API
The best way to deploy the REST API instance is to use Docker. Use the docker/api/Dockerfile
to build a minimal Docker image:
docker build -f docker/api/Dockerfile -t inference-server-hpc:latest .
# Or if you need root permissions for docker:
sudo docker build -f docker/api/Dockerfile -t inference-server-hpc:latest .
Virtual machine setup
We recommend running the service in a virtual machine with minimum specs:
Operating System: Linux (any modern distribution) RAM: 4GB VCPUs: 2 VCPU Free disk space: 50GB
Running the container
In order to properly connect the service to your HEAppE server, create .env file based on .env.example and fill in the credentials.
It's mandatory to fill all HEAPPE variables. The LEXIS variables are only needed if you wish to connect the service to LEXIS. Otherwise you can authenticate
using the prototype static key INFERENCE_SERVICE_FASTAPI_TOKEN, which is local to your API instance.
To further customize the API and the job daemon, it's possible to override the default config.yaml. Simply mount your config to /app/config.yaml in the container.
Example of running a Docker container with custom .env and config.yaml:
docker run -d \
--name ai-inference-service-fastapi \
--env-file .env \
-v /home/vm/config.yaml:/app/config.yaml \
-p 8000:8000 \
inference-server-hpc:latest
Here are some useful commands for monitoring the API:
# Monitor the FastAPI and daemon output
docker logs -f ai-inference-service-fastapi
# View request and response logs
docker exec -it ai-inference-service-fastapi tail -f -n 1000 /app/logs/fastapi_logging.log