API

The best way to deploy the REST API instance is to use Docker. Use the docker/api/Dockerfile to build a minimal Docker image:

docker build -f docker/api/Dockerfile -t inference-server-hpc:latest .
# Or if you need root permissions for docker:
sudo docker build -f docker/api/Dockerfile -t inference-server-hpc:latest .

Virtual machine setup

We recommend running the service in a virtual machine with minimum specs:

Operating System: Linux (any modern distribution) RAM: 4GB VCPUs: 2 VCPU Free disk space: 50GB

Running the container

In order to properly connect the service to your HEAppE server, create .env file based on .env.example and fill in the credentials. It's mandatory to fill all HEAPPE variables. The LEXIS variables are only needed if you wish to connect the service to LEXIS. Otherwise you can authenticate using the prototype static key INFERENCE_SERVICE_FASTAPI_TOKEN, which is local to your API instance.

To further customize the API and the job daemon, it's possible to override the default config.yaml. Simply mount your config to /app/config.yaml in the container.

Example of running a Docker container with custom .env and config.yaml:

docker run -d \
  --name ai-inference-service-fastapi \
  --env-file .env \
  -v /home/vm/config.yaml:/app/config.yaml \
  -p 8000:8000 \
  inference-server-hpc:latest

Here are some useful commands for monitoring the API:

# Monitor the FastAPI and daemon output
docker logs -f ai-inference-service-fastapi

# View request and response logs
docker exec -it ai-inference-service-fastapi tail -f -n 1000 /app/logs/fastapi_logging.log