Overview

This repository contains the EXA4MIND AI Inference Service, a project designed to enable AI model inference, such as large language models (LLM), on supercomputers. This service is part of the EXA4MIND project, aiming to bring user-friendly AI capabilities to high-performance computing environments.

The service provides a simple, user-facing interface for selecting Hugging Face models and submitting inference requests, while abstracting away all of the complexity involved in HPC job orchestration and resource management. Under the hood, it integrates with HEAppE middleware for acquiring, launching, and monitoring inference jobs across HPC clusters.