Skip to main content
Use Baseten for production ML model deployments with optimized inference, autoscaling, and multi-cloud support.

Clone the repository​

git clone https://github.com/Liquid4All/lfm-inference

Deployment​

The deployment script is based on Baseten’s documentation of Run any LLM with vLLM. Launch command:
cd bastenpip install trusstruss push lfm2-8b --publish

Test call​

curl -X POST https://<model-id>.api.baseten.co/environments/production/predict \
  -H "Authorization: Api-Key $BASETEN_API_KEY" \
  -d '{
    "model": "LiquidAI/LFM2-8B-A1B",
    "messages": [
      {
        "role": "user",
        "content": "What is the melting temperature of silver?"
      }
    ],
    "max_tokens": 32,
    "temperature": 0
  }'
Baseten endpoints expect the Api-Key prefix in the Authorization header.
Edit this page