API Documentation

Getting Started

Welcome to the OllamaGW API! Our API is designed to be compatible with the OpenAI API format, making it easy to integrate with your existing applications and workflows.

  1. Sign up for an account.
  2. Navigate to your dashboard and generate an API key.
  3. Use your API key in the Authorization header as a Bearer token.
  4. Start making requests to our API endpoints!

Authentication

All API requests must be authenticated using an API key. Include your API key in the Authorization header as a Bearer token:

Authorization: Bearer YOUR_API_KEY

You can manage your API keys from your API Keys dashboard.

API Endpoints

Our primary base URL is: https://ollama-gateway.freeleakhub.com/v1

Chat Completions

POST https://ollama-gateway.freeleakhub.com/v1/chat/completions

This endpoint generates a model response for the given conversation. It supports streaming.

curl -X POST https://ollama-gateway.freeleakhub.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3:8b",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "stream": false
  }'

Python Example:

import openai

client = openai.OpenAI(
    base_url="https://ollama-gateway.freeleakhub.com/v1",
    api_key="YOUR_API_KEY" # Can be an empty string if Authorization header is set globally
)

try:
    chat_completion = client.chat.completions.create(
        model="llama3:8b",
        messages=[
            {"role": "user", "content": "Explain quantum computing in simple terms."}
        ],
        stream=False
    )
    print(chat_completion.choices[0].message.content)
except openai.APIError as e:
    print(f"An API error occurred: {e}")

Node.js (JavaScript/TypeScript) Example:

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'https://ollama-gateway.freeleakhub.com/v1',
  apiKey: 'YOUR_API_KEY', // This is the default and can be omitted
});

async function main() {
  try {
    const chatCompletion = await openai.chat.completions.create({
      messages: [{ role: 'user', content: 'Write a short poem about AI.' }],
      model: 'llama3:8b',
      stream: false,
    });
    console.log(chatCompletion.choices[0].message.content);
  } catch (error) {
    console.error('Error fetching chat completion:', error);
  }
}

main();

Completions (Legacy)

POST https://ollama-gateway.freeleakhub.com/v1/completions

This is the legacy completions endpoint. While supported for compatibility, we recommend using the Chat Completions endpoint for new applications.

Models

GET https://ollama-gateway.freeleakhub.com/v1/models

Lists the currently available models that your API key has access to.

curl https://ollama-gateway.freeleakhub.com/v1/models \ 
 -H "Authorization: Bearer YOUR_API_KEY"

Streaming Responses

For real-time interactions, you can stream responses by setting "stream": true in your request payload. The server will send Server-Sent Events (SSE).

Python Streaming Example:

import openai

client = openai.OpenAI(
    base_url="https://ollama-gateway.freeleakhub.com/v1",
    api_key="YOUR_API_KEY"
)

try:
    stream = client.chat.completions.create(
        model="llama3:8b",
        messages=[{"role": "user", "content": "Tell me a story about a brave robot."}],
        stream=True
    )
    for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            print(chunk.choices[0].delta.content, end="")
    print() # Newline at the end
except openai.APIError as e:
    print(f"An API error occurred: {e}")

Available Models

You can find a dynamically updated list of all publicly available models, their server counts, and recent usage statistics on our dedicated Available Models page.

To see which models are specifically available to your API key (based on your subscription tier), use the /v1/models endpoint.

Token Usage & Billing

Token usage is calculated based on the input prompt and the generated completion. We use a tokenization method similar to OpenAI's cl100k_base encoding for most models.

You can monitor your token usage in your dashboard.

Error Codes

Our API uses standard HTTP status codes to indicate the success or failure of a request. Common codes include:

  • 200 OK: Request successful.
  • 400 Bad Request: Invalid request payload or parameters.
  • 401 Unauthorized: Invalid or missing API key.
  • 403 Forbidden: API key does not have permission for the requested resource/model.
  • 429 Too Many Requests: Rate limit exceeded (though we aim to be generous!).
  • 500 Internal Server Error: An unexpected error occurred on our end.
  • 503 Service Unavailable: Ollama server is temporarily unavailable or no healthy server for the model.