Ollama¶
Overview¶
- Models: Llama 3, Mistral, CodeLlama, Gemma, Qwen2.5, DeepSeek-Coder
- Features: Local inference, no API keys required, optimized for Apple Silicon
Configuration¶
client, err := omnillm.NewClient(omnillm.ClientConfig{
Providers: []omnillm.ProviderConfig{
{Provider: omnillm.ProviderNameOllama, BaseURL: "http://localhost:11434"},
},
})
Running Ollama¶
- Install Ollama from ollama.ai
- Pull a model:
ollama pull llama3 - Ollama runs automatically on
localhost:11434
Available Models¶
| Model | Description |
|---|---|
llama3 |
Meta's Llama 3 |
llama3:70b |
Llama 3 70B (larger) |
mistral |
Mistral 7B |
codellama |
Code-specialized Llama |
gemma |
Google's Gemma |
qwen2.5 |
Alibaba's Qwen 2.5 |
deepseek-coder |
DeepSeek Coder |
Example¶
response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
Model: "llama3",
Messages: []omnillm.Message{
{Role: omnillm.RoleUser, Content: "Explain quantum computing simply."},
},
})
Streaming¶
stream, err := client.CreateChatCompletionStream(ctx, &omnillm.ChatCompletionRequest{
Model: "llama3",
Messages: messages,
})
for {
chunk, err := stream.Recv()
if err == io.EOF {
break
}
fmt.Print(chunk.Choices[0].Delta.Content)
}
Custom Ollama Server¶
Connect to a remote Ollama instance: