Streaming¶
OmniLLM supports real-time response streaming for all providers.
Basic Streaming¶
stream, err := client.CreateChatCompletionStream(context.Background(), &omnillm.ChatCompletionRequest{
Model: omnillm.ModelGPT4o,
Messages: []omnillm.Message{
{Role: omnillm.RoleUser, Content: "Tell me a short story about AI."},
},
MaxTokens: &[]int{200}[0],
Temperature: &[]float64{0.8}[0],
})
if err != nil {
log.Fatal(err)
}
defer stream.Close()
fmt.Print("AI Response: ")
for {
chunk, err := stream.Recv()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
if len(chunk.Choices) > 0 && chunk.Choices[0].Delta != nil {
fmt.Print(chunk.Choices[0].Delta.Content)
}
}
fmt.Println()
Stream Interface¶
type ChatCompletionStream interface {
// Recv receives the next chunk from the stream
Recv() (*ChatCompletionStreamResponse, error)
// Close closes the stream
Close() error
}
Provider Support¶
| Provider | Streaming |
|---|---|
| OpenAI | Yes |
| Anthropic | Yes (SSE) |
| Google Gemini | Yes |
| X.AI | Yes |
| Ollama | Yes |
| AWS Bedrock | Yes |
Streaming with Observability¶
When using observability hooks, wrap the stream to track streaming metrics: