Skip to content

v0.2.0

Release Date: 2026-01-18

Major feature release adding audio codec support, MCP server for Claude Code voice interactions, and pipeline components for connecting voice providers.

Highlights

  • Audio codec package with PCM, mu-law, and a-law support for telephony
  • MCP server enabling Claude Code to make voice calls
  • Pipeline components connecting STT, TTS, and transport providers

Added

  • Audio codec package with PCM sample conversions (int16, float32, float64, bytes) (f64fe1e)
  • Mu-law encoding/decoding for Twilio Media Streams (f64fe1e)
  • A-law encoding/decoding for international telephony (f64fe1e)
  • Audio resampling, normalization, and analysis utilities (f64fe1e)
  • MCP server with stdio transport for voice interactions (721cbac)
  • Voice interaction tools: initiate_call, continue_call, speak_to_user, end_call (721cbac)
  • Session management for tracking active voice calls (721cbac)
  • TTSPipeline for streaming TTS output to transport connections (11c906d)
  • StreamingTTSPipeline for connecting streaming LLM text to TTS to transport (11c906d)
  • STTPipeline for streaming audio from transport to STT with event callbacks (11c906d)

Documentation

  • Voice integration PRD outlining goals, user stories, and success metrics (fd86611)
  • Twilio integration TRD detailing Media Streams architecture (fd86611)

Tests

  • Comprehensive unit tests for audio codec functions (mu-law, a-law, PCM) (f64fe1e)

Full Changelog | Compare to v0.1.0