Image Generation
Image generation services receive text inputs and output images.| Service | Repository | Maintainer(s) |
|---|---|---|
| Replicate | https://github.com/bnovik0v/pipecat-replicate | bnovik0v |
Knowledge Retrieval
Semantic retrieval services enable context-aware search and retrieval of relevant information.| Service | Repository | Maintainer(s) |
|---|---|---|
| Moss | https://github.com/usemoss/pipecat-moss | Moss |
Large Language Models
LLMs receive text or audio based input and output a streaming text response.Memory
Memory services provide persistent, cross-session user memory for voice agents.| Service | Repository | Maintainer(s) |
|---|---|---|
| Synap | https://github.com/maximem-ai/maximem_synap_sdk/tree/main/packages/integrations | maximem-ai |
Observability
Observability services enable telemetry and metrics data to be passed to a Open Telemetry backend or another service.Speech-to-Text
Speech-to-Text services receive and audio input and output transcriptions.Telephony Serializers
Serializers convert between frames and media streams, enabling real-time communication over a websocket.Text-to-Speech
Text-to-Speech services receive text input and output audio streams or chunks.Translation
Translation services enable real-time speech-to-speech and speech-to-text translation.| Service | Repository | Maintainer(s) |
|---|---|---|
| Pinch | https://github.com/pinch-eng/pipecat-plugins-pinch | pinch-eng |
VAD
VAD services analyze audio input to detect when a user starts and stops speaking.| Service | Repository | Maintainer(s) |
|---|---|---|
| TEN VAD | https://github.com/rahulsolanki001/pipecat-ten-vad | rahul solanki |
| FIRE RED VAD | https://github.com/rahulsolanki001/pipecat-fire-vad | rahul solanki |
Video
Video services enable you to build an avatar where audio and video are synchronized.| Service | Repository | Maintainer(s) |
|---|---|---|
| Anam | https://github.com/anam-org/pipecat-anam | anam-org |
| Beyond Presence | https://github.com/bey-dev/pipecat-bey | bey-dev |
Vision
Vision services receive a streaming video input and output text describing the video input.| Service | Repository | Maintainer(s) |
|---|---|---|
| SmolVLM | https://github.com/rahulsolanki001/pipecat-smolvlm | [rahul solanki] (https://github.com/rahulsolanki001) |