Kafka Connect WebSocket Source Connector¶
Stream real-time data from any WebSocket endpoint directly into Apache Kafka with automatic reconnection, authentication support, and comprehensive monitoring.
Features¶
Real-Time Streaming¶
Connect to any WebSocket endpoint (ws:// or wss://) and stream data directly into Kafka topics with minimal latency.
Automatic Reconnection¶
Built-in reconnection logic with configurable intervals ensures resilient connections even when endpoints are unstable.
Authentication Support¶
Bearer token authentication and custom headers for connecting to secured WebSocket APIs.
Subscription Messages¶
Send subscription messages after connection to exchanges like Binance, Coinbase, and custom WebSocket servers.
Built-in Monitoring¶
Comprehensive metrics via JMX, detailed logging, and integration with Prometheus/Grafana for production observability.
:material-buffer: Message Buffering¶
Configurable in-memory queue to handle traffic bursts and optimize throughput to Kafka.
Quick Example¶
Stream Bitcoin trades from Binance into Kafka:
{
"name": "binance-btcusdt-connector",
"config": {
"connector.class": "io.conduktor.connect.websocket.WebSocketSourceConnector",
"tasks.max": "1",
"websocket.url": "wss://stream.binance.com:9443/ws",
"kafka.topic": "binance-btcusdt-trades",
"websocket.subscription.message": "{\"method\":\"SUBSCRIBE\",\"params\":[\"btcusdt@trade\"],\"id\":1}",
"websocket.reconnect.enabled": "true"
}
}
Use Cases¶
The connector is ideal for streaming real-time data from:
- Cryptocurrency Exchanges - Live trades, order books, and ticker data (Binance, Coinbase, Kraken)
- IoT Devices - Sensor data and telemetry streams
- Financial Markets - Stock tickers, forex rates, market data
- Collaboration Tools - Chat messages, presence updates, notifications
- Gaming Platforms - Live scores, player updates, game events
- Custom WebSocket APIs - Any service exposing a WebSocket endpoint
Architecture¶
graph LR
A[WebSocket Endpoint] -->|OkHttp Client| B[Message Queue]
B -->|Poll| C[Kafka Connect Task]
C -->|Produce| D[Kafka Topic]
style A fill:#3b82f6
style B fill:#f59e0b
style C fill:#10b981
style D fill:#ef4444 The connector uses OkHttp's WebSocket client for reliable connections, maintains an in-memory queue for buffering, and integrates seamlessly with Kafka Connect's task framework.
Why This Connector?¶
| Feature | Kafka Connect WebSocket | Custom Consumer | REST Polling |
|---|---|---|---|
| Real-time data | ✅ True push model | ✅ True push | ❌ Polling delays |
| Kafka integration | ✅ Native | ⚠️ Manual | ⚠️ Manual |
| Reconnection logic | ✅ Built-in | ⚠️ Custom code | ⚠️ Custom code |
| Monitoring | ✅ JMX/Prometheus | ⚠️ Custom | ⚠️ Custom |
| Deployment | ✅ Kafka Connect | ❌ Separate service | ❌ Separate service |
| Maintenance | ✅ Integrated | ⚠️ DIY | ⚠️ DIY |
Performance¶
- Throughput: Handles 10,000+ messages/second on standard hardware
- Latency: < 10ms from WebSocket receipt to Kafka produce
- Reliability: Automatic reconnection with exponential backoff
- Scalability: Configurable queue size for traffic bursts
Operational Features¶
Includes comprehensive operational tooling:
- Detailed error handling and structured logging
- JMX metrics for Prometheus/Grafana integration
- Automatic reconnection with exponential backoff
- Configurable message buffering
- Extensive troubleshooting documentation
⚠️ Important: Data Delivery Semantics¶
At-Most-Once Delivery
This connector provides at-most-once delivery semantics due to architectural limitations:
- In-memory buffering: Messages in the queue are lost on shutdown/crashes
- No replay capability: WebSocket protocol doesn't support message replay
- Single task limitation: WebSocket connections don't shard like Kafka partitions
Use cases: Best suited for telemetry, monitoring, and scenarios where occasional data loss is acceptable. For critical data, consider additional validation or complementary ingestion methods. See the README for detailed mitigation strategies.
Community & Support¶
- GitHub Issues: Report bugs and request features
- Slack Community: Join Conduktor Slack
- Documentation: Full reference guide
License¶
Apache License 2.0 - see LICENSE for details.