Frequently Asked Questions¶
Common questions and answers about the Kafka Connect gRPC Source Connector.
General Questions¶
What is this connector used for?¶
The Kafka Connect gRPC connector streams real-time data from gRPC server streaming endpoints into Kafka topics. It's ideal for:
- Microservices communication using gRPC
- Event streaming from gRPC-based platforms
- Real-time data pipelines with gRPC sources
- Cloud-native applications exposing gRPC APIs
- IoT and telemetry data from gRPC-enabled devices
- Financial services data feeds via gRPC
How does it differ from the WebSocket connector?¶
| Feature | gRPC Connector | WebSocket Connector |
|---|---|---|
| Protocol | HTTP/2 + Protocol Buffers | WebSocket |
| Schema | Strong typing via Protobuf | String messages only |
| Streaming | Server streaming RPC | WebSocket push |
| TLS/mTLS | Built-in support | TLS only |
| Offset Tracking | Sequence-based | None |
| Message Format | Binary (Protobuf) | Text (JSON typically) |
| Use Cases | Microservices, typed APIs | Exchange feeds, IoT |
What gRPC patterns are supported?¶
Supported: - ✅ Server streaming (one request, stream of responses)
Not Supported: - ❌ Unary (request-response) - ❌ Client streaming (stream of requests, one response) - ❌ Bidirectional streaming (stream both ways)
Only server streaming is compatible with Kafka Connect's source connector model.
Installation & Setup¶
Do I need to build from source?¶
No. Pre-built JARs are available from GitHub Releases:
# Download the latest release
wget https://github.com/conduktor/kafka-connect-grpc/releases/download/v1.0.0/kafka-connect-grpc-1.0.0-jar-with-dependencies.jar
# Copy to plugin directory
cp kafka-connect-grpc-1.0.0-jar-with-dependencies.jar $KAFKA_HOME/plugins/kafka-connect-grpc/
Building from source is only needed for development or custom modifications.
Which Kafka version do I need?¶
Minimum: Kafka 3.9.0 Recommended: Latest stable Kafka version
The connector uses Kafka Connect API features available in 3.9.0+.
Do I need protoc installed?¶
Yes, if you need to generate Protocol Buffer descriptor files (.desc).
# Install protoc
brew install protobuf # macOS
sudo apt install protobuf-compiler # Ubuntu/Debian
# Generate descriptor
protoc --descriptor_set_out=service.desc \
--include_imports \
your_service.proto
The descriptor file is required for dynamic message handling.
Can I use this with Confluent Platform?¶
Yes, the connector works with:
- Apache Kafka (open source)
- Confluent Platform
- Amazon MSK (Managed Streaming for Kafka)
- Azure Event Hubs for Kafka
- Any Kafka-compatible platform supporting Connect API
Configuration¶
What's the minimum configuration?¶
Five required parameters:
{
"connector.class": "io.conduktor.connect.grpc.GrpcSourceConnector",
"grpc.server.host": "localhost",
"grpc.server.port": "9090",
"grpc.service.name": "com.example.MyService",
"grpc.method.name": "StreamData",
"kafka.topic": "grpc-messages"
}
How do I configure TLS/mTLS?¶
TLS (server verification only):
mTLS (mutual authentication):
{
"grpc.tls.enabled": "true",
"grpc.tls.ca.cert": "/path/to/ca.crt",
"grpc.tls.client.cert": "/path/to/client.crt",
"grpc.tls.client.key": "/path/to/client.key"
}
How do I add custom metadata/headers?¶
Option 1: Comma-separated format
Option 2: Individual entries
How do I send a request message?¶
Use grpc.request.message with JSON format:
{
"grpc.request.message": "{\"filter\":\"active\",\"limit\":100}",
"grpc.proto.descriptor": "/path/to/service.desc"
}
The JSON is converted to Protobuf using the descriptor file.
Can I connect to multiple gRPC endpoints?¶
No, each connector instance connects to one gRPC endpoint. To stream from multiple endpoints:
- Create separate connector instances
- Use different connector names
- Route to the same or different Kafka topics
Example:
# Connector 1: Service A
curl -X POST http://localhost:8083/connectors \
-d '{"name":"grpc-service-a", "config":{...}}'
# Connector 2: Service B
curl -X POST http://localhost:8083/connectors \
-d '{"name":"grpc-service-b", "config":{...}}'
Proto Descriptors¶
What is a proto descriptor file?¶
A compiled Protocol Buffer schema containing all message and service definitions. It's used for dynamic message handling without code generation.
How do I generate a descriptor file?¶
Important: Always use --include_imports to include dependencies.
Can I use base64-encoded descriptors?¶
Yes, you can base64-encode the descriptor and provide it inline:
# Encode descriptor
base64 service.desc > service.desc.b64
# Use in configuration
{
"grpc.proto.descriptor": "$(cat service.desc.b64)"
}
What if my proto imports other files?¶
Use --include_imports when generating the descriptor:
protoc --descriptor_set_out=service.desc \
--include_imports \
--proto_path=. \
--proto_path=./vendor \
your_service.proto
This includes all imported message definitions in the descriptor.
Data & Reliability¶
What delivery guarantees does this connector provide?¶
The connector provides at-least-once delivery semantics with sequence-based offset tracking:
- Messages are tracked by sequence number
- Offsets are committed to Kafka Connect
- Gaps in sequence numbers are detected and logged
- Cannot replay from gRPC server (server streaming limitation)
What happens if the connector crashes?¶
- In-memory queue: Messages in queue are lost
- Offset tracking: Last committed offset is preserved
- Reconnection: Connector reconnects and requests new stream
- Gap detection: Sequence gaps are logged as warnings
The connector cannot replay lost messages from the gRPC server.
Can I replay historical data?¶
No. gRPC server streaming limitations:
- Server doesn't store message history
- Cannot "rewind" to previous position
- Each connection gets a new stream starting from current state
For historical data, use the gRPC server's unary RPC methods (if available) or another mechanism.
What data format is produced to Kafka?¶
Messages are produced as JSON strings (converted from Protobuf):
{
"topic": "grpc-messages",
"partition": 0,
"offset": 12345,
"key": null,
"value": "{\"field1\":\"value1\",\"field2\":123}"
}
For structured processing, use Kafka Streams or ksqlDB to parse JSON downstream.
How are offsets tracked?¶
The connector uses sequence-based offset tracking:
- Session ID: Unique per connection (UUID)
- Sequence Number: Monotonically increasing counter
- Offset Format:
{"sessionId":"abc123","sequence":1000}
This enables gap detection and tracking across reconnections.
Operations¶
How do I monitor the connector?¶
Three monitoring approaches:
-
JMX Metrics (recommended):
-
Structured Logging:
-
Kafka Connect REST API:
See Configuration for monitoring setup.
What metrics should I alert on?¶
Critical alerts:
- Connector down -
connector.state != RUNNING - Not connected -
IsConnected = falsefor > 5 minutes - No messages -
MessagesReceivednot increasing - Queue overflow -
QueueUtilizationPercent > 80% - High reconnection rate -
TotalReconnectsincreasing rapidly
How do I troubleshoot connection failures?¶
-
Test gRPC endpoint:
-
Check TLS configuration:
-
Review connector logs:
-
Verify proto descriptor:
How do I update connector configuration?¶
For running connectors:
curl -X PUT http://localhost:8083/connectors/grpc-connector/config \
-H "Content-Type: application/json" \
-d @updated-config.json
The connector will restart automatically with new configuration.
Note: Some changes require full restart (TLS, server host/port, service/method names).
Can I pause and resume the connector?¶
Yes, use the Kafka Connect REST API:
Pause:
Resume:
Warning: Pausing closes the gRPC connection. Messages sent during pause are lost.
Performance¶
What throughput can I expect?¶
Typical performance on standard hardware (4 CPU, 8 GB RAM):
- Message rate: 10,000+ messages/second
- Latency: < 5ms from gRPC receipt to Kafka produce
- Queue capacity: Configurable (default: 10,000 messages)
Actual throughput depends on: - Message size - Kafka broker performance - Network latency - gRPC server throughput
How many tasks can I run per connector?¶
Always 1 task per connector.
gRPC server streaming connections are single-threaded by design. Each connector maintains one gRPC stream.
To parallelize: - Create multiple connector instances - Each connects to different service/method or uses different request filters
What's the memory footprint?¶
Memory usage depends on queue size and message size:
Formula:
Example: - Queue size: 10,000 messages - Average message: 2 KB (Protobuf) - Memory: ~40 MB (with overhead)
Recommendation: - Development: 512 MB heap - Production: 2 GB heap
How do I optimize throughput?¶
-
Increase queue size (handle bursts):
-
Increase message size limit (if needed):
-
Optimize Kafka producer (in
connect-distributed.properties): -
Tune keepalive (reduce overhead):
Troubleshooting¶
Why isn't my connector appearing in the plugin list?¶
Check:
- JAR is in the correct plugin directory
plugin.pathis configured inconnect-distributed.properties- Kafka Connect was restarted after installation
- All dependencies are present (should use uber JAR)
Verify:
Why do I get "UNAVAILABLE: io exception"?¶
Cause: gRPC server not reachable.
Solution:
# Test connectivity
telnet grpc-server 9090
# Test with grpcurl
grpcurl -plaintext grpc-server:9090 list
# Check connector logs
grep "event=connection_failed" $KAFKA_HOME/logs/connect.log
Why do I get "Service not found in descriptor"?¶
Cause: Service name doesn't match proto definition.
Solution:
# List services in descriptor
grpcurl -protoset service.desc list
# Verify exact service name (case-sensitive)
grpcurl -protoset service.desc describe com.example.MyService
Why is my queue constantly full?¶
Cause: Kafka producer throughput < gRPC message rate.
Solutions:
-
Increase queue size (temporary):
-
Optimize Kafka producer (permanent):
-
Scale Kafka infrastructure:
- Add more brokers
- Increase partition count
- Use SSD storage
Why do I see sequence gaps in logs?¶
Cause: Messages lost due to queue overflow or network issues.
Investigation:
# Check for gap warnings
grep "event=sequence_gap" $KAFKA_HOME/logs/connect.log
# Check queue utilization
# Monitor QueueUtilizationPercent metric via JMX
Solutions: - Increase grpc.message.queue.size - Optimize Kafka producer settings - Investigate network stability
Compatibility¶
Does this work with Kafka 2.x?¶
No, minimum Kafka version is 3.9.0. The connector uses APIs introduced in Kafka 3.x.
Does this work with Java 8?¶
No, minimum Java version is 11. The connector uses Java 11 language features and gRPC Java 1.60.0 requires Java 11+.
Does this work with Kubernetes?¶
Yes, deploy Kafka Connect in Kubernetes and include this connector:
- Strimzi Kafka Operator - Build custom Connect image
- Confluent for Kubernetes - Use custom Connect image
- Helm Charts - Mount plugin directory via ConfigMap/PersistentVolume
Does this work with Docker?¶
Yes. Create a custom image:
FROM confluentinc/cp-kafka-connect:7.5.0
# Copy the uber JAR (includes all dependencies)
COPY kafka-connect-grpc-1.0.0-jar-with-dependencies.jar \
/usr/share/confluent-hub-components/kafka-connect-grpc/
Still Have Questions?¶
- GitHub Issues: Open an issue
- Slack Community: Join Conduktor Slack
- Documentation: Browse documentation