Quick Start¶
Get up and running with streamt in 5 minutes. By the end of this guide, you'll have a working streaming pipeline.
Prerequisites¶
- streamt installed
- Docker running (for local Kafka)
1. Start Local Infrastructure¶
# Start Kafka, Flink, and supporting services
docker compose up -d
# Wait for services to be healthy
docker compose ps
2. Create Your Project¶
Create a new directory for your streamt project:
Create the main configuration file:
project:
name: my-first-pipeline
version: "1.0.0"
description: My first streaming pipeline with streamt
runtime:
kafka:
bootstrap_servers: localhost:9092
flink:
default: local
clusters:
local:
type: rest
rest_url: http://localhost:8082
sql_gateway_url: http://localhost:8084
3. Define a Source¶
Create a source representing incoming data:
sources:
- name: raw_events
topic: events.raw.v1
description: Raw user events from the web application
owner: platform-team
columns:
- name: event_id
description: Unique event identifier
- name: user_id
description: User who triggered the event
- name: event_type
description: Type of event (click, view, purchase)
- name: timestamp
description: When the event occurred
4. Create Your First Model¶
Create a model that transforms the raw events:
models:
- name: events_clean
description: Cleaned and validated events
sql: |
SELECT
event_id,
user_id,
event_type,
`timestamp`
FROM {{ source("raw_events") }}
WHERE event_id IS NOT NULL
AND user_id IS NOT NULL
# Optional: customize topic settings
advanced:
topic:
name: events.clean.v1
partitions: 6
The model is automatically materialized as a topic since it's a simple SELECT statement.
5. Validate Your Project¶
Check that everything is configured correctly:
You should see:
6. View the Lineage¶
See how data flows through your pipeline:
Output:
7. Plan the Deployment¶
See what will be created:
Output:
Plan: 1 to create, 0 to update, 0 to delete
Topics:
+ events.clean.v1 (6 partitions, replication: 1)
8. Deploy!¶
Apply your pipeline to the infrastructure:
Output:
Applying changes...
Topics:
+ events.clean.v1 ............... created
Applied: 1 created, 0 updated, 0 unchanged
9. Verify in Conduktor¶
Open http://localhost:8080 and log in with:
- Email: admin@localhost
- Password: Admin123!
You should see the events.clean.v1 topic in the Topics view.
10. Add a Test¶
Create a test to validate data quality:
tests:
- name: events_schema_validation
model: events_clean
type: schema
assertions:
- not_null:
columns: [event_id, user_id, event_type]
- accepted_values:
column: event_type
values: [click, view, purchase, signup]
Run the test:
Project Structure¶
Your project should now look like this:
my-streaming-project/
├── stream_project.yml # Main configuration
├── sources/
│ └── events.yml # Source definitions
├── models/
│ └── events_clean.yml # Model definitions
└── tests/
└── events_test.yml # Test definitions
What's Next?¶
Congratulations! You've created your first streaming pipeline with streamt.
- Build a complete pipeline — Add stateful processing with Flink
- Learn about concepts — Understand sources, models, tests
- Explore materializations — Topics, Flink jobs, sinks
- See examples — Real-world pipeline examples