Real-Time Data Pipelines with Apache Kafka and Flink
章节 1: Stream Processing Architectures and Semantics
Evolution of Distributed Logs
Lambda versus Kappa Architecture
Processing Guarantees and Semantics
Event Time versus Processing Time
Hands-on Practical: Designing a Kappa Pipeline
章节 2: Advanced Kafka Producer and Consumer Internals
Replication Protocols and ISRs
Idempotent Producers and Transactions
Custom Partitioning Strategies
Consumer Group Rebalancing protocols
Hands-on Practical: Implementing Transactional Writes
章节 3: Flink State Management and Checkpointing
State Backends: HashMap versus RocksDB
Asynchronous Barrier Snapshots
Incremental Checkpointing
Hands-on Practical: State Migration with Savepoints
章节 4: Advanced Windowing and Time Attributes
Watermark Generation Strategies
Handling Late Data and Side Outputs
Custom Window Triggers and Evictors
Session Windows and Gap Analysis
Hands-on Practical: Building a Custom Trigger
章节 5: Low-Level Operations with ProcessFunctions
The ProcessFunction Hierarchy
Timer Services and Event Scheduling
Async I/O for External Lookups
Hands-on Practical: Dynamic Rule Evaluation
章节 6: Production Deployment and Reliability
Serialization with Avro and Protobuf
Schema Registry Integration
Kafka Connect for Sink and Source
Failure Recovery Strategies
Hands-on Practical: Schema Evolution in Flight
章节 7: Performance Tuning and Monitoring
Memory Management and Slot Allocation
Tuning RocksDB Performance
Kafka Consumer Lag Analysis
Hands-on Practical: Resolving Skewed Data
章节 8: Real-Time AI and Feature Engineering
Online Feature Generation
Model Serving Patterns in Streams
Request-Response over Async Streams
Feature Store Integration
Hands-on Practical: Real-Time Inference Pipeline