Managed Service for Kafka (MSK) makes it easy to ingest and process streaming data in real time with fully managed Apache Kafka, open source alternative to Kinesis Streams. MSK creates and manages Kafka brokers and Zookeeper nodes in a MSK cluster multi-AZ (up to 3) while storing data on EBS volumes. MSK also can recover from common Apache Kafka failures.
Another option is MSK Serverless which does all this… serverless.
Consumers
Four types of consumers:
- Kinesis Data Analytics for Apache Flink
- Glue (using Spark streaming)
- Lambda
- some app running on EC2, ECS, EKS using a client library
Comparing Kinesis Streams and MSK
Kinesis Streams | MSK | |
---|---|---|
Size limit | 1mb | 1mb default; configurable higher |
Components | Streams with shards | Topics with partitions |
Scaling | up=splitting; down=merging | up=add partitions; down=nope |
Encryption in-flight | TLS | plaintext or TLS |
Encryption at-rest | KMS | KMS |
Data retention (max) | 365 days | indefinitely |