Simple Queue Service is a queuing product much like the Tuxedo product, now owned by Oracle. Together with AWS Simple Notification Service, SQS enables distributed, fault tolerate applications to be easily created.
SQS guarantees each message will be delivered at least once; the only issues is that the message might be delivered more than once. Message delivery order is not guaranteed either. By default SQS is NOT FIFO; there is, however, a FIFO SQS mode.. The max time a SQS message can be retained (
MessageRetentionPeriod) is 14 days.
Each 64kb of SQS message is billed as a message for a max of 256kb per message total. If the message payload is larger than 256kb it is best practice to store the message in ElasticCache, DynamoDB, or on S3.
For use cases that must receive messages in order OR require exactly once processing, there are newer FIFO Queue option. FIFO Queues have less throughput - 300 messages/s without batching and 3000 messages/s with batching.
Dead Letter Queue (DLQ)
If a consumer can’t process a message (likely because of an error) it goes back into the queue - there is a max time a message can go back into the queue (
maxReceiveCount) and once the message hits that limit it gets sent to the DLQ. The DLQ is the same type of queue: a FIFO queue outputs to a FIFO DLQ queue; a Standard queue outputs to a Standard DLQ queue. Messages can expire in the DLQ - so be mindful to process them!
The Redrive to Source feature enables you to push the DLQ messages back into the queue - perhaps to reprocess the messages after a hot fix to address the error that lead the messages to end up in the DLQ.
By default SQS uses short polling where the listening process hits the queue and gets a message (or not) and disconnects. Using long polling, the
ReceiveMessage call issued from the worker of the queue will wait and listen to the queue for as long as 20 seconds before it times out or retrieves one or more messages. A
ReceiveMessageWaitTimeSeconds attribute of 0 enables short polling; enable long polling by increasing the
ReceiveMessageWaitTimeSeconds attribute of the queue to a value greater then 0.
VisibilityTimeout defines how long a message is INVISIBLE to other workers after being
ReceiveMessage‘d by a worker. It is invisible so the worker who retrieved the message has the opportunity to process the message and remove it from the queue. If the worker is not successfully in processing the message, the
VisibilityTimeout then expires and the message is again available to be accessed by another worker. This ensures that if part of your application fails the message is not lost. Default visibility timeout is 30 seconds and the max is 12 hours and can be changed by the
Lambda Event Mapping
By default Lambda uses long polling with a batch size configurable from 1-10. The queue visibility timeout should be 6X the timeout of the Lambda function. Have to set up the DLQ on the SQS queue.
- Message priority? Multiple queues; more resources on high priority queue; Add
DelaySecondsto the lower priority queue
- Bigger than 256kb? More than one message chunk
- bigger than 256kb? Message with pointer to data in DynamoDB or S3
- Workflow longer than 14 days? SWF
- Decrease SQS cost? long poll
- Lots of Visibility timeout? up timeout
- Enable long poll?
- Exactly once processing? FIFO queue
- First-in, First-out delivery? FIFO queue
- Unlimited throughput? Standard queue
- IBM MQ, or other on-prem MQ? Amazon MQ