AWS - SQS

Simple Queue Service is a queuing product much like the Tuxedo product, now owned by Oracle. Together with AWS Simple Notification Service, SQS enables distributed, fault tolerate applications to be easily created.

SQS guarantees each message will be delivered at least once; the only issues is that the message might be delivered more than once. Message delivery order is not guaranteed either. By default SQS is NOT FIFO; there is, however, a FIFO SQS mode.. The max time a SQS message can be retained (MessageRetentionPeriod) is 14 days.

Each 64kb of SQS message is billed as a message for a max of 256kb per message total. If the message payload is larger than 256kb it is best practice to store the message in ElasticCache, DynamoDB, or on S3.

SQS FIFO

For use cases that must receive messages in order OR require exactly once processing, there are newer FIFO Queue option. FIFO Queues have less throughput - 300 messages/s without batching and 3000 messages/s with batching.

Dead Letter Queue (DLQ)

If a consumer can’t process a message (likely because of an error) it goes back into the queue - there is a max time a message can go back into the queue (maxReceiveCount) and once the message hits that limit it gets sent to the DLQ. The DLQ is the same type of queue: a FIFO queue outputs to a FIFO DLQ queue; a Standard queue outputs to a Standard DLQ queue. Messages can expire in the DLQ - so be mindful to process them!

The Redrive to Source feature enables you to push the DLQ messages back into the queue - perhaps to reprocess the messages after a hot fix to address the error that lead the messages to end up in the DLQ.

Polling

By default SQS uses short polling where the listening process hits the queue and gets a message (or not) and disconnects. Using long polling, the ReceiveMessage call issued from the worker of the queue will wait and listen to the queue for as long as 20 seconds before it times out or retrieves one or more messages. A ReceiveMessageWaitTimeSeconds attribute of 0 enables short polling; enable long polling by increasing the ReceiveMessageWaitTimeSeconds attribute of the queue to a value greater then 0.

Visibility Timeout

VisibilityTimeout defines how long a message is INVISIBLE to other workers after being ReceiveMessage‘d by a worker. It is invisible so the worker who retrieved the message has the opportunity to process the message and remove it from the queue. If the worker is not successfully in processing the message, the VisibilityTimeout then expires and the message is again available to be accessed by another worker. This ensures that if part of your application fails the message is not lost. Default visibility timeout is 30 seconds and the max is 12 hours and can be changed by the ChangeMessageVisibility action.

Lambda Event Mapping

By default Lambda uses long polling with a batch size configurable from 1-10. The queue visibility timeout should be 6X the timeout of the Lambda function. Have to set up the DLQ on the SQS queue.

Use Cases