As part of the current application I am developing, I have the need for a reliable queue that is not going to allow duplicate reads when popped from multiple threads or processes.
The queue in this case is an outgoing mail queue. The system needs to read the task from the queue, generate the email by substituting in the member’s details, and then send it to the mail server.
When reading the queue with only 1 process, it is easy — read the queue, process the email, delete the queue entry, repeat. Concurrency adds the problem that 2 (or more) processes could read the same email, and we really don’t want an email going to a member 2 or 3 times.
My first solution involved a single “master” thread that would read the queue and delegate the processing to worker threads. While this worked, it was complicated and error prone. After some discussion with some people on #mysql on freenode. I found what should be a suitable database level method.
By selecting the queue entry with the FOR UPDATE modifier, the row is placed under an exclusive lock — the same lock used when a row is being updated, and won’t be allowed to be read until it is updated (N.B. only works when using the INNODB storage engine)
1 2 3 4
The process now has a lock on that row, and it won’t be read by any others. It can then be updated to mark it as “being processed” or deleted from the queue, depending on your needs.
1 2 3 4
With the queue entry safely belonging to the thread, it can now take as long as it needs to process. By keeping the time between the SELECT…FOR UPDATE and UPDATE to a minimum the throughput should be increased significantly from the original non-concurrent solution.