Deprecated Behaviour

The inane, sometimes insane, ramblings from the mind of Brenton Alker.

Using a MySQL Table as a Thread-safe Queue

As part of the current application I am developing, I have the need for a reliable queue that is not going to allow duplicate reads when popped from multiple threads or processes.

The queue in this case is an outgoing mail queue. The system needs to read the task from the queue, generate the email by substituting in the member’s details, and then send it to the mail server.

When reading the queue with only 1 process, it is easy — read the queue, process the email, delete the queue entry, repeat. Concurrency adds the problem that 2 (or more) processes could read the same email, and we really don’t want an email going to a member 2 or 3 times.

My first solution involved a single “master” thread that would read the queue and delegate the processing to worker threads. While this worked, it was complicated and error prone. After some discussion with some people on #mysql on freenode. I found what should be a suitable database level method.

By selecting the queue entry with the FOR UPDATE modifier, the row is placed under an exclusive lock — the same lock used when a row is being updated, and won’t be allowed to be read until it is updated (N.B. only works when using the INNODB storage engine)

1
2
3
4


SELECT id, task FROM queue WHERE processing = 0 FOR UPDATE;

The process now has a lock on that row, and it won’t be read by any others. It can then be updated to mark it as “being processed” or deleted from the queue, depending on your needs.

1
2
3
4


UPDATE queue SET processing = 1 WHERE id = :id;

With the queue entry safely belonging to the thread, it can now take as long as it needs to process. By keeping the time between the SELECT…FOR UPDATE and UPDATE to a minimum the throughput should be increased significantly from the original non-concurrent solution.

Set Manipulation of Files With Uniq & Sort

I often find myself performing operations on single column CSV files to perform tasks such as merging member id lists, or email addresses, counting unique entries, finding duplicates etc.

By applying some set theory (remember that from math class) and some standard linux/unix command line tools, almost any operation I have ever wanted to achieve in this vain is possible. And I have found a post, over at Linux By Examples that provides a simple list of operations and examples of how to achieve them on text files using sort and uniq.

Manually Deliver Email to Maildir

I have, on occasion, had need to manually drop an email into a users inbox. For example, when an over-zealous virus scanner dumps mail into “quarantine”.

After confirming that the email are, in fact, virus free, the emails are able to be injected into the users maildir with the help of safecat, which guarantees atomic copying (even from a slow standard input stream) and unique file naming.

The basic format of the command is:

1
2
3
4


safecat <tmp_dir> .Maildir/new/ < msg.eml

with the only restriction being that tmp_dir must be on the same file system, as the algorithm uses hard links to guarantee instant availability of the complete file.