Is lock-free logging safe?

Published on May 30, 2013 by Jesse Storimer

This is a bit of an odd question without some context.

Recently, I saw the mono_logger project. It claims to provide "a lock-free Logger for Ruby 2.0". When I saw it I thought "Cool! I want to see what tricks they used to implement this". It looked pretty straightforward at first. When I looked closer and compared it to Logger found in Ruby's stdlib, I saw that it was the exact same, minus the mutex.

This led me to asking the question, is lock-free logging safe? Ruby probably put a mutex there for a reason. What if multiple threads try to log a message at the same time? Will their log messages be interleaved like this?

When I started looking around for the answer to this question, I realized that the overarching questions I was asking was: are calls to write(2) atomic? In other words, if two threads or two processes write to the same file at the same time, are the writes atomic? Or can they be interleaved?

Unix beard too short...

This is a deep Unix question. When I posed this question on the mono_logger bug tracker, I admitted that my Unix beard wasn't long enough to know the answer. So I turned to my favourite source of Unix lore: The Linux Programming Interface.

It had the answers I was looking for. There are a few bits of knowledge that lead to the answer to this question.

For starters, all system calls are atomic. From TLPI:

"All system calls are executed atomically. By this, we mean that the kernel guarantees that all of the steps in a system call are completed as a single operation, without being interrupted by another process of thread."

Great! So if two threads or two processes write to a file at the same time, we can be sure their data won't be interleaved. This confirms that we don't really need a mutex around the call to write(2) to ensure that it's atomic, the kernel already guarantees that.

But there's one problem. Typically when writing to a file (from C-land), you need to seek to the location in the file where you want to write, then perform the actual write to the file descriptor. Unfortunately, this involves two system calls, lseek(2) and write(2). Individually they're atomic, but as two operations that's not the case. Let me demonstrate.

As pseudo-code, this might look like:

lseek(fd, 0, SEEK_END);        // seek to the end of the file
write(fd, "log message", len); // perform the write

This isn't safe in a multi-threaded situation. Since there's no atomicity guarantees across these two system calls, the following situation is possible:

  • Thread A seeks to the end of the file
  • -- the thread scheduler decides to switch context to Thread B --
  • Thread B seeks to the end of the file
  • Thread B writes its data
  • -- thread scheduler switches context back to Thread A --
  • Thread A has already done its lseek, so it doesn't do it again, it just performs its write

d'oh, Thread A is no longer pointing to the true end of the file, it's pointing to the previous end of the file, the location where Thread B wrote its message. When Thread A performs its write, it overwrites whatever Thread B wrote.

So with this approach, it's possible to lose data.

But there's a solution!

In order for the seek and the write to happen atomically, you can set the O_APPEND flag when opening the file. Then any writes performed will atomically be appended to the file. This is exactly what we want in the case of a shared logfile. Both Logger and MonoLogger use this option when opening the logfile.

open(filename, (File::WRONLY | File::APPEND))

So MonoLogger should be able to append to the logfile atomically without a mutex.

Update: @jacknagel raised some good points on the MonoLogger issue tracker. Most notably that write(2) doesn't guarantee that all of the data you pass it will be written. It returns the number of bytes written. This is for cases where the call is interrupted by the arrival of a signal. 

Update #2: Eric Wong got in touch and confirmed that MonoLogger's behaviour is safe. His messages are posted to the usp.ruby email list.


In the github issue, I raised a point about something called PIPE_BUF_MAX. I won't say much about it here, because it's not relevant when writing to files. Essentially, a write(2) of any size is guaranteed to be atomic when the target is a file. However, when writing to a pipe, your write(2) is only guaranteed to be atomic if it's less than PIPE_BUF_MAX.

I'm interested to see what happens with MonoLogger in the wild. Anyone out there using it yet?