Passing Open Files Between Processes with UNIX Sockets

Published on May 07, 2012 by Jesse Storimer

I want to share with you today a neat little technique I learned involving UNIX sockets.

In the land of Unix everything is a file. This is faithfully mirrored in Ruby with the IO class. The IO class models any so-called files on a Unix system. This includes stuff like File, TCPSocket, UDPSocket, all of which are subclasses of IO. Since these classes all share a common parent they also share a common API, as they should.

Here's a quick example:

require 'socket'

# open 3 different IO objects
file        = File.open('/etc/passwd', 'r')
tcp_socket  = TCPSocket.new('google.com', 80)
pipe        = IO.pipe

# write some data to the tcp socket and close
# the writing stream afterwards
tcp_socket.write("GET / HTTP/1.1\r\n")
tcp_socket.close_write

# write some data to the pipe and close
# the writable IO afterwards
pipe[1].write('some test data')
pipe[1].close_write

# read from all the IO objects in the
# same way. Yay for polymorphism and
# duck typing!
[file, tcp_socket, pipe[0]].each do |io|
  puts io.read
  io.close
end

This example shows how different subclasses of IO can be treated in a similar way. But, each subclass also implements some of its own methods. In some cases it overwrites the parent implementation, but some of the subclasses can do things that others can't. Enter UNIXSocket.

UNIX Sockets are Local

UNIX sockets have (almost) the exact same API as TCP sockets. The defining difference is that while TCP sockets can connect any two endpoints on the same network, UNIX sockets can only connect to other UNIX sockets on the same physical domain, ie. on the same machine. As such, the kernel can take some shortcuts and ensure that UNIX sockets are much faster than TCP sockets because they skip the entire networking layer.

Since UNIX Sockets are local and fast they make a great candidate for IPC. Besides sharing plain text data between processes you can also share open file descriptors between unrelated processes using UNIX Sockets. The most important word in the last sentence is unrelated.

Related Processes Share Open Files By Default

When I say related, I'm talking about a hierarchichal relationship. One process is related to another when it has a parent/child or parent/grandchild, etc., relationship.

I've written previously about how open files are shared between parent/child processes when forking. Related processes share file descriptors through the semnatics of fork(2). Even if the child process decides to exec(2) and become a totally different process it can still access any shared file descriptors.

But what if you want to pass a file descriptor to a totally unrelated process? You can! Using UNIX sockets.

Sharing File Descriptors Between Unrelated Processes

Let's imagine that I have a Ruby daemon process that listens on a UNIX socket. It listens on that socket to receive open TCP connections. When it gets one it will respond in its own special way. I could distribute this daemon as a rubygem, at which point developers could write clients that interact with it by sending open connections over the UNIX socket. By 'open' I mean that the client would accept a new connection, then pass it to the daemon over a UNIX socket and the daemon would write a response and close the connection. Fun!

The daemon could look like this:

require 'socket'

# This sockets listens for local connections.
listener = UNIXServer.new('/tmp/pingback.sock')
client = listener.accept

loop do
  # This is where an open connection is received.
  connection = client.recv_io

  # An exciting response. Echo!
  connection.write(connection.read)
  connection.close
end

Stick that code into a file and run it. Then it can be interacted with using a client like:

require 'socket'

# This socket listens for connections from the outside world.
listener = TCPServer.new('0.0.0.0', 4481)

# This socket connects up to the local daemon.
receiver = UNIXSocket.new('/tmp/pingback.sock')

Socket.accept_loop(listener) do |connection, _|
  # When it gets a new connection it passes it right to the
  # daemon process.
  receiver.send_io(connection)

  # Passing the connection to the daemon made a copy of it,
  # so now there are two open connections. If we didn't close
  # it here then the connection would remain open even after
  # it was closed by the daemon.
  connection.close
end

If you run both the daemon and the client you can test drive the two pieces using netcat:

$ echo boofar | nc localhost 4481
    boofar

The response is pretty unexciting, it's just an echo server. But realize what's happening: you're initiating a TCP connection to the client which passes that connection to the daemon, and it's the daemon that actually writes the response back to you.

Note that these two processes are started up independently and are unrelated. There's no forking semantics at play here.

Can't handle multiple clients atm.

Conclusion

So the important methods here were UNIXServer#send_io and UNIXServer#recv_io. The first sends an open file, the second receives one.

You could use this to pass open files between any two processes on your system. And you don't need to restrict yourself to two Ruby processes. Thanks to the fact that most modern languages support UNIX sockets you could write a client to interact with the Ruby daemon in Python, or even in C. I'll leave that as an exercise for the reader :)


If you want to know more about socket programming then stay tuned. I'm working on a new book specifically about socket programming...in Ruby! Sign up for the email list to stay in touch.


Like what you read?

Join 2,000+ Ruby programmers improving their skills with exclusive content about digging deeper with sockets, processes, threads, and more - delivered to your inbox weekly.

I'll never send spam and you can unsubscribe any time.


comments powered by Disqus