My ebook Publishing Process
Published on February 28, 2013 by Jesse Storimer
Over the past year and some, since I've been publishing ebooks, various people have asked about my publishing tools. I've given a few responses via email, or links via Twitter, but never laid everything out in one place.
In December of 2012 I published an update to one of my books, here I'll lay out the steps (and tools) involved in writing, editing, and distributing an update to my ebooks.
Let's start with the writing.
I do all of my writing in Markdown. It's simple formatting and ubiquity makes it a great fit. The fact that my toolchain is based on HTML makes Mardown an even better fit.
In terms of an editor, I spend most of my time in console
vim. This makes it easy to simultaneously run example code, grab output, etc. while also writing. Lately I've also been enjoying Mou.app.
I use an open-source tool called Kitabu to handle the lion's share of the ebook conversion process. It takes my raw Markdown source text and transforms it into a book.
Kitabu provides nice defaults and conventions, so I don't have to think about where to put everything, how I'm going to generate the various formats, etc. I just focus on the content.
One example, all of the source text goes in the
text/ folder. Here's mine for my sockets book:
~/projects/wwtcps/Working With TCP Sockets > tree text text ├── 10_intro.md ├── 25_sockets.md ├── 30_establishing.md ├── 40_servers.md ├── 50_clients.md ├── 70_data.md ├── 71_read.md ├── 72_write.md ├── 73_buffering.md ├── 75_hcloud.md ├── 80_options.md ├── 81_nonblocking.md ├── 82_multiplexing.md ├── 83_nagle.md ├── 84_framing.md ├── 85_timeouts.md ├── 86_dns.md ├── 87_ssl.md ├── 88_oob.md ├── 90_arch.md ├── 91_serial.md ├── 92_ppc.md ├── 93_tpc.md ├── 94_prefork.md ├── 95_thread_pool.md ├── 96_evented.md ├── 97_hybrid.md ├── 98_closing.md └── CHANGELOG.md 0 directories, 29 files
There's roughly a 1:1 mapping of chapter:file. The numbering of the files is to ensure that they're imported in the correct order.
When I want to generate a new version of the book, I run
kitabu export. This gives me a version of the book in HTML, PDF, ePub, Mobi, and TXT formats.
When I'm working on new content, I leave
kitabu export --only=pdf --auto running in the background. This re-generates just the PDF every time I save a file. I leave my Preview.app open so I can always peek at the reader-centric view of what I'm writing.
Whenever I'm editing or proof-reading, I always do it from reader's point-of-view. Gauging the proper flow, pace, wording, etc. is very different in my console window than in a PDF or Kindle reader.
Those are the tools that help me write the text.
All of the code lives independent of the text in the
code/ folder. This makes the code easier to maintain and test, compared to putting it directly in the source text.
Again, Kitabu helps me get that code back into the right place in the book.
Kitabu does some pre-processing of the Markdown files so that I can easily import code files. It even runs them through Pygments so they look pretty. Here's how it's done.
@@@ ruby ftp/arch/thread_pool.rb @@@ Again, two main methods here. One spawns the threads, the other encapsulates the spawning and thread behaviour. Since we're working with threads, we'll once again be using the `Connection` class. @@@ ruby ftp/arch/thread_pool.rb:24,40 @@@
@@@ syntax tells Kitabu it should import a file from the
The first line imports the whole
thread_pool.rb file, using the
ruby lexer. The middle section is just some Markdown. The last line imports the same
thread_pool.rb file, but only lines 24-40.
The whole PDF generation process is based on HTML/CSS. As a web developer by day, I'm very comfortable with this. I know there are other book formatting tools out there that are more powerful. But the fact that I could style the book using HTML/CSS meant that I get started faster, and focus on the content, rather than having to learn a new markup language.
Kitabu creates the master HTML file by converting all the Markdown files to HTML, slamming them together, and running it through an ERB template.
Here's my ERB template that encapsulates the whole book. Some things are notable:
- Notice how I'm including a font from Google Web Fonts? That's a handy trick. Works nicely all the way through the PDF generation as well.
- There are method calls in there to things like
toc, etc. These are filled in by the framework with the appropriate values.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html> <head> <title><%= title %></title> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <link href='http://fonts.googleapis.com/css?family=Poly:400,400italic' rel='stylesheet' type='text/css' /> <link rel="stylesheet" type="text/css" href="../templates/html/layout.css" /> <link rel="stylesheet" type="text/css" href="../templates/html/syntax.css" /> <link rel="stylesheet" type="text/css" href="../templates/html/user.css" /> <meta name="author" content="<%= authors.join(', ') %>" /> <meta name="subject" content="<%= subject %>" /> <meta name="keywords" content="<%= keywords %>" /> <meta name="date" content="<%= published_at %>" /> </head> <body> <div class="frontcover container"> <img style='width:900px' src='../templates/html/cover.jpg' /> </div> <div id='copyright-notice'> <div> <h2><%= title %></h2> <p><%= copyright %></p> </div> </div> <div id="chapters"> <div class="table-of-contents"> <h2 class="no-toc">Contents</h2> <div id="toc"> <%= toc %> </div> </div> <div class='chapter' id='changelog'> <%= changelog %> </div> <%= content %> </div> </body> </html>
From that HTML document, the PDF document is generated using PrinceXML. Prince is a pretty impressive tool that takes HTML documents and transform them to PDF.
Prince is a fairly complete HTML5/CSS3 implementation, supporting all the stuff you're used to, and a bunch of CSS features that you probably haven't used in a browser. This article describes some of the features nicely.
When I'm generating PDFs locally, I use the free development version of Prince. This puts a Prince watermark on the front page of your document. When I'm generating the finished version, I use DocRaptor to generate a watermark-less version.
The reason for using DocRaptor is simple. It costs $15/month for me to use, whereas a personal Prince license costs $495 (one time).
Note that Kitabu still manages the interaction with Prince via the
kitabu export command, I don't ever have to use it directly.
This is the format for iPad and most other e-readers (except Kindle).
The ePub packaging is handled mostly by the eeepub gem. An ePub file is essentially a zip of some HTML and XML files. There's actually quite an array of XML files that need to generated so that the ToC and other linking that happens on the ePub works properly.
Thankfully, the eeepub gem handles this, and kitabu handles the eeepub gem, so I don't even have to worry about it.
With ePub, there's much less control over formatting than with PDF, so it's more or less just a direct inclusion of the HTML files generated from Markdown, no CSS applied.
This is the format for Kindles.
The MOBI file format is very similar to ePub, a zip file with some HTML and XML inside. In this case, Amazon provides a binary script called kindlegen. This little utility can take my ePub and auto-convert it to a MOBI.
Not only that, but it provides verbose output when there's some issue with the generated XML. Useful for compliance.
Up until now, I've covered all the tools that take me from raw text to a completed book in a multitude of formats, ready to ship.
Once I've got all the file formats generated, I have a rake task that builds a distribution of the book. It's nothing fancy. It copies the generated book builds, as well as the bundled source code files, to a
dist/ directory and zips them up.
At this point I take a look at what the zip includes, make sure it all looks OK, then put it up for download.
Once I've got a .zip file ready for distribution, I have to upload it to the various places that you can get my book. This is the least automated step.
Disclaimer: I'm a Shopify employee.
My primary platform for selling is my Shopify store. I am an employee, but I also think it's been a great platform for my needs, usually providing more firepower than I need in most situations.
To be fair, I've only tried one competitor: Digital Delivery App. Ultimately their platform was a bit smoother for my workflow, but didn't offer anything that Shopify did not.
The core of Shopify is focused on selling hard goods. But there's a free add-on for digital deliveries. I upload my .zip file here and it gets automatically emailed to people who buy through the website.
My book is also available on PragProg. They don't provide an interface for uploading new releases, so I send them an email with the new release attached and they take care of it.
So that's a detailed look at my publishing process and all the various tools involved. That's how I take some raw text, format it, and get it into people's hands.
Although I've got the process down to a science by now, there certainly was a lot of experimentation along the way. For most of the tools that I use, I either maintain my own custom fork, or have contributed things upstream that I needed. The point of me sharing this is that if you want to publish something, the most important place to focus is on the content.
Tinkering with toolchains is fun (for us programmers), but you need some toolchain to start with if you have content to share, and this is the best solution that I've arrived at yet.
Want to see the finished product? Check out a free sample chapter to see what it looks like, or buy the whole book to see what the complete package is like and level up your network programming skills.
There's other related topics, like how I approach writing, or how I market my books, that I didn't cover here. Let me know if you're interested in hearing more about this.
One more bit of fun. Kitabu also has a
kitabu stats command. Here's mine for the sockets book:
Chapters: 28 Words: 25207 Images: 0 Links: 27 Code blocks: 103