News

How Many eBooks, Ultimately?

By Glenn Fleishman February 2, 2010

The focus in consumer electronics right now is on electronic books and tablet-sized devices. Apple's iPad announcement was a big deal, but there will be a dozen or more widely available ebook readers and computer-like slates driven by average people's apparent high interest in buying electronic versions of books.

There's also the standoff between Amazon and Macmillan. Macmillan, one of the biggest publishers in the U.S., has given Amazon new terms for selling its ebooks. Despite the fact that the terms were pretty reasonable, Amazon didn't like them, and in what can only be called a fit of pique pulled down all Macmillan's ebooks and print books from sale.



In the end, Macmillan will prevail: it holds the keys to the kingdom in terms of copyright. That means that while new mass-market books in electronic form will be priced higher ($13 to $15 instead of Amazon's $10), many more books will be available across a broader range of prices.

But that raises a question: What titles can you buy or read now? How long will it take for all books to be electronically available either free or for a fee?

Let's start with the current totals. Amazon is the most forthright about what it has available. On January 28, the company said it has more than 410,000 ebook titles available for the Kindle.

Other companies have been more cagey about totals, with Sony claiming 600,000 several months ago and B&N one million. However, both firms include at least 500,000 public-domain titles scanned by Google Books. In any case, neither has a library larger than Amazon for in-print books.

And even with hundreds of thousands of in-print books available, all the various ebook libraries combined represent no more than 15 percent of all books currently available in the United States. Also, because publishers don't love Amazon's ebook revenue terms—I don't know how they feel about B&N and Sony—not all bestselling or mid-list books are available.

Back in 1996-1997, when I worked as catalog manager at Amazon.com, we estimated that there were about two million books in print in the U.S., and we were able to list availability for only a fraction of those. Over time, Amazon improved its access to more titles, while the number of books issued each year exploded. There are likely   in-print titles today.

R.R. Bowker, the folks who produce the quasi-definitive Books In Print resource, have a neat chart that shows the number of new titles issued each year from 2002 to 2008 (first link on this page). You can see how crazy it's become if you download the PDF.

Back in the mid-90s, we figured about 200,000 books a year were being released in the U.S. The Bowker data from 2002 shows about 250,000 books; by 2008, it was 560,000. The growth has come almost entirely from print on demand (POD), "short run," and "unclassified" books

POD is one-at-a-time me books produced by fancy photocopying machines that have in-line binding. You can see this in action at Third Place Books in Lake Forest Park with its Espresso Book Machine.

The term "short run" refers to printings of hundreds of copies. It used to be unaffordable to produce books in batches of less than a few thousand; no more.

"Unclassified" refers to books that don't have subject categories, but it represents a small percentage of the surge.

A few publishers represent a large percentage of all in-print books, but the book tail is quite long, and the POD and short-run possibilities have opened up book publishing to an ever larger audience.

There may now be more than a million publishers in the U.S.; at the very least, there are hundreds of thousands. Most publishers have between one title and a few dozen. These tiny publishers are often formed to print one book, or are a small part of a larger business that produces books for a particular purpose.

Small press doesn't mean small sales. The statistical design genius Edward Tufte's Graphics Press
, for instance, has produced exactly seven unique titles, but has sold many millions of copies. (Tufte started his own press when he couldn't find a mainstream publisher that could produce his first book in the way he wanted. Good move.)



Bookstores consider most of these books special orders. One of Amazon's early secrets to success was building a huge special orders department. Books listed with four-to-six-week availability on Amazon in 1996 and 1997 were shunted for order to one of a couple dozen people who contacted small and academic presses directly to place orders. (Remember what a pain it was to special order a book at a physical bookstore?)

Eventually, Amazon started a program (Amazon Advantage) that allowed small publishers to send stock directly to Amazon warehouses to get a faster fulfillment time.

There's a similar problem with ebooks. Getting the first 200,000 books was trivial for the various booksellers as big publishers had already ramped up electronic-book conversion programs. Getting the next couple hundred thousand was harder and took much longer. Getting to a million will be quite difficult. The second million involves working with 50,000 different publishers. The next million after that, hundreds of thousands.

Amazon has a built its own self-serve ebook upload system, the Kindle Publishing Program. A couple weeks ago, it even changed the terms for books priced $2.99 to $9.99 so that publishers get 70 percent of the price at which Amazon sells the book, rather than the 35 percent the company offered before.

There's a whole other treasure trove of books, too, that fall into two categories: public domain works, and in-copyright but out-of-print titles, which are known as "orphaned" works if the publisher and author cannot be found.

Public domain titles in the United States include everything printed before 1923, as well as some titles created after that which carry a host of conditions. This Cornell University page
gives you a sense of the morass of complexity for 1923-and-later titles. Google has made about 500,000 public-domain works widely available. (Amazon doesn't offer these, but B&N and Sony do, as noted earlier, and at no cost.)

There are likely hundreds of thousands, if not millions, of books released between 1923 and the last few years that are no longer in print, but remain under copyright. These orphaned works are in limbo, although a proposed settlement between publishers and authors on one side and Google on the other may make these broadly available too.

The final count of books published in the U.S. that could be available on ebook readers or via ebook software is likely more than 10 million. I mention just U.S. titles, because while copyright law has some international harmony—based on an agreement made in the late 1970s—the copyright for a work can be sold individually in every country to different parties. There could be 100 million unique works in general book form worldwide.

This count also doesn't include the likely explosion in digital-only titles that's already underway. There are likely already hundreds of thousands of ebook-only format books, but that's before there were enough channels for someone to sell and market all but certain niche categories. With many more places to sell ebooks, the number in this category will rise.

No one knows how many books are in print, but it's a very large number. The long-tail theory
proposed by Chris Anderson, the editor in chief of Wired magazine, explains how making a very large catalog of anything available means that the aggregate of all the less-popular titles (the long tail of sales by volume) is still a very large number.

Which means that while the low-hanging fruit of bestselling and new titles are currently emphasized, eventually most of the books that have ever been available will be in electronic form.
Share
Show Comments