Thursday, July 28, 2011

My search for an Android ebook reader

[Edit: I edited the Moon+ review to take into account the latest version as of January 5, 2012. -ARP]

Every couple of weeks, my Palm TX stops working, and the only way to get it working again is to short out its battery quickly in order to clear out the RAM (disconnecting the battery would work, too, but then I'd have to unsolder it and resolder it; hmm, maybe I could solder in a tiny switch?).  So I need a new home for my large ebook collection.  It looks like the current options are iOS and Android.  But iOS development requires a Mac, plus iOS is notoriously closed.  So Android.

I acquired an Archos 43.  I don't particularly recommend the device, but it has the advantage for me of not being a phone (so I don't have to switch from my grandfathered-in phone plan that gives me unlimited Internet on my Treo at a ridiculously low rate), plus a manufacturer who has a very open attitude.  A happy thing: PDF readers for Android are an order of magnitude better than PalmPDF (though PalmPDF was a great step forward from Adobe's reader for Palm), and the higher resolution screen is quite helpful.

But what about non-PDF ebooks?  It looks like the standard format of the future is epub.  So I converted my Plucker version of Aquinas's Summa Theologica to html.  This generated 626 html files (about one per article), total size 25 mb.  These html files use very simple formatting, which should make them easily convertible.  I then converted the html to epub with Calibre.  Options: 260K segments, don't split on page breaks.  Result: a 6.9mb epub file (the Plucker file was 5.6mb).  This is quite large as epubs go, since most epubs are novel-length, rather than Summa-length.

What would I like in an ebook reader?
  1. Speed (I don't want to wait 30 seconds to open the Summa at a conference to look something up prior to asking a question). 
  2. Good searching at a decent speed through large texts.
  3. Multiple bookmarks/annotations.  
  4. Good use of a small screen.
  5. Scrolling rather than paging (this is a taste preference, but I think paging is a left-over from dead-tree technology;  why should one have to flip back and forth to see a difficult passage that is broken between pages--one should just be able to scroll to locate the passage conveniently).
  6. Open source.
Item (1) and (2) are essential.  Item (3) is close to essential, but I find I don't use Plucker's bookmarks/annotations quite as much as I thought I would.

I tried: Moon+, FBReader, CoolReader, Aldiko, the Nook app, StarBooks, Foliant Beta, Mantano Trial and the Kindle app.  FBReaderJ and CoolReader are open source.  I don't know about Foliant and StarBooks.  The others are closed source.  All are free or have free or trial versions which is what I tested.

Summary: None of the readers was 100% satisfactory for my purposes.  Moon+ and Mantano are the best choices.  Moreover, both have very responsive developers who are interested in working with me on large text issues.  If you care more about searching than opening large documents very quickly, Moon+ is the choice.  If you care more about opening large documents very quickly, Mantano is the choice.  But this may change with future versions.

My main tests were just opening the Summa and searching through it for the nonsense word "trubbli".  As background for the speed tests, the Summa opens in two or three seconds in Plucker on my PalmTX, which is underclocked to 208mhz, and a search through the Summa takes 41 seconds.

The Archos has a 1000mhz CPU.  The epub format is basically zipped html files, each at most 260K long, with some additional meta-data.  It should be possible to extract a single file from an epub zip just about instantly, to open an epub file it shouldn't be necessary to do more than open some meta-data files and then load the correct html segment.  It takes the Archos 0.05seconds to unzip a 200K file (unrepresentatively large) from my summa2.epub file using busybox's unzip.  Unzipping all of the files in the Summa takes 17 seconds on the device, and then searching them with grep takes less than two seconds.

Moon+: This was the first reader I tried, after hearing really good things about it. This mini-review is edited as of January 5, 2012, and is of the version downloaded from the developer's site. It takes 10 seconds to load the document. That is slightly disappointing--I thought it would be like Plucker, namely almost instant.  Display options are great, scrolling is great.  I haven't tried the annotation features, but I've been told they're good.  Search took about 22 seconds, which is close to the best that one can expect given how long unzipping the epub takes.
The developer is great and responsive. For instance, the version I tried in the summer took an unacceptable 20 seconds to open the document, and the developer has worked on reducing this. Moreover, the summer's release had an annoying dialog each time you clicked on an intra-document link, but it's now gone.

Apart from the imperfect 10 second load time, Moon+ is great. You can scroll, you have a ton of display options, etc. It is the best choice right now for large ebooks as far as I can tell. And the 10 second load time is decent given some of the competition.

FBReader: This is open source, which means a lot to me as I don't expect any reader to have all the features I want, and so I expect to have to add features myself.  It took about 40 seconds to load the document, which is unacceptable.  Search time was a creditable 20 seconds.  Since the opening time was utterly unacceptable, further tests were unnecessary.

Coolreader: Another open source offering.  Took a minute to load.  Took three seconds to flip a page.  I couldn't click on any of the links.  Didn't try any more as it was not usable.

Aldiko: A pretty popular closed-source reader.  It loaded the file instantly, thereby showing that there is nothing intrinsic to the epub file structure that makes that impossible.  But the search was unacceptably slow at  about 74 seconds.  Moreover, it pages rather than scrolls, which is annoying.  I didn't try any more as the search speed killed it as an option.

Nook: Annoyingly, it wants epubs in its own directory, and doesn't let you browse the file system to get to them, like other readers let you.  It also loaded the file instantly.  However, it had really annoying large margins, showing too little text per screen.  Maybe it's optimized for larger screens (the Archos has a 4.3" screen), but such large margins should be adjustable in the app, and I couldn't find an adjustment for them.  The deal-killer was the search.  After two minutes it wasn't done, and I gave up and uninstalled it.

Starbooks: After about 20 seconds of first-time importing, it loads instantly.  Page-based model.  But no search!  So, that's that.

Foliant Beta: It took a while to scan the Summa, but it cached the scan, so next time it started instantly.  Developers whose model requires the epub to be all scanned (which I am guessing is what is behind the unacceptable startup times on otherwise good apps, such as Moon+ and FBReader) should take note.  Search speed was marginal at 35 seconds, too, and it scrolled fine.  The killer was that I couldn't click on intra-document links, and I couldn't change the absurdly large font size.  The inability to click on links makes books like the Summa which seem to have been written with hyperlinks in mind (wasn't St Thomas ahead of his time?) unusable.  However, it is a beta version, so it may improve.

Mantano: Starts the Summa instantly.  Unfortunately, like many readers, it's page based rather than allowing for continuous scrolling (why pages? most page breaks are an accidental division with no semantic value).  And as with a number of other readers that started the book instantly, searching the Summa was slow, about 80 seconds.  I was using the free seven day trial.  Unlike the other apps, Mantano has no permanent free version.  However, Mantano has an extremely responsive development team that collects suggestions, and it looks like they are quite interested in working on these issues.

Kindle: Unlike the other apps, this doesn't have epub support as far as I know.  Fortunately, Calibre can convert to Mobipocket format, too, so I put a Mobipocket conversion of the Summa, based on the same html files, in the Kindle directory (like some of the other apps, it only reads books in a special directory; my beloved Plucker on PalmOS does that, so I don't complain too much).  It opened in a second or less.  However, it is page-based, with no scrolling.  More seriously, the search.  The good news is that unlike most other apps, it has a progress bar for the search.  The other apps mostly just show you a spinner and so you don't know what percentage has been searched.  The bad news is that I started the search around the time I started writing this paragraph, and it's still going.  It's now about 3/4 done, and my timer says that it took 4 minutes to get there.  It's definitely one of the slower searchers.  Since Kindle is the big name in ebooks, I'm going to let the time get to the end.  Almost there.  Finally: "0 results for 'trubbli'."  I missed the exact time of finishing, but it was around 6 minutes.  Wow!  How did they manage it to be so slow? There is also a subpixel rendering bug, at least as January 5, 2012.

Conclusions: I have yet to find anything that works as well for my purposes as Plucker on my 5+ year old Palm TX, though the faster search speed of Moon+ almost compensates for the slower document opening speed.  Perhaps I should try readers that use other formats than epub, like Mobi.  Or perhaps I should just take Coolreader or FBReader and bang on the source until it does what I want it to do.  Or perhaps I should just wait until something comes along that is compelling better than Plucker on Palm.

Or maybe I should just keep on downloading epub apps.  I'll post comments with test results on other epub readers or update the post.

14 comments:

David Parker said...

I've used a Kindle for several years. The pdf reader is pretty bad for texts with large margins, but you can crop them on your pc before emailing the pdf to the kindle.

Kindle supports mobi natively and (supposedly) ePub with conversion.

You can also view GoogleReader with a WiFi or 3G connection, which was a big plus for me.

Probably not as portable as you'd like though.

Alexander R Pruss said...

Converting is a nuisance, and requires you to be near a computer or something like that. And the Kindle is too large to keep comfortably in my pocket. Plus, I like backlit screens

How hard can it be to make a good ebook reader app when you have the sort of resources that Amazon or Barnes and Noble have? I suppose I am in a small minority of users who have large texts. But I think this small minority of users is probably overrepresented among techie types who buy gadgets. Lawyers, doctors and humanities academics all need very large reference works.

My comparing to Plucker is not exactly fair, since I worked as part of the Plucker project for several years to get it doing all the things I wanted it to do. But when I want a reader that opens a large book instantly and searches at the full speed of the device (namely, 20 seconds for the Summa, since that's how long unzip /storage/sdcard/Summa.epub followed by grep trubbli *.html takes), I don't think I am asking for anything too esoteric.

Out of curiosity, I'd like to know how long your Kindle would take to load and search for "trubbli" in my Mobipocket build of the Summa. Email me at arpruss@gmail.com if you'd like to help, and I'll send you the file.

Alexander R Pruss said...

Let me add: one could do the search much faster than 20 seconds if one indexed. But that increases storage space, and initial load times. I am not asking for that. (I kept on meaning to implement this with Plucker, but never got around to it. The algorithms were too complicated for this amateur.) The 20 second search is adequate.

David Parker said...

Total time between emailing the attachment to my Kindle and seeing it pop up on the home screen was about 5 minutes.

Opened the Summa and the first page displayed < 1 second.

The search for trubbli gave me "your search can not be completed as this item has not been indexed. Please try again later."

Ba! Apparently you have to wait 5-10 minutes for the document to be indexed before you can search it.

Got "search not found" in < 1 second once indexing finished of course. Tried a search for "evil" and returned 600 pages of search results in < 1 second.

This edition of the Summa is very nice with links back to the Index and next sections and such!

Alexander R Pruss said...

Nice!

David Parker said...

One quick addition. Text to speech conversion also works with this document. You get a male and female voice (but no British accent!) and three speed settings.

Alexander R Pruss said...

Wordoholic: Failed to load the Summa epub at all: just gave an error message.

Captionary: Loaded in about five seconds, but links don't work and no search.

Jarrett Cooper said...

Prof. Pruss,

There is one good thing about the usage of pages. For example, when reading scholarly work you will see superscripts used with the author telling you to look up pages (X-Y) in whichever book for more detail.

But yeah, I agree with you that scrolling is better, especially for the more difficult passages.

Jarrett Cooper said...
This comment has been removed by the author.
Alexander R Pruss said...

It's good to have small page numbers in brackets marking where the pages begin in some scholarly edition, yes. But the page-breaks on the screen rarely correspond to those.

Alexander R Pruss said...

Updates:

Just tried Laputa (the link is to the pro version, but I think there should be a place there to download the trial that I used). Search speed is 17 seconds, the best so far. However, it took an unacceptable 36 seconds to open the file.

Looks pretty, but I found it hard to use because of all the prettiness.

For some reason, Laputa has been pulled from Android Market (the link is to PDAssi.de).

Alexander R Pruss said...

I also tried generating a .chm file and using Moon+ with it. Opening times were instant. Yay! But unfortunately the search was unusable for me: in .chm files, Moon+ only searches the current segment.

Alexander R Pruss said...

AReader: Looks like an updated version of Laputa, but nicer. Test file opens snappily, and works nicely. Search is a bit slow, about two minutes (maybe three--I lost track of the minutes hand due to multitasking).

Alexander R Pruss said...

Update: The latest pre-release version of Moon+ Reader opens my test file in 10 seconds, which twice as fast as the previous versions. Moon+ is starting to be quite usable.