Summary

TL;DR: I'm considering replacing those various Calibre compnents with...

ebook-viewer: using a Kobo or other ebook reader, possibly Atril or MuPDF on the desktop?
ebook-editor: Sigil.
collection browser: Liber? see also bookmarks
device synchronisation: git-annex?
RSS reader: feed2exec, wallabako
ebook web server: Liber?

See below why and a deeper discussion on all the features.

Problems with Calibre

Calibre is an amazing software: it allows users to manage ebooks on your desktop and a multitude of ebook readers. It's used by Linux geeks as well as Windows power-users and vastly surpasses any native app shipped by ebook manufacturers. I know almost exactly zero people that have an ebook reader that do not use Calibre.

However, it has had many problems over the years:

Calibre is a complex piece of machinery, and it's therefore buggy. It manages to simultaneously ship with embedded libraries (Debian bug #872595, Debian bug #704977, Debian bug #684229, Debian bug #555352, Debian bug #555368, Debian bug #700838, most fixed in Debian) and also suffer from the *NIH syndrome. For example, it implement its own web framework instead of reusing stuff like requests or flask.
There are numerous security issues in Calibre. For example, it can execute arbitrary code while fetching news (Debian bug #873795) or plugin updates (Debian bug #640026), it would phone home (Debian bug #584334, fixed in Debian), allowed arbitrary file access via crafted files (Debian bug #853004, Debian bug #608822), arbitrary code execution in bookmark data (Debian bug #892242), and XSS vuln (Debian bug #608822), or even insecure embedded libraries (Debian bug #873660, Debian bug #787085). Some of those issues have been fixed upstream but, in my experience, it's clear that upstream does not take security seriously. The best example is probably the legendary security bug about how Calibre handled mounting partitions which upstream refused to fix properly even after a LWN article came out about it.
No support for Python 3. because of this, Calibre was removed from Debian in 2019 (Debian bug #936270). Now a there is port in progress, but the author infamously claimed it wasn't necessary to port to Python3 because he could maintain Python 2 himself

The latest issue (lack of Python 3) is the last straw, for me. While Calibe is an awesome piece of software, I can't help but think it's doing too much, and the wrong way. It's one of those tools that looks amazing on the surface, but when you look underneath, it's a monster that is impossible to maintain, a liability that is just bound to cause more problems in the future.

What does Calibre do anyways

So let's say I wanted to get rid of Calibre, what would that mean exactly? What do I actually use Calibre for anyways?

Calibre is...

an ebook viewer: Calibre ships with the ebook-viewer command, which allows one to browse a vast variety of ebook formats. I rarely use this feature, since I read my ebooks on a e-reader, on purpose. There is, besides, a good variety of ebook-readers, on different platforms, that can replace Calibre here:
- Atril, MATE's version of Evince, supports ePUBs (Evince doesn't)
- MuPDF also reads ePUBs without problems and is really fast
- fbreader also supports ePUBs, but is much slower than all those others
- Emacs (of course) supports ebooks through nov.el
- Okular apparently supports ePUBs, but I must be missing a library because it doesn't actually work here
- coolreader is another alternative, not yet in Debian (#715470)
- lucidor also looks interesting, but is not packaged in Debian either (although upstream provides a .deb)
- koreader and plato are good alternatives for the Kobo reader (although koreader also now has builds for Debian)
an ebook editor: Calibre also ships with an ebook-edit command, which allows you to do all sorts of nasty things to your ebooks. I have rarely used this tool, having found it hard to use and not giving me the results I needed, in my use case (which was to reformat ePUBs before publication). For this purpose, Sigil is a much better option, now packaged in Debian. There are also various tools that render to ePUB: I often use the Sphinx documentation system for that purpose, and have been able to produce ePUBs from LaTeX for some projects.
a file converter: Calibre can convert between many ebook formats, to accomodate the various readers. In my experience, this doesn't work very well: the layout is often broken and I have found it's much better to find pristine copies of ePUB books than fight with the converter. There are, however, very few alternatives to this functionality, unfortunately.
a collection browser: this is the main functionality I would miss from Calibre. I am constantly adding books to my library, and Calibre does have this incredibly nice functionality of just hitting "add book" and Just Do The Right Thing™ after that. Specifically, what I like is that it:
- sort, view, and search books in folders, per author, date, editor, etc
- quick search is especially powerful
- allows downloading and editing metadata (like covers) easily
- track read/unread status (although that's a custom field I had to add)
Calibre is, as far as I know, the only tool that goes so deep in solving that problem. The Liber web server, however, does provide similar search and metadata functionality. It also supports migrating from an existing Calibre database as it can read the Calibre metadata stores.

This also connects with the more general "book inventory" problem I have which involves an inventory physical books and directory of online articles. See also firefox (Zotero section) and ?bookmarks for a longer discussion of that problem.
a device synchronization tool : I mostly use Calibre to synchronize books with an ebook-reader. It can also automatically update the database on the ebook with relevant metadata (e.g. collection or "shelves"), although I do not really use that feature. I do like to use Calibre to quickly search and prune books from by ebook reader, however. I might be able to use git-annex for this, however, given that I already use it to synchronize and backup my ebook collection in the first place...
an RSS reader: I used this for a while to read RSS feeds on my ebook-reader, but it was pretty clunky. Calibre would be continously generating new ebooks based on those feeds and I would never read them, because I would never find the time to transfer them to my ebook viewer in the first place. Instead, I use a regular RSS feed reader. I ended up writing my own, feed2exec) and when I find an article I like, I add it to Wallabag which gets sync'd to my reader using wallabako, another tool I wrote.
an ebook web server : Calibre can also act as a web server, presenting your entire ebook collection as a website. It also supports acting as an OPDS directory, which is kind of neat. There are, as far as I know, no alternative for such a system although there are servers to share and store ebooks, like Trantor or Liber.

Note that I might have forgotten functionality in Calibre in the above list: I'm only listing the things I have used or am using on a regular basis. For example, you can have a USB stick with Calibre on it to carry the actual software, along with the book library, around on different computers, but I never used that feature.

So there you go. It's a colossal task! And while it's great that Calibre does all those things, I can't help but think that it would be better if Calibre was split up in multiple components, each maintained separately. I would love to use only the document converter, for example. It's possible to do that on the commandline, but it still means I have the entire Calibre package installed.

Maybe a simple solution, from Debian's point of view, would be to split the package into multiple components, with the GUI and web servers packaged separately from the commandline converter. This way I would be able to install only the parts of Calibre I need and have limited exposure to other security issues. It would also make it easier to run Calibre headless, in a virtual machine or remote server for extra isoluation, for example.

from Planet Python
via read more

Daily Python

Sunday, October 6, 2019

Anarcat: Calibre replacement considerations

Summary

Problems with Calibre

What does Calibre do anyways

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

Search This Blog