Linux
Linux

Compressing epub in Linux CLI


Most epub files are not saved to have small filesizes because they expect you to download them once to a device. However, if you use something like calibre-web which stream the file to readers every time you open the epub reader, you will wish they used better compression.

So epubs are actually just zip files with html text formatting and images. Now they have some very specific restrictions, but opening the files to recompress the images isn’t very difficult.

I also found a lot of epubs use PNG format for the images and that is a mixed bag for improvement. PNG is a lossless format that does great for transparency and keeping text readable but is not the best for file size. Converting to JPG would give huge benefits in size but updating the HTML code to reflect the new file extension is annoying. Most browsers actually don’t care if you send a JPG under the name of PNG but in this first example, we will just do the normal PNG compression route. For low color images, there is a mode that uses 8-bit palleted colors that can make PNG outperform JPG in quality and file size.

vi ~/bin/compressepub

echo starting epub compressing...

rm -rf ./epub #delete the folder if it already exists
unzip "$1" -d epub

# process images
shopt -s globstar
for i in **/*.png; do
   # 1) pngpal is a lossless compressor that works best if it can go to 8-bit
   #pngpal "${i}"
   # 2) pngquant is a lossy png compressor that reduces more bytes
    pngquant --skip-if-larger --output "${i}" --force "${i}"
done

# rezip the epub
cd epub
zip -0 -X "../compress.epub" mimetype
zip -rDX9 "../compress.epub" * -x mimetype
cd ..

echo '-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-'

rm -rf ./epub
 # remove our work directory

mv "$1" "${1}.original"
mv compress.epub "$1"

du -h "${1}.original"
du -h "$1"

echo '-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-'
chmod +x ~/bin/compressepub
compressepub <filename>

Better Compression using webp image format

Using the trick mentioned above, your browser actually doesn’t care if a file says PNG but is actually a different format, as long as it knows how to render it. So let’s go all out and use a next-gen image format like webp. You will need to make sure you have webp cli tool installed on the machine that will do this compression.

sudo apt install webp

Now take this large book full of screenshots and example images at a huge 112MB and re-encoding all the png into webp got us down to 13MB.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
112M Unity 2018 Shaders and Effects Cookbook – John P. Doran.epub.original
13M Unity 2018 Shaders and Effects Cookbook – John P. Doran.epub
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Our compression script will look a little different this time around

echo compressing with webp...

rm -rf ./epub #delete the folder if it already exists
unzip "$1" -d epub

# find all images
shopt -s globstar
for i in **/*.{png,jpg,jpeg}; do
    cwebp -short "${i}" -o "${i}"
done

# zip epub
cd epub
zip -0 -X "../compress.epub" mimetype
zip -rDX9 "../compress.epub" * -x mimetype
cd ..

rm -rf ./epub # remove work directory

# move the compressed one and rename the other to original
mv "$1" "${1}.original"
mv compress.epub "$1"

echo '-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-'
du -h "${1}.original"
du -h "$1"
echo '-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-'

Downside to webp

Most new browsers support it (Chrome, Firefox, Edge, etc) with a weird exception for anything but the absolute latest Safari 14. If you are opening these epubs on older e-reader devices, it’s probably not going to like webp either so you are stuck with the PNG compression at the top.

openanalytics 5820 views

I'm a 35 year old UIUC Computer Engineer building mobile apps, websites and hardware integrations with an interest in 3D printing, biotechnology and Arduinos.


View Comments
There are currently no comments.

This site uses Akismet to reduce spam. Learn how your comment data is processed.