Friday, August 14, 2015

Extract images embedded in a PDF

Trick source

You need to install poppler-utils.

Use pdfimages a PDF image extractor tool that saves images from a PDF file to PPM, PBM or JPEG file(s) format.

Usage: pdfimages [options]

Example: Save images in JPEG format


$ pdfimages -j in.pdf /tmp/out

Will save images from PDF file in.pdf in files /tmp/out-000.jpg (or /tmp/out-000.pbm; see below), /tmp/out-001.jpg, etc.

Extracted from pdfimages man page.

-j: Normally, all images are written as PBM (for monochrome images) or PPM for non-monochrome images) files. With this option, images in DCT format are saved as JPEG files. All non-DCT images are saved in PBM/PPM format as usual.

EDIT: Notice that this trick applies to the extraction of raster images!

No comments: