pdfreader.readthedocs.io - pdfreader 0.1.13dev Documentation — pdfreader 0.1.13dev documentation

Description: pdfreader - Python API to parse PDF documents, extract texts, images, other objects.

python (3845) image (3229) text (2270) pdf (1755) extract (179) acrobat (165) parser (69) parse (24) pdfreader (1) xobject (1)

Example domain paragraphs

pdfreader is a Pythonic API to PDF documents which follows PDF-1.7 specification .

It allows to parse documents, extract texts, images, fonts, CMaps , and other data; access different objects within PDF documents.

If you’re having trouble, have questions about pdfreader , or need some features the best place to ask is the Github issue tracker . Once you get an answer, it’d be great if you could work it back into this documentation and contribute!

Links to pdfreader.readthedocs.io (1)