
Gibt es eine Möglichkeit mit Python eine PDF Datei auszulesen?!
hab im Forum und bei Google nichts gefunden :/
würde mich über Hilfe freuen

SigMA
Code: Alles auswählen
$ pdfinfo CoArray_Python.pdf
Title: LNCS 3149 - Co-array Python: A Parallel Extension to the Python Language
Subject: Euro-Par 2004 Parallel Processing
Author: Craig E. Rasmussen, Matthew J. Sottile, Jarek Nieplocha, Robert W. Numrich, and Eric Jones
Producer: Acrobat Distiller 5.0.5 (Windows)
CreationDate: Mon Dec 20 11:10:08 2004
ModDate: Wed Dec 22 12:08:51 2004
Tagged: no
Pages: 6
Encrypted: no
Page size: 439 x 666 pts
File size: 108647 bytes
Optimized: yes
PDF version: 1.4
Code: Alles auswählen
$ pdftk CoArray_Python.pdf dump_data
InfoKey: Title
InfoValue: LNCS 3149 - Co-array Python: A Parallel Extension to the Python Language
InfoKey: Producer
InfoValue: Acrobat Distiller 5.0.5 (Windows)
InfoKey: Author
InfoValue: Craig E. Rasmussen, Matthew J. Sottile, Jarek Nieplocha, Robert W. Numrich, and Eric Jones
InfoKey: Subject
InfoValue: Euro-Par 2004 Parallel Processing
InfoKey: ModDate
InfoValue: D:20041222120851+01'00'
InfoKey: CreationDate
InfoValue: D:20041220111008Z
PdfID0: 99516261b52c495aa6db34acbe4e3fdf
PdfID1: abfaca3213d5bbbe56c2ae66029b71b
NumberOfPages: 6
BookmarkTitle: 1 Introduction
BookmarkLevel: 1
BookmarkPageNumber: 1
BookmarkTitle: 2 Co-array Python Implementation
BookmarkLevel: 1
BookmarkPageNumber: 2
BookmarkTitle: 2.1 Implementation Details
BookmarkLevel: 2
BookmarkPageNumber: 3
BookmarkTitle: 3 Co-array Python Example
BookmarkLevel: 1
BookmarkPageNumber: 3
BookmarkTitle: 4 Future Work
BookmarkLevel: 1
BookmarkPageNumber: 5
BookmarkTitle: Acknowledgments
BookmarkLevel: 1
BookmarkPageNumber: 5
BookmarkTitle: References
BookmarkLevel: 1
BookmarkPageNumber: 6
Code: Alles auswählen
from pyPdf import PdfFileReader
pdf_file = open('test.pdf', 'rb')
pdf = PdfFileReader(pdf_file)
info = pdf.getDocumentInfo()
for key in ('title', 'author', 'subject', 'creator'):
print getattr(info, key)
pdf_file.close()