Python: PDF: Difference between revisions

From OnnoCenterWiki
Jump to navigationJump to search
Onnowpurbo (talk | contribs)
No edit summary
Onnowpurbo (talk | contribs)
 
Line 20: Line 20:


  pip install textract
  pip install textract
 
  # for read pdf
  # for read pdf
  import textract
  import textract
  text = textract.process('path/to/pdf/file', method='pdfminer')
  text = textract.process('path/to/pdf/file', method='pdfminer')


==Referensi==
==Referensi==

Latest revision as of 22:29, 24 October 2018

pyPDF2

#install pyPDF2
pip install PyPDF2

# importing all the required modules
import PyPDF2

# creating an object 
file = open('example.pdf', 'rb')

# creating a pdf reader object
fileReader = PyPDF2.PdfFileReader(file)

# print the number of pages in pdf file
print(fileReader.numPages)

textract

pip install textract

# for read pdf
import textract
text = textract.process('path/to/pdf/file', method='pdfminer')

Referensi