- Free software: GNU General Public License v3
- Documentation: https://google-drive-ocr.readthedocs.io.
Features
- Perform OCR using Google’s Drive API v3
- Class
GoogleOCRApplication()
for use in projects - Highly configurable CLI
- Run OCR on a single image file
- Run OCR on multiple image files
- Run OCR on all images in directory
- Use multiple workers (
multiprocessing
) - Work on a PDF document directly
Install
To install Google OCR (Drive API v3), run this command in your terminal:
pip install google-drive-ocr
Note: One must setup a Google application and download client_secrets.json
file before using google_drive_ocr
.
Setup
- Create a project on Google Cloud Platform
- Wizard: https://console.developers.google.com/start/api?id=drive
Instructions
- https://cloud.google.com/genomics/downloading-credentials-for-api-access
- Select application type as “Installed Application”
- Create credentials “OAuth consent screen” –> “OAuth client ID”
- Save
client_secret.json
Usage
Using in a Project
Create a GoogleOCRApplication
application instance:
from google_drive_ocr import GoogleOCRApplication
app = GoogleOCRApplication('client_secret.json')
Perform OCR on a single image:
app.perform_ocr('image.png')
Perform OCR on mupltiple images:
app.perform_batch_ocr(['image_1.png', 'image_2.png', 'image_3.png'])
Perform OCR on multiple images using multiple workers (multiprocessing
):
app.perform_batch_ocr(['image_1.png', 'image_3.png', 'image_2.png'], workers=2)
Using Command Line Interface
Typical usage with several options:
google-ocr --client-secret client_secret.json \
--upload-folder-id <google-drive-folder-id> \
--image-dir images/ --extension .jpg \
--workers 4 --no-keep
Show help message with the full set of options:
google-ocr --help
Configuration
The default location for configuration is ~/.gdo.cfg
.
If configuration is written to this location with a set of options,
we don’t have to specify those options again on the subsequent runs.
Save configuration and exit:
google-ocr --client-secret client_secret.json --write-config ~/.gdo.cfg
Read configuration from a custom location (if it was written to a custom location):
google-ocr --config ~/.my_config_file ..
Performing OCR
Note: It is assumed that the client-secret
option is saved in configuration file.
Single image file:
google-ocr -i image.png
Multiple image files:
google-ocr -b image_1.png image_2.png image_3.png
All image files from a directory with a specific extension:
google-ocr --image-dir images/ --extension .png
Multiple workers (multiprocessing
):
google-ocr -b image_1.png image_2.png image_3.png --workers 2
PDF files:
google-ocr --pdf document.pdf --pages 1-3 5 7-10 13