Setup Serverless Cloud GPU at runpod.io
1. Setup serverless GPU at [runpod.io](http://runpod.io/), with the relevant OCR libraries installed
2. Develop a program/script that exposes the OCR services as REST API and host it at the same serverless GPU above
## Problems
Currently we are using OCRmyPDF and EasyOCR library to perform OCR on PDF documents. These OCR libraries are currently running in the same server as the web application. Due to the web application server has no GPU, the OCR processing is slow.
## Objective
The proposed solution is to host these OCR libraries at [runpod.io](http://runpod.io/) using their serverless GPU (https://www.runpod.io/serverless-gpu ) service. In order for the web application server to use the OCR services, a custom OCR REST API need to be developed and exposed at the serveless GPU.
## Libraries Needed To Be Installed in the Serverless GPU
- It is recommended to use Ubuntu 22.04 as the base OS
- OCRmyPDF https://ocrmypdf.readthedocs.io/en/latest/
- EasyOCR https://github.com/JaidedAI/EasyOCR
- Other dependencies libraries
- Tesseract https://github.com/tesseract-ocr/tesseract
- Ghostscript (gs) https://www.ghostscript.com/
- Ps2Pdf https://web.mit.edu/ghostscript/www/Ps2pdf.htm
## OCR REST API Specifications
The server must expose an OCR REST API with the following requests parameters:
- `file_url`, string. URL to the PDF file
- `doc_type`, string. Document Type. Valid values are:
- BOOKING_FORM Booking Form
- IDENTITY_DOC Identity Doc
- LOAN_OFFER_LETTER Loan Offer Letter
- `api_key`, string. API key
The API must then return the following response:
- `status`, string. Status, valid values are: success or fail
- `error`, string. Error, if fail, the error message. If success, this field does not exists
- `output`, string. Output, if success, the OCR text output
### Processing Logic for Different Document Type
1. Booking Form
1. Use `ps2pdf` to minimize the file
2. Use `ocrmypdf` to read the minimized file
2. Identity Doc
1. Use `gs` to convert each page to PNG file
2. Use `easyocr` to read each PNG file
3. Loan Offer Letter
1. Use `ocrmypdf` to deskew the file
2. Use `ocrmypdf` to read the deskewed file
### Remarks
1. Details of the above processing logic is already written in Laravel PHP which you can refer as reference. Request for the source code if you need.
2. The OCR REST API can be written in any language, but Python is preferred. Alternatively if you think hosting the Laravel PHP in the serverless GPU is appropriate (so that we don’t have to rewrite the processing logic again in another language), you can suggest this too.
Python Ubuntu Laravel Framework Google Cloud Platform AWS Cloud
Finish Days: 7
Published Date: 08/01/2024 15:16:24 WIB
Start Date: 11/01/2024 14:55:42 WIB
Finish Date: 22/01/2024 15:32:35 WIB
Accepted Worker: masumar (masumar)
Accepted Budget: Rp 3,000,000
Project Ending: Completed
Project Owner
masumar completed the project as per the requirements.
He has shown passion to the task by doing in-depth research even before the project was awarded to him. He also provided advice on alternative solutions so that the problem can be solved more efficiently.
Accepted Worker
Clear instruction and willing to accept any suggestions. A very recommended owner. Never fail him.
Recommended Workers
ras617
4,510 points
107 projects
9.90/10.00
riki_aji1
1,689 points
7 projects
9.57/10.00
zaidanriz
1,100 points
3 projects
10.00/10.00
nveffermanick
569 points
10 projects
10.00/10.00
rakalso
1,008 points
57 projects
9.72/10.00
Recommended Services
Manage, Installasi, Konfigurasi VPS. Dan Migrasi Hosting
Rp 150,000
18 sales
9.61/10.00
Jasa Penetration Testing & Vulnerability Assessment
Rp 2,500,000
2 sales
10.00/10.00
Jasa Penetrasi Test Website / Mobile App (Android)
Rp 1,000,000
1 sales
10.00/10.00
Jasa Pembuatan Akun Amazon AWS Free Tier 1 Tahun
Rp 100,000
1 sales
10.00/10.00
Jual Jasa Pembuatan RDP VPS Windows 1 Bulan 2 CORE 4 GB Ram 50 GB SSD
Rp 75,000
1 sales
10.00/10.00
Open Projects
Pengurusan Dokumen ASN (APJII) dan Konfigurasi Mikrotik DC
Owner: safanesai
Budget: Rp 500,000 - 1,000,000
DICARI JASA MIRRORING WEBSITE TO IP DOMAIN
Owner: NagaBonar
Budget: Rp 500,000 - 1,000,000
Config Network Switch dan Firewall
Owner: kazaa3000
Budget: Rp 1,000,000 - 2,000,000
deteksi jaringan wifi
Owner: Okvinugroho
Budget: Rp 1,000,000 - 1,500,000