Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We use GPT-4o for data extraction from documents, its really good. I published a small library that does a lot of the document conversion and output parsing: https://npmjs.com/package/llm-document-ocr

For straight OCR, it does work really well but at the end of the day its still not 100%



Thanks! look forward to checking this out as soon as I get home.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: