你的位置：首页>programmer>ocr - Tesseract HOCR to a structured text for LLMs - Stack Overflow

ocr - Tesseract HOCR to a structured text for LLMs - Stack Overflow

programmeradmin2025-04-061浏览0评论

I want to use the HOCR that I get from TesseractJS (I work on Javascript) and somehow transform it to be readable by a LLM. The goal is to reade technical documents with prices, tabs, header, lines, footer... not just a normal text.

Currently, I plan to "transform" the hOCR to a structured text, but I don't know how yet..

Any idea or anything else ?

与本文相关的文章

ocr - Tesseract HOCR to a structured text for LLMs - Stack Overflow

评论列表(0)

暂无评论

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

与本文相关的文章

评论列表(0)