I wonder how this compares to open source models (which might be less accurate b...

mediaman · 2025-02-05T19:07:00 1738782420

Everything I tried previously had very disappointing results. I was trying to get rid of Azure's DocumentIntelligence, which is kind of expensive at scale. The models could often output a portion of a table, but it was nearly impossible to get them to produce a structured output of a large table on a single page; they'd often insert "...rest of table follows" and similar terminations, regardless of different kinds of prompting.

Maybe incremental processing of chunks of the table would have worked, with subsequent stitching, but if Gemini can just process it that would be pretty good.