Skip to content

magic_pdf-0.10.2-released

Latest
Compare
Choose a tag to compare
@myhloli myhloli released this 27 Nov 10:33
· 1 commit to master since this release
8afff9a

What's Changed

  • fix(pdf_parse): Move the logic for filling text content into spans before the discarded_block recognition to fix the issue of empty text blocks in discarded_block. by @myhloli in #1082
  • refactor(txt_spans_extract_v2): optimize span processing and OCR logic by @myhloli in #1086
  • feat(ocr): filter out low confidence ocr results by @myhloli in #1088
  • feat(pdf_parse): add OCR score to span data by @myhloli in #1089
  • fix: test_rag by @icecraft in #1105
  • perf(image_processing): reduce maximum image size for analysis by @myhloli in #1106
  • fix: test_tools unittest by @icecraft in #1104
  • refactor(libs): remove unused imports and functions by @myhloli in #1112
  • Feat/add s3 read write example by @icecraft in #1117

Full Changelog: magic_pdf-0.10.1-released...magic_pdf-0.10.2-released