Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"mproving PDF-to-Text Conversion: Integrating Tables as Markup Text on a Page-by-Page Basis #984

Open
Isha09Garg opened this issue Sep 11, 2023 · 1 comment
Labels
feature-request All feature requests receive this label initially, can be upgraded to "enhancement"

Comments

@Isha09Garg
Copy link

Is it possible to integrate text seamlessly with tables, essentially converting tables into markup text, to enhance the quality of PDF-to-text conversion on a per-page basis?

@Isha09Garg Isha09Garg added the feature-request All feature requests receive this label initially, can be upgraded to "enhancement" label Sep 11, 2023
@jsvine
Copy link
Owner

jsvine commented Sep 11, 2023

Hi @Isha09Garg, and thanks for your interest in this library. It's a really interesting functionality you propose, but my instinct is that this is best handled by a third-party library, given the extensive amount of customization I can imagine users wanting. (E.g., How to represent more complex text layouts in Markdown, or how to determine whether a line of text should be rendered as a Markdown heading or not, etc.) If you or another member of the community wants to build that, I'd be happy to link to the project from pdfplumber's documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request All feature requests receive this label initially, can be upgraded to "enhancement"
Projects
None yet
Development

No branches or pull requests

2 participants