Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Docs support #1022

Open
doberst opened this issue Oct 4, 2024 · 1 comment
Open

Google Docs support #1022

doberst opened this issue Oct 4, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@doberst
Copy link
Contributor

doberst commented Oct 4, 2024

LLMWare provides extensive built-in parsing capability for Microsoft Document types (PPTX, DOCX, and XLSX), but does not currently integrate a solution for parsing and integration of Google Docs, Slides and Sheets - along with potential connections into Google Drive repositories for storing and accessing documents.

It would be great to have an integrated capability that supports parsing, text chunking and ingestion of Google document types and repositories. This implementation could take several forms - from a de novo parser/text chunker in Python or C/C++ or more likely an interface into an existing Google document parser - with the supporting code to seamlessly integrate into LLMWare.

@doberst doberst added the enhancement New feature or request label Oct 4, 2024
@EricLiclair
Copy link

@doberst seems interesting to me. can u throw some light on what do u suggest for this?

  1. any specific libs that you recommend,
  2. any existing code/class/component/pr in llmware that could be referenced/extended to add the support for GDocs
    I'll try and scope-in from my perspective what/where to add changes but it might be time consuming since i'm new to this codebase.

Suggestions for pt. 2 would help speedup the scoping. pt1. will help in better aligning the expected solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants