Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase Total tokens to 128K (currently 4K) #1186

Open
Padmaapparao opened this issue Nov 23, 2024 · 1 comment
Open

Increase Total tokens to 128K (currently 4K) #1186

Padmaapparao opened this issue Nov 23, 2024 · 1 comment
Assignees

Comments

@Padmaapparao
Copy link

Padmaapparao commented Nov 23, 2024

For Doc Sum example as we will upload 100's of files, we need the input token length to be large and same with the output. Currently it is fixed at 4096 total, so if we upload even 1 large file, output token length for summarization will be only 32 tokens which is very very small for a summary.

need total tokens 128K, so we can get at least 16K-32K summary.
These values are hardcoded in compose.yaml. Need them to be parametrizable.

@lvliang-intel
Copy link
Collaborator

We are considering adding these parameters and making them configurable in compose.yaml to support flexible setups. A PR will be created for this, and we will update the details here once the PR is ready.

However, the models themselves currently don't support a 256K context length. And some hardware also have limitation to support large input token length and max output token length. We recommend exploring alternative approaches, such as chunking files or using recursive summarization techniques, to achieve optimal results within the current technical limitations.

@lvliang-intel lvliang-intel self-assigned this Nov 27, 2024
@Padmaapparao Padmaapparao changed the title Increase Total tokens to 256K (currently 4K) Increase Total tokens to 128K (currently 4K) Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants