You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For Doc Sum example as we will upload 100's of files, we need the input token length to be large and same with the output. Currently it is fixed at 4096 total, so if we upload even 1 large file, output token length for summarization will be only 32 tokens which is very very small for a summary.
need total tokens 128K, so we can get at least 16K-32K summary.
These values are hardcoded in compose.yaml. Need them to be parametrizable.
The text was updated successfully, but these errors were encountered:
We are considering adding these parameters and making them configurable in compose.yaml to support flexible setups. A PR will be created for this, and we will update the details here once the PR is ready.
However, the models themselves currently don't support a 256K context length. And some hardware also have limitation to support large input token length and max output token length. We recommend exploring alternative approaches, such as chunking files or using recursive summarization techniques, to achieve optimal results within the current technical limitations.
For Doc Sum example as we will upload 100's of files, we need the input token length to be large and same with the output. Currently it is fixed at 4096 total, so if we upload even 1 large file, output token length for summarization will be only 32 tokens which is very very small for a summary.
need total tokens 128K, so we can get at least 16K-32K summary.
These values are hardcoded in compose.yaml. Need them to be parametrizable.
The text was updated successfully, but these errors were encountered: