Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llm large reference #1915

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Llm large reference #1915

wants to merge 10 commits into from

Conversation

pgmpablo157321
Copy link
Contributor

No description provided.

Copy link

github-actions bot commented Nov 12, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@pgmpablo157321 pgmpablo157321 force-pushed the llm_large_reference branch 2 times, most recently from 3b33ce1 to 7be9b13 Compare November 14, 2024 23:43
no_eos_ids = []
for qid, output in tqdm(run_outputs.items()):
L = list(output)
# Prune trailing 2s (EOS token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EOS ID is not 2 for llama-405B. Need to use tokenizer.eos_token_id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of consolidate_results.py? I copied it from Llama2 but don't know why it is needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be needed since we don't split pickle anymore. Please remove if not needed.

predictions=preds, references=targets, use_stemmer=True, use_aggregator=False
)

assert len(rouge_scores["rouge1"]) == 24576
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Comment on lines +158 to +161
self.model = LLM(
self.model_path,
dtype=self.dtype,
tensor_parallel_size=self.tensor_parallel_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are not using AsyncLLMEngine here? It may be more efficient since it will support continous batching


def main(args):
# Set up decode and evaluation objects
tokenizer = LlamaTokenizerFast.from_pretrained(args.model_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i may be wrong, but would it better to use AutoTokenizer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suggest to use AutoTokenizer for robustness

@pgmpablo157321 pgmpablo157321 force-pushed the llm_large_reference branch 2 times, most recently from 5da4409 to e44c62a Compare November 22, 2024 21:07
@pgmpablo157321 pgmpablo157321 marked this pull request as ready for review November 22, 2024 21:08
@pgmpablo157321 pgmpablo157321 requested a review from a team as a code owner November 22, 2024 21:08
@pgmpablo157321 pgmpablo157321 changed the title [WIP] Llm large reference Llm large reference Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants