You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model = MosaicGPT.from_pretrained(
"mosaicml/mpt-1b-redpajama-200b",
trust_remote_code=True,
attn_impl='torch'
)
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=tokenized_train_data["train"],
eval_dataset=tokenized_val_data["validation"],
dataset_text_field="text",
args=training_args,
neftune_noise_alpha=5 #the only one important thing for me
)
Yet it fails with various missing features in MPT-1b implementation:
Please help the community to use MPT-1b by:
a) retraining MPT-7b with 1b params size weights and MPT-7b code base
b) by updating MPT-1b codebase (which diverges from MPT-7b in terms of architecture a bit)
The text was updated successfully, but these errors were encountered:
What I want to do:
Yet it fails with various missing features in MPT-1b implementation:
and potentially others.
Please help the community to use MPT-1b by:
a) retraining MPT-7b with 1b params size weights and MPT-7b code base
b) by updating MPT-1b codebase (which diverges from MPT-7b in terms of architecture a bit)
The text was updated successfully, but these errors were encountered: