Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to v0.10.0 #1427

Open
wants to merge 399 commits into
base: habana_alpha
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
399 commits
Select commit Hold shift + click to select a range
2e8982e
improve error msg when checking target_blocks in activation_checkpoin…
cli99 Feb 15, 2024
1ef7409
Torch 2.2 upgrade - Part 1 (#976)
dakinggg Feb 15, 2024
e0756e1
Torch 2.2 - Part 2 (#979)
dakinggg Feb 15, 2024
da2c863
PyTorch 2.2 - Part 3 (#981)
dakinggg Feb 16, 2024
3a99270
Remove torch 2.1 from docker build (#982)
dakinggg Feb 16, 2024
6e3842b
Async callback: Don't skip checkpoints, reliably only launch async ev…
aspfohl Feb 16, 2024
2431730
Token accuracy metrics (#983)
dakinggg Feb 21, 2024
63c88d0
do not mention 1.13 in readme (#988)
irenedea Feb 22, 2024
dff2cf4
Patch test, lock mcli version (#990)
aspfohl Feb 22, 2024
386ae36
Bump gha timeouts (#991)
aspfohl Feb 22, 2024
2478f0a
Fix readme typo (#993)
dakinggg Feb 23, 2024
e5fffac
if condition in tie weights added (#989)
megha95 Feb 23, 2024
44fd365
bump composer version (#995)
dakinggg Feb 24, 2024
d527c9b
Trim examples ahead of time for auto packing (#994)
irenedea Feb 27, 2024
b082511
add oom observer callback (#932)
cli99 Feb 27, 2024
e3f214e
Change ci/cd to use ci-testing repo
b-chu Feb 27, 2024
5abbca0
Revert "Change ci/cd to use ci-testing repo"
b-chu Feb 27, 2024
2436c00
Use ci-testing repo (#1000)
b-chu Feb 29, 2024
d104d16
Make CodeEval respect device_eval_batch_size (#956)
josejg Mar 1, 2024
2dea737
Remove try except around imports (#1004)
dakinggg Mar 2, 2024
3880d04
Deprecate triton, prefix lm, llama attention patch, and text denoisin…
irenedea Mar 4, 2024
96c8218
add magic filename for sharded state dicts (#1001)
milocress Mar 5, 2024
cbdddf0
bump (#1009)
mvpatel2000 Mar 5, 2024
09ff550
Fix evaluators actually pulling eval metrics (#1006)
mvpatel2000 Mar 5, 2024
fd8cbaf
Build torch 2.2.1 images (#1010)
dakinggg Mar 5, 2024
5728969
add 2.2.1 tests (#1011)
dakinggg Mar 5, 2024
f4f6414
Bump min torch pin (#1013)
dakinggg Mar 6, 2024
cf0f5e5
Fix extra BOS token in front of response for some tokenizers (#1003)
dakinggg Mar 6, 2024
86c8746
Bump min composer pin (#1015)
dakinggg Mar 6, 2024
5261a55
add default for eval interval (#987)
irenedea Mar 6, 2024
93d7a05
Add support for olmo (#1016)
dakinggg Mar 7, 2024
64212cb
Add deeper support for multi-turn chats and loss-generating tokens in…
alextrott16 Mar 7, 2024
c2aec30
Fix profiling packing ratio to explicitly say 1 (#1019)
irenedea Mar 8, 2024
2b17497
Bump transformers to 4.38.2 (#1018)
dakinggg Mar 8, 2024
36ab1ba
that kwargs (#1020)
snarayan21 Mar 10, 2024
2fc5d33
Update readme with pytorch 2.2.1 (#1021)
irenedea Mar 11, 2024
d61c53d
Add code import to train/eval scripts (#1002)
dakinggg Mar 11, 2024
4e43792
finish (#1022)
bmosaicml Mar 11, 2024
257c25d
Bump version to 0.6.0 (#1023)
dakinggg Mar 12, 2024
4e8a875
Fix typo in monolithic chkpt callback docs (#1024)
sashaDoubov Mar 13, 2024
14e2dec
Allow code-quality workflow to be callable (#1026)
b-chu Mar 13, 2024
cffd75e
Fix llama attention patch (#1036)
dakinggg Mar 14, 2024
c88169d
Adds a decorator for experimental features (#1038)
dakinggg Mar 15, 2024
c173dd3
Bump version in yamls (#1040)
dakinggg Mar 18, 2024
c26309d
Remove references to attn_impl: triton (#1041)
dakinggg Mar 18, 2024
9bf3701
Registry based config - Part 1 (#975)
dakinggg Mar 20, 2024
3348b59
Deprecate attention patching for llama (#1047)
dakinggg Mar 21, 2024
26a5fd4
Compile GLU (#1049)
josejg Mar 22, 2024
31e4879
log details to metadata for run analytics (#992)
angel-ruiz7 Mar 23, 2024
c9685cf
Update README.md (#1056)
dennyglee Mar 24, 2024
94a05bd
Add chat schema example for mlflow (#1054)
dakinggg Mar 24, 2024
813d596
Metrics registry (#1052)
dakinggg Mar 24, 2024
67dcab9
LLM Foundry CLI (just registry) (#1043)
dakinggg Mar 24, 2024
5c8a829
Bump Composer to 0.21.1 (#1053)
jjanezhang Mar 24, 2024
32e14a6
Dataloaders registry (#1044)
dakinggg Mar 24, 2024
648b1bd
Fix multi model eval (#1055)
dakinggg Mar 24, 2024
b9a2de6
Remove unnecessary test workflow (#1058)
dakinggg Mar 24, 2024
8f25c18
Fix peft llama test (#1059)
dakinggg Mar 24, 2024
2d65fc2
Models registry (#1057)
dakinggg Mar 25, 2024
e590acf
Remove under construction from registry (#1060)
dakinggg Mar 25, 2024
0ef7cd6
Custom Exceptions for Mosaic Logger (#1014)
jjanezhang Mar 26, 2024
7f0fdae
Bump version to 0.7.0 (#1063)
irenedea Mar 26, 2024
e42ea70
filter files (#1067)
dakinggg Mar 27, 2024
f044d6c
Fix context printing (#1068)
irenedea Mar 27, 2024
8a69bd7
Update README with DBRX (#1069)
hanlint Mar 27, 2024
28467bb
added attributes to construct msg in mapi (#1071)
jjanezhang Mar 27, 2024
7a8a156
Output eval logging batch (#961)
maxisawesome Mar 28, 2024
349c2ff
Add expandeable segments flag (#1075)
dakinggg Apr 1, 2024
d8ea2c5
Check the user provided eos / bos token id against the tokenizer eos …
ShashankMosaicML Apr 1, 2024
b765b47
Triton RMSNorm (#1050)
josejg Apr 2, 2024
caf7fda
Fix tiktoken vocab size (#1081)
dakinggg Apr 2, 2024
632cb73
Doing the loss reduction in foundry instead of in the loss functions.…
ShashankMosaicML Apr 2, 2024
580a4b0
remove (#1082)
mvpatel2000 Apr 2, 2024
394735b
Upgrade hf chat (#1061)
j316chuck Apr 2, 2024
d452c60
Fixes for streaming and auto packing (#1083)
dakinggg Apr 3, 2024
23c3173
Background mlflow model registration (#1078)
irenedea Apr 3, 2024
5455b40
Update README.md to include DBRX blog under "Latest News" (#1085)
lupesko Apr 4, 2024
c766cf9
decrease mlflow hf model shard size (#1087)
dakinggg Apr 4, 2024
e70e424
log packing ratio progress (#1070)
milocress Apr 4, 2024
26f0619
Bump HF version (#1091)
b-chu Apr 4, 2024
06ff95f
fix typo in expandable_segments (#1088)
mammothb Apr 4, 2024
607b982
Bump transformers to 4.39.3 (#1086)
dakinggg Apr 4, 2024
f18768b
fix yamls typo (#1092)
dakinggg Apr 4, 2024
96b27c5
Allow overrides for nested PretrainedConfig (#1089)
dakinggg Apr 4, 2024
94301cd
cleaned up HF/MPT conversion test (#1048)
milocress Apr 4, 2024
60a1ab4
update (#1097)
dakinggg Apr 5, 2024
b81897a
Norms registry (#1080)
dakinggg Apr 5, 2024
d0d9434
fixing evaluator microbatch size (#1100)
ShashankMosaicML Apr 8, 2024
2939cc9
Updating the streaming version in setup.py (#1103)
ShashankMosaicML Apr 9, 2024
53160f4
MegaBlocks release (#1102)
mvpatel2000 Apr 9, 2024
f12bc8a
Remove torch compile (#1101)
josejg Apr 9, 2024
17f8aeb
Update config_moe_args.py (#1104)
vchiley Apr 10, 2024
7337429
Add remote code option to allow execution of DBRX tokenizer (#1106)
b-chu Apr 10, 2024
b5fc0fa
GRT-2819 fix overwritting in script (#1107)
cli99 Apr 10, 2024
4cd2324
Support ShareGPT chat format (#1098)
samhavens Apr 11, 2024
ed3daef
FC layer registry (#1093)
dakinggg Apr 12, 2024
560012b
Attention layer registry (#1094)
dakinggg Apr 12, 2024
e9b1c6e
Dbrx finetune yaml requires save folder specified to enable autoresum…
mvpatel2000 Apr 12, 2024
b58d68c
Revert "Update config_moe_args.py (#1104)" (#1111)
vchiley Apr 12, 2024
6257e5b
Update config_moe_args.py (#1112)
vchiley Apr 12, 2024
3729ba3
Migrate ICL classes to foundry (#936)
bmosaicml Apr 12, 2024
cb0de4f
FFN layer registry (#1095)
dakinggg Apr 12, 2024
676ad7f
Param init registry (#1096)
dakinggg Apr 13, 2024
f01f625
Add missing init file (#1113)
dakinggg Apr 13, 2024
84b6410
Update tests to not rely on mistral (#1117)
dakinggg Apr 18, 2024
4bb4d4a
Bump transformers to 4.40 (#1118)
dakinggg Apr 18, 2024
20cb40c
add `.json` to SUPPORTED_EXTENSIONS (#1114)
eitanturok Apr 19, 2024
63a7f12
Add option for subclasses to convert model and tokenizer in hf checkp…
dakinggg Apr 19, 2024
698206d
Bump Composer to 0.21.3 (#1122)
b-chu Apr 19, 2024
f0646e8
catch misconfigured hf dataset (#1123)
milocress Apr 19, 2024
3426415
Pin mlflow (#1124)
dakinggg Apr 20, 2024
6caa75a
Change main to a dev version (#1126)
dakinggg Apr 20, 2024
0c6bd75
Fix deprecation versions (#1129)
dakinggg Apr 22, 2024
4952183
Clean up the publicly exported API (#1128)
dakinggg Apr 22, 2024
c53622e
Fix HF checkpointer + mlflow bugs (#1125)
dakinggg Apr 23, 2024
0d62e61
Update JSONL sources in eval README (#1110)
emmanuel-ferdman Apr 23, 2024
72da1d7
Mlflow datasets (#1119)
KuuCi Apr 24, 2024
6252f79
Fix InvalidPromptResponseKeysError bug (#1131)
b-chu Apr 24, 2024
76f74b6
First initialize dist with gloo (#1133)
dakinggg Apr 24, 2024
15abf8c
Fix saving of generation_config for Llama-3 (#1134)
eldarkurtic Apr 25, 2024
24f65fd
Bump datasets version (#1138)
dakinggg Apr 25, 2024
4aef5de
Revert "First initialize dist with gloo (#1133)" (#1139)
dakinggg Apr 25, 2024
6cfd2a3
Barrier immediately after initialize dist with logs (#1140)
dakinggg Apr 25, 2024
f97f02e
Add new FT instructions (#1143)
b-chu Apr 26, 2024
8be3254
Upgrade ci-testing (#1145)
mvpatel2000 Apr 27, 2024
704a90a
Fix typos in callbacks with configs (#1146)
dakinggg Apr 29, 2024
fbcf311
remove olmo (#1148)
snarayan21 Apr 29, 2024
de5a394
build inner model (#1147)
milocress Apr 29, 2024
738956e
fix DatasetConstants.splints default value to protect dictionary over…
ivan-kud Apr 29, 2024
bd0d1cb
Bump flash attention version (#1150)
dakinggg Apr 30, 2024
6fde283
Add torch 2.3 images (#1149)
dakinggg Apr 30, 2024
63ac1a4
add torch 2.3 CI (#1151)
dakinggg Apr 30, 2024
b570b61
Comment out torch 2.3 tests (#1155)
dakinggg Apr 30, 2024
2f58965
Fix yaml lint (#1156)
dakinggg May 1, 2024
6561330
Move sentencepiece import (#1157)
aspfohl May 1, 2024
5f39606
composer verision BUMP (#1160)
snarayan21 May 1, 2024
fa7a78a
Uncomment GPU tests (#1162)
milocress May 1, 2024
124de4c
Update setup.py (#1163)
milocress May 2, 2024
0d58f46
Update pr-gpu.yaml (#1164)
dakinggg May 2, 2024
a3e0fb9
Bump min torch version to 2.3.0 (#1152)
dakinggg May 2, 2024
10b7caf
Add line splitting and other linting (#1161)
b-chu May 2, 2024
3b82735
refactoring dataloader into registries. (#1165)
ShashankMosaicML May 2, 2024
ddf4aa4
Migrate eval output logging to foundry (#1166)
maxisawesome May 2, 2024
c0d591c
Fix import and mocking (#1169)
dakinggg May 3, 2024
bfbb8c5
minor fix to (#1170)
ShashankMosaicML May 3, 2024
ab9dde7
Fix config access for DBRX (#1177)
dakinggg May 6, 2024
a777014
Bump version v0.9.0.dev0 (#1181)
milocress May 8, 2024
cc8351c
structuredconfig for train.py and eval.py (#1051)
milocress May 8, 2024
46b8bee
update version names (#1185)
milocress May 8, 2024
ac563e6
Refactoring attention (#1182)
ShashankMosaicML May 8, 2024
0c7bc2a
checking if attention mask present for ignoring pad tokens in ffn (#1…
ShashankMosaicML May 9, 2024
c6679d3
Remove python 3.8 and add python 3.11 (#1189)
j316chuck May 9, 2024
139abab
Docstring fix for curriculum learning callback (#1186)
snarayan21 May 9, 2024
51d0d09
Set ft dataloader name explicitly (#1187)
milocress May 9, 2024
1dd37c5
Fix to_container call (#1190)
dakinggg May 9, 2024
983234d
fix eval (#1193)
milocress May 10, 2024
994209c
Log exception on inactivity callback (#1194)
jjanezhang May 10, 2024
eef4872
Pass FC type along for all FFN types (#1196)
dakinggg May 11, 2024
0449b60
streaming version bump (#1195)
snarayan21 May 14, 2024
8274c6c
Clearer error message for unknown example type (#1202)
milocress May 14, 2024
b414626
Added torch_dmoe defaults, bug fixes for 2D inputs (#1210)
snarayan21 May 15, 2024
cfee4e4
log eval dataset misconfiguration (#1179)
milocress May 15, 2024
8cd23d5
using self.shift_labels instead of self.model.transformer.shift_label…
ShashankMosaicML May 15, 2024
dc3212e
Add fc to HF export (#1209)
dakinggg May 16, 2024
e70891b
TransformerEngine Image Build (#1204)
mvpatel2000 May 16, 2024
3fe7f09
remove dead code (#1213)
dakinggg May 16, 2024
38ae65b
Make `fc_type` a dict to pass fc kwargs through (#1201)
snarayan21 May 16, 2024
3a15082
torch dmoe tests gpu oom (#1216)
snarayan21 May 16, 2024
77f9ab1
that mod (#1219)
snarayan21 May 17, 2024
001e7c3
Modularize components of megablocks layer builder (#1224)
dakinggg May 22, 2024
8e29698
Add user error superclass (#1225)
milocress May 22, 2024
c891bed
Make config/class properties on ComposerMPTForCausalLM (#1227)
dakinggg May 22, 2024
9cc945c
Quick patch to check that Dataset Keys contain non-None Values (#1228)
KuuCi May 22, 2024
9120c27
Modularize backbone class and block creation (#1229)
dakinggg May 23, 2024
c213ea8
Loss v len callback (#1226)
ShashankMosaicML May 23, 2024
6fa6026
Fixing the state.timestamp.batch.value issue in loss v len callback (…
ShashankMosaicML May 23, 2024
b4bb34c
fix attr error for attention_classes (#1230)
cli99 May 24, 2024
09d8892
fix typing (#1235)
dakinggg May 24, 2024
ef530bf
Add example eval scripts for dbrx PT sizes (#1218)
aspfohl May 24, 2024
ff92f3c
configurable submesh (#1236)
dakinggg May 24, 2024
fdaa58b
Add retries to downloads in convert_text_to_mds.py (#1238)
irenedea May 24, 2024
1e4bd37
Move MLFlow dataset outside of log_config (#1234)
KuuCi May 24, 2024
2e10d95
add error when chat template fails (#1222)
milocress May 25, 2024
867e58f
Make the exceptions serializable (#1239)
dakinggg May 25, 2024
c9257b5
removing rich install (#1198)
jjanezhang May 27, 2024
43d149b
Chunk file reads and tokenization for text to mds conversion (#1240)
irenedea May 28, 2024
b82a82b
Make HF conversion automatically add missing imports (#1241)
dakinggg May 29, 2024
fb9a225
Add logging to convert_text_to_mds.py script (#1243)
irenedea May 29, 2024
d846731
Update CODEOWNERS (#1248)
dakinggg Jun 3, 2024
6c260f5
replacing icl_task_type question_answering with generation_task_with_…
ShashankMosaicML Jun 4, 2024
ac56dc5
Change TE docker image to enable te_shard_weight (#1251)
j316chuck Jun 4, 2024
67928cb
Fix MPT HF conversion (#1257)
dakinggg Jun 6, 2024
3966f0e
remove warning (#1258)
dakinggg Jun 6, 2024
42c2d9a
Adding more token encoding types (#1254)
snarayan21 Jun 6, 2024
14f296c
Bump Composer to 0.23.0 (#1259)
KuuCi Jun 6, 2024
bea61fb
Bump Version to 0.10.0.dev0 (#1255)
KuuCi Jun 7, 2024
e4b8b57
Fix typo in setup.py (#1263)
XiaohanZhangCMU Jun 7, 2024
db70135
Update TE Dockerfile (#1265)
j316chuck Jun 7, 2024
4e53e74
Revert "Update TE Dockerfile (#1265)" (#1266)
j316chuck Jun 7, 2024
dddb9b8
revert to nvidia code (#1267)
mvpatel2000 Jun 7, 2024
dd92abf
Bump composer to 0.23.2 (#1269)
dakinggg Jun 8, 2024
5571101
fix linting (#1270)
milocress Jun 9, 2024
ffec54b
Add torch 2.3.1 docker images (#1275)
dakinggg Jun 13, 2024
c30856f
Make expandable segments on by default (#1278)
b-chu Jun 13, 2024
630fc68
Add CI for torch 2.3.1 (#1281)
dakinggg Jun 14, 2024
9b9fc24
Update README.md to use variables (#1282)
milocress Jun 14, 2024
1a2fac0
Add registry for ICL datasets (#1252)
sanjari-orb Jun 14, 2024
82ef072
fix typo in CI (#1284)
dakinggg Jun 14, 2024
4350990
Fix backwards compatibility for ICL arg (#1286)
dakinggg Jun 14, 2024
dbd798e
Fix packing + streaming + resumption (#1277)
dakinggg Jun 14, 2024
1ff6c5b
Dbfs HF (#1214)
KuuCi Jun 15, 2024
ca528d5
bump mlflow (#1285)
KuuCi Jun 16, 2024
f8b2875
Add missing dependency group (#1287)
dakinggg Jun 17, 2024
618db6f
Update Dockerfile with TE main (#1273)
j316chuck Jun 17, 2024
c23be4a
Fix TE HF checkpoint saving (#1280)
j316chuck Jun 18, 2024
4b1fecb
added systemMetricsMonitor callback (#1260)
JackZ-db Jun 19, 2024
8241f9c
Extendability refactors (#1290)
dakinggg Jun 20, 2024
78e4cc6
Small refactor for update batch size (#1293)
dakinggg Jun 21, 2024
13bd8f9
Bump min composer version to 0.23.3 (#1294)
dakinggg Jun 21, 2024
e8ba9b7
Fix grad accum type (#1296)
dakinggg Jun 21, 2024
129e3e1
Bump composer to 0.23.4 (#1297)
mvpatel2000 Jun 21, 2024
2196d07
Allow passing in lbl_process_group directly (#1298)
dakinggg Jun 21, 2024
8b5a1bb
Add `all` transforms to train script (#1300)
dakinggg Jun 23, 2024
fd7b187
Add Retries to run_query (#1302)
KuuCi Jun 24, 2024
2267bc7
Bumping mlflow version to include buffering (#1303)
JackZ-db Jun 24, 2024
21c9e0a
Ignore mosaicml logger for exception if excephook is active (#1301)
jjanezhang Jun 24, 2024
ef14849
Add curriculum learning callback (#1256)
b-chu Jun 25, 2024
2412b59
Avoid circular import in hf checkpointer (#1304)
dakinggg Jun 26, 2024
bbfebda
Remove CodeQL workflow (#1305)
dakinggg Jun 26, 2024
901eee3
Update CI test to v0.0.8 (#1306)
KuuCi Jun 27, 2024
3edce07
update (#1307)
dakinggg Jun 27, 2024
14348fa
bump ci-testing to 0.0.9 (#1310)
dakinggg Jun 27, 2024
472d009
Fix 4 gpu tests (#1311)
dakinggg Jun 27, 2024
f141ee1
2.3.1 (#1312)
dakinggg Jun 27, 2024
0ebd7c9
Provide default seed value in TrainConfig, matching EvalConfig (#1315)
mvpatel2000 Jun 29, 2024
88511f7
Refactor hf checkpointer (#1318)
irenedea Jun 29, 2024
8604bba
Allows interweaving of arbitrary kinds of 'attention' layers, like sl…
ShashankMosaicML Jun 30, 2024
68c2625
Add optional logging of text output to EvalOutputLogging (#1283)
sjawhar Jul 1, 2024
742f340
Update version to release version
dakinggg Jul 3, 2024
e99ec07
Add support to run MPT-1b training on Habana device (HPU) using DeepS…
vivekgoe Aug 22, 2023
48b239e
wip
Aug 31, 2023
a76d824
add act ckpt
abhi-mosaic Sep 6, 2023
d238edc
fix
abhi-databricks Nov 15, 2023
b723c25
cleanup
abhi-mosaic Jan 4, 2024
820221a
update README
abhi-mosaic Jan 4, 2024
a69273d
update reqs to 1.13
abhi-mosaic Jan 4, 2024
8ec8716
Add Model sharding support with deepspeed (#836)
vivekgoe Feb 5, 2024
de1240d
Make config compliant to 0.10.0
hlahkar Jul 9, 2024
51beeb0
Update ds_gaudi.sh
ckvermaAI Jul 15, 2024
1939ae4
Update README
hlahkar Jul 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 0 additions & 2 deletions .ci/FILE_HEADER

This file was deleted.

12 changes: 12 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Require admin approval to modify all files in the root of the repository
# This includes setup.py, the README, and the CODEOWNERS file itself!
/* @mosaicml/composer-team-admins

# Require team approval for code changes
/llmfoundry/ @mosaicml/composer-team-eng
/scripts/ @mosaicml/composer-team-eng

# Require admin approval to change the CI build configuration
# All CI Changes should be reviewed for security
/.ci/ @mosaicml/composer-team-admins
/.github/ @mosaicml/composer-team-admins
133 changes: 0 additions & 133 deletions .github/mcp/mcp_pytest.py

This file was deleted.

27 changes: 13 additions & 14 deletions .github/workflows/code-quality.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,23 @@ defaults:
jobs:
code-quality:
runs-on: ubuntu-20.04
timeout-minutes: 20
timeout-minutes: 30
strategy:
matrix:
python_version:
- '3.9'
- '3.10'
- "3.9"
- "3.10"
pip_deps:
- '[dev]'
- "[dev]"
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
- name: Get composite run steps repository
uses: actions/checkout@v3
with:
python-version: ${{ matrix.python_version }}
- name: Setup
run: |
set -ex
python -m pip install --upgrade 'pip<23' wheel
python -m pip install --upgrade .${{ matrix.pip_deps }}
- name: Run checks
run: |
pre-commit run --all-files
repository: mosaicml/ci-testing
ref: v0.0.9
path: ./ci-testing
- uses: ./ci-testing/.github/actions/code-quality
with:
python_version: ${{ matrix.python_version }}
pip_deps: ${{ matrix.pip_deps }}
71 changes: 0 additions & 71 deletions .github/workflows/codeql-analysis.yml

This file was deleted.

25 changes: 8 additions & 17 deletions .github/workflows/coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,12 @@ jobs:
steps:
- name: Checkout Repo
uses: actions/checkout@v3
- name: Setup
run: |
set -ex
python -m pip install --upgrade 'pip<23' wheel
pip install coverage[toml]==6.5.0
- name: Download artifacts
uses: actions/download-artifact@v3
- name: Get composite run steps repository
uses: actions/checkout@v3
with:
repository: mosaicml/ci-testing
ref: v0.0.9
path: ./ci-testing
- uses: ./ci-testing/.github/actions/coverage
with:
path: ${{ inputs.download-path }}
- name: Generate coverage report
run: |
set -ex

# Flatten the coverage files
ls ${{ inputs.download-path }} | while read x; do mv ${{ inputs.download-path }}/$x/.coverage .coverage.$x; done

python -m coverage combine
python -m coverage report
download-path: ${{ inputs.download-path }}
43 changes: 15 additions & 28 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,22 @@ on:
branches:
- main
paths:
- ./Dockerfile
- Dockerfile
- .github/workflows/docker.yaml
workflow_dispatch: {}
jobs:
docker-build:
runs-on: ubuntu-latest
runs-on: mosaic-8wide
if: github.repository_owner == 'mosaicml'
strategy:
matrix:
include:
- name: '1.13.1_cu117'
base_image: mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04
dep_groups: '[gpu]'
- name: '2.0.1_cu118'
base_image: mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04
dep_groups: '[gpu]'
- name: '2.1.0_cu121'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04
dep_groups: '[gpu]'
- name: '2.1.0_cu121_flash2'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04
dep_groups: '[gpu-flash2]'
- name: '2.1.0_cu121_aws'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04-aws
dep_groups: '[gpu]'
- name: '2.1.0_cu121_flash2_aws'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04-aws
dep_groups: '[gpu-flash2]'
- name: "2.3.1_cu121"
base_image: mosaicml/pytorch:2.3.1_cu121-python3.11-ubuntu20.04
dep_groups: "[gpu]"
- name: "2.3.1_cu121_aws"
base_image: mosaicml/pytorch:2.3.1_cu121-python3.11-ubuntu20.04-aws
dep_groups: "[gpu]"
steps:
- name: Maximize Build Space on Worker
uses: easimon/maximize-build-space@v4
Expand Down Expand Up @@ -69,19 +57,17 @@ jobs:
GIT_SHA=$(echo ${{ github.sha }} | cut -c1-7)
echo "IMAGE_TAG=${GIT_SHA}" >> ${GITHUB_ENV}

if [ "${{ github.event_name }}" == "push" ]; then
echo "Triggered by push event."
PROD_REPO="mosaicml/llm-foundry"
IMAGE_TAG="${PROD_REPO}:${{matrix.name}}-${GIT_SHA},${PROD_REPO}:${{matrix.name}}-latest"
IMAGE_CACHE="${PROD_REPO}:${{matrix.name}}-buildcache"
elif [ "${{ github.event_name }}" == "pull_request" ]; then
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "Triggered by pull_request event."
STAGING_REPO="mosaicml/ci-staging"
IMAGE_TAG="${STAGING_REPO}:${{matrix.name}}-${GIT_SHA}"
IMAGE_CACHE="${STAGING_REPO}:${{matrix.name}}-buildcache"
else
echo "Triggered by unknown event: ${{ github.event_name }}"
exit 1
# Triggered by push or workflow_dispatch event
echo "Triggered by ${{ github.event_name }} event, releasing to prod"
PROD_REPO="mosaicml/llm-foundry"
IMAGE_TAG="${PROD_REPO}:${{matrix.name}}-${GIT_SHA},${PROD_REPO}:${{matrix.name}}-latest"
IMAGE_CACHE="${PROD_REPO}:${{matrix.name}}-buildcache"
fi

echo "IMAGE_TAG=${IMAGE_TAG}" >> ${GITHUB_ENV}
Expand All @@ -96,5 +82,6 @@ jobs:
cache-from: type=registry,ref=${{ env.IMAGE_CACHE }}
cache-to: type=registry,ref=${{ env.IMAGE_CACHE }},mode=max
build-args: |
BRANCH_NAME=${{ github.head_ref || github.ref_name }}
BASE_IMAGE=${{ matrix.base_image }}
DEP_GROUPS=${{ matrix.dep_groups }}
Loading