Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync <- Mlperf inference for November release #585

Merged
merged 75 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from 74 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
973e9c2
take user conf and measurements json from scenario folder
anandhu-eng Nov 14, 2024
5e7f196
folder creation
anandhu-eng Nov 14, 2024
4f8818e
mlperf conf is not required
anandhu-eng Nov 14, 2024
b046b5e
take system meta from sut root
anandhu-eng Nov 14, 2024
0c7a5d4
get fromwork from sut info if not filled
anandhu-eng Nov 14, 2024
2cbb243
fix typo
anandhu-eng Nov 14, 2024
95e2634
throw error if measurements or user conf is missing
anandhu-eng Nov 14, 2024
a563ef0
paths to files rearranged
anandhu-eng Nov 14, 2024
cb1a504
commit for backwards compatibility
anandhu-eng Nov 19, 2024
3318471
fix typo
anandhu-eng Nov 19, 2024
7f95181
update the repsonse
anandhu-eng Nov 19, 2024
b6f07c0
commit to make user conf backward compatible
anandhu-eng Nov 19, 2024
f90af77
optimize user conf copy logic
anandhu-eng Nov 19, 2024
8f07511
measurements.json made backward compatible
anandhu-eng Nov 19, 2024
9481038
changes reverted
anandhu-eng Nov 19, 2024
c54f4ba
bug fix
anandhu-eng Nov 19, 2024
4fe0fbe
code optimization
anandhu-eng Nov 19, 2024
51963e2
eliminate checking in compliance folders
anandhu-eng Nov 19, 2024
5980174
Merge branch 'mlperf-inference' into issue-#542
anandhu-eng Nov 19, 2024
fd0904e
fix unmatched fstring
anandhu-eng Nov 19, 2024
c6e9d42
test commit
anandhu-eng Nov 19, 2024
751856f
default division set to closed
anandhu-eng Nov 19, 2024
4cc558b
optimize system_meta part
anandhu-eng Nov 19, 2024
741c482
switch off submission preprocessing - short run
anandhu-eng Nov 19, 2024
10e5b1a
fix typo
anandhu-eng Nov 19, 2024
d7f46a5
fix typo
anandhu-eng Nov 19, 2024
48df9dc
test commit
anandhu-eng Nov 19, 2024
c7f8c1b
provided absolute path
anandhu-eng Nov 19, 2024
15196a0
added extra run args
anandhu-eng Nov 19, 2024
158d91a
added new case - closed
anandhu-eng Nov 19, 2024
aa6df9c
test case repo mapped to mlcommons
anandhu-eng Nov 19, 2024
17d6fb7
handle suts with different sut meta files
anandhu-eng Nov 19, 2024
a563476
add extra run args for skipping sdxl accuracy checks
anandhu-eng Nov 19, 2024
c3ba940
added rest of test cases for closed division
anandhu-eng Nov 19, 2024
0fb76e0
updated description
anandhu-eng Nov 19, 2024
ce74af3
additional tag for dataset sampling
anandhu-eng Nov 19, 2024
99b504a
Merge pull request #576 from mlcommons/mixtral-ghaction-patch
arjunsuresh Nov 19, 2024
71b0876
match test query count with sampled data
anandhu-eng Nov 20, 2024
9efa46f
Merge pull request #577 from mlcommons/patch-mixtral
arjunsuresh Nov 20, 2024
af749bd
Update default-config.yaml
arjunsuresh Nov 21, 2024
ab3f900
Update default-config.yaml
arjunsuresh Nov 21, 2024
3ce2b6a
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 21, 2024
e27bede
Merge pull request #578 from GATEOverflow/mlperf-inference
arjunsuresh Nov 21, 2024
bb6154d
Fix sdxl ss model generation for Nvidia mlperf inference
arjunsuresh Nov 21, 2024
fb2eb99
Improvements to docker run command in detached mode
arjunsuresh Nov 21, 2024
1fcec02
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 21, 2024
b770ea4
Merge pull request #579 from GATEOverflow/mlperf-inference
arjunsuresh Nov 21, 2024
255c63c
Use absolute paths for docker mounts
arjunsuresh Nov 21, 2024
539e4a8
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 21, 2024
974fdbc
Merge pull request #580 from GATEOverflow/mlperf-inference
arjunsuresh Nov 21, 2024
5863f42
Add TEST04 for sdxl
arjunsuresh Nov 21, 2024
bc3eb3f
Fixed the dependencies for MLPErf inference SDXL accuracy script
arjunsuresh Nov 21, 2024
ab1b7b4
Added rerun option for SDXL accuracy script
arjunsuresh Nov 21, 2024
c74c9c8
Update test-cm-based-submission-generation.yml
arjunsuresh Nov 21, 2024
889d67b
Added cm-deps.mmd and cm-deps.png to submission generation
arjunsuresh Nov 21, 2024
6638a79
Merge pull request #574 from mlcommons/issue-#542
arjunsuresh Nov 21, 2024
e55fdcc
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 21, 2024
9a18419
Merge pull request #582 from GATEOverflow/mlperf-inference
arjunsuresh Nov 21, 2024
d1525af
Use edge as the default system in run-mlperf-inference-app
arjunsuresh Nov 21, 2024
42c0b6f
Added use_model_from_host option for run-mlperf-inference-app
arjunsuresh Nov 21, 2024
58cf0b8
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 21, 2024
29e0974
Fix typo in benchmark-program
arjunsuresh Nov 21, 2024
b37b7dd
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 22, 2024
94e127d
Merge pull request #583 from GATEOverflow/mlperf-inference
arjunsuresh Nov 22, 2024
3061b78
Added _full for nvidia mlperf inference sdxl
arjunsuresh Nov 22, 2024
dc86953
Added auto clean of partial dataset for coco2014 for a full run
arjunsuresh Nov 22, 2024
441859d
Added auto clean of partial dataset for coco2014 for a full run
arjunsuresh Nov 22, 2024
3b466ad
Added auto clean of partial dataset for coco2014 for a full run
arjunsuresh Nov 22, 2024
2c774da
Added auto clean of partial dataset for coco2014 for a full run
arjunsuresh Nov 22, 2024
2fbabe6
Renamed _cm.json files to _cm.yaml
arjunsuresh Nov 22, 2024
61f312d
Updated _cm.yaml contents from _cm.json
arjunsuresh Nov 22, 2024
b12f055
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 22, 2024
9fd8d2d
Merge pull request #584 from GATEOverflow/mlperf-inference
arjunsuresh Nov 22, 2024
2abf849
Merge branch 'main' into mlperf-inference
arjunsuresh Nov 22, 2024
9f79f45
Update VERSION
arjunsuresh Nov 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
22 changes: 16 additions & 6 deletions .github/workflows/test-cm-based-submission-generation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
python-version: [ "3.12" ]
division: ["closed", "open", "closed-open"]
category: ["datacenter", "edge"]
case: ["case-3", "case-7", "case-8"]
case: ["closed", "closed-no-compliance", "closed-power", "closed-failed-power-logs", "case-3", "case-7", "case-8"]
action: ["run", "docker"]
exclude:
- os: macos-latest
Expand All @@ -37,21 +37,31 @@ jobs:
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
- name: Pull repo where test cases are uploaded
run: |
git clone -b submission-generation-tests https://github.com/anandhu-eng/inference.git submission_generation_tests
git clone -b submission-generation-tests https://github.com/mlcommons/inference.git submission_generation_tests
- name: Run Submission Generation - ${{ matrix.case }} ${{ matrix.action }} ${{ matrix.category }} ${{ matrix.division }}
run: |
if [ "${{ matrix.case }}" == "case-3" ]; then
#results_dir="submission_generation_tests/case-3/"
description="Submission generation (model_mapping.json not present but model name matches with official one)"
elif [ "${{ matrix.case }}" == "case-7" ]; then
#results_dir="submission_generation_tests/case-7/"
description="Submission generation (sut_info.json incomplete, SUT folder name in required format)"
elif [ "${{ matrix.case }}" == "case-8" ]; then
#results_dir="submission_generation_tests/case-8/"
extra_run_args=" --category=datacenter"
description="Submission generation (system_meta.json not found in results folder)"
elif [ "${{ matrix.case }}" == "closed" ]; then
extra_run_args=" --env.CM_MLPERF_SUBMISSION_CHECKER_EXTRA_ARGS="--skip-extra-accuracy-files-check""
description="Test submission - contains closed edge and datacenter"
elif [ "${{ matrix.case }}" == "closed-no-compliance" ]; then
extra_run_args=" --env.CM_MLPERF_SUBMISSION_CHECKER_EXTRA_ARGS="--skip-extra-accuracy-files-check""
description="Test submission - contains closed edge and datacenter with no compliance tests"
elif [ "${{ matrix.case }}" == "closed-power" ]; then
extra_run_args=" --env.CM_MLPERF_SUBMISSION_CHECKER_EXTRA_ARGS="--skip-extra-accuracy-files-check""
description="Test submission - contains closed-power edge and datacenter results"
elif [ "${{ matrix.case }}" == "closed-failed-power-logs" ]; then
extra_run_args=" --env.CM_MLPERF_SUBMISSION_CHECKER_EXTRA_ARGS="--skip-extra-accuracy-files-check""
description="Test submission - contains closed-power edge and datacenter results with failed power logs"
fi
# Dynamically set the log group to simulate a dynamic step name
echo "::group::$description"
cm ${{ matrix.action }} script --tags=generate,inference,submission --clean --preprocess_submission=yes --results_dir=submission_generation_tests/${{ matrix.case }}/ --run-checker --submitter=MLCommons --tar=yes --env.CM_TAR_OUTFILE=submission.tar.gz --division=${{ matrix.division }} --category=${{ matrix.category }} --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes --quiet
cm ${{ matrix.action }} script --tags=generate,inference,submission --adr.submission-checker-src.tags=_branch.dev --clean --preprocess_submission=yes --results_dir=$PWD/submission_generation_tests/${{ matrix.case }}/ --run-checker --submitter=MLCommons --tar=yes --env.CM_TAR_OUTFILE=submission.tar.gz --division=${{ matrix.division }} --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes --quiet $extra_run_args
echo "::endgroup::"

2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-mixtral.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ jobs:
git config --global credential.helper store
huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
cm pull repo
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=mixtral-8x7b --implementation=reference --batch_size=1 --precision=${{ matrix.precision }} --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --docker --quiet --test_query_count=1 --target_qps=0.001 --clean --env.CM_MLPERF_MODEL_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --env.CM_MLPERF_DATASET_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=mixtral-8x7b --implementation=reference --batch_size=1 --precision=${{ matrix.precision }} --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --docker --quiet --test_query_count=3 --target_qps=0.001 --clean --env.CM_MLPERF_MODEL_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --env.CM_MLPERF_DATASET_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --adr.openorca-mbxp-gsm8k-combined-preprocessed.tags=_generate-test-data.1
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions - GO-phoenix" --quiet --submission_dir=$HOME/gh_action_submissions
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,6 @@ jobs:
pip install --upgrade cm4mlops
cm pull repo

cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --adr.submission-checker-src.tags=_branch.dev --execution_mode=valid --gpu_name=rtx_4090 --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=$hw_name --implementation=nvidia --backend=tensorrt --category=datacenter,edge --division=closed --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cuda --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet
cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --adr.submission-checker-src.tags=_branch.dev --execution_mode=valid --gpu_name=rtx_4090 --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=$hw_name --implementation=nvidia --backend=tensorrt --category=datacenter,edge --division=closed --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cuda --use_model_from_host=yes --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet

cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_unofficial_submissions_v5.0 --repo_branch=main --commit_message="Results from GH action on NVIDIA_$hw_name" --quiet --submission_dir=$HOME/gh_action_submissions --hw_name=$hw_name
25 changes: 0 additions & 25 deletions script/activate-python-venv/_cm.json

This file was deleted.

18 changes: 18 additions & 0 deletions script/activate-python-venv/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
alias: activate-python-venv
automation_alias: script
automation_uid: 5b4e0237da074764
category: Python automation
developers: '[Grigori Fursin](https://cKnowledge.org/gfursin)'
name: Activate virtual Python environment
prehook_deps:
- names:
- python-venv
reuse_version: true
tags: install,python-venv
tags:
- activate
- python
- activate-python-venv
- python-venv
tags_help: activate python-venv
uid: fcbbb84946f34c55
46 changes: 0 additions & 46 deletions script/app-image-classification-tf-onnx-cpp/_cm.json

This file was deleted.

27 changes: 27 additions & 0 deletions script/app-image-classification-tf-onnx-cpp/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
alias: app-image-classification-tf-onnx-cpp
automation_alias: script
automation_uid: 5b4e0237da074764
category: Modular AI/ML application pipeline
default_env:
CM_BATCH_COUNT: '1'
CM_BATCH_SIZE: '1'
deps:
- tags: detect,os
- tags: get,sys-utils-cm
- tags: get,gcc
- tags: get,dataset,image-classification,original
- tags: get,dataset-aux,image-classification
- tags: get,ml-model,raw,image-classification,resnet50,_onnx,_opset-11
- tags: tensorflow,from-src
version: v2.0.0
tags:
- app
- image-classification
- tf
- tensorflow
- tf-onnx
- tensorflow-onnx
- onnx
- cpp
tags_help: app image-classification cpp tensorflow onnx
uid: 879ed32e47074033
86 changes: 0 additions & 86 deletions script/app-image-classification-torch-py/_cm.json

This file was deleted.

46 changes: 46 additions & 0 deletions script/app-image-classification-torch-py/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
alias: app-image-classification-torch-py
automation_alias: script
automation_uid: 5b4e0237da074764
category: Modular AI/ML application pipeline
default_env:
CM_BATCH_COUNT: '1'
CM_BATCH_SIZE: '1'
deps:
- tags: detect,os
- names:
- python
- python3
tags: get,python3
- tags: get,dataset,imagenet,image-classification,preprocessed
- tags: get,dataset-aux,imagenet-aux,image-classification
- tags: get,imagenet-helper
- tags: get,ml-model,image-classification,resnet50,_pytorch,_fp32
- skip_if_env:
USE_CUDA:
- 'yes'
tags: get,generic-python-lib,_torch
- enable_if_env:
USE_CUDA:
- 'yes'
tags: get,generic-python-lib,_torch_cuda
- skip_if_env:
USE_CUDA:
- 'yes'
tags: get,generic-python-lib,_torchvision
- enable_if_env:
USE_CUDA:
- 'yes'
tags: get,generic-python-lib,_torchvision_cuda
tags:
- app
- image-classification
- torch
- python
tags_help: app image-classification python torch
uid: e3986ae887b84ca8
variations:
cuda:
deps:
- tags: get,cuda
env:
USE_CUDA: 'yes'
Loading
Loading