Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial fix for hard-coded GCST #101 #109

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joshchiou
Copy link

There are currently multiple points in the pipeline where errors are raised due to the expectation that the study name starts with "GCST". This fixes one of them so that the study name in the metadata yaml file doesn't need to be GCST*.

The harmonization pipeline still exits with an error at the end when it is collecting metrics. I believe this is because GCST is also hard-coded in gwas_sumstats_tools.schema.metadata.SumStatsMetadata from gwas-sumstats-tools (https://pypi.org/project/gwas-sumstats-tools/).

Nextflow log:

executor >  slurm (56)
[e2/d0e39e] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 2 of 2, failed: 1...
[9f/2106e9] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 25 of 25 ✔
[1a/0c48b1] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 1 of 1 ✔
[-        ] process > NFCORE_GWASCATALOGHARM:GWAS... -
[-        ] process > NFCORE_GWASCATALOGHARM:GWAS... -
[05/b5b195] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 25 of 25, failed:...
[35/3f542c] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 1 of 1 ✔
[57/ee4335] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 1 of 1 ✔
[80/7c50b8] process > NFCORE_GWASCATALOGHARM:GWAS... [100%] 1 of 1, failed: 1 ✔
[80/7c50b8] NOTE: Process `NFCORE_GWASCATALOGHARM:GWASCATALOGHARM:quality_control:harmonization_log (CAD)` terminated with an error exit status (1) -- Error is ignored
Completed at: 26-Sep-2024 22:06:02
Duration    : 1h 26m 20s
CPU hours   : 10.1 (0.8% failed)
Succeeded   : 51
Ignored     : 4
Failed      : 5

.command.log from failed job:

Traceback (most recent call last):
  File "/bin/gwas_metadata.py", line 71, in <module>
    main()
  File "/bin/gwas_metadata.py", line 60, in main
    m.from_file()
  File "/bin/gwas_metadata.py", line 21, in from_file
    self.update_metadata(self._meta_dict)
  File "/bin/gwas_metadata.py", line 34, in update_metadata
    self.metadata = self.metadata.parse_obj(self._meta_dict)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 2 validation errors for SumStatsMetadata
gwas_id
  string does not match regex "^GCST\d+$" (type=value_error.str.regex; pattern=^GCST\d+$)
samples
  value is not a valid list (type=type_error.list)```

Copy link
Collaborator

@jiyue1214 jiyue1214 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestions and contribution! Currently, input naming is hardcoded to include a shortened identifier, which helps with easier job monitoring. In our upcoming version, we plan to improve this feature to support more flexible naming conventions, as well as consider two user scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants