Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add publication info to assembly or BioProject fields. #369

Open
conchoecia opened this issue May 24, 2024 · 2 comments
Open

Add publication info to assembly or BioProject fields. #369

conchoecia opened this issue May 24, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@conchoecia
Copy link

Is your feature request related to a problem? Please describe.
I would like to be able to quickly determine what research article I should reference simply from looking at the output of datasets for a particular genome assembly or BioProject. This publication information is often on the NCBI website, but when I programmatically access the same accession/BioProject, there is no reference to the publication.

For example, on the BioProject page for the common carp, there is a paper to cite: Chang YS et al., "The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome.", J Mol Evol, 1994 Feb;38(2):138-55

However, when I query the same BioProject on the command line there is no publication information: datasets summary genome accession PRJNA682709

{"reports": [{"accession":"GCF_018340385.1","annotation_info":{"busco":{"busco_lineage":"actinopterygii_odb10","busco_ver":"4.1.4","complete":0.98571426,"duplicated":0.62445056,"fragmented":0.0054945056,"missing":0.008791209,"single_copy":0.36126372,"total_count":"3640"},"method":"Best-placed RefSeq; Gnomon","name":"NCBI Cyprinus carpio Annotation Release 101","pipeline":"NCBI eukaryotic genome annotation pipeline","provider":"NCBI RefSeq","release_date":"2021-07-20","report_url":"https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Cyprinus_carpio/101","software_version":"9.0","stats":{"gene_counts":{"non_coding":9553,"other":301,"protein_coding":43531,"pseudogene":6174,"total":59559}},"status":"Full annotation"},"assembly_info":{"assembly_level":"Chromosome","assembly_method":"wtdbg v. 2; quickmerge v. 1","assembly_name":"ASM1834038v1","assembly_status":"current","assembly_type":"haploid","bioproject_accession":"PRJNA682709","bioproject_lineage":[{"bioprojects":[{"accession":"PRJNA682709","title":"Cyprinus carpio isolate:SPL01 Genome sequencing and assembly"}]}],"biosample":{"accession":"SAMN17005855","attributes":[{"name":"isolate","value":"SPL01"},{"name":"dev_stage","value":"adult"},{"name":"sex","value":"not collected"},{"name":"tissue","value":"muscle"}],"bioprojects":[{"accession":"PRJNA682709"}],"description":{"organism":{"organism_name":"Cyprinus carpio","tax_id":7962},"title":"Model organism or animal sample from Cyprinus carpio"},"last_updated":"2021-05-13T06:39:37.300","models":["Model organism or animal"],"owner":{"contacts":[{}],"name":"Chinese Academy of Fishery Sciences"},"package":"Model.organism.animal.1.0","publication_date":"2021-05-13T06:39:37.300","sample_ids":[{"label":"Sample name","value":"Common_carp"}],"status":{"status":"live","when":"2021-05-13T06:39:37.300"},"submission_date":"2020-12-05T07:09:04.477"},"blast_url":"https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch\u0026PROG_DEF=blastn\u0026BLAST_SPEC=GDH_GCF_018340385.1","paired_assembly":{"accession":"GCA_018340385.1","only_genbank":"7 unlocalized scaffolds on chromosome MT","only_refseq":"chromosome MT","status":"current"},"refseq_category":"representative genome","release_date":"2021-05-12","sequencing_tech":"PacBio; Oxford Nanopore; Illumina HiSeq","submitter":"Chinese Academy of Fishery Sciences"},"assembly_stats":{"contig_l50":229,"contig_n50":1558716,"gc_count":"620441384","gc_percent":37,"genome_coverage":"184.8x","number_of_component_sequences":6700,"number_of_contigs":19837,"number_of_organelles":1,"number_of_scaffolds":6700,"scaffold_l50":24,"scaffold_n50":29545497,"total_number_of_chromosomes":50,"total_sequence_length":"1680118328","total_ungapped_length":"1672146419"},"current_accession":"GCF_018340385.1","organelle_info":[{"description":"Mitochondrion","submitter":"Chinese Academy of Fishery Sciences","total_seq_length":"16575"}],"organism":{"common_name":"common carp","infraspecific_names":{"isolate":"SPL01"},"organism_name":"Cyprinus carpio","tax_id":7962},"paired_accession":"GCA_018340385.1","source_database":"SOURCE_DATABASE_REFSEQ","wgs_info":{"master_wgs_url":"https://www.ncbi.nlm.nih.gov/nuccore/JAEOAB000000000.1","wgs_contigs_url":"https://www.ncbi.nlm.nih.gov/Traces/wgs/JAEOAB01","wgs_project_accession":"JAEOAB01"}},{"accession":"GCA_018340385.1","assembly_info":{"assembly_level":"Chromosome","assembly_method":"wtdbg v. 2; quickmerge v. 1","assembly_name":"ASM1834038v1","assembly_status":"current","assembly_type":"haploid","bioproject_accession":"PRJNA682709","bioproject_lineage":[{"bioprojects":[{"accession":"PRJNA682709","title":"Cyprinus carpio isolate:SPL01 Genome sequencing and assembly"}]}],"biosample":{"accession":"SAMN17005855","attributes":[{"name":"isolate","value":"SPL01"},{"name":"dev_stage","value":"adult"},{"name":"sex","value":"not collected"},{"name":"tissue","value":"muscle"}],"bioprojects":[{"accession":"PRJNA682709"}],"description":{"organism":{"organism_name":"Cyprinus carpio","tax_id":7962},"title":"Model organism or animal sample from Cyprinus carpio"},"last_updated":"2021-05-13T06:39:37.300","models":["Model organism or animal"],"owner":{"contacts":[{}],"name":"Chinese Academy of Fishery Sciences"},"package":"Model.organism.animal.1.0","publication_date":"2021-05-13T06:39:37.300","sample_ids":[{"label":"Sample name","value":"Common_carp"}],"status":{"status":"live","when":"2021-05-13T06:39:37.300"},"submission_date":"2020-12-05T07:09:04.477"},"blast_url":"https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch\u0026PROG_DEF=blastn\u0026BLAST_SPEC=GDH_GCA_018340385.1","paired_assembly":{"accession":"GCF_018340385.1","annotation_name":"NCBI Cyprinus carpio Annotation Release 101","only_genbank":"7 unlocalized scaffolds on chromosome MT","only_refseq":"chromosome MT","status":"current"},"release_date":"2021-05-12","sequencing_tech":"PacBio; Oxford Nanopore; Illumina HiSeq","submitter":"Chinese Academy of Fishery Sciences"},"assembly_stats":{"contig_l50":229,"contig_n50":1558716,"gc_count":"620441384","gc_percent":37,"genome_coverage":"184.8x","number_of_component_sequences":6700,"number_of_contigs":19837,"number_of_organelles":1,"number_of_scaffolds":6700,"scaffold_l50":24,"scaffold_n50":29545497,"total_number_of_chromosomes":50,"total_sequence_length":"1680118328","total_ungapped_length":"1672146419"},"current_accession":"GCA_018340385.1","organelle_info":[{"description":"Mitochondrion","submitter":"Chinese Academy of Fishery Sciences"}],"organism":{"common_name":"common carp","infraspecific_names":{"isolate":"SPL01"},"organism_name":"Cyprinus carpio","tax_id":7962},"paired_accession":"GCF_018340385.1","source_database":"SOURCE_DATABASE_GENBANK","wgs_info":{"master_wgs_url":"https://www.ncbi.nlm.nih.gov/nuccore/JAEOAB000000000.1","wgs_contigs_url":"https://www.ncbi.nlm.nih.gov/Traces/wgs/JAEOAB01","wgs_project_accession":"JAEOAB01"}}],"total_count": 2}

Describe the solution you'd like
It would be nice if there were a meaningful "publication" field that was the publication that should be cited when that data source is used!

Thank you-

@conchoecia conchoecia added the enhancement New feature or request label May 24, 2024
@olearyna
Copy link
Contributor

Hi conchoecia,

Thank you for opening this issue. While we won't be able to get to it any time soon, please know that we are exploring better ways to enhance data attribution. We'll keep this issue open until the issue is resolved.

Nuala

Nuala A. O'Leary, PhD
Product Owner, NCBI Datasets
National Center for Biotechnology Information, NLM, NIH, DHHS

@conchoecia
Copy link
Author

Thanks for the response, @olearyna! Good luck with the development

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants