-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscellaneous VCF formatting and typo fixes #391
Conversation
Fixes samtools#382. Fix formatting of ID and REF+ALT as separate table rows.
In particular, use a string that is not subject to change! Hat tip Thomas Colthurst. Closes samtools#363. Also fix a|b formatting and note that the inline/overflow boundary is >= 15 rather than > 15.
Also avoid using \textbackslash within \texttt{}.
Move the example \begin{enumerate} up to the otherwise blank half page before the landscape example SV VCF. Use "ht" in \begin{figure}[ht] where necessary, and keep some of the associated paragraphs together. Use \makecell to avoid an underful hbox.
Also make "dbsnp" and "138" verbatim, as the double quotes are part of what is being shown.
Compress the list of columns and encourage a pagebreak in "4 REF" to get more vertical space for Table 1 on page 8. Use \longtable so Table 1 can be broken across pages; don't use \begin{table}, so that the tables are positioned within their related ("8 INFO" / "Genotype fields") text. Keep initial "INFO keys used for structural variants" paragraphs together so they don't creep back to page 11. (Previously it was very easy to misread the INFO keys table as pertaining to the genotype text surrounding it.)
The bar is only for genotype fields. Fixes samtools#390. Also for the genotype fields, use plain | rather than \mid, which adds excess space around the delimiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to header line and all the tables look great, thanks! Just one comment on my side.
Multiple bases are permitted. | ||
The value in the POS field refers to the position of the first base in the String. | ||
For simple insertions and deletions in which either the REF or one of the ALT alleles would otherwise be null/empty, the REF and ALT Strings must include the base before the event (which must be reflected in the POS field), unless the event occurs at position 1 on the contig in which case it must include the base after the event; this padding base is not required (although it is permitted) for e.g. complex substitutions or other events where all alleles have at least one base represented in their Strings. | ||
For simple insertions and deletions in which either the REF or one of the ALT alleles would otherwise be null/empty, the REF and ALT Strings must include the base before the event (which must be reflected in the POS field), unless the event occurs at position 1 on the contig in which case it must include the base after the event; \pagebreak[1] this padding base is not required (although it is permitted) for e.g.\ complex substitutions or other events where all alleles have at least one base represented in their Strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a pretty plain paragraph, I'm not sure the \pagebreak[1]
command is very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the commit message: encourage a pagebreak in "4 REF" to get more vertical space for Table 1 on page 8. Without this, there is wasted space on the previous page and Table 1 is split more awkwardly.
(Also just pushed one more fix.)
Thanks. I would have merged that with a merge commit, as it's a sequence of distinct commits that tell a story. Squashing them into a single commit makes it a bit harder for future us to look back and understand the history, but YMMV. |
I actually prefer merge commits, but I thought the standard policy was always squashing... |
I don't think we have a policy as such. What often happens in this repository is that a single-commit PR gets improved and workshopped and acquires a bunch of “Fixed typos” and “Respond to review comments“ commits. Clearly the best practice in that case is to squash, as the fixup commits are not of later historical interest. Multi-commit PRs containing distinct fixes are a different case. |
Fix the BCF encoding errors noted in #382, and various formatting improvements.
No changes to the format or descriptions, so this should be straightforward to review and merge.
The non-TeX quotes etc in §1.4.7 Contig field format are left untouched so as not to conflict with lines changed in PR #379.