Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

psites_number table is not generated with v1.2.15 #55

Open
hsinyenwu opened this issue May 18, 2023 · 1 comment
Open

psites_number table is not generated with v1.2.15 #55

hsinyenwu opened this issue May 18, 2023 · 1 comment

Comments

@hsinyenwu
Copy link

hsinyenwu commented May 18, 2023

I think I still encounter an issue that is similar to issue#32
i.e., I did not get the psites_number table in the hdf5 file.

  group           name       otype dclass   dim
0     /        p_sites H5I_DATASET   VLEN 46826
1     / transcript_ids H5I_DATASET STRING 46826

Also the dimension of the p_sites table seems to be wrong.
Otherwise the code ran okay and generated other files.

This job is running on skl-119 on Wed May 17 23:21:46 EDT 2023
0% has finished! ^M2% has finished! ^M4% has finished! ^M6% has finished! ^M9% has finished! ^M11% has finished! ^M13% has finished! ^M15% has finished! ^M17% has finished! ^M19% has finished! ^M21% has finished! ^M23% has finished! ^M26% has finished! ^M28% has finished! ^M30% has finished! ^M32% has finished! ^M34% has finished! ^M36% has finished! ^M38% has finished! ^M41% has finished! ^M43% has finished! ^M45% has finished! ^M47% has finished! ^M49% has finished! ^M51% has finished! ^M53% has finished! ^M56% has finished! ^M58% has finished! ^M60% has finished! ^M62% has finished! ^M64% has finished! ^M66% has finished! ^M68% has finished! ^M70% has finished! ^M73% has finished! ^M75% has finished! ^M77% has finished! ^M79% has finished! ^M81% has finished! ^M83% has finished! ^M85% has finished! ^M88% has finished! ^M90% has finished! ^M92% has finished! ^M94% has finished! ^M96% has finished! ^M98% has finished! ^M[2023-05-17 23:29:14] Finished!
        Loading transcripts.pickle ...
        Reading bam file: /mnt/home/larrywu/CTRL_arabidopsis/data/RiboCode_STAR/ribo_mapped/D1//star_D1_Aligned.toTranscriptome.out.bam......
Finished reading bam file!

Any suggestions for how to deal with this issue? Thanks!

@Roleren
Copy link

Roleren commented Sep 30, 2024

An easy hack is to add it, very easy in R at least:




library(rhdf5)
p_sites <- rhdf5::h5read(h5file_path, "p_sites") # Read in the p_sites object, it can be quite big, ~ 1GB for human  
p_sites_sum <- sum(sapply(p_sites, sum)) # If object is VLEN (i.e. R list), pick this one
p_sites_sum <- as.integer(sum(rowSums(p_sites))) # If object is matrix (i.e. R matrix), pick this one
rhdf5::h5writeDataset(obj = p_sites_sum, h5loc = H5Fopen(h5file),
                        name = "psites_number")

You can also insert the psites directly, object if you want to skip the whole bam creation and hdf5 step.
# p_sites as VLEN, this is much simpler, but you need 1TB memory for large species:
library(hdf5r)
h5file_link <- H5File$new(h5file, mode = "w")
h5file_link[["p_sites"]] <- psites # I usually use as(IntegerList(ORFik::coveragePerTiling(tx, RPF)), "list) on a bigwig or covRle object
# Then tx names can be made like this:
h5file_link[["transcript_ids"]] <- names(tx) # GRangesList of all transcripts, must match gtf used by ribocode!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants