CLI tool for aggregating single cell data #1264

arteymix · 2024-10-23T19:46:46Z

arteymix · 2024-10-29T15:40:11Z

This is done, I'm just doing a little bit more testing at this point.

arteymix · 2024-10-29T23:51:34Z

I'm almost done with deleting aggregated data. There's a few caveat to consider such as whether to remove the dimension and resetting the single-cell metrics (see #1273), but also deleting generated data files for that QT.

arteymix · 2024-10-30T22:01:43Z

Good, now we can reliably aggregate and delete aggregated vectors! I'm looking into some -Infinity slipping through the aggregation process and causing the processed vectors to be filled with NaNs...

I've also made some improvements to which file get deleted when regenerating a platform annotations and pre-processing an experiment.

arteymix · 2024-10-31T00:24:44Z

Ok got the NaN situation figured out. We need to adjust the data to the library size and add a pseudocount just like we do for log2cpm of RNA-Seq data.

arteymix · 2024-10-31T00:52:23Z

Counting data would become linear after library size normalization.

Linear data would technically not be CPM, but I don't think that is important.

We also need to look into allowing count data to use a logarithmic scale type. We might find counting data out there that is unfortunately already log-transformed.

arteymix · 2024-10-31T16:17:17Z

Another thing to include in the tests is non-integer counting data.

This happens for some method that regresses out ambient RNA or other contaminants from the data. This we would get a general type COUNT and a scale type LINEAR, or something similar. We can add a way to generate such vectors by adding a little bit of multiplicative Gaussian noise.

arteymix added cli Issues affecting the CLI single cell Issues related to single-cell data support labels Oct 23, 2024

arteymix self-assigned this Oct 23, 2024

arteymix closed this as completed Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI tool for aggregating single cell data #1264

CLI tool for aggregating single cell data #1264

arteymix commented Oct 23, 2024 •

edited

Loading

arteymix commented Oct 29, 2024

arteymix commented Oct 29, 2024

arteymix commented Oct 30, 2024

arteymix commented Oct 31, 2024

arteymix commented Oct 31, 2024

arteymix commented Oct 31, 2024

CLI tool for aggregating single cell data #1264

CLI tool for aggregating single cell data #1264

Comments

arteymix commented Oct 23, 2024 • edited Loading

TODO

arteymix commented Oct 29, 2024

arteymix commented Oct 29, 2024

arteymix commented Oct 30, 2024

arteymix commented Oct 31, 2024

arteymix commented Oct 31, 2024

arteymix commented Oct 31, 2024

arteymix commented Oct 23, 2024 •

edited

Loading