Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining transcript confidence and assignment_confidence from baysor segmentation #149

Open
pngai1 opened this issue Nov 4, 2024 · 2 comments

Comments

@pngai1
Copy link

pngai1 commented Nov 4, 2024

Hello Sopa team,

Thank you so much for developing such a great pipeline for spatial analysis. I have been applying the Baysor segmentation pipeline to my CosMx dataset, and it has been working great in general. However, I am struggling to find a good way to obtain the confidence score and assignment confidence score from Baysor for all the transcripts retained in the sdata after being processed by the sopa pipeline to correct for the conflicts and filtering. I saw from a previous issue that _map_transcript_to_cell would be the function to use here but I am a bit unsure if that is the appropriate function to use if Baysor segmentation is done in 3D. It would be great if more information from Baysor is included in the data output as that information could be helpful for filtering purposes.

Thank you so much.
Paul

@quentinblampey
Copy link
Collaborator

Hello @pngai1,

Indeed we are not adding the baysor information to transcripts right now. The current blocker is that, since we run Baysor on overlapping patches, one transcript can be found in multiple patches. This means each transcript may have multiple confidence scores, depending on the number of patches on which it has been found...

The _map_transcript_to_cell adds a column ID to the transcript dataframe that denotes inside which cell the transcript is contained, but it will not include the baysor information.

I have one question: what do you plan to do with this confidence score? Something like removing the transcripts whose confidence score is too low?

@pngai1
Copy link
Author

pngai1 commented Nov 4, 2024

Hello quentinblampey,

Yes, I agree it will be a tricky problem to account for transcripts that have multiple scores. I would expect that low confidence score cells might not be real and low assignment confidence score cells could be doublets or triplets. I could then filter/ label those cells in my data for downstream analysis. I thought of removing transcripts with low confidence or assignment confidence but a bit unsure if that would introduce any bias to the gene expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants