Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to Vega for visualizations #12

Open
keckelt opened this issue Dec 5, 2019 · 4 comments
Open

Switch to Vega for visualizations #12

keckelt opened this issue Dec 5, 2019 · 4 comments
Assignees
Labels
type: refactor Refactor the current implementation

Comments

@keckelt
Copy link
Member

keckelt commented Dec 5, 2019

Hi,
with the transition to Vega (Lite) we can replace the charts used by Tourdino as well.

There is:

  • Line Chart
  • Scatter Plot
  • Parallel sets
  • Grouped Histogram (Abs and Relative)
  • Box Plot

Apart from the parallel sets, it should be straight forward.

@keckelt keckelt added type: feature New feature or request type: refactor Refactor the current implementation and removed type: feature New feature or request labels Dec 5, 2019
@keckelt keckelt self-assigned this Mar 8, 2021
keckelt added a commit that referenced this issue Apr 22, 2021
@keckelt
Copy link
Member Author

keckelt commented Apr 22, 2021

I've started with the Boxplot today using vega lite v4.
v5 is already out and I wanted to upgrade it directly but that caused some build errors in Coral I didn't know how to fix right away. So in order to have Coral & TourDino use the same vega versions I sticked to v4 for now.

keckelt added a commit that referenced this issue Apr 22, 2021
keckelt added a commit that referenced this issue Apr 22, 2021
keckelt added a commit that referenced this issue Apr 23, 2021
@keckelt
Copy link
Member Author

keckelt commented May 4, 2021

Examples for space saving y-axis might be worth a try:
image

https://vega.github.io/editor/#/examples/vega-lite/bar_axis_space_saving

@keckelt
Copy link
Member Author

keckelt commented May 6, 2021

I had a thought about the GENIE dataset and the many categories some attributes contain. Additionally, the GENIE dataset is about 10 times larger than the TCGA.

I'll focus on TourDino visualizations here, but of course the dataset size also impacts the statistics part.

Row Comparison

In addition to the visualizations, selecting the sets/row to compare also suffers from the many categories. There are all listed in the select2 input. It is searchable, but manual browsing becomes impractical.

Significance Matrix

If you select to compare all Tumortypes with each other, the significance matrix gets very wide and likely impossible to navigate. Its already a problem with the ~30 tumortypes. The row headers are not sticky. You also spawn a lot of comparisons.
Due to the large datasize, results tend to be significant, although not necessarily relevant - due to a low effect size.

Numerical ↔ Numerical

As you just take the numerical values per set, regardless of the categories within it, this should not be an issue.
Also, the boxplots pretty much abstract away the dataset size.

image

Categorical ↔ Categorical

For categorical, we could use a Grouped Relative Histogram, similar to what we have in Coral.

image

Open the Chart in the Vega Editor

In Coral, we sort by the total percent per category. Alphabetically would also be an option, or based on the differences of sets, which could be shown by a separate negative bar:

image
Open the Chart in the Vega Editor

Column Comparison

Numerical ↔ Numerical

The large amount of marks could be an issue for Scatterplots (also to get the opacity right). Alternatively we could create a heatmap:
image

Open the Chart in the Vega Editor

Optionally, with superimposed trend line.

Categorical ↔ Categorical

I would replace the parallel sets with a heatmap. As with the significance matrix, the attribute with more categories should have the categories as rows.
For two attributes with many categories it is still problematic.

Categorical ↔ Numerical

A linechart with 5 lines corresponding to the Top 5 enrichment scores. With mroe categories, this should probably be adjustable.

@keckelt
Copy link
Member Author

keckelt commented Nov 24, 2021

Related issue: https://github.com/Caleydo/cohort/issues/585

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: refactor Refactor the current implementation
Projects
None yet
Development

No branches or pull requests

1 participant