-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy Streaming of Tabs #291
base: master
Are you sure you want to change the base?
Conversation
5.. Yeah! The tabs can be computed in any order - for now I just reversed the order so that correlation is the last to be computed. How do you want me to prioritize the actions? They come in alphabetical as default. |
|
Codecov Report
@@ Coverage Diff @@
## master #291 +/- ##
==========================================
- Coverage 85.09% 81.16% -3.94%
==========================================
Files 52 50 -2
Lines 4033 3663 -370
==========================================
- Hits 3432 2973 -459
- Misses 601 690 +89
Continue to review full report at Codecov.
|
lux/core/frame.py
Outdated
@@ -90,6 +90,7 @@ def __init__(self, *args, **kw): | |||
self._min_max = None | |||
self.pre_aggregated = None | |||
self._type_override = {} | |||
self.loadingBar = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename this as loading_bar
to be consistent with PEP8.
lux/core/frame.py
Outdated
self._widget = rec_df.render_widget() | ||
|
||
self._widget = rec_df.render_widget( | ||
pandasHtml=rec_df.to_html(max_rows=5, classes="pandasStyle") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this 10 so that it corresponds to the default display.min_rows
value which is 10
lux/core/frame.py
Outdated
# re-render widget for the current dataframe if previous rec is not recomputed | ||
elif show_prev: | ||
rec_df.show_all_column_vis() | ||
if lux.config.render_widget: | ||
self._widget = rec_df.render_widget() | ||
self._widget = rec_df.render_widget( | ||
pandasHtml=rec_df.to_html(max_rows=5, classes="pandasStyle") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -675,11 +722,15 @@ def render_widget(self, renderer: str = "altair", input_current_vis=""): | |||
import luxwidget | |||
|
|||
widgetJSON = self.to_JSON(self._rec_info, input_current_vis=input_current_vis) | |||
if pandasHtml is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the purpose of backwards compatibility with other versions of lux-api right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps - there was an error in one of the tests that caused pandasHtml to be none, so I just put a null check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, when we change the input to luxwidget, the missing traitlet variable can cause an error. So It's good that we have this check in here.
if loadingBar is not None: | ||
loadingBar.value = progress | ||
|
||
# # Pushing back correlation and geographical actions for performance reasons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we currently determine what to compute first v.s. what to compute lazily? Or for now, are we explicitly saying that "correlation" and "geographical" will compute later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, actions are checked in alphabetical order (with correlation being first). I manually put correlation and geographical later because i noticed those two take the longest to compute/render. The order in which we check/compute is very flexible as we can just move the action names around in the list.
Overview
This change (along with the frontend changes) implements lazy loading of the tabs. In essence, all tabs are initially greyed out and tabs are loaded one by one.
For now, the ordering of the calculation of tabs is just reversed (to put correlation last). We can add some sort of tab prioritization later on if needed.
Changes
I had to change a lot of the tests in terms of the tabs they were expecting since we are now lazily streaming them in, the order could be changed (I just changed the tests from using lists to using sets). I changed the way we compute actions. Instead of doing them all at once, I compute the first available one and display that (so we have anything to display). Then when ipython_display reaches its end, I start calculating the rest of the tabs. Also changed the button to a tabs instead.
Actions are now checked before computations in order to prevent unecessary tabs from being displayed initially. These checks are pretty negligible in terms of cost.
Example Output
Please see frontend PR: lux-org/lux-widget#71.
Performance Benefit
The Correlation action can make computing the widget take very long as datasets such as the communities dataset require around 40-45 seconds to be computed. By streaming in the tabs, we will be able to display the widget much earlier allowing users to interact with it almost right away and have the heavy-computation tabs be added in later. For example, Lux for the communities dataset now appears almost instantly instead of after 40ish seconds.