-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for SQL as execution engine #306
Conversation
Uses unique value metadata to verify if a value is valid
frame.py was trying to import luxWidget instead of luxwidget
… with SQL Executor Some interestingness functions required the number of observations in the data and visualization, so I added these values to the metadata to make the scoring work when using the SQL executor Added tests for SQL executor
Removed lines that changed Year column type to datetime
SQL Executor tests interfering with travis build, commenting out for now
…nto Database-Executor
Issue where validator was relying on metadata which was not yet generated, moved metadata calculation before validation step in frame.py
Renamed num_obs to length, removed ordinal variable from Executor mapping function
* Merging Recent SQL Executor changes * Fix to Validator Uses unique value metadata to verify if a value is valid * Fix Bug with Widget Rendering frame.py was trying to import luxWidget instead of luxwidget * Added Number of Observations to MetaData, Fixed Interestingness issue with SQL Executor Some interestingness functions required the number of observations in the data and visualization, so I added these values to the metadata to make the scoring work when using the SQL executor Added tests for SQL executor * Re-added Licensing Headers * Adding Recent frame.py changes * Adjusted SQL Executor Tests Removed lines that changed Year column type to datetime * Update Frame with new Action Registering * Resolving Conflicts in frame.py * Commenting out local SQL Executor tests SQL Executor tests interfering with travis build, commenting out for now * Update correlation.py * Update frame.py * Fixing Code Format * Cleaning up Pandas Executor imports * Fix Validation Bug Issue where validator was relying on metadata which was not yet generated, moved metadata calculation before validation step in frame.py * Changed metadata variable name Renamed num_obs to length, removed ordinal variable from Executor mapping function Co-authored-by: 19thyneb <[email protected]> Co-authored-by: Doris Lee <[email protected]>
Updated travis.yml file to create postgresql database in test instance. Added script to populate test database with data.
Updated database credentials
* Merging Recent SQL Executor changes * Fix to Validator Uses unique value metadata to verify if a value is valid * Fix Bug with Widget Rendering frame.py was trying to import luxWidget instead of luxwidget * Added Number of Observations to MetaData, Fixed Interestingness issue with SQL Executor Some interestingness functions required the number of observations in the data and visualization, so I added these values to the metadata to make the scoring work when using the SQL executor Added tests for SQL executor * Re-added Licensing Headers * Adding Recent frame.py changes * Adjusted SQL Executor Tests Removed lines that changed Year column type to datetime * Update Frame with new Action Registering * Resolving Conflicts in frame.py * Commenting out local SQL Executor tests SQL Executor tests interfering with travis build, commenting out for now * Update correlation.py * Update frame.py * Fixing Code Format * Cleaning up Pandas Executor imports * Fix Validation Bug Issue where validator was relying on metadata which was not yet generated, moved metadata calculation before validation step in frame.py * Changed metadata variable name Renamed num_obs to length, removed ordinal variable from Executor mapping function * Added script to generate Postgresql database Updated travis.yml file to create postgresql database in test instance. Added script to populate test database with data. * Update upload_car_data.py Updated database credentials * Updated script name in travis.yml * Removed unnecessary import from travis.yml * Added psycopg2 to requirements.txt * Creating Postgres test database in travis * Fixed directory issue Co-authored-by: 19thyneb <[email protected]> Co-authored-by: Doris Lee <[email protected]>
Added tests for basic SQL Executor functionality.
Added an example notebook to showcase how to use the sql-engine. Fixed variable reference in interestingness.py that was causing issues.
Rather than referencing the _length parameter throughout the code, update and use the LuxSQLTable len() function. Added _setup_done parameter to the LuxSQLTable. This will check if the initial setup of the table, retrieving and populating attributes, is completed. This will inform which len() function to use, as the parent len() is required while populating the columns of the LuxSQLTable.
Rename _repr_html_() to _ipython_display_()
This reverts commit e350ab4.
This reverts commit 5d1a2f4.
This reverts commit 7c7dcd3.
Merge in Master branch changes, add len() functionality to the LuxSQLTable
This reverts commit b5998c7.
Updated _is_datetime_number() in the PandasExecutor to use the is_integer_dtype() function to check if a series is of int dtype. Cleaned up SQLExecutor checks in frame.py
* Revert "Revert "Update LuxSQLTable __len__() and metadata computation"" This reverts commit b5998c7. * Cleaned up datatype and SQLExecutor checks Updated _is_datetime_number() in the PandasExecutor to use the is_integer_dtype() function to check if a series is of int dtype. Cleaned up SQLExecutor checks in frame.py
Merging in Master branch changes to sql-engine branch
Black Reformatting
@@ -17,6 +17,7 @@ | |||
import pandas as pd | |||
from lux.vis.Vis import Vis | |||
from lux.vis.VisList import VisList | |||
import psycopg2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the SQL tests into a different file so that the separation between SQL and Pandas tests is more clear? That way, it would make it easier to add a flag that only runs PandasExecutor tests (e.g., when developing locally).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, the line connection = psycopg2.connect("host=localhost dbname=postgres user=postgres password=lux")
is repeated many times across the tests. We could probably put this inside conftest.py
as a @pytest.fixture
(variables that are reused across tests in the same session).
@@ -21,6 +21,7 @@ | |||
|
|||
# Suite of test that checks if data_type inferred correctly by Lux | |||
def test_check_cars(): | |||
lux.config.set_SQL_connection("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this empty?
Aiming to simplify the initial Lux installation. Will include a notice in the SQL documentation letting users know that they will have to install the library themselves if the want to use the LuxSQLTable functionality.
Removed psycopg2 from Lux requirements
…nto Database-Executor
Merging in master branch changes
Thanks @thyneb19 @dj-khandelwal @NiStannum @sophiahhuang for working hard on the SQL features! We're really looking forward to this new addition to the upcoming release! |
Overview
Merging in the sql-engine branch to bring the updated Lux SQL functionality to the main release. Users will be able to connect LuxSQLTable objects to their database tables and views, and leverage Lux' recommendation system without having to pull all of their database data locally.
Changes
This PR will update the Lux SQLExecutor as well as add a new LuxSQLTable object. The LuxSQLTable object is meant to help users differentiate between Lux' functionality when connecting to a Dataframe versus a SQL database. This PR makes the following major changes:
Example Output
Script to reproduce:
After setting up Postgres, go into command line via
psql postgres
, then setup via:Then run
python upload_car_data.py
or other data upload scripts insidelux/data
Users will now be able to connect Lux to their database tables like so: