Data connection

Data connection is used for accessing data in a particular data source. Connection parameters depend on the data source. Typically, you would need to provide server name and login credentials.

Connectors

A connector could work with a database, an Excel file, a CSV file, a web service, or basically anything that is capable of providing the data. We currently support over 20 different connectors, and the list is quickly growing. Most of our data connectors are open-sourced and could be found on GitHub (MIT license).

Name	Type
Access	JDBC
Athena	JDBC
BigQuery	JDBC
Cassandra	JDBC
DB2	JDBC
Denodo	JDBC
DropBox	Files
Files	Files
Firebird	JDBC
Git	Files
GoogleCloud	Files
HBase	JDBC
Hive	JDBC
Hive2	JDBC
Impala	JDBC
MariaDB	JDBC
MongoDB	JDBC
MS SQL	JDBC
MySql	JDBC
Neo4j	JDBC
OData
Oracle	JDBC
Postgres	JDBC
Redshift	JDBC
S3	Files
Snowflake	JDBC
Socrata
Sparql
SQLite	JDBC
Teradata	JDBC
Twitter
Vertica	JDBC
Virtuoso	JDBC
Web

Creating a connection

To create a new data connection, open the "Databases" pane (Open | Databases), right-click on the appropriate connector in the tree, and choose "Add connection...". Alternatively, click on "New Connection" under the "Actions" tab, and select the appropriate connector.

Editing properties

Then, edit the connection attributes, and click on TEST to confirm that you've entered everything correctly. The set of attributes you can edit depends on the connector. Typically, for JDBC-based connectors you can provide a custom connection string (but do not enter login and password there, they will still be picked up from the corresponding fields).

Once a connection is set up, you are ready to start creating queries. There are multiple ways to do so: manually or programmatically.

Note that the platform supports caching of results.

Access control

Connections are first-class entities in the Datagrok platform, and as such are subjects to the standard checks and routines performed against them whenever they are used in the specific context. Some of the most popular privileges are: view, edit, delete, and share. Those privileges can be given to individual users, or to groups. For more information on the access privilege model, check out privileges.

Another “out of the box” feature that comes with connections being first-class entity is the audit trail for every action performed against the connection. For details on that, check out Audit page.

Caching

You can force the platform to cache results of executing queries (taking into account parameters as well). This is useful when a query executes slowly and returns relatively small results. A popular use case is caching the values that are used for building the UI automatically (typically this is some form of select distinct <name> from <table>) is a good idea.

You can turn caching on for either the whole connection, or for a particular query. For a connection, open its properties, and check the "Cache Results" checkbox. When you check it, the "Invalidate On" input becomes visible. Enter the cron expression there to define cache invalidation timepoints (for instance, if a database refreshes overnight and you want to invalidate it at 1am each night, enter 0 0 1 * * ?). Leaving the field blank will make the cache stay forever.

If a connection gets created automatically as part of the package, you can specify the cacheResults parameter in the connection json definition:

{
  "name": "Northwind",
  "parameters": {
    "server": "dev.datagrok.ai",
    "port": 23306,
    "db": "Northwind",
    "cacheSchema": false,
    "cacheResults": true,
    "ssl": false,
    "connString": ""
  }
}

To cache results of individual queries, edit the query (either via Datagrok UI if a query already exists, or by editing the corresponding .sql file of the package queries) and specify the meta.cache and meta.invalidate fields:

--name: getProductNames
--input: string department
--meta.cache: true
--meta.invalidate: 0 0 1 * * ?
select distinct name from products p 
where p.department = @department

Filtering

You can use these fields to filter connections with smart search:

Field	Description
id
name
server
port
db
login
dataSource
description
createdOn
updatedOn
author	User object
starredBy	User object
commentedBy	User object
usedBy	User object

JDBC connection

For some cases connection may require custom JDBC connection string. For this case, JDBC-based data connection has parameter "Conn. string". If filled, it will be used for connection, and all other parameters will be ignored except for "Login" and "Password".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-connection.md

data-connection.md

Data connection

Connectors

Creating a connection

Editing properties

Access control

Caching

Filtering

JDBC connection

Videos

Files

data-connection.md

Latest commit

History

data-connection.md

File metadata and controls

Data connection

Connectors

Creating a connection

Editing properties

Access control

Caching

Filtering

JDBC connection

Videos