Data connection is used for accessing data in a particular data source. Connection parameters depend on the data source. Typically, you would need to provide server name and login credentials.
A connector could work with a database, an Excel file, a CSV file, a web service, or basically anything that is capable of providing the data. We currently support over 20 different connectors, and the list is quickly growing. Most of our data connectors are open-sourced and could be found on GitHub (MIT license).
Name | Type |
---|---|
Access | JDBC |
Athena | JDBC |
BigQuery | JDBC |
Cassandra | JDBC |
DB2 | JDBC |
Denodo | JDBC |
DropBox | Files |
Files | Files |
Firebird | JDBC |
Git | Files |
GoogleCloud | Files |
HBase | JDBC |
Hive | JDBC |
Hive2 | JDBC |
Impala | JDBC |
MariaDB | JDBC |
MongoDB | JDBC |
MS SQL | JDBC |
MySql | JDBC |
Neo4j | JDBC |
OData | |
Oracle | JDBC |
Postgres | JDBC |
Redshift | JDBC |
S3 | Files |
Snowflake | JDBC |
Socrata | |
Sparql | |
SQLite | JDBC |
Teradata | JDBC |
Vertica | JDBC |
Virtuoso | JDBC |
Web |
To create a new data connection, open the "Databases" pane (Open | Databases), right-click on the appropriate connector in the tree, and choose "Add connection...". Alternatively, click on "New Connection" under the "Actions" tab, and select the appropriate connector.
Then, edit the connection attributes, and click on TEST to confirm that you've entered everything correctly. The set of attributes you can edit depends on the connector. Typically, for JDBC-based connectors you can provide a custom connection string (but do not enter login and password there, they will still be picked up from the corresponding fields).
Once a connection is set up, you are ready to start creating queries. There are multiple ways to do so: manually or programmatically.
Note that the platform supports caching of results.
Connections are first-class entities in the Datagrok platform, and as such are subjects to the standard
checks and routines performed against them whenever they are used in the specific context. Some of the
most popular privileges are: view
, edit
, delete
, and share
.
Those privileges can be given to individual users, or to groups.
For more information on the access privilege model, check out privileges.
Another “out of the box” feature that comes with connections being first-class entity is the audit trail for every action performed against the connection. For details on that, check out Audit page.
You can force the platform to cache results of executing queries (taking into account parameters as well).
This is useful when a query executes slowly and returns relatively small results.
A popular use case is caching the values that are used for building the UI automatically
(typically this is some form of select distinct <name> from <table>
) is a good idea.
You can turn caching on for either the whole connection, or for a particular query. For a connection,
open its properties, and check the "Cache Results" checkbox.
When you check it, the "Invalidate On" input becomes visible. Enter the
cron expression
there to define cache invalidation timepoints (for instance, if a database refreshes overnight
and you want to invalidate it at 1am each night, enter 0 0 1 * * ?
). Leaving the field blank
will make the cache stay forever.
If a connection gets created automatically as part of the package, you can specify the cacheResults
parameter in the connection json definition:
{
"name": "Northwind",
"parameters": {
"server": "dev.datagrok.ai",
"port": 23306,
"db": "Northwind",
"cacheSchema": false,
"cacheResults": true,
"ssl": false,
"connString": ""
}
}
To cache results of individual queries, edit the query (either via Datagrok UI if a
query already exists, or by editing the corresponding .sql file of the package queries)
and specify the meta.cache
and meta.invalidate
fields:
--name: getProductNames
--input: string department
--meta.cache: true
--meta.invalidate: 0 0 1 * * ?
select distinct name from products p
where p.department = @department
You can use these fields to filter connections with smart search:
Field | Description |
---|---|
id | |
name | |
server | |
port | |
db | |
login | |
dataSource | |
description | |
createdOn | |
updatedOn | |
author | User object |
starredBy | User object |
commentedBy | User object |
usedBy | User object |
For some cases connection may require custom JDBC connection string. For this case, JDBC-based data connection has parameter "Conn. string". If filled, it will be used for connection, and all other parameters will be ignored except for "Login" and "Password".
See also: