-
Notifications
You must be signed in to change notification settings - Fork 111
TM027 Tables
This page describes the design goals, semantics, and syntax for table values in Pyret. It is a mish-mash of various design proposals, with an attempt to solidify what it is we are aiming to create.
Warning: This proposal is still in its infancy, so this page might change drastically before any final launch.
- Summary
- Notation
- Constructing Tables
- Extending Tables
- Selecting Tables
-
Filtering Tables (
sieve
) - Ordering Tables
- Extracting Tables
Tables are a new type of value being added to Pyret with the aim of allowing students to process tabular data with ease. In order to make this as intuitive as possible, we have opted to implement an SQL-like mini language in the syntax of Pyret which provides various data-processing functions to users.
The following notation is used in this document:
Notation | Meaning |
---|---|
<NAME> |
Meta-variable |
<[NAME]> |
JS Function NAME
|
The syntax for constructing a table is as follows:
table: <COL-NAME> [:: <COL-ANN>]? [, <COL-NAME> [:: <COL-ANN>]?]*
[row: <ROW-VALUE> [, <ROW-VALUE>]*]+
end
Consider the following table:
Timestamp | What is your name? | What is your age? | Who is your vote for president? |
---|---|---|---|
2016-01-01 | James McAvoy | 30 | Magneto |
2016-01-02 | Matt Murdock | 35 | Stick |
2016-01-03 | Shallan Davar | 20 | Stick |
This table would be constructed in the following way in Pyret:
my-table = table: timestamp, name, age, president
row: "2016-01-01", "James McAvoy", 30, "Magneto"
row: "2016-01-02", "Matt Murdock", 35, "Stick"
row: "2016-01-03", "Shallan Davar", 20, "Stick"
end
Another example:
world = table: name, gdp, pop, area
row: "Atlantis", 1000, 10, 0
row: "Amestris", 9999, 1000, 5000
row: "Kamina City", 2000, 9999, 9999
end
(Note that references to my-table
in this document are
referring to the above example)
Note: As it currently stands, tables are immutable.
Under the hood, tables are Pyret values (hosted in a JS module...see
/src/js/trove/table.js
) which are targets of desugaring.
For example, the following code:
table: a :: Number, b
row: 1, 2
row: 3, 4
end
desugars into
<[makeTable]>([raw-array: "a", "b"],
[raw-array:
[raw-array: <[checkNumber]>(1), 2]
[raw-array: <[checkNumber]>(3), 4]])
"Extending" tables refers to updating their schema to include new columns. In general, there are two main types of extensions:
Name | Description |
---|---|
mapped |
Values can be computed from one row |
reduced |
Values are accumulated across multiple rows |
For example, adding a column to my-table
with a boolean value indicating
whether or not that person is voting for Magneto would be a mapped
extension,
while something like the average age would be a reduced
extension.
The syntax for extending tables is as follows:
extend <TABLE> [using <COLUMN> [, <COLUMN>]*]? :
(<NEW-COLUMN-NAME>: (<BINOP> | (<EXPR> of <COLUMN>)))+
end
The column names following using
are in scope within
the body of the extend
expression.
There are two variants of extensions. The first one, of the form
<NEW-COLUMN-NAME>: <BINOP>
represents a mapped
extension.
Alternatively, extensions of the form <NEW-COLUMN-NAME>: <EXPR> of <COLUMN>
represent the usage of a Reducer object with the specified column
to create a reduced
extension.
A Reducer object is defined as follows:
type Reducer<A, B> = {
# The base case of the reducer
one :: (A -> B),
# The combinator function
reduce :: (B, A -> B)
}
Note that extending an empty table returns an empty table.
Example usage:
running-sum :: Reducer<Number, Number> = {
one: lam(v :: Number): v end,
reduce: lam(curr, from-row):
curr + from-row
end
}
extend world using gdp, pop, area:
gdp-per-pop: gdp / pop,
area-so-far: running-sum of area
end
extend my-table:
seven: 7
end
Transforming tables is similar to extending them, but users are limited to overwriting the contents of existing columns.
The syntax for transforming tables is as follows:
transform <TABLE> [using <COLUMN> [, <COLUMN>]*]? :
<DEST-COLUMN>: (<BINOP> | (<EXPR> of <COLUMN>))
end
The semantics are the same as those in extend
, with the additional
note that if <DEST-COLUMN>
is one of the columns specified after using
,
its value in either the mapped
or reduced
value is its original
(pre-transformed) value (naturally).
Example usage:
transform my-table using president:
# Change to a boolean indicating if they are voting for Magneto
president: (president == "Magneto")
end
Selecting from tables is the process of pulling out a
subset of columns from the table (akin to the SELECT
statement
in SQL).
The syntax for selection is as follows:
select <COLUMN> [, <COLUMN>]* from <TABLE> end
Example usage:
names-and-ages = select name, age from my-table end
Filtering is the process of selecting rows from a table which pass a given criteria.
The syntax for filtering a table is as follows:
sieve <TABLE> [using <COLUMN> [, <COLUMN>]*]? :
<BINOP>
end
<BINOP>
should return a boolean value. If that value is
true
for a given row, that row is kept.
Example usage:
wants-stick = sieve my-table using president: president == "Stick" end
Users may also sort tables by any column (assuming, of course, that its datatype can be compared...something for the to-do list is the ability to use an arbitrary comparator).
The syntax for ordering a table is as follows:
order <TABLE>:
<COLUMN> (ascending | descending)
end
Example usage:
sorted-by-age = order my-table: age descending end
The extract
functionality allows users to pull columns out of
tables as lists.
The syntax for extraction is as follows:
extract <COLUMN> from <TABLE> end
Example usage:
check:
(extract name from world end) is [list: "Atlantis", "Amestris", "Kamina City"]
end