GraphQL API
Types
A typed schema is automatically generated from the table and its columns. However, advanced usage of tables often creates new columns - or changes the type of existing ones - and therefore falls outside the schema. Fields which create columns also allow aliasing, otherwise the column is replaced.
Output
A column within the schema can be accessed by Table.columns
.
{
columns {
<name> { ... }
}
}
Any column can be accessed by name using Dataset.column
and inline fragments.
{
column(name: "...") {
... on <Type>Column { ... }
}
}
Input
Input types don't have the equivalent of inline fragments, but GraphQL is converging on the OneOf input pattern. Effectively the type of the field becomes the name of the field.
Table.filter
provides simple queries for columns within the schema, and a where
arguments for complex expressions.
{
filter(<name>: { ... }, ..., where: { ... }) { ... }
}
Table.project
also supports complex expressions with aliased column names.
{
project(columns: [{ ... }]) { ... }
}
Note list inputs allow passing a single value, coercing the input to a list of 1.
The versatility of expressions can become complicated and unreadable. Consider applying expected projections to the root at startup, making filter
and project
simpler without performance penalty. Fields are not automatically camel-cased for the same reason; it's simpler to project first.
Partitions
Partitioned parquet datasets have custom optimization for fragment keys.
filter
on fragment keysgroup
on fragment keys with countsorder
with limit on fragment keys
Otherwise there is no advantage in the root being an arrow dataset over an ibis table.
Roots
A single root can be attached directly. This is not common - but perfectly valid - GraphQL usage. The resolvers receive the table object as the first argument, just as with nested fields.
Alternatively multiple named roots can be attached. This also enables federation, which can be ignored if not needed. Optionally keys can also be specified, which are translated into matching equality queries.
IntColumn.takeFrom
enables using one table as an index into another. Whether there is a performance benefit over filter
will of course vary by backend, but there is potential to be orders of magnitude faster than a scan.
Column selection
Each field resolver transforms a table or column as needed. Ibis is lazily executed, so there is no select
field because it's handled automatically. Conversely if multiple table fields are requested, the table selection is cached for performance and consistency.
Arrays
Ibis Array columns are supported. unnest
flattens arrays back to scalars, and group: {aggregate: {collect: ...}}
also creates arrays.
Nulls
GraphQL continues the long tradition of confusing "optional" with "nullable". Graphique strives to be explicit regarding what may be omitted versus what may be null.
Output
Ibis has first-class support for nulls, so array scalars are nullable. Non-null scalars are used where relevant.
Columns and rows are nullable to allow partial query results. Dataset.optional
enables client controlled nullability.
Input
Default values and non-null types are used wherever possible. When an input is optional and has no natural default, there are two cases to distinguish:
- if null is expected and semantically different, the input's description explains null behavior
- otherwise the input has an
@optional
directive, and explicit null behavior is undefined
Extending
The Dataset
interface can subclassed to add fields, making a custom strawberry asgi app.
from graphique import Dataset
@strawberry.type
class NewType(Dataset):
__init__ = Dataset.__init__
@strawberry.field
def new_field(self) -> T:
self.table # the underlying ibis table