Skip to content

Example Schema

Generated from a test fixture of zipcodes.

Query (Table)

a dataset with a derived schema

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/dataset.html#classes)
schema Schema! dataset schema
optional Table Nullable field to stop error propagation, enabling partial query results. Will be replaced by client controlled nullability.
length Long! number of rows
any Boolean! Return whether there are at least `length` rows. May be significantly faster than `length` for out-of-core data.
length Long!
size Long buffer size in bytes; null if table is not loaded
column Column Return column of any type by name. This is typically only needed for aliased or casted columns. If the column is in the schema, `columns` can be used instead.
name [String!]! column name(s); multiple names access nested struct fields
cast String! cast array to [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
safe Boolean! check for conversion errors on cast
slice Table! Return zero-copy slice of table. Can also be sued to force loading a dataset.
offset Long! number of rows to skip; negative value skips from the end
length Long maximum number of rows to return
reverse Boolean! reverse order after slicing; forces a copy
group Table! Return table grouped by columns. See `column` for accessing any column which has changed type. See `tables` to split on any aggregated list columns.
by [String!]! column names; empty will aggregate into a single row table
counts String! optionally include counts in an aliased column
ordered Boolean! optionally disable parallelization to maintain ordering
aggregate HashAggregates! aggregation functions applied to other columns
runs Table! Return table grouped by pairwise differences. Differs from `group` by relying on adjacency, and is typically faster. Other columns are transformed into list columns. See `column` and `tables` to further access lists.
by [String!]! column names
split [Diff!]! optional predicates to split on; scalars are compared to pairwise difference
counts String! optionally include counts in an aliased column
sort Table! Return table slice sorted by specified columns. Optimized for length == 1; matches min or max values.
by [String!]! column names; prefix with `-` for descending order
length Long maximum number of rows to return; may be significantly faster but is unstable
nullPlacement String! where nulls in input should be sorted; incompatible with `length`
rank Table! Return table selected by maximum dense rank.
by [String!]! column names; prefix with `-` for descending order
max Int! maximum dense rank to select; optimized for == 1 (min or max)
apply Table! Return view of table with vector functions applied across columns. Applied functions load arrays into memory as needed. See `scan` for scalar functions, which do not require loading.
cumulativeMax [Cumulative!]! Compute the cumulative max over a numeric input.
cumulativeMean [Cumulative!]! Compute the cumulative max over a numeric input.
cumulativeMin [Cumulative!]! Compute the cumulative min over a numeric input.
cumulativeProd [Cumulative!]! Compute the cumulative product over a numeric input.
cumulativeSum [Cumulative!]! Compute the cumulative sum over a numeric input.
fillNullBackward [Field!]! Carry non-null values backward to fill null slots.
fillNullForward [Field!]! Carry non-null values forward to fill null slots.
pairwiseDiff [Pairwise!]! Compute first order difference of an array.
rank [Rank!]! Compute numerical ranks of an array (1-based).
list ListFunction! functions for list arrays.
flatten Table! Return table with list arrays flattened. At least one list column must be referenced, and all list columns must have the same lengths.
indices String!
tables [Table]! Return a list of tables by splitting list columns. At least one list column must be referenced, and all list columns must have the same lengths.
aggregate ⚠️ Table! Return table with scalar aggregate functions applied to list columns.

⚠️ DEPRECATED

List scalar functions will be moved to `scan(...: {list: ...})`
approximateMedian [ScalarAggregate!]! Approximate median of a numeric array with T-Digest algorithm.
count [CountAggregate!]! Count the number of null / non-null values.
countDistinct [CountAggregate!]! Count the number of unique values.
distinct [CountAggregate!]! distinct values within each scalar
first [Field!]! first value of each list scalar
last [Field!]! last value of each list scalar
max [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
mean [ScalarAggregate!]! Compute the mean of a numeric array.
min [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
product [ScalarAggregate!]! Compute the product of values in a numeric array.
stddev [VarianceAggregate!]! Calculate the standard deviation of a numeric array.
sum [ScalarAggregate!]! Compute the sum of a numeric array.
tdigest [TDigestAggregate!]! Approximate quantiles of a numeric array with T-Digest algorithm.
variance [VarianceAggregate!]! Calculate the variance of a numeric array.
scan Table! Select rows and project columns without memory usage.
filter Expression! selected rows
columns [Projection!]! projected columns
join Table! Provisional: [join](https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Dataset.html#pyarrow.dataset.Dataset.join) this table with another table on the root Query type.
right String! name of right table; must be on root Query type
keys [String!]! column names used as keys on the left side
rightKeys [String!] column names used as keys on the right side; defaults to left side.
joinType String! the kind of join: 'left semi', 'right semi', 'left anti', 'right anti', 'inner', 'left outer', 'right outer', 'full outer'
leftSuffix String! add suffix to left column names; for preventing collisions
rightSuffix String! add suffix to right column names; for preventing collisions.
coalesceKeys Boolean! omit duplicate keys
take Table! Select rows from indices.
indices [Long!]!
dropNull Table! Remove missing values from referenced columns in the table.
columns Columns! fields for each column
row Row Return scalar values at index.
index Long!
filter Table! Return table with rows which match all queries. See `scan(filter: ...)` for more advanced queries. Additional feature: sorted tables support binary search
latitude FloatFilter!
longitude FloatFilter!
state StrFilter!
city StrFilter!
county StrFilter!
zipcode IntFilter!

Objects

Base64Column

column of ordinal values

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Base64]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique Base64Set! unique values and counts
value Base64 scalar value at index
index Long!
dropNull [Base64!]! Drop nulls from the input.
first Base64 Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Base64 Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Base64 Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Base64 Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Base64!
start Long!
end Long
fillNull [Base64!]! Replace each null element in values with a corresponding
value Base64!

Base64Set

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Base64]! list of values

BoolSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Boolean]! list of values

BooleanColumn

column of booleans

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Boolean]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique BoolSet! unique values and counts
value Boolean scalar value at index
index Long!
dropNull [Boolean!]! Drop nulls from the input.
first Boolean Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Boolean Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Boolean Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Boolean Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Boolean!
start Long!
end Long
fillNull [Boolean!]! Replace each null element in values with a corresponding
value Boolean!
mode BoolSet! Compute the modal (most common) values of a numeric array.
n Int!
skipNulls Boolean!
minCount Int!
sum Boolean Compute the sum of a numeric array.
skipNulls Boolean!
minCount Int!
product Boolean Compute the product of values in a numeric array.
skipNulls Boolean!
minCount Int!
mean Float Compute the mean of a numeric array.
skipNulls Boolean!
minCount Int!
indicesNonzero [Long!]! Return the indices of the values in the array that are non-zero.
any Boolean Test whether any element in a boolean array evaluates to true.
skipNulls Boolean!
minCount Int!
all Boolean Test whether all elements in a boolean array evaluate to true.
skipNulls Boolean!
minCount Int!

Columns

fields for each column

Field Argument Type Description
latitude FloatColumn
longitude FloatColumn
state StringColumn
city StringColumn
county StringColumn
zipcode IntColumn

DateColumn

column of ordinal values

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Date]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique DateSet! unique values and counts
value Date scalar value at index
index Long!
dropNull [Date!]! Drop nulls from the input.
first Date Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Date Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Date Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Date Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Date!
start Long!
end Long
fillNull [Date!]! Replace each null element in values with a corresponding
value Date!

DateSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Date]! list of values

DatetimeColumn

column of ordinal values

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [DateTime]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique DatetimeSet! unique values and counts
value DateTime scalar value at index
index Long!
dropNull [DateTime!]! Drop nulls from the input.
first DateTime Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last DateTime Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min DateTime Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max DateTime Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value DateTime!
start Long!
end Long
fillNull [DateTime!]! Replace each null element in values with a corresponding
value DateTime!

DatetimeSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [DateTime]! list of values

DecimalColumn

column of floats or decimals

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Decimal]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique DecimalSet! unique values and counts
value Decimal scalar value at index
index Long!
dropNull [Decimal!]! Drop nulls from the input.
first Decimal Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Decimal Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Decimal Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Decimal Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Decimal!
start Long!
end Long
fillNull [Decimal!]! Replace each null element in values with a corresponding
value Decimal!
mode DecimalSet! Compute the modal (most common) values of a numeric array.
n Int!
skipNulls Boolean!
minCount Int!
sum Decimal Compute the sum of a numeric array.
skipNulls Boolean!
minCount Int!
product Decimal Compute the product of values in a numeric array.
skipNulls Boolean!
minCount Int!
mean Float Compute the mean of a numeric array.
skipNulls Boolean!
minCount Int!
indicesNonzero [Long!]! Return the indices of the values in the array that are non-zero.
stddev Float Calculate the standard deviation of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
variance Float Calculate the variance of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
quantile [Float]! Compute an array of quantiles of a numeric array or chunked array.
q [Float!]!
interpolation String!
skipNulls Boolean!
minCount Int!
tdigest [Float]! Approximate quantiles of a numeric array with T-Digest algorithm.
q [Float!]!
delta Int!
bufferSize Int!
skipNulls Boolean!
minCount Int!

DecimalSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Decimal]! list of values

DurationColumn

column of elapsed times

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Duration]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique DurationSet! unique values and counts
value Duration scalar value at index
index Long!
dropNull [Duration!]! Drop nulls from the input.

DurationSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Duration]! list of values

FloatColumn

column of floats or decimals

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Float]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique FloatSet! unique values and counts
value Float scalar value at index
index Long!
dropNull [Float!]! Drop nulls from the input.
first Float Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Float Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Float Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Float Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Float!
start Long!
end Long
fillNull [Float!]! Replace each null element in values with a corresponding
value Float!
mode FloatSet! Compute the modal (most common) values of a numeric array.
n Int!
skipNulls Boolean!
minCount Int!
sum Float Compute the sum of a numeric array.
skipNulls Boolean!
minCount Int!
product Float Compute the product of values in a numeric array.
skipNulls Boolean!
minCount Int!
mean Float Compute the mean of a numeric array.
skipNulls Boolean!
minCount Int!
indicesNonzero [Long!]! Return the indices of the values in the array that are non-zero.
stddev Float Calculate the standard deviation of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
variance Float Calculate the variance of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
quantile [Float]! Compute an array of quantiles of a numeric array or chunked array.
q [Float!]!
interpolation String!
skipNulls Boolean!
minCount Int!
tdigest [Float]! Approximate quantiles of a numeric array with T-Digest algorithm.
q [Float!]!
delta Int!
bufferSize Int!
skipNulls Boolean!
minCount Int!

FloatSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Float]! list of values

IntColumn

column of integers

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Int]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique IntSet! unique values and counts
value Int scalar value at index
index Long!
dropNull [Int!]! Drop nulls from the input.
first Int Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Int Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Int Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Int Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Int!
start Long!
end Long
fillNull [Int!]! Replace each null element in values with a corresponding
value Int!
mode IntSet! Compute the modal (most common) values of a numeric array.
n Int!
skipNulls Boolean!
minCount Int!
sum Int Compute the sum of a numeric array.
skipNulls Boolean!
minCount Int!
product Int Compute the product of values in a numeric array.
skipNulls Boolean!
minCount Int!
mean Float Compute the mean of a numeric array.
skipNulls Boolean!
minCount Int!
indicesNonzero [Long!]! Return the indices of the values in the array that are non-zero.
stddev Float Calculate the standard deviation of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
variance Float Calculate the variance of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
quantile [Float]! Compute an array of quantiles of a numeric array or chunked array.
q [Float!]!
interpolation String!
skipNulls Boolean!
minCount Int!
tdigest [Float]! Approximate quantiles of a numeric array with T-Digest algorithm.
q [Float!]!
delta Int!
bufferSize Int!
skipNulls Boolean!
minCount Int!
takeFrom Dataset Select indices from a table on the root Query type.
field String!

IntSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Int]! list of values

ListColumn

column of lists

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
value Column scalar column at index
index Long!
values [Column]! list of columns
dropNull [Column!]! Drop nulls from the input.
flatten Column! concatenation of all sub-lists

LongColumn

column of integers

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Long]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique LongSet! unique values and counts
value Long scalar value at index
index Long!
dropNull [Long!]! Drop nulls from the input.
first Long Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Long Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Long Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Long Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Long!
start Long!
end Long
fillNull [Long!]! Replace each null element in values with a corresponding
value Long!
mode LongSet! Compute the modal (most common) values of a numeric array.
n Int!
skipNulls Boolean!
minCount Int!
sum Long Compute the sum of a numeric array.
skipNulls Boolean!
minCount Int!
product Long Compute the product of values in a numeric array.
skipNulls Boolean!
minCount Int!
mean Float Compute the mean of a numeric array.
skipNulls Boolean!
minCount Int!
indicesNonzero [Long!]! Return the indices of the values in the array that are non-zero.
stddev Float Calculate the standard deviation of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
variance Float Calculate the variance of a numeric array.
ddof Int!
skipNulls Boolean!
minCount Int!
quantile [Float]! Compute an array of quantiles of a numeric array or chunked array.
q [Float!]!
interpolation String!
skipNulls Boolean!
minCount Int!
tdigest [Float]! Approximate quantiles of a numeric array with T-Digest algorithm.
q [Float!]!
delta Int!
bufferSize Int!
skipNulls Boolean!
minCount Int!
takeFrom Dataset Select indices from a table on the root Query type.
field String!

LongSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Long]! list of values

Row

scalar fields

Field Argument Type Description
latitude Float
longitude Float
state String
city String
county String
zipcode Int

Schema

dataset schema

Field Argument Type Description
names [String!]! field names
types [String!]! [arrow types](https://arrow.apache.org/docs/python/api/datatypes.html), corresponding to `names`
partitioning [String!]! partition keys
index [String!]! sorted index columns

StrSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [String]! list of values

StringColumn

column of strings

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [String]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique StrSet! unique values and counts
value String scalar value at index
index Long!
dropNull [String!]! Drop nulls from the input.
first String Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last String Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min String Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max String Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value String!
start Long!
end Long
fillNull [String!]! Replace each null element in values with a corresponding
value String!

StructColumn

column of structs

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
value JSON scalar json object at index
index Long!
names [String!]! field names
column Column Return struct field as a column.
name [String!]! field name(s); multiple names access nested fields

TimeColumn

column of ordinal values

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!
values [Time]! list of values
countDistinct Long! Count the number of unique values.
mode String!
unique TimeSet! unique values and counts
value Time scalar value at index
index Long!
dropNull [Time!]! Drop nulls from the input.
first Time Compute the first value in each group.
skipNulls Boolean!
minCount Int!
last Time Compute the first value in each group.
skipNulls Boolean!
minCount Int!
min Time Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
max Time Compute the minimum or maximum values of a numeric array.
skipNulls Boolean!
minCount Int!
index Long! Find the index of the first occurrence of a given value.
value Time!
start Long!
end Long
fillNull [Time!]! Replace each null element in values with a corresponding
value Time!

TimeSet

unique values and counts

Field Argument Type Description
counts [Long!]! list of counts
length Long! array length
values [Time]! list of values

Inputs

Binary

Binary functions.

Field Type Description
length Expression Compute string lengths.
repeat [Expression!]! Repeat a binary string.
reverse Expression Reverse binary input.
join [Expression!]! Join a list of strings together with a separator.
joinElementWise [Expression!]! Join string arguments together, with the last argument as separator.
nullHandling String!
nullReplacement String!
replaceSlice Expression Replace a slice of a binary string.
start Int!
stop Int!
replacement Base64!

BitWise

Bit-wise functions.

Field Type Description
and [Expression!]! Bit-wise AND the arguments element-wise.
not [Expression!]! Bit-wise negate the arguments element-wise.
or [Expression!]! Bit-wise OR the arguments element-wise.
xor [Expression!]! Bit-wise XOR the arguments element-wise.
shiftLeft [Expression!]! Left shift `x` by `y`.
shiftRight [Expression!]! Right shift `x` by `y`.

CountAggregate

options for count aggregation

Field Type Description
name String! column name
alias String! output column name
mode String!

Cumulative

Field Type Description
name String! column name
alias String! output column name
start Float!
skipNulls Boolean!
checked Boolean!

Diff

Discrete difference predicates, applied in forwards direction (array[i + 1] ? array[i]).

By default compares by not equal. Specifying null with a predicate compares pairwise. A float computes the discrete difference first; durations may be in float seconds.

Field Type Description
name String!
lt Float <
le Float <=
gt Float \>
ge Float \>=

ElementWise

Element-wise aggregate functions.

Field Type Description
min [Expression!]! Find the element-wise minimum value.
max [Expression!]! Find the element-wise maximum value.
skipNulls Boolean!

Expression

Dataset expression used for scanning.

Expects one of: a field name, a scalar, or an operator with expressions. Single values can be passed for an input List. * eq with a list scalar is equivalent to isin * eq with a null scalar is equivalent is_null * ne with a null scalar is equivalent to is_valid

Field Type Description
name [String!]! field name(s)
cast String! cast as [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
safe Boolean! check for conversion errors on cast
value JSON JSON scalar; also see typed scalars
kleene Boolean! use kleene logic for booleans
checked Boolean! check for overflow errors
base64 [Base64!]!
date [Date!]!
datetime [DateTime!]!
decimal [Decimal!]!
duration [Duration!]!
time [Time!]!
eq [Expression!]! ==
ne [Expression!]! !=
lt [Expression!]! <
le [Expression!]! <=
gt [Expression!]! \>
ge [Expression!]! \>=
inv Expression ~
abs Expression Calculate the absolute value of the argument element-wise.
add [Expression!]! Add the arguments element-wise.
divide [Expression!]! Divide the arguments element-wise.
multiply [Expression!]! Multiply the arguments element-wise.
negate Expression Negate the argument element-wise.
power [Expression!]! Raise arguments to power element-wise.
sign Expression Get the signedness of the arguments element-wise.
subtract [Expression!]! Subtract the arguments element-wise.
bitWise BitWise bit-wise functions
rounding Rounding rounding functions
log Log logarithmic functions
trig Trig trigonometry functions
elementWise ElementWise element-wise aggregate functions
and [Expression!]! &
andNot [Expression!]! Logical 'and not' boolean values.
or [Expression!]! |
xor [Expression!]! Logical 'xor' boolean values.
utf8 Utf8 utf8 string functions
stringIsAscii Expression Classify strings as ASCII.
substring MatchSubstring match substring functions
binary Binary binary functions
setLookup SetLookup set lookup functions
isFinite Expression Return true if value is finite.
isInf Expression Return true if infinity.
isNan Expression Return true if NaN.
trueUnlessNull Expression Return true if non-null, else return null.
caseWhen [Expression!]! Choose values based on multiple conditions.
choose [Expression!]! Choose values from several arrays.
coalesce [Expression!]! Select the first non-null value.
ifElse [Expression!]! Choose values based on a condition.
temporal Temporal temporal functions
replaceWithMask [Expression!]! Replace items selected with a mask.
list Lists list array functions

Field

name and optional alias for compute functions

Field Type Description
name String! column name
alias String! output column name

FloatFilter

predicates for scalars

Field Type Description
eq [Float] == or `isin`; `null` is equivalent to arrow `is_null`.
ne Float !=; `null` is equivalent to arrow `is_valid`.
lt Float <
le Float <=
gt Float \>
ge Float \>=

HashAggregates

Field Type Description
all [ScalarAggregate!]! Test whether all elements in a boolean array evaluate to true.
any [ScalarAggregate!]! Test whether any element in a boolean array evaluates to true.
approximateMedian [ScalarAggregate!]! Approximate median of a numeric array with T-Digest algorithm.
count [CountAggregate!]! Count the number of null / non-null values.
countDistinct [CountAggregate!]! Count the number of unique values.
first [ScalarAggregate!]! Compute the first value in each group.
firstLast [ScalarAggregate!]! Compute the first and last values of an array.
last [ScalarAggregate!]! Compute the first value in each group.
max [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
mean [ScalarAggregate!]! Compute the mean of a numeric array.
min [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
minMax [ScalarAggregate!]! Compute the minimum and maximum values of a numeric array.
product [ScalarAggregate!]! Compute the product of values in a numeric array.
stddev [VarianceAggregate!]! Calculate the standard deviation of a numeric array.
sum [ScalarAggregate!]! Compute the sum of a numeric array.
tdigest [TDigestAggregate!]! Approximate quantiles of a numeric array with T-Digest algorithm.
variance [VarianceAggregate!]! Calculate the variance of a numeric array.
distinct [CountAggregate!]! distinct values within each scalar
list [Field!]! all values within each scalar
one [Field!]! arbitrary value within each scalar

Index

Field Type Description
name String! column name
alias String! output column name
value JSON!
start Long!
end Long

IntFilter

predicates for scalars

Field Type Description
eq [Int] == or `isin`; `null` is equivalent to arrow `is_null`.
ne Int !=; `null` is equivalent to arrow `is_valid`.
lt Int <
le Int <=
gt Int \>
ge Int \>=

ListFunction

functions for lists

Field Type Description
filter Expression! filter within list scalars
sort Sort sort within list scalars
rank Ranked select by dense rank within list scalars

Lists

List array functions.

Field Type Description
element [Expression!]! Compute elements using of nested list values using an index.
valueLength Expression Compute list lengths.
all Expression Test whether all elements in a boolean array evaluate to true.
any Expression Test whether any element in a boolean array evaluates to true.
slice Expression Compute slice of list-like array.
start Int!
stop Int
step Int!
returnFixedSizeList Boolean

Log

Logarithmic functions.

Field Type Description
ln Expression Compute natural logarithm.
log1p Expression Compute natural log of (1+x).
logb [Expression!]! Compute base `b` logarithm.

MatchSubstring

Match substring functions.

Field Type Description
count Expression Count occurrences of substring.
endsWith Expression Check if strings end with a literal pattern.
find Expression Find first occurrence of substring.
match Expression Match strings against literal pattern.
startsWith Expression Check if strings start with a literal pattern.
replace Expression Replace matching non-overlapping substrings with replacement.
split Expression Split string according to separator.
extract Expression Extract substrings captured by a regex pattern.
pattern String!
ignoreCase Boolean!
regex Boolean!
replacement String!
maxReplacements Int
maxSplits Int
reverse Boolean!

Mode

Field Type Description
name String! column name
alias String! output column name
n Int!
skipNulls Boolean!
minCount Int!

Pairwise

Field Type Description
name String! column name
alias String! output column name
period Int!
checked Boolean!

Projection

an Expression with an optional alias

Field Type Description
name [String!]! field name(s)
cast String! cast as [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
safe Boolean! check for conversion errors on cast
value JSON JSON scalar; also see typed scalars
kleene Boolean! use kleene logic for booleans
checked Boolean! check for overflow errors
base64 [Base64!]!
date [Date!]!
datetime [DateTime!]!
decimal [Decimal!]!
duration [Duration!]!
time [Time!]!
eq [Expression!]! ==
ne [Expression!]! !=
lt [Expression!]! <
le [Expression!]! <=
gt [Expression!]! \>
ge [Expression!]! \>=
inv Expression ~
abs Expression Calculate the absolute value of the argument element-wise.
add [Expression!]! Add the arguments element-wise.
divide [Expression!]! Divide the arguments element-wise.
multiply [Expression!]! Multiply the arguments element-wise.
negate Expression Negate the argument element-wise.
power [Expression!]! Raise arguments to power element-wise.
sign Expression Get the signedness of the arguments element-wise.
subtract [Expression!]! Subtract the arguments element-wise.
bitWise BitWise bit-wise functions
rounding Rounding rounding functions
log Log logarithmic functions
trig Trig trigonometry functions
elementWise ElementWise element-wise aggregate functions
and [Expression!]! &
andNot [Expression!]! Logical 'and not' boolean values.
or [Expression!]! |
xor [Expression!]! Logical 'xor' boolean values.
utf8 Utf8 utf8 string functions
stringIsAscii Expression Classify strings as ASCII.
substring MatchSubstring match substring functions
binary Binary binary functions
setLookup SetLookup set lookup functions
isFinite Expression Return true if value is finite.
isInf Expression Return true if infinity.
isNan Expression Return true if NaN.
trueUnlessNull Expression Return true if non-null, else return null.
caseWhen [Expression!]! Choose values based on multiple conditions.
choose [Expression!]! Choose values from several arrays.
coalesce [Expression!]! Select the first non-null value.
ifElse [Expression!]! Choose values based on a condition.
temporal Temporal temporal functions
replaceWithMask [Expression!]! Replace items selected with a mask.
list Lists list array functions
alias String! name of projected column

Quantile

Field Type Description
name String! column name
alias String! output column name
q [Float!]!
interpolation String!
skipNulls Boolean!
minCount Int!

Rank

Field Type Description
name String! column name
alias String! output column name
sortKeys String!
nullPlacement String!
tiebreaker String!

Ranked

Field Type Description
by [String!]!
max Int!

Rounding

Rounding functions.

Field Type Description
ceil Expression Round up to the nearest integer.
floor Expression Round down to the nearest integer.
trunc Expression Compute the integral part.
round Expression Round to a given precision.
ndigits Int!
roundMode String!
multiple Float!

ScalarAggregate

options for scalar aggregation

Field Type Description
name String! column name
alias String! output column name
skipNulls Boolean!

SetLookup

Set lookup functions.

Field Type Description
indexIn [Expression!]! Return index of each element in a set of values.
digitize [Expression!]! numpy [digitize](https://numpy.org/doc/stable/reference/generated/numpy.digitize.html)
skipNulls Boolean!
right Boolean!

Sort

Field Type Description
by [String!]!
length Long

StrFilter

predicates for scalars

Field Type Description
eq [String] == or `isin`; `null` is equivalent to arrow `is_null`.
ne String !=; `null` is equivalent to arrow `is_valid`.
lt String <
le String <=
gt String \>
ge String \>=

TDigestAggregate

options for tdigest aggregation

Field Type Description
name String! column name
alias String! output column name
skipNulls Boolean!
q [Float!]!
delta Int!
bufferSize Int!

Temporal

Temporal functions.

Field Type Description
day Expression Extract day number.
dayOfYear Expression Extract day of year number.
hour Expression Extract hour value.
isoWeek Expression Extract ISO week of year number.
isoYear Expression Extract ISO year number.
isoCalendar Expression Extract (ISO year, ISO week, ISO day of week) struct.
isLeapYear Expression Extract if year is a leap year.
microsecond Expression Extract microsecond values.
millisecond Expression Extract millisecond values.
minute Expression Extract minute values.
month Expression Extract month number.
nanosecond Expression Extract nanosecond values.
quarter Expression Extract quarter of year number.
second Expression Extract second values.
subsecond Expression Extract subsecond values.
usWeek Expression Extract US week of year number.
usYear Expression Extract US epidemiological year number.
year Expression Extract year number.
yearMonthDay Expression Extract (year, month, day) struct.
dayTimeIntervalBetween [Expression!]! Compute the number of days and milliseconds between two timestamps.
daysBetween [Expression!]! Compute the number of days between two timestamps.
hoursBetween [Expression!]! Compute the number of hours between two timestamps.
microsecondsBetween [Expression!]! Compute the number of microseconds between two timestamps.
millisecondsBetween [Expression!]! Compute the number of millisecond boundaries between two timestamps.
minutesBetween [Expression!]! Compute the number of minute boundaries between two timestamps.
monthDayNanoIntervalBetween [Expression!]! Compute the number of months, days and nanoseconds between two timestamps.
monthIntervalBetween [Expression!]! Compute the number of months between two timestamps.
nanosecondsBetween [Expression!]! Compute the number of nanoseconds between two timestamps.
quartersBetween [Expression!]! Compute the number of quarters between two timestamps.
secondsBetween [Expression!]! Compute the number of seconds between two timestamps.
weeksBetween [Expression!]! Compute the number of weeks between two timestamps.
yearsBetween [Expression!]! Compute the number of years between two timestamps.
ceil Expression Round temporal values up to nearest multiple of specified time unit.
floor Expression Round temporal values down to nearest multiple of specified time unit.
round Expression Round temporal values to the nearest multiple of specified time unit.
multiple Int!
unit String!
weekStartsMonday Boolean!
ceilIsStrictlyGreater Boolean!
calendarBasedOrigin Boolean!
week Expression Extract week of year number.
countFromZero Boolean
firstWeekIsFullyInYear Boolean!
dayOfWeek Expression Extract day of the week number.
weekStart Int!
strftime Expression Format temporal values according to a format string.
strptime Expression Parse timestamps.
format String!
locale String!
errorIsNull Boolean!
assumeTimezone Expression Convert naive timestamp to timezone-aware timestamp.
timezone String!
ambiguous String!
nonexistent String!

Trig

Trigonometry functions.

Field Type Description
checked Boolean! check for overflow errors
acos Expression Compute the inverse cosine.
asin Expression Compute the inverse sine.
atan Expression Compute the inverse tangent of x.
atan2 [Expression!]! Compute the inverse tangent of y/x.
cos Expression Compute the cosine.
sin Expression Compute the sine.
tan Expression Compute the tangent.

Utf8

Utf8 string functions.

Field Type Description
isAlnum Expression Classify strings as alphanumeric.
isAlpha Expression Classify strings as alphabetic.
isDecimal Expression Classify strings as decimal.
isDigit Expression Classify strings as digits.
isLower Expression Classify strings as lowercase.
isNumeric Expression Classify strings as numeric.
isPrintable Expression Classify strings as printable.
isSpace Expression Classify strings as whitespace.
isTitle Expression Classify strings as titlecase.
isUpper Expression Classify strings as uppercase.
capitalize Expression Capitalize the first character of input.
length Expression Compute UTF8 string lengths.
lower Expression Transform input to lowercase.
reverse Expression Reverse input.
swapcase Expression Transform input lowercase characters to uppercase and uppercase characters to lowercase.
title Expression Titlecase each word of input.
upper Expression Transform input to uppercase.
ltrim Expression Trim leading characters.
rtrim Expression Trim trailing characters.
trim Expression Trim leading and trailing characters.
characters String! trim options; by default trims whitespace
replaceSlice Expression Replace a slice of a string.
sliceCodeunits Expression Slice string.
start Int!
stop Int
step Int!
replacement String!
center Expression Center strings by padding with a given character.
lpad Expression Right-align strings by padding with a given character.
rpad Expression Left-align strings by padding with a given character.
width Int!
padding String!

VarianceAggregate

options for variance aggregation

Field Type Description
name String! column name
alias String! output column name
skipNulls Boolean!
ddof Int!

Scalars

Base64

Represents binary data as Base64-encoded strings, using the standard alphabet.

Boolean

The Boolean scalar type represents true or false.

Date

Date (isoformat)

DateTime

Date with time (isoformat)

Decimal

Decimal (fixed-point)

Duration

Duration (isoformat)

Float

The Float scalar type represents signed double-precision fractional values as specified by IEEE 754.

Int

The Int scalar type represents non-fractional signed whole numeric values. Int can represent values between -(2^31) and 2^31 - 1.

JSON

The JSON scalar type represents JSON values as specified by ECMA-404.

Long

64-bit int

String

The String scalar type represents textual data, represented as UTF-8 character sequences. The String type is most often used by GraphQL to represent free-form human-readable text.

Time

Time (isoformat)

Interfaces

Column

an arrow array

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
length Long! array length
size Long! buffer size in bytes
count Long! Count the number of null / non-null values.
mode String!

Possible Types: Base64Column, BooleanColumn, DateColumn, DatetimeColumn, DecimalColumn, DurationColumn, FloatColumn, IntColumn, ListColumn, LongColumn, StringColumn, StructColumn, TimeColumn

Dataset

an arrow dataset, scanner, or table

Field Argument Type Description
type String! [arrow type](https://arrow.apache.org/docs/python/api/dataset.html#classes)
schema Schema! dataset schema
optional Dataset Nullable field to stop error propagation, enabling partial query results. Will be replaced by client controlled nullability.
length Long! number of rows
any Boolean! Return whether there are at least `length` rows. May be significantly faster than `length` for out-of-core data.
length Long!
size Long buffer size in bytes; null if table is not loaded
column Column Return column of any type by name. This is typically only needed for aliased or casted columns. If the column is in the schema, `columns` can be used instead.
name [String!]! column name(s); multiple names access nested struct fields
cast String! cast array to [arrow type](https://arrow.apache.org/docs/python/api/datatypes.html)
safe Boolean! check for conversion errors on cast
slice Dataset! Return zero-copy slice of table. Can also be sued to force loading a dataset.
offset Long! number of rows to skip; negative value skips from the end
length Long maximum number of rows to return
reverse Boolean! reverse order after slicing; forces a copy
group Dataset! Return table grouped by columns. See `column` for accessing any column which has changed type. See `tables` to split on any aggregated list columns.
by [String!]! column names; empty will aggregate into a single row table
counts String! optionally include counts in an aliased column
ordered Boolean! optionally disable parallelization to maintain ordering
aggregate HashAggregates! aggregation functions applied to other columns
runs Dataset! Return table grouped by pairwise differences. Differs from `group` by relying on adjacency, and is typically faster. Other columns are transformed into list columns. See `column` and `tables` to further access lists.
by [String!]! column names
split [Diff!]! optional predicates to split on; scalars are compared to pairwise difference
counts String! optionally include counts in an aliased column
sort Dataset! Return table slice sorted by specified columns. Optimized for length == 1; matches min or max values.
by [String!]! column names; prefix with `-` for descending order
length Long maximum number of rows to return; may be significantly faster but is unstable
nullPlacement String! where nulls in input should be sorted; incompatible with `length`
rank Dataset! Return table selected by maximum dense rank.
by [String!]! column names; prefix with `-` for descending order
max Int! maximum dense rank to select; optimized for == 1 (min or max)
apply Dataset! Return view of table with vector functions applied across columns. Applied functions load arrays into memory as needed. See `scan` for scalar functions, which do not require loading.
cumulativeMax [Cumulative!]! Compute the cumulative max over a numeric input.
cumulativeMean [Cumulative!]! Compute the cumulative max over a numeric input.
cumulativeMin [Cumulative!]! Compute the cumulative min over a numeric input.
cumulativeProd [Cumulative!]! Compute the cumulative product over a numeric input.
cumulativeSum [Cumulative!]! Compute the cumulative sum over a numeric input.
fillNullBackward [Field!]! Carry non-null values backward to fill null slots.
fillNullForward [Field!]! Carry non-null values forward to fill null slots.
pairwiseDiff [Pairwise!]! Compute first order difference of an array.
rank [Rank!]! Compute numerical ranks of an array (1-based).
list ListFunction! functions for list arrays.
flatten Dataset! Return table with list arrays flattened. At least one list column must be referenced, and all list columns must have the same lengths.
indices String!
tables [Dataset]! Return a list of tables by splitting list columns. At least one list column must be referenced, and all list columns must have the same lengths.
aggregate ⚠️ Dataset! Return table with scalar aggregate functions applied to list columns.

⚠️ DEPRECATED

List scalar functions will be moved to `scan(...: {list: ...})`
approximateMedian [ScalarAggregate!]! Approximate median of a numeric array with T-Digest algorithm.
count [CountAggregate!]! Count the number of null / non-null values.
countDistinct [CountAggregate!]! Count the number of unique values.
distinct [CountAggregate!]! distinct values within each scalar
first [Field!]! first value of each list scalar
last [Field!]! last value of each list scalar
max [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
mean [ScalarAggregate!]! Compute the mean of a numeric array.
min [ScalarAggregate!]! Compute the minimum or maximum values of a numeric array.
product [ScalarAggregate!]! Compute the product of values in a numeric array.
stddev [VarianceAggregate!]! Calculate the standard deviation of a numeric array.
sum [ScalarAggregate!]! Compute the sum of a numeric array.
tdigest [TDigestAggregate!]! Approximate quantiles of a numeric array with T-Digest algorithm.
variance [VarianceAggregate!]! Calculate the variance of a numeric array.
scan Dataset! Select rows and project columns without memory usage.
filter Expression! selected rows
columns [Projection!]! projected columns
join Dataset! Provisional: [join](https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Dataset.html#pyarrow.dataset.Dataset.join) this table with another table on the root Query type.
right String! name of right table; must be on root Query type
keys [String!]! column names used as keys on the left side
rightKeys [String!] column names used as keys on the right side; defaults to left side.
joinType String! the kind of join: 'left semi', 'right semi', 'left anti', 'right anti', 'inner', 'left outer', 'right outer', 'full outer'
leftSuffix String! add suffix to left column names; for preventing collisions
rightSuffix String! add suffix to right column names; for preventing collisions.
coalesceKeys Boolean! omit duplicate keys
take Dataset! Select rows from indices.
indices [Long!]!
dropNull Dataset! Remove missing values from referenced columns in the table.

Possible Types: Table