Scalar Functions
Math Functions
- abs
- acos
- acosh
- asin
- asinh
- atan
- atanh
- atan2
- cbrt
- ceil
- cos
- cosh
- degrees
- exp
- factorial
- floor
- gcd
- isnan
- iszero
- lcm
- ln
- log
- log10
- log2
- nanvl
- pi
- power
- pow
- radians
- random
- round
- signum
- sin
- sinh
- sqrt
- tan
- tanh
- trunc
abs
Returns the absolute value of a number.
abs(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
acos
Returns the arc cosine or inverse cosine of a number.
acos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
acosh
Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.
acosh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
asin
Returns the arc sine or inverse sine of a number.
asin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
asinh
Returns the area hyperbolic sine or inverse hyperbolic sine of a number.
asinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atan
Returns the arc tangent or inverse tangent of a number.
atan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atanh
Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.
atanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atan2
Returns the arc tangent or inverse tangent of expression_y / expression_x
.
atan2(expression_y, expression_x)
Arguments
- expression_y: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_x: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cbrt
Returns the cube root of a number.
cbrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
ceil
Returns the nearest integer greater than or equal to a number.
ceil(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cos
Returns the cosine of a number.
cos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
cosh
Returns the hyperbolic cosine of a number.
cosh(numeric_expression)
degrees
Converts radians to degrees.
degrees(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
exp
Returns the base-e exponential of a number.
exp(numeric_expression)
Arguments
- numeric_expression: Numeric expression to use as the exponent. Can be a constant, column, or function, and any combination of arithmetic operators.
factorial
Factorial. Returns 1 if value is less than 2.
factorial(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
floor
Returns the nearest integer less than or equal to a number.
floor(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
gcd
Returns the greatest common divisor of expression_x
and expression_y
. Returns 0 if both inputs are zero.
gcd(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
isnan
Returns true if a given number is +NaN or -NaN otherwise returns false.
isnan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
iszero
Returns true if a given number is +0.0 or -0.0 otherwise returns false.
iszero(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
lcm
Returns the least common multiple of expression_x
and expression_y
. Returns 0 if either input is zero.
lcm(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
ln
Returns the natural logarithm of a number.
ln(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log
Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.
log(base, numeric_expression)
log(numeric_expression)
Arguments
- base: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log10
Returns the base-10 logarithm of a number.
log10(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
log2
Returns the base-2 logarithm of a number.
log2(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
nanvl
Returns the first argument if it's not NaN. Returns the second argument otherwise.
nanvl(expression_x, expression_y)
Arguments
- expression_x: Numeric expression to return if it's not NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_y: Numeric expression to return if the first expression is NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
pi
Returns an approximate value of π.
pi()
power
Returns a base expression raised to the power of an exponent.
power(base, exponent)
Arguments
- base: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- exponent: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Aliases
- pow
pow
Alias of power.
radians
Converts degrees to radians.
radians(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
random
Returns a random float value in the range [0, 1). The random seed is unique to each row.
random()
round
Rounds a number to the nearest integer.
round(numeric_expression[, decimal_places])
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- decimal_places: Optional. The number of decimal places to round to. Defaults to 0.
signum
Returns the sign of a number.
Negative numbers return -1
.
Zero and positive numbers return 1
.
signum(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sin
Returns the sine of a number.
sin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sinh
Returns the hyperbolic sine of a number.
sinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
sqrt
Returns the square root of a number.
sqrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
tan
Returns the tangent of a number.
tan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
tanh
Returns the hyperbolic tangent of a number.
tanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
trunc
Truncates a number to a whole number or truncated to the specified decimal places.
trunc(numeric_expression[, decimal_places])
Arguments
-
numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
-
decimal_places: Optional. The number of decimal places to truncate to. Defaults to 0 (truncate to a whole number). If
decimal_places
is a positive integer, truncates digits to the right of the decimal point. Ifdecimal_places
is a negative integer, replaces digits to the left of the decimal point with0
.
Conditional Functions
coalesce
Returns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.
coalesce(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expression to use if previous expressions are null. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
nullif
Returns null if expression1 equals expression2; otherwise it returns expression1.
This can be used to perform the inverse operation of coalesce
.
nullif(expression1, expression2)
Arguments
- expression1: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression2: Expression to compare to expression1. Can be a constant, column, or function, and any combination of arithmetic operators.
String Functions
- ascii
- bit_length
- btrim
- char_length
- character_length
- concat
- concat_ws
- chr
- ends_with
- initcap
- instr
- left
- length
- lower
- lpad
- ltrim
- octet_length
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- split_part
- starts_with
- strpos
- substr
- to_hex
- translate
- trim
- upper
- uuid
- overlay
- levenshtein
- substr_index
- find_in_set
ascii
Returns the ASCII value of the first character in a string.
ascii(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: chr
bit_length
Returns the bit length of a string.
bit_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: length, octet_length
btrim
Trims the specified trim string from the start and end of a string. If no trim string is provided, all whitespace is removed from the start and end of the input string.
btrim(str[, trim_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- trim_str: String expression to trim from the beginning and end of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is whitespace characters.
Related functions: ltrim, rtrim, trim
char_length
Alias of length.
character_length
Alias of length.
concat
Concatenates multiple strings together.
concat(str[, ..., str_n])
Arguments
- str: String expression to concatenate. Can be a constant, column, or function, and any combination of string operators.
- str_n: Subsequent string column or literal string to concatenate.
Related functions: contcat_ws
concat_ws
Concatenates multiple strings together with a specified separator.
concat(separator, str[, ..., str_n])
Arguments
- separator: Separator to insert between concatenated strings.
- str: String expression to concatenate. Can be a constant, column, or function, and any combination of string operators.
- str_n: Subsequent string column or literal string to concatenate.
Related functions: concat
chr
Returns the character with the specified ASCII or Unicode code value.
chr(expression)
Arguments
- expression: Expression containing the ASCII or Unicode code value to operate on. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Related functions: ascii
ends_with
Tests if a string ends with a substring.
ends_with(str, substr)
Arguments
- str: String expression to test. Can be a constant, column, or function, and any combination of string operators.
- substr: Substring to test for.
initcap
Capitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.
initcap(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: lower, upper
instr
Returns the location where substr first appeared in str (counting from 1). If substr does not appear in str, return 0.
instr(str, substr)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- substr: Substring expression to search for. Can be a constant, column, or function, and any combination of string operators.
left
Returns a specified number of characters from the left side of a string.
left(str, n)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- n: Number of characters to return.
Related functions: right
length
Returns the number of characters in a string.
length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Aliases
- char_length
- character_length
Related functions: bit_length, octet_length
lower
Converts a string to lower-case.
lower(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: initcap, upper
lpad
Pads the left side of a string with another string to a specified string length.
lpad(str, n[, padding_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- n: String length to pad to.
- padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Related functions: rpad
ltrim
Removes leading spaces from a string.
ltrim(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: btrim, rtrim, trim
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
octet_length
Returns the length of a string in bytes.
octet_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: bit_length, length
repeat
Returns a string with an input string repeated a specified number.
repeat(str, n)
Arguments
- str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
- n: Number of times to repeat the input string.
replace
Replaces all occurrences of a specified substring in a string with a new substring.
replace(str, substr, replacement)
Arguments
- str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
- substr: Substring expression to replace in the input string. Can be a constant, column, or function, and any combination of string operators.
- replacement: Replacement substring expression. Can be a constant, column, or function, and any combination of string operators.
reverse
Reverses the character order of a string.
reverse(str)
Arguments
- str: String expression to repeat. Can be a constant, column, or function, and any combination of string operators.
right
Returns a specified number of characters from the right side of a string.
right(str, n)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- n: Number of characters to return.
Related functions: left
rpad
Pads the right side of a string with another string to a specified string length.
rpad(str, n[, padding_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- n: String length to pad to.
- padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Related functions: lpad
rtrim
Removes trailing spaces from a string.
rtrim(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: btrim, ltrim, trim
split_part
Splits a string based on a specified delimiter and returns the substring in the specified position.
split_part(str, delimiter, pos)
Arguments
- str: String expression to spit. Can be a constant, column, or function, and any combination of string operators.
- delimiter: String or character to split on.
- pos: Position of the part to return.
starts_with
Tests if a string starts with a substring.
starts_with(str, substr)
Arguments
- str: String expression to test. Can be a constant, column, or function, and any combination of string operators.
- substr: Substring to test for.
strpos
Returns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.
strpos(str, substr)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- substr: Substring expression to search for. Can be a constant, column, or function, and any combination of string operators.
substr
Extracts a substring of a specified number of characters from a specific starting position in a string.
substr(str, start_pos[, length])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- start_pos: Character position to start the substring at. The first character in the string has a position of 1.
- length: Number of characters to extract. If not specified, returns the rest of the string after the start position.
translate
Translates characters in a string to specified translation characters.
translate(str, chars, translation)
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- chars: Characters to translate.
- translation: Translation characters. Translation characters replace only characters at the same position in the chars string.
to_hex
Converts an integer to a hexadecimal string.
to_hex(int)
Arguments
- int: Integer expression to convert. Can be a constant, column, or function, and any combination of arithmetic operators.
trim
Removes leading and trailing spaces from a string.
trim(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: btrim, ltrim, rtrim
upper
Converts a string to upper-case.
upper(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Related functions: initcap, lower
uuid
Returns UUID v4 string value which is unique per row.
uuid()
overlay
Returns the string which is replaced by another string from the specified position and specified count length.
For example, overlay('Txxxxas' placing 'hom' from 2 for 4) → Thomas
overlay(str PLACING substr FROM pos [FOR count])
Arguments
- str: String expression to operate on.
- substr: the string to replace part of str.
- pos: the start position to replace of str.
- count: the count of characters to be replaced from start position of str. If not specified, will use substr length instead.
levenshtein
Returns the Levenshtein distance between the two given strings.
For example, levenshtein('kitten', 'sitting') = 3
levenshtein(str1, str2)
Arguments
- str1: String expression to compute Levenshtein distance with str2.
- str2: String expression to compute Levenshtein distance with str1.
substr_index
Returns the substring from str before count occurrences of the delimiter delim.
If count is positive, everything to the left of the final delimiter (counting from the left) is returned.
If count is negative, everything to the right of the final delimiter (counting from the right) is returned.
For example, substr_index('www.apache.org', '.', 1) = www
, substr_index('www.apache.org', '.', -1) = org
substr_index(str, delim, count)
Arguments
- str: String expression to operate on.
- delim: the string to find in str to split str.
- count: The number of times to search for the delimiter. Can be both a positive or negative number.
find_in_set
Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.
For example, find_in_set('b', 'a,b,c,d') = 2
find_in_set(str, strlist)
Arguments
- str: String expression to find in strlist.
- strlist: A string list is a string composed of substrings separated by , characters.
Binary String Functions
encode
Encode binary data into a textual representation.
encode(expression, format)
Arguments
-
expression: Expression containing string or binary data
-
format: Supported formats are:
base64
,hex
Related functions: decode
decode
Decode binary data from textual representation in string.
decode(expression, format)
Arguments
-
expression: Expression containing encoded string data
-
format: Same arguments as encode
Related functions: encode
Regular Expression Functions
Apache DataFusion uses the POSIX regular expression syntax and supports the following regular expression functions:
regexp_match
Returns a list of regular expression matches in a string.
regexp_match(str, regexp)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- regexp: Regular expression to match against. Can be a constant, column, or function.
regexp_replace
Replaces substrings in a string that match a regular expression.
regexp_replace(str, regexp, replacement, flags)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
- regexp: Regular expression to match against. Can be a constant, column, or function.
- replacement: Replacement string expression. Can be a constant, column, or function, and any combination of string operators.
- flags: Regular expression flags that control the behavior of the
regular expression. The following flags are supported.
- g: (global) Search globally and don't return after the first match.
- i: (insensitive) Ignore case when matching.
Time and Date Functions
- now
- current_date
- current_time
- date_bin
- date_trunc
- datetrunc
- date_part
- datepart
- extract
- today
- to_timestamp
- to_timestamp_millis
- to_timestamp_micros
- to_timestamp_seconds
- to_timestamp_nanos
- from_unixtime
now
Returns the current UTC timestamp.
The now()
return value is determined at query time and will return the same timestamp,
no matter when in the query plan the function executes.
now()
current_date
Returns the current UTC date.
The current_date()
return value is determined at query time and will return the same date,
no matter when in the query plan the function executes.
current_date()
Aliases
- today
today
Alias of current_date.
current_time
Returns the current UTC time.
The current_time()
return value is determined at query time and will return the same time,
no matter when in the query plan the function executes.
current_time()
date_bin
Calculates time intervals and returns the start of the interval nearest to the specified timestamp.
Use date_bin
to downsample time series data by grouping rows into time-based "bins" or "windows"
and applying an aggregate or selector function to each window.
For example, if you "bin" or "window" data into 15 minute intervals, an input
timestamp of 2023-01-01T18:18:18Z
will be updated to the start time of the 15
minute bin it is in: 2023-01-01T18:15:00Z
.
date_bin(interval, expression, origin-timestamp)
Arguments
- interval: Bin interval.
- expression: Time expression to operate on. Can be a constant, column, or function.
- origin-timestamp: Optional. Starting point used to determine bin boundaries. If not specified
defaults
1970-01-01T00:00:00Z
(the UNIX epoch in UTC).
The following intervals are supported:
- nanoseconds
- microseconds
- milliseconds
- seconds
- minutes
- hours
- days
- weeks
- months
- years
- century
date_trunc
Truncates a timestamp value to a specified precision.
date_trunc(precision, expression)
Arguments
-
precision: Time precision to truncate to. The following precisions are supported:
- year / YEAR
- quarter / QUARTER
- month / MONTH
- week / WEEK
- day / DAY
- hour / HOUR
- minute / MINUTE
- second / SECOND
-
expression: Time expression to operate on. Can be a constant, column, or function.
Aliases
- datetrunc
datetrunc
Alias of date_trunc.
date_part
Returns the specified part of the date as an integer.
date_part(part, expression)
Arguments
-
part: Part of the date to return. The following date parts are supported:
- year
- quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
- month
- week (week of the year)
- day (day of the month)
- hour
- minute
- second
- millisecond
- microsecond
- nanosecond
- dow (day of the week)
- doy (day of the year)
- epoch (seconds since Unix epoch)
-
expression: Time expression to operate on. Can be a constant, column, or function.
Aliases
- datepart
datepart
Alias of date_part.
extract
Returns a sub-field from a time value as an integer.
Similar to date_part
, but with different arguments.
extract(field FROM source)
Arguments
-
field: Part or field of the date to return. The following date fields are supported:
- year
- quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
- month
- week (week of the year)
- day (day of the month)
- hour
- minute
- second
- millisecond
- microsecond
- nanosecond
- dow (day of the week)
- doy (day of the year)
- epoch (seconds since Unix epoch)
-
source: Source time expression to operate on. Can be a constant, column, or function.
to_timestamp
Converts a value to a timestamp (YYYY-MM-DDT00:00:00Z
).
Supports strings, integer, unsigned integer, and double types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no [Chrono formats] are provided.
Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z
).
Returns the corresponding timestamp.
Note: to_timestamp
returns Timestamp(Nanosecond)
. The supported range for integer input is between -9223372037
and 9223372036
.
Supported range for string input is between 1677-09-21T00:12:44.0
and 2262-04-11T23:47:16.0
. Please use to_timestamp_seconds
for the input outside of supported bounds.
to_timestamp(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
❯ select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789 |
+-----------------------------------------------------------+
❯ select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789 |
+--------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_millis
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.
Integers and unsigned integers are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z
).
Returns the corresponding timestamp.
to_timestamp_millis(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
❯ select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123 |
+------------------------------------------------------------------+
❯ select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_micros
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.
Integers and unsigned integers are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z
)
Returns the corresponding timestamp.
to_timestamp_micros(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
❯ select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456 |
+------------------------------------------------------------------+
❯ select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_nanos
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.
Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z
).
Returns the corresponding timestamp.
to_timestamp_nanos(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
❯ select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789 |
+-----------------------------------------------------------------+
❯ select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_seconds
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z
).
Supports strings, integer, and unsigned integer types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.
Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z
).
Returns the corresponding timestamp.
to_timestamp_seconds(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
❯ select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56 |
+-------------------------------------------------------------------+
❯ select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00 |
+----------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
from_unixtime
Converts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z
).
Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z
)
return the corresponding timestamp.
from_unixtime(expression)
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
Hashing Functions
digest
Computes the binary hash of an expression using the specified algorithm.
digest(expression, algorithm)
Arguments
-
expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
-
algorithm: String expression specifying algorithm to use. Must be one of:
- md5
- sha224
- sha256
- sha384
- sha512
- blake2s
- blake2b
- blake3
md5
Computes an MD5 128-bit checksum for a string expression.
md5(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha224
Computes the SHA-224 hash of a binary string.
sha224(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha256
Computes the SHA-256 hash of a binary string.
sha256(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha384
Computes the SHA-384 hash of a binary string.
sha384(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
sha512
Computes the SHA-512 hash of a binary string.
sha512(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of string operators.
Other Functions
arrow_cast
Casts a value to a specific Arrow data type:
arrow_cast(expression, datatype)
Arguments
- expression: Expression to cast. Can be a constant, column, or function, and any combination of arithmetic or string operators.
- datatype: Arrow data type name
to cast to, as a string. The format is the same as that returned by [
arrow_typeof
]
Example
❯ select arrow_cast(-5, 'Int8') as a,
arrow_cast('foo', 'Dictionary(Int32, Utf8)') as b,
arrow_cast('bar', 'LargeUtf8') as c,
arrow_cast('2023-01-02T12:53:02', 'Timestamp(Microsecond, Some("+08:00"))') as d
;
+----+-----+-----+---------------------------+
| a | b | c | d |
+----+-----+-----+---------------------------+
| -5 | foo | bar | 2023-01-02T12:53:02+08:00 |
+----+-----+-----+---------------------------+
1 row in set. Query took 0.001 seconds.
arrow_typeof
Returns the name of the underlying Arrow data type of the expression:
arrow_typeof(expression)
Arguments
- expression: Expression to evaluate. Can be a constant, column, or function, and any combination of arithmetic or string operators.
Example
❯ select arrow_typeof('foo'), arrow_typeof(1);
+---------------------------+------------------------+
| arrow_typeof(Utf8("foo")) | arrow_typeof(Int64(1)) |
+---------------------------+------------------------+
| Utf8 | Int64 |
+---------------------------+------------------------+
1 row in set. Query took 0.001 seconds.
Large portions of this page is copied from the Apache Datafusion documentation on January 26th 2024 - where there have been customisations to match Hypi's deployment this has been noted. Apache Datafusion and the Apache name are the property of the Apache Foundation and licensed under the Apache V2 license .