Skip to main content

String functions and operators

String operators

OperatorExpression & DescriptionExample
||expression1 || expression2 [ || expression ] ...
Concatenates two or more expressions.
'Abcde' || 1 || 23 Abcde123
^@string ^@ substring
Returns true (t) if string starts with substring. This operator is equivalent to the starts_with() function.
'abcdef' ^@ 'abc't

String functions

ascii

This function returns the Unicode code point of the first character of the input_string. If the string is empty, it returns NULL.

ascii ( input_string ) → int
ascii('RisingWave') → 82
ascii('🌊') → 127754

bit_length

This function returns the number of bits in the input string, which is 8 times the octet_length.

bit_length ( input_string ) → integer
bit_length('wave') → 32

btrim

This function is equal to trim (BOTH). It removes the specified characters from both the beginning and end of the input string.

btrim ( input_string [, characters ] ) → output_string
btrim(' cake ') → 'cake'
btrim('abcxyzabc', 'cba') → 'xyz'

char_length, character_length, length

These functions return the number of characters in the input string.

char_length ( input_string ) → integer_output
character_length ( input_string ) → integer_output
length ( input_string ) → integer_output
char_length('wave') → 4

chr

This function returns the character with the Unicode code point equivalent to the input integer value provided.

chr ( input_int ) → string
chr(65) → 'A'

concat

This function concatenates the arguments. NULL arguments are ignored.

concat ( any_input-value_1 [, any_input-value_2 [, ...] ]) → output_string
concat('Abcde', 2, NULL, 22) → 'Abcde222'
concat(variadic array['abcde', '2', NULL, '22']); -> "abcde222"

concat_ws

This function concatenates the arguments with a separator. The first argument is used as the separator and should not be NULL. Other NULL arguments are ignored.

concat_ws ( separator_string, any_input-value_1 [, any_input-value_2 [, ...] ]) → output_string
concat_ws(',', 'Abcde', 2, NULL, 22) → 'Abcde,2,22'
concat_ws(',', variadic array['abcde', 2, NULL, 22] :: varchar[]); -> "abcde,2,22"

convert_from

This function converts a string to the database encoding. The original encoding is specified by src_encoding name. The string must be valid in this encoding.

convert_from(string bytea, src_encoding name)  →  text
convert_from('\x4346464558'::bytea, 'utf8')  →  'CFFEX'
note

For this function, only encoding UTF8 is supported. RisingWave uses UTF8 encoding to store text, so this function primarily serves as a type conversion operation.


convert_to

This function converts a string to the encoding specified by dest_encoding name and returns a byte array.

convert_to(string text, dest_encoding name) → bytea
convert_to('Hello World', 'UTF8') → '\\x48656c6c6f20576f726c64'
note

For this function, only encoding UTF8 is supported. RisingWave uses UTF8 encoding to store text, so this function primarily serves as a type conversion operation.


decode

This function decodes the text data in the input string into binary data. Supported formats for the encoded input string include base64, hex, and escape.

decode ( input_string, format_type ) → bytea
decode('MTIz', 'base64') → '123'

encode

This function encodes the binary data in bytea into its textual representation. Supported encoding formats include base64, hex, and escape.

encode ( bytea, format_type ) → output_string
encode(E'123'::bytea, 'base64') → 'MTIz'

format

This function produces output formatted according to a format string, in a style similar to the C function sprintf.

format( format_string [, format_arg [, ...] ] ) → output_string

The format_string specifies how the output_string should be formatted. It consists of text and format specifiers. Text is copied directly to the output string. Format specifiers are placeholders for the arguments to be inserted into the output string. The number of format specifiers should be equal to or less than the number of arguments.

The syntax of the format specifier:

%type

type is the type of format conversion to use to generate the output of the format specifier.

The allowed values for type are:

  • s: Formats the argument value as a string. NULL is treated as an empty string.
  • I: Treats the argument value as an SQL identifier.

Please note that format_string and format_arg can be variables.

For example, the following query works fine in RisingWave.

SELECT format(f, a, b) from (values
('%s %s', 'Hello', 'World'),
('%s%s', 'Hello', null),
(null, 'Hello', 'World')
) as t(f, a, b);
----
Hello World
Hello
NULL
More examples
format('%s %s', variadic array['Hello', 'World']); -> "Hello World"

initcap

This function capitalizes the first letter of each word in the input string and converts the remaining characters to lowercase.

initcap ( input_string ) → string
initcap('POWERFUL and flexible') → 'Powerful And Flexible'

lower

This function converts the string to all lowercase.

lower ( input_string ) → output_string
lower('TOM') → 'tom'

left

This function returns the first input integer characters in the input string. If the input integer is negative, the last input_integer characters are removed from the output string.

left ( input_string, input_integer ) → output_string
left('risingwave', 4) → 'risi'
left('risingwave', -4) → 'rising'

lpad

This function pads the input string on the left with spaces until it reaches the specified input integer length. If the input string is longer than the input integer length, it is truncated to the specified length. Providing the optional padding string replaces the spaces with the padding string.

lpad ( input_string, input_int ) → string

lpad ( input_string, input_int, padding_string ) → string
lpad('42', 5) → '  &nbsp42'
lpad('42', 5, 'R') → 'RRR42'

ltrim

This function is equal to trim (LEADING). It removes the specified characters from the beginning of the input string.

ltrim ( input_string [, characters ] ) → output_string
ltrim(' cake ') → 'cake '
ltrim('abcxyzabc', 'cba') → 'xyzabc'

octet_length

This function returns the number of bytes in the string.

octet_length ( input_string )
octet_length('wave') → 4

overlay

This function replaces a substring in the input string with a substring, starting at a specified position and with an optional length. If the length is omitted, its value is the length of the substring.

overlay ( input_string PLACING substring FROM start_int [ FOR length_int ] ) → output_string
overlay('yabadoo' PLACING 'daba' FROM 5 FOR 0) → 'yabadabadoo'
overlay('abcdef' PLACING '45' FROM 4) → 'abc45f'
overlay('RisingWave' PLACING '🌊' FROM 7) → 'Rising🌊ave'

position

This function returns the starting index of the specified substring within the input string, or zero if it is not present.

position ( substring in input_string ) → integer_output
position('ing' in 'rising') → 4

regexp_count

Returns the number of times a POSIX regular expressions pattern appears in input_string. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i, which stands for case-insensitive matching, and c, which represents case-sensitive matching.

regexp_count( input_string, pattern [, start_int [, optional_flag ]] ) → output_int
regexp_count('ABCABCAXYaxy', 'A.', 1, 'c') → 3
regexp_count('ABCABCAXYaxy', 'A.', 2, 'c') → 2

regexp_match

Returns a string array of captured substring(s) resulting from the first match of a POSIX regular expression pattern to a string. If there is no match, the result is NULL. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i, which stands for case-insensitive matching, and c, which represents case-sensitive matching.

regexp_match( input_string, pattern [, optional_flag ] ) → matched_string[]
regexp_match('foobarbequebaz', '(bar)(beque)') → {bar,beque}
regexp_match('abc', 'd') → NULL
regexp_match('abc', 'Bc', 'ici') → {bc}

regexp_matches

Returns a set of string arrays of captured substring(s) resulting from matching a POSIX regular expression pattern to a string. Returns all matches by default. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i, which stands for case-insensitive matching, and c, which represents case-sensitive matching.

regexp_matches( input_string, pattern [, optional_flag ] ) → set_of_matched_string[]
regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)') →
{bar,beque}
{bazil,barf}

regexp_matches('abcabc', 'Bc', 'i') →
{bc}
{bc}

regexp_replace

Replaces the substring that is either the first match or, optionally, the N'th match to the POSIX regular expression pattern in the input_string, starting from the character index specified by the optional start_integer.

Back reference, positive, negative lookahead, and positive, negative lookbehind are supported.

Optional flags can modify the matching behavior:

  • The g flag indicates that all occurrences of the pattern in the input string should be replaced. If not used, only the first occurrence is replaced.
  • The i flag enables case-insensitive matching.
  • The c flag enables case-sensitive matching.

Note: If start_integer is used, flags is not permitted unless N_integer is also specified.

regexp_replace( input_string, pattern, replacement_string [, start_integer [, N_integer ] ] [, flags ] ) → output_string
regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g') → fooXarYXazY

regexp_replace('HELLO world', '[aeiou]', 'X', 'ig') → HXLLX wXrld

regexp_replace('RisingWave', '[aeiou]', 'X', 1, 3, 'i') → RisingWXve

regexp_split_to_array

This function splits a string into an array of substrings based on a regular expression pattern.

regexp_split_to_array ( input_string TEXT, pattern TEXT )TEXT[]
regexp_split_to_array('apple,banana,orange', ',') → {apple,banana,orange}
regexp_split_to_array('apple.banana!orange', '[.!]') → {apple,banana,orange}
regexp_split_to_array('applebananaorange', ',') → {applebananaorange}

repeat

Repeats input_string specific times. Null is returned when times_int is zero, negative, or null.

repeat( input_string, times_int ) → output_string
repeat('A1b2', 3) → A1b2A1b2A1b2

replace

Replaces all occurrences of substring from_string in input_string with substring to_string.

replace( input_string, from_string, to_string ) → output_string
replace('abcdefabcdef', 'cd', 'XX') → abXXefabXXef

reverse

Returns the input_string with its characters in the reverse order.

reverse( input_string ) → string
reverse('RisingWave') → evaWgnisiR

Returns the last input_integer characters in the input_string. If input_integer is negative, the first |input_integer| characters are removed from output_string.

right( input_string, input_integer ) → output_string
right('risingwave', 4) → wave
right('risingwave', -4) → ngwave

rpad

Pads the input string on the right with spaces until it reaches the specified length. If the string is longer than the specified length, it is truncated to the specified length. Providing the optional padding string replaces the spaces with the padding string.

rpad( input_string, input_int, padding_string ) → string
rpad('42', 5) → '42    '
rpad('42', 5, 'R') → '42RRR'

rtrim

Equals to trim (TRAILING).

rtrim( input_string[, characters] ) → output_string
rtrim(' cake ') → 'cake'
rtrim('abcxyzabc', 'cba') → 'abcxyz'

split_part

Splits the input string at occurrences of the delimiter string and returns the n'th field (counting from one), or when n is negative, returns the |n|'th-from-last field. When n is zero, returns an 'InvalidParameterValue' error. When the input delimiter string is an empty string, returns the input string if querying the first or last field. Otherwise, returns an empty string.

split_part( input_string, delimiter_string, int_n ) → varchar
split_part('abc~@~def~@~ghi', '~@~', 2) → 'def'

starts_with

Returns true if the input string starts with the specified prefix string, otherwise returns false.

starts_with( input_string, prefix_string ) → boolean
starts_with('RisingWave is powerful', 'Rising') → true

substr/substring

Extracts the substring from input_string starting at position start_int and extending for count_int characters.

substr( input_string, start_int[, count_int] ) → output_string
substr('alphabet', 3) → 'phabet';
substring('alphabet', 3, 2) → 'ph'

to_ascii

Returns the input string with non-ASCII characters replaced by their closest ASCII equivalents.

to_ascii( input_string ) → string
to_ascii('Café') → 'Cafe'

to_hex

Converts input_int or input_bigint to its hexadecimal representation as a string.

to_hex( input_int ) → string
to_hex(255) → 'ff'
to_hex(123456789012345678) → '1b69b4ba630f34e'

translate

Replaces each character in the input_string that matches a character in the from_string with the corresponding character in the to_string.

translate( input_string, from_string, to_string ) → output_string
translate('M1X3', '13', 'ae') → 'MaXe'

trim

Trims the longest contiguous substring of characters from the beginning, end, or both ends (BOTH by default) of input_string that contains only the characters specified in characters (which defaults to whitespace if not specified).

There are two syntax variants.

trim( [ LEADING | TRAILING | BOTH ] [ characters ] FROM input_string ) → output_string
trim( [ LEADING | TRAILING | BOTH ]  [FROM ] input_string[, characters] ) → output_string
trim(' cake ') → 'cake'
trim(both 'cba' from 'abcxyzabc') → 'xyz'
trim(both from 'abcxyzabc', 'cba') → 'xyz'
trim('abcxyzabc', 'cba') → 'xyz'

upper

Converts the string to all uppercase.

upper( input_string ) → output_string
upper('tom') → 'TOM'

LIKE pattern matching expressions

string [ NOT ] { LIKE | ILIKE } pattern

string [!]~~[*] pattern

The LIKE expression returns true if the string matches the supplied pattern. The NOT LIKE expression returns false if LIKE returns true. By using ILIKE instead of LIKE, the matching becomes case-insensitive.

Alternatively, you can use the operators ~~ and ~~* as equivalents to LIKE and ILIKE, respectively. Similarly, the operators !~~ and !~~* equal to NOT LIKE and NOT ILIKE.

Wildcards

  • An underscore _ in a pattern matches any single character.

  • A percent sign % matches any sequence of zero or more characters.

If the pattern does not contain _ or %, then the pattern only represents the string itself. For example, the pattern 'apple' matches only the string 'apple'. In that case, LIKE acts like the equals operator =.

Escape

To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character \. To match the escape character itself, write two escape characters: \\.

note

You can use ESCAPE '' to disable the escape mechanism, but specifying a custom escape character using the ESCAPE clause is not supported.

Examples

'abc' LIKE 'abc'           true
'abc' LIKE 'a%' true
'abc' LIKE '_b_' true
'abc' LIKE 'c' false

SIMILAR TO pattern matching expressions

string [ NOT ] SIMILAR TO pattern [ ESCAPE escape-character ]

The SIMILAR TO expression returns true if the string matches the supplied pattern. The NOT SIMILAR TO expression returns false if SIMILAR TO returns true. The matching is case-sensitive.

Metacharacter

OperatorDescription
%Matches any sequence of zero or more characters.
_Matches any single character.
|Denotes alternation (either of two alternatives).
*Repeats the previous item zero or more times.
+Repeats the previous item one or more times.
?Repeats the previous item zero or one time.
{m}Repeats the previous item exactly m times.
{m,}Repeats the previous item m or more times.
{m,n}Repeats the previous item at least m and not more than n times.
()Parentheses group items into a single logical item.
[...]A bracket expression specifies a character class.

Escape

To match a metacharacter literally, use the escape character \ before the respective character in the pattern. To match the escape character itself, write two escape characters: \\.

You can use ESCAPE '' to disable the escape mechanism. The ESCAPE clause supports specifying a custom escape character, which must be either empty or a single character.

Examples

'abc' SIMILAR TO 'a|b|c'                false
'abc' SIMILAR TO '(a|b|c)+' true
'a_b' SIMILAR TO 'a$_b' ESCAPE '$' true

Help us make this doc better!