Moving from BigQuery to ClickHouse in the SQL Editor

Introduction

We recently moved from BigQuery to ClickHouse as our primary database management system, as part of a major infrastructure upgrade. You may notice effects of the change in queries generated by Moby, as well as in the SQL editor when writing your own queries.

This guide aims to help you transition from using BigQuery to ClickHouse by highlighting the key differences in SQL syntax, functions, and potential pitfalls to watch out for. Please refer to the ClickHouse Documentation for more detailed syntax and function references:

Summary of Common Pitfalls

Case Sensitivity

  • Ensure that all identifiers are used with the correct case. ClickHouse is more strictly case sensitive than BigQuery.

Aliases

  • In ClickHouse, aliases can be referenced throughout the query. Because this shortcut is available, this can lead to strange behavior or errors.

For example, the spend alias in the WHERE clause refers to the aliased SUM(spend), leading to an error since WHERE cannot filter on aggregated results.

SELECT SUM(spend) AS spend FROM ads WHERE spend > 8

One way to fix this is to use different aliases for aggregated columns:

SELECT SUM(spend) AS total_spend FROM ads WHERE spend > 8

This ensures spend is understood as the original column, and total_spend is the result of the aggregation.

Another possible fix is to use the table or table alias whenever referencing the original column:

SELECT SUM(spend) AS spend FROM ads a WHERE a.spend > 8

Data Type Mismatches

  • Ensure that data types are correctly mapped when migrating schemas. For instance, a STRING in BigQuery should be mapped to String in ClickHouse, BOOL to UInt8, etc.

Date and Time Handling

  • Be aware of differences in date and time functions. Test queries thoroughly to ensure dates are handled correctly.

Syntax Differences

Some SQL constructs available in BigQuery may not have direct equivalents in ClickHouse. It may require rewriting parts of queries.

Data Types

BigQueryClickHouse
STRINGString
INT64Int64
FLOAT64Float64
BOOLUInt8

Date Functions

BigQueryClickHouse
PARSE_DATE(format_string, date_string)parseDateTimeBestEffort(date_string)
FORMAT_DATE(format_string, date)formatDate(format_string, date)

Array Functions

BigQueryClickHouse
ARRAY_LENGTH(array)arrayLength(array)
ARRAY_AGG(expression)groupArray(expression)

JSON Functions

BigQueryClickHouse
JSON_EXTRACT(json_string, json_path)JSONExtract(json_string, json_path)
JSON_EXTRACT_SCALAR(json_string, json_path)JSONExtractString(json_string, json_path)

Unnest

ClickHouse does not have an exact equivalent of UNNEST as found in BigQuery. However, you can achieve similar functionality using array functions like arrayJoin. Here's an example:

BigQuery:

SELECT
    user,
    item
FROM
    users,
    UNNEST(items) AS item;

ClickHouse:

SELECT
    user,
    item
FROM
    users
    array join items as item