Skip to content

SQL API Regression: Pushdown Query - Incorrect date range filtering in v1.5.14 (BigQuery) #10284

@duychinhnguyenvn

Description

@duychinhnguyenvn

Describe the bug
After upgrading @cubejs-backend/core from v1.3.23 to v1.5.14, we observed a significant regression in how the SQL API generates queries (in pushdown mode) for DATE dimensions on BigQuery.

There are two critical issues:

  1. Partition Pruning is lost: The generated subquery no longer includes the date filter, leading to a WHERE (1=1) clause instead of filtering on the partition column. This causes a Full Table Scan on BigQuery.
  2. Data Inconsistency (Off-by-one error): The date filter omits the first day of the requested range. For example, a filter for 2025-09-01 to 2025-09-02 only returns data for 2025-09-02.

Environment

  • Cube.js Version: 1.5.14 (Regression from 1.3.23)
  • Database: Google BigQuery
  • Deployment: Production
  • Timezone Configuration: Asia/Ho_Chi_Minh

Steps to Reproduce

  1. Define a Cube with a DATE type dimension (e.g., date).
  2. Run a query via the SQL API with a date range filter (e.g., 2025-09-01 to 2025-09-02).
  3. Observe the generated SQL via EXPLAIN.

My Query:

SELECT
  date_trunc('day', DATE) as date_d,
  sum(sumRevenue), 
  count(1) -- force pushdown mode
FROM FnB_Performance_Raw
WHERE DATE BETWEEN '2025-09-01' AND '2025-09-02'
GROUP BY date_d;

Expected Behavior (v1.3.23)
The SQL generator should push the filter down to the subquery to enable partition pruning:

-- v1.3.23 SQL (Correct)
SELECT ......
FROM (
  SELECT
        ........
      FROM
        (SELECT * FROM `project.dataset.table`
     WHERE 1=1
     AND ((
           date BETWEEN DATE(TIMESTAMP(?), 'Asia/Ho_Chi_Minh') AND DATE(TIMESTAMP(?), 'Asia/Ho_Chi_Minh')
        ))
  ) AS `fn_b__performance__raw`  WHERE (TIMESTAMP(`fn_b__performance__raw`.date, 'Asia/Ho_Chi_Minh') >= TIMESTAMP(?) AND TIMESTAMP(`fn_b__performance__raw`.date) <= TIMESTAMP(?)) GROUP BY 1
) AS `FnB_Performance_Raw`
LIMIT 50000

Actual Behavior (v1.5.14)
The filter is missing from the subquery, and the outer filter logic seems to cause data loss:

-- v1.5.14 SQL (Buggy)
SELECT ....
FROM (
  SELECT
        ......
      FROM
        (SELECT * FROM `project.dataset.table`
     WHERE 1=1
     AND (1 = 1)
  ) AS `fn_b__performance__raw`  WHERE (TIMESTAMP(`fn_b__performance__raw`.date, 'Asia/Ho_Chi_Minh') BETWEEN ? AND ?) GROUP BY 1
) AS `FnB_Performance_Raw`
LIMIT 50000

Result: Only returns 2025-09-02, omitting 2025-09-01.

Image

Unlike v1.5.14, version 1.3.23 correctly included both 2025-09-01 and 2025-09-02 in the results

Image

Our Cube Schema
My minimally reproducible Cube Schema here.

cubes:
  - name: FnB_Performance_Raw
    sql: |
      SELECT * FROM `project.dataset.table`
         WHERE 1=1
         AND {FILTER_PARAMS.FnB_Performance_Raw.DATE.filter(
               lambda x, y: f"""
               date BETWEEN DATE(TIMESTAMP({x}), 'Asia/Ho_Chi_Minh') AND DATE(TIMESTAMP({y}), 'Asia/Ho_Chi_Minh')
            """
         )}
    dimensions:
      - name: DATE
        sql: "TIMESTAMP({CUBE}.date,'Asia/Ho_Chi_Minh')"
        type: time
    measures:
      - name: sumRevenue
        sql: "{CUBE}.revenue"
        type: sum

Metadata

Metadata

Assignees

Labels

api:sqlIssues related to SQL APIquestionThe issue is a question. Please use Stack Overflow for questions.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions