Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 38 additions & 1 deletion datafusion/physical-expr/src/analysis.rs
Original file line number Diff line number Diff line change
Expand Up @@ -297,12 +297,14 @@ mod tests {
use std::sync::Arc;

use arrow::datatypes::{DataType, Field, Schema};
use datafusion_common::{DFSchema, assert_contains};
use datafusion_common::stats::Precision;
use datafusion_common::{DFSchema, Result, ScalarValue, assert_contains};
use datafusion_expr::{
Expr, col, execution_props::ExecutionProps, interval_arithmetic::Interval, lit,
};

use crate::{AnalysisContext, create_physical_expr};
use crate::expressions::Column;

use super::{ExprBoundaries, analyze};

Expand Down Expand Up @@ -435,4 +437,39 @@ mod tests {
.unwrap_err();
assert_contains!(analysis_error.to_string(), expected_error);
}

#[test]
fn analyze_not_eq_around() -> Result<()> {
let schema = Arc::new(Schema::new(vec![make_field("a", DataType::Float32)]));
let boundaries = vec![ExprBoundaries {
column: Column::new("a", 0),
interval: Some(Interval::try_new(
ScalarValue::Float32(Some(-1.0)),
ScalarValue::Float32(Some(1.0)),
)?),
distinct_count: Precision::Absent,
}];

// NOT (a = 0.0)
let pred_not_eq = datafusion_expr::not(col("a").eq(lit(0.0f32)));

let df_schema = DFSchema::try_from(Arc::clone(&schema))?;
let physical_expr =
create_physical_expr(&pred_not_eq, &df_schema, &ExecutionProps::new())?;

let out_not_eq = analyze(
&physical_expr,
AnalysisContext::new(boundaries),
df_schema.as_ref(),
)?;

let actual = out_not_eq.boundaries[0].interval.clone();
let expected = Some(Interval::try_new(
ScalarValue::Float32(Some(-1.0)),
ScalarValue::Float32(Some(1.0)),
)?);

assert_eq!(expected, actual);
Ok(())
}
}
2 changes: 1 addition & 1 deletion datafusion/physical-expr/src/intervals/cp_solver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@ pub fn propagate_comparison(
match op {
Operator::Eq => {
// TODO: Propagation is not possible until we support interval sets.
Ok(None)
Ok(Some((left_child.clone(), right_child.clone())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but only if the left and right child intervals do not intersect. So maybe this should be

match left_child.intersect(right_child)? {
    None => Ok(Some((left_child.clone(), right_child.clone()))),
    // TODO: Propagation is not possible until we support interval sets.
    Some(_) => Ok(None)
}

instead?

In the test case

#[test]
fn test_propagate_eq_false() -> Result<()> {
    let left = Interval::make(Some(-10_i64), Some(10_i64))?;
    let right = Interval::make(Some(0_i64), Some(0_i64))?;
    let outcome = propagate_comparison(&Operator::Eq, &Interval::FALSE, &left, &right)?;
    assert_eq!(
        None,
        outcome
    );
    Ok(())
}

for instance, the result for the left child should be the intervals [-10, 0[ and ]0, 10], but as the TODO suggests that can't currently be represented. The proposed fix would return [-10, 10] for the left child, but that is not correct since a value of 0 for the left child would result in true, not false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on @berkaysynnada's answer I see I misinterpreted the contract of propagate_comparison. I had assumed it needed to provide a correct stricter constraint if it could or None if it couldn't. If not restricting the child constraints is valid, which makes perfect sense, then indeed there's just the situation where we can know for sure that (left == right) == true, i.e. two equal point intervals, that we should return None. In all other cases we can simply retain the existing constraints.

}
Operator::Gt => satisfy_greater(right_child, left_child, false),
Operator::GtEq => satisfy_greater(right_child, left_child, true),
Expand Down
Loading