Distributed query on batch mode #320
Replies: 1 comment
-
SQL plan optimization for where and group opRelated issue #317 Previous workThe OpenMLDB planner has apply some optimization passes on physical plan so that the group/join/filter operation can be optimized when their keys are matches with table indexes. For instance, given the table create statement: CREATE TABLE t1 (
col0 string,
col1 int32,
col2 int16,
col3 float,
col4 double,
col5 int64,
col6 string,
INDEX index0(col0) OPTIONS (ts = col5)); and the table query statement: SELECT col0,
sum(col1) as col1_sum, sum(col3) as col3_sum,
sum(col4) as col4_sum, sum(col2) as col2_sum,
sum(col5) as col5_sum
FROM t1 WHERE col0 = "1" and col5 < 2 Group By col0; Before optimized, the physical plan will be:
After apply optimization passes, the group op will be eliminated since the group key
Issue descriptionIn this issue, we are going to enhance the existing passes in a way to further optimize SELECT col0,
sum(col1) as col1_sum, sum(col3) as col3_sum,
sum(col4) as col4_sum, sum(col2) as col2_sum,
sum(col5) as col5_sum
FROM t1 WHERE col0 = "1" and col5 < 2 Group By col0; Then in query statement above, since both
ImplementationFrom the above discussion, we can see that our essential work is to further optimized OP when its keys can match the bool GroupAndSortOptimized::KeysOptimized(
const SchemasContext* root_schemas_ctx, PhysicalOpNode* in, Key* left_key,
Key* index_key, Key* right_key, Sort* sort, PhysicalOpNode** new_in) {
// TODO:
// handler keys optimized when the data provider type is `kProviderTypePartition `
} |
Beta Was this translation helpful? Give feedback.
-
Motivation
Users are getting frustrated when they can't do distributed batch queries on OpenMLDB. So we are considering support distributed batch queried gradually.
Firstly, we might support some batch queried on some specific restrictions. For example:
ISSUE related
COUNT
,MAX
,MIN
,SUM
#219Beta Was this translation helpful? Give feedback.
All reactions