【免费】apachecalcite论文_cascadecalsite

calcite

apache

需积分: 0 2 浏览量更新于2023-05-24 评论收藏 810KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

Apache Calcite: A Foundational Framework for Optimized

ery Processing Over Heterogeneous Data Sources

Edmon Begoli

Oak Ridge National Laboratory

(ORNL)

Oak Ridge, Tennessee, USA

begolie@ornl.gov

Jesús Camacho-Rodríguez

Hortonworks Inc.

Santa Clara, California, USA

jcamacho@hortonworks.com

Julian Hyde

Hortonworks Inc.

Santa Clara, California, USA

jhyde@hortonworks.com

Michael J. Mior

David R. Cheriton School of

Computer Science

University of Waterloo

Waterloo, Ontario, Canada

mmior@uwaterloo.ca

Daniel Lemire

University of Quebec (TELUQ)

Montreal, Quebec, Canada

lemire@gmail.com

ABSTRACT

Apache Calcite is a foundational software framework that provides

query processing, optimization, and query language support to

many popular open-source data processing systems such as Apache

Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite’s ar-

chitecture consists of a modular and extensible query optimizer

with hundreds of built-in optimization rules, a query processor

capable of processing a variety of query languages, an adapter ar-

chitecture designed for extensibility, and support for heterogeneous

data models and stores (relational, semi-structured, streaming, and

geospatial). This exible, embeddable, and extensible architecture

is what makes Calcite an attractive choice for adoption in big-

data frameworks. It is an active project that continues to introduce

support for the new types of data sources, query languages, and

approaches to query processing and optimization.

CCS CONCEPTS

• Information systems → DBMS engine architectures;

KEYWORDS

Apache Calcite, Relational Semantics, Data Management, Query

Algebra, Modular Query Optimization, Storage Adapters

1 INTRODUCTION

Following the seminal System R, conventional relational database

engines dominated the data processing landscape. Yet, as far back as

2005, Stonebraker and Çetintemel [

] predicted that we would see

the rise a collection of specialized engines such as column stores,

stream processing engines, text search engines, and so forth. They

Publication rights licensed to ACM. ACM acknowledges that this contribution was

authored or co-authored by an employee, contractor or aliate of the United States

government. As such, the Government retains a nonexclusive, royalty-free right to

publish or reproduce this article, or to allow others to do so, for Government purposes

only.

SIGMOD’18, June 10–15, 2018, Houston, TX, USA

2018 Copyright held by the owner/author(s). Publication rights licensed to the

Association for Computing Machinery.

ACM ISBN 978-1-4503-4703-7/18/06... $15.00

https://doi.org/10.1145/3183713.3190662

argued that specialized engines can oer more cost-eective per-

formance and that they would bring the end of the “one size ts

all” paradigm. Their vision seems today more relevant than ever.

Indeed, many specialized open-source data systems have since be-

come popular such as Storm [

] and Flink [

] (stream processing),

Elasticsearch [

] (text search), Apache Spark [

], Druid [

], etc.

As organizations have invested in data processing systems tai-

lored towards their specic needs, two overarching problems have

arisen:

•

The developers of such specialized systems have encoun-

tered related problems, such as query optimization [

]

or the need to support query languages such as SQL and

related extensions (e.g., streaming queries [

]) as well as

language-integrated queries inspired by LINQ [

]. With-

out a unifying framework, having multiple engineers inde-

pendently develop similar optimization logic and language

support wastes engineering eort.

•

Programmers using these specialized systems often have to

integrate several of them together. An organization might

rely on Elasticsearch, Apache Spark, and Druid. We need

to build systems capable of supporting optimized queries

across heterogeneous data sources [55].

Apache Calcite was developed to solve these problems. It is

a complete query processing system that provides much of the

common functionality—query execution, optimization, and query

languages—required by any database management system, except

for data storage and management, which are left to specialized

engines. Calcite was quickly adopted by Hive, Drill [

], Storm,

and many other data processing engines, providing them with

advanced query optimizations and query languages.

For example,

Hive [

] is a popular data warehouse project built on top of Apache

Hadoop. As Hive moved from its batch processing roots towards an

interactive SQL query answering platform, it became clear that the

project needed a powerful optimizer at its core. Thus, Hive adopted

Calcite as its optimizer and their integration has been growing since.

Many other projects and products have followed suit, including

Flink, MapD [12], etc.

http://calcite.apache.org/docs/powered_by

arXiv:1802.10233v1 [cs.DB] 28 Feb 2018

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余9页未读，立即下载

hjw199089

粉丝: 84
资源: 24

会员权益专享

apache calcite论文

评论0

会员权益专享

最新资源

apache calcite论文

评论0

Apache Calcite: A Foundational Framework for Optimized Query Processing Over Het

Apache Calcite

calcite-core-1.2.0-incubating-API文档-中文版.zip

Apache Calcite使用方法

apache calcite 语法介绍

apache calcite

core/src/test/resources/org/apache/calcite/sql/test是干什么的

brew安装calcite需要配置环境变量吗

calcite的core项目下没有org.apache.calcite.test.CalciteAssert

calcite怎么连接到core目录

calcite如何写UDF

calcite执行的文件有哪些

calcite/core/src/test/resources/sql/agg.iq被哪个测试类调用

calcite在执行 .sql 文件时，Surefire 插件会使用 Calcite 的测试工具类 org.apache.calcite.test.CalciteAssert 在哪

calcite的StraightforwardQueryMvExpander.java

Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "of" at line 1, column 31. Was expecting one of: <BRACKET_QUOTED_IDENTIFIER> ... <QUOTED_IDENTIFIER> ... <BACK_QUOTED_IDENTIFIER> ... <HYPHENATED_IDENTIFIER> ... <IDENTIFIER> ... <UNICODE_QUOTED_IDENTIFIER> ...

calcite的优化规则在哪看

怎么使用基于gradle的calcite的sql测试库

calcite的Avatica

会员权益专享

最新资源