├── LICENSE
├── algebra.md
├── flink1.9.md
├── function.md
├── helloworld.md
├── images
    ├── calcitea.png
    ├── kafkaa.png
    ├── kafkainstall.png
    ├── window-types.png
    └── zkinstall.png
├── streaming.1.md
├── streaming.2.md
└── tutorial.md


/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/algebra.md:
--------------------------------------------------------------------------------
 1 | # 前言
 2 | 
 3 | 本章主旨介绍关系代数在`Calcite`中的应用，如果还对`Calcite`不了解的同学，也可以异步到`https://github.com/dafei1288/CalciteDocTrans/blob/master/tutorial.md`去看
 4 | 
 5 | # 正文
 6 | 
 7 | 关系代数是`Calcite`的核心。每个查询都可以被表述为一个关系运算符树(`a tree of relational operators`)。你可以将SQL翻译成关系代数或者直接构建树。
 8 | 
 9 | Relational algebra is at the heart of Calcite. Every query is represented as a tree of relational operators. You can translate from SQL to relational algebra, or you can build the tree directly.
10 | 
11 | Planner rules transform expression trees using mathematical identities that preserve semantics. For example, it is valid to push a filter into an input of an inner join if the filter does not reference columns from the other input.
12 | 
13 | Calcite optimizes queries by repeatedly applying planner rules to a relational expression. A cost model guides the process, and the planner engine generates an alternative expression that has the same semantics as the original but a lower cost.
14 | 
15 | The planning process is extensible. You can add your own relational operators, planner rules, cost model, and statistics.


--------------------------------------------------------------------------------
/flink1.9.md:
--------------------------------------------------------------------------------
  1 | 大家期盼已久的1.9已经剪支有些日子了，兴冲冲的切换到跑去编译，我在之前的文章《尝尝Blink》里也介绍过如何编译，本文只针对不同的地方以及遇到的坑做一些说明，希望对遇到同样问题的朋友有一些帮助。
  2 | 
  3 | 首先，切换分支 `git checkout release-1.9` 
  4 | 这次我们不修改pom文件，将镜像添加到`settings.xml`里，在文章末尾，我会分享出来我用的文件全文，这里就不再赘述了。
  5 | 直接使用 `clean package  -DskipTests -Dfast`进行编译
  6 | 
  7 | ```
  8 | ​[INFO] Reactor Summary for flink 1.9-SNAPSHOT:
  9 | [INFO] 
 10 | [INFO] force-shading ...................................... SUCCESS [  2.233 s]
 11 | [INFO] flink .............................................. SUCCESS [  2.536 s]
 12 | [INFO] flink-annotations .................................. SUCCESS [  1.447 s]
 13 | [INFO] flink-shaded-curator ............................... SUCCESS [  1.291 s]
 14 | [INFO] flink-metrics ...................................... SUCCESS [  0.101 s]
 15 | [INFO] flink-metrics-core ................................. SUCCESS [  0.959 s]
 16 | [INFO] flink-test-utils-parent ............................ SUCCESS [  0.091 s]
 17 | [INFO] flink-test-utils-junit ............................. SUCCESS [  1.048 s]
 18 | [INFO] flink-core ......................................... SUCCESS [ 19.790 s]
 19 | [INFO] flink-java ......................................... SUCCESS [  4.944 s]
 20 | [INFO] flink-queryable-state .............................. SUCCESS [  0.085 s]
 21 | [INFO] flink-queryable-state-client-java .................. SUCCESS [  1.671 s]
 22 | [INFO] flink-filesystems .................................. SUCCESS [  0.079 s]
 23 | [INFO] flink-hadoop-fs .................................... SUCCESS [  3.029 s]
 24 | [INFO] flink-runtime ...................................... SUCCESS [ 48.913 s]
 25 | [INFO] flink-scala ........................................ SUCCESS [ 39.109 s]
 26 | [INFO] flink-mapr-fs ...................................... SUCCESS [  2.523 s]
 27 | [INFO] flink-filesystems :: flink-fs-hadoop-shaded ........ SUCCESS [  3.966 s]
 28 | [INFO] flink-s3-fs-base ................................... SUCCESS [  7.892 s]
 29 | [INFO] flink-s3-fs-hadoop ................................. SUCCESS [ 10.222 s]
 30 | [INFO] flink-s3-fs-presto ................................. SUCCESS [ 14.337 s]
 31 | [INFO] flink-swift-fs-hadoop .............................. SUCCESS [ 13.493 s]
 32 | [INFO] flink-oss-fs-hadoop ................................ SUCCESS [  7.104 s]
 33 | [INFO] flink-azure-fs-hadoop .............................. SUCCESS [  8.093 s]
 34 | [INFO] flink-optimizer .................................... SUCCESS [  3.843 s]
 35 | [INFO] flink-clients ...................................... SUCCESS [  3.200 s]
 36 | [INFO] flink-streaming-java ............................... SUCCESS [ 15.939 s]
 37 | [INFO] flink-test-utils ................................... SUCCESS [  4.398 s]
 38 | [INFO] flink-runtime-web .................................. SUCCESS [06:05 min]
 39 | [INFO] flink-examples ..................................... SUCCESS [  0.196 s]
 40 | [INFO] flink-examples-batch ............................... SUCCESS [ 15.297 s]
 41 | [INFO] flink-connectors ................................... SUCCESS [  0.076 s]
 42 | [INFO] flink-hadoop-compatibility ......................... SUCCESS [  6.228 s]
 43 | [INFO] flink-state-backends ............................... SUCCESS [  0.088 s]
 44 | [INFO] flink-statebackend-rocksdb ......................... SUCCESS [  4.283 s]
 45 | [INFO] flink-tests ........................................ SUCCESS [01:00 min]
 46 | [INFO] flink-streaming-scala .............................. SUCCESS [ 33.076 s]
 47 | [INFO] flink-table ........................................ SUCCESS [  0.082 s]
 48 | [INFO] flink-table-common ................................. SUCCESS [  2.936 s]
 49 | [INFO] flink-table-api-java ............................... FAILURE [  1.958 s]
 50 | [INFO] flink-table-api-java-bridge ........................ SKIPPED
 51 | [INFO] flink-table-api-scala .............................. SKIPPED
 52 | [INFO] flink-table-api-scala-bridge ....................... SKIPPED
 53 | [INFO] flink-sql-parser ................................... SKIPPED
 54 | [INFO] flink-libraries .................................... SKIPPED
 55 | [INFO] flink-cep .......................................... SKIPPED
 56 | [INFO] flink-table-planner ................................ SKIPPED
 57 | [INFO] flink-orc .......................................... SKIPPED
 58 | [INFO] flink-jdbc ......................................... SKIPPED
 59 | [INFO] flink-hbase ........................................ SKIPPED
 60 | [INFO] flink-hcatalog ..................................... SKIPPED
 61 | [INFO] flink-metrics-jmx .................................. SKIPPED
 62 | [INFO] flink-connector-kafka-base ......................... SKIPPED
 63 | [INFO] flink-connector-kafka-0.9 .......................... SKIPPED
 64 | [INFO] flink-connector-kafka-0.10 ......................... SKIPPED
 65 | [INFO] flink-connector-kafka-0.11 ......................... SKIPPED
 66 | [INFO] flink-formats ...................................... SKIPPED
 67 | [INFO] flink-json ......................................... SKIPPED
 68 | [INFO] flink-connector-elasticsearch-base ................. SKIPPED
 69 | [INFO] flink-connector-elasticsearch2 ..................... SKIPPED
 70 | [INFO] flink-connector-elasticsearch5 ..................... SKIPPED
 71 | [INFO] flink-connector-elasticsearch6 ..................... SKIPPED
 72 | [INFO] flink-connector-hive ............................... SKIPPED
 73 | [INFO] flink-connector-rabbitmq ........................... SKIPPED
 74 | [INFO] flink-connector-twitter ............................ SKIPPED
 75 | [INFO] flink-connector-nifi ............................... SKIPPED
 76 | [INFO] flink-connector-cassandra .......................... SKIPPED
 77 | [INFO] flink-avro ......................................... SKIPPED
 78 | [INFO] flink-connector-filesystem ......................... SKIPPED
 79 | [INFO] flink-connector-kafka .............................. SKIPPED
 80 | [INFO] flink-connector-gcp-pubsub ......................... SKIPPED
 81 | [INFO] flink-sql-connector-elasticsearch6 ................. SKIPPED
 82 | [INFO] flink-sql-connector-kafka-0.9 ...................... SKIPPED
 83 | [INFO] flink-sql-connector-kafka-0.10 ..................... SKIPPED
 84 | [INFO] flink-sql-connector-kafka-0.11 ..................... SKIPPED
 85 | [INFO] flink-sql-connector-kafka .......................... SKIPPED
 86 | [INFO] flink-connector-kafka-0.8 .......................... SKIPPED
 87 | [INFO] flink-avro-confluent-registry ...................... SKIPPED
 88 | [INFO] flink-parquet ...................................... SKIPPED
 89 | [INFO] flink-sequence-file ................................ SKIPPED
 90 | [INFO] flink-csv .......................................... SKIPPED
 91 | [INFO] flink-examples-streaming ........................... SKIPPED
 92 | [INFO] flink-examples-table ............................... SKIPPED
 93 | [INFO] flink-examples-build-helper ........................ SKIPPED
 94 | [INFO] flink-examples-streaming-twitter ................... SKIPPED
 95 | [INFO] flink-examples-streaming-state-machine ............. SKIPPED
 96 | [INFO] flink-examples-streaming-gcp-pubsub ................ SKIPPED
 97 | [INFO] flink-container .................................... SKIPPED
 98 | [INFO] flink-queryable-state-runtime ...................... SKIPPED
 99 | [INFO] flink-end-to-end-tests ............................. SKIPPED
100 | [INFO] flink-cli-test ..................................... SKIPPED
101 | [INFO] flink-parent-child-classloading-test-program ....... SKIPPED
102 | [INFO] flink-parent-child-classloading-test-lib-package ... SKIPPED
103 | [INFO] flink-dataset-allround-test ........................ SKIPPED
104 | [INFO] flink-datastream-allround-test ..................... SKIPPED
105 | [INFO] flink-stream-sql-test .............................. SKIPPED
106 | [INFO] flink-bucketing-sink-test .......................... SKIPPED
107 | [INFO] flink-distributed-cache-via-blob ................... SKIPPED
108 | [INFO] flink-high-parallelism-iterations-test ............. SKIPPED
109 | [INFO] flink-stream-stateful-job-upgrade-test ............. SKIPPED
110 | [INFO] flink-queryable-state-test ......................... SKIPPED
111 | [INFO] flink-local-recovery-and-allocation-test ........... SKIPPED
112 | [INFO] flink-elasticsearch2-test .......................... SKIPPED
113 | [INFO] flink-elasticsearch5-test .......................... SKIPPED
114 | [INFO] flink-elasticsearch6-test .......................... SKIPPED
115 | [INFO] flink-quickstart ................................... SKIPPED
116 | [INFO] flink-quickstart-java .............................. SKIPPED
117 | [INFO] flink-quickstart-scala ............................. SKIPPED
118 | [INFO] flink-quickstart-test .............................. SKIPPED
119 | [INFO] flink-confluent-schema-registry .................... SKIPPED
120 | [INFO] flink-stream-state-ttl-test ........................ SKIPPED
121 | [INFO] flink-sql-client-test .............................. SKIPPED
122 | [INFO] flink-streaming-file-sink-test ..................... SKIPPED
123 | [INFO] flink-state-evolution-test ......................... SKIPPED
124 | [INFO] flink-e2e-test-utils ............................... SKIPPED
125 | [INFO] flink-mesos ........................................ SKIPPED
126 | [INFO] flink-yarn ......................................... SKIPPED
127 | [INFO] flink-gelly ........................................ SKIPPED
128 | [INFO] flink-gelly-scala .................................. SKIPPED
129 | [INFO] flink-gelly-examples ............................... SKIPPED
130 | [INFO] flink-metrics-dropwizard ........................... SKIPPED
131 | [INFO] flink-metrics-graphite ............................. SKIPPED
132 | [INFO] flink-metrics-influxdb ............................. SKIPPED
133 | [INFO] flink-metrics-prometheus ........................... SKIPPED
134 | [INFO] flink-metrics-statsd ............................... SKIPPED
135 | [INFO] flink-metrics-datadog .............................. SKIPPED
136 | [INFO] flink-metrics-slf4j ................................ SKIPPED
137 | [INFO] flink-cep-scala .................................... SKIPPED
138 | [INFO] flink-table-uber ................................... SKIPPED
139 | [INFO] flink-sql-client ................................... SKIPPED
140 | [INFO] flink-python ....................................... SKIPPED
141 | [INFO] flink-scala-shell .................................. SKIPPED
142 | [INFO] flink-dist ......................................... SKIPPED
143 | [INFO] flink-end-to-end-tests-common ...................... SKIPPED
144 | [INFO] flink-metrics-availability-test .................... SKIPPED
145 | [INFO] flink-metrics-reporter-prometheus-test ............. SKIPPED
146 | [INFO] flink-heavy-deployment-stress-test ................. SKIPPED
147 | [INFO] flink-connector-gcp-pubsub-emulator-tests .......... SKIPPED
148 | [INFO] flink-streaming-kafka-test-base .................... SKIPPED
149 | [INFO] flink-streaming-kafka-test ......................... SKIPPED
150 | [INFO] flink-streaming-kafka011-test ...................... SKIPPED
151 | [INFO] flink-streaming-kafka010-test ...................... SKIPPED
152 | [INFO] flink-plugins-test ................................. SKIPPED
153 | [INFO] flink-state-processor-api .......................... SKIPPED
154 | [INFO] flink-table-runtime-blink .......................... SKIPPED
155 | [INFO] flink-table-planner-blink .......................... SKIPPED
156 | [INFO] flink-contrib ...................................... SKIPPED
157 | [INFO] flink-connector-wikiedits .......................... SKIPPED
158 | [INFO] flink-yarn-tests ................................... SKIPPED
159 | [INFO] flink-fs-tests ..................................... SKIPPED
160 | [INFO] flink-docs ......................................... SKIPPED
161 | [INFO] flink-ml-parent .................................... SKIPPED
162 | [INFO] flink-ml-api ....................................... SKIPPED
163 | [INFO] flink-ml-lib ....................................... SKIPPED
164 | [INFO] ------------------------------------------------------------------------
165 | [INFO] BUILD FAILURE
166 | [INFO] ------------------------------------------------------------------------
167 | [INFO] Total time:  11:58 min
168 | [INFO] Finished at: 2019-07-24T16:37:45+08:00
169 | [INFO] ------------------------------------------------------------------------
170 | [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project flink-table-api-java: Compilation failure
171 | [ERROR] /E:/devlop/sourcespace/flink/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/OperationTreeBuilder.java:[560,85] 未报告的异常错误X; 必须对其进行捕获或声明以便抛出
172 | [ERROR] 
173 | [ERROR] -> [Help 1]
174 | [ERROR] 
175 | [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
176 | [ERROR] Re-run Maven using the -X switch to enable full debug logging.
177 | [ERROR] 
178 | [ERROR] For more information about the errors and possible solutions, please read the following articles:
179 | [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
180 | [ERROR] 
181 | [ERROR] After correcting the problems, you can resume the build with the command
182 | [ERROR]   mvn <goals> -rf :flink-table-api-java
183 | ```
184 | 
185 | 这个问题 `未报告的异常错误X; 必须对其进行捕获或声明以便抛出` 问题卡了我好久，查看源码 
186 | ```
187 | 		private CalculatedQueryOperation<?> unwrapFromAlias(CallExpression call) {
188 | 			List<Expression> children = call.getChildren();
189 | 			List<String> aliases = children.subList(1, children.size())
190 | 				.stream()
191 | 				.map(alias -> ExpressionUtils.extractValue(alias, String.class)
192 | 					.orElseThrow(() -> new ValidationException("Unexpected alias: " + alias))) <= 这里是异常提示
193 | 				.collect(toList());
194 | 
195 | 			if (!isFunctionOfKind(children.get(0), TABLE)) {
196 | 				throw fail();
197 | 			}
198 | 
199 | 			CallExpression tableCall = (CallExpression) children.get(0);
200 | 			TableFunctionDefinition tableFunctionDefinition =
201 | 				(TableFunctionDefinition) tableCall.getFunctionDefinition();
202 | 			return createFunctionCall(tableFunctionDefinition, aliases, tableCall.getResolvedChildren());
203 | 		}
204 | ```
205 | 再看一下`ValidationException`的代码
206 | ```
207 | @PublicEvolving
208 | public class ValidationException extends RuntimeException {
209 | 
210 | 	public ValidationException(String message, Throwable cause) {
211 | 		super(message, cause);
212 | 	}
213 | 
214 | 	public ValidationException(String message) {
215 | 		super(message);
216 | 	}
217 | }
218 | 
219 | ```
220 | 似乎也没啥问题，然后翻了半天，终于在stackoverflow上找到问题所在了
221 | `https://stackoverflow.com/questions/25523375/java8-lambdas-and-exceptions`
222 | 可以在前面加上异常类型 `.<ValidationException>orElseThrow(() -> new ValidationException("Unexpected alias: " + alias)))` 还有几个文件，也要修改，这个问题也可以通过更换JDK来规避。
223 | 
224 | 
225 | 当时使用JDK
226 | ```
227 | E:\devlop\envs\Java8x64bak\bin>java -version
228 | java version "1.8.0_60"
229 | Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
230 | Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
231 | ```
232 | 更换JDK
233 | ```
234 | E:\devlop\envs\Java8x64\bin>java -version
235 | java version "1.8.0_131"
236 | Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
237 | Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
238 | ```
239 | 
240 | 编译成功
241 | ```
242 | [INFO] Reactor Summary for flink 1.9-SNAPSHOT:
243 | [INFO] 
244 | [INFO] force-shading ...................................... SUCCESS [  3.341 s]
245 | [INFO] flink .............................................. SUCCESS [  3.686 s]
246 | [INFO] flink-annotations .................................. SUCCESS [  1.474 s]
247 | [INFO] flink-shaded-curator ............................... SUCCESS [  1.275 s]
248 | [INFO] flink-metrics ...................................... SUCCESS [  0.100 s]
249 | [INFO] flink-metrics-core ................................. SUCCESS [  0.959 s]
250 | [INFO] flink-test-utils-parent ............................ SUCCESS [  0.094 s]
251 | [INFO] flink-test-utils-junit ............................. SUCCESS [  0.963 s]
252 | [INFO] flink-core ......................................... SUCCESS [ 20.784 s]
253 | [INFO] flink-java ......................................... SUCCESS [  7.953 s]
254 | [INFO] flink-queryable-state .............................. SUCCESS [  0.084 s]
255 | [INFO] flink-queryable-state-client-java .................. SUCCESS [  1.925 s]
256 | [INFO] flink-filesystems .................................. SUCCESS [  0.094 s]
257 | [INFO] flink-hadoop-fs .................................... SUCCESS [  3.108 s]
258 | [INFO] flink-runtime ...................................... SUCCESS [ 52.749 s]
259 | [INFO] flink-scala ........................................ SUCCESS [ 40.804 s]
260 | [INFO] flink-mapr-fs ...................................... SUCCESS [  2.281 s]
261 | [INFO] flink-filesystems :: flink-fs-hadoop-shaded ........ SUCCESS [  3.865 s]
262 | [INFO] flink-s3-fs-base ................................... SUCCESS [  7.667 s]
263 | [INFO] flink-s3-fs-hadoop ................................. SUCCESS [ 11.142 s]
264 | [INFO] flink-s3-fs-presto ................................. SUCCESS [ 14.022 s]
265 | [INFO] flink-swift-fs-hadoop .............................. SUCCESS [ 13.379 s]
266 | [INFO] flink-oss-fs-hadoop ................................ SUCCESS [  7.149 s]
267 | [INFO] flink-azure-fs-hadoop .............................. SUCCESS [  8.124 s]
268 | [INFO] flink-optimizer .................................... SUCCESS [  3.841 s]
269 | [INFO] flink-clients ...................................... SUCCESS [  3.081 s]
270 | [INFO] flink-streaming-java ............................... SUCCESS [ 13.254 s]
271 | [INFO] flink-test-utils ................................... SUCCESS [  4.429 s]
272 | [INFO] flink-runtime-web .................................. SUCCESS [03:56 min]
273 | [INFO] flink-examples ..................................... SUCCESS [  0.195 s]
274 | [INFO] flink-examples-batch ............................... SUCCESS [01:27 min]
275 | [INFO] flink-connectors ................................... SUCCESS [  0.156 s]
276 | [INFO] flink-hadoop-compatibility ......................... SUCCESS [  7.404 s]
277 | [INFO] flink-state-backends ............................... SUCCESS [  0.103 s]
278 | [INFO] flink-statebackend-rocksdb ......................... SUCCESS [  4.041 s]
279 | [INFO] flink-tests ........................................ SUCCESS [ 57.677 s]
280 | [INFO] flink-streaming-scala .............................. SUCCESS [ 39.897 s]
281 | [INFO] flink-table ........................................ SUCCESS [  0.093 s]
282 | [INFO] flink-table-common ................................. SUCCESS [  3.252 s]
283 | [INFO] flink-table-api-java ............................... SUCCESS [  3.382 s]
284 | [INFO] flink-table-api-java-bridge ........................ SUCCESS [  1.691 s]
285 | [INFO] flink-table-api-scala .............................. SUCCESS [  5.564 s]
286 | [INFO] flink-table-api-scala-bridge ....................... SUCCESS [  6.084 s]
287 | [INFO] flink-sql-parser ................................... SUCCESS [01:45 min]
288 | [INFO] flink-libraries .................................... SUCCESS [  0.071 s]
289 | [INFO] flink-cep .......................................... SUCCESS [  7.880 s]
290 | [INFO] flink-table-planner ................................ SUCCESS [02:02 min]
291 | [INFO] flink-orc .......................................... SUCCESS [  2.537 s]
292 | [INFO] flink-jdbc ......................................... SUCCESS [  2.255 s]
293 | [INFO] flink-hbase ........................................ SUCCESS [  7.450 s]
294 | [INFO] flink-hcatalog ..................................... SUCCESS [  5.875 s]
295 | [INFO] flink-metrics-jmx .................................. SUCCESS [  1.468 s]
296 | [INFO] flink-connector-kafka-base ......................... SUCCESS [  6.826 s]
297 | [INFO] flink-connector-kafka-0.9 .......................... SUCCESS [  5.396 s]
298 | [INFO] flink-connector-kafka-0.10 ......................... SUCCESS [  3.076 s]
299 | [INFO] flink-connector-kafka-0.11 ......................... SUCCESS [  3.337 s]
300 | [INFO] flink-formats ...................................... SUCCESS [  0.070 s]
301 | [INFO] flink-json ......................................... SUCCESS [  1.535 s]
302 | [INFO] flink-connector-elasticsearch-base ................. SUCCESS [  4.051 s]
303 | [INFO] flink-connector-elasticsearch2 ..................... SUCCESS [ 10.091 s]
304 | [INFO] flink-connector-elasticsearch5 ..................... SUCCESS [ 11.304 s]
305 | [INFO] flink-connector-elasticsearch6 ..................... SUCCESS [  5.441 s]
306 | [INFO] flink-connector-hive ............................... SUCCESS [ 10.140 s]
307 | [INFO] flink-connector-rabbitmq ........................... SUCCESS [  1.770 s]
308 | [INFO] flink-connector-twitter ............................ SUCCESS [  2.210 s]
309 | [INFO] flink-connector-nifi ............................... SUCCESS [  1.993 s]
310 | [INFO] flink-connector-cassandra .......................... SUCCESS [  4.067 s]
311 | [INFO] flink-avro ......................................... SUCCESS [  6.819 s]
312 | [INFO] flink-connector-filesystem ......................... SUCCESS [  3.599 s]
313 | [INFO] flink-connector-kafka .............................. SUCCESS [  3.106 s]
314 | [INFO] flink-connector-gcp-pubsub ......................... SUCCESS [  6.798 s]
315 | [INFO] flink-sql-connector-elasticsearch6 ................. SUCCESS [  5.708 s]
316 | [INFO] flink-sql-connector-kafka-0.9 ...................... SUCCESS [  0.579 s]
317 | [INFO] flink-sql-connector-kafka-0.10 ..................... SUCCESS [  0.665 s]
318 | [INFO] flink-sql-connector-kafka-0.11 ..................... SUCCESS [  0.748 s]
319 | [INFO] flink-sql-connector-kafka .......................... SUCCESS [  1.050 s]
320 | [INFO] flink-connector-kafka-0.8 .......................... SUCCESS [  2.633 s]
321 | [INFO] flink-avro-confluent-registry ...................... SUCCESS [  1.856 s]
322 | [INFO] flink-parquet ...................................... SUCCESS [  2.886 s]
323 | [INFO] flink-sequence-file ................................ SUCCESS [  1.368 s]
324 | [INFO] flink-csv .......................................... SUCCESS [  1.404 s]
325 | [INFO] flink-examples-streaming ........................... SUCCESS [ 14.729 s]
326 | [INFO] flink-examples-table ............................... SUCCESS [  8.828 s]
327 | [INFO] flink-examples-build-helper ........................ SUCCESS [  0.189 s]
328 | [INFO] flink-examples-streaming-twitter ................... SUCCESS [  0.826 s]
329 | [INFO] flink-examples-streaming-state-machine ............. SUCCESS [  0.696 s]
330 | [INFO] flink-examples-streaming-gcp-pubsub ................ SUCCESS [  4.980 s]
331 | [INFO] flink-container .................................... SUCCESS [  2.574 s]
332 | [INFO] flink-queryable-state-runtime ...................... SUCCESS [  4.981 s]
333 | [INFO] flink-end-to-end-tests ............................. SUCCESS [  0.078 s]
334 | [INFO] flink-cli-test ..................................... SUCCESS [  0.933 s]
335 | [INFO] flink-parent-child-classloading-test-program ....... SUCCESS [  1.070 s]
336 | [INFO] flink-parent-child-classloading-test-lib-package ... SUCCESS [  0.519 s]
337 | [INFO] flink-dataset-allround-test ........................ SUCCESS [  0.734 s]
338 | [INFO] flink-datastream-allround-test ..................... SUCCESS [  2.613 s]
339 | [INFO] flink-stream-sql-test .............................. SUCCESS [  1.742 s]
340 | [INFO] flink-bucketing-sink-test .......................... SUCCESS [  1.580 s]
341 | [INFO] flink-distributed-cache-via-blob ................... SUCCESS [  0.880 s]
342 | [INFO] flink-high-parallelism-iterations-test ............. SUCCESS [  7.606 s]
343 | [INFO] flink-stream-stateful-job-upgrade-test ............. SUCCESS [  1.518 s]
344 | [INFO] flink-queryable-state-test ......................... SUCCESS [  2.314 s]
345 | [INFO] flink-local-recovery-and-allocation-test ........... SUCCESS [  0.966 s]
346 | [INFO] flink-elasticsearch2-test .......................... SUCCESS [  4.529 s]
347 | [INFO] flink-elasticsearch5-test .......................... SUCCESS [  5.285 s]
348 | [INFO] flink-elasticsearch6-test .......................... SUCCESS [  3.856 s]
349 | [INFO] flink-quickstart ................................... SUCCESS [  1.481 s]
350 | [INFO] flink-quickstart-java .............................. SUCCESS [  4.658 s]
351 | [INFO] flink-quickstart-scala ............................. SUCCESS [  0.414 s]
352 | [INFO] flink-quickstart-test .............................. SUCCESS [  1.497 s]
353 | [INFO] flink-confluent-schema-registry .................... SUCCESS [  2.361 s]
354 | [INFO] flink-stream-state-ttl-test ........................ SUCCESS [  3.930 s]
355 | [INFO] flink-sql-client-test .............................. SUCCESS [  3.859 s]
356 | [INFO] flink-streaming-file-sink-test ..................... SUCCESS [  1.164 s]
357 | [INFO] flink-state-evolution-test ......................... SUCCESS [  1.532 s]
358 | [INFO] flink-e2e-test-utils ............................... SUCCESS [  6.745 s]
359 | [INFO] flink-mesos ........................................ SUCCESS [ 18.941 s]
360 | [INFO] flink-yarn ......................................... SUCCESS [  3.017 s]
361 | [INFO] flink-gelly ........................................ SUCCESS [  5.259 s]
362 | [INFO] flink-gelly-scala .................................. SUCCESS [ 13.110 s]
363 | [INFO] flink-gelly-examples ............................... SUCCESS [ 11.624 s]
364 | [INFO] flink-metrics-dropwizard ........................... SUCCESS [  1.044 s]
365 | [INFO] flink-metrics-graphite ............................. SUCCESS [  0.570 s]
366 | [INFO] flink-metrics-influxdb ............................. SUCCESS [  2.176 s]
367 | [INFO] flink-metrics-prometheus ........................... SUCCESS [  1.361 s]
368 | [INFO] flink-metrics-statsd ............................... SUCCESS [  0.956 s]
369 | [INFO] flink-metrics-datadog .............................. SUCCESS [  0.711 s]
370 | [INFO] flink-metrics-slf4j ................................ SUCCESS [  0.917 s]
371 | [INFO] flink-cep-scala .................................... SUCCESS [  9.729 s]
372 | [INFO] flink-table-uber ................................... SUCCESS [  2.603 s]
373 | [INFO] flink-sql-client ................................... SUCCESS [  7.800 s]
374 | [INFO] flink-python ....................................... SUCCESS [  2.724 s]
375 | [INFO] flink-scala-shell .................................. SUCCESS [ 10.762 s]
376 | [INFO] flink-dist ......................................... SUCCESS [ 34.086 s]
377 | [INFO] flink-end-to-end-tests-common ...................... SUCCESS [  1.229 s]
378 | [INFO] flink-metrics-availability-test .................... SUCCESS [  0.946 s]
379 | [INFO] flink-metrics-reporter-prometheus-test ............. SUCCESS [  0.798 s]
380 | [INFO] flink-heavy-deployment-stress-test ................. SUCCESS [  7.118 s]
381 | [INFO] flink-connector-gcp-pubsub-emulator-tests .......... SUCCESS [  3.777 s]
382 | [INFO] flink-streaming-kafka-test-base .................... SUCCESS [  1.260 s]
383 | [INFO] flink-streaming-kafka-test ......................... SUCCESS [  6.750 s]
384 | [INFO] flink-streaming-kafka011-test ...................... SUCCESS [  6.230 s]
385 | [INFO] flink-streaming-kafka010-test ...................... SUCCESS [  8.173 s]
386 | [INFO] flink-plugins-test ................................. SUCCESS [  0.799 s]
387 | [INFO] flink-state-processor-api .......................... SUCCESS [  3.276 s]
388 | [INFO] flink-table-runtime-blink .......................... SUCCESS [  7.159 s]
389 | [INFO] flink-table-planner-blink .......................... SUCCESS [02:26 min]
390 | [INFO] flink-contrib ...................................... SUCCESS [  0.070 s]
391 | [INFO] flink-connector-wikiedits .......................... SUCCESS [  1.790 s]
392 | [INFO] flink-yarn-tests ................................... SUCCESS [04:22 min]
393 | [INFO] flink-fs-tests ..................................... SUCCESS [  1.905 s]
394 | [INFO] flink-docs ......................................... SUCCESS [  2.258 s]
395 | [INFO] flink-ml-parent .................................... SUCCESS [  0.066 s]
396 | [INFO] flink-ml-api ....................................... SUCCESS [  1.020 s]
397 | [INFO] flink-ml-lib ....................................... SUCCESS [  0.797 s]
398 | [INFO] ------------------------------------------------------------------------
399 | [INFO] BUILD SUCCESS
400 | [INFO] ------------------------------------------------------------------------
401 | [INFO] Total time:  29:11 min
402 | [INFO] Finished at: 2019-07-24T16:03:03+08:00
403 | [INFO] ------------------------------------------------------------------------
404 | ```
405 | 
406 | 去dist里启动玩耍了。
407 | 
408 | 分享一下我的 `settings.xml`
409 | ```
410 | <?xml version="1.0" encoding="UTF-8"?>
411 | 
412 | <!--
413 |      Licensed to the Apache Software Foundation (ASF) under one
414 | or more contributor license agreements.  See the NOTICE file
415 | distributed with this work for additional information
416 | regarding copyright ownership.  The ASF licenses this file
417 | to you under the Apache License, Version 2.0 (the
418 | "License"); you may not use this file except in compliance
419 | with the License.  You may obtain a copy of the License at
420 |     http://www.apache.org/licenses/LICENSE-2.0
421 | Unless required by applicable law or agreed to in writing,
422 | software distributed under the License is distributed on an
423 | "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
424 | KIND, either express or implied.  See the License for the
425 | specific language governing permissions and limitations
426 | under the License.
427 | -->
428 | 
429 | <!--
430 |       | This is the configuration file for Maven. It can be specified at two levels:
431 |  |
432 |  |  1. User Level. This settings.xml file provides configuration for a single user,
433 |  |                 and is normally provided in ${user.home}/.m2/settings.xml.
434 |  |
435 |  |                 NOTE: This location can be overridden with the CLI option:
436 |  |
437 |  |                 -s /path/to/user/settings.xml
438 |  |
439 |  |  2. Global Level. This settings.xml file provides configuration for all Maven
440 |  |                 users on a machine (assuming they're all using the same Maven
441 |  |                 installation). It's normally provided in
442 |  |                 ${maven.home}/conf/settings.xml.
443 |  |
444 |  |                 NOTE: This location can be overridden with the CLI option:
445 |  |
446 |  |                 -gs /path/to/global/settings.xml
447 |  |
448 |  | The sections in this sample file are intended to give you a running start at
449 |  | getting the most out of your Maven installation. Where appropriate, the default
450 |  | values (values used when the setting is not specified) are provided.
451 |  |
452 |  |-->
453 | <settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
454 |           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
455 |           xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
456 |   <!-- localRepository
457 |           | The path to the local repository maven will use to store artifacts.
458 |    |
459 |    | Default: ${user.home}/.m2/repository
460 | 
461 |   -->
462 |   <!-- interactiveMode
463 |           | This will determine whether maven prompts you when it needs input. If set to false,
464 |    | maven will use a sensible default value, perhaps based on some other setting, for
465 |    | the parameter in question.
466 |    |
467 |    | Default: true
468 |   <interactiveMode>true</interactiveMode>
469 |   -->
470 | 
471 |   <!-- offline
472 |           | Determines whether maven should attempt to connect to the network when executing a build.
473 |    | This will have an effect on artifact downloads, artifact deployment, and others.
474 |    |
475 |    | Default: false
476 |   <offline>false</offline>
477 |   -->
478 | 
479 |   <!-- pluginGroups
480 |           | This is a list of additional group identifiers that will be searched when resolving plugins by their prefix, i.e.
481 |    | when invoking a command line like "mvn prefix:goal". Maven will automatically add the group identifiers
482 |    | "org.apache.maven.plugins" and "org.codehaus.mojo" if these are not already contained in the list.
483 |    |-->
484 |   <pluginGroups>
485 |     <!-- pluginGroup
486 |               | Specifies a further group identifier to use for plugin lookup.
487 |     <pluginGroup>com.your.plugins</pluginGroup>
488 |     -->
489 |   </pluginGroups>
490 | 
491 |   <!-- proxies
492 |           | This is a list of proxies which can be used on this machine to connect to the network.
493 |    | Unless otherwise specified (by system property or command-line switch), the first proxy
494 |    | specification in this list marked as active will be used.
495 |    |-->
496 |   <proxies>
497 |     <!-- proxy
498 |               | Specification for one proxy, to be used in connecting to the network.
499 |      |
500 |     <proxy>
501 |       <id>optional</id>
502 |       <active>true</active>
503 |       <protocol>http</protocol>
504 |       <username>proxyuser</username>
505 |       <password>proxypass</password>
506 |       <host>proxy.host.net</host>
507 |       <port>80</port>
508 |       <nonProxyHosts>local.net|some.host.com</nonProxyHosts>
509 |     </proxy>
510 |     -->
511 |   </proxies>
512 | 
513 |   <!-- servers
514 |           | This is a list of authentication profiles, keyed by the server-id used within the system.
515 |    | Authentication profiles can be used whenever maven must make a connection to a remote server.
516 |    |-->
517 |   <servers>
518 |     <!-- server
519 |               | Specifies the authentication information to use when connecting to a particular server, identified by
520 |      | a unique name within the system (referred to by the 'id' attribute below).
521 |      |
522 |      | NOTE: You should either specify username/password OR privateKey/passphrase, since these pairings are
523 |      |       used together.
524 |      |
525 |     <server>
526 |       <id>deploymentRepo</id>
527 |       <username>repouser</username>
528 |       <password>repopwd</password>
529 |     </server>
530 |     -->
531 | 
532 |     <!-- Another sample, using keys to authenticate.
533 |              <server>
534 |       <id>siteServer</id>
535 |       <privateKey>/path/to/private/key</privateKey>
536 |       <passphrase>optional; leave empty if not used.</passphrase>
537 |     </server>
538 |     -->
539 |   </servers>
540 | 
541 |   <!-- mirrors
542 |           | This is a list of mirrors to be used in downloading artifacts from remote repositories.
543 |    |
544 |    | It works like this: a POM may declare a repository to use in resolving certain artifacts.
545 |    | However, this repository may have problems with heavy traffic at times, so people have mirrored
546 |    | it to several places.
547 |    |
548 |    | That repository definition will have a unique id, so we can create a mirror reference for that
549 |    | repository, to be used as an alternate download site. The mirror site will be the preferred
550 |    | server for that repository.
551 |    |-->
552 |   <mirrors>
553 |     <!-- mirror
554 |               | Specifies a repository mirror site to use instead of a given repository. The repository that
555 |      | this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
556 |      | for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
557 |      |-->
558 |     <mirror>
559 |       <id>nexus-aliyun</id>
560 |       <name>Nexus aliyun</name>
561 |       <mirrorOf>*,!jeecg,!jeecg-snapshots,!mapr-releases,!cloudera,!cdh,!confluent</mirrorOf>
562 |       <url>http://maven.aliyun.com/nexus/content/groups/public</url>
563 |     </mirror>
564 |     <mirror>
565 |       <id>mapr-public</id>
566 |       <name>mapr-releases</name>
567 |       <mirrorOf>mapr-releases,*,!confluent</mirrorOf>
568 |       <url>https://maven.aliyun.com/repository/mapr-public</url>
569 |     </mirror>
570 |     <mirror>
571 |       <id>cloudera</id>
572 |       <name>cloudera</name>
573 |       <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
574 |       <mirrorOf>*,!mapr-releases,!confluent</mirrorOf>
575 |     </mirror>
576 |   </mirrors>
577 | 
578 |   <!-- profiles
579 |           | This is a list of profiles which can be activated in a variety of ways, and which can modify
580 |    | the build process. Profiles provided in the settings.xml are intended to provide local machine-
581 |    | specific paths and repository locations which allow the build to work in the local environment.
582 |    |
583 |    | For example, if you have an integration testing plugin - like cactus - that needs to know where
584 |    | your Tomcat instance is installed, you can provide a variable here such that the variable is
585 |    | dereferenced during the build process to configure the cactus plugin.
586 |    |
587 |    | As noted above, profiles can be activated in a variety of ways. One way - the activeProfiles
588 |    | section of this document (settings.xml) - will be discussed later. Another way essentially
589 |    | relies on the detection of a system property, either matching a particular value for the property,
590 |    | or merely testing its existence. Profiles can also be activated by JDK version prefix, where a
591 |    | value of '1.4' might activate a profile when the build is executed on a JDK version of '1.4.2_07'.
592 |    | Finally, the list of active profiles can be specified directly from the command line.
593 |    |
594 |    | NOTE: For profiles defined in the settings.xml, you are restricted to specifying only artifact
595 |    |       repositories, plugin repositories, and free-form properties to be used as configuration
596 |    |       variables for plugins in the POM.
597 |    |
598 |    |-->
599 |   <profiles>
600 |     <!-- profile
601 |               | Specifies a set of introductions to the build process, to be activated using one or more of the
602 |      | mechanisms described above. For inheritance purposes, and to activate profiles via <activatedProfiles/>
603 |      | or the command line, profiles have to have an ID that is unique.
604 |      |
605 |      | An encouraged best practice for profile identification is to use a consistent naming convention
606 |      | for profiles, such as 'env-dev', 'env-test', 'env-production', 'user-jdcasey', 'user-brett', etc.
607 |      | This will make it more intuitive to understand what the set of introduced profiles is attempting
608 |      | to accomplish, particularly when you only have a list of profile id's for debug.
609 |      |
610 |      | This profile example uses the JDK version to trigger activation, and provides a JDK-specific repo.
611 |     <profile>
612 |       <id>jdk-1.4</id>
613 |       <activation>
614 |         <jdk>1.4</jdk>
615 |       </activation>
616 |       <repositories>
617 |         <repository>
618 |           <id>jdk14</id>
619 |           <name>Repository for JDK 1.4 builds</name>
620 |           <url>http://www.myhost.com/maven/jdk14</url>
621 |           <layout>default</layout>
622 |           <snapshotPolicy>always</snapshotPolicy>
623 |         </repository>
624 |       </repositories>
625 |     </profile>
626 |     -->
627 | 
628 |     <!--
629 |               | Here is another profile, activated by the system property 'target-env' with a value of 'dev',
630 |      | which provides a specific path to the Tomcat instance. To use this, your plugin configuration
631 |      | might hypothetically look like:
632 |      |
633 |      | ...
634 |      | <plugin>
635 |      |   <groupId>org.myco.myplugins</groupId>
636 |      |   <artifactId>myplugin</artifactId>
637 |      |
638 |      |   <configuration>
639 |      |     <tomcatLocation>${tomcatPath}</tomcatLocation>
640 |      |   </configuration>
641 |      | </plugin>
642 |      | ...
643 |      |
644 |      | NOTE: If you just wanted to inject this configuration whenever someone set 'target-env' to
645 |      |       anything, you could just leave off the <value/> inside the activation-property.
646 |      |
647 |     <profile>
648 |       <id>env-dev</id>
649 |       <activation>
650 |         <property>
651 |           <name>target-env</name>
652 |           <value>dev</value>
653 |         </property>
654 |       </activation>
655 |       <properties>
656 |         <tomcatPath>/path/to/tomcat/instance</tomcatPath>
657 |       </properties>
658 |     </profile>
659 |     -->
660 |     <!--
661 |     <profile>
662 |     <id>aliyun</id>
663 |     <repositories>
664 |         <repository>
665 |             <id>mapr-public</id>
666 |             <url>https://maven.aliyun.com/repository/mapr-public</url>
667 |             <releases>
668 |                 <enabled>true</enabled>
669 |             </releases>
670 |             <snapshots>
671 |                 <enabled>true</enabled>
672 |             </snapshots>
673 |         </repository>
674 |     </repositories>
675 |     </profile>
676 |     -->
677 | 
678 |     </profiles>
679 | 
680 | 
681 |   <!-- activeProfiles
682 |           | List of profiles that are active for all builds.
683 |    |
684 |   <activeProfiles>
685 |     <activeProfile>alwaysActiveProfile</activeProfile>
686 |     <activeProfile>anotherAlwaysActiveProfile</activeProfile>
687 |   </activeProfiles>
688 |   -->
689 | </settings>
690 | 
691 | ```
692 | 
693 | 
694 | 
695 | 
696 | 
697 | 


--------------------------------------------------------------------------------
/function.md:
--------------------------------------------------------------------------------
  1 | # 直播改BUG
  2 | 
  3 | ## 修复内联查询
  4 | 
  5 | 在[上期文章](https://github.com/dafei1288/CalciteDocTrans/blob/master/helloworld.md)撰写的时候，我还认为只完成了单表查询，但经过几天的研究发现，上次那寥寥几十行代码，其实已经可以完成了表联接，过滤等功能了，只是由于当时粗心写错了一些东西，造成过滤失效了，下面我来剖析一下问题。
  6 | 
  7 | 首先在`Storage.java`这个文件里
  8 | ```
  9 | public class Storage {
 10 |     public static final String SCHEMA_NAME = "bookshop";
 11 |     public static final String TABLE_AUTHOR = "AUTHOR";
 12 |     public static final String TABLE_BOOK = "BOOK";
 13 | 
 14 | //    public static List<DummyTable> tables = new ArrayList<>();
 15 |     public static Hashtable<String,DummyTable> _bag = new Hashtable<>();
 16 |     static{
 17 |         DummyTable author = new DummyTable(TABLE_AUTHOR);
 18 |         DummyColumn id = new DummyColumn("ID","String");
 19 |         DummyColumn name = new DummyColumn("NAME","String");
 20 |         DummyColumn age = new DummyColumn("AGE","String");
 21 |         DummyColumn aid = new DummyColumn("AID","String");
 22 |         DummyColumn type = new DummyColumn("TYPE","String");
 23 |         author.addColumn(id).addColumn(name).addColumn(age);
 24 |         author.addRow("1","jacky","33");
 25 |         author.addRow("2","wang","23");
 26 |         author.addRow("3","dd","32");
 27 |         author.addRow("4","ma","42");
 28 | //        tables.add(author);
 29 |         _bag.put(TABLE_AUTHOR,author);
 30 | 
 31 |         DummyTable book = new DummyTable(TABLE_BOOK);
 32 |         book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type);
 33 |         book.addRow("1","1","数据山","java");
 34 |         book.addRow("2","2","大关","sql");
 35 |         book.addRow("3","1","lili","sql");
 36 |         book.addRow("4","3","ten","c#");
 37 | //        tables.add(book);
 38 |         _bag.put(TABLE_BOOK,book);
 39 |     }
 40 |     ......
 41 | }
 42 | ```
 43 | 只截取了部分片段， 在`book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type);`这里，我把`name`列和`aid`列写颠倒了。更正过来，就可以进行正确的连接查询了。
 44 | 
 45 | ## 过滤条件
 46 | 
 47 | 在可以进行内联查询以后，我一直在对不能做过滤这点存在质疑，从关系代数的角度分析，应该是先做笛卡尔积，然后再过滤数据，那么就应该可以对数据进行过滤了，那么问题出在哪呢？
 48 | 
 49 | 于是抱着试试看的心里，构建了一条查询`select * from "BOOK" as b where b.name = 数据山`，看看效果，果不其然，喜闻乐见...
 50 | 
 51 | ```
 52 | java.sql.SQLException: Error while executing SQL "select * from "BOOK" as b where b.name = 数据山": From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table
 53 | 	at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
 54 | 	at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
 55 | 	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
 56 | 	at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
 57 | 	at com.dafei1288.calcite.TestJDBC.main(TestJDBC.java:81)
 58 | Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table
 59 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 60 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 61 | 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 62 | 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
 63 | 	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
 64 | 	at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:783)
 65 | 	at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:768)
 66 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4759)
 67 | 	at org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:259)
 68 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5619)
 69 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5601)
 70 | 	at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:334)
 71 | 	at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:134)
 72 | 	at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:101)
 73 | 	at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:859)
 74 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visitScoped(SqlValidatorImpl.java:5654)
 75 | 	at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:50)
 76 | 	at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:33)
 77 | 	at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
 78 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.expand(SqlValidatorImpl.java:5208)
 79 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereClause(SqlValidatorImpl.java:3948)
 80 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3276)
 81 | 	at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
 82 | 	at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
 83 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967)
 84 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943)
 85 | 	at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:225)
 86 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:918)
 87 | 	at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:628)
 88 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552)
 89 | 	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:264)
 90 | 	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:230)
 91 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:772)
 92 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:636)
 93 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:606)
 94 | 	at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:229)
 95 | 	at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550)
 96 | 	at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
 97 | 	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
 98 | 	... 2 more
 99 | Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Column '数据山' not found in any table
100 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
101 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
102 | 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
103 | 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
104 | 	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
105 | 	at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572)
106 | 	... 36 more
107 | ```
108 | 
109 | 看到报错一脸懵逼，`From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table`这是把过滤条件当列处理了？在百思不得其解的时候，突然想起了mysql的引号问题，于是把sql修正为`select * from "BOOK" as b where b.name = '数据山'`，于是喜闻乐见变了一下
110 | 
111 | ```
112 | java.sql.SQLException: Error while executing SQL "select * from "BOOK" as b where b.name = '数据山'": while converting `B`.`NAME` = '数据山'
113 | 	at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
114 | 	at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
115 | 	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
116 | 	at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
117 | 	at com.dafei1288.calcite.TestJDBC.main(TestJDBC.java:81)
118 | Caused by: java.lang.RuntimeException: while converting `B`.`NAME` = '数据山'
119 | 	at org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:86)
120 | 	at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63)
121 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4670)
122 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3977)
123 | 	at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
124 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4541)
125 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertWhere(SqlToRelConverter.java:965)
126 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:643)
127 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:621)
128 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3051)
129 | 	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:557)
130 | 	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:264)
131 | 	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:230)
132 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:772)
133 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:636)
134 | 	at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:606)
135 | 	at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:229)
136 | 	at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550)
137 | 	at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
138 | 	at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
139 | 	... 2 more
140 | Caused by: java.lang.reflect.InvocationTargetException
141 | 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
142 | 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
143 | 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
144 | 	at java.lang.reflect.Method.invoke(Method.java:497)
145 | 	at org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:83)
146 | 	... 21 more
147 | Caused by: org.apache.calcite.runtime.CalciteException: Failed to encode '数据山' in character set 'ISO-8859-1'
148 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
149 | 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
150 | 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
151 | 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
152 | 	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
153 | 	at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572)
154 | 	at org.apache.calcite.util.NlsString.<init>(NlsString.java:81)
155 | 	at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:878)
156 | 	at org.apache.calcite.rex.RexBuilder.makeCharLiteral(RexBuilder.java:1093)
157 | 	at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertLiteral(SqlNodeToRexConverterImpl.java:118)
158 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4659)
159 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3977)
160 | 	at org.apache.calcite.sql.SqlLiteral.accept(SqlLiteral.java:532)
161 | 	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4541)
162 | 	at org.apache.calcite.sql2rel.StandardConvertletTable.convertExpressionList(StandardConvertletTable.java:767)
163 | 	at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:743)
164 | 	at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:727)
165 | 	... 26 more
166 | ```
167 | 对于开发者来说，有变化总是好事，说明可能找到问题点了，再仔细看看日志，果然，编码问题，搞起。于是在各种检索下，得到了答案
168 | 
169 | ```
170 | System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
171 | System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
172 | System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US");
173 | ```
174 | 
175 | 在获取连接之前，将入上述环境变量，就好了,代码片段如下：
176 | 
177 | ```
178 |     public static void main(String[] args) {
179 |         try {
180 |             Class.forName("org.apache.calcite.jdbc.Driver");
181 |         } catch (ClassNotFoundException e1) {
182 |             e1.printStackTrace();
183 |         }
184 |         System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
185 |         System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
186 |         System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US");
187 | 
188 |         Properties info = new Properties();
189 |         String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshop.json";
190 |         
191 |         Connection connection =
192 |                     DriverManager.getConnection("jdbc:calcite:model="+jsonmodle, info);
193 |         CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class);
194 |         ......
195 |     }
196 | ```
197 | 
198 | # 数据类型处理
199 | 
200 | 在写过滤的时候，其实就发现了，没有正确的数据类型，是走不远的，所以有必要把数据类型对应好，既然是模拟数据库，那么数据基本类型，还是使用SQL习惯的类型，而这里对java还是需要有一个映射关系。
201 | 
202 | ```
203 | package com.dafei1288.calcite.storage;
204 | 
205 | import com.google.common.collect.HashBasedTable;
206 | import com.google.common.collect.Table;
207 | import org.apache.calcite.sql.type.SqlTypeName;
208 | 
209 | import java.math.BigDecimal;
210 | import java.sql.Date;
211 | import java.util.Set;
212 | 
213 | /**
214 |  * 这里使用了GUAVA的table 作为存SQL和JAVA数据类型的数据结构
215 |  * 这并不是一个好的设计，而是为了让大家更容易理解而做的设计
216 |  */
217 | public class DataTypeMapping {
218 | 
219 |     public static Table<String, SqlTypeName,Class> TYPEMAPPING= HashBasedTable.create();
220 |     public static final String CHAR = "char";
221 |     public static final String VARCHAR = "varchar";
222 |     public static final String BOOLEAN = "boolean";
223 |     public static final String DATE = "date";
224 |     public static final String INTEGER = "integer";
225 |     public static final String TINYINT = "tinyint";
226 |     public static final String SMALLINT = "smallint";
227 |     public static final String BIGINT = "bigint";
228 |     public static final String DECIMAL = "decimal";
229 |     public static final String NUMERIC = "numeric";
230 |     public static final String FLOAT = "float";
231 |     public static final String REAL = "real";
232 |     public static final String DOUBLE = "double";
233 |     public static final String TIME = "time";
234 |     public static final String TIMESTAMP = "timestamp";
235 |     public static final String ANY = "any";
236 |     static{
237 |         TYPEMAPPING.put(DataTypeMapping.CHAR,SqlTypeName.CHAR,Character.class);
238 |         TYPEMAPPING.put(DataTypeMapping.VARCHAR,SqlTypeName.VARCHAR,String.class);
239 |         TYPEMAPPING.put(DataTypeMapping.BOOLEAN,SqlTypeName.BOOLEAN,Boolean.class);
240 |         TYPEMAPPING.put(DataTypeMapping.DATE,SqlTypeName.DATE,Date.class);
241 |         TYPEMAPPING.put(DataTypeMapping.INTEGER,SqlTypeName.INTEGER,Integer.class);
242 |         TYPEMAPPING.put(DataTypeMapping.TINYINT, SqlTypeName.TINYINT,Integer.class);
243 |         TYPEMAPPING.put(DataTypeMapping.SMALLINT, SqlTypeName.SMALLINT,Integer.class);
244 |         TYPEMAPPING.put(DataTypeMapping.BIGINT, SqlTypeName.BIGINT,Long.class);
245 |         TYPEMAPPING.put(DataTypeMapping.DECIMAL, SqlTypeName.DECIMAL, BigDecimal.class);
246 |         TYPEMAPPING.put(DataTypeMapping.NUMERIC, SqlTypeName.DECIMAL,Long.class);
247 |         TYPEMAPPING.put(DataTypeMapping.FLOAT, SqlTypeName.FLOAT,Float.class);
248 |         TYPEMAPPING.put(DataTypeMapping.REAL, SqlTypeName.REAL,Double.class);
249 |         TYPEMAPPING.put(DataTypeMapping.DOUBLE, SqlTypeName.DOUBLE,Double.class);
250 |         TYPEMAPPING.put(DataTypeMapping.TIME, SqlTypeName.TIME, Date.class);
251 |         TYPEMAPPING.put(DataTypeMapping.TIMESTAMP, SqlTypeName.TIMESTAMP,Long.class);
252 |         TYPEMAPPING.put(DataTypeMapping.ANY, SqlTypeName.ANY,String.class);
253 |     }
254 |     /**
255 |      * 根据名字获取，对应的java类型
256 |      * */
257 |     public static Class getJavaClassByName(String name){
258 |         Set<Table.Cell<String, SqlTypeName,Class>> table = TYPEMAPPING.cellSet();
259 |         for(Table.Cell<String, SqlTypeName,Class> it:table){
260 |             if(it.getRowKey().equals(name)){
261 |                 return it.getValue();
262 |             }
263 |         }
264 |         return null;
265 |     }
266 |     public static SqlTypeName getSqlTypeByName(String name){
267 |         for(Table.Cell<String, SqlTypeName,Class> it:TYPEMAPPING.cellSet()){
268 |             if(it.getRowKey().equals(name)){
269 |                 return it.getColumnKey();
270 |             }
271 |         }
272 |         return null;
273 |     }
274 | }
275 | 
276 | ```
277 | 
278 | 栗子中，我使用了`guava`的`table`结构，主要还是为了大家方便只管理解，`Table<String, SqlTypeName,Class>`,第一个泛型代表内部定义的数据类型字符串，第二个是SQL数据类型，第三个参数是对应的JAVA类型。由于`guava`的`table`是`xy-->z`模式，但实际上我们需要的是一个`x->yz`模式，所以在下面添加了两个辅助方法：`public static Class getJavaClassByName(String name)`通过拿到的数据类型的字符串，获得java的类型，`public static SqlTypeName getSqlTypeByName(String name)`这个方法则是拿到SQL的数据类型。
279 | 
280 | 笔者一直对充血模型钟爱有加，所以在`Storage`里为`DummyColumn`添加了两个方法
281 | 
282 | ```
283 | public static class DummyColumn{
284 |         private String name;
285 |         private String type;
286 | 
287 |         public DummyColumn(String name, String type) {
288 |             this.name = name;
289 |             this.type = type;
290 |         }
291 | 
292 |         public String getName() {
293 |             return name;
294 |         }
295 | 
296 |         public String getType() {
297 |             return type;
298 |         }
299 | 
300 |         public void setName(String name) {
301 |             this.name = name;
302 |         }
303 | 
304 |         public void setType(String type) {
305 |             this.type = type;
306 |         }
307 | 
308 |         //我充血模型
309 |         //获取JAVA类型
310 |         public Class getJavaClass(){
311 |             return DataTypeMapping.getJavaClassByName(this.type);
312 |         }
313 | 
314 |         public SqlTypeName getSqlTypeName(){
315 |             return DataTypeMapping.getSqlTypeByName(this.type);
316 |         }
317 |     }
318 | ```
319 | 
320 | 而在数据初始化的时候，我们可以使用正确的数据类型了
321 | 
322 | ```
323 |         DummyColumn id = new DummyColumn("ID","integer");
324 |         DummyColumn name = new DummyColumn("NAME","varchar");
325 |         DummyColumn age = new DummyColumn("AGE","integer");
326 |         DummyColumn aid = new DummyColumn("AID","integer");
327 |         DummyColumn type = new DummyColumn("TYPE","varchar");
328 | ```
329 | 
330 | 这样数据类型的准备工作就做好了，后面我们需要将数据类型正确的注册表里，所以在`InMemoryTable`里我们原来写死的`String.class`类型，需要根据实际数据进行设置了
331 | 
332 | ```
333 |  @Override
334 |     public RelDataType getRowType(RelDataTypeFactory typeFactory) {
335 | //        System.out.println("RelDataType !!!!!!");
336 |         if(dataType == null) {
337 |             RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder();
338 |             for (Storage.DummyColumn column : this._table.getColumns()) {
339 |                 RelDataType sqlType = typeFactory.createJavaType(column.getJavaClass()); //这里使用了新增的方法，原来是String.class
340 |                 sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory);
341 |                 fieldInfo.add(column.getName(), sqlType);
342 |             }
343 |             this.dataType = typeFactory.createStructType(fieldInfo);
344 |         }
345 |         return this.dataType;
346 |     }
347 | ```
348 | 
349 | 到这里我们就搞定了数据类型。
350 | 
351 | # UDF
352 | 
353 | 有了基础数据类型，对于我们后面做函数处理就方便多了，现在我们以2个简单的UDF为例，让我们继续领略`calcite`的魅力，
354 | 
355 | 首先我们定义两个类，一个数学类，提供一个乘方的方法
356 | 
357 | ```
358 | package com.dafei1288.calcite.function;
359 | 
360 | public class MathFunction {
361 |     public int square(int i){
362 |         return i*i;
363 |     }
364 | }
365 | ```
366 | 
367 | 另一个字符处理类，提供一个字符串连接方法，提供一个将参数转换成字符串的方法
368 | 
369 | ```
370 | package com.dafei1288.calcite.function;
371 | 
372 | public class StringFunction {
373 |     public String concat(Object o1,Object o2){
374 |         return "["+o1.toString()+" , "+o2.toString()+"] => "+this.toString();
375 |     }
376 |     public String parseString(Object o){
377 |         return o.toString();
378 |     }
379 | }
380 | ```
381 | 
382 | 接下来，在`InMemorySchemaFactory`里，将函数注册到数据库上，
383 | 
384 | ```
385 | package com.dafei1288.calcite;
386 | 
387 | import com.dafei1288.calcite.function.MathFunction;
388 | import com.dafei1288.calcite.function.StringFunction;
389 | import org.apache.calcite.schema.Schema;
390 | import org.apache.calcite.schema.SchemaFactory;
391 | import org.apache.calcite.schema.SchemaPlus;
392 | import org.apache.calcite.schema.impl.ScalarFunctionImpl;
393 | 
394 | import java.util.Map;
395 | 
396 | public class InMemorySchemaFactory implements SchemaFactory {
397 |     @Override
398 |     public Schema create(SchemaPlus parentSchema, String name, Map<String, Object> operand) {
399 | 
400 | 
401 |         System.out.println("schema name ==>  "+ name);
402 |         System.out.println("operand ==> "+operand);
403 | 
404 |         parentSchema.add("SQUARE_FUNC",ScalarFunctionImpl.create(MathFunction.class,"square"));
405 |         parentSchema.add("TOSTRING_FUNC",ScalarFunctionImpl.create(StringFunction.class,"parseString"));
406 |         parentSchema.add("CONCAT_FUNC",ScalarFunctionImpl.create(StringFunction.class,"concat"));
407 | 
408 |         return new InMemorySchema(name,operand);
409 |     }
410 | }
411 | ```
412 | 
413 | `SchemaPlus parentSchema`提供了一个`void add(String name, Function function)`方法，`name`为函数名，这里`calcite`提供了一个工具类`ScalarFunction`,它可以通过`create`方法，可以将你写好的函数类和其对应的方法反射出来。
414 | 
415 | 接下来我们做一个测试：
416 | ```
417 | result = st.executeQuery("select SQUARE_FUNC(b.id),CONCAT_FUNC(b.id,b.name) from \"BOOK\" as b");
418 | while(result.next()) {
419 |     System.out.println(result.getString(1) + "\t" +result.getString(2) + "\t" );
420 | }
421 | ```
422 | 
423 | 结果
424 | 
425 | ```
426 | 1	[1 , 数据山] => com.dafei1288.calcite.function.StringFunction@22bb5646	
427 | 4	[2 , 大关] => com.dafei1288.calcite.function.StringFunction@1be59f28	
428 | 9	[3 , lili] => com.dafei1288.calcite.function.StringFunction@2ce45a7b	
429 | 16	[4 , ten] => com.dafei1288.calcite.function.StringFunction@153d4abb	
430 | ```
431 | 
432 | # 技术总结
433 | 
434 | 这期新内容代码有点少，但是在实验成功以后，就克不住自己的兴奋了，一蹴而就写了这篇文章。
435 | 
436 | 后续：
437 | 1. 希望能对聚合函数做一些尝试。
438 | 2. 函数的下推，这只是一个想法，目前实现的还是相当于UDF，那么实际上数据库层应该提供了很多函数的，那么在这里，是否可以透过`calcite`将函数交给`Storage`处理...
439 | 3. streaming sql 
440 | 
441 | 代码已更新：`https://github.com/dafei1288/CalciteHelloworld.git`
442 | 
443 | 文档的翻译工作感觉暂时鸽了：）
444 | 
445 | 


--------------------------------------------------------------------------------
/helloworld.md:
--------------------------------------------------------------------------------
  1 | # 前言
  2 | 
  3 | 说不定期更新，就不定期更新：)。
  4 | 
  5 | 在翻译[关系代数](https://github.com/dafei1288/CalciteDocTrans/blob/master/algebra.md)这篇文档的时候，总有一种惴惴不安的感觉伴随着我，其实还是对之前[概览](https://github.com/dafei1288/CalciteDocTrans/blob/master/tutorial.md)的一知半解，而DEMO项目`Calcite-example-CSV`为了介绍特性，添加了太多代码进来，这虽然很好，因为当你执行代码的时候，就能看到所有特性，但是对于一个新手来讲却未必够友好，我也是这样的一个新手，看着文档里不知所云的概念和代码片段，经常会有挫败感。那不如我们就来实实在在的完成一个`Helloworld`来查询一个表（当然这个表示我们自己定义的格式）就这么简单。来体会一下`Calcite`的魅力吧。
  6 | 
  7 | 这里我们的目标是：
  8 | 
  9 | 1. 数据在一个自己可控的位置，本文写在一个Java文件的静态块里 
 10 | 1. 可以执行一个简单查询并返回数据
 11 | 
 12 | # model.json
 13 | 
 14 | 我习惯gradle，所以起手构建一个空白gradle项目，添加依赖：
 15 | 
 16 | `compile group: 'org.apache.calcite', name: 'calcite-core', version: '1.17.0'`
 17 | 
 18 | 在`resources`下构建一个`bookshop.json`：
 19 | ```
 20 | {
 21 |   "version": "1.0",
 22 |   "defaultSchema": "bookshop",
 23 |   "schemas": [
 24 |     {
 25 |       "type": "custom",
 26 |       "name": "bookshop",
 27 |       "factory": "com.dafei1288.calcite.InMemorySchemaFactory",
 28 |       "operand": {
 29 |         "p1": "hello",
 30 |         "p2": "world"
 31 |       }
 32 |     }
 33 |   ]
 34 | }
 35 | ```
 36 | 首先给库定义一个名字：`"defaultSchema": "bookshop"`
 37 | 然后描述类型`"type": "custom"`，自定义类型，其他还包括`table`，`view`等
 38 | 接下来`"factory": "com.dafei1288.calcite.InMemorySchemaFactory"`相当于定义我们程序的入口，如何加载一个`schema`
 39 | 
 40 | 在构想初期只是想实现一个简单的bookshop数据库，后面在`Storage`介绍里，也会提到，我设计了2张表，`book`和`author`。
 41 | 
 42 | # InMemorySchemaFactory
 43 | 
 44 | 首先让我们来看一下代码：
 45 | ```
 46 | public class InMemorySchemaFactory implements SchemaFactory {
 47 |     @Override
 48 |     public Schema create(SchemaPlus parentSchema, String name, Map<String, Object> operand) {
 49 | 
 50 | 
 51 |         System.out.println("schema name ==>  "+ name);
 52 |         System.out.println("operand ==> "+operand);
 53 | 
 54 |         return new InMemorySchema(name,operand);
 55 |     }
 56 | }
 57 | ```
 58 | 因为在`bookshop.json`里定义了属性`"factory": "com.dafei1288.calcite.InMemorySchemaFactory"`，所以`InMemorySchemaFactory`被默认加载，该类需要继承`SchemaFactory`,重写`create`方法的时候，可以根据自己需要来构建逻辑，这里我们只打印了几个参数看一眼，就略过，实例化一个`InMemorySchema`。
 59 | 
 60 | # InMemorySchema
 61 | 我们还是先把代码贴上：
 62 | ```
 63 | public class InMemorySchema extends AbstractSchema {
 64 |     private String dbName;
 65 |     private Map<String, Object> operand;
 66 | 
 67 |     public InMemorySchema(String name, Map<String, Object> operand) {
 68 |         this.operand = operand;
 69 |         this.dbName = dbName;
 70 |         System.out.println("");
 71 |         System.out.println("in this class ==> "+ this);
 72 | 
 73 |     }
 74 |     @Override
 75 |     public Map<String, Table> getTableMap() {
 76 | 
 77 |         Map<String, Table> tables = new HashMap<String, Table>();
 78 | 
 79 |         Storage.getTables().forEach(it->{
 80 |             //System.out.println("it = "+it.getName());
 81 |             tables.put(it.getName(),new InMemoryTable(it. getName(),it));
 82 |         });
 83 | 
 84 |         return tables;
 85 |     }
 86 | }
 87 | ```
 88 | `InMemorySchema`类也是相当简单的，首先继承`AbstractSchema`,实际上需要复写的`getTableMap`就是这个方法，它的职责就是要提供一个表名和表的映射表，为了实现这个，我们需要做一些处理，当然本例里是使用了一个`Storage`类，来模拟存储表结构信息，以及数据的，这里的表结构以及其他信息都不需要外接再提供额外辅助，如果是使用其他类型的，就可能需要根据自己的实际需求，扩展`operand`属性，来携带必要参数进来了。
 89 | 
 90 | `Storage`直接提供了`getTables`方法，可以直接从里面获取到当前存在的表，这样直接将`Storage`内的表转化成`InMemoryTable`类就可以了。
 91 | 
 92 | # InMemoryTable
 93 | 还是先从代码入手：
 94 | ```
 95 | public class InMemoryTable extends AbstractTable implements ScannableTable {
 96 |     private String name;
 97 |     private Storage.DummyTable _table;
 98 |     private RelDataType dataType;
 99 | 
100 |     InMemoryTable(String name){
101 |         System.out.println("InMemoryTable !!!!!!    "+name );
102 |         this.name = name;
103 |     }
104 | 
105 |     public InMemoryTable(String name, Storage.DummyTable it) {
106 |         this.name = name;
107 |         this._table = it;
108 |     }
109 | 
110 |     @Override
111 |     public RelDataType getRowType(RelDataTypeFactory typeFactory) {
112 | //        System.out.println("RelDataType !!!!!!");
113 |         if(dataType == null) {
114 |             RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder();
115 |             for (Storage.DummyColumn column : this._table.getColumns()) {
116 |                 RelDataType sqlType = typeFactory.createJavaType(
117 |                         String.class);
118 |                 sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory);
119 | //                System.out.println(column.getName()+" / "+sqlType);
120 |                 fieldInfo.add(column.getName(), sqlType);
121 |             }
122 |             this.dataType = typeFactory.createStructType(fieldInfo);
123 |         }
124 |         return this.dataType;
125 |     }
126 | 
127 | 
128 |     @Override
129 |     public Enumerable<Object[]> scan(DataContext root) {
130 |         System.out.println("scan ...... ");
131 |         return new AbstractEnumerable<Object[]>() {
132 |             public Enumerator<Object[]> enumerator() {
133 |                 return new Enumerator<Object[]>(){
134 |                     private int cur = 0;
135 |                     @Override
136 |                     public Object[] current() {
137 | //                        System.out.println("cur = "+cur+" => ");
138 | //                        for (int i =0;i<_table.getData(cur).length;i++){
139 | //                            System.out.println(_table.getData(cur)[i]);
140 | //                        }
141 |                         return _table.getData(cur++);
142 |                     }
143 | 
144 |                     @Override
145 |                     public boolean moveNext() {
146 | //                        System.out.println("++cur < _table.getRowCount() = "+(cur+1 < _table.getRowCount()));
147 |                         return cur < _table.getRowCount() ;
148 |                     }
149 | 
150 |                     @Override
151 |                     public void reset() {
152 | 
153 |                     }
154 | 
155 |                     @Override
156 |                     public void close() {
157 | 
158 |                     }
159 |                 };
160 |             }
161 |         };
162 |     }
163 | }
164 | ```
165 | 这里我保留了很多难看的`System.out`,其实也是为了展示一下我走过的弯路，在这里面，遇到奇奇怪怪的坑，由于`Calcite`的结构原因，有时出错从日志上很难发现原因，或者说很难准确断定原因，当然也许是笔者水平所限的缘故。
166 | `InMemoryTable`需要继承`AbstractTable`实现`ScannableTable`的接口，在这里`Calcite`提供了几种`Table`接口，待日后分解。这个类里，我们主要需要处理的2个方法`public RelDataType getRowType(RelDataTypeFactory typeFactory)`和`public Enumerable<Object[]> scan(DataContext root)`.
167 | 
168 | `getRowType`用来处理列的类型的，不要被那几句代码所迷惑，为了顺利运行，并没有针对数据的类型做什么处理，而是简单粗暴了使用了String，有兴趣的话，可以根据自己的实际情况来注册，日后有机会会详细介绍这部分。
169 | `scan`这个方法相对复杂一点，提供了全表扫面的功能，这里主要需要高速引擎，如何遍历及获取数据。其结构还是比较复杂得，为了减少本例中类的个数，避免复杂得代码结构，吓跑初学者，所以，采用了内部类嵌套的形式，含义还是比较明确的。 主要就是实现`current`和`moveNext`方法。这里还是由`Storage`提供了数据的存储功能，所以只需要遍历，获取一下数据而已，其他方法暂时不管。
170 | 
171 | 写到这，其实和`Calcite`相关的代码已经完成了，整个工程的主体代码也完成了，现在只需要再介绍一下`Storage`
172 | 
173 | # Storage
174 | ```
175 | /**
176 |  * 用于模拟数据库结构及数据
177 |  *
178 |  * author : id,name,age
179 |  * book : id,aid,name,type
180 |  * */
181 | public class Storage {
182 |     public static final String SCHEMA_NAME = "bookshop";
183 |     public static final String TABLE_AUTHOR = "AUTHOR";
184 |     public static final String TABLE_BOOK = "BOOK";
185 | 
186 | //    public static List<DummyTable> tables = new ArrayList<>();
187 |     public static Hashtable<String,DummyTable> _bag = new Hashtable<>();
188 |     static{
189 |         DummyTable author = new DummyTable(TABLE_AUTHOR);
190 |         DummyColumn id = new DummyColumn("ID","String");
191 |         DummyColumn name = new DummyColumn("NAME","String");
192 |         DummyColumn age = new DummyColumn("AGE","String");
193 |         DummyColumn aid = new DummyColumn("AID","String");
194 |         DummyColumn type = new DummyColumn("TYPE","String");
195 |         author.addColumn(id).addColumn(name).addColumn(age);
196 |         author.addRow("1","jacky","33");
197 |         author.addRow("2","wang","23");
198 |         author.addRow("3","dd","32");
199 |         author.addRow("4","ma","42");
200 | //        tables.add(author);
201 |         _bag.put(TABLE_AUTHOR,author);
202 | 
203 |         DummyTable book = new DummyTable(TABLE_BOOK);
204 |         book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type);
205 |         book.addRow("1","1","数据山","java");
206 |         book.addRow("2","2","大关","sql");
207 |         book.addRow("3","1","lili","sql");
208 |         book.addRow("4","3","ten","c#");
209 | //        tables.add(book);
210 |         _bag.put(TABLE_BOOK,book);
211 |     }
212 | 
213 |     public static Collection<DummyTable> getTables(){
214 |         return _bag.values();
215 |     }
216 |     public static DummyTable getTable(String tableName){return _bag.get(tableName);}
217 | 
218 |     public static class DummyTable{
219 |       private String name;
220 |       private List<DummyColumn> columns;
221 |       private List<List<Object>> datas = new ArrayList<>();
222 |       DummyTable(String name){
223 |           this.name = name;
224 |       }
225 | 
226 |       public String getName(){
227 |           return this.name;
228 |       }
229 | 
230 |         public List<DummyColumn> getColumns() {
231 |             return columns;
232 |         }
233 | 
234 |         public DummyTable addColumn(DummyColumn dc){
235 |           if(this.columns == null){
236 |               this.columns = new ArrayList<>();
237 |           }
238 |           this.columns.add(dc);
239 |           return this;
240 |         }
241 | 
242 |         public void setColumns(List<DummyColumn> columns) {
243 |             this.columns = columns;
244 |         }
245 | 
246 |         public Object[] getData(int index){
247 |           return this.datas.get(index).toArray();
248 |         }
249 | 
250 |         public int getRowCount(){
251 |           return this.datas.size();
252 |         }
253 | 
254 |         public void addRow(Object...objects){
255 |           this.datas.add(Arrays.asList(objects));
256 |         }
257 | 
258 | 
259 |     }
260 | 
261 |     public static class DummyColumn{
262 |         private String name;
263 |         private String type;
264 | 
265 |         public DummyColumn(String name, String type) {
266 |             this.name = name;
267 |             this.type = type;
268 |         }
269 | 
270 |         public String getName() {
271 |             return name;
272 |         }
273 | 
274 |         public String getType() {
275 |             return type;
276 |         }
277 | 
278 |         public void setName(String name) {
279 |             this.name = name;
280 |         }
281 | 
282 |         public void setType(String type) {
283 |             this.type = type;
284 |         }
285 |     }
286 | 
287 | }
288 | ```
289 | 这里我们用了一个简单的结构来模拟了存储，`Storage`下面包含`DummyTable`,`DummyTable`包含`DummyColumn`,用于存放元数据信息，而数据则包含在一个`List<List<Object>>`里，各类都提供基础的`getter`和`setter`方法，数据初始化则写在静态块里。
290 | 
291 | # 测试
292 | 写个main方法测试一下：
293 | ```
294 | public static void main(String[] args) {
295 |         try {
296 |             Class.forName("org.apache.calcite.jdbc.Driver");
297 |         } catch (ClassNotFoundException e1) {
298 |             e1.printStackTrace();
299 |         }
300 | 
301 |         Properties info = new Properties();
302 |         String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshop.json";
303 |         try {
304 |             Connection connection =
305 |                     DriverManager.getConnection("jdbc:calcite:model="+jsonmodle, info);
306 |             CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class);
307 | 
308 |             ResultSet result = connection.getMetaData().getTables(null, null, null, null);
309 |             while(result.next()) {
310 |                 System.out.println("Catalog : " + result.getString(1) + ",Database : " + result.getString(2) + ",Table : " + result.getString(3));
311 |             }
312 |             result.close();
313 |             Statement st = connection.createStatement();
314 |             result = st.executeQuery("select * from book as b");
315 |             while(result.next()) {
316 |                 System.out.println(result.getString(1) + "\t" + result.getString(2) + "\t" + result.getString(3));
317 |             }
318 |             result.close();
319 |             //connection.close();
320 |             st = connection.createStatement();
321 |             result = st.executeQuery("select a.name from author as a");
322 |             while(result.next()) {
323 |                 System.out.println(result.getString(1));
324 |             }
325 |             result.close();
326 |             connection.close();
327 |         }catch(Exception e){
328 |             e.printStackTrace();
329 |         }
330 |     }
331 | ```
332 | 
333 | # 技术总结
334 | 1. `Calcite`能提供一个透明的JDBC实现，使用者可以按自己的方式规划存储，这个特性在数据分析中，其实更适合，比如在多源、跨源联合查询上，威力巨大。
335 | 2. 按接口实现相关`schema`和`table`，目前只实现了流程上跑通，单不代表他们就是这样，在这里我们还有很长的路要走
336 | 3. 自定义视图配上model上配置的参数，也许可以作为数据权限一种实现
337 | 
338 | 
339 | # 后记
340 | 
341 | 上述项目代码库传送门：`https://github.com/dafei1288/CalciteHelloworld.git`
342 | 
343 | 目前只提供了全表扫面，条件判断表连接都还不行，待日后更新。
344 | 而`Calcite`强大的优化工作还没登场呢。


--------------------------------------------------------------------------------
/images/calcitea.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/calcitea.png


--------------------------------------------------------------------------------
/images/kafkaa.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/kafkaa.png


--------------------------------------------------------------------------------
/images/kafkainstall.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/kafkainstall.png


--------------------------------------------------------------------------------
/images/window-types.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/window-types.png


--------------------------------------------------------------------------------
/images/zkinstall.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/zkinstall.png


--------------------------------------------------------------------------------
/streaming.1.md:
--------------------------------------------------------------------------------
  1 | # 概述
  2 | 
  3 | 在前面两篇中介绍了 [存储](https://github.com/dafei1288/CalciteDocTrans/blob/master/helloworld.md) 和 [UDF](https://github.com/dafei1288/CalciteDocTrans/blob/master/function.md)，然后就开始着手准备streaming了，开始走了些弯路，本以为需要构建起一个简单的流系统，才能写`streaming sql`呢，所以跑去看来几天的flink，然后再仔细研究了calcite的源码后发现，其实并不用那么麻烦，所以这个系列又能继续了。
  4 | 
  5 | 现在，我打算用2-3章来说说streaming。
  6 | 
  7 | 首先streaming是对表的一种补充，因为他代表着当前和未来的情况，而表则代表着过去。流是连续，流动的记录的集合，与表不同，流通常不存储再磁盘上，而是再网络上流动，在内存中保留的时间也很短。
  8 | 
  9 | 但是与表类似，业务上也通常希望以基于关系代数的高级语言查询流，根据模式进行验证，并优化以利用可用的资源和算法。
 10 | 
 11 | `Calcite`的`Streaming SQL`是标准SQL的扩展，而不是另一种`SQL like`的语言。主要原因如下（翻译自calcite官方文档：
 12 | 
 13 | - 对于任何了解标准SQL的人来说，流式SQL都很容易学习。
 14 | - 语义清晰，无论使用表或是流，都可以返回相同的数据。
 15 | - 可以编写结合流和表的查询（或者流的历史记录，它基本上是内存中的表）。
 16 | - 许多现有的工具可以生成标准SQL。
 17 | - 如果不使用stream关键字，则返回常规标准SQL。
 18 | 
 19 | 介绍了一下基本概念，关于流，还由一点是必须说的，就是窗口
 20 | 
 21 | ![架构图](./images/window-types.png)
 22 | 
 23 | - tumbling window (GROUP BY)
 24 | - hopping window (multi GROUP BY)
 25 | - sliding window (window functions)
 26 | - cascading window (window functions)
 27 | 
 28 | 对于窗口和时间的一些理解，也可以看看，我的另外一篇文章《再谈Flink》
 29 | 
 30 | 
 31 | # 案例
 32 | 
 33 | 好了，基础先说到这，下面来看看代码吧，这次其实非常简单，就可以完成`streaming`了，再一次强调，`calcite`的`streaming sql`和`flink`及`spark`的支持不同，不是api级别上的，而是支持`stream`关键字来支持流
 34 | 
 35 | 我们已经有了前面工程的积累，这样代码量非常小的改动就可以完成了。
 36 | 
 37 | 
 38 | ## bookshopStream.json
 39 | 
 40 | 首先，我们重新定义一个模型文件，取名`bookshopStream.json`
 41 | 
 42 | ```
 43 | {
 44 |   "version": "1.0",
 45 |   "defaultSchema": "bookshopstream",
 46 |   "schemas": [
 47 |     {
 48 |       "name": "bookshopstream",
 49 |       "tables": [
 50 |         {
 51 |           "name": "BOOK",
 52 |           "type": "custom",
 53 |           "factory": "com.dafei1288.calcite.stream.InMemoryStreamTableFactory",
 54 |           "stream": {
 55 |             "stream": true
 56 |           },
 57 |           "operand": {
 58 |             "p1": "hello",
 59 |             "p2": "world"
 60 |           }
 61 |         }
 62 |       ]
 63 |     }
 64 |   ]
 65 | }
 66 | ```
 67 | 
 68 | 这里我们对`schema`并没有过多的设置，而是直接对`tables`属性进行了设置，将`factory`指定为`com.dafei1288.calcite.stream.InMemoryStreamTableFactory`，这类后续在细讲。这里我们将表名定义为`BOOK`,意在后续使用之前案例的`Storage`。
 69 | 
 70 | ## InMemoryStreamTableFactory
 71 | 
 72 | ```
 73 | public class InMemoryStreamTableFactory implements TableFactory {
 74 |     @Override
 75 |     public Table create(SchemaPlus schema, String name, Map operand, RelDataType rowType) {
 76 |         System.out.println(operand);
 77 |         System.out.println(name);
 78 |         return new InMemoryStreamTable(name, Storage.getTable(name));
 79 |     }
 80 | }
 81 | ```
 82 | 
 83 | 因为在模型里，直接指定了`TableFactory`，这个类的职责就是构建`Table`表对象，其职责，有点类似之前案例里的`InMemorySchema`类的`public Map<String, Table> getTableMap()`方法。前文描述了过，指定了`"name": "BOOK"`，所以，在这里代码执行的结果就是加载了`BOOK`表。
 84 | 
 85 | ## InMemoryStreamTable
 86 | 
 87 | ```
 88 | public class InMemoryStreamTable extends InMemoryTable implements StreamableTable {
 89 |     public InMemoryStreamTable(String name, Storage.DummyTable it) {
 90 |         super(name, it);
 91 |     }
 92 | 
 93 |     @Override
 94 |     public Table stream() {
 95 |         System.out.println("streaming .....");
 96 |         return this;
 97 |     }
 98 | }
 99 | ```
100 | 
101 | 这里，为了能复用之前的存储逻辑，所以直接继承了`InMemoryTable`,所以，这个实现，其实底层并不是一个彻底的`streaming`实现，而是和之前案例一直的内存实现，但是这样就可以通过stream关键字，来进行sql查询了。
102 | 
103 | ## 测试
104 | 
105 | ```
106 | public class TestStreamJDBC {
107 |     public static void main(String[] args) {
108 |         try {
109 |             Class.forName("org.apache.calcite.jdbc.Driver");
110 |         } catch (ClassNotFoundException e1) {
111 |             e1.printStackTrace();
112 |         }
113 |         System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
114 |         System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
115 |         System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US");
116 | 
117 |         Properties info = new Properties();
118 |         String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshopStream.json";
119 |         try {
120 |             Connection connection =
121 |                     DriverManager.getConnection("jdbc:calcite:model=" + jsonmodle, info);
122 |             CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class);
123 | 
124 |             ResultSet result = null;
125 | 
126 |             Statement st = connection.createStatement();
127 | 
128 |             st = connection.createStatement();
129 |             //where b.name = '数据山'
130 |             result = st.executeQuery("select stream * from BOOK as b ");
131 |             while(result.next()) {
132 |                 System.out.println(result.getString(1)+" \t "+result.getString(2)+" \t "+result.getString(3)+" \t "+result.getString(4));
133 |             }
134 |             result.close();
135 |         }catch(Exception e){
136 |             e.printStackTrace();
137 |         }
138 | 
139 |         }
140 | }
141 | ```
142 | 
143 | `select stream * from BOOK as b`这里撰写了一个简单的SQL，并使用了`stream`关键字，结果如下。
144 | 
145 | ```
146 | {p1=hello, p2=world, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\bookshopStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources}
147 | BOOK
148 | streaming .....
149 | scan ...... 
150 | 1 	 1 	 数据山 	 java
151 | 2 	 2 	 大关 	 sql
152 | 3 	 1 	 lili 	 sql
153 | 4 	 3 	 ten 	 c#
154 | ```
155 | 
156 | 那么对于一个非stream表，使用stream关键字，会怎么样呢？那么我们会得到一个异常
157 | 
158 | > ERROR: Cannot convert table 'xxx' to a stream
159 | 
160 | 
161 | # 结尾
162 | 
163 | 目前只是完成了最基础的查询，代码已提交到demo仓库
164 | 
165 | TBD 


--------------------------------------------------------------------------------
/streaming.2.md:
--------------------------------------------------------------------------------
  1 | # 概述
  2 | 
  3 | 在上一篇文章中介绍了，如何在`select`语句中使用`stream`关键字，进行`流查询`,并且模拟了简单数据结构，有兴趣的同学可以移步去看看( [streaming上篇](https://github.com/dafei1288/CalciteDocTrans/blob/master/streaming.1.md))。本文将会继续扩展这个案例，把`calcite`和`kafka`联合起来，将`kafka`作为数据提供者，并进行`SQL`查询。
  4 | 
  5 | # 什么是 kafka
  6 | 
  7 | `kafka` 是一个分布式消息队列。具有高性能、持久化、多副本备份、横向扩展能力。生产者往队列里写消息，消费者从队列里取消息进行业务逻辑。一般在架构设计中起到解耦、削峰、异步处理的作用。
  8 | `kafka`对外使用`topic`的概念，生产者往`topic`里写消息，消费者从读消息。为了做到水平扩展，一个`topic`实际是由多个`partition`组成的，遇到瓶颈时，可以通过增加`partition`的数量来进行横向扩容。单个`parition`内是保证消息有序。
  9 | 每新写一条消息，`kafka`就是在对应的文件`append写`，所以性能非常高。
 10 | `kafka`的总体数据流是这样的：
 11 | 
 12 | ![zk](./images/kafkaa.png)
 13 | 
 14 | 大概用法就是，`Producers`往`Brokers`里面的指定`Topic`中写消息，`Consumers`从`Brokers`里面拉去指定`Topic`的消息，然后进行业务处理。
 15 | 
 16 | `以上内容这部分引用自：https://www.jianshu.com/p/d3e963ff8b70 `
 17 | 
 18 | 至于什么是`zookeeper`？有兴趣的读者自行搜索吧,这里就不过多介绍了...
 19 | 
 20 | 
 21 | # kafka 环境搭建
 22 | 
 23 | 本章以`windows`环境下搭建`kafka`环境为例，如果您已经熟悉这部分内容，可以跳过这个章节。搭建测试的方法有很多，这里我们使用一种较为便捷且成功率较高的方式。
 24 | 
 25 | ## zookeeper 环境搭建
 26 | 
 27 | - 下载并解压zookeeper `http://zookeeper.apache.org/releases.html#download` 
 28 | - 进入解压后的文件夹的`conf目录`，复制`zoo_sample.cfg`重命名成`zoo.cfg`
 29 | - 编辑`zoo.cfg`文件，修改`dataDir`为`dataDir=$zookeeper解压路径\data`，这个路径可自行配置，只要有权限写入即可
 30 | - 添加环境变量`ZOOKEEPER_HOME`，指向`zookeeper解压路径`
 31 | - 在`PATH`变量里添加`ZOOKEEPER_HOME\bin`
 32 | - 新建一个命令行，执行`zkServer`
 33 | 
 34 | ![zk](./images/zkinstall.png)
 35 | 
 36 | ## kafka 环境搭建
 37 | 
 38 | - 下载并解压kafka `http://kafka.apache.org/downloads` , 下载的时候，注意`scala`版本，后续开发，可能会有影响
 39 | - 进入解压后的文件夹的`config目录`
 40 | - 编辑`server.properties`文件，修改`log.dirs=$kafka解压路径\kafka-logs`，这个路径可自行配置，只要有权限写入即可
 41 | - 在`kafka解压路径`执行`.\bin\windows\kafka-server-start.bat .\config\server.properties`，建议将此命令，保存为`start.cmd`存放在该路径下，以便日后使用
 42 | 
 43 | ![kafka](./images/kafkainstall.png)
 44 | 
 45 | # kafka 环境测试
 46 | 
 47 | 我们已经搭建起来了一个简单的`kafka`环境，接下来我们需要测试一下环境
 48 | 
 49 | 首先，在之前的工程里加入`kafka`的依赖
 50 | 
 51 | ```
 52 |     compile group: 'org.apache.kafka', name: 'kafka_2.12', version: '2.1.0'
 53 |     compile group: 'org.apache.kafka', name: 'kafka-clients', version: '2.1.0'
 54 |     compile group: 'org.apache.kafka', name: 'kafka-streams', version: '2.1.0'
 55 | ```
 56 | 
 57 | 然后来创建主题
 58 | 
 59 | ## 创建 topic
 60 | 
 61 | ```
 62 | package com.dafei1288.calcite.stream.kafka;
 63 | 
 64 | import org.apache.kafka.clients.admin.AdminClient;
 65 | import org.apache.kafka.clients.admin.CreateTopicsResult;
 66 | import org.apache.kafka.clients.admin.NewTopic;
 67 | 
 68 | import java.util.ArrayList;
 69 | import java.util.Properties;
 70 | import java.util.concurrent.ExecutionException;
 71 | 
 72 | public class CreateTopic {
 73 |     public static void main(String[] args) {
 74 |         //创建topic
 75 |         Properties props = new Properties();
 76 |         props.put("bootstrap.servers", "localhost:2181");
 77 |         AdminClient adminClient = AdminClient.create(props);
 78 |         ArrayList<NewTopic> topics = new ArrayList<NewTopic>();
 79 |         NewTopic newTopic = new NewTopic("calcitekafka", 1, (short) 1);
 80 |         topics.add(newTopic);
 81 |         CreateTopicsResult result = adminClient.createTopics(topics);
 82 |         try {
 83 |             result.all().get();
 84 |         } catch (InterruptedException e) {
 85 |             e.printStackTrace();
 86 |         } catch (ExecutionException e) {
 87 |             e.printStackTrace();
 88 |         }
 89 |     }
 90 | }
 91 | 
 92 | ```
 93 | 
 94 | 创建`topic`以后，我们来构建一个基础的生产者`producter`。
 95 | 
 96 | ## 创建 producter
 97 | 
 98 | ```
 99 | package com.dafei1288.calcite.stream.kafka;
100 | 
101 | import org.apache.kafka.clients.producer.KafkaProducer;
102 | import org.apache.kafka.clients.producer.ProducerRecord;
103 | 
104 | import java.util.Properties;
105 | import java.util.Random;
106 | 
107 | public class Producter {
108 |     private static KafkaProducer<String, String> producer;
109 |     //刚才构建的topic
110 |     private final static String TOPIC = "calcitekafka";
111 |     public Producter(){
112 |         Properties props = new Properties();
113 |         props.put("bootstrap.servers", "localhost:9092");
114 |         props.put("acks", "all");
115 |         props.put("retries", 0);
116 |         props.put("batch.size", 16384);
117 |         props.put("linger.ms", 1);
118 |         props.put("buffer.memory", 33554432);
119 |         props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
120 |         props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
121 |         //设置分区类,根据key进行数据分区
122 |         producer = new KafkaProducer<String, String>(props);
123 |     }
124 |     public void produce(){
125 |         int i = 0;
126 |         Random r = new Random();
127 |         for(;;){
128 |             //每一秒创建一个随机的布尔值
129 |             producer.send(new ProducerRecord<String, String>(TOPIC,i+++"",r.nextBoolean()+"" ));
130 |             try {
131 |                 Thread.sleep(1000);
132 |             } catch (InterruptedException e) {
133 |                 e.printStackTrace();
134 |             }
135 |         }
136 |        // producer.close();
137 |     }
138 | 
139 |     public static void main(String[] args) {
140 |         new Producter().produce();
141 |     }
142 | }
143 | ```
144 | 
145 | 由于没有正式的业务场景，我们进行一个简单的模拟，每秒生成一个随机的布尔值，一直循环下去，有了生产者，下面我们来构建一个消费者。
146 | 
147 | ## 创建 consumer
148 | 
149 | ```
150 | package com.dafei1288.calcite.stream.kafka;
151 | 
152 | import org.apache.kafka.clients.consumer.ConsumerRecord;
153 | import org.apache.kafka.clients.consumer.ConsumerRecords;
154 | import org.apache.kafka.clients.consumer.KafkaConsumer;
155 | 
156 | import java.util.Arrays;
157 | import java.util.Properties;
158 | 
159 | public class Consumer {
160 |     private static KafkaConsumer<String, String> consumer;
161 |     private final static String TOPIC = "calcitekafka";
162 |     public Consumer(){
163 |         Properties props = new Properties();
164 |         props.put("bootstrap.servers", "localhost:9092");
165 |         //每个消费者分配独立的组号
166 |         props.put("group.id", "test2");
167 |         //如果value合法，则自动提交偏移量
168 |         props.put("enable.auto.commit", "true");
169 |         //设置多久一次更新被消费消息的偏移量
170 |         props.put("auto.commit.interval.ms", "1000");
171 |         //设置会话响应的时间，超过这个时间kafka可以选择放弃消费或者消费下一条消息
172 |         props.put("session.timeout.ms", "30000");
173 |         //自动重置offset
174 |         props.put("auto.offset.reset","earliest");
175 |         props.put("key.deserializer",
176 |                 "org.apache.kafka.common.serialization.StringDeserializer");
177 |         props.put("value.deserializer",
178 |                 "org.apache.kafka.common.serialization.StringDeserializer");
179 |         consumer = new KafkaConsumer<String, String>(props);
180 |     }
181 | 
182 |     public void consume(){
183 |         consumer.subscribe(Arrays.asList(TOPIC));
184 |         while (true) {
185 |             ConsumerRecords<String, String> records = consumer.poll(100);
186 |             for (ConsumerRecord<String, String> record : records){
187 |                 System.out.printf("offset = %d, key = %s, value = %s",record.offset(), record.key(), record.value());
188 |                 System.out.println();
189 |             }
190 |         }
191 |     }
192 | 
193 |     public static void main(String[] args) {
194 |         new Consumer().consume();
195 |     }
196 | }
197 | ```
198 | 
199 | 这里就是简单的将数据在控制台，进行一下输出，片段如下：
200 | 
201 | ```
202 | offset = 328, key = 0, value = false
203 | offset = 329, key = 1, value = false
204 | offset = 330, key = 2, value = true
205 | offset = 331, key = 3, value = true
206 | offset = 332, key = 4, value = false
207 | offset = 333, key = 5, value = false
208 | offset = 334, key = 6, value = true
209 | offset = 335, key = 7, value = true
210 | offset = 336, key = 8, value = false
211 | offset = 337, key = 9, value = true
212 | offset = 338, key = 10, value = true
213 | offset = 339, key = 11, value = true
214 | ```
215 | 
216 | 这样就说明之前我们搭建的`kafka`环境成功了，下面我们来和`calcite`进行整合，代替前文案例中，我们自己撰写的`storage`
217 | 
218 | # calcite 整合 kafka
219 | 
220 | 我们这次的目的是取代之前使用`java`文件来存储的数据，而是使用`kafka`作为数据的提供者，首先我们需要重新构建一个`schema`文件
221 | 
222 | ## 创建 kafkaStream.json
223 | 
224 | ```
225 | {
226 |   "version": "1.0",
227 |   "defaultSchema": "bookshopstream",
228 |   "schemas": [
229 |     {
230 |       "name": "bookshopstream",
231 |       "tables": [
232 |         {
233 |           "name": "KF",
234 |           "type": "custom",
235 |           "factory": "com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory",
236 |           "stream": {
237 |             "stream": true
238 |           },
239 |           "operand": {
240 |             "topic": "calcitekafka",
241 |             "bootstrap.servers": "localhost:9092",
242 |             "group.id": "test2",
243 |             "enable.auto.commit": "true",
244 |             "auto.commit.interval.ms": "1000",
245 |             "session.timeout.ms": "30000",
246 |             "auto.offset.reset":"earliest",
247 |             "key.deserializer": "org.apache.kafka.common.serialization.StringDeserializer",
248 |             "value.deserializer": "org.apache.kafka.common.serialization.StringDeserializer",
249 |             "colnames": "KK,VV",
250 |             "timeouts": "2000"
251 |           }
252 |         }
253 |       ]
254 |     }
255 |   ]
256 | }
257 | ```
258 | 
259 | 在这里，我们重新构建了一个`factory`,它是`com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory`，这个类的具体内容，我们下面会详细介绍。
260 | 
261 | __请注意，在`operand`里的配置，我们加入了一系列配置，这是从通用性考虑，我们将`kafka`以及其他必要配置全部写在了这里面。__
262 | 
263 | 接下来，我们看一下`com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory`,做了一些什么？
264 | 
265 | ## KafkaStreamTableFactory
266 | 
267 | ```
268 | package com.dafei1288.calcite.stream.kafka;
269 | 
270 | import org.apache.calcite.rel.type.RelDataType;
271 | import org.apache.calcite.schema.SchemaPlus;
272 | import org.apache.calcite.schema.Table;
273 | import org.apache.calcite.schema.TableFactory;
274 | 
275 | import java.util.Map;
276 | 
277 | public class KafkaStreamTableFactory implements TableFactory {
278 |     @Override
279 |     public Table create(SchemaPlus schema, String name, Map operand, RelDataType rowType) {
280 |         System.out.println(operand);
281 |         System.out.println(name);
282 |         return new KafkaStreamTable(name,operand);
283 |     }
284 | }
285 | 
286 | ```
287 | 
288 | 这个类，和之前的类职责基本相同，代码也几乎一致，只是在返回的时候，变成了`return new KafkaStreamTable(name,operand);`，这里我们将operand直接作为参数下发到了，`Table`类的实现里，这里是为了提高`Table`的灵活性，将职责下发。而不是像之前案例那样。
289 | 
290 | ## KafkaStreamTable
291 | 
292 | `KafkaStreamTable`这个类，是这个案例最有意思的部分,我们先来看一下代码
293 | 
294 | ```
295 | package com.dafei1288.calcite.stream.kafka;
296 | 
297 | import org.apache.calcite.DataContext;
298 | import org.apache.calcite.linq4j.AbstractEnumerable;
299 | import org.apache.calcite.linq4j.Enumerable;
300 | import org.apache.calcite.linq4j.Enumerator;
301 | import org.apache.calcite.rel.type.RelDataType;
302 | import org.apache.calcite.rel.type.RelDataTypeFactory;
303 | import org.apache.calcite.schema.ScannableTable;
304 | import org.apache.calcite.schema.StreamableTable;
305 | import org.apache.calcite.schema.Table;
306 | import org.apache.calcite.schema.impl.AbstractTable;
307 | import org.apache.calcite.sql.type.SqlTypeUtil;
308 | import org.apache.kafka.clients.consumer.ConsumerRecord;
309 | import org.apache.kafka.clients.consumer.ConsumerRecords;
310 | import org.apache.kafka.clients.consumer.KafkaConsumer;
311 | 
312 | import java.util.Arrays;
313 | import java.util.Iterator;
314 | import java.util.Map;
315 | 
316 | 
317 | public class KafkaStreamTable extends AbstractTable implements StreamableTable, ScannableTable {
318 | 
319 |     @Override
320 |     public Table stream() {
321 |         return  this;
322 |     }
323 | 
324 |     private String name;
325 |     private RelDataType dataType;
326 |     private Map operand;
327 |     private static KafkaConsumer<String, String> consumer;
328 | 
329 |     public KafkaStreamTable(String name){
330 |         System.out.println("KafkaStreamTable !!!!!!    "+name );
331 |         this.name = name;
332 |     }
333 | 
334 |     public KafkaStreamTable(String name, Map operand) {
335 |         System.out.println("KafkaStreamTable !!!!!!    "+name +" , "+operand);
336 |         this.name = name;
337 |         this.operand = operand;
338 | 
339 | 
340 |     }
341 | 
342 |     @Override
343 |     public RelDataType getRowType(RelDataTypeFactory typeFactory) {
344 | //        System.out.println("RelDataType !!!!!!");
345 |         if(dataType == null) {
346 |             RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder();
347 |             //我们需要存储stream table的元数据信息，为了案例，我写在了kafkaStream.json文件里配置信息里colnames
348 |             for (String col : operand.get("colnames").toString().split(",")) {
349 |                 RelDataType sqlType = typeFactory.createJavaType(String.class);
350 |                 sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory);
351 |                 fieldInfo.add(col, sqlType);
352 |             }
353 |             this.dataType = typeFactory.createStructType(fieldInfo);
354 |         }
355 |         return this.dataType;
356 |     }
357 | 
358 | 
359 |     @Override
360 |     public Enumerable<Object[]> scan(DataContext root) {
361 |         System.out.println("scan ...... ");
362 |         consumer = new KafkaConsumer<String, String>(operand);
363 |         consumer.subscribe(Arrays.asList(operand.get("topic").toString()));
364 | 
365 |         return new AbstractEnumerable<Object[]>() {
366 | 
367 |             public Enumerator<Object[]> enumerator() {
368 |                 return new Enumerator<Object[]>(){
369 |                     //因为，刚才的producter里面，数据是每秒产生的，如果这里值太下，则会出现取不出值的可能
370 |                     ConsumerRecords<String, String> records = consumer.poll(Integer.parseInt(operand.get("timeouts").toString()));
371 |                     Iterator it =records.iterator();
372 |                     private int cur = 0;
373 |                     @Override
374 |                     public Object[] current() {
375 |                         ConsumerRecord<String, String> reco = (ConsumerRecord<String, String>) it.next();
376 |                         return new String[]{reco.key(),reco.value()};
377 |                     }
378 | 
379 |                     @Override
380 |                     public boolean moveNext() {
381 |                         //ConsumerRecord<String, String> record : records
382 |                         return it.hasNext();
383 |                     }
384 | 
385 |                     @Override
386 |                     public void reset() {
387 | 
388 |                     }
389 | 
390 |                     @Override
391 |                     public void close() {
392 |                         consumer.close();
393 |                     }
394 |                 };
395 |             }
396 |         };
397 |     }
398 | }
399 | ```
400 | 
401 | 这个类的职责与之前的`InMemoryTable`类似，即提供数据如何遍历，如何转化数据类型。
402 | 
403 | 前文提及将何定义一个`streaming`的职责下发到此类里,这是为了提高了灵活性，即如果不使用`kafka`提供数据，想使用其他的`streaming`工具来构造数据，也会变得相对简单一些。
404 | 
405 | 在`public RelDataType getRowType(RelDataTypeFactory typeFactory)`这个方法里，我们需要对流里的数据，提供元数据的类型映射，前文提到过，我是把元数据，放在了`kafkaStream.json`文件里的`operand`节中的`colnames`属性里，这里，`producter`的数据提供，只有一个`key`和一个`boolean`值，所以我们只创建了两列`KK`和`VV`。而为了演示，我们也粗暴的将数据类型，定义为`string`类型。
406 | 
407 | 接下来，我们将在`public Enumerable<Object[]> scan(DataContext root)`方法里，订阅`kafka`的主题，并消费其发射来的数据。由于我们的生产者是每秒产生一次数据，所以在`consumer.poll(Integer.parseInt(operand.get("timeouts").toString()));`这里，我们不能把时间设置的太小，否则会出现取不出数据的情况，我们可以通过在`operand`里加入类似参数`"max.poll.records": 20,`来控制每页数据量。
408 | 
409 | 到这里，我们的基础工作完成了，下面来测试一下
410 | 
411 | ## 测试
412 | 
413 | ```
414 | package com.dafei1288.calcite.stream.kafka;
415 | 
416 | import org.apache.calcite.jdbc.CalciteConnection;
417 | import org.apache.calcite.util.ConversionUtil;
418 | 
419 | import java.sql.Connection;
420 | import java.sql.DriverManager;
421 | import java.sql.ResultSet;
422 | import java.sql.Statement;
423 | import java.util.Properties;
424 | 
425 | public class TestKafkaStreamJDBC {
426 |     public static void main(String[] args) {
427 |         try {
428 |             Class.forName("org.apache.calcite.jdbc.Driver");
429 |         } catch (ClassNotFoundException e1) {
430 |             e1.printStackTrace();
431 |         }
432 |         System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
433 |         System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME);
434 |         System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US");
435 | 
436 |         Properties info = new Properties();
437 |         String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\kafkaStream.json";
438 |         try {
439 |             Connection connection =
440 |                     DriverManager.getConnection("jdbc:calcite:model=" + jsonmodle, info);
441 |             CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class);
442 | 
443 |             ResultSet result = null;
444 | 
445 |             Statement st = connection.createStatement();
446 | 
447 |             st = connection.createStatement();
448 |             //where b.name = '数据山'
449 |             result = st.executeQuery("select stream kf.kk,kf.vv from KF as kf ");
450 |             while(result.next()) {
451 |                 System.out.println(result.getString(1)+" \t "+result.getString(2));
452 |             }
453 | 
454 |             result.close();
455 |         }catch(Exception e){
456 |             e.printStackTrace();
457 |         }
458 | 
459 |         }
460 | }
461 | 
462 | ```
463 | 
464 | 可以看到我们的测试语句 `select stream kf.kk,kf.vv from KF as kf`，结果如下
465 | 
466 | ```
467 | {topic=calcitekafka, bootstrap.servers=localhost:9092, group.id=test2, enable.auto.commit=true, auto.commit.interval.ms=1000, session.timeout.ms=30000, auto.offset.reset=earliest, key.deserializer=org.apache.kafka.common.serialization.StringDeserializer, value.deserializer=org.apache.kafka.common.serialization.StringDeserializer, colnames=key,value, timeouts=2000, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\kafkaStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources}
468 | KF
469 | KafkaStreamTable !!!!!!    KF , {topic=calcitekafka, bootstrap.servers=localhost:9092, group.id=test2, enable.auto.commit=true, auto.commit.interval.ms=1000, session.timeout.ms=30000, auto.offset.reset=earliest, key.deserializer=org.apache.kafka.common.serialization.StringDeserializer, value.deserializer=org.apache.kafka.common.serialization.StringDeserializer, colnames=key,value, timeouts=2000, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\kafkaStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources}
470 | scan ...... 
471 | 283 	 false
472 | 284 	 false
473 | 285 	 false
474 | 286 	 true
475 | 287 	 true
476 | 288 	 true
477 | 289 	 false
478 | 290 	 false
479 | 291 	 false
480 | 292 	 true
481 | 293 	 false
482 | 294 	 true
483 | 295 	 false
484 | 296 	 true
485 | 297 	 true
486 | ```
487 | 
488 | 到这，基础整合完成了。
489 | 
490 | # 结尾
491 | 
492 | 当前案例仅完成了初步整合，后续会继续扩展这个案例，例如时间窗滑动等，敬请期待
493 | 
494 | TBD 


--------------------------------------------------------------------------------
/tutorial.md:
--------------------------------------------------------------------------------
  1 | # 前言
  2 | 
  3 | Apache Calcite 是独立于存储与执行的SQL解析、优化引擎，广泛应用于各种离线、搜索、实时查询引擎，如Drill、Hive、Kylin、Solr、flink、Samza等。
  4 | 
  5 | ![架构图](./images/calcitea.png)
  6 | 
  7 | 偶然的机会了解到这个项目，然后就深深的为之着迷了，很感慨为什么没能早几年遇到她。也是为了更加了解她，光读文档不过瘾了，所以想动手翻译一下。 但是本人英文水平有限，又是第一次干这种事，所以欢迎大家帮我勘正谬误。 [联系我：dafei1288@sina.com](mailto:dafei1288@sina.com) 欢迎转载，请注明出处。
  8 | 
  9 | 我们先从引导文件开始：[原文链接](http://calcite.apache.org/docs/tutorial.html)
 10 | 
 11 | 
 12 | 
 13 | # 正文
 14 | 
 15 | 这是一个手把手式文档，教你如何构建并且连接到`Calcite`。我们用一个简单的适配器来将一个包含[CSV](https://en.wikipedia.org/wiki/Comma-separated_values)文件的目录变成一个包含数据表的数据库(原文描述为`schema`)。`Calcite`可以提供一个完整的SQL接口。
 16 | 
 17 | `Calcite-example-CSV`是一个全功能适配器来使得`Calcite`可以读取`CSV`格式文件。可以通过几百行代码就能够完成一个全SQL查询功能。
 18 | 
 19 | `CSV`适配器可以作为抛砖引玉的模板套用到其他数据格式上。尽管他代码量不多，但是麻雀虽小五脏俱全，重要原理都包含其中：
 20 | 
 21 | 1. 使用`SchemaFactory`和`Schema interfaces`来自定义`schema`
 22 | 2. 使用固定格式的JSON文件来(`a model JSON file`模型文件)声明数据库(`schemas`)
 23 | 3. 使用固定格式的JSON文件来(`a model JSON file`模型文件)声明视图(`views`)
 24 | 4. 使用`Table interface`来自定义表(`Table`)
 25 | 5. 确定表格的记录类型
 26 | 6. 使用`ScannableTable interface`来实现一个简单的表(`Table`)，来枚举所有行(`rows`)
 27 | 7. 进阶实现`FilterableTable`，可以根据条件(`simple predicates`)来过滤数据
 28 | 8. 表的进阶实现`TranslatableTable`,将执行计划翻译成关系运算(`translates to relational operators using planner rules`)
 29 | 
 30 | ## 下载和编译
 31 | 
 32 | 需要Java环境(1.7及以上版本，推荐1.8)，git以及maven(3.2.1及以上版本)
 33 | 
 34 | ```
 35 | $ git clone https://github.com/apache/calcite.git
 36 | $ cd calcite
 37 | $ mvn install -DskipTests -Dcheckstyle.skip=true
 38 | $ cd example/csv
 39 | ```
 40 | 
 41 | ## 第一个查询
 42 | 
 43 | 现在让我们来使用[sqlline](https://github.com/julianhyde/sqlline)来连接`Calcite`，`sqlline`是一个包含在整个`Calcite`项目里的SQL的命令行工具。
 44 | 
 45 | ```
 46 | $ ./sqlline
 47 | sqlline> !connect jdbc:calcite:model=target/test-classes/model.json admin admin
 48 | ```
 49 | 
 50 | (如果是windows操作系统，使用`sqlline.bat`)
 51 | 
 52 | 执行一个元数据查询：
 53 | 
 54 | ```
 55 | sqlline> !tables
 56 | +------------+--------------+-------------+---------------+----------+------+
 57 | | TABLE_CAT  | TABLE_SCHEM  | TABLE_NAME  |  TABLE_TYPE   | REMARKS  | TYPE |
 58 | +------------+--------------+-------------+---------------+----------+------+
 59 | | null       | SALES        | DEPTS       | TABLE         | null     | null |
 60 | | null       | SALES        | EMPS        | TABLE         | null     | null |
 61 | | null       | SALES        | HOBBIES     | TABLE         | null     | null |
 62 | | null       | metadata     | COLUMNS     | SYSTEM_TABLE  | null     | null |
 63 | | null       | metadata     | TABLES      | SYSTEM_TABLE  | null     | null |
 64 | +------------+--------------+-------------+---------------+----------+------+
 65 | ```
 66 | 
 67 | (*译者注：上面案例里使用的`!tables`命令查询元数据，但是译者在使用的时候发现这个命令不好使)
 68 | 
 69 | ```
 70 | 0: jdbc:calcite:model=target/test-classes/mod> !table
 71 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+
 72 | | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_CAT | TYPE_SCHEM | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION |
 73 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+
 74 | |           | SALES       | DEPTS      | TABLE      |         |          |            |           |                           |                |
 75 | |           | SALES       | EMPS       | TABLE      |         |          |            |           |                           |                |
 76 | |           | SALES       | SDEPTS     | TABLE      |         |          |            |           |                           |                |
 77 | |           | metadata    | COLUMNS    | SYSTEM TABLE |         |          |            |           |                           |                |
 78 | |           | metadata    | TABLES     | SYSTEM TABLE |         |          |            |           |                           |                |
 79 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+
 80 | ```
 81 | (JDBC提示： 在`sqlline`里`!tables`命令只是执行了`DatabaseMetaData.getTables()`方法，还有其他的获取元数据命令如：`!columns`,`!describe`) 
 82 | 
 83 | (译者注：`!describe`需要加表名)
 84 | 
 85 | ```
 86 | 0: jdbc:calcite:model=target/test-classes/mod> !describe
 87 | Usage: describe <table name>
 88 | 
 89 | 0: jdbc:calcite:model=target/test-classes/mod> !describe DEPTS
 90 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+
 91 | 
 92 | | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE |
 93 | 
 94 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+
 95 | 
 96 | |           | SALES       | DEPTS      | DEPTNO      | 4         | INTEGER   | -1          | null          | null           | 10             | 1        |         |            | null          | null             | -1                | 1                | YES         |               |              |             |
 97 | 
 98 | |           | SALES       | DEPTS      | NAME        | 12        | VARCHAR CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" | -1          | null          | null           | 10             | 1        |         |            | null          | null             | -1                | 2               |
 99 | 
100 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+
101 | 
102 | ```
103 | 
104 | 你能看到，在执行`!tables`的时候有5个表，表`EMPS`, `DEPTS`和`HOBBIES`在`SALES`库(`schema`)里,表`COLUMNS`和`TABLES`在系统元数据库(`system metadata schema`)里。系统表总是在`Calcite`里显示，但其他表是由库(`schema`)的实现来指定的，在本例中，`EMPS`和`DEPTS`表来源于`target/test-classes`路径下的`EMPS.csv`和`DEPTS.csv`。
105 | 
106 | 让我们来执行一些查询，来展示`Calcite`的全SQL功能，首先表检索：
107 | 
108 | ```
109 | sqlline> SELECT * FROM emps;
110 | +--------+--------+---------+---------+----------------+--------+-------+---+
111 | | EMPNO  |  NAME  | DEPTNO  | GENDER  |      CITY      | EMPID  |  AGE  | S |
112 | +--------+--------+---------+---------+----------------+--------+-------+---+
113 | | 100    | Fred   | 10      |         |                | 30     | 25    | t |
114 | | 110    | Eric   | 20      | M       | San Francisco  | 3      | 80    | n |
115 | | 110    | John   | 40      | M       | Vancouver      | 2      | null  | f |
116 | | 120    | Wilma  | 20      | F       |                | 1      | 5     | n |
117 | | 130    | Alice  | 40      | F       | Vancouver      | 2      | null  | f |
118 | +--------+--------+---------+---------+----------------+--------+-------+---+
119 | ```
120 | 
121 | 接下来是表连接和分组聚合查询：
122 | 
123 | ```
124 | sqlline> SELECT d.name, COUNT(*)
125 | . . . .> FROM emps AS e JOIN depts AS d ON e.deptno = d.deptno
126 | . . . .> GROUP BY d.name;
127 | +------------+---------+
128 | |    NAME    | EXPR$1  |
129 | +------------+---------+
130 | | Sales      | 1       |
131 | | Marketing  | 2       |
132 | +------------+---------+
133 | ```
134 | 
135 | 最后，一个计算操作返回一个单行记录，也可以通过这种简便的方法来测试表达式和SQL函数
136 | 
137 | ```
138 | sqlline> VALUES CHAR_LENGTH('Hello, ' || 'world!');
139 | +---------+
140 | | EXPR$0  |
141 | +---------+
142 | | 13      |
143 | +---------+
144 | ```
145 | 
146 | `Calcite`还包含很多SQL特性，这里就不一一列举了。
147 | 
148 | ## Schema探索
149 | 
150 | 那么`Calcite`是如何发现表的呢？事实上`Calcite`的核心是并不能理解`CSV`文件的(作为一个“没有存储层的databse”，`Calcite`是了解任何文件格式)，之所以`Calcite`能读取上文中的元数据，是因为在`calcite-example-csv`里我们撰写了相关代码。
151 | 
152 | 在执行链里包含着很多步骤。首先我们定义一个可以被库工厂加载的模型文件(`we define a schema based on a schema factory class in a model file.`)。然后库工厂会加载成数据库并创建许多表，每一个表都需要知道自己如何加载CSV中的数据。最后`Calcite`解析完查询并将查询计划映射到这几个表上时，`Calcite`会在查询执行时触发这些表去读取数据。接下来我们更深入地解析其中的细节步骤。
153 | 
154 | 举个栗子(a model in JSON format)：
155 | 
156 | ```
157 | {
158 |   version: '1.0',
159 |   defaultSchema: 'SALES',
160 |   schemas: [
161 |     {
162 |       name: 'SALES',
163 |       type: 'custom',
164 |       factory: 'org.apache.calcite.adapter.csv.CsvSchemaFactory',
165 |       operand: {
166 |         directory: 'target/test-classes/sales'
167 |       }
168 |     }
169 |   ]
170 | }
171 | ```
172 | 
173 | 这个模型文件定义了一个库(`schema`)叫`SALES`,这个库是由一个插件类(`a plugin class`)支持的,[org.apache.calcite.adapter.csv.CsvSchemaFactory](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory.java)这个是`calcite-example-csv`工程里`interface SchemaFactory`的一个实现。它的`create`方法将一个schema实例化了，将model file中的directory作为参数传递过去了。
174 | 
175 | ```
176 | public Schema create(SchemaPlus parentSchema, String name,
177 |     Map<String, Object> operand) {
178 |   String directory = (String) operand.get("directory");
179 |   String flavorName = (String) operand.get("flavor");
180 |   CsvTable.Flavor flavor;
181 |   if (flavorName == null) {
182 |     flavor = CsvTable.Flavor.SCANNABLE;
183 |   } else {
184 |     flavor = CsvTable.Flavor.valueOf(flavorName.toUpperCase());
185 |   }
186 |   return new CsvSchema(
187 |       new File(directory),
188 |       flavor);
189 | }
190 | ```
191 | 根据模型(`model`)描述，库工程(`schema factory`)实例化了一个名为'SALES'的简单库(`schema`)。这个库(`schema`)是[org.apache.calcite.adapter.csv.CsvSchema](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvSchema.java)的实例并且实现了`Calcite`里的接口[Schema](http://calcite.apache.org/apidocs/org/apache/calcite/schema/Schema.html)。
192 | 
193 | 一个库(`schema`)的主要职责就是创建一个表(`table`)的列表(库的职责还包括子库列表、函数列表等，但是`calcite-example-csv`项目里并没有包含他们)。这些表实现了`Calcite`的[Table](http://calcite.apache.org/apidocs/org/apache/calcite/schema/Table.html)接口。CsvSchema创建的表全部是[CsvTable](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTable.java)和他的子类的实例。
194 | 
195 | 下面是`CsvSchema`的一些相关代码，对基类`AbstractSchema`中的[getTableMap()](http://calcite.apache.org/apidocs/org/apache/calcite/schema/impl/AbstractSchema.html#getTableMap())方法进行了重载。
196 | 
197 | ```
198 | protected Map<String, Table> getTableMap() {
199 |   // Look for files in the directory ending in ".csv", ".csv.gz", ".json",
200 |   // ".json.gz".
201 |   File[] files = directoryFile.listFiles(
202 |       new FilenameFilter() {
203 |         public boolean accept(File dir, String name) {
204 |           final String nameSansGz = trim(name, ".gz");
205 |           return nameSansGz.endsWith(".csv")
206 |               || nameSansGz.endsWith(".json");
207 |         }
208 |       });
209 |   if (files == null) {
210 |     System.out.println("directory " + directoryFile + " not found");
211 |     files = new File[0];
212 |   }
213 |   // Build a map from table name to table; each file becomes a table.
214 |   final ImmutableMap.Builder<String, Table> builder = ImmutableMap.builder();
215 |   for (File file : files) {
216 |     String tableName = trim(file.getName(), ".gz");
217 |     final String tableNameSansJson = trimOrNull(tableName, ".json");
218 |     if (tableNameSansJson != null) {
219 |       JsonTable table = new JsonTable(file);
220 |       builder.put(tableNameSansJson, table);
221 |       continue;
222 |     }
223 |     tableName = trim(tableName, ".csv");
224 |     final Table table = createTable(file);
225 |     builder.put(tableName, table);
226 |   }
227 |   return builder.build();
228 | }
229 | 
230 | /** Creates different sub-type of table based on the "flavor" attribute. */
231 | private Table createTable(File file) {
232 |   switch (flavor) {
233 |   case TRANSLATABLE:
234 |     return new CsvTranslatableTable(file, null);
235 |   case SCANNABLE:
236 |     return new CsvScannableTable(file, null);
237 |   case FILTERABLE:
238 |     return new CsvFilterableTable(file, null);
239 |   default:
240 |     throw new AssertionError("Unknown flavor " + flavor);
241 |   }
242 | }
243 | ```
244 | 
245 | `schema`会扫描指定路径，找到所有以`.csv/`结尾的文件。在本例中，指定路径是 `target/test-classes/sales`，路径中包含文件'EMPS.csv'和'DEPTS.csv'，这两个文件会转换成表`EMPS`和`DEPTS`。
246 | 
247 | ## 表和视图
248 | 
249 | 值得注意的是，我们在模型文件(`model`)里并不需要定义任何表，`schema`会自动创建的。
250 | 你可以额外扩展一些表(`tables`)，使用这个`schema`中其他表的属性。
251 | 
252 | 
253 | 让我们看看如何创建一个重要且常用的一种表——视图。
254 | 
255 | 在写一个查询时，视图就相当于一个table，但它不存储数据。它通过执行查询来生成数据。在查询转换为执行计划时，视图会被展开，所以查询执行器可以执行一些优化策略，例如移除一些`SELECT`子句中存在但在最终结果中没有用到的表达式。
256 | 
257 | 举个栗子：
258 | 
259 | ```
260 | {
261 |   version: '1.0',
262 |   defaultSchema: 'SALES',
263 |   schemas: [
264 |     {
265 |       name: 'SALES',
266 |       type: 'custom',
267 |       factory: 'org.apache.calcite.adapter.csv.CsvSchemaFactory',
268 |       operand: {
269 |         directory: 'target/test-classes/sales'
270 |       },
271 |       tables: [
272 |         {
273 |           name: 'FEMALE_EMPS',
274 |           type: 'view',
275 |           sql: 'SELECT * FROM emps WHERE gender = \'F\''
276 |         }
277 |       ]
278 |     }
279 |   ]
280 | }
281 | ```
282 | 
283 | 栗子中`type：view`这一行将`FEMALE_EMPS`定义为一个视图，而不是常规表或者是自定义表。注意通常在JSON文件里，定义`view`的时候，需要对单引号进行转义。
284 | 
285 | 用JSON来定义长字符串易用性不太高，因此`Calcite`支持了一种替代语法。如果你的视图定义中有长SQL语句，可以使用多行来定义一个长字符串:
286 | 
287 | ```
288 | {
289 |   name: 'FEMALE_EMPS',
290 |   type: 'view',
291 |   sql: [
292 |     'SELECT * FROM emps',
293 |     'WHERE gender = \'F\''
294 |   ]
295 | }
296 | ```
297 | 现在我们定义了一个视图(`view`),我们可以再查询中使用它就像使用普通表(`table`)一样:
298 | 
299 | ```
300 | sqlline> SELECT e.name, d.name FROM female_emps AS e JOIN depts AS d on e.deptno = d.deptno;
301 | +--------+------------+
302 | |  NAME  |    NAME    |
303 | +--------+------------+
304 | | Wilma  | Marketing  |
305 | +--------+------------+
306 | ```
307 | 
308 | ## 自定义表
309 | 
310 | 自定义表是由用户定义的代码来实现定义的，不需要额外自定义`schema`。
311 | 
312 | 继续举个栗子`model-with-custom-table.json`：
313 | 
314 | ```
315 | {
316 |   version: '1.0',
317 |   defaultSchema: 'CUSTOM_TABLE',
318 |   schemas: [
319 |     {
320 |       name: 'CUSTOM_TABLE',
321 |       tables: [
322 |         {
323 |           name: 'EMPS',
324 |           type: 'custom',
325 |           factory: 'org.apache.calcite.adapter.csv.CsvTableFactory',
326 |           operand: {
327 |             file: 'target/test-classes/sales/EMPS.csv.gz',
328 |             flavor: "scannable"
329 |           }
330 |         }
331 |       ]
332 |     }
333 |   ]
334 | }
335 | ```
336 | 
337 | 我们可以一样来查询表数据：
338 | 
339 | ```
340 | sqlline> !connect jdbc:calcite:model=target/test-classes/model-with-custom-table.json admin admin
341 | sqlline> SELECT empno, name FROM custom_table.emps;
342 | +--------+--------+
343 | | EMPNO  |  NAME  |
344 | +--------+--------+
345 | | 100    | Fred   |
346 | | 110    | Eric   |
347 | | 110    | John   |
348 | | 120    | Wilma  |
349 | | 130    | Alice  |
350 | +--------+--------+
351 | ```
352 | 
353 | 上面的`schema`是通用格式，包含了一个自定义表[org.apache.calcite.adapter.csv.CsvTableFactory](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTableFactory.java),这个类实现了`Calcite`中的`TableFactory`接口。它在`create`方法里实例化了`CsvScannableTable`，将`model`文件中的`file`参数传递过去。
354 | 
355 | 
356 | ```
357 | public CsvTable create(SchemaPlus schema, String name,
358 |     Map<String, Object> map, RelDataType rowType) {
359 |   String fileName = (String) map.get("file");
360 |   final File file = new File(fileName);
361 |   final RelProtoDataType protoRowType =
362 |       rowType != null ? RelDataTypeImpl.proto(rowType) : null;
363 |   return new CsvScannableTable(file, protoRowType);
364 | }
365 | ```
366 | 
367 | 通常做法是实现一个自定义表(`a custom table`)来替代实现一个自定义库(`a custom schema`)。两个方法最后都会创建一个`Table`接口的实例，但是自定义表无需重新实现元数据(`metadata`)获取部分。(`CsvTableFactory`和`CsvSchema`一样，都创建了`CsvScannableTable`，但是自定表实现就不需要实现在文件系统里检索`.csv`文件。)
368 | 
369 | 自定义表(`table`)要求开发者在`model`上执有多操作(开发者需要在`model`文件中显式指定每一个`table`和它对应的文件)，同时也提供给了开发者更多的控制选项(例如，为每一个table提供不同参数)。
370 | 
371 | ## 模型中的注释
372 | 
373 | 
374 | 注释使用语法 `/* ... */` 和 `//`:
375 | 
376 | ```
377 | {
378 |   version: '1.0',
379 |   /* 多行
380 |      注释 */
381 |   defaultSchema: 'CUSTOM_TABLE',
382 |   // 单行注释
383 |   schemas: [
384 |     ..
385 |   ]
386 | }
387 | ```
388 | 
389 | (注释不是标准JSON格式，但不会造成影响。)
390 | 
391 | ## 使用查询计划来优化查询
392 | 
393 | 目前来看表(`table`)实现和查询都没有问题，因为我们的表中并没有大量的数据。但如果你的自定义表(`table`)有，例如，有100列和100万行数据，你肯定希望用户在每次查询过程中不检索全量数据。你会希望`Calcite`通过适配器来进行衡量，并找到一个更有效的方法来访问数据。
394 | 
395 | 这个衡量过程是一个简单的查询优化格式。`Calcite`是通过添加执行器规则(`planner rules`)来支持查询优化的。执行器规则(`planner rules`)通过在查询解析中寻找指定模式(`patterns`)(例如在某个项目中匹配到某种类型的`table`是生效),使用实现优化后的新节点替换寻找到节点。
396 | 
397 | 执行器规则(`planner rules`)也是可扩展的，就像`schemas`和`tables`一样。所以如果你有一些存储下来的数据希望通过SQL访问它，首先需要定义一个自定义表或是schema，然后再去定义一些能使数据访问高效的规则。
398 | 
399 | 为了查看效果，我们可以使用一个执行器规则(`planner rules`)来访问一个`CSV`文件中的某些子列集合。我们可以在两个相似的schema中执行同样的查询:
400 | 
401 | ```
402 | sqlline> !connect jdbc:calcite:model=target/test-classes/model.json admin admin
403 | sqlline> explain plan for select name from emps;
404 | +-----------------------------------------------------+
405 | | PLAN                                                |
406 | +-----------------------------------------------------+
407 | | EnumerableCalcRel(expr#0..9=[{inputs}], NAME=[$t1]) |
408 | |   EnumerableTableScan(table=[[SALES, EMPS]])        |
409 | +-----------------------------------------------------+
410 | sqlline> !connect jdbc:calcite:model=target/test-classes/smart.json admin admin
411 | sqlline> explain plan for select name from emps;
412 | +-----------------------------------------------------+
413 | | PLAN                                                |
414 | +-----------------------------------------------------+
415 | | EnumerableCalcRel(expr#0..9=[{inputs}], NAME=[$t1]) |
416 | |   CsvTableScan(table=[[SALES, EMPS]])               |
417 | +-----------------------------------------------------+
418 | ```
419 | 
420 | 这两个计划到底有什么不同呢？通过对比可以发现，在`smart.json`里只多了一行：
421 | 
422 | ```
423 | flavor: "translatable"
424 | ```
425 | 
426 | 这会让`CsvSchema`携带参数参数`falvor = TRANSLATABLE` 参数进行创建，并且它的`createTable`方法会创建[CsvTranslatableTable](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTranslatableTable.java)，而不是`CsvScannableTable`.
427 | 
428 | `CsvTranslatableTable`实现了`TranslatableTable.toRel()`方法来创建[CsvTableScan](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTableScan.java). 扫描表(`Table scan`)操作是查询执行树中的叶子节点，默认实现方式是`EnumerableTableScan`，但我们构造了一种不同的的子类型来让规则生效。
429 | 
430 | 下面是完整的代码：
431 | 
432 | ```
433 | public class CsvProjectTableScanRule extends RelOptRule {
434 |   public static final CsvProjectTableScanRule INSTANCE =
435 |       new CsvProjectTableScanRule();
436 | 
437 |   private CsvProjectTableScanRule() {
438 |     super(
439 |         operand(Project.class,
440 |             operand(CsvTableScan.class, none())),
441 |         "CsvProjectTableScanRule");
442 |   }
443 | 
444 |   @Override
445 |   public void onMatch(RelOptRuleCall call) {
446 |     final Project project = call.rel(0);
447 |     final CsvTableScan scan = call.rel(1);
448 |     int[] fields = getProjectFields(project.getProjects());
449 |     if (fields == null) {
450 |       // Project contains expressions more complex than just field references.
451 |       return;
452 |     }
453 |     call.transformTo(
454 |         new CsvTableScan(
455 |             scan.getCluster(),
456 |             scan.getTable(),
457 |             scan.csvTable,
458 |             fields));
459 |   }
460 | 
461 |   private int[] getProjectFields(List<RexNode> exps) {
462 |     final int[] fields = new int[exps.size()];
463 |     for (int i = 0; i < exps.size(); i++) {
464 |       final RexNode exp = exps.get(i);
465 |       if (exp instanceof RexInputRef) {
466 |         fields[i] = ((RexInputRef) exp).getIndex();
467 |       } else {
468 |         return null; // not a simple projection
469 |       }
470 |     }
471 |     return fields;
472 |   }
473 | }
474 | ```
475 | 构造函数声明了能使规则生效的关系表达式匹配模式。
476 | 
477 | `onMatch`方法创了一个新的表达式并且执行[RelOptRuleCall.transformTo()](http://calcite.apache.org/apidocs/org/apache/calcite/plan/RelOptRuleCall.html#transformTo(org.apache.calcite.rel.RelNode))这个方法来通知规则执行成功。
478 | 
479 | ## 查询优化流程
480 | 
481 | 
482 | 关于`Calcite`的查询计划有多智能有很多可以说的，但我们在这里不会讨论这个问题。最聪明的做法是为执行器规划的作者减轻负担(` The cleverness is designed to take the burden off you, the writer of planner rules.`)。
483 | 
484 | 
485 | 首先，`Calcite`不会按照规定的数据来执行.查询优化处理过程是一个有很多分支的分支树，就像国际象棋一样会检查很多可能的子操作。如果规则A和B同时满足查询操作树的一个给定子集合，`Calcite`可以将它们同时执行。
486 | 
487 | 其次，`Calcite`在执行计划树的时候会使用基于代价的优化，但代价模型并不会阻止一些看起来短期代价更高的规则执行(`Second, Calcite uses cost in choosing between plans, but the cost model doesn’t prevent rules from firing which may seem to be more expensive in the short term.`)。
488 | 
489 | 许多优化规则都有一个线性优化方案。在面对A或B的选择上，需要立刻做出决定。就好像有一个策略，比如“在整棵树上先执行规则A，然后在整棵树上执行规则B”，或是执行基于代价的优化策略，执行能产生耗费更低的结果的规则。
490 | 
491 | `Calcite`并不需要做出上述的妥协。这使得在处理多组合规则的情况更简单了。如果你希望结合规则来识别物化视图，去从CSV和JDBC源中读取数据，你只需要给`Calcite`所有的规则并告诉它如何去做。
492 | 
493 | `Calcite`使用了一个基于成本的优化模型，成本模型决定了最终使用哪个执行计划，有时候为了避免搜索空间的爆炸性增长会对搜索树进行剪枝，但它绝不对强迫用户在规则A和规则B之间进行选择。这是很重要的一点，因为它避免了在搜索空间中落入实际上不是最优的局部最优值。
494 | 
495 | 同样，成本模型是可扩展的，它是基于表和查询操作的统计信息。这个问题稍后会仔细讨论。
496 | 
497 | ## JDBC适配器(`adapter`)
498 | 
499 | JDBC适配器(`adapter`)可以吧一个jdbc库(`schema`)映射成`Calcite`的库(`schema`)。
500 | 
501 | 举个栗子,这是MySQL的一个经典库“foodmart”:
502 | 
503 | ```
504 | {
505 |   version: '1.0',
506 |   defaultSchema: 'FOODMART',
507 |   schemas: [
508 |     {
509 |       name: 'FOODMART',
510 |       type: 'custom',
511 |       factory: 'org.apache.calcite.adapter.jdbc.JdbcSchema$Factory',
512 |       operand: {
513 |         jdbcDriver: 'com.mysql.jdbc.Driver',
514 |         jdbcUrl: 'jdbc:mysql://localhost/foodmart',
515 |         jdbcUser: 'foodmart',
516 |         jdbcPassword: 'foodmart'
517 |       }
518 |     }
519 |   ]
520 | }
521 | ```
522 | 
523 | (`foodmart`这个库对于使用 Mondrian OLAP的人再熟悉不过了，这是Mondrain的重要测试集之一，不了解的请点击[传送门](https://mondrian.pentaho.com/documentation/installation.php#2_Set_up_test_data))
524 | 
525 | 当前的一些限制：JDBC适配器(`adapter`)目前仅支持下推表扫描(`table scan`)操作；其他的的操作（filtering，joins，aggregations等等）在`Calcite`中完成。我们的目的是将尽可能多的处理操作、语法转换、数据类型和内建函数下推到源数据系统。如果一个`Calcite`查询来源于单独一个JDBC数据库中的表，从原则上来说整个查询都会下推到源数据系统中。如果表来源于多个JDBC数据源，或是一个JDBC和非JDBC的混合源，`Calcite`会使用尽可能高效的分布式查询方法来完成本次查询。
526 | 
527 | (*译者注：从15年开始，我们设计的一块数据分析产品，就像达到类似的功能，但是最终以失败告终，整体的完成度远不及Calcite，而Calcite的历史库最远仅可以追溯到14年，感叹一下人家的开发水准，自叹弗如！！！)
528 | 
529 | 
530 | ## 克隆JDBC适配器(`adapter`)
531 | 
532 | 克隆JDBC适配器(`adapter`)创造了一个混合数据系统。数据来源于JDBC数据库但在它第一次读取时会读取到内存表中。`Calcite`基于内存表对查询进行评估，有效地实现了数据库的缓存。
533 | 
534 | 例如：下面的模型(`model`)就是从mysql的“footmart”库中读取信息的：
535 | 
536 | ```
537 | {
538 |   version: '1.0',
539 |   defaultSchema: 'FOODMART_CLONE',
540 |   schemas: [
541 |     {
542 |       name: 'FOODMART_CLONE',
543 |       type: 'custom',
544 |       factory: 'org.apache.calcite.adapter.clone.CloneSchema$Factory',
545 |       operand: {
546 |         jdbcDriver: 'com.mysql.jdbc.Driver',
547 |         jdbcUrl: 'jdbc:mysql://localhost/foodmart',
548 |         jdbcUser: 'foodmart',
549 |         jdbcPassword: 'foodmart'
550 |       }
551 |     }
552 |   ]
553 | }
554 | ```
555 | 
556 | 另外一种技术是从当前已存在的`schema`中构建一份`克隆schema`。通过`source`属性来引用之前已经在`model`中定义过的`schema`，如下：
557 | 
558 | ```
559 | {
560 |   version: '1.0',
561 |   defaultSchema: 'FOODMART_CLONE',
562 |   schemas: [
563 |     {
564 |       name: 'FOODMART',
565 |       type: 'custom',
566 |       factory: 'org.apache.calcite.adapter.jdbc.JdbcSchema$Factory',
567 |       operand: {
568 |         jdbcDriver: 'com.mysql.jdbc.Driver',
569 |         jdbcUrl: 'jdbc:mysql://localhost/foodmart',
570 |         jdbcUser: 'foodmart',
571 |         jdbcPassword: 'foodmart'
572 |       }
573 |     },
574 |     {
575 |       name: 'FOODMART_CLONE',
576 |       type: 'custom',
577 |       factory: 'org.apache.calcite.adapter.clone.CloneSchema$Factory',
578 |       operand: {
579 |         source: 'FOODMART'
580 |       }
581 |     }
582 |   ]
583 | }
584 | ```
585 | 
586 | 你可以使用这种方法建立任意类型的`克隆schema`，不仅限于JDBC.
587 | 
588 | 克隆适配器`cloning adapter`不是最重要的。我们计划开发更复杂的缓存策略，和更复杂更有效的内存表的实现，但目前克隆JDBC适配器(`adapter`)体现了这种可能性，并让我们能开始尝试初始实现。
589 | 
590 | 
591 | 
592 | ## 下一章？
593 | 
594 | 不定期更新 :) 喜欢的可以联系我崔更。 骂的人多的话，也不一定不更...
595 | 


--------------------------------------------------------------------------------