├── LICENSE ├── algebra.md ├── flink1.9.md ├── function.md ├── helloworld.md ├── images ├── calcitea.png ├── kafkaa.png ├── kafkainstall.png ├── window-types.png └── zkinstall.png ├── streaming.1.md ├── streaming.2.md └── tutorial.md /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /algebra.md: -------------------------------------------------------------------------------- 1 | # 前言 2 | 3 | 本章主旨介绍关系代数在`Calcite`中的应用,如果还对`Calcite`不了解的同学,也可以异步到`https://github.com/dafei1288/CalciteDocTrans/blob/master/tutorial.md`去看 4 | 5 | # 正文 6 | 7 | 关系代数是`Calcite`的核心。每个查询都可以被表述为一个关系运算符树(`a tree of relational operators`)。你可以将SQL翻译成关系代数或者直接构建树。 8 | 9 | Relational algebra is at the heart of Calcite. Every query is represented as a tree of relational operators. You can translate from SQL to relational algebra, or you can build the tree directly. 10 | 11 | Planner rules transform expression trees using mathematical identities that preserve semantics. For example, it is valid to push a filter into an input of an inner join if the filter does not reference columns from the other input. 12 | 13 | Calcite optimizes queries by repeatedly applying planner rules to a relational expression. A cost model guides the process, and the planner engine generates an alternative expression that has the same semantics as the original but a lower cost. 14 | 15 | The planning process is extensible. You can add your own relational operators, planner rules, cost model, and statistics. -------------------------------------------------------------------------------- /flink1.9.md: -------------------------------------------------------------------------------- 1 | 大家期盼已久的1.9已经剪支有些日子了,兴冲冲的切换到跑去编译,我在之前的文章《尝尝Blink》里也介绍过如何编译,本文只针对不同的地方以及遇到的坑做一些说明,希望对遇到同样问题的朋友有一些帮助。 2 | 3 | 首先,切换分支 `git checkout release-1.9` 4 | 这次我们不修改pom文件,将镜像添加到`settings.xml`里,在文章末尾,我会分享出来我用的文件全文,这里就不再赘述了。 5 | 直接使用 `clean package -DskipTests -Dfast`进行编译 6 | 7 | ``` 8 | ​[INFO] Reactor Summary for flink 1.9-SNAPSHOT: 9 | [INFO] 10 | [INFO] force-shading ...................................... SUCCESS [ 2.233 s] 11 | [INFO] flink .............................................. SUCCESS [ 2.536 s] 12 | [INFO] flink-annotations .................................. SUCCESS [ 1.447 s] 13 | [INFO] flink-shaded-curator ............................... SUCCESS [ 1.291 s] 14 | [INFO] flink-metrics ...................................... SUCCESS [ 0.101 s] 15 | [INFO] flink-metrics-core ................................. SUCCESS [ 0.959 s] 16 | [INFO] flink-test-utils-parent ............................ SUCCESS [ 0.091 s] 17 | [INFO] flink-test-utils-junit ............................. SUCCESS [ 1.048 s] 18 | [INFO] flink-core ......................................... SUCCESS [ 19.790 s] 19 | [INFO] flink-java ......................................... SUCCESS [ 4.944 s] 20 | [INFO] flink-queryable-state .............................. SUCCESS [ 0.085 s] 21 | [INFO] flink-queryable-state-client-java .................. SUCCESS [ 1.671 s] 22 | [INFO] flink-filesystems .................................. SUCCESS [ 0.079 s] 23 | [INFO] flink-hadoop-fs .................................... SUCCESS [ 3.029 s] 24 | [INFO] flink-runtime ...................................... SUCCESS [ 48.913 s] 25 | [INFO] flink-scala ........................................ SUCCESS [ 39.109 s] 26 | [INFO] flink-mapr-fs ...................................... SUCCESS [ 2.523 s] 27 | [INFO] flink-filesystems :: flink-fs-hadoop-shaded ........ SUCCESS [ 3.966 s] 28 | [INFO] flink-s3-fs-base ................................... SUCCESS [ 7.892 s] 29 | [INFO] flink-s3-fs-hadoop ................................. SUCCESS [ 10.222 s] 30 | [INFO] flink-s3-fs-presto ................................. SUCCESS [ 14.337 s] 31 | [INFO] flink-swift-fs-hadoop .............................. SUCCESS [ 13.493 s] 32 | [INFO] flink-oss-fs-hadoop ................................ SUCCESS [ 7.104 s] 33 | [INFO] flink-azure-fs-hadoop .............................. SUCCESS [ 8.093 s] 34 | [INFO] flink-optimizer .................................... SUCCESS [ 3.843 s] 35 | [INFO] flink-clients ...................................... SUCCESS [ 3.200 s] 36 | [INFO] flink-streaming-java ............................... SUCCESS [ 15.939 s] 37 | [INFO] flink-test-utils ................................... SUCCESS [ 4.398 s] 38 | [INFO] flink-runtime-web .................................. SUCCESS [06:05 min] 39 | [INFO] flink-examples ..................................... SUCCESS [ 0.196 s] 40 | [INFO] flink-examples-batch ............................... SUCCESS [ 15.297 s] 41 | [INFO] flink-connectors ................................... SUCCESS [ 0.076 s] 42 | [INFO] flink-hadoop-compatibility ......................... SUCCESS [ 6.228 s] 43 | [INFO] flink-state-backends ............................... SUCCESS [ 0.088 s] 44 | [INFO] flink-statebackend-rocksdb ......................... SUCCESS [ 4.283 s] 45 | [INFO] flink-tests ........................................ SUCCESS [01:00 min] 46 | [INFO] flink-streaming-scala .............................. SUCCESS [ 33.076 s] 47 | [INFO] flink-table ........................................ SUCCESS [ 0.082 s] 48 | [INFO] flink-table-common ................................. SUCCESS [ 2.936 s] 49 | [INFO] flink-table-api-java ............................... FAILURE [ 1.958 s] 50 | [INFO] flink-table-api-java-bridge ........................ SKIPPED 51 | [INFO] flink-table-api-scala .............................. SKIPPED 52 | [INFO] flink-table-api-scala-bridge ....................... SKIPPED 53 | [INFO] flink-sql-parser ................................... SKIPPED 54 | [INFO] flink-libraries .................................... SKIPPED 55 | [INFO] flink-cep .......................................... SKIPPED 56 | [INFO] flink-table-planner ................................ SKIPPED 57 | [INFO] flink-orc .......................................... SKIPPED 58 | [INFO] flink-jdbc ......................................... SKIPPED 59 | [INFO] flink-hbase ........................................ SKIPPED 60 | [INFO] flink-hcatalog ..................................... SKIPPED 61 | [INFO] flink-metrics-jmx .................................. SKIPPED 62 | [INFO] flink-connector-kafka-base ......................... SKIPPED 63 | [INFO] flink-connector-kafka-0.9 .......................... SKIPPED 64 | [INFO] flink-connector-kafka-0.10 ......................... SKIPPED 65 | [INFO] flink-connector-kafka-0.11 ......................... SKIPPED 66 | [INFO] flink-formats ...................................... SKIPPED 67 | [INFO] flink-json ......................................... SKIPPED 68 | [INFO] flink-connector-elasticsearch-base ................. SKIPPED 69 | [INFO] flink-connector-elasticsearch2 ..................... SKIPPED 70 | [INFO] flink-connector-elasticsearch5 ..................... SKIPPED 71 | [INFO] flink-connector-elasticsearch6 ..................... SKIPPED 72 | [INFO] flink-connector-hive ............................... SKIPPED 73 | [INFO] flink-connector-rabbitmq ........................... SKIPPED 74 | [INFO] flink-connector-twitter ............................ SKIPPED 75 | [INFO] flink-connector-nifi ............................... SKIPPED 76 | [INFO] flink-connector-cassandra .......................... SKIPPED 77 | [INFO] flink-avro ......................................... SKIPPED 78 | [INFO] flink-connector-filesystem ......................... SKIPPED 79 | [INFO] flink-connector-kafka .............................. SKIPPED 80 | [INFO] flink-connector-gcp-pubsub ......................... SKIPPED 81 | [INFO] flink-sql-connector-elasticsearch6 ................. SKIPPED 82 | [INFO] flink-sql-connector-kafka-0.9 ...................... SKIPPED 83 | [INFO] flink-sql-connector-kafka-0.10 ..................... SKIPPED 84 | [INFO] flink-sql-connector-kafka-0.11 ..................... SKIPPED 85 | [INFO] flink-sql-connector-kafka .......................... SKIPPED 86 | [INFO] flink-connector-kafka-0.8 .......................... SKIPPED 87 | [INFO] flink-avro-confluent-registry ...................... SKIPPED 88 | [INFO] flink-parquet ...................................... SKIPPED 89 | [INFO] flink-sequence-file ................................ SKIPPED 90 | [INFO] flink-csv .......................................... SKIPPED 91 | [INFO] flink-examples-streaming ........................... SKIPPED 92 | [INFO] flink-examples-table ............................... SKIPPED 93 | [INFO] flink-examples-build-helper ........................ SKIPPED 94 | [INFO] flink-examples-streaming-twitter ................... SKIPPED 95 | [INFO] flink-examples-streaming-state-machine ............. SKIPPED 96 | [INFO] flink-examples-streaming-gcp-pubsub ................ SKIPPED 97 | [INFO] flink-container .................................... SKIPPED 98 | [INFO] flink-queryable-state-runtime ...................... SKIPPED 99 | [INFO] flink-end-to-end-tests ............................. SKIPPED 100 | [INFO] flink-cli-test ..................................... SKIPPED 101 | [INFO] flink-parent-child-classloading-test-program ....... SKIPPED 102 | [INFO] flink-parent-child-classloading-test-lib-package ... SKIPPED 103 | [INFO] flink-dataset-allround-test ........................ SKIPPED 104 | [INFO] flink-datastream-allround-test ..................... SKIPPED 105 | [INFO] flink-stream-sql-test .............................. SKIPPED 106 | [INFO] flink-bucketing-sink-test .......................... SKIPPED 107 | [INFO] flink-distributed-cache-via-blob ................... SKIPPED 108 | [INFO] flink-high-parallelism-iterations-test ............. SKIPPED 109 | [INFO] flink-stream-stateful-job-upgrade-test ............. SKIPPED 110 | [INFO] flink-queryable-state-test ......................... SKIPPED 111 | [INFO] flink-local-recovery-and-allocation-test ........... SKIPPED 112 | [INFO] flink-elasticsearch2-test .......................... SKIPPED 113 | [INFO] flink-elasticsearch5-test .......................... SKIPPED 114 | [INFO] flink-elasticsearch6-test .......................... SKIPPED 115 | [INFO] flink-quickstart ................................... SKIPPED 116 | [INFO] flink-quickstart-java .............................. SKIPPED 117 | [INFO] flink-quickstart-scala ............................. SKIPPED 118 | [INFO] flink-quickstart-test .............................. SKIPPED 119 | [INFO] flink-confluent-schema-registry .................... SKIPPED 120 | [INFO] flink-stream-state-ttl-test ........................ SKIPPED 121 | [INFO] flink-sql-client-test .............................. SKIPPED 122 | [INFO] flink-streaming-file-sink-test ..................... SKIPPED 123 | [INFO] flink-state-evolution-test ......................... SKIPPED 124 | [INFO] flink-e2e-test-utils ............................... SKIPPED 125 | [INFO] flink-mesos ........................................ SKIPPED 126 | [INFO] flink-yarn ......................................... SKIPPED 127 | [INFO] flink-gelly ........................................ SKIPPED 128 | [INFO] flink-gelly-scala .................................. SKIPPED 129 | [INFO] flink-gelly-examples ............................... SKIPPED 130 | [INFO] flink-metrics-dropwizard ........................... SKIPPED 131 | [INFO] flink-metrics-graphite ............................. SKIPPED 132 | [INFO] flink-metrics-influxdb ............................. SKIPPED 133 | [INFO] flink-metrics-prometheus ........................... SKIPPED 134 | [INFO] flink-metrics-statsd ............................... SKIPPED 135 | [INFO] flink-metrics-datadog .............................. SKIPPED 136 | [INFO] flink-metrics-slf4j ................................ SKIPPED 137 | [INFO] flink-cep-scala .................................... SKIPPED 138 | [INFO] flink-table-uber ................................... SKIPPED 139 | [INFO] flink-sql-client ................................... SKIPPED 140 | [INFO] flink-python ....................................... SKIPPED 141 | [INFO] flink-scala-shell .................................. SKIPPED 142 | [INFO] flink-dist ......................................... SKIPPED 143 | [INFO] flink-end-to-end-tests-common ...................... SKIPPED 144 | [INFO] flink-metrics-availability-test .................... SKIPPED 145 | [INFO] flink-metrics-reporter-prometheus-test ............. SKIPPED 146 | [INFO] flink-heavy-deployment-stress-test ................. SKIPPED 147 | [INFO] flink-connector-gcp-pubsub-emulator-tests .......... SKIPPED 148 | [INFO] flink-streaming-kafka-test-base .................... SKIPPED 149 | [INFO] flink-streaming-kafka-test ......................... SKIPPED 150 | [INFO] flink-streaming-kafka011-test ...................... SKIPPED 151 | [INFO] flink-streaming-kafka010-test ...................... SKIPPED 152 | [INFO] flink-plugins-test ................................. SKIPPED 153 | [INFO] flink-state-processor-api .......................... SKIPPED 154 | [INFO] flink-table-runtime-blink .......................... SKIPPED 155 | [INFO] flink-table-planner-blink .......................... SKIPPED 156 | [INFO] flink-contrib ...................................... SKIPPED 157 | [INFO] flink-connector-wikiedits .......................... SKIPPED 158 | [INFO] flink-yarn-tests ................................... SKIPPED 159 | [INFO] flink-fs-tests ..................................... SKIPPED 160 | [INFO] flink-docs ......................................... SKIPPED 161 | [INFO] flink-ml-parent .................................... SKIPPED 162 | [INFO] flink-ml-api ....................................... SKIPPED 163 | [INFO] flink-ml-lib ....................................... SKIPPED 164 | [INFO] ------------------------------------------------------------------------ 165 | [INFO] BUILD FAILURE 166 | [INFO] ------------------------------------------------------------------------ 167 | [INFO] Total time: 11:58 min 168 | [INFO] Finished at: 2019-07-24T16:37:45+08:00 169 | [INFO] ------------------------------------------------------------------------ 170 | [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project flink-table-api-java: Compilation failure 171 | [ERROR] /E:/devlop/sourcespace/flink/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/OperationTreeBuilder.java:[560,85] 未报告的异常错误X; 必须对其进行捕获或声明以便抛出 172 | [ERROR] 173 | [ERROR] -> [Help 1] 174 | [ERROR] 175 | [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. 176 | [ERROR] Re-run Maven using the -X switch to enable full debug logging. 177 | [ERROR] 178 | [ERROR] For more information about the errors and possible solutions, please read the following articles: 179 | [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException 180 | [ERROR] 181 | [ERROR] After correcting the problems, you can resume the build with the command 182 | [ERROR] mvn -rf :flink-table-api-java 183 | ``` 184 | 185 | 这个问题 `未报告的异常错误X; 必须对其进行捕获或声明以便抛出` 问题卡了我好久,查看源码 186 | ``` 187 | private CalculatedQueryOperation unwrapFromAlias(CallExpression call) { 188 | List children = call.getChildren(); 189 | List aliases = children.subList(1, children.size()) 190 | .stream() 191 | .map(alias -> ExpressionUtils.extractValue(alias, String.class) 192 | .orElseThrow(() -> new ValidationException("Unexpected alias: " + alias))) <= 这里是异常提示 193 | .collect(toList()); 194 | 195 | if (!isFunctionOfKind(children.get(0), TABLE)) { 196 | throw fail(); 197 | } 198 | 199 | CallExpression tableCall = (CallExpression) children.get(0); 200 | TableFunctionDefinition tableFunctionDefinition = 201 | (TableFunctionDefinition) tableCall.getFunctionDefinition(); 202 | return createFunctionCall(tableFunctionDefinition, aliases, tableCall.getResolvedChildren()); 203 | } 204 | ``` 205 | 再看一下`ValidationException`的代码 206 | ``` 207 | @PublicEvolving 208 | public class ValidationException extends RuntimeException { 209 | 210 | public ValidationException(String message, Throwable cause) { 211 | super(message, cause); 212 | } 213 | 214 | public ValidationException(String message) { 215 | super(message); 216 | } 217 | } 218 | 219 | ``` 220 | 似乎也没啥问题,然后翻了半天,终于在stackoverflow上找到问题所在了 221 | `https://stackoverflow.com/questions/25523375/java8-lambdas-and-exceptions` 222 | 可以在前面加上异常类型 `.orElseThrow(() -> new ValidationException("Unexpected alias: " + alias)))` 还有几个文件,也要修改,这个问题也可以通过更换JDK来规避。 223 | 224 | 225 | 当时使用JDK 226 | ``` 227 | E:\devlop\envs\Java8x64bak\bin>java -version 228 | java version "1.8.0_60" 229 | Java(TM) SE Runtime Environment (build 1.8.0_60-b27) 230 | Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode) 231 | ``` 232 | 更换JDK 233 | ``` 234 | E:\devlop\envs\Java8x64\bin>java -version 235 | java version "1.8.0_131" 236 | Java(TM) SE Runtime Environment (build 1.8.0_131-b11) 237 | Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) 238 | ``` 239 | 240 | 编译成功 241 | ``` 242 | [INFO] Reactor Summary for flink 1.9-SNAPSHOT: 243 | [INFO] 244 | [INFO] force-shading ...................................... SUCCESS [ 3.341 s] 245 | [INFO] flink .............................................. SUCCESS [ 3.686 s] 246 | [INFO] flink-annotations .................................. SUCCESS [ 1.474 s] 247 | [INFO] flink-shaded-curator ............................... SUCCESS [ 1.275 s] 248 | [INFO] flink-metrics ...................................... SUCCESS [ 0.100 s] 249 | [INFO] flink-metrics-core ................................. SUCCESS [ 0.959 s] 250 | [INFO] flink-test-utils-parent ............................ SUCCESS [ 0.094 s] 251 | [INFO] flink-test-utils-junit ............................. SUCCESS [ 0.963 s] 252 | [INFO] flink-core ......................................... SUCCESS [ 20.784 s] 253 | [INFO] flink-java ......................................... SUCCESS [ 7.953 s] 254 | [INFO] flink-queryable-state .............................. SUCCESS [ 0.084 s] 255 | [INFO] flink-queryable-state-client-java .................. SUCCESS [ 1.925 s] 256 | [INFO] flink-filesystems .................................. SUCCESS [ 0.094 s] 257 | [INFO] flink-hadoop-fs .................................... SUCCESS [ 3.108 s] 258 | [INFO] flink-runtime ...................................... SUCCESS [ 52.749 s] 259 | [INFO] flink-scala ........................................ SUCCESS [ 40.804 s] 260 | [INFO] flink-mapr-fs ...................................... SUCCESS [ 2.281 s] 261 | [INFO] flink-filesystems :: flink-fs-hadoop-shaded ........ SUCCESS [ 3.865 s] 262 | [INFO] flink-s3-fs-base ................................... SUCCESS [ 7.667 s] 263 | [INFO] flink-s3-fs-hadoop ................................. SUCCESS [ 11.142 s] 264 | [INFO] flink-s3-fs-presto ................................. SUCCESS [ 14.022 s] 265 | [INFO] flink-swift-fs-hadoop .............................. SUCCESS [ 13.379 s] 266 | [INFO] flink-oss-fs-hadoop ................................ SUCCESS [ 7.149 s] 267 | [INFO] flink-azure-fs-hadoop .............................. SUCCESS [ 8.124 s] 268 | [INFO] flink-optimizer .................................... SUCCESS [ 3.841 s] 269 | [INFO] flink-clients ...................................... SUCCESS [ 3.081 s] 270 | [INFO] flink-streaming-java ............................... SUCCESS [ 13.254 s] 271 | [INFO] flink-test-utils ................................... SUCCESS [ 4.429 s] 272 | [INFO] flink-runtime-web .................................. SUCCESS [03:56 min] 273 | [INFO] flink-examples ..................................... SUCCESS [ 0.195 s] 274 | [INFO] flink-examples-batch ............................... SUCCESS [01:27 min] 275 | [INFO] flink-connectors ................................... SUCCESS [ 0.156 s] 276 | [INFO] flink-hadoop-compatibility ......................... SUCCESS [ 7.404 s] 277 | [INFO] flink-state-backends ............................... SUCCESS [ 0.103 s] 278 | [INFO] flink-statebackend-rocksdb ......................... SUCCESS [ 4.041 s] 279 | [INFO] flink-tests ........................................ SUCCESS [ 57.677 s] 280 | [INFO] flink-streaming-scala .............................. SUCCESS [ 39.897 s] 281 | [INFO] flink-table ........................................ SUCCESS [ 0.093 s] 282 | [INFO] flink-table-common ................................. SUCCESS [ 3.252 s] 283 | [INFO] flink-table-api-java ............................... SUCCESS [ 3.382 s] 284 | [INFO] flink-table-api-java-bridge ........................ SUCCESS [ 1.691 s] 285 | [INFO] flink-table-api-scala .............................. SUCCESS [ 5.564 s] 286 | [INFO] flink-table-api-scala-bridge ....................... SUCCESS [ 6.084 s] 287 | [INFO] flink-sql-parser ................................... SUCCESS [01:45 min] 288 | [INFO] flink-libraries .................................... SUCCESS [ 0.071 s] 289 | [INFO] flink-cep .......................................... SUCCESS [ 7.880 s] 290 | [INFO] flink-table-planner ................................ SUCCESS [02:02 min] 291 | [INFO] flink-orc .......................................... SUCCESS [ 2.537 s] 292 | [INFO] flink-jdbc ......................................... SUCCESS [ 2.255 s] 293 | [INFO] flink-hbase ........................................ SUCCESS [ 7.450 s] 294 | [INFO] flink-hcatalog ..................................... SUCCESS [ 5.875 s] 295 | [INFO] flink-metrics-jmx .................................. SUCCESS [ 1.468 s] 296 | [INFO] flink-connector-kafka-base ......................... SUCCESS [ 6.826 s] 297 | [INFO] flink-connector-kafka-0.9 .......................... SUCCESS [ 5.396 s] 298 | [INFO] flink-connector-kafka-0.10 ......................... SUCCESS [ 3.076 s] 299 | [INFO] flink-connector-kafka-0.11 ......................... SUCCESS [ 3.337 s] 300 | [INFO] flink-formats ...................................... SUCCESS [ 0.070 s] 301 | [INFO] flink-json ......................................... SUCCESS [ 1.535 s] 302 | [INFO] flink-connector-elasticsearch-base ................. SUCCESS [ 4.051 s] 303 | [INFO] flink-connector-elasticsearch2 ..................... SUCCESS [ 10.091 s] 304 | [INFO] flink-connector-elasticsearch5 ..................... SUCCESS [ 11.304 s] 305 | [INFO] flink-connector-elasticsearch6 ..................... SUCCESS [ 5.441 s] 306 | [INFO] flink-connector-hive ............................... SUCCESS [ 10.140 s] 307 | [INFO] flink-connector-rabbitmq ........................... SUCCESS [ 1.770 s] 308 | [INFO] flink-connector-twitter ............................ SUCCESS [ 2.210 s] 309 | [INFO] flink-connector-nifi ............................... SUCCESS [ 1.993 s] 310 | [INFO] flink-connector-cassandra .......................... SUCCESS [ 4.067 s] 311 | [INFO] flink-avro ......................................... SUCCESS [ 6.819 s] 312 | [INFO] flink-connector-filesystem ......................... SUCCESS [ 3.599 s] 313 | [INFO] flink-connector-kafka .............................. SUCCESS [ 3.106 s] 314 | [INFO] flink-connector-gcp-pubsub ......................... SUCCESS [ 6.798 s] 315 | [INFO] flink-sql-connector-elasticsearch6 ................. SUCCESS [ 5.708 s] 316 | [INFO] flink-sql-connector-kafka-0.9 ...................... SUCCESS [ 0.579 s] 317 | [INFO] flink-sql-connector-kafka-0.10 ..................... SUCCESS [ 0.665 s] 318 | [INFO] flink-sql-connector-kafka-0.11 ..................... SUCCESS [ 0.748 s] 319 | [INFO] flink-sql-connector-kafka .......................... SUCCESS [ 1.050 s] 320 | [INFO] flink-connector-kafka-0.8 .......................... SUCCESS [ 2.633 s] 321 | [INFO] flink-avro-confluent-registry ...................... SUCCESS [ 1.856 s] 322 | [INFO] flink-parquet ...................................... SUCCESS [ 2.886 s] 323 | [INFO] flink-sequence-file ................................ SUCCESS [ 1.368 s] 324 | [INFO] flink-csv .......................................... SUCCESS [ 1.404 s] 325 | [INFO] flink-examples-streaming ........................... SUCCESS [ 14.729 s] 326 | [INFO] flink-examples-table ............................... SUCCESS [ 8.828 s] 327 | [INFO] flink-examples-build-helper ........................ SUCCESS [ 0.189 s] 328 | [INFO] flink-examples-streaming-twitter ................... SUCCESS [ 0.826 s] 329 | [INFO] flink-examples-streaming-state-machine ............. SUCCESS [ 0.696 s] 330 | [INFO] flink-examples-streaming-gcp-pubsub ................ SUCCESS [ 4.980 s] 331 | [INFO] flink-container .................................... SUCCESS [ 2.574 s] 332 | [INFO] flink-queryable-state-runtime ...................... SUCCESS [ 4.981 s] 333 | [INFO] flink-end-to-end-tests ............................. SUCCESS [ 0.078 s] 334 | [INFO] flink-cli-test ..................................... SUCCESS [ 0.933 s] 335 | [INFO] flink-parent-child-classloading-test-program ....... SUCCESS [ 1.070 s] 336 | [INFO] flink-parent-child-classloading-test-lib-package ... SUCCESS [ 0.519 s] 337 | [INFO] flink-dataset-allround-test ........................ SUCCESS [ 0.734 s] 338 | [INFO] flink-datastream-allround-test ..................... SUCCESS [ 2.613 s] 339 | [INFO] flink-stream-sql-test .............................. SUCCESS [ 1.742 s] 340 | [INFO] flink-bucketing-sink-test .......................... SUCCESS [ 1.580 s] 341 | [INFO] flink-distributed-cache-via-blob ................... SUCCESS [ 0.880 s] 342 | [INFO] flink-high-parallelism-iterations-test ............. SUCCESS [ 7.606 s] 343 | [INFO] flink-stream-stateful-job-upgrade-test ............. SUCCESS [ 1.518 s] 344 | [INFO] flink-queryable-state-test ......................... SUCCESS [ 2.314 s] 345 | [INFO] flink-local-recovery-and-allocation-test ........... SUCCESS [ 0.966 s] 346 | [INFO] flink-elasticsearch2-test .......................... SUCCESS [ 4.529 s] 347 | [INFO] flink-elasticsearch5-test .......................... SUCCESS [ 5.285 s] 348 | [INFO] flink-elasticsearch6-test .......................... SUCCESS [ 3.856 s] 349 | [INFO] flink-quickstart ................................... SUCCESS [ 1.481 s] 350 | [INFO] flink-quickstart-java .............................. SUCCESS [ 4.658 s] 351 | [INFO] flink-quickstart-scala ............................. SUCCESS [ 0.414 s] 352 | [INFO] flink-quickstart-test .............................. SUCCESS [ 1.497 s] 353 | [INFO] flink-confluent-schema-registry .................... SUCCESS [ 2.361 s] 354 | [INFO] flink-stream-state-ttl-test ........................ SUCCESS [ 3.930 s] 355 | [INFO] flink-sql-client-test .............................. SUCCESS [ 3.859 s] 356 | [INFO] flink-streaming-file-sink-test ..................... SUCCESS [ 1.164 s] 357 | [INFO] flink-state-evolution-test ......................... SUCCESS [ 1.532 s] 358 | [INFO] flink-e2e-test-utils ............................... SUCCESS [ 6.745 s] 359 | [INFO] flink-mesos ........................................ SUCCESS [ 18.941 s] 360 | [INFO] flink-yarn ......................................... SUCCESS [ 3.017 s] 361 | [INFO] flink-gelly ........................................ SUCCESS [ 5.259 s] 362 | [INFO] flink-gelly-scala .................................. SUCCESS [ 13.110 s] 363 | [INFO] flink-gelly-examples ............................... SUCCESS [ 11.624 s] 364 | [INFO] flink-metrics-dropwizard ........................... SUCCESS [ 1.044 s] 365 | [INFO] flink-metrics-graphite ............................. SUCCESS [ 0.570 s] 366 | [INFO] flink-metrics-influxdb ............................. SUCCESS [ 2.176 s] 367 | [INFO] flink-metrics-prometheus ........................... SUCCESS [ 1.361 s] 368 | [INFO] flink-metrics-statsd ............................... SUCCESS [ 0.956 s] 369 | [INFO] flink-metrics-datadog .............................. SUCCESS [ 0.711 s] 370 | [INFO] flink-metrics-slf4j ................................ SUCCESS [ 0.917 s] 371 | [INFO] flink-cep-scala .................................... SUCCESS [ 9.729 s] 372 | [INFO] flink-table-uber ................................... SUCCESS [ 2.603 s] 373 | [INFO] flink-sql-client ................................... SUCCESS [ 7.800 s] 374 | [INFO] flink-python ....................................... SUCCESS [ 2.724 s] 375 | [INFO] flink-scala-shell .................................. SUCCESS [ 10.762 s] 376 | [INFO] flink-dist ......................................... SUCCESS [ 34.086 s] 377 | [INFO] flink-end-to-end-tests-common ...................... SUCCESS [ 1.229 s] 378 | [INFO] flink-metrics-availability-test .................... SUCCESS [ 0.946 s] 379 | [INFO] flink-metrics-reporter-prometheus-test ............. SUCCESS [ 0.798 s] 380 | [INFO] flink-heavy-deployment-stress-test ................. SUCCESS [ 7.118 s] 381 | [INFO] flink-connector-gcp-pubsub-emulator-tests .......... SUCCESS [ 3.777 s] 382 | [INFO] flink-streaming-kafka-test-base .................... SUCCESS [ 1.260 s] 383 | [INFO] flink-streaming-kafka-test ......................... SUCCESS [ 6.750 s] 384 | [INFO] flink-streaming-kafka011-test ...................... SUCCESS [ 6.230 s] 385 | [INFO] flink-streaming-kafka010-test ...................... SUCCESS [ 8.173 s] 386 | [INFO] flink-plugins-test ................................. SUCCESS [ 0.799 s] 387 | [INFO] flink-state-processor-api .......................... SUCCESS [ 3.276 s] 388 | [INFO] flink-table-runtime-blink .......................... SUCCESS [ 7.159 s] 389 | [INFO] flink-table-planner-blink .......................... SUCCESS [02:26 min] 390 | [INFO] flink-contrib ...................................... SUCCESS [ 0.070 s] 391 | [INFO] flink-connector-wikiedits .......................... SUCCESS [ 1.790 s] 392 | [INFO] flink-yarn-tests ................................... SUCCESS [04:22 min] 393 | [INFO] flink-fs-tests ..................................... SUCCESS [ 1.905 s] 394 | [INFO] flink-docs ......................................... SUCCESS [ 2.258 s] 395 | [INFO] flink-ml-parent .................................... SUCCESS [ 0.066 s] 396 | [INFO] flink-ml-api ....................................... SUCCESS [ 1.020 s] 397 | [INFO] flink-ml-lib ....................................... SUCCESS [ 0.797 s] 398 | [INFO] ------------------------------------------------------------------------ 399 | [INFO] BUILD SUCCESS 400 | [INFO] ------------------------------------------------------------------------ 401 | [INFO] Total time: 29:11 min 402 | [INFO] Finished at: 2019-07-24T16:03:03+08:00 403 | [INFO] ------------------------------------------------------------------------ 404 | ``` 405 | 406 | 去dist里启动玩耍了。 407 | 408 | 分享一下我的 `settings.xml` 409 | ``` 410 | 411 | 412 | 428 | 429 | 453 | 456 | 462 | 470 | 471 | 478 | 479 | 484 | 485 | 489 | 490 | 491 | 496 | 497 | 511 | 512 | 513 | 517 | 518 | 531 | 532 | 539 | 540 | 541 | 552 | 553 | 558 | 559 | nexus-aliyun 560 | Nexus aliyun 561 | *,!jeecg,!jeecg-snapshots,!mapr-releases,!cloudera,!cdh,!confluent 562 | http://maven.aliyun.com/nexus/content/groups/public 563 | 564 | 565 | mapr-public 566 | mapr-releases 567 | mapr-releases,*,!confluent 568 | https://maven.aliyun.com/repository/mapr-public 569 | 570 | 571 | cloudera 572 | cloudera 573 | https://repository.cloudera.com/artifactory/cloudera-repos 574 | *,!mapr-releases,!confluent 575 | 576 | 577 | 578 | 599 | 600 | 627 | 628 | 660 | 677 | 678 | 679 | 680 | 681 | 689 | 690 | 691 | ``` 692 | 693 | 694 | 695 | 696 | 697 | -------------------------------------------------------------------------------- /function.md: -------------------------------------------------------------------------------- 1 | # 直播改BUG 2 | 3 | ## 修复内联查询 4 | 5 | 在[上期文章](https://github.com/dafei1288/CalciteDocTrans/blob/master/helloworld.md)撰写的时候,我还认为只完成了单表查询,但经过几天的研究发现,上次那寥寥几十行代码,其实已经可以完成了表联接,过滤等功能了,只是由于当时粗心写错了一些东西,造成过滤失效了,下面我来剖析一下问题。 6 | 7 | 首先在`Storage.java`这个文件里 8 | ``` 9 | public class Storage { 10 | public static final String SCHEMA_NAME = "bookshop"; 11 | public static final String TABLE_AUTHOR = "AUTHOR"; 12 | public static final String TABLE_BOOK = "BOOK"; 13 | 14 | // public static List tables = new ArrayList<>(); 15 | public static Hashtable _bag = new Hashtable<>(); 16 | static{ 17 | DummyTable author = new DummyTable(TABLE_AUTHOR); 18 | DummyColumn id = new DummyColumn("ID","String"); 19 | DummyColumn name = new DummyColumn("NAME","String"); 20 | DummyColumn age = new DummyColumn("AGE","String"); 21 | DummyColumn aid = new DummyColumn("AID","String"); 22 | DummyColumn type = new DummyColumn("TYPE","String"); 23 | author.addColumn(id).addColumn(name).addColumn(age); 24 | author.addRow("1","jacky","33"); 25 | author.addRow("2","wang","23"); 26 | author.addRow("3","dd","32"); 27 | author.addRow("4","ma","42"); 28 | // tables.add(author); 29 | _bag.put(TABLE_AUTHOR,author); 30 | 31 | DummyTable book = new DummyTable(TABLE_BOOK); 32 | book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type); 33 | book.addRow("1","1","数据山","java"); 34 | book.addRow("2","2","大关","sql"); 35 | book.addRow("3","1","lili","sql"); 36 | book.addRow("4","3","ten","c#"); 37 | // tables.add(book); 38 | _bag.put(TABLE_BOOK,book); 39 | } 40 | ...... 41 | } 42 | ``` 43 | 只截取了部分片段, 在`book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type);`这里,我把`name`列和`aid`列写颠倒了。更正过来,就可以进行正确的连接查询了。 44 | 45 | ## 过滤条件 46 | 47 | 在可以进行内联查询以后,我一直在对不能做过滤这点存在质疑,从关系代数的角度分析,应该是先做笛卡尔积,然后再过滤数据,那么就应该可以对数据进行过滤了,那么问题出在哪呢? 48 | 49 | 于是抱着试试看的心里,构建了一条查询`select * from "BOOK" as b where b.name = 数据山`,看看效果,果不其然,喜闻乐见... 50 | 51 | ``` 52 | java.sql.SQLException: Error while executing SQL "select * from "BOOK" as b where b.name = 数据山": From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table 53 | at org.apache.calcite.avatica.Helper.createException(Helper.java:56) 54 | at org.apache.calcite.avatica.Helper.createException(Helper.java:41) 55 | at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163) 56 | at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) 57 | at com.dafei1288.calcite.TestJDBC.main(TestJDBC.java:81) 58 | Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table 59 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 60 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 61 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 62 | at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 63 | at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463) 64 | at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:783) 65 | at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:768) 66 | at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4759) 67 | at org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:259) 68 | at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5619) 69 | at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5601) 70 | at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:334) 71 | at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:134) 72 | at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:101) 73 | at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:859) 74 | at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visitScoped(SqlValidatorImpl.java:5654) 75 | at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:50) 76 | at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:33) 77 | at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138) 78 | at org.apache.calcite.sql.validate.SqlValidatorImpl.expand(SqlValidatorImpl.java:5208) 79 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereClause(SqlValidatorImpl.java:3948) 80 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3276) 81 | at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) 82 | at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) 83 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:967) 84 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:943) 85 | at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:225) 86 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:918) 87 | at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:628) 88 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552) 89 | at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:264) 90 | at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:230) 91 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:772) 92 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:636) 93 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:606) 94 | at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:229) 95 | at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550) 96 | at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) 97 | at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) 98 | ... 2 more 99 | Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Column '数据山' not found in any table 100 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 101 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 102 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 103 | at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 104 | at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463) 105 | at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572) 106 | ... 36 more 107 | ``` 108 | 109 | 看到报错一脸懵逼,`From line 1, column 42 to line 1, column 44: Column '数据山' not found in any table`这是把过滤条件当列处理了?在百思不得其解的时候,突然想起了mysql的引号问题,于是把sql修正为`select * from "BOOK" as b where b.name = '数据山'`,于是喜闻乐见变了一下 110 | 111 | ``` 112 | java.sql.SQLException: Error while executing SQL "select * from "BOOK" as b where b.name = '数据山'": while converting `B`.`NAME` = '数据山' 113 | at org.apache.calcite.avatica.Helper.createException(Helper.java:56) 114 | at org.apache.calcite.avatica.Helper.createException(Helper.java:41) 115 | at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163) 116 | at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) 117 | at com.dafei1288.calcite.TestJDBC.main(TestJDBC.java:81) 118 | Caused by: java.lang.RuntimeException: while converting `B`.`NAME` = '数据山' 119 | at org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:86) 120 | at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:63) 121 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4670) 122 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3977) 123 | at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138) 124 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4541) 125 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertWhere(SqlToRelConverter.java:965) 126 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:643) 127 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:621) 128 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3051) 129 | at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:557) 130 | at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:264) 131 | at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:230) 132 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:772) 133 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:636) 134 | at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:606) 135 | at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:229) 136 | at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:550) 137 | at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) 138 | at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) 139 | ... 2 more 140 | Caused by: java.lang.reflect.InvocationTargetException 141 | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 142 | at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 143 | at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 144 | at java.lang.reflect.Method.invoke(Method.java:497) 145 | at org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:83) 146 | ... 21 more 147 | Caused by: org.apache.calcite.runtime.CalciteException: Failed to encode '数据山' in character set 'ISO-8859-1' 148 | at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 149 | at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 150 | at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 151 | at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 152 | at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463) 153 | at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572) 154 | at org.apache.calcite.util.NlsString.(NlsString.java:81) 155 | at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:878) 156 | at org.apache.calcite.rex.RexBuilder.makeCharLiteral(RexBuilder.java:1093) 157 | at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertLiteral(SqlNodeToRexConverterImpl.java:118) 158 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4659) 159 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3977) 160 | at org.apache.calcite.sql.SqlLiteral.accept(SqlLiteral.java:532) 161 | at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4541) 162 | at org.apache.calcite.sql2rel.StandardConvertletTable.convertExpressionList(StandardConvertletTable.java:767) 163 | at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:743) 164 | at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:727) 165 | ... 26 more 166 | ``` 167 | 对于开发者来说,有变化总是好事,说明可能找到问题点了,再仔细看看日志,果然,编码问题,搞起。于是在各种检索下,得到了答案 168 | 169 | ``` 170 | System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 171 | System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 172 | System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US"); 173 | ``` 174 | 175 | 在获取连接之前,将入上述环境变量,就好了,代码片段如下: 176 | 177 | ``` 178 | public static void main(String[] args) { 179 | try { 180 | Class.forName("org.apache.calcite.jdbc.Driver"); 181 | } catch (ClassNotFoundException e1) { 182 | e1.printStackTrace(); 183 | } 184 | System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 185 | System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 186 | System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US"); 187 | 188 | Properties info = new Properties(); 189 | String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshop.json"; 190 | 191 | Connection connection = 192 | DriverManager.getConnection("jdbc:calcite:model="+jsonmodle, info); 193 | CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class); 194 | ...... 195 | } 196 | ``` 197 | 198 | # 数据类型处理 199 | 200 | 在写过滤的时候,其实就发现了,没有正确的数据类型,是走不远的,所以有必要把数据类型对应好,既然是模拟数据库,那么数据基本类型,还是使用SQL习惯的类型,而这里对java还是需要有一个映射关系。 201 | 202 | ``` 203 | package com.dafei1288.calcite.storage; 204 | 205 | import com.google.common.collect.HashBasedTable; 206 | import com.google.common.collect.Table; 207 | import org.apache.calcite.sql.type.SqlTypeName; 208 | 209 | import java.math.BigDecimal; 210 | import java.sql.Date; 211 | import java.util.Set; 212 | 213 | /** 214 | * 这里使用了GUAVA的table 作为存SQL和JAVA数据类型的数据结构 215 | * 这并不是一个好的设计,而是为了让大家更容易理解而做的设计 216 | */ 217 | public class DataTypeMapping { 218 | 219 | public static Table TYPEMAPPING= HashBasedTable.create(); 220 | public static final String CHAR = "char"; 221 | public static final String VARCHAR = "varchar"; 222 | public static final String BOOLEAN = "boolean"; 223 | public static final String DATE = "date"; 224 | public static final String INTEGER = "integer"; 225 | public static final String TINYINT = "tinyint"; 226 | public static final String SMALLINT = "smallint"; 227 | public static final String BIGINT = "bigint"; 228 | public static final String DECIMAL = "decimal"; 229 | public static final String NUMERIC = "numeric"; 230 | public static final String FLOAT = "float"; 231 | public static final String REAL = "real"; 232 | public static final String DOUBLE = "double"; 233 | public static final String TIME = "time"; 234 | public static final String TIMESTAMP = "timestamp"; 235 | public static final String ANY = "any"; 236 | static{ 237 | TYPEMAPPING.put(DataTypeMapping.CHAR,SqlTypeName.CHAR,Character.class); 238 | TYPEMAPPING.put(DataTypeMapping.VARCHAR,SqlTypeName.VARCHAR,String.class); 239 | TYPEMAPPING.put(DataTypeMapping.BOOLEAN,SqlTypeName.BOOLEAN,Boolean.class); 240 | TYPEMAPPING.put(DataTypeMapping.DATE,SqlTypeName.DATE,Date.class); 241 | TYPEMAPPING.put(DataTypeMapping.INTEGER,SqlTypeName.INTEGER,Integer.class); 242 | TYPEMAPPING.put(DataTypeMapping.TINYINT, SqlTypeName.TINYINT,Integer.class); 243 | TYPEMAPPING.put(DataTypeMapping.SMALLINT, SqlTypeName.SMALLINT,Integer.class); 244 | TYPEMAPPING.put(DataTypeMapping.BIGINT, SqlTypeName.BIGINT,Long.class); 245 | TYPEMAPPING.put(DataTypeMapping.DECIMAL, SqlTypeName.DECIMAL, BigDecimal.class); 246 | TYPEMAPPING.put(DataTypeMapping.NUMERIC, SqlTypeName.DECIMAL,Long.class); 247 | TYPEMAPPING.put(DataTypeMapping.FLOAT, SqlTypeName.FLOAT,Float.class); 248 | TYPEMAPPING.put(DataTypeMapping.REAL, SqlTypeName.REAL,Double.class); 249 | TYPEMAPPING.put(DataTypeMapping.DOUBLE, SqlTypeName.DOUBLE,Double.class); 250 | TYPEMAPPING.put(DataTypeMapping.TIME, SqlTypeName.TIME, Date.class); 251 | TYPEMAPPING.put(DataTypeMapping.TIMESTAMP, SqlTypeName.TIMESTAMP,Long.class); 252 | TYPEMAPPING.put(DataTypeMapping.ANY, SqlTypeName.ANY,String.class); 253 | } 254 | /** 255 | * 根据名字获取,对应的java类型 256 | * */ 257 | public static Class getJavaClassByName(String name){ 258 | Set> table = TYPEMAPPING.cellSet(); 259 | for(Table.Cell it:table){ 260 | if(it.getRowKey().equals(name)){ 261 | return it.getValue(); 262 | } 263 | } 264 | return null; 265 | } 266 | public static SqlTypeName getSqlTypeByName(String name){ 267 | for(Table.Cell it:TYPEMAPPING.cellSet()){ 268 | if(it.getRowKey().equals(name)){ 269 | return it.getColumnKey(); 270 | } 271 | } 272 | return null; 273 | } 274 | } 275 | 276 | ``` 277 | 278 | 栗子中,我使用了`guava`的`table`结构,主要还是为了大家方便只管理解,`Table`,第一个泛型代表内部定义的数据类型字符串,第二个是SQL数据类型,第三个参数是对应的JAVA类型。由于`guava`的`table`是`xy-->z`模式,但实际上我们需要的是一个`x->yz`模式,所以在下面添加了两个辅助方法:`public static Class getJavaClassByName(String name)`通过拿到的数据类型的字符串,获得java的类型,`public static SqlTypeName getSqlTypeByName(String name)`这个方法则是拿到SQL的数据类型。 279 | 280 | 笔者一直对充血模型钟爱有加,所以在`Storage`里为`DummyColumn`添加了两个方法 281 | 282 | ``` 283 | public static class DummyColumn{ 284 | private String name; 285 | private String type; 286 | 287 | public DummyColumn(String name, String type) { 288 | this.name = name; 289 | this.type = type; 290 | } 291 | 292 | public String getName() { 293 | return name; 294 | } 295 | 296 | public String getType() { 297 | return type; 298 | } 299 | 300 | public void setName(String name) { 301 | this.name = name; 302 | } 303 | 304 | public void setType(String type) { 305 | this.type = type; 306 | } 307 | 308 | //我充血模型 309 | //获取JAVA类型 310 | public Class getJavaClass(){ 311 | return DataTypeMapping.getJavaClassByName(this.type); 312 | } 313 | 314 | public SqlTypeName getSqlTypeName(){ 315 | return DataTypeMapping.getSqlTypeByName(this.type); 316 | } 317 | } 318 | ``` 319 | 320 | 而在数据初始化的时候,我们可以使用正确的数据类型了 321 | 322 | ``` 323 | DummyColumn id = new DummyColumn("ID","integer"); 324 | DummyColumn name = new DummyColumn("NAME","varchar"); 325 | DummyColumn age = new DummyColumn("AGE","integer"); 326 | DummyColumn aid = new DummyColumn("AID","integer"); 327 | DummyColumn type = new DummyColumn("TYPE","varchar"); 328 | ``` 329 | 330 | 这样数据类型的准备工作就做好了,后面我们需要将数据类型正确的注册表里,所以在`InMemoryTable`里我们原来写死的`String.class`类型,需要根据实际数据进行设置了 331 | 332 | ``` 333 | @Override 334 | public RelDataType getRowType(RelDataTypeFactory typeFactory) { 335 | // System.out.println("RelDataType !!!!!!"); 336 | if(dataType == null) { 337 | RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder(); 338 | for (Storage.DummyColumn column : this._table.getColumns()) { 339 | RelDataType sqlType = typeFactory.createJavaType(column.getJavaClass()); //这里使用了新增的方法,原来是String.class 340 | sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory); 341 | fieldInfo.add(column.getName(), sqlType); 342 | } 343 | this.dataType = typeFactory.createStructType(fieldInfo); 344 | } 345 | return this.dataType; 346 | } 347 | ``` 348 | 349 | 到这里我们就搞定了数据类型。 350 | 351 | # UDF 352 | 353 | 有了基础数据类型,对于我们后面做函数处理就方便多了,现在我们以2个简单的UDF为例,让我们继续领略`calcite`的魅力, 354 | 355 | 首先我们定义两个类,一个数学类,提供一个乘方的方法 356 | 357 | ``` 358 | package com.dafei1288.calcite.function; 359 | 360 | public class MathFunction { 361 | public int square(int i){ 362 | return i*i; 363 | } 364 | } 365 | ``` 366 | 367 | 另一个字符处理类,提供一个字符串连接方法,提供一个将参数转换成字符串的方法 368 | 369 | ``` 370 | package com.dafei1288.calcite.function; 371 | 372 | public class StringFunction { 373 | public String concat(Object o1,Object o2){ 374 | return "["+o1.toString()+" , "+o2.toString()+"] => "+this.toString(); 375 | } 376 | public String parseString(Object o){ 377 | return o.toString(); 378 | } 379 | } 380 | ``` 381 | 382 | 接下来,在`InMemorySchemaFactory`里,将函数注册到数据库上, 383 | 384 | ``` 385 | package com.dafei1288.calcite; 386 | 387 | import com.dafei1288.calcite.function.MathFunction; 388 | import com.dafei1288.calcite.function.StringFunction; 389 | import org.apache.calcite.schema.Schema; 390 | import org.apache.calcite.schema.SchemaFactory; 391 | import org.apache.calcite.schema.SchemaPlus; 392 | import org.apache.calcite.schema.impl.ScalarFunctionImpl; 393 | 394 | import java.util.Map; 395 | 396 | public class InMemorySchemaFactory implements SchemaFactory { 397 | @Override 398 | public Schema create(SchemaPlus parentSchema, String name, Map operand) { 399 | 400 | 401 | System.out.println("schema name ==> "+ name); 402 | System.out.println("operand ==> "+operand); 403 | 404 | parentSchema.add("SQUARE_FUNC",ScalarFunctionImpl.create(MathFunction.class,"square")); 405 | parentSchema.add("TOSTRING_FUNC",ScalarFunctionImpl.create(StringFunction.class,"parseString")); 406 | parentSchema.add("CONCAT_FUNC",ScalarFunctionImpl.create(StringFunction.class,"concat")); 407 | 408 | return new InMemorySchema(name,operand); 409 | } 410 | } 411 | ``` 412 | 413 | `SchemaPlus parentSchema`提供了一个`void add(String name, Function function)`方法,`name`为函数名,这里`calcite`提供了一个工具类`ScalarFunction`,它可以通过`create`方法,可以将你写好的函数类和其对应的方法反射出来。 414 | 415 | 接下来我们做一个测试: 416 | ``` 417 | result = st.executeQuery("select SQUARE_FUNC(b.id),CONCAT_FUNC(b.id,b.name) from \"BOOK\" as b"); 418 | while(result.next()) { 419 | System.out.println(result.getString(1) + "\t" +result.getString(2) + "\t" ); 420 | } 421 | ``` 422 | 423 | 结果 424 | 425 | ``` 426 | 1 [1 , 数据山] => com.dafei1288.calcite.function.StringFunction@22bb5646 427 | 4 [2 , 大关] => com.dafei1288.calcite.function.StringFunction@1be59f28 428 | 9 [3 , lili] => com.dafei1288.calcite.function.StringFunction@2ce45a7b 429 | 16 [4 , ten] => com.dafei1288.calcite.function.StringFunction@153d4abb 430 | ``` 431 | 432 | # 技术总结 433 | 434 | 这期新内容代码有点少,但是在实验成功以后,就克不住自己的兴奋了,一蹴而就写了这篇文章。 435 | 436 | 后续: 437 | 1. 希望能对聚合函数做一些尝试。 438 | 2. 函数的下推,这只是一个想法,目前实现的还是相当于UDF,那么实际上数据库层应该提供了很多函数的,那么在这里,是否可以透过`calcite`将函数交给`Storage`处理... 439 | 3. streaming sql 440 | 441 | 代码已更新:`https://github.com/dafei1288/CalciteHelloworld.git` 442 | 443 | 文档的翻译工作感觉暂时鸽了:) 444 | 445 | -------------------------------------------------------------------------------- /helloworld.md: -------------------------------------------------------------------------------- 1 | # 前言 2 | 3 | 说不定期更新,就不定期更新:)。 4 | 5 | 在翻译[关系代数](https://github.com/dafei1288/CalciteDocTrans/blob/master/algebra.md)这篇文档的时候,总有一种惴惴不安的感觉伴随着我,其实还是对之前[概览](https://github.com/dafei1288/CalciteDocTrans/blob/master/tutorial.md)的一知半解,而DEMO项目`Calcite-example-CSV`为了介绍特性,添加了太多代码进来,这虽然很好,因为当你执行代码的时候,就能看到所有特性,但是对于一个新手来讲却未必够友好,我也是这样的一个新手,看着文档里不知所云的概念和代码片段,经常会有挫败感。那不如我们就来实实在在的完成一个`Helloworld`来查询一个表(当然这个表示我们自己定义的格式)就这么简单。来体会一下`Calcite`的魅力吧。 6 | 7 | 这里我们的目标是: 8 | 9 | 1. 数据在一个自己可控的位置,本文写在一个Java文件的静态块里 10 | 1. 可以执行一个简单查询并返回数据 11 | 12 | # model.json 13 | 14 | 我习惯gradle,所以起手构建一个空白gradle项目,添加依赖: 15 | 16 | `compile group: 'org.apache.calcite', name: 'calcite-core', version: '1.17.0'` 17 | 18 | 在`resources`下构建一个`bookshop.json`: 19 | ``` 20 | { 21 | "version": "1.0", 22 | "defaultSchema": "bookshop", 23 | "schemas": [ 24 | { 25 | "type": "custom", 26 | "name": "bookshop", 27 | "factory": "com.dafei1288.calcite.InMemorySchemaFactory", 28 | "operand": { 29 | "p1": "hello", 30 | "p2": "world" 31 | } 32 | } 33 | ] 34 | } 35 | ``` 36 | 首先给库定义一个名字:`"defaultSchema": "bookshop"` 37 | 然后描述类型`"type": "custom"`,自定义类型,其他还包括`table`,`view`等 38 | 接下来`"factory": "com.dafei1288.calcite.InMemorySchemaFactory"`相当于定义我们程序的入口,如何加载一个`schema` 39 | 40 | 在构想初期只是想实现一个简单的bookshop数据库,后面在`Storage`介绍里,也会提到,我设计了2张表,`book`和`author`。 41 | 42 | # InMemorySchemaFactory 43 | 44 | 首先让我们来看一下代码: 45 | ``` 46 | public class InMemorySchemaFactory implements SchemaFactory { 47 | @Override 48 | public Schema create(SchemaPlus parentSchema, String name, Map operand) { 49 | 50 | 51 | System.out.println("schema name ==> "+ name); 52 | System.out.println("operand ==> "+operand); 53 | 54 | return new InMemorySchema(name,operand); 55 | } 56 | } 57 | ``` 58 | 因为在`bookshop.json`里定义了属性`"factory": "com.dafei1288.calcite.InMemorySchemaFactory"`,所以`InMemorySchemaFactory`被默认加载,该类需要继承`SchemaFactory`,重写`create`方法的时候,可以根据自己需要来构建逻辑,这里我们只打印了几个参数看一眼,就略过,实例化一个`InMemorySchema`。 59 | 60 | # InMemorySchema 61 | 我们还是先把代码贴上: 62 | ``` 63 | public class InMemorySchema extends AbstractSchema { 64 | private String dbName; 65 | private Map operand; 66 | 67 | public InMemorySchema(String name, Map operand) { 68 | this.operand = operand; 69 | this.dbName = dbName; 70 | System.out.println(""); 71 | System.out.println("in this class ==> "+ this); 72 | 73 | } 74 | @Override 75 | public Map getTableMap() { 76 | 77 | Map tables = new HashMap(); 78 | 79 | Storage.getTables().forEach(it->{ 80 | //System.out.println("it = "+it.getName()); 81 | tables.put(it.getName(),new InMemoryTable(it. getName(),it)); 82 | }); 83 | 84 | return tables; 85 | } 86 | } 87 | ``` 88 | `InMemorySchema`类也是相当简单的,首先继承`AbstractSchema`,实际上需要复写的`getTableMap`就是这个方法,它的职责就是要提供一个表名和表的映射表,为了实现这个,我们需要做一些处理,当然本例里是使用了一个`Storage`类,来模拟存储表结构信息,以及数据的,这里的表结构以及其他信息都不需要外接再提供额外辅助,如果是使用其他类型的,就可能需要根据自己的实际需求,扩展`operand`属性,来携带必要参数进来了。 89 | 90 | `Storage`直接提供了`getTables`方法,可以直接从里面获取到当前存在的表,这样直接将`Storage`内的表转化成`InMemoryTable`类就可以了。 91 | 92 | # InMemoryTable 93 | 还是先从代码入手: 94 | ``` 95 | public class InMemoryTable extends AbstractTable implements ScannableTable { 96 | private String name; 97 | private Storage.DummyTable _table; 98 | private RelDataType dataType; 99 | 100 | InMemoryTable(String name){ 101 | System.out.println("InMemoryTable !!!!!! "+name ); 102 | this.name = name; 103 | } 104 | 105 | public InMemoryTable(String name, Storage.DummyTable it) { 106 | this.name = name; 107 | this._table = it; 108 | } 109 | 110 | @Override 111 | public RelDataType getRowType(RelDataTypeFactory typeFactory) { 112 | // System.out.println("RelDataType !!!!!!"); 113 | if(dataType == null) { 114 | RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder(); 115 | for (Storage.DummyColumn column : this._table.getColumns()) { 116 | RelDataType sqlType = typeFactory.createJavaType( 117 | String.class); 118 | sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory); 119 | // System.out.println(column.getName()+" / "+sqlType); 120 | fieldInfo.add(column.getName(), sqlType); 121 | } 122 | this.dataType = typeFactory.createStructType(fieldInfo); 123 | } 124 | return this.dataType; 125 | } 126 | 127 | 128 | @Override 129 | public Enumerable scan(DataContext root) { 130 | System.out.println("scan ...... "); 131 | return new AbstractEnumerable() { 132 | public Enumerator enumerator() { 133 | return new Enumerator(){ 134 | private int cur = 0; 135 | @Override 136 | public Object[] current() { 137 | // System.out.println("cur = "+cur+" => "); 138 | // for (int i =0;i<_table.getData(cur).length;i++){ 139 | // System.out.println(_table.getData(cur)[i]); 140 | // } 141 | return _table.getData(cur++); 142 | } 143 | 144 | @Override 145 | public boolean moveNext() { 146 | // System.out.println("++cur < _table.getRowCount() = "+(cur+1 < _table.getRowCount())); 147 | return cur < _table.getRowCount() ; 148 | } 149 | 150 | @Override 151 | public void reset() { 152 | 153 | } 154 | 155 | @Override 156 | public void close() { 157 | 158 | } 159 | }; 160 | } 161 | }; 162 | } 163 | } 164 | ``` 165 | 这里我保留了很多难看的`System.out`,其实也是为了展示一下我走过的弯路,在这里面,遇到奇奇怪怪的坑,由于`Calcite`的结构原因,有时出错从日志上很难发现原因,或者说很难准确断定原因,当然也许是笔者水平所限的缘故。 166 | `InMemoryTable`需要继承`AbstractTable`实现`ScannableTable`的接口,在这里`Calcite`提供了几种`Table`接口,待日后分解。这个类里,我们主要需要处理的2个方法`public RelDataType getRowType(RelDataTypeFactory typeFactory)`和`public Enumerable scan(DataContext root)`. 167 | 168 | `getRowType`用来处理列的类型的,不要被那几句代码所迷惑,为了顺利运行,并没有针对数据的类型做什么处理,而是简单粗暴了使用了String,有兴趣的话,可以根据自己的实际情况来注册,日后有机会会详细介绍这部分。 169 | `scan`这个方法相对复杂一点,提供了全表扫面的功能,这里主要需要高速引擎,如何遍历及获取数据。其结构还是比较复杂得,为了减少本例中类的个数,避免复杂得代码结构,吓跑初学者,所以,采用了内部类嵌套的形式,含义还是比较明确的。 主要就是实现`current`和`moveNext`方法。这里还是由`Storage`提供了数据的存储功能,所以只需要遍历,获取一下数据而已,其他方法暂时不管。 170 | 171 | 写到这,其实和`Calcite`相关的代码已经完成了,整个工程的主体代码也完成了,现在只需要再介绍一下`Storage` 172 | 173 | # Storage 174 | ``` 175 | /** 176 | * 用于模拟数据库结构及数据 177 | * 178 | * author : id,name,age 179 | * book : id,aid,name,type 180 | * */ 181 | public class Storage { 182 | public static final String SCHEMA_NAME = "bookshop"; 183 | public static final String TABLE_AUTHOR = "AUTHOR"; 184 | public static final String TABLE_BOOK = "BOOK"; 185 | 186 | // public static List tables = new ArrayList<>(); 187 | public static Hashtable _bag = new Hashtable<>(); 188 | static{ 189 | DummyTable author = new DummyTable(TABLE_AUTHOR); 190 | DummyColumn id = new DummyColumn("ID","String"); 191 | DummyColumn name = new DummyColumn("NAME","String"); 192 | DummyColumn age = new DummyColumn("AGE","String"); 193 | DummyColumn aid = new DummyColumn("AID","String"); 194 | DummyColumn type = new DummyColumn("TYPE","String"); 195 | author.addColumn(id).addColumn(name).addColumn(age); 196 | author.addRow("1","jacky","33"); 197 | author.addRow("2","wang","23"); 198 | author.addRow("3","dd","32"); 199 | author.addRow("4","ma","42"); 200 | // tables.add(author); 201 | _bag.put(TABLE_AUTHOR,author); 202 | 203 | DummyTable book = new DummyTable(TABLE_BOOK); 204 | book.addColumn(id).addColumn(name).addColumn(aid).addColumn(type); 205 | book.addRow("1","1","数据山","java"); 206 | book.addRow("2","2","大关","sql"); 207 | book.addRow("3","1","lili","sql"); 208 | book.addRow("4","3","ten","c#"); 209 | // tables.add(book); 210 | _bag.put(TABLE_BOOK,book); 211 | } 212 | 213 | public static Collection getTables(){ 214 | return _bag.values(); 215 | } 216 | public static DummyTable getTable(String tableName){return _bag.get(tableName);} 217 | 218 | public static class DummyTable{ 219 | private String name; 220 | private List columns; 221 | private List> datas = new ArrayList<>(); 222 | DummyTable(String name){ 223 | this.name = name; 224 | } 225 | 226 | public String getName(){ 227 | return this.name; 228 | } 229 | 230 | public List getColumns() { 231 | return columns; 232 | } 233 | 234 | public DummyTable addColumn(DummyColumn dc){ 235 | if(this.columns == null){ 236 | this.columns = new ArrayList<>(); 237 | } 238 | this.columns.add(dc); 239 | return this; 240 | } 241 | 242 | public void setColumns(List columns) { 243 | this.columns = columns; 244 | } 245 | 246 | public Object[] getData(int index){ 247 | return this.datas.get(index).toArray(); 248 | } 249 | 250 | public int getRowCount(){ 251 | return this.datas.size(); 252 | } 253 | 254 | public void addRow(Object...objects){ 255 | this.datas.add(Arrays.asList(objects)); 256 | } 257 | 258 | 259 | } 260 | 261 | public static class DummyColumn{ 262 | private String name; 263 | private String type; 264 | 265 | public DummyColumn(String name, String type) { 266 | this.name = name; 267 | this.type = type; 268 | } 269 | 270 | public String getName() { 271 | return name; 272 | } 273 | 274 | public String getType() { 275 | return type; 276 | } 277 | 278 | public void setName(String name) { 279 | this.name = name; 280 | } 281 | 282 | public void setType(String type) { 283 | this.type = type; 284 | } 285 | } 286 | 287 | } 288 | ``` 289 | 这里我们用了一个简单的结构来模拟了存储,`Storage`下面包含`DummyTable`,`DummyTable`包含`DummyColumn`,用于存放元数据信息,而数据则包含在一个`List>`里,各类都提供基础的`getter`和`setter`方法,数据初始化则写在静态块里。 290 | 291 | # 测试 292 | 写个main方法测试一下: 293 | ``` 294 | public static void main(String[] args) { 295 | try { 296 | Class.forName("org.apache.calcite.jdbc.Driver"); 297 | } catch (ClassNotFoundException e1) { 298 | e1.printStackTrace(); 299 | } 300 | 301 | Properties info = new Properties(); 302 | String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshop.json"; 303 | try { 304 | Connection connection = 305 | DriverManager.getConnection("jdbc:calcite:model="+jsonmodle, info); 306 | CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class); 307 | 308 | ResultSet result = connection.getMetaData().getTables(null, null, null, null); 309 | while(result.next()) { 310 | System.out.println("Catalog : " + result.getString(1) + ",Database : " + result.getString(2) + ",Table : " + result.getString(3)); 311 | } 312 | result.close(); 313 | Statement st = connection.createStatement(); 314 | result = st.executeQuery("select * from book as b"); 315 | while(result.next()) { 316 | System.out.println(result.getString(1) + "\t" + result.getString(2) + "\t" + result.getString(3)); 317 | } 318 | result.close(); 319 | //connection.close(); 320 | st = connection.createStatement(); 321 | result = st.executeQuery("select a.name from author as a"); 322 | while(result.next()) { 323 | System.out.println(result.getString(1)); 324 | } 325 | result.close(); 326 | connection.close(); 327 | }catch(Exception e){ 328 | e.printStackTrace(); 329 | } 330 | } 331 | ``` 332 | 333 | # 技术总结 334 | 1. `Calcite`能提供一个透明的JDBC实现,使用者可以按自己的方式规划存储,这个特性在数据分析中,其实更适合,比如在多源、跨源联合查询上,威力巨大。 335 | 2. 按接口实现相关`schema`和`table`,目前只实现了流程上跑通,单不代表他们就是这样,在这里我们还有很长的路要走 336 | 3. 自定义视图配上model上配置的参数,也许可以作为数据权限一种实现 337 | 338 | 339 | # 后记 340 | 341 | 上述项目代码库传送门:`https://github.com/dafei1288/CalciteHelloworld.git` 342 | 343 | 目前只提供了全表扫面,条件判断表连接都还不行,待日后更新。 344 | 而`Calcite`强大的优化工作还没登场呢。 -------------------------------------------------------------------------------- /images/calcitea.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/calcitea.png -------------------------------------------------------------------------------- /images/kafkaa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/kafkaa.png -------------------------------------------------------------------------------- /images/kafkainstall.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/kafkainstall.png -------------------------------------------------------------------------------- /images/window-types.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/window-types.png -------------------------------------------------------------------------------- /images/zkinstall.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dafei1288/CalciteDocTrans/354f4abefcdd35077ea7c1fdae15405ad87eb8c0/images/zkinstall.png -------------------------------------------------------------------------------- /streaming.1.md: -------------------------------------------------------------------------------- 1 | # 概述 2 | 3 | 在前面两篇中介绍了 [存储](https://github.com/dafei1288/CalciteDocTrans/blob/master/helloworld.md) 和 [UDF](https://github.com/dafei1288/CalciteDocTrans/blob/master/function.md),然后就开始着手准备streaming了,开始走了些弯路,本以为需要构建起一个简单的流系统,才能写`streaming sql`呢,所以跑去看来几天的flink,然后再仔细研究了calcite的源码后发现,其实并不用那么麻烦,所以这个系列又能继续了。 4 | 5 | 现在,我打算用2-3章来说说streaming。 6 | 7 | 首先streaming是对表的一种补充,因为他代表着当前和未来的情况,而表则代表着过去。流是连续,流动的记录的集合,与表不同,流通常不存储再磁盘上,而是再网络上流动,在内存中保留的时间也很短。 8 | 9 | 但是与表类似,业务上也通常希望以基于关系代数的高级语言查询流,根据模式进行验证,并优化以利用可用的资源和算法。 10 | 11 | `Calcite`的`Streaming SQL`是标准SQL的扩展,而不是另一种`SQL like`的语言。主要原因如下(翻译自calcite官方文档: 12 | 13 | - 对于任何了解标准SQL的人来说,流式SQL都很容易学习。 14 | - 语义清晰,无论使用表或是流,都可以返回相同的数据。 15 | - 可以编写结合流和表的查询(或者流的历史记录,它基本上是内存中的表)。 16 | - 许多现有的工具可以生成标准SQL。 17 | - 如果不使用stream关键字,则返回常规标准SQL。 18 | 19 | 介绍了一下基本概念,关于流,还由一点是必须说的,就是窗口 20 | 21 | ![架构图](./images/window-types.png) 22 | 23 | - tumbling window (GROUP BY) 24 | - hopping window (multi GROUP BY) 25 | - sliding window (window functions) 26 | - cascading window (window functions) 27 | 28 | 对于窗口和时间的一些理解,也可以看看,我的另外一篇文章《再谈Flink》 29 | 30 | 31 | # 案例 32 | 33 | 好了,基础先说到这,下面来看看代码吧,这次其实非常简单,就可以完成`streaming`了,再一次强调,`calcite`的`streaming sql`和`flink`及`spark`的支持不同,不是api级别上的,而是支持`stream`关键字来支持流 34 | 35 | 我们已经有了前面工程的积累,这样代码量非常小的改动就可以完成了。 36 | 37 | 38 | ## bookshopStream.json 39 | 40 | 首先,我们重新定义一个模型文件,取名`bookshopStream.json` 41 | 42 | ``` 43 | { 44 | "version": "1.0", 45 | "defaultSchema": "bookshopstream", 46 | "schemas": [ 47 | { 48 | "name": "bookshopstream", 49 | "tables": [ 50 | { 51 | "name": "BOOK", 52 | "type": "custom", 53 | "factory": "com.dafei1288.calcite.stream.InMemoryStreamTableFactory", 54 | "stream": { 55 | "stream": true 56 | }, 57 | "operand": { 58 | "p1": "hello", 59 | "p2": "world" 60 | } 61 | } 62 | ] 63 | } 64 | ] 65 | } 66 | ``` 67 | 68 | 这里我们对`schema`并没有过多的设置,而是直接对`tables`属性进行了设置,将`factory`指定为`com.dafei1288.calcite.stream.InMemoryStreamTableFactory`,这类后续在细讲。这里我们将表名定义为`BOOK`,意在后续使用之前案例的`Storage`。 69 | 70 | ## InMemoryStreamTableFactory 71 | 72 | ``` 73 | public class InMemoryStreamTableFactory implements TableFactory { 74 | @Override 75 | public Table create(SchemaPlus schema, String name, Map operand, RelDataType rowType) { 76 | System.out.println(operand); 77 | System.out.println(name); 78 | return new InMemoryStreamTable(name, Storage.getTable(name)); 79 | } 80 | } 81 | ``` 82 | 83 | 因为在模型里,直接指定了`TableFactory`,这个类的职责就是构建`Table`表对象,其职责,有点类似之前案例里的`InMemorySchema`类的`public Map getTableMap()`方法。前文描述了过,指定了`"name": "BOOK"`,所以,在这里代码执行的结果就是加载了`BOOK`表。 84 | 85 | ## InMemoryStreamTable 86 | 87 | ``` 88 | public class InMemoryStreamTable extends InMemoryTable implements StreamableTable { 89 | public InMemoryStreamTable(String name, Storage.DummyTable it) { 90 | super(name, it); 91 | } 92 | 93 | @Override 94 | public Table stream() { 95 | System.out.println("streaming ....."); 96 | return this; 97 | } 98 | } 99 | ``` 100 | 101 | 这里,为了能复用之前的存储逻辑,所以直接继承了`InMemoryTable`,所以,这个实现,其实底层并不是一个彻底的`streaming`实现,而是和之前案例一直的内存实现,但是这样就可以通过stream关键字,来进行sql查询了。 102 | 103 | ## 测试 104 | 105 | ``` 106 | public class TestStreamJDBC { 107 | public static void main(String[] args) { 108 | try { 109 | Class.forName("org.apache.calcite.jdbc.Driver"); 110 | } catch (ClassNotFoundException e1) { 111 | e1.printStackTrace(); 112 | } 113 | System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 114 | System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 115 | System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US"); 116 | 117 | Properties info = new Properties(); 118 | String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\bookshopStream.json"; 119 | try { 120 | Connection connection = 121 | DriverManager.getConnection("jdbc:calcite:model=" + jsonmodle, info); 122 | CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class); 123 | 124 | ResultSet result = null; 125 | 126 | Statement st = connection.createStatement(); 127 | 128 | st = connection.createStatement(); 129 | //where b.name = '数据山' 130 | result = st.executeQuery("select stream * from BOOK as b "); 131 | while(result.next()) { 132 | System.out.println(result.getString(1)+" \t "+result.getString(2)+" \t "+result.getString(3)+" \t "+result.getString(4)); 133 | } 134 | result.close(); 135 | }catch(Exception e){ 136 | e.printStackTrace(); 137 | } 138 | 139 | } 140 | } 141 | ``` 142 | 143 | `select stream * from BOOK as b`这里撰写了一个简单的SQL,并使用了`stream`关键字,结果如下。 144 | 145 | ``` 146 | {p1=hello, p2=world, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\bookshopStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources} 147 | BOOK 148 | streaming ..... 149 | scan ...... 150 | 1 1 数据山 java 151 | 2 2 大关 sql 152 | 3 1 lili sql 153 | 4 3 ten c# 154 | ``` 155 | 156 | 那么对于一个非stream表,使用stream关键字,会怎么样呢?那么我们会得到一个异常 157 | 158 | > ERROR: Cannot convert table 'xxx' to a stream 159 | 160 | 161 | # 结尾 162 | 163 | 目前只是完成了最基础的查询,代码已提交到demo仓库 164 | 165 | TBD -------------------------------------------------------------------------------- /streaming.2.md: -------------------------------------------------------------------------------- 1 | # 概述 2 | 3 | 在上一篇文章中介绍了,如何在`select`语句中使用`stream`关键字,进行`流查询`,并且模拟了简单数据结构,有兴趣的同学可以移步去看看( [streaming上篇](https://github.com/dafei1288/CalciteDocTrans/blob/master/streaming.1.md))。本文将会继续扩展这个案例,把`calcite`和`kafka`联合起来,将`kafka`作为数据提供者,并进行`SQL`查询。 4 | 5 | # 什么是 kafka 6 | 7 | `kafka` 是一个分布式消息队列。具有高性能、持久化、多副本备份、横向扩展能力。生产者往队列里写消息,消费者从队列里取消息进行业务逻辑。一般在架构设计中起到解耦、削峰、异步处理的作用。 8 | `kafka`对外使用`topic`的概念,生产者往`topic`里写消息,消费者从读消息。为了做到水平扩展,一个`topic`实际是由多个`partition`组成的,遇到瓶颈时,可以通过增加`partition`的数量来进行横向扩容。单个`parition`内是保证消息有序。 9 | 每新写一条消息,`kafka`就是在对应的文件`append写`,所以性能非常高。 10 | `kafka`的总体数据流是这样的: 11 | 12 | ![zk](./images/kafkaa.png) 13 | 14 | 大概用法就是,`Producers`往`Brokers`里面的指定`Topic`中写消息,`Consumers`从`Brokers`里面拉去指定`Topic`的消息,然后进行业务处理。 15 | 16 | `以上内容这部分引用自:https://www.jianshu.com/p/d3e963ff8b70 ` 17 | 18 | 至于什么是`zookeeper`?有兴趣的读者自行搜索吧,这里就不过多介绍了... 19 | 20 | 21 | # kafka 环境搭建 22 | 23 | 本章以`windows`环境下搭建`kafka`环境为例,如果您已经熟悉这部分内容,可以跳过这个章节。搭建测试的方法有很多,这里我们使用一种较为便捷且成功率较高的方式。 24 | 25 | ## zookeeper 环境搭建 26 | 27 | - 下载并解压zookeeper `http://zookeeper.apache.org/releases.html#download` 28 | - 进入解压后的文件夹的`conf目录`,复制`zoo_sample.cfg`重命名成`zoo.cfg` 29 | - 编辑`zoo.cfg`文件,修改`dataDir`为`dataDir=$zookeeper解压路径\data`,这个路径可自行配置,只要有权限写入即可 30 | - 添加环境变量`ZOOKEEPER_HOME`,指向`zookeeper解压路径` 31 | - 在`PATH`变量里添加`ZOOKEEPER_HOME\bin` 32 | - 新建一个命令行,执行`zkServer` 33 | 34 | ![zk](./images/zkinstall.png) 35 | 36 | ## kafka 环境搭建 37 | 38 | - 下载并解压kafka `http://kafka.apache.org/downloads` , 下载的时候,注意`scala`版本,后续开发,可能会有影响 39 | - 进入解压后的文件夹的`config目录` 40 | - 编辑`server.properties`文件,修改`log.dirs=$kafka解压路径\kafka-logs`,这个路径可自行配置,只要有权限写入即可 41 | - 在`kafka解压路径`执行`.\bin\windows\kafka-server-start.bat .\config\server.properties`,建议将此命令,保存为`start.cmd`存放在该路径下,以便日后使用 42 | 43 | ![kafka](./images/kafkainstall.png) 44 | 45 | # kafka 环境测试 46 | 47 | 我们已经搭建起来了一个简单的`kafka`环境,接下来我们需要测试一下环境 48 | 49 | 首先,在之前的工程里加入`kafka`的依赖 50 | 51 | ``` 52 | compile group: 'org.apache.kafka', name: 'kafka_2.12', version: '2.1.0' 53 | compile group: 'org.apache.kafka', name: 'kafka-clients', version: '2.1.0' 54 | compile group: 'org.apache.kafka', name: 'kafka-streams', version: '2.1.0' 55 | ``` 56 | 57 | 然后来创建主题 58 | 59 | ## 创建 topic 60 | 61 | ``` 62 | package com.dafei1288.calcite.stream.kafka; 63 | 64 | import org.apache.kafka.clients.admin.AdminClient; 65 | import org.apache.kafka.clients.admin.CreateTopicsResult; 66 | import org.apache.kafka.clients.admin.NewTopic; 67 | 68 | import java.util.ArrayList; 69 | import java.util.Properties; 70 | import java.util.concurrent.ExecutionException; 71 | 72 | public class CreateTopic { 73 | public static void main(String[] args) { 74 | //创建topic 75 | Properties props = new Properties(); 76 | props.put("bootstrap.servers", "localhost:2181"); 77 | AdminClient adminClient = AdminClient.create(props); 78 | ArrayList topics = new ArrayList(); 79 | NewTopic newTopic = new NewTopic("calcitekafka", 1, (short) 1); 80 | topics.add(newTopic); 81 | CreateTopicsResult result = adminClient.createTopics(topics); 82 | try { 83 | result.all().get(); 84 | } catch (InterruptedException e) { 85 | e.printStackTrace(); 86 | } catch (ExecutionException e) { 87 | e.printStackTrace(); 88 | } 89 | } 90 | } 91 | 92 | ``` 93 | 94 | 创建`topic`以后,我们来构建一个基础的生产者`producter`。 95 | 96 | ## 创建 producter 97 | 98 | ``` 99 | package com.dafei1288.calcite.stream.kafka; 100 | 101 | import org.apache.kafka.clients.producer.KafkaProducer; 102 | import org.apache.kafka.clients.producer.ProducerRecord; 103 | 104 | import java.util.Properties; 105 | import java.util.Random; 106 | 107 | public class Producter { 108 | private static KafkaProducer producer; 109 | //刚才构建的topic 110 | private final static String TOPIC = "calcitekafka"; 111 | public Producter(){ 112 | Properties props = new Properties(); 113 | props.put("bootstrap.servers", "localhost:9092"); 114 | props.put("acks", "all"); 115 | props.put("retries", 0); 116 | props.put("batch.size", 16384); 117 | props.put("linger.ms", 1); 118 | props.put("buffer.memory", 33554432); 119 | props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 120 | props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 121 | //设置分区类,根据key进行数据分区 122 | producer = new KafkaProducer(props); 123 | } 124 | public void produce(){ 125 | int i = 0; 126 | Random r = new Random(); 127 | for(;;){ 128 | //每一秒创建一个随机的布尔值 129 | producer.send(new ProducerRecord(TOPIC,i+++"",r.nextBoolean()+"" )); 130 | try { 131 | Thread.sleep(1000); 132 | } catch (InterruptedException e) { 133 | e.printStackTrace(); 134 | } 135 | } 136 | // producer.close(); 137 | } 138 | 139 | public static void main(String[] args) { 140 | new Producter().produce(); 141 | } 142 | } 143 | ``` 144 | 145 | 由于没有正式的业务场景,我们进行一个简单的模拟,每秒生成一个随机的布尔值,一直循环下去,有了生产者,下面我们来构建一个消费者。 146 | 147 | ## 创建 consumer 148 | 149 | ``` 150 | package com.dafei1288.calcite.stream.kafka; 151 | 152 | import org.apache.kafka.clients.consumer.ConsumerRecord; 153 | import org.apache.kafka.clients.consumer.ConsumerRecords; 154 | import org.apache.kafka.clients.consumer.KafkaConsumer; 155 | 156 | import java.util.Arrays; 157 | import java.util.Properties; 158 | 159 | public class Consumer { 160 | private static KafkaConsumer consumer; 161 | private final static String TOPIC = "calcitekafka"; 162 | public Consumer(){ 163 | Properties props = new Properties(); 164 | props.put("bootstrap.servers", "localhost:9092"); 165 | //每个消费者分配独立的组号 166 | props.put("group.id", "test2"); 167 | //如果value合法,则自动提交偏移量 168 | props.put("enable.auto.commit", "true"); 169 | //设置多久一次更新被消费消息的偏移量 170 | props.put("auto.commit.interval.ms", "1000"); 171 | //设置会话响应的时间,超过这个时间kafka可以选择放弃消费或者消费下一条消息 172 | props.put("session.timeout.ms", "30000"); 173 | //自动重置offset 174 | props.put("auto.offset.reset","earliest"); 175 | props.put("key.deserializer", 176 | "org.apache.kafka.common.serialization.StringDeserializer"); 177 | props.put("value.deserializer", 178 | "org.apache.kafka.common.serialization.StringDeserializer"); 179 | consumer = new KafkaConsumer(props); 180 | } 181 | 182 | public void consume(){ 183 | consumer.subscribe(Arrays.asList(TOPIC)); 184 | while (true) { 185 | ConsumerRecords records = consumer.poll(100); 186 | for (ConsumerRecord record : records){ 187 | System.out.printf("offset = %d, key = %s, value = %s",record.offset(), record.key(), record.value()); 188 | System.out.println(); 189 | } 190 | } 191 | } 192 | 193 | public static void main(String[] args) { 194 | new Consumer().consume(); 195 | } 196 | } 197 | ``` 198 | 199 | 这里就是简单的将数据在控制台,进行一下输出,片段如下: 200 | 201 | ``` 202 | offset = 328, key = 0, value = false 203 | offset = 329, key = 1, value = false 204 | offset = 330, key = 2, value = true 205 | offset = 331, key = 3, value = true 206 | offset = 332, key = 4, value = false 207 | offset = 333, key = 5, value = false 208 | offset = 334, key = 6, value = true 209 | offset = 335, key = 7, value = true 210 | offset = 336, key = 8, value = false 211 | offset = 337, key = 9, value = true 212 | offset = 338, key = 10, value = true 213 | offset = 339, key = 11, value = true 214 | ``` 215 | 216 | 这样就说明之前我们搭建的`kafka`环境成功了,下面我们来和`calcite`进行整合,代替前文案例中,我们自己撰写的`storage` 217 | 218 | # calcite 整合 kafka 219 | 220 | 我们这次的目的是取代之前使用`java`文件来存储的数据,而是使用`kafka`作为数据的提供者,首先我们需要重新构建一个`schema`文件 221 | 222 | ## 创建 kafkaStream.json 223 | 224 | ``` 225 | { 226 | "version": "1.0", 227 | "defaultSchema": "bookshopstream", 228 | "schemas": [ 229 | { 230 | "name": "bookshopstream", 231 | "tables": [ 232 | { 233 | "name": "KF", 234 | "type": "custom", 235 | "factory": "com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory", 236 | "stream": { 237 | "stream": true 238 | }, 239 | "operand": { 240 | "topic": "calcitekafka", 241 | "bootstrap.servers": "localhost:9092", 242 | "group.id": "test2", 243 | "enable.auto.commit": "true", 244 | "auto.commit.interval.ms": "1000", 245 | "session.timeout.ms": "30000", 246 | "auto.offset.reset":"earliest", 247 | "key.deserializer": "org.apache.kafka.common.serialization.StringDeserializer", 248 | "value.deserializer": "org.apache.kafka.common.serialization.StringDeserializer", 249 | "colnames": "KK,VV", 250 | "timeouts": "2000" 251 | } 252 | } 253 | ] 254 | } 255 | ] 256 | } 257 | ``` 258 | 259 | 在这里,我们重新构建了一个`factory`,它是`com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory`,这个类的具体内容,我们下面会详细介绍。 260 | 261 | __请注意,在`operand`里的配置,我们加入了一系列配置,这是从通用性考虑,我们将`kafka`以及其他必要配置全部写在了这里面。__ 262 | 263 | 接下来,我们看一下`com.dafei1288.calcite.stream.kafka.KafkaStreamTableFactory`,做了一些什么? 264 | 265 | ## KafkaStreamTableFactory 266 | 267 | ``` 268 | package com.dafei1288.calcite.stream.kafka; 269 | 270 | import org.apache.calcite.rel.type.RelDataType; 271 | import org.apache.calcite.schema.SchemaPlus; 272 | import org.apache.calcite.schema.Table; 273 | import org.apache.calcite.schema.TableFactory; 274 | 275 | import java.util.Map; 276 | 277 | public class KafkaStreamTableFactory implements TableFactory { 278 | @Override 279 | public Table create(SchemaPlus schema, String name, Map operand, RelDataType rowType) { 280 | System.out.println(operand); 281 | System.out.println(name); 282 | return new KafkaStreamTable(name,operand); 283 | } 284 | } 285 | 286 | ``` 287 | 288 | 这个类,和之前的类职责基本相同,代码也几乎一致,只是在返回的时候,变成了`return new KafkaStreamTable(name,operand);`,这里我们将operand直接作为参数下发到了,`Table`类的实现里,这里是为了提高`Table`的灵活性,将职责下发。而不是像之前案例那样。 289 | 290 | ## KafkaStreamTable 291 | 292 | `KafkaStreamTable`这个类,是这个案例最有意思的部分,我们先来看一下代码 293 | 294 | ``` 295 | package com.dafei1288.calcite.stream.kafka; 296 | 297 | import org.apache.calcite.DataContext; 298 | import org.apache.calcite.linq4j.AbstractEnumerable; 299 | import org.apache.calcite.linq4j.Enumerable; 300 | import org.apache.calcite.linq4j.Enumerator; 301 | import org.apache.calcite.rel.type.RelDataType; 302 | import org.apache.calcite.rel.type.RelDataTypeFactory; 303 | import org.apache.calcite.schema.ScannableTable; 304 | import org.apache.calcite.schema.StreamableTable; 305 | import org.apache.calcite.schema.Table; 306 | import org.apache.calcite.schema.impl.AbstractTable; 307 | import org.apache.calcite.sql.type.SqlTypeUtil; 308 | import org.apache.kafka.clients.consumer.ConsumerRecord; 309 | import org.apache.kafka.clients.consumer.ConsumerRecords; 310 | import org.apache.kafka.clients.consumer.KafkaConsumer; 311 | 312 | import java.util.Arrays; 313 | import java.util.Iterator; 314 | import java.util.Map; 315 | 316 | 317 | public class KafkaStreamTable extends AbstractTable implements StreamableTable, ScannableTable { 318 | 319 | @Override 320 | public Table stream() { 321 | return this; 322 | } 323 | 324 | private String name; 325 | private RelDataType dataType; 326 | private Map operand; 327 | private static KafkaConsumer consumer; 328 | 329 | public KafkaStreamTable(String name){ 330 | System.out.println("KafkaStreamTable !!!!!! "+name ); 331 | this.name = name; 332 | } 333 | 334 | public KafkaStreamTable(String name, Map operand) { 335 | System.out.println("KafkaStreamTable !!!!!! "+name +" , "+operand); 336 | this.name = name; 337 | this.operand = operand; 338 | 339 | 340 | } 341 | 342 | @Override 343 | public RelDataType getRowType(RelDataTypeFactory typeFactory) { 344 | // System.out.println("RelDataType !!!!!!"); 345 | if(dataType == null) { 346 | RelDataTypeFactory.FieldInfoBuilder fieldInfo = typeFactory.builder(); 347 | //我们需要存储stream table的元数据信息,为了案例,我写在了kafkaStream.json文件里配置信息里colnames 348 | for (String col : operand.get("colnames").toString().split(",")) { 349 | RelDataType sqlType = typeFactory.createJavaType(String.class); 350 | sqlType = SqlTypeUtil.addCharsetAndCollation(sqlType, typeFactory); 351 | fieldInfo.add(col, sqlType); 352 | } 353 | this.dataType = typeFactory.createStructType(fieldInfo); 354 | } 355 | return this.dataType; 356 | } 357 | 358 | 359 | @Override 360 | public Enumerable scan(DataContext root) { 361 | System.out.println("scan ...... "); 362 | consumer = new KafkaConsumer(operand); 363 | consumer.subscribe(Arrays.asList(operand.get("topic").toString())); 364 | 365 | return new AbstractEnumerable() { 366 | 367 | public Enumerator enumerator() { 368 | return new Enumerator(){ 369 | //因为,刚才的producter里面,数据是每秒产生的,如果这里值太下,则会出现取不出值的可能 370 | ConsumerRecords records = consumer.poll(Integer.parseInt(operand.get("timeouts").toString())); 371 | Iterator it =records.iterator(); 372 | private int cur = 0; 373 | @Override 374 | public Object[] current() { 375 | ConsumerRecord reco = (ConsumerRecord) it.next(); 376 | return new String[]{reco.key(),reco.value()}; 377 | } 378 | 379 | @Override 380 | public boolean moveNext() { 381 | //ConsumerRecord record : records 382 | return it.hasNext(); 383 | } 384 | 385 | @Override 386 | public void reset() { 387 | 388 | } 389 | 390 | @Override 391 | public void close() { 392 | consumer.close(); 393 | } 394 | }; 395 | } 396 | }; 397 | } 398 | } 399 | ``` 400 | 401 | 这个类的职责与之前的`InMemoryTable`类似,即提供数据如何遍历,如何转化数据类型。 402 | 403 | 前文提及将何定义一个`streaming`的职责下发到此类里,这是为了提高了灵活性,即如果不使用`kafka`提供数据,想使用其他的`streaming`工具来构造数据,也会变得相对简单一些。 404 | 405 | 在`public RelDataType getRowType(RelDataTypeFactory typeFactory)`这个方法里,我们需要对流里的数据,提供元数据的类型映射,前文提到过,我是把元数据,放在了`kafkaStream.json`文件里的`operand`节中的`colnames`属性里,这里,`producter`的数据提供,只有一个`key`和一个`boolean`值,所以我们只创建了两列`KK`和`VV`。而为了演示,我们也粗暴的将数据类型,定义为`string`类型。 406 | 407 | 接下来,我们将在`public Enumerable scan(DataContext root)`方法里,订阅`kafka`的主题,并消费其发射来的数据。由于我们的生产者是每秒产生一次数据,所以在`consumer.poll(Integer.parseInt(operand.get("timeouts").toString()));`这里,我们不能把时间设置的太小,否则会出现取不出数据的情况,我们可以通过在`operand`里加入类似参数`"max.poll.records": 20,`来控制每页数据量。 408 | 409 | 到这里,我们的基础工作完成了,下面来测试一下 410 | 411 | ## 测试 412 | 413 | ``` 414 | package com.dafei1288.calcite.stream.kafka; 415 | 416 | import org.apache.calcite.jdbc.CalciteConnection; 417 | import org.apache.calcite.util.ConversionUtil; 418 | 419 | import java.sql.Connection; 420 | import java.sql.DriverManager; 421 | import java.sql.ResultSet; 422 | import java.sql.Statement; 423 | import java.util.Properties; 424 | 425 | public class TestKafkaStreamJDBC { 426 | public static void main(String[] args) { 427 | try { 428 | Class.forName("org.apache.calcite.jdbc.Driver"); 429 | } catch (ClassNotFoundException e1) { 430 | e1.printStackTrace(); 431 | } 432 | System.setProperty("saffron.default.charset", ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 433 | System.setProperty("saffron.default.nationalcharset",ConversionUtil.NATIVE_UTF16_CHARSET_NAME); 434 | System.setProperty("saffron.default.collation.name",ConversionUtil.NATIVE_UTF16_CHARSET_NAME + "$en_US"); 435 | 436 | Properties info = new Properties(); 437 | String jsonmodle = "E:\\working\\others\\写作\\calcitetutorial\\src\\main\\resources\\kafkaStream.json"; 438 | try { 439 | Connection connection = 440 | DriverManager.getConnection("jdbc:calcite:model=" + jsonmodle, info); 441 | CalciteConnection calciteConn = connection.unwrap(CalciteConnection.class); 442 | 443 | ResultSet result = null; 444 | 445 | Statement st = connection.createStatement(); 446 | 447 | st = connection.createStatement(); 448 | //where b.name = '数据山' 449 | result = st.executeQuery("select stream kf.kk,kf.vv from KF as kf "); 450 | while(result.next()) { 451 | System.out.println(result.getString(1)+" \t "+result.getString(2)); 452 | } 453 | 454 | result.close(); 455 | }catch(Exception e){ 456 | e.printStackTrace(); 457 | } 458 | 459 | } 460 | } 461 | 462 | ``` 463 | 464 | 可以看到我们的测试语句 `select stream kf.kk,kf.vv from KF as kf`,结果如下 465 | 466 | ``` 467 | {topic=calcitekafka, bootstrap.servers=localhost:9092, group.id=test2, enable.auto.commit=true, auto.commit.interval.ms=1000, session.timeout.ms=30000, auto.offset.reset=earliest, key.deserializer=org.apache.kafka.common.serialization.StringDeserializer, value.deserializer=org.apache.kafka.common.serialization.StringDeserializer, colnames=key,value, timeouts=2000, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\kafkaStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources} 468 | KF 469 | KafkaStreamTable !!!!!! KF , {topic=calcitekafka, bootstrap.servers=localhost:9092, group.id=test2, enable.auto.commit=true, auto.commit.interval.ms=1000, session.timeout.ms=30000, auto.offset.reset=earliest, key.deserializer=org.apache.kafka.common.serialization.StringDeserializer, value.deserializer=org.apache.kafka.common.serialization.StringDeserializer, colnames=key,value, timeouts=2000, modelUri=E:\working\others\写作\calcitetutorial\src\main\resources\kafkaStream.json, baseDirectory=E:\working\others\写作\calcitetutorial\src\main\resources} 470 | scan ...... 471 | 283 false 472 | 284 false 473 | 285 false 474 | 286 true 475 | 287 true 476 | 288 true 477 | 289 false 478 | 290 false 479 | 291 false 480 | 292 true 481 | 293 false 482 | 294 true 483 | 295 false 484 | 296 true 485 | 297 true 486 | ``` 487 | 488 | 到这,基础整合完成了。 489 | 490 | # 结尾 491 | 492 | 当前案例仅完成了初步整合,后续会继续扩展这个案例,例如时间窗滑动等,敬请期待 493 | 494 | TBD -------------------------------------------------------------------------------- /tutorial.md: -------------------------------------------------------------------------------- 1 | # 前言 2 | 3 | Apache Calcite 是独立于存储与执行的SQL解析、优化引擎,广泛应用于各种离线、搜索、实时查询引擎,如Drill、Hive、Kylin、Solr、flink、Samza等。 4 | 5 | ![架构图](./images/calcitea.png) 6 | 7 | 偶然的机会了解到这个项目,然后就深深的为之着迷了,很感慨为什么没能早几年遇到她。也是为了更加了解她,光读文档不过瘾了,所以想动手翻译一下。 但是本人英文水平有限,又是第一次干这种事,所以欢迎大家帮我勘正谬误。 [联系我:dafei1288@sina.com](mailto:dafei1288@sina.com) 欢迎转载,请注明出处。 8 | 9 | 我们先从引导文件开始:[原文链接](http://calcite.apache.org/docs/tutorial.html) 10 | 11 | 12 | 13 | # 正文 14 | 15 | 这是一个手把手式文档,教你如何构建并且连接到`Calcite`。我们用一个简单的适配器来将一个包含[CSV](https://en.wikipedia.org/wiki/Comma-separated_values)文件的目录变成一个包含数据表的数据库(原文描述为`schema`)。`Calcite`可以提供一个完整的SQL接口。 16 | 17 | `Calcite-example-CSV`是一个全功能适配器来使得`Calcite`可以读取`CSV`格式文件。可以通过几百行代码就能够完成一个全SQL查询功能。 18 | 19 | `CSV`适配器可以作为抛砖引玉的模板套用到其他数据格式上。尽管他代码量不多,但是麻雀虽小五脏俱全,重要原理都包含其中: 20 | 21 | 1. 使用`SchemaFactory`和`Schema interfaces`来自定义`schema` 22 | 2. 使用固定格式的JSON文件来(`a model JSON file`模型文件)声明数据库(`schemas`) 23 | 3. 使用固定格式的JSON文件来(`a model JSON file`模型文件)声明视图(`views`) 24 | 4. 使用`Table interface`来自定义表(`Table`) 25 | 5. 确定表格的记录类型 26 | 6. 使用`ScannableTable interface`来实现一个简单的表(`Table`),来枚举所有行(`rows`) 27 | 7. 进阶实现`FilterableTable`,可以根据条件(`simple predicates`)来过滤数据 28 | 8. 表的进阶实现`TranslatableTable`,将执行计划翻译成关系运算(`translates to relational operators using planner rules`) 29 | 30 | ## 下载和编译 31 | 32 | 需要Java环境(1.7及以上版本,推荐1.8),git以及maven(3.2.1及以上版本) 33 | 34 | ``` 35 | $ git clone https://github.com/apache/calcite.git 36 | $ cd calcite 37 | $ mvn install -DskipTests -Dcheckstyle.skip=true 38 | $ cd example/csv 39 | ``` 40 | 41 | ## 第一个查询 42 | 43 | 现在让我们来使用[sqlline](https://github.com/julianhyde/sqlline)来连接`Calcite`,`sqlline`是一个包含在整个`Calcite`项目里的SQL的命令行工具。 44 | 45 | ``` 46 | $ ./sqlline 47 | sqlline> !connect jdbc:calcite:model=target/test-classes/model.json admin admin 48 | ``` 49 | 50 | (如果是windows操作系统,使用`sqlline.bat`) 51 | 52 | 执行一个元数据查询: 53 | 54 | ``` 55 | sqlline> !tables 56 | +------------+--------------+-------------+---------------+----------+------+ 57 | | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE | 58 | +------------+--------------+-------------+---------------+----------+------+ 59 | | null | SALES | DEPTS | TABLE | null | null | 60 | | null | SALES | EMPS | TABLE | null | null | 61 | | null | SALES | HOBBIES | TABLE | null | null | 62 | | null | metadata | COLUMNS | SYSTEM_TABLE | null | null | 63 | | null | metadata | TABLES | SYSTEM_TABLE | null | null | 64 | +------------+--------------+-------------+---------------+----------+------+ 65 | ``` 66 | 67 | (*译者注:上面案例里使用的`!tables`命令查询元数据,但是译者在使用的时候发现这个命令不好使) 68 | 69 | ``` 70 | 0: jdbc:calcite:model=target/test-classes/mod> !table 71 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+ 72 | | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_CAT | TYPE_SCHEM | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | 73 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+ 74 | | | SALES | DEPTS | TABLE | | | | | | | 75 | | | SALES | EMPS | TABLE | | | | | | | 76 | | | SALES | SDEPTS | TABLE | | | | | | | 77 | | | metadata | COLUMNS | SYSTEM TABLE | | | | | | | 78 | | | metadata | TABLES | SYSTEM TABLE | | | | | | | 79 | +-----------+-------------+------------+------------+---------+----------+------------+-----------+---------------------------+----------------+ 80 | ``` 81 | (JDBC提示: 在`sqlline`里`!tables`命令只是执行了`DatabaseMetaData.getTables()`方法,还有其他的获取元数据命令如:`!columns`,`!describe`) 82 | 83 | (译者注:`!describe`需要加表名) 84 | 85 | ``` 86 | 0: jdbc:calcite:model=target/test-classes/mod> !describe 87 | Usage: describe 88 | 89 | 0: jdbc:calcite:model=target/test-classes/mod> !describe DEPTS 90 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+ 91 | 92 | | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | TYPE_NAME | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | NULLABLE | REMARKS | COLUMN_DEF | SQL_DATA_TYPE | SQL_DATETIME_SUB | CHAR_OCTET_LENGTH | ORDINAL_POSITION | IS_NULLABLE | SCOPE_CATALOG | SCOPE_SCHEMA | SCOPE_TABLE | 93 | 94 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+ 95 | 96 | | | SALES | DEPTS | DEPTNO | 4 | INTEGER | -1 | null | null | 10 | 1 | | | null | null | -1 | 1 | YES | | | | 97 | 98 | | | SALES | DEPTS | NAME | 12 | VARCHAR CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" | -1 | null | null | 10 | 1 | | | null | null | -1 | 2 | 99 | 100 | +-----------+-------------+------------+-------------+-----------+-----------+-------------+---------------+----------------+----------------+----------+---------+------------+---------------+------------------+-------------------+------------------+-------------+---------------+--------------+-------------+ 101 | 102 | ``` 103 | 104 | 你能看到,在执行`!tables`的时候有5个表,表`EMPS`, `DEPTS`和`HOBBIES`在`SALES`库(`schema`)里,表`COLUMNS`和`TABLES`在系统元数据库(`system metadata schema`)里。系统表总是在`Calcite`里显示,但其他表是由库(`schema`)的实现来指定的,在本例中,`EMPS`和`DEPTS`表来源于`target/test-classes`路径下的`EMPS.csv`和`DEPTS.csv`。 105 | 106 | 让我们来执行一些查询,来展示`Calcite`的全SQL功能,首先表检索: 107 | 108 | ``` 109 | sqlline> SELECT * FROM emps; 110 | +--------+--------+---------+---------+----------------+--------+-------+---+ 111 | | EMPNO | NAME | DEPTNO | GENDER | CITY | EMPID | AGE | S | 112 | +--------+--------+---------+---------+----------------+--------+-------+---+ 113 | | 100 | Fred | 10 | | | 30 | 25 | t | 114 | | 110 | Eric | 20 | M | San Francisco | 3 | 80 | n | 115 | | 110 | John | 40 | M | Vancouver | 2 | null | f | 116 | | 120 | Wilma | 20 | F | | 1 | 5 | n | 117 | | 130 | Alice | 40 | F | Vancouver | 2 | null | f | 118 | +--------+--------+---------+---------+----------------+--------+-------+---+ 119 | ``` 120 | 121 | 接下来是表连接和分组聚合查询: 122 | 123 | ``` 124 | sqlline> SELECT d.name, COUNT(*) 125 | . . . .> FROM emps AS e JOIN depts AS d ON e.deptno = d.deptno 126 | . . . .> GROUP BY d.name; 127 | +------------+---------+ 128 | | NAME | EXPR$1 | 129 | +------------+---------+ 130 | | Sales | 1 | 131 | | Marketing | 2 | 132 | +------------+---------+ 133 | ``` 134 | 135 | 最后,一个计算操作返回一个单行记录,也可以通过这种简便的方法来测试表达式和SQL函数 136 | 137 | ``` 138 | sqlline> VALUES CHAR_LENGTH('Hello, ' || 'world!'); 139 | +---------+ 140 | | EXPR$0 | 141 | +---------+ 142 | | 13 | 143 | +---------+ 144 | ``` 145 | 146 | `Calcite`还包含很多SQL特性,这里就不一一列举了。 147 | 148 | ## Schema探索 149 | 150 | 那么`Calcite`是如何发现表的呢?事实上`Calcite`的核心是并不能理解`CSV`文件的(作为一个“没有存储层的databse”,`Calcite`是了解任何文件格式),之所以`Calcite`能读取上文中的元数据,是因为在`calcite-example-csv`里我们撰写了相关代码。 151 | 152 | 在执行链里包含着很多步骤。首先我们定义一个可以被库工厂加载的模型文件(`we define a schema based on a schema factory class in a model file.`)。然后库工厂会加载成数据库并创建许多表,每一个表都需要知道自己如何加载CSV中的数据。最后`Calcite`解析完查询并将查询计划映射到这几个表上时,`Calcite`会在查询执行时触发这些表去读取数据。接下来我们更深入地解析其中的细节步骤。 153 | 154 | 举个栗子(a model in JSON format): 155 | 156 | ``` 157 | { 158 | version: '1.0', 159 | defaultSchema: 'SALES', 160 | schemas: [ 161 | { 162 | name: 'SALES', 163 | type: 'custom', 164 | factory: 'org.apache.calcite.adapter.csv.CsvSchemaFactory', 165 | operand: { 166 | directory: 'target/test-classes/sales' 167 | } 168 | } 169 | ] 170 | } 171 | ``` 172 | 173 | 这个模型文件定义了一个库(`schema`)叫`SALES`,这个库是由一个插件类(`a plugin class`)支持的,[org.apache.calcite.adapter.csv.CsvSchemaFactory](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvSchemaFactory.java)这个是`calcite-example-csv`工程里`interface SchemaFactory`的一个实现。它的`create`方法将一个schema实例化了,将model file中的directory作为参数传递过去了。 174 | 175 | ``` 176 | public Schema create(SchemaPlus parentSchema, String name, 177 | Map operand) { 178 | String directory = (String) operand.get("directory"); 179 | String flavorName = (String) operand.get("flavor"); 180 | CsvTable.Flavor flavor; 181 | if (flavorName == null) { 182 | flavor = CsvTable.Flavor.SCANNABLE; 183 | } else { 184 | flavor = CsvTable.Flavor.valueOf(flavorName.toUpperCase()); 185 | } 186 | return new CsvSchema( 187 | new File(directory), 188 | flavor); 189 | } 190 | ``` 191 | 根据模型(`model`)描述,库工程(`schema factory`)实例化了一个名为'SALES'的简单库(`schema`)。这个库(`schema`)是[org.apache.calcite.adapter.csv.CsvSchema](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvSchema.java)的实例并且实现了`Calcite`里的接口[Schema](http://calcite.apache.org/apidocs/org/apache/calcite/schema/Schema.html)。 192 | 193 | 一个库(`schema`)的主要职责就是创建一个表(`table`)的列表(库的职责还包括子库列表、函数列表等,但是`calcite-example-csv`项目里并没有包含他们)。这些表实现了`Calcite`的[Table](http://calcite.apache.org/apidocs/org/apache/calcite/schema/Table.html)接口。CsvSchema创建的表全部是[CsvTable](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTable.java)和他的子类的实例。 194 | 195 | 下面是`CsvSchema`的一些相关代码,对基类`AbstractSchema`中的[getTableMap()](http://calcite.apache.org/apidocs/org/apache/calcite/schema/impl/AbstractSchema.html#getTableMap())方法进行了重载。 196 | 197 | ``` 198 | protected Map getTableMap() { 199 | // Look for files in the directory ending in ".csv", ".csv.gz", ".json", 200 | // ".json.gz". 201 | File[] files = directoryFile.listFiles( 202 | new FilenameFilter() { 203 | public boolean accept(File dir, String name) { 204 | final String nameSansGz = trim(name, ".gz"); 205 | return nameSansGz.endsWith(".csv") 206 | || nameSansGz.endsWith(".json"); 207 | } 208 | }); 209 | if (files == null) { 210 | System.out.println("directory " + directoryFile + " not found"); 211 | files = new File[0]; 212 | } 213 | // Build a map from table name to table; each file becomes a table. 214 | final ImmutableMap.Builder builder = ImmutableMap.builder(); 215 | for (File file : files) { 216 | String tableName = trim(file.getName(), ".gz"); 217 | final String tableNameSansJson = trimOrNull(tableName, ".json"); 218 | if (tableNameSansJson != null) { 219 | JsonTable table = new JsonTable(file); 220 | builder.put(tableNameSansJson, table); 221 | continue; 222 | } 223 | tableName = trim(tableName, ".csv"); 224 | final Table table = createTable(file); 225 | builder.put(tableName, table); 226 | } 227 | return builder.build(); 228 | } 229 | 230 | /** Creates different sub-type of table based on the "flavor" attribute. */ 231 | private Table createTable(File file) { 232 | switch (flavor) { 233 | case TRANSLATABLE: 234 | return new CsvTranslatableTable(file, null); 235 | case SCANNABLE: 236 | return new CsvScannableTable(file, null); 237 | case FILTERABLE: 238 | return new CsvFilterableTable(file, null); 239 | default: 240 | throw new AssertionError("Unknown flavor " + flavor); 241 | } 242 | } 243 | ``` 244 | 245 | `schema`会扫描指定路径,找到所有以`.csv/`结尾的文件。在本例中,指定路径是 `target/test-classes/sales`,路径中包含文件'EMPS.csv'和'DEPTS.csv',这两个文件会转换成表`EMPS`和`DEPTS`。 246 | 247 | ## 表和视图 248 | 249 | 值得注意的是,我们在模型文件(`model`)里并不需要定义任何表,`schema`会自动创建的。 250 | 你可以额外扩展一些表(`tables`),使用这个`schema`中其他表的属性。 251 | 252 | 253 | 让我们看看如何创建一个重要且常用的一种表——视图。 254 | 255 | 在写一个查询时,视图就相当于一个table,但它不存储数据。它通过执行查询来生成数据。在查询转换为执行计划时,视图会被展开,所以查询执行器可以执行一些优化策略,例如移除一些`SELECT`子句中存在但在最终结果中没有用到的表达式。 256 | 257 | 举个栗子: 258 | 259 | ``` 260 | { 261 | version: '1.0', 262 | defaultSchema: 'SALES', 263 | schemas: [ 264 | { 265 | name: 'SALES', 266 | type: 'custom', 267 | factory: 'org.apache.calcite.adapter.csv.CsvSchemaFactory', 268 | operand: { 269 | directory: 'target/test-classes/sales' 270 | }, 271 | tables: [ 272 | { 273 | name: 'FEMALE_EMPS', 274 | type: 'view', 275 | sql: 'SELECT * FROM emps WHERE gender = \'F\'' 276 | } 277 | ] 278 | } 279 | ] 280 | } 281 | ``` 282 | 283 | 栗子中`type:view`这一行将`FEMALE_EMPS`定义为一个视图,而不是常规表或者是自定义表。注意通常在JSON文件里,定义`view`的时候,需要对单引号进行转义。 284 | 285 | 用JSON来定义长字符串易用性不太高,因此`Calcite`支持了一种替代语法。如果你的视图定义中有长SQL语句,可以使用多行来定义一个长字符串: 286 | 287 | ``` 288 | { 289 | name: 'FEMALE_EMPS', 290 | type: 'view', 291 | sql: [ 292 | 'SELECT * FROM emps', 293 | 'WHERE gender = \'F\'' 294 | ] 295 | } 296 | ``` 297 | 现在我们定义了一个视图(`view`),我们可以再查询中使用它就像使用普通表(`table`)一样: 298 | 299 | ``` 300 | sqlline> SELECT e.name, d.name FROM female_emps AS e JOIN depts AS d on e.deptno = d.deptno; 301 | +--------+------------+ 302 | | NAME | NAME | 303 | +--------+------------+ 304 | | Wilma | Marketing | 305 | +--------+------------+ 306 | ``` 307 | 308 | ## 自定义表 309 | 310 | 自定义表是由用户定义的代码来实现定义的,不需要额外自定义`schema`。 311 | 312 | 继续举个栗子`model-with-custom-table.json`: 313 | 314 | ``` 315 | { 316 | version: '1.0', 317 | defaultSchema: 'CUSTOM_TABLE', 318 | schemas: [ 319 | { 320 | name: 'CUSTOM_TABLE', 321 | tables: [ 322 | { 323 | name: 'EMPS', 324 | type: 'custom', 325 | factory: 'org.apache.calcite.adapter.csv.CsvTableFactory', 326 | operand: { 327 | file: 'target/test-classes/sales/EMPS.csv.gz', 328 | flavor: "scannable" 329 | } 330 | } 331 | ] 332 | } 333 | ] 334 | } 335 | ``` 336 | 337 | 我们可以一样来查询表数据: 338 | 339 | ``` 340 | sqlline> !connect jdbc:calcite:model=target/test-classes/model-with-custom-table.json admin admin 341 | sqlline> SELECT empno, name FROM custom_table.emps; 342 | +--------+--------+ 343 | | EMPNO | NAME | 344 | +--------+--------+ 345 | | 100 | Fred | 346 | | 110 | Eric | 347 | | 110 | John | 348 | | 120 | Wilma | 349 | | 130 | Alice | 350 | +--------+--------+ 351 | ``` 352 | 353 | 上面的`schema`是通用格式,包含了一个自定义表[org.apache.calcite.adapter.csv.CsvTableFactory](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTableFactory.java),这个类实现了`Calcite`中的`TableFactory`接口。它在`create`方法里实例化了`CsvScannableTable`,将`model`文件中的`file`参数传递过去。 354 | 355 | 356 | ``` 357 | public CsvTable create(SchemaPlus schema, String name, 358 | Map map, RelDataType rowType) { 359 | String fileName = (String) map.get("file"); 360 | final File file = new File(fileName); 361 | final RelProtoDataType protoRowType = 362 | rowType != null ? RelDataTypeImpl.proto(rowType) : null; 363 | return new CsvScannableTable(file, protoRowType); 364 | } 365 | ``` 366 | 367 | 通常做法是实现一个自定义表(`a custom table`)来替代实现一个自定义库(`a custom schema`)。两个方法最后都会创建一个`Table`接口的实例,但是自定义表无需重新实现元数据(`metadata`)获取部分。(`CsvTableFactory`和`CsvSchema`一样,都创建了`CsvScannableTable`,但是自定表实现就不需要实现在文件系统里检索`.csv`文件。) 368 | 369 | 自定义表(`table`)要求开发者在`model`上执有多操作(开发者需要在`model`文件中显式指定每一个`table`和它对应的文件),同时也提供给了开发者更多的控制选项(例如,为每一个table提供不同参数)。 370 | 371 | ## 模型中的注释 372 | 373 | 374 | 注释使用语法 `/* ... */` 和 `//`: 375 | 376 | ``` 377 | { 378 | version: '1.0', 379 | /* 多行 380 | 注释 */ 381 | defaultSchema: 'CUSTOM_TABLE', 382 | // 单行注释 383 | schemas: [ 384 | .. 385 | ] 386 | } 387 | ``` 388 | 389 | (注释不是标准JSON格式,但不会造成影响。) 390 | 391 | ## 使用查询计划来优化查询 392 | 393 | 目前来看表(`table`)实现和查询都没有问题,因为我们的表中并没有大量的数据。但如果你的自定义表(`table`)有,例如,有100列和100万行数据,你肯定希望用户在每次查询过程中不检索全量数据。你会希望`Calcite`通过适配器来进行衡量,并找到一个更有效的方法来访问数据。 394 | 395 | 这个衡量过程是一个简单的查询优化格式。`Calcite`是通过添加执行器规则(`planner rules`)来支持查询优化的。执行器规则(`planner rules`)通过在查询解析中寻找指定模式(`patterns`)(例如在某个项目中匹配到某种类型的`table`是生效),使用实现优化后的新节点替换寻找到节点。 396 | 397 | 执行器规则(`planner rules`)也是可扩展的,就像`schemas`和`tables`一样。所以如果你有一些存储下来的数据希望通过SQL访问它,首先需要定义一个自定义表或是schema,然后再去定义一些能使数据访问高效的规则。 398 | 399 | 为了查看效果,我们可以使用一个执行器规则(`planner rules`)来访问一个`CSV`文件中的某些子列集合。我们可以在两个相似的schema中执行同样的查询: 400 | 401 | ``` 402 | sqlline> !connect jdbc:calcite:model=target/test-classes/model.json admin admin 403 | sqlline> explain plan for select name from emps; 404 | +-----------------------------------------------------+ 405 | | PLAN | 406 | +-----------------------------------------------------+ 407 | | EnumerableCalcRel(expr#0..9=[{inputs}], NAME=[$t1]) | 408 | | EnumerableTableScan(table=[[SALES, EMPS]]) | 409 | +-----------------------------------------------------+ 410 | sqlline> !connect jdbc:calcite:model=target/test-classes/smart.json admin admin 411 | sqlline> explain plan for select name from emps; 412 | +-----------------------------------------------------+ 413 | | PLAN | 414 | +-----------------------------------------------------+ 415 | | EnumerableCalcRel(expr#0..9=[{inputs}], NAME=[$t1]) | 416 | | CsvTableScan(table=[[SALES, EMPS]]) | 417 | +-----------------------------------------------------+ 418 | ``` 419 | 420 | 这两个计划到底有什么不同呢?通过对比可以发现,在`smart.json`里只多了一行: 421 | 422 | ``` 423 | flavor: "translatable" 424 | ``` 425 | 426 | 这会让`CsvSchema`携带参数参数`falvor = TRANSLATABLE` 参数进行创建,并且它的`createTable`方法会创建[CsvTranslatableTable](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTranslatableTable.java),而不是`CsvScannableTable`. 427 | 428 | `CsvTranslatableTable`实现了`TranslatableTable.toRel()`方法来创建[CsvTableScan](https://github.com/apache/calcite/blob/master/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvTableScan.java). 扫描表(`Table scan`)操作是查询执行树中的叶子节点,默认实现方式是`EnumerableTableScan`,但我们构造了一种不同的的子类型来让规则生效。 429 | 430 | 下面是完整的代码: 431 | 432 | ``` 433 | public class CsvProjectTableScanRule extends RelOptRule { 434 | public static final CsvProjectTableScanRule INSTANCE = 435 | new CsvProjectTableScanRule(); 436 | 437 | private CsvProjectTableScanRule() { 438 | super( 439 | operand(Project.class, 440 | operand(CsvTableScan.class, none())), 441 | "CsvProjectTableScanRule"); 442 | } 443 | 444 | @Override 445 | public void onMatch(RelOptRuleCall call) { 446 | final Project project = call.rel(0); 447 | final CsvTableScan scan = call.rel(1); 448 | int[] fields = getProjectFields(project.getProjects()); 449 | if (fields == null) { 450 | // Project contains expressions more complex than just field references. 451 | return; 452 | } 453 | call.transformTo( 454 | new CsvTableScan( 455 | scan.getCluster(), 456 | scan.getTable(), 457 | scan.csvTable, 458 | fields)); 459 | } 460 | 461 | private int[] getProjectFields(List exps) { 462 | final int[] fields = new int[exps.size()]; 463 | for (int i = 0; i < exps.size(); i++) { 464 | final RexNode exp = exps.get(i); 465 | if (exp instanceof RexInputRef) { 466 | fields[i] = ((RexInputRef) exp).getIndex(); 467 | } else { 468 | return null; // not a simple projection 469 | } 470 | } 471 | return fields; 472 | } 473 | } 474 | ``` 475 | 构造函数声明了能使规则生效的关系表达式匹配模式。 476 | 477 | `onMatch`方法创了一个新的表达式并且执行[RelOptRuleCall.transformTo()](http://calcite.apache.org/apidocs/org/apache/calcite/plan/RelOptRuleCall.html#transformTo(org.apache.calcite.rel.RelNode))这个方法来通知规则执行成功。 478 | 479 | ## 查询优化流程 480 | 481 | 482 | 关于`Calcite`的查询计划有多智能有很多可以说的,但我们在这里不会讨论这个问题。最聪明的做法是为执行器规划的作者减轻负担(` The cleverness is designed to take the burden off you, the writer of planner rules.`)。 483 | 484 | 485 | 首先,`Calcite`不会按照规定的数据来执行.查询优化处理过程是一个有很多分支的分支树,就像国际象棋一样会检查很多可能的子操作。如果规则A和B同时满足查询操作树的一个给定子集合,`Calcite`可以将它们同时执行。 486 | 487 | 其次,`Calcite`在执行计划树的时候会使用基于代价的优化,但代价模型并不会阻止一些看起来短期代价更高的规则执行(`Second, Calcite uses cost in choosing between plans, but the cost model doesn’t prevent rules from firing which may seem to be more expensive in the short term.`)。 488 | 489 | 许多优化规则都有一个线性优化方案。在面对A或B的选择上,需要立刻做出决定。就好像有一个策略,比如“在整棵树上先执行规则A,然后在整棵树上执行规则B”,或是执行基于代价的优化策略,执行能产生耗费更低的结果的规则。 490 | 491 | `Calcite`并不需要做出上述的妥协。这使得在处理多组合规则的情况更简单了。如果你希望结合规则来识别物化视图,去从CSV和JDBC源中读取数据,你只需要给`Calcite`所有的规则并告诉它如何去做。 492 | 493 | `Calcite`使用了一个基于成本的优化模型,成本模型决定了最终使用哪个执行计划,有时候为了避免搜索空间的爆炸性增长会对搜索树进行剪枝,但它绝不对强迫用户在规则A和规则B之间进行选择。这是很重要的一点,因为它避免了在搜索空间中落入实际上不是最优的局部最优值。 494 | 495 | 同样,成本模型是可扩展的,它是基于表和查询操作的统计信息。这个问题稍后会仔细讨论。 496 | 497 | ## JDBC适配器(`adapter`) 498 | 499 | JDBC适配器(`adapter`)可以吧一个jdbc库(`schema`)映射成`Calcite`的库(`schema`)。 500 | 501 | 举个栗子,这是MySQL的一个经典库“foodmart”: 502 | 503 | ``` 504 | { 505 | version: '1.0', 506 | defaultSchema: 'FOODMART', 507 | schemas: [ 508 | { 509 | name: 'FOODMART', 510 | type: 'custom', 511 | factory: 'org.apache.calcite.adapter.jdbc.JdbcSchema$Factory', 512 | operand: { 513 | jdbcDriver: 'com.mysql.jdbc.Driver', 514 | jdbcUrl: 'jdbc:mysql://localhost/foodmart', 515 | jdbcUser: 'foodmart', 516 | jdbcPassword: 'foodmart' 517 | } 518 | } 519 | ] 520 | } 521 | ``` 522 | 523 | (`foodmart`这个库对于使用 Mondrian OLAP的人再熟悉不过了,这是Mondrain的重要测试集之一,不了解的请点击[传送门](https://mondrian.pentaho.com/documentation/installation.php#2_Set_up_test_data)) 524 | 525 | 当前的一些限制:JDBC适配器(`adapter`)目前仅支持下推表扫描(`table scan`)操作;其他的的操作(filtering,joins,aggregations等等)在`Calcite`中完成。我们的目的是将尽可能多的处理操作、语法转换、数据类型和内建函数下推到源数据系统。如果一个`Calcite`查询来源于单独一个JDBC数据库中的表,从原则上来说整个查询都会下推到源数据系统中。如果表来源于多个JDBC数据源,或是一个JDBC和非JDBC的混合源,`Calcite`会使用尽可能高效的分布式查询方法来完成本次查询。 526 | 527 | (*译者注:从15年开始,我们设计的一块数据分析产品,就像达到类似的功能,但是最终以失败告终,整体的完成度远不及Calcite,而Calcite的历史库最远仅可以追溯到14年,感叹一下人家的开发水准,自叹弗如!!!) 528 | 529 | 530 | ## 克隆JDBC适配器(`adapter`) 531 | 532 | 克隆JDBC适配器(`adapter`)创造了一个混合数据系统。数据来源于JDBC数据库但在它第一次读取时会读取到内存表中。`Calcite`基于内存表对查询进行评估,有效地实现了数据库的缓存。 533 | 534 | 例如:下面的模型(`model`)就是从mysql的“footmart”库中读取信息的: 535 | 536 | ``` 537 | { 538 | version: '1.0', 539 | defaultSchema: 'FOODMART_CLONE', 540 | schemas: [ 541 | { 542 | name: 'FOODMART_CLONE', 543 | type: 'custom', 544 | factory: 'org.apache.calcite.adapter.clone.CloneSchema$Factory', 545 | operand: { 546 | jdbcDriver: 'com.mysql.jdbc.Driver', 547 | jdbcUrl: 'jdbc:mysql://localhost/foodmart', 548 | jdbcUser: 'foodmart', 549 | jdbcPassword: 'foodmart' 550 | } 551 | } 552 | ] 553 | } 554 | ``` 555 | 556 | 另外一种技术是从当前已存在的`schema`中构建一份`克隆schema`。通过`source`属性来引用之前已经在`model`中定义过的`schema`,如下: 557 | 558 | ``` 559 | { 560 | version: '1.0', 561 | defaultSchema: 'FOODMART_CLONE', 562 | schemas: [ 563 | { 564 | name: 'FOODMART', 565 | type: 'custom', 566 | factory: 'org.apache.calcite.adapter.jdbc.JdbcSchema$Factory', 567 | operand: { 568 | jdbcDriver: 'com.mysql.jdbc.Driver', 569 | jdbcUrl: 'jdbc:mysql://localhost/foodmart', 570 | jdbcUser: 'foodmart', 571 | jdbcPassword: 'foodmart' 572 | } 573 | }, 574 | { 575 | name: 'FOODMART_CLONE', 576 | type: 'custom', 577 | factory: 'org.apache.calcite.adapter.clone.CloneSchema$Factory', 578 | operand: { 579 | source: 'FOODMART' 580 | } 581 | } 582 | ] 583 | } 584 | ``` 585 | 586 | 你可以使用这种方法建立任意类型的`克隆schema`,不仅限于JDBC. 587 | 588 | 克隆适配器`cloning adapter`不是最重要的。我们计划开发更复杂的缓存策略,和更复杂更有效的内存表的实现,但目前克隆JDBC适配器(`adapter`)体现了这种可能性,并让我们能开始尝试初始实现。 589 | 590 | 591 | 592 | ## 下一章? 593 | 594 | 不定期更新 :) 喜欢的可以联系我崔更。 骂的人多的话,也不一定不更... 595 | --------------------------------------------------------------------------------