├── .gitignore
├── LICENSE
├── MinikubeCluster.md
├── README.md
├── ReferenceGuides
└── VitessApi.md
├── VitessApi.md
├── VitessCluster.md
├── etcd.md
├── overview
├── Concepts.md
├── ScalingMySQL.md
└── VitessOverview.md
├── res
├── Kubernetes_ui.png
├── etcd_pods.png
├── kvtctl_list.png
├── minikube_struct.png
├── vitess_architecture.png
└── vtctld_web.png
├── started
├── DockerBuild.md
└── GettingStartedKubernetes.md
├── userguide
└── ShardingKubernetes.md
└── warning.md
/.gitignore:
--------------------------------------------------------------------------------
1 | # Compiled Object files, Static and Dynamic libs (Shared Objects)
2 | *.o
3 | *.a
4 | *.so
5 |
6 | # Folders
7 | _obj
8 | _test
9 |
10 | # Architecture specific extensions/prefixes
11 | *.[568vq]
12 | [568vq].out
13 |
14 | *.cgo1.go
15 | *.cgo2.c
16 | _cgo_defun.c
17 | _cgo_gotypes.go
18 | _cgo_export.*
19 |
20 | _testmain.go
21 |
22 | *.exe
23 | *.test
24 | *.prof
25 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "{}"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright {yyyy} {name of copyright owner}
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/MinikubeCluster.md:
--------------------------------------------------------------------------------
1 | # 基于minikube的kubernetes集群部署
2 |
3 | ## 简介
4 | [minikube](https://github.com/kubernetes/minikube/blob/v0.16.0/README.md)是一个可以很容易在本地运行Kubernetes集群的工具,
5 | minikube在电脑上的虚拟机内运行单节点Kubernetes集群,可以很方便的供Kubernetes日常开发使用;minikube在Linux下是部署需要依赖[VirtualBox](https://www.virtualbox.org/wiki/Downloads)或者[KVM](http://www.linux-kvm.org/),本文中所说的是基于KVM驱动搭建单机集群环境。
6 | minikube运行的大致结构如下:
7 | 
8 |
9 |
10 | ## minikube环境搭建
11 | ### minikube安装
12 | Minikube使用go语言编写,发布形式是一个独立的二进制文件,因此只需要下载,然后放到对应的位置即可正常使用。
13 | ``` sh
14 | # 下载minikube-linux-amd64文件
15 | $ wget https://storage.googleapis.com/minikube/releases/v0.16.0/minikube-linux-amd64
16 |
17 | # 增加可执行权限
18 | $ chmod +x minikube-linux-amd64
19 |
20 | # 将可执行文件移动到/usr/local/bin
21 | $ mv minikube-linux-amd64 /usr/local/bin/minikube
22 |
23 | # 查看版本确认是否安装成功
24 | $ minikube version
25 | # minikube version: v0.16.0
26 | ```
27 | ### kubectl安装
28 | kubectl同样是go语言编写, 发布形式是一个独立的二进制文件,我们只需要下载该文件就可以正常使用。
29 | ``` sh
30 | # 下载kubectl文件
31 | $ wget http://storage.googleapis.com/kubernetes-release/release/v1.3.0/bin/linux/amd64/kubectl
32 |
33 | # 增加可执行权限
34 | $ chmod +x kubectl
35 |
36 | # 移动文件到/usr/local/bin下
37 | $ mv kubectl /usr/local/bin
38 |
39 | # 查看版本确认安装成功
40 | $ kubectl version
41 | ```
42 |
43 | ### docker-machine-driver-kvm安装
44 | docker-machine-driver-kvm是提供在kvm虚拟机上安装docker的驱动程序,二进制发布,可以直接下载使用。
45 | ``` sh
46 | # 下载驱动
47 | $ wget https://github.com/dhiltgen/docker-machine-kvm/releases/download/v0.7.0/docker-machine-driver-kvm
48 |
49 | # 增加可执行权限
50 | $ chmod +x docker-machine-driver-kvm
51 |
52 | # 移动文件到/usr/local/bin
53 | $ mv docker-machine-driver-kvm /usr/local/bin
54 | ```
55 |
56 | ### kvm驱动安装
57 | 安装kvm驱动主要是为了在本机运行kvm虚拟机, kvm驱动需要根据自己系统到官网下载对应的驱动进行安装。
58 | minikube官方对kvm驱动的安装说明请参考: [kvm驱动](https://github.com/kubernetes/minikube/blob/v0.16.0/DRIVERS.md#kvm-driver)
59 | 以下是我个人整理的需要安装的内容:
60 | ``` sh
61 | # centos 系统
62 | # 安装驱动
63 | $ yum install libvirt-daemon-kvm kvm
64 |
65 | # 安装驱动相关工具
66 | $ yum install libguestfs libguestfs-tools libvirt
67 |
68 |
69 | # ubuntu 系统
70 | # 安装驱动和相应的工具
71 | $ sudo apt install libvirt-bin qemu-kvm
72 | ```
73 |
74 | ### 启动kvm相关服务
75 | kvm安装好后需要启动相应的服务才能保证虚拟机正常启动使用
76 |
77 | ``` sh
78 | $ libvirtd -d
79 | $ systemctl start virtlogd.socket
80 | ```
81 |
82 | ### 启动minikube
83 | 通过上面的安装下面我们就可以正常启动minikube了,由于minikube启动参数比较多,以下我们只列出两条,简单说明下,需要详细了解请自行阅读minikube帮助文档,帮助信息可以通过以下命令查看:
84 | ``` sh
85 | $ minikube -h
86 | ```
87 |
88 | 正常启动命令
89 | ``` sh
90 | # 正常启动minikube服务,指定驱动;不开启日志
91 | # --vm 参数指定了需要使用的驱动程序, linux下默认使用的是virtualbox程序来启动, 由于virtualbox操作安装问题较多,所以这里选用了kvm
92 | # 可以借助kvm强大的命令行工具集合来操作, 方便快捷
93 | $ minikube start --vm-driver=kvm
94 | ```
95 |
96 | 开启日志启动命令
97 | ``` sh
98 | # 启动minikube并且开启日志, --v参数是可以指定minikube的日志级别
99 | # --v=0 INFO level logs
100 | # --v=1 WARNING level logs
101 | # --v=2 ERROR level logs
102 | # --v=3 libmachine logging
103 | # --v=7 libmachine --debug level logging
104 | $ minikube start --v=7 --vm-driver=kvm
105 | ```
106 |
107 | ### kvm相关命令
108 | kvm命令也很多,下面介绍部分命令,详细的命令信息可以参见virsh -h
109 | * 启动虚拟机
110 | ``` sh
111 | ### 启动已经创建的虚拟机xxxx
112 | $ virsh start xxxx
113 | ```
114 |
115 | * 暂停虚拟机
116 | ``` sh
117 | # 暂停正在运行的虚拟机xxx
118 | $ virsh suspend xxxx
119 | ```
120 |
121 | * 设置虚拟机内存
122 | ``` sh
123 | # 修改内存
124 | $ virsh setmem xxxxx 512000
125 | ```
126 |
127 | * 恢复挂起(暂停)的虚拟机
128 | ``` sh
129 | $ virsh resume xxxx
130 | ```
131 |
132 | * 修改虚拟机配置文件
133 |
134 | 上面所说的修改内存还有一种方法是可以直接修改运行中的虚拟机的配置文件,以达到修改对应参数的效果,修改配置文件相对于其他命令来说比较好用,kvm虚拟机配置都是以xml格式配置的,我们可以使用virsh edit直接修改。
135 | ``` sh
136 | # 使用如下命令就会显示配置文件编辑窗口,对应的xml文件记录了虚拟机的各种参数, 修改完成重启虚拟机即可生效
137 | $ virsh edit xxxx
138 | ```
139 |
140 | * 其他
141 |
142 | 在使用minikube通过kvm创建虚拟机的时候,文件virbr1.status记录着对应的ip信息,如果出现ip冲突可以修改以下文件进行处理,保证以下文件只有一个唯一的ip即可。
143 | ``` sh
144 | $ vim /var/lib/libvirt/dnsmasq/virbr1.status
145 | ```
146 |
147 | 以上只说明了--v和--vm参数的使用说明,由于minikube涉及的参数比较多,使用的时候可以根据自己需要查看帮助文档,或者参考官方说明。本文只列出了linux下的部署说明,windows和mac系统下请参阅[官方说明](https://github.com/kubernetes/minikube/blob/v0.16.0/README.md)。
148 |
149 | 经过上面的步骤我就可以正常使用minikube和kubectl来管理操作我们的集群了。
150 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Vitess文档
2 | 提供Vitess相关的中文文档, 包含官方文档的翻译和自己开发学习中整理的文档供大家一起学习讨论;官方文档不只是机械性的翻译,部分内容根据自己的理解进行适当的修改,保证文档更容易理解。
3 | 有不足的地方欢迎搭建指正。
4 |
5 | 文档会持续更新...
6 |
7 | # **目录**
8 | * 概览
9 | * [Vitess概览](https://github.com/davygeek/vitessdoc/blob/master/overview/VitessOverview.md)
10 | * [用Vitess扩展MySQL](#)
11 | * [关键概念](#)
12 | * 入门
13 | * [在Kubernetes上运行Vitess](https://github.com/davygeek/vitessdoc/blob/master/started/GettingStartedKubernetes.md)
14 | * [自定义Docker构建](https://github.com/zssky/vitessdoc/blob/master/started/DockerBuild.md)
15 | * [在本地运行Vitess](#)
16 | * 用户指南
17 | * [基于Kubernetes分片](https://github.com/zssky/vitessdoc/blob/master/userguide/ShardingKubernetes.md)
18 |
19 |
20 | * 参考指南
21 | * 实践心得
22 | * [基于minikube的kubernetes集群搭建](https://github.com/davygeek/vitessdoc/blob/master/MinikubeCluster.md)
23 | * [基于kubernetes集群的Vitess最佳实践](https://github.com/zssky/vitessdoc/blob/master/VitessCluster.md)
24 |
--------------------------------------------------------------------------------
/ReferenceGuides/VitessApi.md:
--------------------------------------------------------------------------------
1 | This document describes Vitess API methods that enable your client application to more easily talk to your storage system to query data. API methods are grouped into the following categories:
2 |
3 | * [Range-based Sharding](#range-based-sharding)
4 | * [Transactions](#transactions)
5 | * [Custom Sharding](#custom-sharding)
6 | * [Map Reduce](#map-reduce)
7 | * [Topology](#topology)
8 | * [v3 API (alpha)](#v3-api-(alpha))
9 |
10 |
11 | The following table lists the methods in each group and links to more detail about each method:
12 |
13 |
14 | Range-based Sharding |
15 |
16 | ExecuteBatchKeyspaceIds |
17 | ExecuteBatchKeyspaceIds executes the list of queries based on the specified keyspace ids. |
18 |
19 |
20 | ExecuteEntityIds |
21 | ExecuteEntityIds executes the query based on the specified external id to keyspace id map. |
22 |
23 |
24 | ExecuteKeyRanges |
25 | ExecuteKeyRanges executes the query based on the specified key ranges. |
26 |
27 |
28 | ExecuteKeyspaceIds |
29 | ExecuteKeyspaceIds executes the query based on the specified keyspace ids. |
30 |
31 |
32 | StreamExecuteKeyRanges |
33 | StreamExecuteKeyRanges executes a streaming query based on key ranges. Use this method if the query returns a large number of rows. |
34 |
35 |
36 | StreamExecuteKeyspaceIds |
37 | StreamExecuteKeyspaceIds executes a streaming query based on keyspace ids. Use this method if the query returns a large number of rows. |
38 |
39 | Transactions |
40 |
41 | Begin |
42 | Begin a transaction. |
43 |
44 |
45 | Commit |
46 | Commit a transaction. |
47 |
48 |
49 | ResolveTransaction |
50 | ResolveTransaction resolves a transaction. |
51 |
52 |
53 | Rollback |
54 | Rollback a transaction. |
55 |
56 | Custom Sharding |
57 |
58 | ExecuteBatchShards |
59 | ExecuteBatchShards executes the list of queries on the specified shards. |
60 |
61 |
62 | ExecuteShards |
63 | ExecuteShards executes the query on the specified shards. |
64 |
65 |
66 | StreamExecuteShards |
67 | StreamExecuteShards executes a streaming query based on shards. Use this method if the query returns a large number of rows. |
68 |
69 | Map Reduce |
70 |
71 | SplitQuery |
72 | Split a query into non-overlapping sub queries |
73 |
74 | Topology |
75 |
76 | GetSrvKeyspace |
77 | GetSrvKeyspace returns a SrvKeyspace object (as seen by this vtgate). This method is provided as a convenient way for clients to take a look at the sharding configuration for a Keyspace. Looking at the sharding information should not be used for routing queries (as the information may change, use the Execute calls for that). It is convenient for monitoring applications for instance, or if using custom sharding. |
78 |
79 | v3 API (alpha) |
80 |
81 | Execute |
82 | Execute tries to route the query to the right shard. It depends on the query and bind variables to provide enough information in conjonction with the vindexes to route the query. |
83 |
84 |
85 | StreamExecute |
86 | StreamExecute executes a streaming query based on shards. It depends on the query and bind variables to provide enough information in conjonction with the vindexes to route the query. Use this method if the query returns a large number of rows. |
87 |
88 |
89 | ##Range-based Sharding
90 | ### ExecuteBatchKeyspaceIds
91 |
92 | ExecuteBatchKeyspaceIds executes the list of queries based on the specified keyspace ids.
93 |
94 | #### Request
95 |
96 | ExecuteBatchKeyspaceIdsRequest is the payload to ExecuteBatchKeyspaceId.
97 |
98 | ##### Parameters
99 |
100 | | Name |Description |
101 | | :-------- | :--------
102 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
103 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
104 | | queries
list <[BoundKeyspaceIdQuery](#boundkeyspaceidquery)>| BoundKeyspaceIdQuery represents a single query request for the specified list of keyspace ids. This is used in a list for ExecuteBatchKeyspaceIdsRequest. |
105 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
106 | | as_transaction
bool| as_transaction will execute the queries in this batch in a single transaction per shard, created for this purpose. (this can be seen as adding a 'begin' before and 'commit' after the queries). Only makes sense if tablet_type is master. If set, the Session is ignored. |
107 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
108 |
109 | #### Response
110 |
111 | ExecuteBatchKeyspaceIdsResponse is the returned value from ExecuteBatchKeyspaceId.
112 |
113 | ##### Properties
114 |
115 | | Name |Description |
116 | | :-------- | :--------
117 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
118 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
119 | | results
list <[query.QueryResult](#query.queryresult)>| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
120 |
121 | ### ExecuteEntityIds
122 |
123 | ExecuteEntityIds executes the query based on the specified external id to keyspace id map.
124 |
125 | #### Request
126 |
127 | ExecuteEntityIdsRequest is the payload to ExecuteEntityIds.
128 |
129 | ##### Parameters
130 |
131 | | Name |Description |
132 | | :-------- | :--------
133 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
134 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
135 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
136 | | keyspace
string| keyspace to target the query to. |
137 | | entity_column_name
string| entity_column_name is the column name to use. |
138 | | entity_keyspace_ids
list <[EntityId](#executeentityidsrequest.entityid)>| entity_keyspace_ids are pairs of entity_column_name values associated with its corresponding keyspace_id. |
139 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
140 | | not_in_transaction
bool| not_in_transaction is deprecated and should not be used. |
141 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
142 |
143 | #### Messages
144 |
145 | ##### ExecuteEntityIdsRequest.EntityId
146 |
147 | Properties
148 |
149 | | Name |Description |
150 | | :-------- | :--------
151 | | type
[query.Type](#query.type)| Type defines the various supported data types in bind vars and query results. |
152 | | value
bytes| value is the value for the entity. Not set if type is NULL_TYPE. |
153 | | keyspace_id
bytes| keyspace_id is the associated keyspace_id for the entity. |
154 |
155 | #### Response
156 |
157 | ExecuteEntityIdsResponse is the returned value from ExecuteEntityIds.
158 |
159 | ##### Properties
160 |
161 | | Name |Description |
162 | | :-------- | :--------
163 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
164 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
165 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
166 |
167 | ### ExecuteKeyRanges
168 |
169 | ExecuteKeyRanges executes the query based on the specified key ranges.
170 |
171 | #### Request
172 |
173 | ExecuteKeyRangesRequest is the payload to ExecuteKeyRanges.
174 |
175 | ##### Parameters
176 |
177 | | Name |Description |
178 | | :-------- | :--------
179 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
180 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
181 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
182 | | keyspace
string| keyspace to target the query to |
183 | | key_ranges
list <[topodata.KeyRange](#topodata.keyrange)>| KeyRange describes a range of sharding keys, when range-based sharding is used. |
184 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
185 | | not_in_transaction
bool| not_in_transaction is deprecated and should not be used. |
186 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
187 |
188 | #### Response
189 |
190 | ExecuteKeyRangesResponse is the returned value from ExecuteKeyRanges.
191 |
192 | ##### Properties
193 |
194 | | Name |Description |
195 | | :-------- | :--------
196 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
197 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
198 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
199 |
200 | ### ExecuteKeyspaceIds
201 |
202 | ExecuteKeyspaceIds executes the query based on the specified keyspace ids.
203 |
204 | #### Request
205 |
206 | ExecuteKeyspaceIdsRequest is the payload to ExecuteKeyspaceIds.
207 |
208 | ##### Parameters
209 |
210 | | Name |Description |
211 | | :-------- | :--------
212 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
213 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
214 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
215 | | keyspace
string| keyspace to target the query to. |
216 | | keyspace_ids
list <bytes>| keyspace_ids contains the list of keyspace_ids affected by this query. Will be used to find the shards to send the query to. |
217 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
218 | | not_in_transaction
bool| not_in_transaction is deprecated and should not be used. |
219 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
220 |
221 | #### Response
222 |
223 | ExecuteKeyspaceIdsResponse is the returned value from ExecuteKeyspaceIds.
224 |
225 | ##### Properties
226 |
227 | | Name |Description |
228 | | :-------- | :--------
229 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
230 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
231 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
232 |
233 | ### StreamExecuteKeyRanges
234 |
235 | StreamExecuteKeyRanges executes a streaming query based on key ranges. Use this method if the query returns a large number of rows.
236 |
237 | #### Request
238 |
239 | StreamExecuteKeyRangesRequest is the payload to StreamExecuteKeyRanges.
240 |
241 | ##### Parameters
242 |
243 | | Name |Description |
244 | | :-------- | :--------
245 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
246 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
247 | | keyspace
string| keyspace to target the query to. |
248 | | key_ranges
list <[topodata.KeyRange](#topodata.keyrange)>| KeyRange describes a range of sharding keys, when range-based sharding is used. |
249 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
250 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
251 |
252 | #### Response
253 |
254 | StreamExecuteKeyRangesResponse is the returned value from StreamExecuteKeyRanges.
255 |
256 | ##### Properties
257 |
258 | | Name |Description |
259 | | :-------- | :--------
260 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
261 |
262 | ### StreamExecuteKeyspaceIds
263 |
264 | StreamExecuteKeyspaceIds executes a streaming query based on keyspace ids. Use this method if the query returns a large number of rows.
265 |
266 | #### Request
267 |
268 | StreamExecuteKeyspaceIdsRequest is the payload to StreamExecuteKeyspaceIds.
269 |
270 | ##### Parameters
271 |
272 | | Name |Description |
273 | | :-------- | :--------
274 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
275 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
276 | | keyspace
string| keyspace to target the query to. |
277 | | keyspace_ids
list <bytes>| keyspace_ids contains the list of keyspace_ids affected by this query. Will be used to find the shards to send the query to. |
278 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
279 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
280 |
281 | #### Response
282 |
283 | StreamExecuteKeyspaceIdsResponse is the returned value from StreamExecuteKeyspaceIds.
284 |
285 | ##### Properties
286 |
287 | | Name |Description |
288 | | :-------- | :--------
289 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
290 |
291 | ##Transactions
292 | ### Begin
293 |
294 | Begin a transaction.
295 |
296 | #### Request
297 |
298 | BeginRequest is the payload to Begin.
299 |
300 | ##### Parameters
301 |
302 | | Name |Description |
303 | | :-------- | :--------
304 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
305 | | single_db
bool| single_db specifies if the transaction should be restricted to a single database. |
306 |
307 | #### Response
308 |
309 | BeginResponse is the returned value from Begin.
310 |
311 | ##### Properties
312 |
313 | | Name |Description |
314 | | :-------- | :--------
315 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
316 |
317 | ### Commit
318 |
319 | Commit a transaction.
320 |
321 | #### Request
322 |
323 | CommitRequest is the payload to Commit.
324 |
325 | ##### Parameters
326 |
327 | | Name |Description |
328 | | :-------- | :--------
329 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
330 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
331 | | atomic
bool| atomic specifies if the commit should go through the 2PC workflow to ensure atomicity. |
332 |
333 | #### Response
334 |
335 | CommitResponse is the returned value from Commit.
336 |
337 | ##### Properties
338 |
339 | | Name |Description |
340 | | :-------- | :--------
341 |
342 | ### ResolveTransaction
343 |
344 | ResolveTransaction resolves a transaction.
345 |
346 | #### Request
347 |
348 | ResolveTransactionRequest is the payload to ResolveTransaction.
349 |
350 | ##### Parameters
351 |
352 | | Name |Description |
353 | | :-------- | :--------
354 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
355 | | dtid
string| dtid is the dtid of the transaction to be resolved. |
356 |
357 | #### Response
358 |
359 | ResolveTransactionResponse is the returned value from Rollback.
360 |
361 | ##### Properties
362 |
363 | | Name |Description |
364 | | :-------- | :--------
365 |
366 | ### Rollback
367 |
368 | Rollback a transaction.
369 |
370 | #### Request
371 |
372 | RollbackRequest is the payload to Rollback.
373 |
374 | ##### Parameters
375 |
376 | | Name |Description |
377 | | :-------- | :--------
378 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
379 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
380 |
381 | #### Response
382 |
383 | RollbackResponse is the returned value from Rollback.
384 |
385 | ##### Properties
386 |
387 | | Name |Description |
388 | | :-------- | :--------
389 |
390 | ##Custom Sharding
391 | ### ExecuteBatchShards
392 |
393 | ExecuteBatchShards executes the list of queries on the specified shards.
394 |
395 | #### Request
396 |
397 | ExecuteBatchShardsRequest is the payload to ExecuteBatchShards
398 |
399 | ##### Parameters
400 |
401 | | Name |Description |
402 | | :-------- | :--------
403 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
404 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
405 | | queries
list <[BoundShardQuery](#boundshardquery)>| BoundShardQuery represents a single query request for the specified list of shards. This is used in a list for ExecuteBatchShardsRequest. |
406 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
407 | | as_transaction
bool| as_transaction will execute the queries in this batch in a single transaction per shard, created for this purpose. (this can be seen as adding a 'begin' before and 'commit' after the queries). Only makes sense if tablet_type is master. If set, the Session is ignored. |
408 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
409 |
410 | #### Response
411 |
412 | ExecuteBatchShardsResponse is the returned value from ExecuteBatchShards.
413 |
414 | ##### Properties
415 |
416 | | Name |Description |
417 | | :-------- | :--------
418 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
419 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
420 | | results
list <[query.QueryResult](#query.queryresult)>| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
421 |
422 | ### ExecuteShards
423 |
424 | ExecuteShards executes the query on the specified shards.
425 |
426 | #### Request
427 |
428 | ExecuteShardsRequest is the payload to ExecuteShards.
429 |
430 | ##### Parameters
431 |
432 | | Name |Description |
433 | | :-------- | :--------
434 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
435 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
436 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
437 | | keyspace
string| keyspace to target the query to. |
438 | | shards
list <string>| shards to target the query to. A DML can only target one shard. |
439 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
440 | | not_in_transaction
bool| not_in_transaction is deprecated and should not be used. |
441 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
442 |
443 | #### Response
444 |
445 | ExecuteShardsResponse is the returned value from ExecuteShards.
446 |
447 | ##### Properties
448 |
449 | | Name |Description |
450 | | :-------- | :--------
451 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
452 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
453 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
454 |
455 | ### StreamExecuteShards
456 |
457 | StreamExecuteShards executes a streaming query based on shards. Use this method if the query returns a large number of rows.
458 |
459 | #### Request
460 |
461 | StreamExecuteShardsRequest is the payload to StreamExecuteShards.
462 |
463 | ##### Parameters
464 |
465 | | Name |Description |
466 | | :-------- | :--------
467 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
468 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
469 | | keyspace
string| keyspace to target the query to. |
470 | | shards
list <string>| shards to target the query to. |
471 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
472 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
473 |
474 | #### Response
475 |
476 | StreamExecuteShardsResponse is the returned value from StreamExecuteShards.
477 |
478 | ##### Properties
479 |
480 | | Name |Description |
481 | | :-------- | :--------
482 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
483 |
484 | ##Map Reduce
485 | ### SplitQuery
486 |
487 | Split a query into non-overlapping sub queries
488 |
489 | #### Request
490 |
491 | SplitQueryRequest is the payload to SplitQuery. SplitQuery takes a "SELECT" query and generates a list of queries called "query-parts". Each query-part consists of the original query with an added WHERE clause that restricts the query-part to operate only on rows whose values in the the columns listed in the "split_column" field of the request (see below) are in a particular range. It is guaranteed that the set of rows obtained from executing each query-part on a database snapshot and merging (without deduping) the results is equal to the set of rows obtained from executing the original query on the same snapshot with the rows containing NULL values in any of the split_column's excluded. This is typically called by the MapReduce master when reading from Vitess. There it's desirable that the sets of rows returned by the query-parts have roughly the same size.
492 |
493 | ##### Parameters
494 |
495 | | Name |Description |
496 | | :-------- | :--------
497 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
498 | | keyspace
string| keyspace to target the query to. |
499 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
500 | | split_column
list <string>| Each generated query-part will be restricted to rows whose values in the columns listed in this field are in a particular range. The list of columns named here must be a prefix of the list of columns defining some index or primary key of the table referenced in 'query'. For many tables using the primary key columns (in order) is sufficient and this is the default if this field is omitted. See the comment on the 'algorithm' field for more restrictions and information. |
501 | | split_count
int64| You can specify either an estimate of the number of query-parts to generate or an estimate of the number of rows each query-part should return. Thus, exactly one of split_count or num_rows_per_query_part should be nonzero. The non-given parameter is calculated from the given parameter using the formula: split_count * num_rows_per_query_pary = table_size, where table_size is an approximation of the number of rows in the table. Note that if "split_count" is given it is regarded as an estimate. The number of query-parts returned may differ slightly (in particular, if it's not a whole multiple of the number of vitess shards). |
502 | | num_rows_per_query_part
int64| |
503 | | algorithm
query.SplitQueryRequest.Algorithm| The algorithm to use to split the query. The split algorithm is performed on each database shard in parallel. The lists of query-parts generated by the shards are merged and returned to the caller. Two algorithms are supported: EQUAL_SPLITS If this algorithm is selected then only the first 'split_column' given is used (or the first primary key column if the 'split_column' field is empty). In the rest of this algorithm's description, we refer to this column as "the split column". The split column must have numeric type (integral or floating point). The algorithm works by taking the interval [min, max], where min and max are the minimum and maximum values of the split column in the table-shard, respectively, and partitioning it into 'split_count' sub-intervals of equal size. The added WHERE clause of each query-part restricts that part to rows whose value in the split column belongs to a particular sub-interval. This is fast, but requires that the distribution of values of the split column be uniform in [min, max] for the number of rows returned by each query part to be roughly the same. FULL_SCAN If this algorithm is used then the split_column must be the primary key columns (in order). This algorithm performs a full-scan of the table-shard referenced in 'query' to get "boundary" rows that are num_rows_per_query_part apart when the table is ordered by the columns listed in 'split_column'. It then restricts each query-part to the rows located between two successive boundary rows. This algorithm supports multiple split_column's of any type, but is slower than EQUAL_SPLITS. |
504 | | use_split_query_v2
bool| Remove this field after this new server code is released to prod. We must keep it for now, so that clients can still send it to the old server code currently in production. |
505 |
506 | #### Response
507 |
508 | SplitQueryResponse is the returned value from SplitQuery.
509 |
510 | ##### Properties
511 |
512 | | Name |Description |
513 | | :-------- | :--------
514 | | splits
list <[Part](#splitqueryresponse.part)>| splits contains the queries to run to fetch the entire data set. |
515 |
516 | #### Messages
517 |
518 | ##### SplitQueryResponse.KeyRangePart
519 |
520 | Properties
521 |
522 | | Name |Description |
523 | | :-------- | :--------
524 | | keyspace
string| keyspace to target the query to. |
525 | | key_ranges
list <[topodata.KeyRange](#topodata.keyrange)>| KeyRange describes a range of sharding keys, when range-based sharding is used. |
526 |
527 | ##### SplitQueryResponse.Part
528 |
529 | Properties
530 |
531 | | Name |Description |
532 | | :-------- | :--------
533 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
534 | | key_range_part
[KeyRangePart](#splitqueryresponse.keyrangepart)| key_range_part is set if the query should be executed by ExecuteKeyRanges. |
535 | | shard_part
[ShardPart](#splitqueryresponse.shardpart)| shard_part is set if the query should be executed by ExecuteShards. |
536 | | size
int64| size is the approximate number of rows this query will return. |
537 |
538 | ##### SplitQueryResponse.ShardPart
539 |
540 | Properties
541 |
542 | | Name |Description |
543 | | :-------- | :--------
544 | | keyspace
string| keyspace to target the query to. |
545 | | shards
list <string>| shards to target the query to. |
546 |
547 | ##Topology
548 | ### GetSrvKeyspace
549 |
550 | GetSrvKeyspace returns a SrvKeyspace object (as seen by this vtgate). This method is provided as a convenient way for clients to take a look at the sharding configuration for a Keyspace. Looking at the sharding information should not be used for routing queries (as the information may change, use the Execute calls for that). It is convenient for monitoring applications for instance, or if using custom sharding.
551 |
552 | #### Request
553 |
554 | GetSrvKeyspaceRequest is the payload to GetSrvKeyspace.
555 |
556 | ##### Parameters
557 |
558 | | Name |Description |
559 | | :-------- | :--------
560 | | keyspace
string| keyspace name to fetch. |
561 |
562 | #### Response
563 |
564 | GetSrvKeyspaceResponse is the returned value from GetSrvKeyspace.
565 |
566 | ##### Properties
567 |
568 | | Name |Description |
569 | | :-------- | :--------
570 | | srv_keyspace
[topodata.SrvKeyspace](#topodata.srvkeyspace)| SrvKeyspace is a rollup node for the keyspace itself. |
571 |
572 | ##v3 API (alpha)
573 | ### Execute
574 |
575 | Execute tries to route the query to the right shard. It depends on the query and bind variables to provide enough information in conjonction with the vindexes to route the query.
576 |
577 | #### Request
578 |
579 | ExecuteRequest is the payload to Execute.
580 |
581 | ##### Parameters
582 |
583 | | Name |Description |
584 | | :-------- | :--------
585 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
586 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
587 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
588 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
589 | | not_in_transaction
bool| not_in_transaction is deprecated and should not be used. |
590 | | keyspace
string| keyspace to target the query to. |
591 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
592 |
593 | #### Response
594 |
595 | ExecuteResponse is the returned value from Execute.
596 |
597 | ##### Properties
598 |
599 | | Name |Description |
600 | | :-------- | :--------
601 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
602 | | session
[Session](#session)| Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user. |
603 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
604 |
605 | ### StreamExecute
606 |
607 | StreamExecute executes a streaming query based on shards. It depends on the query and bind variables to provide enough information in conjonction with the vindexes to route the query. Use this method if the query returns a large number of rows.
608 |
609 | #### Request
610 |
611 | StreamExecuteRequest is the payload to StreamExecute.
612 |
613 | ##### Parameters
614 |
615 | | Name |Description |
616 | | :-------- | :--------
617 | | caller_id
[vtrpc.CallerID](#vtrpc.callerid)| CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes. |
618 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
619 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
620 | | keyspace
string| keyspace to target the query to. |
621 | | options
[query.ExecuteOptions](#query.executeoptions)| ExecuteOptions is passed around for all Execute calls. |
622 |
623 | #### Response
624 |
625 | StreamExecuteResponse is the returned value from StreamExecute.
626 |
627 | ##### Properties
628 |
629 | | Name |Description |
630 | | :-------- | :--------
631 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
632 |
633 | ## Enums
634 |
635 | ### query.Type
636 |
637 | Type defines the various supported data types in bind vars and query results.
638 |
639 | | Name |Value |Description |
640 | | :-------- | :-------- | :--------
641 | | NULL_TYPE
| 0
| NULL_TYPE specifies a NULL type. |
642 | | INT8
| 257
| INT8 specifies a TINYINT type. Properties: 1, IsNumber. |
643 | | UINT8
| 770
| UINT8 specifies a TINYINT UNSIGNED type. Properties: 2, IsNumber, IsUnsigned. |
644 | | INT16
| 259
| INT16 specifies a SMALLINT type. Properties: 3, IsNumber. |
645 | | UINT16
| 772
| UINT16 specifies a SMALLINT UNSIGNED type. Properties: 4, IsNumber, IsUnsigned. |
646 | | INT24
| 261
| INT24 specifies a MEDIUMINT type. Properties: 5, IsNumber. |
647 | | UINT24
| 774
| UINT24 specifies a MEDIUMINT UNSIGNED type. Properties: 6, IsNumber, IsUnsigned. |
648 | | INT32
| 263
| INT32 specifies a INTEGER type. Properties: 7, IsNumber. |
649 | | UINT32
| 776
| UINT32 specifies a INTEGER UNSIGNED type. Properties: 8, IsNumber, IsUnsigned. |
650 | | INT64
| 265
| INT64 specifies a BIGINT type. Properties: 9, IsNumber. |
651 | | UINT64
| 778
| UINT64 specifies a BIGINT UNSIGNED type. Properties: 10, IsNumber, IsUnsigned. |
652 | | FLOAT32
| 1035
| FLOAT32 specifies a FLOAT type. Properties: 11, IsFloat. |
653 | | FLOAT64
| 1036
| FLOAT64 specifies a DOUBLE or REAL type. Properties: 12, IsFloat. |
654 | | TIMESTAMP
| 2061
| TIMESTAMP specifies a TIMESTAMP type. Properties: 13, IsQuoted. |
655 | | DATE
| 2062
| DATE specifies a DATE type. Properties: 14, IsQuoted. |
656 | | TIME
| 2063
| TIME specifies a TIME type. Properties: 15, IsQuoted. |
657 | | DATETIME
| 2064
| DATETIME specifies a DATETIME type. Properties: 16, IsQuoted. |
658 | | YEAR
| 785
| YEAR specifies a YEAR type. Properties: 17, IsNumber, IsUnsigned. |
659 | | DECIMAL
| 18
| DECIMAL specifies a DECIMAL or NUMERIC type. Properties: 18, None. |
660 | | TEXT
| 6163
| TEXT specifies a TEXT type. Properties: 19, IsQuoted, IsText. |
661 | | BLOB
| 10260
| BLOB specifies a BLOB type. Properties: 20, IsQuoted, IsBinary. |
662 | | VARCHAR
| 6165
| VARCHAR specifies a VARCHAR type. Properties: 21, IsQuoted, IsText. |
663 | | VARBINARY
| 10262
| VARBINARY specifies a VARBINARY type. Properties: 22, IsQuoted, IsBinary. |
664 | | CHAR
| 6167
| CHAR specifies a CHAR type. Properties: 23, IsQuoted, IsText. |
665 | | BINARY
| 10264
| BINARY specifies a BINARY type. Properties: 24, IsQuoted, IsBinary. |
666 | | BIT
| 2073
| BIT specifies a BIT type. Properties: 25, IsQuoted. |
667 | | ENUM
| 2074
| ENUM specifies an ENUM type. Properties: 26, IsQuoted. |
668 | | SET
| 2075
| SET specifies a SET type. Properties: 27, IsQuoted. |
669 | | TUPLE
| 28
| TUPLE specifies a a tuple. This cannot be returned in a QueryResult, but it can be sent as a bind var. Properties: 28, None. |
670 | | GEOMETRY
| 2077
| GEOMETRY specifies a GEOMETRY type. Properties: 29, IsQuoted. |
671 | | JSON
| 2078
| JSON specified a JSON type. Properties: 30, IsQuoted. |
672 |
673 | ### topodata.KeyspaceIdType
674 |
675 | KeyspaceIdType describes the type of the sharding key for a range-based sharded keyspace.
676 |
677 | | Name |Value |Description |
678 | | :-------- | :-------- | :--------
679 | | UNSET
| 0
| UNSET is the default value, when range-based sharding is not used. |
680 | | UINT64
| 1
| UINT64 is when uint64 value is used. This is represented as 'unsigned bigint' in mysql |
681 | | BYTES
| 2
| BYTES is when an array of bytes is used. This is represented as 'varbinary' in mysql |
682 |
683 | ### topodata.TabletType
684 |
685 | TabletType represents the type of a given tablet.
686 |
687 | | Name |Value |Description |
688 | | :-------- | :-------- | :--------
689 | | UNKNOWN
| 0
| UNKNOWN is not a valid value. |
690 | | MASTER
| 1
| MASTER is the master server for the shard. Only MASTER allows DMLs. |
691 | | REPLICA
| 2
| REPLICA is a slave type. It is used to serve live traffic. A REPLICA can be promoted to MASTER. A demoted MASTER will go to REPLICA. |
692 | | RDONLY
| 3
| RDONLY (old name) / BATCH (new name) is used to serve traffic for long-running jobs. It is a separate type from REPLICA so long-running queries don't affect web-like traffic. |
693 | | BATCH
| 3
| |
694 | | SPARE
| 4
| SPARE is a type of servers that cannot serve queries, but is available in case an extra server is needed. |
695 | | EXPERIMENTAL
| 5
| EXPERIMENTAL is like SPARE, except it can serve queries. This type can be used for usages not planned by Vitess, like online export to another storage engine. |
696 | | BACKUP
| 6
| BACKUP is the type a server goes to when taking a backup. No queries can be served in BACKUP mode. |
697 | | RESTORE
| 7
| RESTORE is the type a server uses when restoring a backup, at startup time. No queries can be served in RESTORE mode. |
698 | | DRAINED
| 8
| DRAINED is the type a server goes into when used by Vitess tools to perform an offline action. It is a serving type (as the tools processes may need to run queries), but it's not used to route queries from Vitess users. In this state, this tablet is dedicated to the process that uses it. |
699 |
700 | ### vtrpc.ErrorCode
701 |
702 | ErrorCode is the enum values for Errors. Internally, errors should be created with one of these codes. These will then be translated over the wire by various RPC frameworks.
703 |
704 | | Name |Value |Description |
705 | | :-------- | :-------- | :--------
706 | | SUCCESS
| 0
| SUCCESS is returned from a successful call. |
707 | | CANCELLED
| 1
| CANCELLED means that the context was cancelled (and noticed in the app layer, as opposed to the RPC layer). |
708 | | UNKNOWN_ERROR
| 2
| UNKNOWN_ERROR includes: 1. MySQL error codes that we don't explicitly handle. 2. MySQL response that wasn't as expected. For example, we might expect a MySQL timestamp to be returned in a particular way, but it wasn't. 3. Anything else that doesn't fall into a different bucket. |
709 | | BAD_INPUT
| 3
| BAD_INPUT is returned when an end-user either sends SQL that couldn't be parsed correctly, or tries a query that isn't supported by Vitess. |
710 | | DEADLINE_EXCEEDED
| 4
| DEADLINE_EXCEEDED is returned when an action is taking longer than a given timeout. |
711 | | INTEGRITY_ERROR
| 5
| INTEGRITY_ERROR is returned on integrity error from MySQL, usually due to duplicate primary keys. |
712 | | PERMISSION_DENIED
| 6
| PERMISSION_DENIED errors are returned when a user requests access to something that they don't have permissions for. |
713 | | RESOURCE_EXHAUSTED
| 7
| RESOURCE_EXHAUSTED is returned when a query exceeds its quota in some dimension and can't be completed due to that. Queries that return RESOURCE_EXHAUSTED should not be retried, as it could be detrimental to the server's health. Examples of errors that will cause the RESOURCE_EXHAUSTED code: 1. TxPoolFull: this is retried server-side, and is only returned as an error if the server-side retries failed. 2. Query is killed due to it taking too long. |
714 | | QUERY_NOT_SERVED
| 8
| QUERY_NOT_SERVED means that a query could not be served right now. Client can interpret it as: "the tablet that you sent this query to cannot serve the query right now, try a different tablet or try again later." This could be due to various reasons: QueryService is not serving, should not be serving, wrong shard, wrong tablet type, blacklisted table, etc. Clients that receive this error should usually retry the query, but after taking the appropriate steps to make sure that the query will get sent to the correct tablet. |
715 | | NOT_IN_TX
| 9
| NOT_IN_TX means that we're not currently in a transaction, but we should be. |
716 | | INTERNAL_ERROR
| 10
| INTERNAL_ERRORs are problems that only the server can fix, not the client. These errors are not due to a query itself, but rather due to the state of the system. Generally, we don't expect the errors to go away by themselves, but they may go away after human intervention. Examples of scenarios where INTERNAL_ERROR is returned: 1. Something is not configured correctly internally. 2. A necessary resource is not available, and we don't expect it to become available by itself. 3. A sanity check fails. 4. Some other internal error occurs. Clients should not retry immediately, as there is little chance of success. However, it's acceptable for retries to happen internally, for example to multiple backends, in case only a subset of backend are not functional. |
717 | | TRANSIENT_ERROR
| 11
| TRANSIENT_ERROR is used for when there is some error that we expect we can recover from automatically - often due to a resource limit temporarily being reached. Retrying this error, with an exponential backoff, should succeed. Clients should be able to successfully retry the query on the same backends. Examples of things that can trigger this error: 1. Query has been throttled 2. VtGate could have request backlog |
718 | | UNAUTHENTICATED
| 12
| UNAUTHENTICATED errors are returned when a user requests access to something, and we're unable to verify the user's authentication. |
719 |
720 | ## Messages
721 |
722 | ### BoundKeyspaceIdQuery
723 |
724 | BoundKeyspaceIdQuery represents a single query request for the specified list of keyspace ids. This is used in a list for ExecuteBatchKeyspaceIdsRequest.
725 |
726 | #### Properties
727 |
728 | | Name |Description |
729 | | :-------- | :--------
730 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
731 | | keyspace
string| keyspace to target the query to. |
732 | | keyspace_ids
list <bytes>| keyspace_ids contains the list of keyspace_ids affected by this query. Will be used to find the shards to send the query to. |
733 |
734 | ### BoundShardQuery
735 |
736 | BoundShardQuery represents a single query request for the specified list of shards. This is used in a list for ExecuteBatchShardsRequest.
737 |
738 | #### Properties
739 |
740 | | Name |Description |
741 | | :-------- | :--------
742 | | query
[query.BoundQuery](#query.boundquery)| BoundQuery is a query with its bind variables |
743 | | keyspace
string| keyspace to target the query to. |
744 | | shards
list <string>| shards to target the query to. A DML can only target one shard. |
745 |
746 | ### Session
747 |
748 | Session objects are session cookies and are invalidated on use. Query results will contain updated session values. Their content should be opaque to the user.
749 |
750 | #### Properties
751 |
752 | | Name |Description |
753 | | :-------- | :--------
754 | | in_transaction
bool| |
755 | | shard_sessions
list <[ShardSession](#session.shardsession)>| |
756 | | single_db
bool| single_db specifies if the transaction should be restricted to a single database. |
757 |
758 | #### Messages
759 |
760 | ##### Session.ShardSession
761 |
762 | Properties
763 |
764 | | Name |Description |
765 | | :-------- | :--------
766 | | target
[query.Target](#query.target)| Target describes what the client expects the tablet is. If the tablet does not match, an error is returned. |
767 | | transaction_id
int64| |
768 |
769 | ### query.BindVariable
770 |
771 | BindVariable represents a single bind variable in a Query.
772 |
773 | #### Properties
774 |
775 | | Name |Description |
776 | | :-------- | :--------
777 | | type
[Type](#query.type)| |
778 | | value
bytes| |
779 | | values
list <[Value](#query.value)>| Value represents a typed value. |
780 |
781 | ### query.BoundQuery
782 |
783 | BoundQuery is a query with its bind variables
784 |
785 | #### Properties
786 |
787 | | Name |Description |
788 | | :-------- | :--------
789 | | sql
string| sql is the SQL query to execute |
790 | | bind_variables
map <string, [BindVariable](#query.bindvariable)>| bind_variables is a map of all bind variables to expand in the query |
791 |
792 | ### query.EventToken
793 |
794 | EventToken is a structure that describes a point in time in a replication stream on one shard. The most recent known replication position can be retrieved from vttablet when executing a query. It is also sent with the replication streams from the binlog service.
795 |
796 | #### Properties
797 |
798 | | Name |Description |
799 | | :-------- | :--------
800 | | timestamp
int64| timestamp is the MySQL timestamp of the statements. Seconds since Epoch. |
801 | | shard
string| The shard name that applied the statements. Note this is not set when streaming from a vttablet. It is only used on the client -> vtgate link. |
802 | | position
string| The position on the replication stream after this statement was applied. It is not the transaction ID / GTID, but the position / GTIDSet. |
803 |
804 | ### query.ExecuteOptions
805 |
806 | ExecuteOptions is passed around for all Execute calls.
807 |
808 | #### Properties
809 |
810 | | Name |Description |
811 | | :-------- | :--------
812 | | include_event_token
bool| This used to be exclude_field_names, which was replaced by IncludedFields enum below If set, we will try to include an EventToken with the responses. |
813 | | compare_event_token
[EventToken](#query.eventtoken)| EventToken is a structure that describes a point in time in a replication stream on one shard. The most recent known replication position can be retrieved from vttablet when executing a query. It is also sent with the replication streams from the binlog service. |
814 | | included_fields
[IncludedFields](#executeoptions.includedfields)| Controls what fields are returned in Field message responses from mysql, i.e. field name, table name, etc. This is an optimization for high-QPS queries where the client knows what it's getting |
815 |
816 | #### Enums
817 |
818 | ##### ExecuteOptions.IncludedFields
819 |
820 | | Name |Value |Description |
821 | | :-------- | :-------- | :--------
822 | | TYPE_AND_NAME
| 0
| |
823 | | TYPE_ONLY
| 1
| |
824 | | ALL
| 2
| |
825 |
826 | ### query.Field
827 |
828 | Field describes a single column returned by a query
829 |
830 | #### Properties
831 |
832 | | Name |Description |
833 | | :-------- | :--------
834 | | name
string| name of the field as returned by mysql C API |
835 | | type
[Type](#query.type)| vitess-defined type. Conversion function is in sqltypes package. |
836 | | table
string| Remaining fields from mysql C API. These fields are only populated when ExecuteOptions.included_fields is set to IncludedFields.ALL. |
837 | | org_table
string| |
838 | | database
string| |
839 | | org_name
string| |
840 | | column_length
uint32| column_length is really a uint32. All 32 bits can be used. |
841 | | charset
uint32| charset is actually a uint16. Only the lower 16 bits are used. |
842 | | decimals
uint32| decimals is actualy a uint8. Only the lower 8 bits are used. |
843 | | flags
uint32| flags is actually a uint16. Only the lower 16 bits are used. |
844 |
845 | ### query.QueryResult
846 |
847 | QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]).
848 |
849 | #### Properties
850 |
851 | | Name |Description |
852 | | :-------- | :--------
853 | | fields
list <[Field](#query.field)>| Field describes a single column returned by a query |
854 | | rows_affected
uint64| |
855 | | insert_id
uint64| |
856 | | rows
list <[Row](#query.row)>| Row is a database row. |
857 | | extras
[ResultExtras](#query.resultextras)| ResultExtras contains optional out-of-band information. Usually the extras are requested by adding ExecuteOptions flags. |
858 |
859 | ### query.ResultExtras
860 |
861 | ResultExtras contains optional out-of-band information. Usually the extras are requested by adding ExecuteOptions flags.
862 |
863 | #### Properties
864 |
865 | | Name |Description |
866 | | :-------- | :--------
867 | | event_token
[EventToken](#query.eventtoken)| EventToken is a structure that describes a point in time in a replication stream on one shard. The most recent known replication position can be retrieved from vttablet when executing a query. It is also sent with the replication streams from the binlog service. |
868 | | fresher
bool| If set, it means the data returned with this result is fresher than the compare_token passed in the ExecuteOptions. |
869 |
870 | ### query.ResultWithError
871 |
872 | ResultWithError represents a query response in the form of result or error but not both.
873 |
874 | #### Properties
875 |
876 | | Name |Description |
877 | | :-------- | :--------
878 | | error
[vtrpc.RPCError](#vtrpc.rpcerror)| RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code. |
879 | | result
[query.QueryResult](#query.queryresult)| QueryResult is returned by Execute and ExecuteStream. As returned by Execute, len(fields) is always equal to len(row) (for each row in rows). As returned by StreamExecute, the first QueryResult has the fields set, and subsequent QueryResult have rows set. And as Execute, len(QueryResult[0].fields) is always equal to len(row) (for each row in rows for each QueryResult in QueryResult[1:]). |
880 |
881 | ### query.Row
882 |
883 | Row is a database row.
884 |
885 | #### Properties
886 |
887 | | Name |Description |
888 | | :-------- | :--------
889 | | lengths
list <sint64>| lengths contains the length of each value in values. A length of -1 means that the field is NULL. While reading values, you have to accummulate the length to know the offset where the next value begins in values. |
890 | | values
bytes| values contains a concatenation of all values in the row. |
891 |
892 | ### query.StreamEvent
893 |
894 | StreamEvent describes a set of transformations that happened as a single transactional unit on a server. It is streamed back by the Update Stream calls.
895 |
896 | #### Properties
897 |
898 | | Name |Description |
899 | | :-------- | :--------
900 | | statements
list <[Statement](#streamevent.statement)>| The statements in this transaction. |
901 | | event_token
[EventToken](#query.eventtoken)| EventToken is a structure that describes a point in time in a replication stream on one shard. The most recent known replication position can be retrieved from vttablet when executing a query. It is also sent with the replication streams from the binlog service. |
902 |
903 | #### Messages
904 |
905 | ##### StreamEvent.Statement
906 |
907 | One individual Statement in a transaction.
908 |
909 | Properties
910 |
911 | | Name |Description |
912 | | :-------- | :--------
913 | | category
[Category](#streamevent.statement.category)| |
914 | | table_name
string| table_name, primary_key_fields and primary_key_values are set for DML. |
915 | | primary_key_fields
list <[Field](#query.field)>| Field describes a single column returned by a query |
916 | | primary_key_values
list <[Row](#query.row)>| Row is a database row. |
917 | | sql
bytes| sql is set for all queries. FIXME(alainjobart) we may not need it for DMLs. |
918 |
919 | #### Enums
920 |
921 | ##### StreamEvent.Statement.Category
922 |
923 | One individual Statement in a transaction. The category of one statement.
924 |
925 | | Name |Value |Description |
926 | | :-------- | :-------- | :--------
927 | | Error
| 0
| |
928 | | DML
| 1
| |
929 | | DDL
| 2
| |
930 |
931 | ### query.Target
932 |
933 | Target describes what the client expects the tablet is. If the tablet does not match, an error is returned.
934 |
935 | #### Properties
936 |
937 | | Name |Description |
938 | | :-------- | :--------
939 | | keyspace
string| |
940 | | shard
string| |
941 | | tablet_type
[topodata.TabletType](#topodata.tablettype)| TabletType represents the type of a given tablet. |
942 |
943 | ### query.Value
944 |
945 | Value represents a typed value.
946 |
947 | #### Properties
948 |
949 | | Name |Description |
950 | | :-------- | :--------
951 | | type
[Type](#query.type)| |
952 | | value
bytes| |
953 |
954 | ### topodata.KeyRange
955 |
956 | KeyRange describes a range of sharding keys, when range-based sharding is used.
957 |
958 | #### Properties
959 |
960 | | Name |Description |
961 | | :-------- | :--------
962 | | start
bytes| |
963 | | end
bytes| |
964 |
965 | ### topodata.ShardReference
966 |
967 | ShardReference is used as a pointer from a SrvKeyspace to a Shard
968 |
969 | #### Properties
970 |
971 | | Name |Description |
972 | | :-------- | :--------
973 | | name
string| Copied from Shard. |
974 | | key_range
[KeyRange](#topodata.keyrange)| KeyRange describes a range of sharding keys, when range-based sharding is used. |
975 |
976 | ### topodata.SrvKeyspace
977 |
978 | SrvKeyspace is a rollup node for the keyspace itself.
979 |
980 | #### Properties
981 |
982 | | Name |Description |
983 | | :-------- | :--------
984 | | partitions
list <[KeyspacePartition](#srvkeyspace.keyspacepartition)>| The partitions this keyspace is serving, per tablet type. |
985 | | sharding_column_name
string| copied from Keyspace |
986 | | sharding_column_type
[KeyspaceIdType](#topodata.keyspaceidtype)| |
987 | | served_from
list <[ServedFrom](#srvkeyspace.servedfrom)>| |
988 |
989 | #### Messages
990 |
991 | ##### SrvKeyspace.KeyspacePartition
992 |
993 | Properties
994 |
995 | | Name |Description |
996 | | :-------- | :--------
997 | | served_type
[TabletType](#topodata.tablettype)| The type this partition applies to. |
998 | | shard_references
list <[ShardReference](#topodata.shardreference)>| ShardReference is used as a pointer from a SrvKeyspace to a Shard |
999 |
1000 | ##### SrvKeyspace.ServedFrom
1001 |
1002 | ServedFrom indicates a relationship between a TabletType and the keyspace name that's serving it.
1003 |
1004 | Properties
1005 |
1006 | | Name |Description |
1007 | | :-------- | :--------
1008 | | tablet_type
[TabletType](#topodata.tablettype)| ServedFrom indicates a relationship between a TabletType and the keyspace name that's serving it. the tablet type |
1009 | | keyspace
string| the keyspace name that's serving it |
1010 |
1011 | ### vtrpc.CallerID
1012 |
1013 | CallerID is passed along RPCs to identify the originating client for a request. It is not meant to be secure, but only informational. The client can put whatever info they want in these fields, and they will be trusted by the servers. The fields will just be used for logging purposes, and to easily find a client. VtGate propagates it to VtTablet, and VtTablet may use this information for monitoring purposes, to display on dashboards, or for blacklisting purposes.
1014 |
1015 | #### Properties
1016 |
1017 | | Name |Description |
1018 | | :-------- | :--------
1019 | | principal
string| principal is the effective user identifier. It is usually filled in with whoever made the request to the appserver, if the request came from an automated job or another system component. If the request comes directly from the Internet, or if the Vitess client takes action on its own accord, it is okay for this field to be absent. |
1020 | | component
string| component describes the running process of the effective caller. It can for instance be the hostname:port of the servlet initiating the database call, or the container engine ID used by the servlet. |
1021 | | subcomponent
string| subcomponent describes a component inisde the immediate caller which is responsible for generating is request. Suggested values are a servlet name or an API endpoint name. |
1022 |
1023 | ### vtrpc.RPCError
1024 |
1025 | RPCError is an application-level error structure returned by VtTablet (and passed along by VtGate if appropriate). We use this so the clients don't have to parse the error messages, but instead can depend on the value of the code.
1026 |
1027 | #### Properties
1028 |
1029 | | Name |Description |
1030 | | :-------- | :--------
1031 | | code
[ErrorCode](#vtrpc.errorcode)| |
1032 | | message
string| |
1033 |
1034 |
--------------------------------------------------------------------------------
/VitessApi.md:
--------------------------------------------------------------------------------
1 | ## Vitess API接口说明文档
2 |
3 | 1. 数据库列表(keyspaces list)
4 | GET :15000/api/keyspaces/
5 |
6 | 2. 创建数据库
7 | POST :15000/api/vtctl/
8 |
9 | content-type:application/json
10 | ["CreateKeyspace","gwgggg"]
11 |
12 | 3. 删除数据库
13 | POST :15000/api/vtctl/
14 |
15 | content-type:application/json
16 | ["DeleteKeyspace","-recursive","gwgggg"]
17 |
18 | 4. 数据库验证
19 | POST :15000/api/vtctl/
20 |
21 | Content-Type:application/json;charset=UTF-8
22 | ["ValidateKeyspace","-ping-tablets","gw_keyspace"]
23 |
24 | ["ValidateSchemaKeyspace","gw_keyspace"]
25 |
26 | ["ValidateVersionKeyspace","test"]
27 |
28 | 5. RebuildKeyspaceGraph
29 | POST :15000/api/vtctl/
30 |
31 | Content-Type:application/json;charset=UTF-8
32 | ["RebuildKeyspaceGraph","gw_keyspace"]
33 |
34 |
35 | 6. 创建分片
36 | POST :15000/api/vtctl/
37 |
38 | Content-Type:application/json;charset=UTF-8
39 | ["CreateShard","test/-10"]
40 |
41 | 7. 删除分片
42 | POST :15000/api/vtctl/
43 |
44 | Content-Type:application/json;charset=UTF-8
45 | ["DeleteShard","-recursive","-even_if_serving","test/-10"]
46 |
47 | 8. 获取分片列表
48 | GET :15000/api/shards/gw_keyspace/
49 |
50 | ["-10","10-"]
51 |
52 | 9. 获取Tablets列表
53 | POST :15000/api/tablets/
54 |
55 | content-type:application/x-www-form-urlencoded
56 | shard=gw_keyspace/-10
57 |
58 |
59 | [
60 | {
61 | "cell": "test",
62 | "uid": 1201
63 | },
64 | {
65 | "cell": "test",
66 | "uid": 1202
67 | },
68 | {
69 | "cell": "test",
70 | "uid": 1203
71 | },
72 | {
73 | "cell": "test",
74 | "uid": 1204
75 | }
76 | ]
77 |
78 | 10. 获取Tablets中表信息
79 | POST :15000/api/vtctl/
80 |
81 | content-type:application/json
82 | ["GetSchema","test-1201"]
83 |
84 | 11. 执行sql
85 | POST :15000/api/schema/apply
86 | Content-Type:application/json;charset=UTF-8
87 |
88 | {"Keyspace":"gw_keyspace","SQL":"CREATE TABLE user_info(\n `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '用户id',\n `username` varchar(32) NOT NULL COMMENT '用户名称',\n `userpasswd` varchar(32) NOT NULL COMMENT '用户密码',\n `email` varchar(32) DEFAULT NULL COMMENT '用户密码',\n `remark` varchar(128) DEFAULT NULL COMMENT ' 备注',\n `createtime` timestamp NOT NULL DEFAULT '2017-01-01 00:00:00' COMMENT '创建时间',\n `updatetime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',\n PRIMARY key(`id`)\n)ENGINE=InnoDB AUTO_INCREMENT = 1 DEFAULT CHARSET=utf8;"}
89 |
90 |
91 | 12. 初始化Master
92 | POST :15000/api/vtctl/
93 |
94 | Content-Type:application/json;charset=UTF-8
95 | ["InitShardMaster", "-force", "gw_keyspace/10-","test-0000001301"]
96 |
97 |
98 | 13. ReparentTablet
99 | POST :15000/api/vtctl/
100 |
101 | application/json; charset=utf-8
102 |
103 | ["ReparentTablet","test-1301"]
104 |
105 |
106 | 14. 获取POD列表
107 | GET /api/v1/namespaces/{namespace}/pods
108 |
109 | 15. 创建POD
110 | POST /api/v1/namespaces/{namespace}/pods
111 |
112 | 16. 删除POD
113 | DELETE /api/v1/namespaces/{namespace}/pods/{name}
114 |
115 | 17. 获取POD详细信息
116 | GET /api/v1/namespaces/{namespace}/pods/{name}
117 |
118 | 18. 获取Service列表
119 | GET /api/v1/namespaces/{namespace}/services
120 |
121 | 19. 创建Service
122 | POST /api/v1/namespaces/{namespace}/services
123 |
124 | 20. 删除Service
125 | DELETE /api/v1/namespaces/{namespace}/services/{name}
126 |
127 | 21. 获取ReplicationControllerList
128 | GET /api/v1/namespaces/{namespace}/replicationcontrollers
129 |
130 | 22. 创建ReplicationController
131 | POST /api/v1/namespaces/{namespace}/replicationcontrollers
132 |
133 | 23. 删除ReplicationController
134 | DELETE /api/v1/namespaces/{namespace}/replicationcontrollers/{name}
135 |
136 | 24. 获取ReplicationController详细信息
137 | GET /api/v1/namespaces/{namespace}/replicationcontrollers/{name}
138 |
139 | 25. 获取Namespace列表
140 | GET /api/v1/namespaces
141 |
142 | 26. 创建Namespace
143 | POST /api/v1/namespaces
144 |
145 | 27. 删除Namespace
146 | DELETE /api/v1/namespaces/{name}
147 |
148 | 28. 获取Namespace详细信息
149 | GET /api/v1/namespaces/{name}
150 |
--------------------------------------------------------------------------------
/VitessCluster.md:
--------------------------------------------------------------------------------
1 | # 基于kubernetes集群的Vitess最佳实践
2 |
3 | ## 概要
4 | 本文主要说明基于kubernetes集群部署并使用Vitess; 本文假定用户已经具备了kubernetes集群使用环境,如果不具备请先参阅[基于minikube的kubernetes集群搭建](https://github.com/zssky/vitessdoc/blob/master/MinikubeCluster.md), 或者参阅[Kubernetes 官方文档](https://kubernetes.io/docs/)搭建正式的集群环境。
5 |
6 | [Kubernetes中文文档](https://www.kubernetes.org.cn/k8s)
7 |
8 | 以下就先简要介绍下基本操作命令:
9 |
10 | ## 集群操作常用命令
11 | ### kubectl相关
12 | * 关键词概念
13 | [Pods](https://www.kubernetes.org.cn/kubernetes-pod)
14 | [Labels](https://www.kubernetes.org.cn/kubernetes-labels)
15 | [Replication Controller](https://www.kubernetes.org.cn/replication-controller-kubernetes)
16 | [Services](https://www.kubernetes.org.cn/kubernetes-services)
17 | [Volumes](https://www.kubernetes.org.cn/kubernetes-volumes)
18 | [kubectl命令详细说明](https://www.kubernetes.org.cn/doc-45)
19 |
20 | * 获取pod列表
21 | ``` sh
22 | # 命令会返回当前kubernetes 已经创建的pods列表,主要会显示以下信息
23 | # NAME READY STATUS RESTARTS AGE
24 | $ kubectl get pod
25 | # NAME READY STATUS RESTARTS AGE
26 | # etcd-global-9002d 1/1 Running 0 2d
27 | # etcd-global-l3ph8 1/1 Running 0 2d
28 | # etcd-global-psj52 1/1 Running 0 2d
29 | ```
30 |
31 | * 查看pod详细信息
32 | ``` sh
33 | # 使用pod名称查看pod的详细信息, 主要是容器的详细信息
34 | $ kubectl describe pod etcd-global-9002d
35 | ```
36 |
37 | * 查询部署列表
38 | ``` sh
39 | # 获取部署列表
40 | $ kubectl get deployment
41 | ```
42 |
43 | * 删除部署
44 | ``` sh
45 | # 删除名称为etcd-minikube的部署
46 | $ kubectl delete deployment etcd-minikube
47 | ```
48 |
49 | * 删除容器
50 | ``` sh
51 | # 删除rc,即删除该rc控制的所有容器
52 | $ kubectl delete rc my-nginx
53 |
54 | # 删除svc,即删除分配的虚拟IP
55 | $ kubectl delete svc my-ngin
56 | ```
57 |
58 | * 获取Replication Controller
59 | ``` sh
60 | # 获取Replication Controller列表
61 | $ kubectl get rc
62 | ```
63 |
64 | * 通过外部访问kubectl内部的端口
65 | ``` sh
66 | # expose命令将会创建一个service,将本地(某个节点上)的一个随机端口关联到容器中的80端口
67 | $ kubectl expose rc my-nginx --port=80 --type=LoadBalancer
68 | ```
69 |
70 | * 查询服务信息
71 | ``` sh
72 | # 以上通过expose创建了一个叫my-nginx的service,我们可以通过以下命令查询服务信息
73 | $ kubectl get svc my-nginx
74 | ```
75 |
76 | * 根据配置文件创建pod
77 | ``` sh
78 | # 根据配置文件*.yaml创建容器
79 | $ kubectl create -f ./hello-world.yaml
80 | ```
81 |
82 | * 配置文件正确性校验
83 | ``` sh
84 | # 使用--vaildate参数可以校验配置文件正确性
85 | $ kubectl create -f ./hello-world.yaml --validate
86 | ```
87 |
88 | * 查看日志
89 | ``` sh
90 | # 查看vttablet的日志
91 | $ kubectl logs vttablet-100 vttablet
92 |
93 | # 查看vttablet中mysql的日志
94 | $ kubectl logs vttablet-100 mysql
95 | ```
96 |
97 | * shell登录
98 | ``` sh
99 | # 通过kubectl exec 可以直接连接到对应的节点
100 | $ kubectl exec vttablet-100 -c vttablet -t -i -- bash -il
101 | ```
102 |
103 | * 查看service详细信息
104 | ``` sh
105 | kubectl describe service etcd-global
106 | ```
107 |
108 | * RC副本数量修改
109 | ``` sh
110 | # 可以通过本地动态修改RC副本数量实现动态扩容缩容
111 | kubectl scale rc xxxx --replicas=3
112 | ```
113 | * 查询Replica Set
114 | ``` sh
115 | kubectl get rs
116 | ```
117 |
118 | * 查看Endpoints列表
119 | ``` sh
120 | # 查看Endpoints 列表
121 | # Endpoint => (Pod Ip + ContainerPort)
122 | kubectl get endpoints
123 | ```
124 |
125 | * 查看namespaces
126 | ``` sh
127 | kubectl get namespace
128 | ```
129 | * Node的隔离与恢复
130 | ``` sh
131 | # 隔离Node,新创建的Pod不会在改node上创建了,但是已经创建的不会自动关闭
132 | kubectl patch node xxx -p '{"spec":{"unschedulable":true}}'
133 |
134 | # 解除Node的隔离, 可以在上面继续创建pod
135 | kubectl patch node xxx -p '{"spec":{"unschedulable":true}}'
136 |
137 | # 隔离还可以通过kubectl replace -f xxx.yaml 命令实现
138 | ```
139 |
140 | * Pod添加/删除/修改Label
141 | ``` sh
142 | # 给pod xxx添加Lable app=vitess
143 | kubectl label pod xxx app=vitess
144 |
145 | # 删除pod xxx的Lable app
146 | kubectl label pod xxx app-
147 |
148 | # 修改pod xxx的标签
149 | kubectl label pod xxx app=mysql --overwrite
150 | ```
151 |
152 | * 查看Pod日志
153 | ``` sh
154 | # 查看Pod下对应容器的日志, 使用-f可以直接监听文件变化
155 | $ kubectl logs -f -c
156 | ```
157 | * 查看Pod之前状态
158 | ``` sh
159 | # 如果容器被kill的话可以通过以下命令查看容器关闭的原因,比如oom这类就可以查看
160 | $ kubectl get pod -o go-template='{{range.status.containerStatuses}}{{"Container Name: "}}{{.name}}{{"\r\nLastState: "}}{{.lastState}}{{end}}' simmemleak-60xbc
161 | ```
162 |
163 | ### 容器相关
164 | * 拉取容器镜像
165 | ``` sh
166 | # 拉取远端名称为test的镜像
167 | $ docker pull test
168 | # docker pull vitess/etcd:v2.0.13-lite
169 | # docker pull vitess/lite
170 | ```
171 |
172 | * 查看容器列表
173 | ``` sh
174 | # 查看当前启动的容器列表
175 | $ docker ps
176 |
177 | # 返回以下信息
178 | # CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
179 | ```
180 |
181 | * 登录容器
182 | ``` sh
183 | # 通过容器ID登录容器
184 | $ docker exec -it 容器ID /bin/bash
185 | # docker exec -it 66f92ed4befb /bin/bash
186 | ```
187 |
188 | * 保存容器镜像
189 | ``` sh
190 | # 保存已经下载下来的容器到文件,xxx是镜像名称(REPOSITORY)
191 | $ docker save -o xxx.tar xxx
192 | ```
193 |
194 | * 加载镜像
195 | ``` sh
196 | # 加载导出的镜像文件
197 | $ docker load --input xxx.tar
198 | ```
199 | 如果有多个镜像文件,可以使用脚本进行批量导入
200 | ``` sh
201 | $ ls -l | awk -F ' ' '{print "docker load --input="$NF}' | sh
202 | ```
203 |
204 | * 把docker进程保存成镜像
205 | ``` sh
206 | # 查询docker进程
207 | $ docker ps
208 | #CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
209 | #9bb89f5f488b ce3f89f83ead "/bin/bash" 59 minutes ago Up 59 minutes angry_pasteur
210 |
211 | # 把进程9bb89f5f488b 保存成镜像
212 | $ docker commit 9bb89f5f488b vitesss/bootstrap
213 |
214 | # 查看镜像列表
215 | $ docker images
216 | #REPOSITORY TAG IMAGE ID CREATED SIZE
217 | #vitesss/bootstrap mysql56 376ef8e4540e 4 seconds ago 2.358 GB
218 | ```
219 | * 查询docker进程信息
220 | ``` sh
221 | # 查询进程信息例如ip地址或者别的信息可以使用
222 | # docker inspect 9bb89f5f488b
223 | $ docker inspect
224 | ```
225 |
226 | ## Vitess部署
227 |
228 | 本文假定用户已经具备[本地部署Vitess](http://vitess.io/getting-started/local-instance.html)的经验,需要将Vitess部署在Kubernetes,所以对于相关的环境依赖就不在做过多的说明; 如果有不明白的地方请先参阅官方文档。
229 |
230 | ### 编译安装vtctlclient
231 |
232 | `vtctlclient`工具可以用来向Vitess发送命令, 所以安装环境前我们需要先安装`vtctlclient`。
233 |
234 | ```ssh
235 | $ go get github.com/youtube/vitess/go/cmd/vtctlclient
236 | ```
237 | 该命令会在`$GOPATH/src/github.com/youtube/vitess/`目录下下载并且编译`vtctlclient`源码, 同时也会把编译好的vtctlclient二进制文件拷贝到目录`$GOPATH/bin`下。
238 |
239 | ### 本地kubectl
240 |
241 | 如果正常按照文档说明安装,本地kubectl就应该已经安装完成,这里我们需要再次校验一下,确保kubectl处于正常可用状态。
242 | 检查kubectl是否已经正常安装并设置环境变量PATH:
243 |
244 | ```sh
245 | $ which kubectl
246 | ### example output:
247 | # /usr/local/bin/kubectl
248 | ```
249 |
250 | 如果kubectl没有包含在$PATH环境变量中, 就需要设置`KUBECTL`环境变量,否则执行启动脚本的时候无法获取`kubectl`位置。
251 |
252 |
253 | ``` sh
254 | $ export KUBECTL=/export/working/bin/kubectl
255 | ```
256 |
257 | ### 启动Vitess集群
258 |
259 | 1. 跳转到本地Vitess源代码目录
260 | 经过上面的步骤后,我们就可以尝试着运行Vitess官方提供的实例了,切换到$GOPATH/src/github.com/youtube/vitess/examples/kubernetes目录下:
261 |
262 | ```
263 | $ cd $GOPATH/src/github.com/youtube/vitess/examples/kubernetes
264 | ```
265 |
266 | 2. 修改本地配置
267 |
268 | 运行configure.sh脚本来生成config.sh文件,config.sh用于自定义的集群设置。对于备份官方支持两种方式file和gcs方式,我们这里使用file方式创建备份。
269 | ``` sh
270 |
271 | vitess/examples/kubernetes$ ./configure.sh
272 | ### example output:
273 | # Vitess Docker image (leave empty for default) []:
274 | # Backup Storage (file, gcs) [gcs]: file
275 | # Root directory for backups (usually an NFS mount): /backup
276 | # NOTE: You must add your NFS mount to the vtctld-controller-template
277 | # and vttablet-pod-template as described in the Kubernetes docs:
278 | # http://kubernetes.io/v1.0/docs/user-guide/volumes.html#nfs
279 | ```
280 | 注意: 对于使用file方式备份的我们需要在vttablet和vtctld pod中安装一个读写网络卷, 可以通过NFS(Network File System)将任何存
281 | 储服务mount到Kubernetes中;这样我们就可以很方便的备份了。
282 |
283 | 3. 启动etcd集群
284 |
285 | Vitess[拓扑服务](http://vitess.io/overview/concepts.html#topology-service)存储Vitess集群中所有服务器的协作元数据, 他将此数据存储在支持数据一致性的分布式存储系统中。本例中我们使用[etcd](https://github.com/coreos/etcd)来存储,注意:我们需要自己的etcd集群,与Kubernetes本身使用的集群分开。
286 |
287 | ``` sh
288 | vitess/examples/kubernetes$ ./etcd-up.sh
289 | ### example output:
290 | # Creating etcd service for global cell...
291 | # service "etcd-global" created
292 | # service "etcd-global-srv" created
293 | # Creating etcd replicationcontroller for global cell...
294 | # replicationcontroller "etcd-global" created
295 | # ...
296 | ```
297 |
298 | 这条命令创建了两个集群, 一个是[全局数据中心](/user-guide/topology-service.html#global-vs-local)集群,另一个是[本地数据中心](http://vitess.io/overview/concepts.html#cell-data-center)集群。你可以通过运行以下命令来检查群集中[pods](http://kubernetes.io/v1.1/docs/user-guide/pods.html)的状态:
299 |
300 | ``` sh
301 | $ kubectl get pods
302 | ### example output:
303 | # NAME READY STATUS RESTARTS AGE
304 | # etcd-global-8oxzm 1/1 Running 0 1m
305 | # etcd-global-hcxl6 1/1 Running 0 1m
306 | # etcd-global-xupzu 1/1 Running 0 1m
307 | # etcd-test-e2y6o 1/1 Running 0 1m
308 | # etcd-test-m6wse 1/1 Running 0 1m
309 | # etcd-test-qajdj 1/1 Running 0 1m
310 | ```
311 |
312 | 
313 |
314 | Kubernetes节点第一次下载需要的Docker镜像的时候会耗费较长的时间, 在下载镜像的过程中Pod的状态是Pending状态。
315 |
316 | **注意:** 本例中, 每个以`-up.sh`结尾的脚本都有一个以`-down.sh`结尾的脚本相对应。 你可以用来停止Vitess集群中的某些组件,而不会关闭整个集群;例如:移除`etcd`的部署可以使用一下命令:
317 | ``` sh
318 | vitess/examples/kubernetes$ ./etcd-down.sh
319 | ```
320 |
321 | 4. **启动vtctld**
322 | `vtctld`提供了检查Vitess集群状态的相关接口, 同时还可以接收`vtctlclient`的RPC命令来修改集群信息。
323 |
324 | ``` sh
325 | vitess/examples/kubernetes$ ./vtctld-up.sh
326 | ### example output:
327 | # Creating vtctld ClusterIP service...
328 | # service "vtctld" created
329 | # Creating vtctld replicationcontroller...
330 | # replicationcontroller "vtctld" create createdd
331 | ```
332 |
333 | 5. **使用vtctld web界面**
334 |
335 | 在Kubernetes外面使用vtctld需要使用[kubectl proxy]
336 | (http://kubernetes.io/v1.1/docs/user-guide/kubectl/kubectl_proxy.html)在工作站上创建一个通道。
337 |
338 | **注意:** proxy命令是运行在前台, 所以如果你想启动proxy需要另外开启一个终端。
339 |
340 | ``` sh
341 | $ kubectl proxy --port=8001
342 | ### example output:
343 | # Starting to serve on localhost:8001
344 | ```
345 |
346 | 你可以在`本地`打开vtctld web界面:
347 |
348 | http://localhost:8001/api/v1/proxy/namespaces/default/services/vtctld:web/
349 |
350 | 界面截图如下:
351 | 
352 |
353 | 同时,还可以通过proxy进入[Kubernetes Dashboard]
354 | (http://kubernetes.io/v1.1/docs/user-guide/ui.html), 监控nodes, pods和服务器状态:
355 |
356 | http://localhost:8001/ui
357 | 控制台截图如下:
358 | 
359 |
360 | 6. **启动vttablets**
361 |
362 | [tablet](http://vitess.io/overview/concepts.html#tablet)是Vitess扩展的基本单位。tablet由运行在相同的机器上的`vttablet` 和 `mysqld`组成。
363 | 我们在用Kubernetes的时候通过将vttablet和mysqld的容器放在单个[pod](http://kubernetes.io/v1.1/docs/user-guide/pods.html)中来实现耦合。
364 |
365 | 运行以下脚本以启动vttablet pod,其中也包括mysqld:
366 |
367 | ``` sh
368 | vitess/examples/kubernetes$ ./vttablet-up.sh
369 | ### example output:
370 | # Creating test_keyspace.shard-0 pods in cell test...
371 | # Creating pod for tablet test-0000000100...
372 | # pod "vttablet-100" created
373 | # Creating pod for tablet test-0000000101...
374 | # pod "vttablet-101" created
375 | # Creating pod for tablet test-0000000102...
376 | # pod "vttablet-102" created
377 | # Creating pod for tablet test-0000000103...
378 | # pod "vttablet-103" created
379 | # Creating pod for tablet test-0000000104...
380 | # pod "vttablet-104" created
381 | ```
382 | 启动后在vtctld Web管理界面中很快就会看到一个名为`test_keyspace`的[keyspace](http://vitess.io/overview/concepts.html#keyspace),其中有一个名为`0`的分片。点击分片名称可以查看
383 | tablets列表。当5个tablets全部显示在分片状态页面上,就可以继续下一步操作。注意,当前状态tablets不健康是正常的,因为在tablets上面还没有初始化数据库。
384 |
385 | tablets第一次创建的时候, 如果pod对应的node上尚未下载对应的[Vitess镜像](https://hub.docker.com/u/vitess/)文件,那么创建就需要花费较多的时间。同样也可以通过命令行使用`kvtctl.sh`查看tablets的状态。
386 |
387 | ``` sh
388 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
389 | ### example output:
390 | # test-0000000100 test_keyspace 0 spare 10.64.1.6:15002 10.64.1.6:3306 []
391 | # test-0000000101 test_keyspace 0 spare 10.64.2.5:15002 10.64.2.5:3306 []
392 | # test-0000000102 test_keyspace 0 spare 10.64.0.7:15002 10.64.0.7:3306 []
393 | # test-0000000103 test_keyspace 0 spare 10.64.1.7:15002 10.64.1.7:3306 []
394 | # test-0000000104 test_keyspace 0 spare 10.64.2.6:15002 10.64.2.6:3306 []
395 | ```
396 |
397 | 
398 |
399 | 7. **初始化MySQL数据库**
400 |
401 | 一旦所有的tablets都启动完成, 我们就可以初始化底层数据库了。
402 |
403 | **注意:** 许多`vtctlclient`命令在执行成功时不返回任何输出。
404 |
405 | 首先,指定tablets其中一个作为初始化的master。Vitess会自动连接其他slaves的mysqld实例,以便他们开启从master复制数据; 默认数据库创建也是如此。 因为我们的keyspace名称为`test_keyspace`,所以MySQL的数据库会被命名为`vt_test_keyspace`。
406 | ``` sh
407 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/0 test-0000000100
408 | ### example output:
409 | # master-elect tablet test-0000000100 is not the shard master, proceeding anyway as -force was used
410 | # master-elect tablet test-0000000100 is not a master in the shard, proceeding anyway as -force was used
411 | ```
412 |
413 | **注意:** 因为分片是第一次启动, tablets还没有准备做任何复制操作, 也不存在master。如果分片不是一个全新的分片,`InitShardMaster`命令增加`-force`标签可以绕过应用的完整性检查。
414 |
415 | tablets更新完成后,你可以看到一个 **master**, 多个 **replica** 和 **rdonly** tablets:
416 |
417 | ``` sh
418 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
419 | ### example output:
420 | # test-0000000100 test_keyspace 0 master 10.64.1.6:15002 10.64.1.6:3306 []
421 | # test-0000000101 test_keyspace 0 replica 10.64.2.5:15002 10.64.2.5:3306 []
422 | # test-0000000102 test_keyspace 0 replica 10.64.0.7:15002 10.64.0.7:3306 []
423 | # test-0000000103 test_keyspace 0 rdonly 10.64.1.7:15002 10.64.1.7:3306 []
424 | # test-0000000104 test_keyspace 0 rdonly 10.64.2.6:15002 10.64.2.6:3306 []
425 | ```
426 |
427 | **replica** tablets通常用于提供实时网络流量, 而 **rdonly** tablets通常用于离线处理, 例如批处理作业和备份。
428 | 每个[tablet type](http://vitess.io/overview/concepts.html#tablet)的数量可以在配置脚本`vttablet-up.sh`中配置。
429 |
430 | 9. **创建表**
431 |
432 | `vtctlclient`命令可以跨越keyspace里面的所有tablets来应用数据库变更。以下命令可以通过文件`create_test_table.sql`的内容来创建表:
433 |
434 | ``` sh
435 | # Make sure to run this from the examples/kubernetes dir, so it finds the file.
436 | vitess/examples/kubernetes$ ./kvtctl.sh ApplySchema -sql "$(cat create_test_table.sql)" test_keyspace
437 | ```
438 |
439 | 创建表的SQL如下所示:
440 |
441 | ``` sql
442 | CREATE TABLE messages (
443 | page BIGINT(20) UNSIGNED,
444 | time_created_ns BIGINT(20) UNSIGNED,
445 | message VARCHAR(10000),
446 | PRIMARY KEY (page, time_created_ns)
447 | ) ENGINE=InnoDB
448 | ```
449 |
450 | 我们可以通过运行此命令来确认在给定的tablet上表是否创建成功,`test-0000000100`是`ListAllTablets`命令显示
451 | tablet列表中一个tablet的别名:
452 |
453 | ``` sh
454 | vitess/examples/kubernetes$ ./kvtctl.sh GetSchema test-0000000100
455 | ### example output:
456 | # {
457 | # "DatabaseSchema": "CREATE DATABASE `{{.DatabaseName}}` /*!40100 DEFAULT CHARACTER SET utf8 */",
458 | # "TableDefinitions": [
459 | # {
460 | # "Name": "messages",
461 | # "Schema": "CREATE TABLE `messages` (\n `page` bigint(20) unsigned NOT NULL DEFAULT '0',\n `time_created_ns` bigint(20) unsigned NOT NULL DEFAULT '0',\n `message` varchar(10000) DEFAULT NULL,\n PRIMARY KEY (`page`,`time_created_ns`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8",
462 | # "Columns": [
463 | # "page",
464 | # "time_created_ns",
465 | # "message"
466 | # ],
467 | # ...
468 | ```
469 |
470 | 10. **执行备份**
471 |
472 | 现在, 数据库初始化已经应用, 可以开始执行第一次[备份](http://vitess.io/user-guide/backup-and-restore.html)了。在他们连上master并且复制之前, 这个备份将用于自动还原运行的任何其他副本。
473 | 如果一个已经存在的tablet出现故障,并且没有备份数据, 那么他将会自动从最新的备份恢复并且恢复复制。
474 |
475 | 选择其中一个 **rdonly** tablets并且执行备份。因为在数据复制期间创建快照的tablet会暂停复制并且停止服务,所以我们使用 **rdonly** 代替 **replica**。
476 | ``` sh
477 | vitess/examples/kubernetes$ ./kvtctl.sh Backup test-0000000104
478 | ```
479 |
480 | 备份完成后,可以通过一下命令查询备份列表:
481 |
482 | ``` sh
483 | vitess/examples/kubernetes$ ./kvtctl.sh ListBackups test_keyspace/0
484 | ### example output:
485 | # 2017-02-21.142940.test-0000000104
486 | ```
487 |
488 | 11. **初始化Vitess路由**
489 |
490 | 在本例中, 我们只使用了没有特殊配置的单节点数据库。因此,我们只需要确保当前配置的服务处于可用状态。
491 | 我们可以通过运行以下命令完成:
492 |
493 | ``` sh
494 | vitess/examples/kubernetes$ ./kvtctl.sh RebuildVSchemaGraph
495 | ```
496 |
497 | (此命令执行完成后将不显示任何输出)
498 |
499 | 12. **启动vtgate**
500 |
501 | Vitess通过使用[vtgate](http://vitess.io/overview/#vtgate)来路由每个客户端的查询到正确的`vttablet`。
502 | 在KubernetesIn中`vtgate`服务将连接分发到一个`vtgate`pods池中。pods由[replication controller](http://kubernetes.io/v1.1/docs/user-guide/replication-controller.html)来制定。
503 |
504 | ``` sh
505 | vitess/examples/kubernetes$ ./vtgate-up.sh
506 | ### example output:
507 | # Creating vtgate service in cell test...
508 | # service "vtgate-test" created
509 | # Creating vtgate replicationcontroller in cell test...
510 | # replicationcontroller "vtgate-test" created
511 | ```
512 | 13. **说明**
513 |
514 | 到目前为止,我们整体的Vitess环境就搭建好了,可以使用命令连接服务进行测试,也可以自己部署对应的应用进行测试。 测试用例可以参考官方提供的[测试用例](http://vitess.io/getting-started/#test-your-cluster-with-a-client-app)。
515 |
516 | 通过以上操作我们现在可以通过VitessClient或者Mysql-Client访问数据库了;
517 |
518 | ## 数据拆分
519 |
520 | 1. 配置分片信息
521 | 首先, 我们需要做的就是让Vitess知道我们需要怎样对数据进行分片,我们通过提供如下的VSchema配置来实现数据分片配置:
522 | ``` json
523 | {
524 | "Sharded": true,
525 | "Vindexes": {
526 | "hash": {
527 | "Type": "hash"
528 | }
529 | },
530 | "Tables": {
531 | "messages": {
532 | "ColVindexes": [
533 | {
534 | "Col": "page",
535 | "Name": "hash"
536 | }
537 | ]
538 | }
539 | }
540 | }
541 | ```
542 |
543 | 以上配置我们想通过 `page` 列的以`hash`方式来对数据进行拆分。换一种说法就是,保证相同的`page`的messages数据在同一个分片是上,但是page的分布
544 | 会被随机打散放置在不同的分片是上。
545 |
546 | 我们可以通过以下命令把VSchema信息应用到Vitess中:
547 |
548 | ``` sh
549 | vitess/examples/kubernetes$ ./kvtctl.sh ApplyVSchema -vschema "$(cat vschema.json)" test_keyspace
550 | ```
551 |
552 |
553 | 2. 新分片tablets启动
554 |
555 | 在未分片的示例中, 我们在 *test_keyspace* 中启动了一个名称为 *0* 的分片,可以这样表示 *test_keyspace/0*。
556 | 现在,我们将会分别为两个不同的分片启动tablets,命名为 *test_keyspace/-80* 和 *test_keyspace/80-*:
557 |
558 | ``` sh
559 | vitess/examples/kubernetes$ ./sharded-vttablet-up.sh
560 | ### example output:
561 | # Creating test_keyspace.shard--80 pods in cell test...
562 | # ...
563 | # Creating test_keyspace.shard-80- pods in cell test...
564 | # ...
565 | ```
566 |
567 | 因为, Guestbook应用的拆分键是page, 这就会导致pages的数据各有一半会落在不同的分片上; *0x80* 是[拆分键范围](http://vitess.io/user-guide/sharding.html#key-ranges-and-partitions)的中点。
568 | 数据分片范围如下:
569 | [0x00 - 0x80][0x80-0xFF]
570 |
571 | 在数据迁移过渡期间,新的分片和老的分片将会并行运行, 但是在我们正式做服务切换前所有的流量还是由老的分片提供服务。
572 | 我们可以通过`vtctld`web界面或者`kvtctl.sh ListAllTablets test`命令查看tablets状态,当tablets启动成功后,每个分片应该有5个对应的tablets
573 | 一旦tablets启动成功, 我们可以通过为每个新分片指定一个master来初始化同步复制数据:
574 |
575 | ``` sh
576 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/-80 test-0000000200
577 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/80- test-0000000300
578 | ```
579 |
580 | 现在我们应该一共存在15个tablets进程了, 可以通过以下命令查看:
581 |
582 | ``` sh
583 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
584 | ### example output:
585 | # test-0000000100 test_keyspace 0 master 10.64.3.4:15002 10.64.3.4:3306 []
586 | # ...
587 | # test-0000000200 test_keyspace -80 master 10.64.0.7:15002 10.64.0.7:3306 []
588 | # ...
589 | # test-0000000300 test_keyspace 80- master 10.64.0.9:15002 10.64.0.9:3306 []
590 | # ...
591 | ```
592 |
593 | 3. 数据复制
594 |
595 | 新的tablets默认都是空的, 因此我们需要将原始分片的所有数据复制到两个新的分片上,首先就从Schema开始:
596 |
597 | ``` sh
598 | vitess/examples/kubernetes$ ./kvtctl.sh CopySchemaShard test_keyspace/0 test_keyspace/-80
599 | vitess/examples/kubernetes$ ./kvtctl.sh CopySchemaShard test_keyspace/0 test_keyspace/80-
600 | ```
601 |
602 | 下面我们开始拷贝数据, 由于要复制的数据量可能非常大,所以我们使用一个称作 *vtworker* 的特殊批处理程序,根据 *keyspace_id* 路由将每一行数据从
603 | 单个源分片传输到多个目标分片。
604 |
605 | ``` sh
606 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitClone test_keyspace/0
607 | ### example output:
608 | # Creating vtworker pod in cell test...
609 | # pods/vtworker
610 | # Following vtworker logs until termination...
611 | # I0416 02:08:59.952805 9 instance.go:115] Starting worker...
612 | # ...
613 | # State: done
614 | # Success:
615 | # messages: copy done, copied 11 rows
616 | # Deleting vtworker pod...
617 | # pods/vtworker
618 | ```
619 |
620 | 注意: 这里我们只指定了数据源分片 *test_keyspace/0* 没有指定目标分片的信息。 *SplitClone* 进程会根据key值覆盖和重叠范围自动判断需要访问的目标分片。
621 | 本例中, 分片 *0* 覆盖整个范围, 所以程序可以自动识别 *-80* 和 *80-* 作为目标分片。因为它们结合起来覆盖范围和 *0* 相同;
622 |
623 |
624 | 接下来,我们将在老分片上摘除一个 *rdonly* tablet(离线处理), 作为数据复制一致性提供快照, 提供静态数据同步数据源。 整个服务可以继续服务不停机;
625 | 因为实时流量可以由 *replica* 和 *master* 负责响应处理,不会受到任何影响。 其他批处理任务同样也不会受到影响,
626 | 因为还有一台未暂停的 *rdonly* tablets可以提供服务。
627 |
628 |
629 | 4. 过滤复制检查
630 |
631 | 当数据从 *rdonly* tablet 复制完成后, *vtworker* 会开启从源分片到每个目标分片的[过滤复制](http://vitess.io/user-guide/sharding.html#filtered-replication);
632 | 过滤复制会从快照创建时间起,继续同步应用数据。
633 |
634 | 当源分片和目标分片数据基本一直时,还会继续复制更新。 你可以通过查看每个分片的内容来看到这个数据同步的变化, 您还可以向留言板应用程序中的各个页面添加新信息,然后在分
635 | 片 *0* 中可以看到所有的消息, 而新的分片仅能看到分布在这个分片上的消息。
636 |
637 | ``` sh
638 | # See what's on shard test_keyspace/0:
639 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT * FROM messages"
640 | # See what's on shard test_keyspace/-80:
641 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000200 "SELECT * FROM messages"
642 | # See what's on shard test_keyspace/80-:
643 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000300 "SELECT * FROM messages"
644 | ```
645 |
646 | 说明: 可以通过在Guestbook上的不同的页面上添加一些消息, 来观察他们是如何进行数据路由的。
647 |
648 | 5. 数据完整性检查
649 |
650 | *vtworker* 批处理程序还有另一种模式,可以比较源分片和目标分片所有数据的一致性和正确性。
651 | 以下命令将在每个目标分片上校验数据差异:
652 |
653 | ``` sh
654 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitDiff test_keyspace/-80
655 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitDiff test_keyspace/80-
656 | ```
657 |
658 | 如果发现有任何差异, 程序将会输出差异信息。
659 | 如果所有检测都正常, 你将会看到如下信息:
660 |
661 | ```
662 | I0416 02:10:56.927313 10 split_diff.go:496] Table messages checks out (4 rows processed, 1072961 qps)
663 | ```
664 |
665 |
666 | 6. 服务切换
667 |
668 | 现在,我们就可以把所有服务切换到新的分片上,由新的分片为应用提供服务。
669 | 我们可以使用[MigrateServedTypes](http://vitess.io/reference/vtctl.html#migrateservedtypes)命令,一次迁移同一
670 | 个[cell](http://vitess.io/overview/concepts.html#cell-data-center)上的一个[tablet type](http://vitess.io/overview/concepts.html#tablet);
671 | 在master切换完成之前,在任何时候我们都可以进行数据回滚。
672 |
673 | ``` sh
674 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 rdonly
675 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 replica
676 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 master
677 | ```
678 |
679 |
680 | 在 *master* 迁移过程中, 首先会停止老master接收数据更新请求; 然后进程需要等待新的分片通过过滤
681 | 复制数据完全一致, 其次才会开启新的服务。 由于过滤复制已经确保数据实时更新,因此在切换过程中应用应该只会出现几秒钟的不可用。
682 |
683 | master完全迁移后就会停止过滤复制, 新分片的数据更新就会被开启, 但是老分片的更新依然是不可用。
684 | 读者可以自己尝试下: 将消息添加到留言板页面,然后检查数据库内容是否有同步更新。
685 |
686 | ``` sh
687 | # See what's on shard test_keyspace/0
688 | # (no updates visible since we migrated away from it):
689 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT * FROM messages"
690 | # See what's on shard test_keyspace/-80:
691 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000200 "SELECT * FROM messages"
692 | # See what's on shard test_keyspace/80-:
693 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000300 "SELECT * FROM messages"
694 | ```
695 |
696 |
697 |
698 | 7. 老分片下线
699 |
700 | 现在,所有的服务都由新的分片进行提供, 我们可以老的分片进行下线处理,使资源可以回收利用。 通过运行脚本`vttablet-down.sh`关闭一组
701 | 非拆分的分片:
702 |
703 | ``` sh
704 | vitess/examples/kubernetes$ ./vttablet-down.sh
705 | ### example output:
706 | # Deleting pod for tablet test-0000000100...
707 | # pods/vttablet-100
708 | # ...
709 | ```
710 | 通过以上命令,我们已经关闭了老分片中的服务,下面可以通过以下命令删除控制的分片信息,保证元数据一致。
711 |
712 | ``` sh
713 | vitess/examples/kubernetes$ ./kvtctl.sh DeleteShard -recursive test_keyspace/0
714 | ```
715 |
716 | 我们可以通过 **Topology** 页面或者使用`kvtctl.sh ListAllTablets test`命令,来查看元数据信息, 通过运行命令发现
717 | 分片 *0* 已经不存在了,说明我们已经成功删除了分片 *0*, 当系统中存在不可用或者闲置的分片的时候就可以通过这种方式删除。
718 |
719 |
720 |
721 | ## 其他
722 | 基础环境的搭建完全是依赖于Kubernetes,以下列出了对应的Kubernetes文档,有需要的可以根据需要进行查阅。
723 |
724 | * [Kubernetes官方文档](https://kubernetes.io/docs)
725 | * [Kubernetes中文文档](https://www.kubernetes.org.cn/k8s)
726 | * [测试用例](http://vitess.io/getting-started/#test-your-cluster-with-a-client-app)
727 |
--------------------------------------------------------------------------------
/etcd.md:
--------------------------------------------------------------------------------
1 | # ETCD运维手册
2 |
3 | ## 1. 查看成员列表
4 | ``` sh
5 | # 查看成员列表
6 | $ export ETCDCTL_API=3
7 | $ etcdctl --endpoints=http://etcd-global:4001 member list
8 |
9 | # 3c7b0b72a83cb207, started, etcd-global-l2ng6, http://b449bd7.etcd-global-srv.etcd2test.svc.hades.local:7001, http://192.168.81.75:4001
10 | # 446539d2bec101ad, started, etcd-global-dhzvw, http://790c7240.etcd-global-srv.etcd2test.svc.hades.local:7001, http://192.168.81.78:4001
11 | # e5e76d85dd6a2837, started, etcd-global-xkvps, http://add369ac.etcd-global-srv.etcd2test.svc.hades.local:7001, http://192.168.81.74:4001
12 | ```
13 |
14 | ## 2. 查看可用服务列表
15 | ``` sh
16 | # 查看可用服务列表
17 | $ getsrv etcd-server tcp etcd-global-srv
18 | # b449bd7.etcd-global-srv.etcd2test.svc.hades.local.:7001
19 | # 790c7240.etcd-global-srv.etcd2test.svc.hades.local.:7001
20 | # add369ac.etcd-global-srv.etcd2test.svc.hades.local.:7001
21 | ```
22 |
23 |
24 | ## 3. 删除成员
25 | ``` sh
26 | # 446539d2bec101ad, started, etcd-global-dhzvw, http://790c7240.etcd-global-srv.etcd2test.svc.hades.local:7001, http://192.168.81.78:4001
27 | # member_id = 446539d2bec101ad
28 | $ export ETCDCTL_API=3
29 | $ etcdctl --endpoints=http://etcd-global:4001 member remove $member_id
30 |
31 | ```
32 | ## 4. 数据备份
33 |
34 | ## 5. 集群节点之间时间偏差问题
35 |
36 | ## 6. 权限控制
37 |
38 | ## 7. 使用备份数据重新启动集群
39 |
40 | ## 8. 异常情况说明
41 | 正常情况下容器销毁的时候会自动把自己从集群中移除, 如果出现异常情况,比如宿主机直接挂了,可能还没有来得及自己删除就退出了。 这种情况如果需要新加入节点可以手动删除成员,然后再添加即可。
42 |
43 | 通过测试ETCD正常容器关闭的时候会自动把自己从集群中摘除, 比如启动后集群有三个节点, 关闭一个节点后会自动从集群中删除(member remove xxxxx); 详细的可以参考etcd集群启动脚本里面的preStop
44 | 关闭一个后集群中两个节点依然会正常处理请求, 然后如果我们再关闭一个节点, 同样会自动摘除,集群中就会剩下一个节点, 集群中剩下一个节点的时候依然可以保证正常的服务。 这样使用就大大提高的系统可用性,
45 | 如果集群中部分节点关闭同样可以确保系统是正常的。
46 |
--------------------------------------------------------------------------------
/overview/Concepts.md:
--------------------------------------------------------------------------------
1 | This document defines common Vitess concepts and terminology.
2 |
3 | ## Keyspace
4 |
5 | A *keyspace* is a logical database. In the unsharded case, it maps directly
6 | to a MySQL database name, but it can also map to multiple MySQL databases.
7 |
8 | Reading data from a keyspace is like reading from a MySQL database. However,
9 | depending on the consistency requirements of the read operation, Vitess
10 | might fetch the data from a master database or from a replica. By routing
11 | each query to the appropriate database, Vitess allows your code to be
12 | structured as if it were reading from a single MySQL database.
13 |
14 | When a database is
15 | [sharded](http://en.wikipedia.org/wiki/Shard_(database_architecture)),
16 | a keyspace maps to multiple MySQL databases. In that case, a single query sent
17 | to Vitess will be routed to one or more shards, depending on where the requested
18 | data resides.
19 |
20 | ## Keyspace ID
21 |
22 | The *keyspace ID* is the value that is used to decide on which shard a given
23 | record lives. [Range-based Sharding](http://vitess.io/user-guide/sharding.html#range-based-sharding)
24 | refers to creating shards that each cover a particular range of keyspace IDs.
25 |
26 | Often, the keyspace ID is computed as the hash of some column in your data,
27 | such as the user ID. This would result in randomly spreading users across
28 | the range-based shards.
29 | Using this technique means you can split a given shard by replacing it with two
30 | or more new shards that combine to cover the original range of keyspace IDs,
31 | without having to move any records in other shards.
32 |
33 | Previously, our resharding process required each table to store this value as a
34 | `keyspace_id` column because it was computed by the application. However, this
35 | column is no longer necessary when you allow VTGate to compute the keyspace ID
36 | for you, for example by using a `hash` vindex.
37 |
38 | ## Shard
39 |
40 | A *shard* is a division within a keyspace. A shard typically contains one MySQL
41 | master and many MySQL slaves.
42 |
43 | Each MySQL instance within a shard has the same data (excepting some replication
44 | lag). The slaves can serve read-only traffic (with eventual consistency guarantees),
45 | execute long-running data analysis tools, or perform administrative tasks
46 | (backup, restore, diff, etc.).
47 |
48 | A keyspace that does not use sharding effectively has one shard.
49 | Vitess names the shard `0` by convention. When sharded, a keyspace has `N`
50 | shards with non-overlapping data.
51 |
52 | ### Resharding
53 |
54 | Vitess supports [dynamic resharding](http://vitess.io/user-guide/sharding.html#resharding),
55 | in which the number of shards is changed on a live cluster. This can be either
56 | splitting one or more shards into smaller pieces, or merging neighboring shards
57 | into bigger pieces.
58 |
59 | During dynamic resharding, the data in the source shards is copied into the
60 | destination shards, allowed to catch up on replication, and then compared
61 | against the original to ensure data integrity. Then the live serving
62 | infrastructure is shifted to the destination shards, and the source shards are
63 | deleted.
64 |
65 | ## Tablet
66 |
67 | A *tablet* is a combination of a `mysqld` process and a corresponding `vttablet`
68 | process, usually running on the same machine.
69 |
70 | Each tablet is assigned a *tablet type*, which specifies what role it currently
71 | performs.
72 |
73 | ### Tablet Types
74 |
75 | * **master** - A *replica* tablet that happens to currently be the MySQL master
76 | for its shard.
77 | * **replica** - A MySQL slave that is eligible to be promoted to *master*.
78 | Conventionally, these are reserved for serving live, user-facing
79 | requests (like from the website's frontend).
80 | * **rdonly** - A MySQL slave that cannot be promoted to *master*.
81 | Conventionally, these are used for background processing jobs,
82 | such as taking backups, dumping data to other systems, heavy
83 | analytical queries, MapReduce, and resharding.
84 | * **backup** - A tablet that has stopped replication at a consistent snapshot,
85 | so it can upload a new backup for its shard. After it finishes,
86 | it will resume replication and return to its previous type.
87 | * **restore** - A tablet that has started up with no data, and is in the process
88 | of restoring itself from the latest backup. After it finishes,
89 | it will begin replicating at the GTID position of the backup,
90 | and become either *replica* or *rdonly*.
91 | * **drained** - A tablet that has been reserved by a Vitess background
92 | process (such as rdonly tablets for resharding).
93 |
94 |
95 |
96 | ## Keyspace Graph
97 |
98 | The *keyspace graph* allows Vitess to decide which set of shards to use for a
99 | given keyspace, cell, and tablet type.
100 |
101 | ### Partitions
102 |
103 | During horizontal resharding (splitting or merging shards), there can be shards
104 | with overlapping key ranges. For example, the source shard of a split may serve
105 | `c0-d0` while its destination shards serve `c0-c8` and `c8-d0` respectively.
106 |
107 | Since these shards need to exist simultaneously during the migration,
108 | the keyspace graph maintains a list (called a *partitioning* or just a *partition*)
109 | of shards whose ranges cover all possible keyspace ID values, while being
110 | non-overlapping and contiguous. Shards can be moved in and out of this list to
111 | determine whether they are active.
112 |
113 | The keyspace graph stores a separate partitioning for each `(cell, tablet type)` pair.
114 | This allows migrations to proceed in phases: first migrate *rdonly* and
115 | *replica* requests, one cell at a time, and finally migrate *master* requests.
116 |
117 | ### Served From
118 |
119 | During vertical resharding (moving tables out from one keyspace to form a new
120 | keyspace), there can be multiple keyspaces that contain the same table.
121 |
122 | Since these multiple copies of the table need to exist simultaneously during
123 | the migration, the keyspace graph supports keyspace redirects, called
124 | `ServedFrom` records. That enables a migration flow like this:
125 |
126 | 1. Create `new_keyspace` and set its `ServedFrom` to point to `old_keyspace`.
127 | 1. Update the app to look for the tables to be moved in `new_keyspace`.
128 | Vitess will automatically redirect these requests to `old_keyspace`.
129 | 1. Perform a vertical split clone to copy data to the new keyspace and start
130 | filtered replication.
131 | 1. Remove the `ServedFrom` redirect to begin actually serving from `new_keyspace`.
132 | 1. Drop the now unused copies of the tables from `old_keyspace`.
133 |
134 | There can be a different `ServedFrom` record for each `(cell, tablet type)` pair.
135 | This allows migrations to proceed in phases: first migrate *rdonly* and
136 | *replica* requests, one cell at a time, and finally migrate *master* requests.
137 |
138 | ## Replication Graph
139 |
140 | The *replication graph* identifies the relationships between master
141 | databases and their respective replicas. During a master failover,
142 | the replication graph enables Vitess to point all existing replicas
143 | to a newly designated master database so that replication can continue.
144 |
145 | ## Topology Service
146 |
147 | The *[Topology Service](/user-guide/topology-service.html)*
148 | is a set of backend processes running on different servers.
149 | Those servers store topology data and provide a distributed locking service.
150 |
151 | Vitess uses a plug-in system to support various backends for storing topology
152 | data, which are assumed to provide a distributed, consistent key-value store.
153 | By default, our [local example](http://vitess.io/getting-started/local-instance.html)
154 | uses the ZooKeeper plugin, and the [Kubernetes example](http://vitess.io/getting-started/)
155 | uses etcd.
156 |
157 | The topology service exists for several reasons:
158 |
159 | * It enables tablets to coordinate among themselves as a cluster.
160 | * It enables Vitess to discover tablets, so it knows where to route queries.
161 | * It stores Vitess configuration provided by the database administrator that is
162 | needed by many different servers in the cluster, and that must persist between
163 | server restarts.
164 |
165 | A Vitess cluster has one global topology service, and a local topology service
166 | in each cell. Since *cluster* is an overloaded term, and one Vitess cluster is
167 | distinguished from another by the fact that each has its own global topology
168 | service, we refer to each Vitess cluster as a **toposphere**.
169 |
170 | ### Global Topology
171 |
172 | The global topology stores Vitess-wide data that does not change frequently.
173 | Specifically, it contains data about keyspaces and shards as well as the
174 | master tablet alias for each shard.
175 |
176 | The global topology is used for some operations, including reparenting and
177 | resharding. By design, the global topology server is not used a lot.
178 |
179 | In order to survive any single cell going down, the global topology service
180 | should have nodes in multiple cells, with enough to maintain quorum in the
181 | event of a cell failure.
182 |
183 | ### Local Topology
184 |
185 | Each local topology contains information related to its own cell.
186 | Specifically, it contains data about tablets in the cell, the keyspace graph
187 | for that cell, and the replication graph for that cell.
188 |
189 | The local topology service must be available for Vitess to discover tablets
190 | and adjust routing as tablets come and go. However, no calls to the topology
191 | service are made in the critical path of serving a query at steady state.
192 | That means queries are still served during temporary unavailability of topology.
193 |
194 | ## Cell (Data Center)
195 |
196 | A *cell* is a group of servers and network infrastructure collocated in an area,
197 | and isolated from failures in other cells. It is typically either a full data
198 | center or a subset of a data center, sometimes called a *zone* or *availability zone*.
199 | Vitess gracefully handles cell-level failures, such as when a cell is cut off the network.
200 |
201 | Each cell in a Vitess implementation has a [local topology service](#topology-service),
202 | which is hosted in that cell. The topology service contains most of the
203 | information about the Vitess tablets in its cell.
204 | This enables a cell to be taken down and rebuilt as a unit.
205 |
206 | Vitess limits cross-cell traffic for both data and metadata.
207 | While it may be useful to also have the ability to route read traffic to
208 | individual cells, Vitess currently serves reads only from the local cell.
209 | Writes will go cross-cell when necessary, to wherever the master for that shard
210 | resides.
211 |
--------------------------------------------------------------------------------
/overview/ScalingMySQL.md:
--------------------------------------------------------------------------------
1 | Traditionally, it's been difficult to scale a MySQL-based database to an arbitrary size. Since MySQL lacks the out-of-the-box multi-instance support required to really scale an application, the process can be complex and obscure.
2 |
3 | As the application grows, scripts emerge to back up data, migrate a master database, or run some offline data processing. Complexity creeps into the application layer, which increasingly needs to be aware of database details. And before we know it, any change needs a big engineering effort so we can keep scaling.
4 |
5 | Vitess grew out of YouTube's attempt to break this cycle, and YouTube decided to open source Vitess after realizing that this is a very common problem. Vitess simplifies every aspect of managing a MySQL cluster, allowing easy scaling to any size without complicating your application layer. It ensures your database can keep up when your application takes off, leaving you with a database that is flexible, secure, and easy to mine.
6 |
7 | This document talks about the process of moving from a single small database to a limitless database cluster. It explains how steps in that process influenced Vitess' design, linking to relevant parts of the Vitess documentation along the way. It concludes with tips for designing a new, highly scalable application and database schema.
8 |
9 | ## Getting started
10 |
11 | Vitess sits between your application and your MySQL database. It looks at incoming queries and routes them properly. So, instead of sending a query directly from your application to your database, you send it through Vitess, which understands your database topology and constantly monitors the health of individual database instances.
12 |
13 | While Vitess is designed to manage large, multi-instance databases, it offers features that simplify database setup and management at all stages of your product's lifecycle.
14 |
15 | Starting out, our first step is getting a simple, reliable, durable database cluster in place with a master instance and a couple of replicas. In Vitess terminology, that's a single-shard, single-keyspace database. Once that building block is in place, we can focus on replicating it to scale up.
16 |
17 | ### Planning for scale
18 |
19 | We recommend a number of best practices to facilitate scaling your database as your product evolves. You might not experience the benefits of these actions immediately, but adopting these practices from day one will make it much easier for your database and product to grow:
20 |
21 | * Always keep your database schema under source control and provide unit test coverage of that schema. Also check schema changes into source control and run unit tests against the newly modified schema.
22 | * Think about appropriate sharding keys for your data and structure that data accordingly. Usually, sharding keys are obvious -- e.g. a user ID. However, having your key(s) in place ahead of time is much easier than needing to retrofit your data before you can actually shard it.
23 | * Group tables that share the same sharding key. Similarly, split tables that don’t share the same sharding key into different keyspaces.
24 | * Avoid cross-shard queries (scatter queries). Instead, use MapReduce for offline queries and build Spark data processing pipelines for online queries. Usually, it is faster to extract raw data and then post-process it in the MapReduce framework.
25 | * Plan to have data in multiple data centers and regions. It is easier to migrate to multiple data centers if you've planned for incoming queries to be routed to a region or application server pool that, in turn, connects to the right database pool.
26 |
27 | ## Step 1: Setting up a database cluster
28 |
29 | At the outset, plan to create a database cluster that has a master instance and a couple of read-only replicas (or slaves). The replicas would be able to take over if the master became unavailable, and they might also handle read-only traffic. You'd also want to schedule regular data backups.
30 |
31 | It's worth noting that master management is a complex and critical challenge for data reliability. At any given time, a shard has only one master instance, and all replica instances replicate from it. Your application -- either a component in your application layer or Vitess, if you are using it -- needs to be able to easily identify the master instance for write operations, recognizing that the master might change from time to time. Similarly, your application, with or without Vitess, should be able to seamlessly adapt to new replicas coming online or old ones being unavailable.
32 |
33 | ### Keep routing logic out of your application
34 |
35 | A core principle underlying Vitess' design is that your database and data management practices should always be ready to support your application's growth. So, you might not yet have an immediate need to store data in multiple data centers, shard your database, or even do regular backups. But when those needs arise, you want to be sure that you'll have an easy path to achieve them. Note that you can run Vitess in a Kubernetes cluster or on local hardware.
36 |
37 | With that in mind, you want to have a plan that allows your database to grow without complicating your application code. For example, if you reshard your database, your application code shouldn't need to change to identify the target shards for a particular query.
38 |
39 | Vitess has several components that keep this complexity out of your application:
40 |
41 | * Each MySQL instance is paired with a **vttablet** process, which provides features like connection pooling, query rewriting, and query de-duping.
42 | * Your application sends queries to **vtgate**, a light proxy that routes traffic to the correct vttablet(s) and then returns consolidated results to the application.
43 | * The **Topology Service** -- Vitess supports Zookeeper and etcd -- maintains configuration data for the database system. Vitess relies on the service to know where to route queries based on both the sharding scheme and the availability of individual MySQL instances.
44 | * The **vtctl** and **vtctld** tools offer command-line and web interfaces to the system.
45 |
46 |
47 |

48 |
49 |
50 |
51 | Setting up these components directly -- for example, writing your own topology service or your own implementation of vtgate -- would require a lot of scripting specific to a given configuration. It would also yield a system that would be difficult and costly to support. In addition, while any one of the components on its own is useful in limiting complexity, you need all of them to keep your application as simple as possible while also optimizing performance.
52 |
53 | **Optional functionality to implement**
54 |
55 | * *Recommended*. Vitess has basic support for identifying or changing a master, but it doesn't aim to fully address this feature. As such, we recommend using another program, like [Orchestrator](https://github.com/outbrain/orchestrator), to monitor the health of your servers and to change your master database when necessary. (In a sharded database, each shard has a master.)
56 |
57 |
58 | * *Recommended*. You should have a way to monitor your database topology and set up alerts as needed. Vitess components facilitate this monitoring by exporting a lot of runtime variables, like QPS over the last few minutes, error rates, and query latency. The variables are exported in JSON format, and Vitess also supports an InfluxDB plug-in.
59 |
60 |
61 | * *Optional*. Using the Kubernetes scripts as a base, you could run Vitess components with other configuration management systems (like Puppet) or frameworks (like Mesos or AWS images).
62 |
63 | **Related Vitess documentation:**
64 |
65 | * [Running Vitess on Kubernetes](http://vitess.io/getting-started/)
66 | * [Running Vitess on a local server](http://vitess.io/getting-started/local-instance.html)
67 | * [Backing up data](http://vitess.io/user-guide/backup-and-restore.html)
68 | * [Reparenting - basic assignment of master instance in Vitess](http://vitess.io/user-guide/reparenting.html)
69 |
70 | ## Step 2: Connect your application to your database
71 |
72 | Obviously, your application needs to be able to call your database. So, we'll jump straight to explaining how you'd modify your application to connect to your database through vtgate.
73 |
74 | ### Using the Vitess connector
75 |
76 | The main protocol for connecting to Vitess is [gRPC](http://www.grpc.io/). The connection lets the application see the database and send queries to it. The queries are virtually identical to the ones the application would send directly to MySQL.
77 |
78 | Vitess supports connections for several languages:
79 |
80 | * **Go**: We provide a sql/database driver.
81 | * **Java**: We provide a library that wraps the gRPC code and a JDBC driver.
82 | * **PHP**: We provide a library that wraps the gRPC code and a PDO driver.
83 | * **Python**: We provide a library that wraps the gRPC code and a PEP-compliant driver.
84 |
85 | #### Unit testing database interactions
86 |
87 | The vttest library and executables provide a unit testing environment that lets you start a fake cluster that acts as an exact replica of your production environment for testing purposes. In the fake cluster, a single DB instance hosts all of your shards.
88 |
89 | ### Migrating production data to Vitess
90 |
91 | The easiest way to migrate data to your Vitess database is to take a backup of your existing data, restore it on the Vitess cluster, and go from there. However, that requires some downtime.
92 |
93 | Another, more complicated approach, is a live migration, which requires your application to support both direct MySQL access and Vitess access. In that approach, you'd enable MySQL replication from your source database to the Vitess master database. This would allow you to migrate quickly and with almost no downtime.
94 |
95 | Note that this path is highly dependent on the source setup. Thus, while Vitess provides helper tools, it does not offer a generic way to support this type of migration.
96 |
97 | **Related Vitess documentation:**
98 |
99 | * [Vitess API Reference](http://vitess.io/reference/vitess-api.html)
100 | * [Schema Management](http://vitess.io/user-guide/schema-management.html)
101 | * [Transport Security Model](http://vitess.io/user-guide/transport-security-model.html)
102 |
103 | ## Step 3: Vertical sharding (scaling to multiple keyspaces)
104 |
105 | Typically, the first step in scaling up is vertical sharding, in which you identify groups of tables that belong together and move them to separate keyspaces. A keyspace is a distributed database, and, usually, the databases are unsharded at this point. That said, it's possible that you'll need to horizontally shard your data (step 4) before scaling to multiple keyspaces.
106 |
107 | The benefit of splitting tables into multiple keyspaces is to parallelize access to the data (increased performance), and to prepare each smaller keyspace for horizontal sharding. And, in separating data into multiple keyspaces, you should aim to reach a point where:
108 |
109 | * All data inside a keyspace scales the same way. For example, in an e-commerce application, user data and product data don’t scale the same way. But user preferences and shopping cart data usually scale with the number of users.
110 | * You can choose a sharding key inside a keyspace and associate each row of every table with a value of the sharding key. Step 4 talks more about choosing a sharding key.
111 | * Joins are primarily within keyspaces. (Joins between keyspaces are costly.)
112 | * Transactions involving data in multiple keyspaces, which are also expensive, are uncommon.
113 |
114 | ### Scaling keyspaces with Vitess
115 |
116 | Several vtctl functions -- vtctl is Vitess' command-line tool for managing your database topology -- support features for vertically splitting a keyspace. In this process, a set of tables can be moved from an existing keyspace to a new keyspace with no read downtime and write downtime of just a few seconds.
117 |
118 | **Related Vitess documentation:**
119 |
120 | * [vtctl Reference guide](http://vitess.io/reference/vtctl.html)
121 |
122 | ## Step 4: Horizontal sharding (partitioning your data)
123 |
124 | The next step in scaling your data is horizontal sharding, the process of partitioning your data to improve scalability and performance. A shard is a horizontal partition of the data within a keyspace. Each shard has a master instance and replica instances, but data does not overlap between shards.
125 |
126 | In general, database sharding is most effective when the assigned keyspace IDs are evenly distributed among shards. Keyspace IDs identify the primary entity of a keyspace. For example, a keyspace ID might identify a user, a product, or a purchase.
127 |
128 | Since vanilla MySQL lacks native sharding support, you'd typically need to write sharding code and embed sharding logic in your application to shard your data.
129 |
130 | ### Sharding options in Vitess
131 |
132 | A keyspace in Vitess can have three sharding schemes:
133 |
134 | * **Not sharded** (or **unsharded**). The keyspace has one shard, which contains all of the data.
135 | * **Custom**: The keyspace has multiple shards, each of which can be targeted by a different connection pool. The application needs to target statements to the right shards.
136 | * **Range-based**: The application provides a sharding key for each record, and each shard contains a range of sharding keys. Vitess uses the sharding key values to route queries to the right shards and also supports advanced features like dynamic resharding.
137 |
138 | A prerequisite for sharding a keyspace in Vitess is that all of the tables in the keyspace contain a keyspace ID, which is a hashed version of the sharding key. Having all of the tables in a keyspace share a keyspace ID was one of the goals mentioned in section 3, but it's a requirement once you're ready to shard your data.
139 |
140 | Vitess offers robust resharding support, which involves updating the sharding scheme for a keyspace and dynamically reorganizing data to match the new scheme. During resharding, Vitess copies, verifies, and keeps data up-to-date on new shards while existing shards continue serving live read and write traffic. When you're ready to switch over, the migration occurs with just a few seconds of read-only downtime.
141 |
142 | **Related Vitess documentation:**
143 |
144 | * [Sharding](http://vitess.io/user-guide/sharding.html)
145 | * [Horizontal sharding (Codelab)](http://vitess.io/user-guide/horizontal-sharding.html)
146 | * [Sharding in Kubernetes (Codelab)](http://vitess.io/user-guide/sharding-kubernetes.html)
147 |
148 | ## Related tasks
149 |
150 | In addition to the four steps discussed above, you might also want to do some or all of the following as your application matures.
151 |
152 | ### Data processing input / output
153 |
154 | Hadoop is a framework that enables distributed processing of large data sets across clusters of computers using simple programming models.
155 |
156 | Vitess provides a Hadoop InputSource that can be used for any Hadoop MapReduce job or even connected to Spark. The Vitess InputSource takes a simple SQL query, splits that query into small chunks, and parallelizes data reading as much as possible across database instances, shards, etc.
157 |
158 | ### Query log analysis
159 |
160 | Database query logs can help you to monitor and improve your application's performance.
161 |
162 | To that end, each vttablet instance provides runtime stats, which can be accessed through the tablet’s web page, for the queries the tablet is running. These stats make it easy to detect slow queries, which are usually hampered by a missing or mismatched table index. Reviewing these queries regularly helps maintain the overall health of your large database installation.
163 |
164 | Each vttablet instance can also provide a stream of all the queries it is running. If the Vitess cluster is colocated with a log cluster, you can dump this data in real time and then run more advanced query analysis.
165 |
--------------------------------------------------------------------------------
/overview/VitessOverview.md:
--------------------------------------------------------------------------------
1 | # 什么是vitess
2 | Vitess是用于扩展MySQL的数据库解决方案, 他可以有效的运行在基于专用硬件设备的公有云或者私有云上;
3 | 通过NoSQL数据库的可扩展性它结合并扩展了许多重要的MySQL功能。自2011年起,Vitess一直为所有YouTube数据库流量提供服务。
4 |
5 |
6 | # Kubernetes上运行vitess
7 | Kubernetes是一个开源的容器管理系统,Vitess可以作为Kubernetes可感知的云本地分布式数据库运行。
8 | Kubernetes处理集群中的节点调度计划,管理节点上的工作负载,方便管理构成应用的容器组。他提供了一个类似于Vitess在YouTube运行的
9 | 开源环境。
10 |
11 | # 参数对比
12 | 以下部分将Vitess与两个常见的替代方案(普通MySQL和NoSQL)进行比较。
13 |
14 | ## Vitess和普通MySQL
15 | Vitess在以下几个方面对普通的MySQL进行改进:
16 |
17 | 普通Mysql|Vitess
18 | :------|:----
19 | 每个MySQL连接的内存开销在256KB到近3MB之间,具体的取决于你使用的MySQL版本;随着用户数量的增长,必须要添加RAM以支持其他连接,但是增加RAM并不能带来查询速度的提高。另外,获取更多连接的同时会增加CPU的成本。 |Vitess基于gRPC协议创建的都是轻量级的连接,Vitess连接池功能使用Go的并发支持将这些轻量级连接映射到一个小型的MySQL连接池;因此,Vitess可以轻松处理数千个连接。
20 | 写的不好的查询(例如:sql中没有增加limit)会给数据库的所有用户带来负面的影响|Vitess采用SQL解析器,它使用一组可配置的规则来重写可能损害数据库性能的查询。
21 | 分片是对数据进行分区以提高可伸缩性和性能的过程。MySQL缺少本机分片支持,需要在应用程序中编写分片代码和嵌入分片逻辑|Vitess使用基于范围的分片;它支持水平和垂直重新分区,只需几秒钟的只读停机即可完成大多数数据转换。Vitess甚至可以适应您已有的自定义分片计划。
22 | 基于复制协议的MySQL集群包含一个主库和多个从库, 如果主库宕机,其中一个从库会被提升为主库继续提供服务; 这就需要管理数据库的生命周期同时告知应用当前系统的状态。|Vitess有助于管理数据库方案的生命周期。它支持并自动处理各种场景,包括主故障转移和数据备份。
23 | MySQL集群可以具有针对不同工作负载的自定义数据库配置,例如用于写入的主库、为web客户端提供的快速只读从库、为批处理作业提供的较慢的只读副本等等。如果数据库具有水平分片,则需要为每个分片重复设置,应用程序需要编写逻辑来确认如何找到正确的数据库。|Vitess使用支持数据一致性的存储系统来存储拓扑结构,例如etcd或ZooKeeper。这意味着集群视图始终是最新的,并且对于不同的客户端都是一致的;Vitess还提供了一个代理,可以将查询有效地路由到最合适的MySQL实例。
24 |
25 | ## Vitess和NoSQL
26 | 如果你考虑一个NoSQL解决方案主要是出于对MySQL的可扩展性的担忧,那么Vitess可能是您的应用程序最佳的选择。虽然NoSQL为非结构化数据提供了极大的支持,但Vitess仍然提供了NoSQL数据存储中不可用的几个优点:
27 |
28 | NoSQL|Vitess
29 | :------|:----
30 | NoSQL数据库不定义数据库表之间的关系,并且只支持SQL语言的一个子集。|Vitess不是一个简单的键值存储,它支持复杂的查询语义,如where子句,JOINS,聚合函数等。
31 | NoSQL数据存储不支持事务。|Vitess支持单分片中的事务。我们还在探索使用2PC支持跨分片事务的可行性。
32 | NoSQL解决方案具有定制API,从而导致定制架构,应用程序和工具。|Vitess对MySQL增加很少的变动(一个大多数人已经习惯于使用的数据库)。
33 | 与MySQL相比,NoSQL解决方案对数据库索引提供有限的支持。|Vitess允许您使用所有MySQL的索引功能来优化查询性能。
34 |
35 | # 特征
36 | * 性能
37 | * 连接池 - 扩展前端连接,同时优化MySQL性能
38 | * 查询去重 - 对于正在运行的查询在执行期间接收的任何相同请求都会复用查询结果
39 | * 事务管理 - 限制并发事务的数量并管理截止时间以优化整体吞吐量
40 | * 保护
41 | * 查询重写和预防 - 添加限制并避免任何非确定性更新
42 | * 查询黑名单 - 自定义规则防止可能有问题的查询命中数据库
43 | * 查询终止 - 终止需要太长的时间才能返回数据的查询
44 | * 表访问控制 - 指定连接用户对表的访问控制列表
45 | * 监控
46 | * 性能分析 - 工具允许您监视,诊断和分析数据库性能
47 | * 流式查询 - 使用传入查询列表来提供OLAP工作负载
48 | * 流式更新 - A server streams the list of rows changing in the database, which can be used as a mechanism to propagate changes to other data stores.
49 | * 拓扑管理工具
50 | * 主库管理工具(处理主切换)
51 | * 基于Web GUI管理工具
52 | * 设计用于在多个数据中心/地区提供服务
53 | * 分片
54 | * 几乎无缝的动态重新分片
55 | * 垂直和水平分片支持
56 | * 内置基于范围的分片方式和应用程序自定义的分片方式的支持。
57 |
58 | # 结构
59 | Vitess平台包括多个服务器进程、命令行工具、基于web的工具、元数据的一致性存储支持。
60 | 根据应用程序的当前状态, 你可以通过多个不同的处理流程来达到一个完整的Vitess实现。例如:如果你从头开始构建一个服务,你使用Vitess的第一步就是定义数据库的拓扑结构。如果您需要扩展现有的数据库,你的第一步可能就是从启动一个连接代理开始。
61 | Vitess工具和服务器主要是帮助您是从一个完整的数据库实现开始还是基于最小实现,然后随着时间的推移逐步扩展。对于最小实现,vttablet的特性连接池和查询重写可以帮助我们在当前已经存在的机器上获取更大的性能;Vitess自动化工具对于大型项目可以提供更好的支持。
62 | 下图是Vitess的结构图:
63 | 
64 |
65 | ## Topology
66 | 拓扑服务是元数据存储服务包含有关运行服务器的信息,数据库分片信息和数据复制拓扑图;拓扑服务是基于一致性数据存储引擎。 您可以使用vtctl(命令行)和vtctld(web)来查看系统的拓扑结构数据。
67 | 在Kubernetes中,元数据存储是etcd, 同时在Vitess源代码也附带Apache ZooKeeper支持。
68 |
69 | ## vtgate
70 | vtgate是一个轻量级的代理服务器,他可以把数据路由到正确的vttablet服务并且将合并的结果返回给客户端;他是应用程序发送查询的服务器。所以,客户端可以非常简单,因为它只需要能够找到一个vtgate实例即可。
71 | 对于路由查询,vtgate考虑分片方案,所需的延迟以及tablets和底层数据库的可用性。
72 |
73 | ## vttablet
74 | vttablet是一个位于MySQL数据库前面的代理服务器;Vitess实现是每个MySQL实例都对应有一个vttablet进程。
75 | vttablet执行任务的时候尝试最大化吞吐量同时对于有害查询的任务会保护MySQL;其功能包括连接池、查询重写、查询去重。另外,可以通过vtctl对vttablet发起管理任务,它提供用于过滤复制和数据导出的流服务。
76 | 轻量级的Vitess实现使用vttablet作为单个MySQL数据库查询的智能连接代理。通过在MySQL数据库前运行vttablet并且更改您的应用程序以使用Vitess客户端,而不是您的MySQL驱动程序;vttablet的连接池、查询重写、查询去重这些特性会给你的应用带来很多好处。
77 |
78 | ## vtctl
79 | vtctl是一个用来管理Vitess集群的命令行工具;它允许人或应用程序轻松地与Vitess实现交互。使用vtctl你可以识别数据库的主从关系,创建表,启动故障切换,执行分片(和重新分片)操作等。vtctl执行操作时它会根据需要更新锁服务器;其他Vitess服务器会观察这些更改并相应做出反应。例如:如果使用vtctl故障切换到新的主数据库,vtgate会获取到更改并将后续的写操作指向新的主节点。
80 |
81 | ## vtctld
82 | vtctld是一个HTTP服务器,它允许您浏览存储在锁服务器中的信息,它对于故障排除或者获取服务器更高级概述及其当前状态的非常有用。
83 |
84 | ## vtworker
85 | vtworker托管长时间运行的进程,它支持插件架构并且提供库,以便您可以轻松地选择使用tablets。插件适用于以下类型的工作:
86 | * resharding differ - resharding后校验数据完整性
87 | * vertical split differ - 垂直拆分后校验数据完整性
88 | vtworker还允许您轻松添加其他验证过程,您可以进行片内完整性检查,以验证外键式关系或交叉分片完整性检查。例如:一个keyspace中的索引表在另一个keyspace中的参考数据。
89 |
90 | ## 其他支持工具
91 | Vitess还包括以下工具:
92 | * mysqlctl: 管理MySQL实例
93 | * zk: ZooKeeper命令行客户端和浏览器
94 | * zkctl: 管理ZooKeeper实例
95 |
96 | # 历史
97 | 自2011年以来,Vitess一直是YouTube基础设施的基本组成部分,本节简要总结了导致Vitess创建的事件序列:
98 | 1. YouTube的MySQL数据库达到了一个点,当高峰流量将很快超过数据库的服务容量,为了暂时缓解这个问题,YouTube为写入流量创建了主数据库,为读取流量创建了一个副本数据库。
99 | 2. 随着对cat视频的需求达到一个历史最高点,只读流量仍然高到足以使副本数据库过载。因此,YouTube添加了更多副本,再次提供了一个临时解决方案。
100 | 3. 最终,写流量变得太高,以至于master数据库无法处理;要求YouTube分割数据以处理传入的流量,如果单个MySQL实例数据量太大,那么分片也将变得必要)。
101 | 4. YouTube修改了应用程层代码,以便在执行任何数据库操作之前,通过代码识别特定的查询到达正确的数据库分片。
102 |
103 | Vitess让YouTube从源代码中删除逻辑代码,在应用程序和数据库之间引入代理,通过路由来管理和数据库间的交互。从此以后,YouTube已将用户群规模扩大了50倍,大大增加了其页面的服务能力,处理新上传的视频,等等。更重要的是,Vitess是一个可以继续扩展的平台。
104 | YouTube选择使用Go语言开发Vitess,因为Go语言优秀的表现能力和性能。它几乎和Python一样富有表现力,非常易于维护。但是,它的性能与Java相同,在某些情况下接近C++。此外,该语言非常适合并发编程,并具有非常高质量的标准库。
105 |
106 | # 开源为先
107 | Vitess的开源版本与YouTube使用的版本非常相似,虽然有一些变化让YouTube可以利用Google的基础设施,但核心功能是一样的。当开发新功能时,Vitess团队首先使它们在开源树中工作,在某些情况下,团队编写一个使用Google特定技术的插件。这种方法确保Vitess的开源版本维持与内部版本相同的质量水平。
108 | 绝大多数Vitess开发发生在开放的GitHub上。因此,Vitess的构建具有可扩展性,以便您可以根据您的基础设施的需求进行调整。
109 |
--------------------------------------------------------------------------------
/res/Kubernetes_ui.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/Kubernetes_ui.png
--------------------------------------------------------------------------------
/res/etcd_pods.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/etcd_pods.png
--------------------------------------------------------------------------------
/res/kvtctl_list.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/kvtctl_list.png
--------------------------------------------------------------------------------
/res/minikube_struct.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/minikube_struct.png
--------------------------------------------------------------------------------
/res/vitess_architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/vitess_architecture.png
--------------------------------------------------------------------------------
/res/vtctld_web.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/zssky/vitessdoc/127d150358fd6b3b64c5704a01c29aafa9100dea/res/vtctld_web.png
--------------------------------------------------------------------------------
/started/DockerBuild.md:
--------------------------------------------------------------------------------
1 | 默认情况下 [Kubernetes configs](https://github.com/youtube/vitess/tree/master/examples/kubernetes)
2 | 使用 [Docker Hub](https://hub.docker.com/u/vitess/) 中的`vitess/lite`镜像.
3 | 这个镜像是定期从github的master分支创建出来的.
4 |
5 | 如果你想创建自己的镜像可以按下面操作:
6 |
7 | 1. 安装 [Docker](https://www.docker.com/) .
8 |
9 | 我们的脚本还假设您可以运行docker命令而不使用sudo,您可以通过 [设置docker组](https://docs.docker.com/engine/installation/linux/ubuntulinux/#create-a-docker-group)来执行.
10 |
11 | 2. 创建一个 [Docker Hub](https://docs.docker.com/docker-hub/) 帐号,使用 `docker login` 登录.
12 |
13 | 3. 进入你的 `$GOPATH/src/github.com/youtube/vitess` 目录.
14 |
15 | 4. 通常你不需要 [创建你的引导镜像](https://github.com/youtube/vitess/blob/master/docker/bootstrap/README.md)
16 | 除非你要修改 [bootstrap.sh](https://github.com/youtube/vitess/blob/master/bootstrap.sh)
17 | 或者 [vendor.json](https://github.com/youtube/vitess/blob/master/vendor/vendor.json),
18 | 例如添加新的依赖. 如果你不需要重新创建引导镜像,你可以选用下面的镜像:
19 |
20 | ```sh
21 | vitess$ docker pull vitess/bootstrap:mysql57 # MySQL Community Edition 5.7
22 | vitess$ docker pull vitess/bootstrap:mysql56 # MySQL Community Edition 5.6
23 | vitess$ docker pull vitess/bootstrap:percona57 # Percona Server 5.7
24 | vitess$ docker pull vitess/bootstrap:percona # Percona Server
25 | vitess$ docker pull vitess/bootstrap:mariadb # MariaDB
26 | ```
27 |
28 | **注意:** 为保证你本地镜像是最新的,你最好每次都执行上面命令拉取镜像
29 |
30 | 5. 构建`vitess/lite[]`镜像。 这将构建`vitess/base[]`镜像并运行一个脚本,只提取运行Vitess所需的文件(`vitess/base`包含开发工作所需的所有内容)。
31 |
32 | 使用下面的命令之一创建镜像 (`docker_lite` 默认使用MySQL 5.7):
33 |
34 | ```sh
35 | vitess$ make docker_lite
36 | vitess$ make docker_lite_mysql56
37 | vitess$ make docker_lite_percona57
38 | vitess$ make docker_lite_percona
39 | vitess$ make docker_lite_mariadb
40 | ```
41 |
42 | 6. 修改镜像的tag到你个人的`docker hub`仓库, 然后上传.
43 | ```sh
44 | vitess$ docker tag -f vitess/lite yourname/vitess
45 | vitess$ docker push yourname/vitess
46 | ```
47 |
48 | **注意:** 如果你使用的不是默认的 `vitess/lite`要改成这种 `vitess/lite:`.
49 |
50 | 7. 修改yaml中的镜像地址为新地址:
51 | ```sh
52 | vitess/examples/kubernetes$ sed -i -e 's,image: vitess/lite,image: yourname/vitess:latest,' *.yaml
53 | ```
54 |
55 | 添加`:latest` 的意思是让`kubernetes`在加载镜像时每次都检查是否有新的镜像更新,如有更新的话,新的`pod`就会拉新的镜像了。
56 |
57 | Once you've stabilized your image, you'll probably want to replace `:latest`
58 | with a specific label that you change each time you make a new build,
59 | so you can control when pods update.
60 |
61 | 8. 按 [Vitess on Kubernetes](http://vitess.io/getting-started/) 的方式启动vitess.
62 |
--------------------------------------------------------------------------------
/started/GettingStartedKubernetes.md:
--------------------------------------------------------------------------------
1 | 本页解释如何在[Kubernetes](http://kubernetes.io)集群上运行Vitess。同时还提供了[Google Container Engine](https://cloud.google.com/container-engine/)启动Kubernetes集群的使用步骤。
2 |
3 | 如果你在其他平台上已经有Kubernetes v1.0+版本的支持,你可以跳过`gcloud`步骤。 `kubectl`可以应用于任何Kubernetes集群。
4 |
5 | ## 先决条件
6 |
7 | 为了更好的使用本指南,您必须在本地安装Go 1.7+,Vitess的`vtctlclient`工具和Google Cloud SDK。以下部分说明如何在您的环境中进行配置。
8 | 其中,Google Cloud SDK不是非必需的,如果您在自有环境中搭建Kubernetes,那么就可以忽略Google Cloud SDK。
9 |
10 | ### 安装Go 1.7+
11 |
12 | 您需要安装[Go 1.7+](http://golang.org/doc/install)才能编译安装`vtctlclient`工具, vtctlclient可以向Vitess发送相关管理命令。
13 |
14 | go环境安装完成后,请确保您的环境变量中`GOPATH`指向您工作目录的根目录。最常用的设置是`GOPATH=$HOME/go`,
15 | 同时需要确保该非root用户对该目录具有读写权限。
16 |
17 |
18 | 另外,请确保把`$GOPATH/bin`路径增加到环境变量`$PATH`中;有关go工作路径的更多信息可以参阅[如何使用go语言开发](http://golang.org/doc/code.html#Organization)。
19 |
20 | ### 编译安装vtctlclient
21 |
22 | `vtctlclient`工具可以用来向Vitess发送命令。
23 | ``` sh
24 | $ go get github.com/youtube/vitess/go/cmd/vtctlclient
25 | ```
26 | 该命令在以下目录下载并且编译了Vitess源码:
27 | ``` sh
28 | $GOPATH/src/github.com/youtube/vitess/
29 | ```
30 |
31 | 同时他也会把编译好的`vtctlclient`二进制文件拷贝到目录`$GOPATH/bin`下。
32 |
33 | ### 设置Google Compute Engine, Container Engine和Cloud工具
34 |
35 | **注意:** 如果您在别处运行Kubernetes, 那么请跳转到[本地kubectl](#locate-kubectl)。
36 |
37 | 为了使用GCE在Kubernetes上运行Vitess, 我们必须有一个GCE账户来计费。下面就说明在Google Developers Console中构建项目如何开启计费功能以及如何关联账户。
38 |
39 | 1. 登陆到谷歌开发者中心来[开启计费](https://console.developers.google.com/billing)。
40 | 1. Click the **Billing** pane if you are not there already.
41 | 1. Click **New billing account**.
42 | 1. Assign a name to the billing account -- e.g. "Vitess on
43 | Kubernetes." Then click **Continue**. You can sign up
44 | for the [free trial](https://cloud.google.com/free-trial/)
45 | to avoid any charges.
46 |
47 | 1. Create a project in the Google Developers Console that uses
48 | your billing account:
49 | 1. At the top of the Google Developers Console, click the **Projects** dropdown.
50 | 1. Click the Create a Project... link.
51 | 1. Assign a name to your project. Then click the **Create** button.
52 | Your project should be created and associated with your
53 | billing account. (If you have multiple billing accounts,
54 | confirm that the project is associated with the correct account.)
55 | 1. After creating your project, click **API Manager** in the left menu.
56 | 1. Find **Google Compute Engine** and **Google Container Engine API**.
57 | (Both should be listed under "Google Cloud APIs".)
58 | For each, click on it, then click the **"Enable API"** button.
59 |
60 | 1. Follow the [Google Cloud SDK quickstart instructions]
61 | (https://cloud.google.com/sdk/#Quick_Start) to set up
62 | and test the Google Cloud SDK. You will also set your default project
63 | ID while completing the quickstart.
64 |
65 | **Note:** If you skip the quickstart guide because you've previously set up
66 | the Google Cloud SDK, just make sure to set a default project ID by running
67 | the following command. Replace `PROJECT` with the project ID assigned to
68 | your [Google Developers Console](https://console.developers.google.com/)
69 | project. You can [find the ID]
70 | (https://cloud.google.com/compute/docs/projects#projectids)
71 | by navigating to the **Overview** page for the project in the Console.
72 |
73 | ``` sh
74 | $ gcloud config set project PROJECT
75 | ```
76 |
77 | 1. 安装或者更新`kubectl`工具:
78 | ``` sh
79 | $ gcloud components update kubectl
80 | ```
81 |
82 | ### 本地kubectl
83 | 检查`kubectl`是否已经正常安装并设置环境变量`PATH`:
84 |
85 | ``` sh
86 | $ which kubectl
87 | ### example output:
88 | # /usr/local/bin/kubectl
89 | ```
90 |
91 | 如果`kubectl`不在您的`PATH`上,您可以通过设置`KUBECTL`环境变量来告诉脚本在哪里可以找到它:
92 |
93 | ``` sh
94 | $ export KUBECTL=/usr/local/bin/kubectl
95 | ```
96 |
97 | ## 启动容器集群
98 |
99 | **注意:** 如果您在其他地方运行Kubernetes,请跳到[启动Vitess集群](#start-a-vitess-cluster)。
100 | 1. 设置[zone][zone](https://cloud.google.com/compute/docs/zones#overview)。
101 | 使用以下命令安装:
102 | ``` sh
103 | $ gcloud config set compute/zone us-central1-b
104 | ```
105 |
106 | 1. 创建容器引擎集群:
107 | ``` sh
108 | $ gcloud container clusters create example --machine-type n1-standard-4 --num-nodes 5 --scopes storage-rw
109 | ### example output:
110 | # Creating cluster example...done.
111 | # Created [https://container.googleapis.com/v1/projects/vitess/zones/us-central1-b/clusters/example].
112 | # kubeconfig entry generated for example.
113 | ```
114 |
115 | **注意:** The `--scopes storage-rw` argument is necessary to allow
116 | [built-in backup/restore](http://vitess.io/user-guide/backup-and-restore.html)
117 | to access [Google Cloud Storage](https://cloud.google.com/storage/).
118 |
119 | 1. 创建一个云存储bucket:
120 |
121 | 为了给backups使用云存储插件,首先需要创建一个[bucket](https://cloud.google.com/storage/docs/concepts-techniques#concepts)用来存储Vitess备份数据。详细信息可以查阅[bucket naming guidelines](https://cloud.google.com/storage/docs/bucket-naming)。
122 |
123 | ``` sh
124 | $ gsutil mb gs://my-backup-bucket
125 | ```
126 |
127 | ## 启动Vitess集群
128 |
129 | 1. **跳转到本地Vitess源代码目录**
130 |
131 | 此目录在安装`vtctlclient`的时候已经被创建成功:
132 |
133 | ``` sh
134 | $ cd $GOPATH/src/github.com/youtube/vitess/examples/kubernetes
135 | ```
136 |
137 | 2. **配置本地站点设置**
138 |
139 | 运行`configure.sh`脚本来生成`config.sh`文件,`config.sh`用于自定义您的集群设置.
140 |
141 | 目前,我们对于在[Google云存储](https://cloud.google.com/storage/)的备份(http://vitess.io/user-guide/backup-and-restore.html)提供了现成的支持。
142 | 如果使用GCS, 请填写configure脚本中所需要字段, 包括上面创建bucket的名称。
143 |
144 | ``` sh
145 | vitess/examples/kubernetes$ ./configure.sh
146 | ### example output:
147 | # Backup Storage (file, gcs) [gcs]:
148 | # Google Developers Console Project [my-project]:
149 | # Google Cloud Storage bucket for Vitess backups: my-backup-bucket
150 | # Saving config.sh...
151 | ```
152 |
153 | 对于其他平台,您需要选择`文件`备份存储插件,并在`vttablet`和`vtctld` pod中安装一个读写网络卷。例如:您可以通过NFS将任何存储服务mount到Kubernetes中,然后在这里提供configure脚本的安装路径。
154 |
155 | 可以通过实现[Vitess BackupStorage插件](https://github.com/youtube/vitess/blob/master/go/vt/mysqlctl/backupstorage/interface.go)来添加对其他云存储(如Amazon S3)的支持。如果您有其他定制的插件请求,请在[论坛](https://groups.google.com/forum/#!forum/vitess)上告诉我们。
156 |
157 | 3. **启动etcd集群**
158 | Vitess[拓扑服务](http://vitess.io/overview/concepts.html#topology-service)存储Vitess集群中所有服务器的协作数据, 他可以将此数据存储在几个一致性存储系统中。本例中我们使用[etcd](https://github.com/coreos/etcd)来存储,注意,我们需要自己的etcd集群,与Kubernetes本身使用的集群分开。
159 |
160 | ``` sh
161 | vitess/examples/kubernetes$ ./etcd-up.sh
162 | ### example output:
163 | # Creating etcd service for global cell...
164 | # service "etcd-global" created
165 | # service "etcd-global-srv" created
166 | # Creating etcd replicationcontroller for global cell...
167 | # replicationcontroller "etcd-global" created
168 | # ...
169 | ```
170 |
171 | 这条命令创建了两个集群, 一个是[全局数据中心](/user-guide/topology-service.html#global-vs-local)集群,另一个是[本地数据中心](http://vitess.io/overview/concepts.html#cell-data-center)集群。你可以通过运行以下命令来检查群集中[pods](http://kubernetes.io/v1.1/docs/user-guide/pods.html)的状态:
172 |
173 | ``` sh
174 | $ kubectl get pods
175 | ### example output:
176 | # NAME READY STATUS RESTARTS AGE
177 | # etcd-global-8oxzm 1/1 Running 0 1m
178 | # etcd-global-hcxl6 1/1 Running 0 1m
179 | # etcd-global-xupzu 1/1 Running 0 1m
180 | # etcd-test-e2y6o 1/1 Running 0 1m
181 | # etcd-test-m6wse 1/1 Running 0 1m
182 | # etcd-test-qajdj 1/1 Running 0 1m
183 | ```
184 | Kubernetes节点第一下下载需要的Docker镜像的时候会耗费较长的时间, 在下载镜像的过程中Pod的状态是Pending状态。
185 |
186 | **注意:** 本例中, 每个以`-up.sh`结尾的脚本都有一个以`-down.sh`结尾的脚本相对应。 你可以用来停止Vitess集群中的某些组件,而不会关闭整个集群;例如:移除`etcd`的部署可以使用一下命令:
187 |
188 | ``` sh
189 | vitess/examples/kubernetes$ ./etcd-down.sh
190 | ```
191 |
192 | 4. **启动vtctld**
193 | `vtctld`提供了检查Vitess集群状态的接口, 同时还可以接收`vtctlclient`的RPC命令来修改集群信息。
194 |
195 | ``` sh
196 | vitess/examples/kubernetes$ ./vtctld-up.sh
197 | ### example output:
198 | # Creating vtctld ClusterIP service...
199 | # service "vtctld" created
200 | # Creating vtctld replicationcontroller...
201 | # replicationcontroller "vtctld" create createdd
202 | ```
203 |
204 | 5. **使用vtctld web界面**
205 |
206 | 在Kubernetes外面使用vtctld需要使用[kubectl proxy]
207 | (http://kubernetes.io/v1.1/docs/user-guide/kubectl/kubectl_proxy.html)在工作站上创建一个通道。
208 |
209 | **注意:** proxy命令是运行在前台, 所以如果你想启动proxy需要另外开启一个终端。
210 |
211 | ``` sh
212 | $ kubectl proxy --port=8001
213 | ### example output:
214 | # Starting to serve on localhost:8001
215 | ```
216 |
217 | 你可以在`本地`打开vtctld web界面:
218 |
219 | http://localhost:8001/api/v1/proxy/namespaces/default/services/vtctld:web/
220 |
221 | 同时,还可以通过proxy进入[Kubernetes Dashboard]
222 | (http://kubernetes.io/v1.1/docs/user-guide/ui.html), 监控nodes, pods和服务器状态:
223 |
224 | http://localhost:8001/ui
225 |
226 | 6. **使用 vtctlclient 向vtctld发送命令**
227 |
228 | 现在就可以通过在本地运行`vtctlclient`向Kubernetes集群中的`vtctld`发送命令。
229 |
230 | 为了开启RPC访问Kubernetes集群,我们将再次使用`kubectl`设置一个验证的隧道。
231 | 和HTTP代理不同的是, HTTP代理我们使用Web界面访问, 这次我们使用[端口转发](http://kubernetes.io/v1.1/docs/user-guide/kubectl/kubectl_port-forward.html)来访问vtctld的gRPC端口。
232 |
233 | 由于通道需要针对特定的vtctld pod名称,所以我们提供了`kvtctl.sh`脚本,它使用`kubectl`命令来查找pod名称并且在运行`vtctlclient`之前设置通道。
234 |
235 | 现在运行`kvtctl.sh help`可以测试到`vtctld`的连接,同时还会列出管理Vitess集群的`vtctlclient`命令。
236 |
237 | ``` sh
238 | vitess/examples/kubernetes$ ./kvtctl.sh help
239 | ### example output:
240 | # Available commands:
241 | #
242 | # Tablets:
243 | # InitTablet ...
244 | # ...
245 | ```
246 | 可以使用`help`获取每个命令的详细信息:
247 |
248 | ``` sh
249 | vitess/examples/kubernetes$ ./kvtctl.sh help ListAllTablets
250 | ```
251 | 有关`vtctl help`输出的web格式版本,请参阅[vtctl参考](http://vitess.io/reference/vtctl.html)
252 |
253 | 7. **启动vttablets**
254 |
255 | [tablet](http://vitess.io/overview/concepts.html#tablet)是Vitess扩展的基本单位。tablet由运行在相同的机器上的`vttablet` 和 `mysqld`组成。
256 | 我们在用Kubernetes的时候通过将vttablet和mysqld的容器放在单个[pod](http://kubernetes.io/v1.1/docs/user-guide/pods.html)中来实现耦合。
257 |
258 | 运行以下脚本以启动vttablet pod,其中也包括mysqld:
259 |
260 | ``` sh
261 | vitess/examples/kubernetes$ ./vttablet-up.sh
262 | ### example output:
263 | # Creating test_keyspace.shard-0 pods in cell test...
264 | # Creating pod for tablet test-0000000100...
265 | # pod "vttablet-100" created
266 | # Creating pod for tablet test-0000000101...
267 | # pod "vttablet-101" created
268 | # Creating pod for tablet test-0000000102...
269 | # pod "vttablet-102" created
270 | # Creating pod for tablet test-0000000103...
271 | # pod "vttablet-103" created
272 | # Creating pod for tablet test-0000000104...
273 | # pod "vttablet-104" created
274 | ```
275 | 启动后在vtctld Web管理界面中很快就会看到一个名为`test_keyspace`的[keyspace](http://vitess.io/overview/concepts.html#keyspace),其中有一个名为`0`的分片。点击分片名称可以查看
276 | tablets列表。当5个tablets全部显示在分片状态页面上,就可以继续下一步操作。注意,当前状态tablets不健康是正常的,因为在tablets上面还没有初始化数据库。
277 |
278 | tablets第一次创建的时候, 如果pod对应的node上尚未下载对应的[Vitess镜像](https://hub.docker.com/u/vitess/)文件,那么创建就需要花费较多的时间。同样也可以通过命令行使用`kvtctl.sh`查看tablets的状态。
279 |
280 | ``` sh
281 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
282 | ### example output:
283 | # test-0000000100 test_keyspace 0 spare 10.64.1.6:15002 10.64.1.6:3306 []
284 | # test-0000000101 test_keyspace 0 spare 10.64.2.5:15002 10.64.2.5:3306 []
285 | # test-0000000102 test_keyspace 0 spare 10.64.0.7:15002 10.64.0.7:3306 []
286 | # test-0000000103 test_keyspace 0 spare 10.64.1.7:15002 10.64.1.7:3306 []
287 | # test-0000000104 test_keyspace 0 spare 10.64.2.6:15002 10.64.2.6:3306 []
288 | ```
289 |
290 | 8. **初始化MySQL数据库**
291 |
292 | 一旦所有的tablets都启动完成, 我们就可以初始化底层数据库了。
293 |
294 | **注意:** 许多`vtctlclient`命令在执行成功时不返回任何输出。
295 |
296 | 首先,指定tablets其中一个作为初始化的master。Vitess会自动连接其他slaves的mysqld实例,以便他们开启从master复制数据的mysqld进程; 默认数据库创建也是如此。 因为我们的keyspace名称为`test_keyspace`,所以MySQL的数据库会被命名为`vt_test_keyspace`。
297 | ``` sh
298 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/0 test-0000000100
299 | ### example output:
300 | # master-elect tablet test-0000000100 is not the shard master, proceeding anyway as -force was used
301 | # master-elect tablet test-0000000100 is not a master in the shard, proceeding anyway as -force was used
302 | ```
303 |
304 | **注意:** 因为分片是第一次启动, tablets还没有准备做任何复制操作, 也不存在master。如果分片不是一个全新的分片,`InitShardMaster`命令增加`-force`标签可以绕过应用的健全检查。
305 |
306 | tablets更新完成后,你可以看到一个**master**, 多个 **replica** 和 **rdonly** tablets:
307 |
308 | ``` sh
309 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
310 | ### example output:
311 | # test-0000000100 test_keyspace 0 master 10.64.1.6:15002 10.64.1.6:3306 []
312 | # test-0000000101 test_keyspace 0 replica 10.64.2.5:15002 10.64.2.5:3306 []
313 | # test-0000000102 test_keyspace 0 replica 10.64.0.7:15002 10.64.0.7:3306 []
314 | # test-0000000103 test_keyspace 0 rdonly 10.64.1.7:15002 10.64.1.7:3306 []
315 | # test-0000000104 test_keyspace 0 rdonly 10.64.2.6:15002 10.64.2.6:3306 []
316 | ```
317 |
318 | **replica** tablets通常用于提供实时网络流量, 而 **rdonly** tablets通常用于离线处理, 例如批处理作业和备份。
319 | 每个[tablet type](http://vitess.io/overview/concepts.html#tablet)的数量可以在配置脚本`vttablet-up.sh`中配置。
320 |
321 | 9. **创建表**
322 |
323 | `vtctlclient`命令可以跨越keyspace里面的所有tablets来应用数据库变更。以下命令创建定义在文件`create_test_table.sql`中的表:
324 |
325 | ``` sh
326 | # Make sure to run this from the examples/kubernetes dir, so it finds the file.
327 | vitess/examples/kubernetes$ ./kvtctl.sh ApplySchema -sql "$(cat create_test_table.sql)" test_keyspace
328 | ```
329 |
330 | 创建表的SQL如下所示:
331 |
332 | ``` sql
333 | CREATE TABLE messages (
334 | page BIGINT(20) UNSIGNED,
335 | time_created_ns BIGINT(20) UNSIGNED,
336 | message VARCHAR(10000),
337 | PRIMARY KEY (page, time_created_ns)
338 | ) ENGINE=InnoDB
339 | ```
340 |
341 | 我们可以通过运行此命令来确认在给定的tablet是否创建成功,`test-0000000100`是`ListAllTablets`命令显示
342 | tablet列表其中一个tablet的别名:
343 |
344 | ``` sh
345 | vitess/examples/kubernetes$ ./kvtctl.sh GetSchema test-0000000100
346 | ### example output:
347 | # {
348 | # "DatabaseSchema": "CREATE DATABASE `{{.DatabaseName}}` /*!40100 DEFAULT CHARACTER SET utf8 */",
349 | # "TableDefinitions": [
350 | # {
351 | # "Name": "messages",
352 | # "Schema": "CREATE TABLE `messages` (\n `page` bigint(20) unsigned NOT NULL DEFAULT '0',\n `time_created_ns` bigint(20) unsigned NOT NULL DEFAULT '0',\n `message` varchar(10000) DEFAULT NULL,\n PRIMARY KEY (`page`,`time_created_ns`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8",
353 | # "Columns": [
354 | # "page",
355 | # "time_created_ns",
356 | # "message"
357 | # ],
358 | # ...
359 | ```
360 |
361 | 10. **执行备份**
362 |
363 | 现在, 数据库初始化已经应用, 是执行第一次[备份](http://vitess.io/user-guide/backup-and-restore.html)的最佳时间。在他们连上master并且复制之前, 这个备份将用于自动还原运行的任何其他副本。
364 |
365 | 如果一个已经存在的tablet出现故障,并且没有备份数据, 那么他将会自动从最新的备份恢复并且恢复复制。
366 |
367 | 选择其中一个 **rdonly** tablets并且执行备份。因为在数据复制期间创建一致性快照tablet会暂停复制并且停止服务,所以我们使用 **rdonly** tablet代替 **replica**。
368 | ``` sh
369 | vitess/examples/kubernetes$ ./kvtctl.sh Backup test-0000000104
370 | ```
371 |
372 | After the backup completes, you can list available backups for the shard:
373 |
374 | ``` sh
375 | vitess/examples/kubernetes$ ./kvtctl.sh ListBackups test_keyspace/0
376 | ### example output:
377 | # 2015-10-21.042940.test-0000000104
378 | ```
379 |
380 | 11. **初始化Vitess路由**
381 |
382 | 在本例中, 我们只使用了没有特殊配置的单个数据库。因此,我们只需要确保当前(空)配置处于服务状态。
383 | 我们可以通过运行以下命令完成:
384 |
385 | ``` sh
386 | vitess/examples/kubernetes$ ./kvtctl.sh RebuildVSchemaGraph
387 | ```
388 |
389 | (因为在运行,此命令将不显示任何输出。)
390 |
391 | 12. **启动vtgate**
392 |
393 | Vitess通过使用[vtgate](http://vitess.io/overview/#vtgate)路由每个客户端的查询到正确的`vttablet`。
394 | 在KubernetesIn中`vtgate`服务将连接分发到一个`vtgate`pods池中。pods由[replication controller](http://kubernetes.io/v1.1/docs/user-guide/replication-controller.html)来制定。
395 |
396 | ``` sh
397 | vitess/examples/kubernetes$ ./vtgate-up.sh
398 | ### example output:
399 | # Creating vtgate service in cell test...
400 | # service "vtgate-test" created
401 | # Creating vtgate replicationcontroller in cell test...
402 | # replicationcontroller "vtgate-test" created
403 | ```
404 |
405 | ## 使用客户端app测试集群
406 |
407 | The GuestBook app in the example is ported from the
408 | [Kubernetes GuestBook example](https://github.com/kubernetes/kubernetes/tree/master/examples/guestbook-go).
409 | The server-side code has been rewritten in Python to use Vitess as the storage
410 | engine. The client-side code (HTML/JavaScript) has been modified to support
411 | multiple Guestbook pages, which will be useful to demonstrate Vitess sharding in
412 | a later guide.
413 |
414 | ``` sh
415 | vitess/examples/kubernetes$ ./guestbook-up.sh
416 | ### example output:
417 | # Creating guestbook service...
418 | # services "guestbook" created
419 | # Creating guestbook replicationcontroller...
420 | # replicationcontroller "guestbook" created
421 | ```
422 |
423 | As with the `vtctld` service, by default the GuestBook app is not accessible
424 | from outside Kubernetes. In this case, since this is a user-facing frontend,
425 | we set `type: LoadBalancer` in the GuestBook service definition,
426 | which tells Kubernetes to create a public
427 | [load balancer](http://kubernetes.io/v1.1/docs/user-guide/services.html#type-loadbalancer)
428 | using the API for whatever platform your Kubernetes cluster is in.
429 |
430 | You also need to [allow access through your platform's firewall]
431 | (http://kubernetes.io/v1.1/docs/user-guide/services-firewalls.html).
432 |
433 | ``` sh
434 | # For example, to open port 80 in the GCE firewall:
435 | $ gcloud compute firewall-rules create guestbook --allow tcp:80
436 | ```
437 |
438 | **Note:** For simplicity, the firewall rule above opens the port on **all**
439 | GCE instances in your project. In a production system, you would likely
440 | limit it to specific instances.
441 |
442 | Then, get the external IP of the load balancer for the GuestBook service:
443 |
444 | ``` sh
445 | $ kubectl get service guestbook
446 | ### example output:
447 | # NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
448 | # guestbook 10.67.242.247 3.4.5.6 80/TCP 1m
449 | ```
450 |
451 | If the `EXTERNAL-IP` is still empty, give it a few minutes to create
452 | the external load balancer and check again.
453 |
454 | Once the pods are running, the GuestBook app should be accessible
455 | from the load balancer's external IP. In the example above, it would be at
456 | `http://3.4.5.6`.
457 |
458 | You can see Vitess' replication capabilities by opening the app in
459 | multiple browser windows, with the same Guestbook page number.
460 | Each new entry is committed to the master database.
461 | In the meantime, JavaScript on the page continuously polls
462 | the app server to retrieve a list of GuestBook entries. The app serves
463 | read-only requests by querying Vitess in 'replica' mode, confirming
464 | that replication is working.
465 |
466 | You can also inspect the data stored by the app:
467 |
468 | ``` sh
469 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT * FROM messages"
470 | ### example output:
471 | # +------+---------------------+---------+
472 | # | page | time_created_ns | message |
473 | # +------+---------------------+---------+
474 | # | 42 | 1460771336286560000 | Hello |
475 | # +------+---------------------+---------+
476 | ```
477 |
478 | The [GuestBook source code]
479 | (https://github.com/youtube/vitess/tree/master/examples/kubernetes/guestbook)
480 | provides more detail about how the app server interacts with Vitess.
481 |
482 | ## 测试Vitess resharding
483 |
484 | 现在你有一个完整的Vitess堆栈运行,你可能想继续按照[Sharding in Kubernetes](http://vitess.io/user-guide/sharding-kubernetes.html)测试[dynamic resharding](http://vitess.io/user-guide/sharding.html#resharding)。
485 |
486 | 如果是这样, 你可以跳过关闭和清理, 因为sharding指南可以直接跳转查阅。 如果不是请继续下面的关闭和清理。
487 |
488 | ## 关闭和清理
489 |
490 | 在停止Container Engine集群之前,应该删除Vitess服务。对于那些服务Kubernetes会负责清理它创建的任何实体
491 | , 比如外部负载均衡。
492 |
493 | ``` sh
494 | vitess/examples/kubernetes$ ./guestbook-down.sh
495 | vitess/examples/kubernetes$ ./vtgate-down.sh
496 | vitess/examples/kubernetes$ ./vttablet-down.sh
497 | vitess/examples/kubernetes$ ./vtctld-down.sh
498 | vitess/examples/kubernetes$ ./etcd-down.sh
499 | ```
500 |
501 | Then tear down the Container Engine cluster itself, which will stop the virtual
502 | machines running on Compute Engine:
503 |
504 | ``` sh
505 | $ gcloud container clusters delete example
506 | ```
507 |
508 | It's also a good idea to remove any firewall rules you created, unless you plan
509 | to use them again soon:
510 |
511 | ``` sh
512 | $ gcloud compute firewall-rules delete guestbook
513 | ```
514 |
515 | ## 故障排除
516 |
517 | ### 服务日志
518 |
519 | 如果一个pod进入`Running`状态,但是服务并没有按照预期响应。 那么我可以通过使用`kubectl logs`命令来查看
520 | pod输出:
521 |
522 | ``` sh
523 | # show logs for container 'vttablet' within pod 'vttablet-100'
524 | $ kubectl logs vttablet-100 vttablet
525 |
526 | # show logs for container 'mysql' within pod 'vttablet-100'
527 | # Note that this is NOT MySQL error log.
528 | $ kubectl logs vttablet-100 mysql
529 | ```
530 |
531 | 发布日志在某个地方,并且发送一个链接到[Vitess邮件列表](https://groups.google.com/forum/#!forum/vitess)获
532 | 得更多的帮助。
533 |
534 | ### Shell访问
535 |
536 | If you want to poke around inside a container, you can use `kubectl exec` to run
537 | a shell.
538 |
539 | For example, to launch a shell inside the `vttablet` container of the
540 | `vttablet-100` pod:
541 |
542 | ``` sh
543 | $ kubectl exec vttablet-100 -c vttablet -t -i -- bash -il
544 | root@vttablet-100:/# ls /vt/vtdataroot/vt_0000000100
545 | ### example output:
546 | # bin-logs innodb my.cnf relay-logs
547 | # data memcache.sock764383635 mysql.pid slow-query.log
548 | # error.log multi-master.info mysql.sock tmp
549 | ```
550 |
551 | ### Root证书
552 |
553 | If you see in the logs a message like this:
554 |
555 | ```
556 | x509: failed to load system roots and no roots provided
557 | ```
558 |
559 | It usually means that your Kubernetes nodes are running a host OS
560 | that puts root certificates in a different place than our configuration
561 | expects by default (for example, Fedora). See the comments in the
562 | [etcd controller template](https://github.com/youtube/vitess/blob/master/examples/kubernetes/etcd-controller-template.yaml)
563 | for examples of how to set the right location for your host OS.
564 | You'll also need to adjust the same certificate path settings in the
565 | `vtctld` and `vttablet` templates.
566 |
567 | ### vttablets状态页面
568 |
569 | Each `vttablet` serves a set of HTML status pages on its primary port.
570 | The `vtctld` interface provides a **STATUS** link for each tablet.
571 |
572 | If you access the vtctld web UI through the kubectl proxy as described above,
573 | it will automatically link to the vttablets through that same proxy,
574 | giving you access from outside the cluster.
575 |
576 | You can also use the proxy to go directly to a tablet. For example,
577 | to see the status page for the tablet with ID `100`, you could navigate to:
578 |
579 | http://localhost:8001/api/v1/proxy/namespaces/default/pods/vttablet-100:15002/debug/status
580 |
581 | ### 直连mysqld
582 |
583 | Since the `mysqld` within the `vttablet` pod is only meant to be accessed
584 | via vttablet, our default bootstrap settings only allow connections from
585 | localhost.
586 |
587 | If you want to check or manipulate the underlying mysqld, you can issue
588 | simple queries or commands through `vtctlclient` like this:
589 |
590 | ``` sh
591 | # Send a query to tablet 100 in cell 'test'.
592 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT VERSION()"
593 | ### example output:
594 | # +------------+
595 | # | VERSION() |
596 | # +------------+
597 | # | 5.7.13-log |
598 | # +------------+
599 | ```
600 |
601 | If you need a truly direct connection to mysqld, you can [launch a shell]
602 | (#shell-access) inside the mysql container, and then connect with the `mysql`
603 | command-line client:
604 |
605 | ``` sh
606 | $ kubectl exec vttablet-100 -c mysql -t -i -- bash -il
607 | root@vttablet-100:/# export TERM=ansi
608 | root@vttablet-100:/# mysql -S /vt/vtdataroot/vt_0000000100/mysql.sock -u vt_dba
609 | ```
610 |
--------------------------------------------------------------------------------
/userguide/ShardingKubernetes.md:
--------------------------------------------------------------------------------
1 | 本指南将指导我们完成,从[Kubernetes](http://kubernetes.io/)上一个未分片的Vitess[keyspace](http://vitess.io/overview/concepts.html#keyspace)进
2 | 行分片的处理过程。
3 |
4 | ## 先决条件
5 |
6 | 我们假设读者已经按照[Kubernetes基本环境搭建指南](http://vitess.io/getting-started/)完成相应的操作, 并且就只剩下集群的运行了。
7 |
8 | ## 概述
9 |
10 | 我们将按照类似通常[水平拆分](http://vitess.io/user-guide/horizontal-sharding.html)指南一样的步骤去处理,除此之外我们还会给出Vitess集群在Kubernetes上运行的相关命令。
11 |
12 | 因为Vitess的[分片](http://vitess.io/user-guide/sharding.html)操作对应用层是透明的,所以[Guestbook](https://github.com/youtube/vitess/tree/master/examples/kubernetes/guestbook)
13 | 实例将会在[resharding](http://vitess.io/user-guide/sharding.html#resharding)的过程中一直提供服务; 确保Vitess集群在分片的过程中一直保提供服务不会停机。
14 |
15 | ## 配置分片信息
16 |
17 | 第一步就是需要让Vitess知道我们需要怎样对数据进行分片,我们通过提供如下的VSchema配置来实现:
18 | ``` json
19 | {
20 | "Sharded": true,
21 | "Vindexes": {
22 | "hash": {
23 | "Type": "hash"
24 | }
25 | },
26 | "Tables": {
27 | "messages": {
28 | "ColVindexes": [
29 | {
30 | "Col": "page",
31 | "Name": "hash"
32 | }
33 | ]
34 | }
35 | }
36 | }
37 | ```
38 |
39 | 这说明我们想通过 `page` 列的hash来对数据进行拆分。换一种说法就是,保证相同的page的messages数据在同一个分片是上,但是page的分布
40 | 会被随机放置在不同的分片是上。
41 |
42 | 我们可以通过以下命令把VSchema 信息加载到Vitess中:
43 |
44 | ``` sh
45 | vitess/examples/kubernetes$ ./kvtctl.sh ApplyVSchema -vschema "$(cat vschema.json)" test_keyspace
46 | ```
47 |
48 | ## 新分片tablets启动
49 |
50 | 在未分片的示例中, 我们在 *test_keyspace* 中启动了一个名称为 *0* 的分片,可以这样表示 *test_keyspace/0*。
51 | 现在,我们将会分别为两个不同的分片启动tablets,命名为 *test_keyspace/-80* 和 *test_keyspace/80-*:
52 |
53 | ``` sh
54 | vitess/examples/kubernetes$ ./sharded-vttablet-up.sh
55 | ### example output:
56 | # Creating test_keyspace.shard--80 pods in cell test...
57 | # ...
58 | # Creating test_keyspace.shard-80- pods in cell test...
59 | # ...
60 | ```
61 |
62 | 因为, Guestbook应用的拆分键是page, 这就会导致pages的数据各有一半会落在不同的分片上; *0x80* 是[拆分键范围](http://vitess.io/user-guide/sharding.html#key-ranges-and-partitions)的终点。
63 |
64 | 在数据过渡期间,新的分片和老的分片将会并行运行, 但是在我们做切换前所有的流量还是由老的分片提供。
65 |
66 | 我们可以通过`vtctld`界面或者`kvtctl.sh ListAllTablets test`命令的输出查看tablets状态,当tablets启动成功后,每个分片应该有5个对应的tablets
67 |
68 | 一旦tablets启动成功, 我们可以通过为每个新分片选择一个主分片来初始化复制:
69 |
70 | ``` sh
71 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/-80 test-0000000200
72 | vitess/examples/kubernetes$ ./kvtctl.sh InitShardMaster -force test_keyspace/80- test-0000000300
73 | ```
74 |
75 | Now there should be a total of 15 tablets, with one master for each shard:
76 |
77 | ``` sh
78 | vitess/examples/kubernetes$ ./kvtctl.sh ListAllTablets test
79 | ### example output:
80 | # test-0000000100 test_keyspace 0 master 10.64.3.4:15002 10.64.3.4:3306 []
81 | # ...
82 | # test-0000000200 test_keyspace -80 master 10.64.0.7:15002 10.64.0.7:3306 []
83 | # ...
84 | # test-0000000300 test_keyspace 80- master 10.64.0.9:15002 10.64.0.9:3306 []
85 | # ...
86 | ```
87 |
88 | ## 从原始分片复制数据
89 |
90 | 新的tablets开始是空的, 因此我们需要将所有内容从原始分片复制到两个新的分片上,首先就从数据库开始:
91 |
92 | ``` sh
93 | vitess/examples/kubernetes$ ./kvtctl.sh CopySchemaShard test_keyspace/0 test_keyspace/-80
94 | vitess/examples/kubernetes$ ./kvtctl.sh CopySchemaShard test_keyspace/0 test_keyspace/80-
95 | ```
96 |
97 | 下面我们拷贝数据。由于要复制的数据量可能非常大,我们使用一个称作 *vtworker* 的特殊批处理程序,根据 *keyspace_id* 路由将每一行数据从
98 | 单个源传输到多个目标。
99 |
100 | ``` sh
101 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitClone test_keyspace/0
102 | ### example output:
103 | # Creating vtworker pod in cell test...
104 | # pods/vtworker
105 | # Following vtworker logs until termination...
106 | # I0416 02:08:59.952805 9 instance.go:115] Starting worker...
107 | # ...
108 | # State: done
109 | # Success:
110 | # messages: copy done, copied 11 rows
111 | # Deleting vtworker pod...
112 | # pods/vtworker
113 | ```
114 |
115 | 注意: 这里我们只指定了数据源分片 *test_keyspace/0* 。 *SplitClone* 进程会根据key值覆盖和重叠范围自动判断需要访问的分片。
116 | 本例中, 分片 *0* 覆盖整个范围, 所以可以识别 *-80* 和 *80-* 作为目标分片。因为它们结合起来覆盖范围相同;
117 |
118 |
119 | 接下来,我们将在一个 *rdonly* tablet上暂停复制(离线处理), 为数据一致性提供快照。 应用程序可以继续服务不停机;
120 | 因为实时流量处理由 *replica* 和 *master* 负责响应,不会受到任何影响。 其他批处理任务同样也不会受到影响,
121 | 因为还有一台未暂停的 *rdonly* tablets可以提供服务。
122 |
123 | ## 检查过滤复制
124 |
125 | 一旦从已经暂停的快照数据复制完成, *vtworker* 会开启从源分片到每个目标分片的[过滤复制](http://vitess.io/user-guide/sharding.html#filtered-replication)
126 | 过滤复制会从快照创建时间起,继续同步应用数据。
127 |
128 | 当追赶上目标分片数据时,还会继续复制新更新。 你可以通过查看每个分片的内容来看到这个数据同步的变化, 您可以向留言板应用程序中的各个页面添加新邮件。
129 | 分片 *0* 可以看到所有的消息, 而新的分片仅能看到分布在这个分片上的消息。
130 |
131 | ``` sh
132 | # See what's on shard test_keyspace/0:
133 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT * FROM messages"
134 | # See what's on shard test_keyspace/-80:
135 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000200 "SELECT * FROM messages"
136 | # See what's on shard test_keyspace/80-:
137 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000300 "SELECT * FROM messages"
138 | ```
139 |
140 | 可以通过在Guestbook上的不同的页面上添加一些消息, 来观察他们是如何进行数据路由的。
141 |
142 | ## 检查复制的数据完整性
143 |
144 | *vtworker* 批处理有另一种模式,将比较源和目标,以确保所有数据的存在和正确。
145 | 以下命令将在每个目标分片上校验数据差异:
146 |
147 | ``` sh
148 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitDiff test_keyspace/-80
149 | vitess/examples/kubernetes$ ./sharded-vtworker.sh SplitDiff test_keyspace/80-
150 | ```
151 |
152 | 如果发现任何差异, 程序将会输出差异信息。
153 | 如果所有检测都正常, 你将会看到如下信息:
154 |
155 | ```
156 | I0416 02:10:56.927313 10 split_diff.go:496] Table messages checks out (4 rows processed, 1072961 qps)
157 | ```
158 |
159 | ## 切换到新的分片
160 |
161 | 现在,我们就可以把所有服务切换到新的分片上,由新的分片为应用提供服务。
162 | 我们可以使用[MigrateServedTypes](http://vitess.io/reference/vtctl.html#migrateservedtypes)命令,一次迁移一
163 | 个[cell](http://vitess.io/overview/concepts.html#cell-data-center)上的一个[tablet type](http://vitess.io/overview/concepts.html#tablet)
164 | 在master切换完成之前,我们可以在任何点都可以进行回滚。
165 |
166 | ``` sh
167 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 rdonly
168 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 replica
169 | vitess/examples/kubernetes$ ./kvtctl.sh MigrateServedTypes test_keyspace/0 master
170 | ```
171 |
172 |
173 | 在 *master* 迁移过程中, 首先会停止老master接收更新请求; 然后进程需要等待新的分片通过过滤
174 | 复制数据完全一直之后, 才会开启新的服务。 由于过滤复制已跟随实时更新,因此应该只有几秒钟的主机不可用。
175 |
176 | master完全钱以后就会停止过滤复制, 新分片的数据更新就会被开启, 但是老分片的更新依然是不可用。
177 | 读者可以自己尝试下: 将消息添加到留言板页面,然后检查数据库内容
178 |
179 | ``` sh
180 | # See what's on shard test_keyspace/0
181 | # (no updates visible since we migrated away from it):
182 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000100 "SELECT * FROM messages"
183 | # See what's on shard test_keyspace/-80:
184 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000200 "SELECT * FROM messages"
185 | # See what's on shard test_keyspace/80-:
186 | vitess/examples/kubernetes$ ./kvtctl.sh ExecuteFetchAsDba test-0000000300 "SELECT * FROM messages"
187 | ```
188 |
189 | ## 移除老的分片
190 |
191 | 现在,所有的服务都由新的分片进行提供, 我们可以移除老的分片。 通过运行脚本`vttablet-down.sh`关闭一组
192 | 非拆分的分片:
193 |
194 | ``` sh
195 | vitess/examples/kubernetes$ ./vttablet-down.sh
196 | ### example output:
197 | # Deleting pod for tablet test-0000000100...
198 | # pods/vttablet-100
199 | # ...
200 | ```
201 |
202 | 下面我们可以删除空置分片,通过以下命令可以删除:
203 |
204 | ``` sh
205 | vitess/examples/kubernetes$ ./kvtctl.sh DeleteShard -recursive test_keyspace/0
206 | ```
207 |
208 | 我们通过 **Topology** 页面或者使用`kvtctl.sh ListAllTablets test`命令,发现分片 *0* 已经不存在了,
209 | 说明我们已经成功删除了不可用的分片, 当系统中存在不可用的分片的时候就可以通过这种方式删除。
210 |
211 | ## 清理
212 |
213 | 在关闭容器引擎之前, 我们需要关闭Vitess服务; 然后Kubernetes会负责清理它创建的其他实体, 例如:外部负载平衡器
214 |
215 | 我们可以通过运行`./vttablet-down.sh`来清理非分片状态的tablets, 可以通过运行`./sharded-vttablet-down.sh`来关闭拆分状态下的tablets
216 |
217 | ``` sh
218 | vitess/examples/kubernetes$ ./guestbook-down.sh
219 | vitess/examples/kubernetes$ ./vtgate-down.sh
220 | vitess/examples/kubernetes$ ./sharded-vttablet-down.sh
221 | vitess/examples/kubernetes$ ./vtctld-down.sh
222 | vitess/examples/kubernetes$ ./etcd-down.sh
223 | ```
224 |
--------------------------------------------------------------------------------
/warning.md:
--------------------------------------------------------------------------------
1 | # Vitess开发过程中的坑
2 |
3 | 1. vtworker 启动的时候刚开始无法正常连接etcd, 应该是网络没有初始化完成, 需要在Pod配置文件中
4 | 增加sleep 5-10s才可以正常进行数据拆分。
5 |
6 | 2. 拆分时,shard值是16进制的,所以长度必需是偶数。
7 |
8 |
--------------------------------------------------------------------------------