├── .gitignore ├── ARCH └── kafka-benchmark-runner │ ├── 1M-consumer-disk.png │ ├── 1M-consumer-io.png │ ├── 1M-disk.png │ ├── 1M-network.png │ ├── LICENSE │ ├── README.md │ ├── config │ └── server.properties │ ├── docker-compose.yml │ ├── three-replication-info.png │ └── topic-info.png ├── OS └── csbu │ ├── README.md │ ├── interrupt.png │ └── memory-layout.png ├── PL ├── about-cps │ └── README.md ├── about-elixir │ └── README.md ├── about-es6 │ └── README.md ├── about-ruby │ └── README.md └── type-level-programming │ └── README.md ├── README.md ├── SUMMARY.md ├── about-beam-model └── README.md ├── about-ceph └── README.md ├── blockchain.md ├── book.json ├── cap ├── statistic_in_tidb.md ├── tidb.graffle ├── tidb.md └── tikv.graffle ├── effective_cpp └── README.md ├── miscellaneous.md ├── projects └── README.md ├── railway-oriented-programming ├── README.md ├── chain-validate-update_db_send_email.png ├── function_may_raise_error.png ├── imperative-code-return-early.png ├── one-track-input-output.png ├── pipe-chain.png ├── success-failure-railway-1.png ├── success-failure-railway.png ├── success-failure.png └── two-track.png ├── ruby-ecosystem └── README.org ├── sharing └── tidb-in-wacai │ ├── README.md │ ├── image-20180528163849477.png │ └── image-20180528164446453.png ├── storage ├── lsm-tree │ ├── assets │ │ └── images │ │ │ ├── compaction-1.png │ │ │ ├── compaction-2.png │ │ │ ├── compaction-3.png │ │ │ ├── leveled-compaction-1.png │ │ │ ├── leveled-compaction-2.png │ │ │ ├── leveled-compaction-3.png │ │ │ └── leveled-compaction-4.png │ └── compaction-strategy.md ├── tidb │ ├── images │ │ ├── executors.jpg │ │ ├── expression.jpg │ │ ├── sql-core-layer.png │ │ ├── tidb3-all.jpg │ │ ├── tidb3.0.jpg │ │ └── tidb3.graffle │ └── tikv-intro.md └── tikv │ └── CODE_READING.md └── working-with-socket └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | node_modules 2 | _book 3 | -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/1M-consumer-disk.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/1M-consumer-disk.png -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/1M-consumer-io.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/1M-consumer-io.png -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/1M-disk.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/1M-disk.png -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/1M-network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/1M-network.png -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 Jiafeng Cao 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/README.md: -------------------------------------------------------------------------------- 1 | ## Kafka 压力测试 2 | 3 | ### 目的 ### 4 | 5 | 测试消息大小在 100K 至 1M 区间内,测试 Kafka 的写性能,以确认其是否可以支撑 __每天千万级的消息(百K大小)写入__ 需求。 6 | 同时还需要对Kafka 在消息的其他大小维度上的性能有一定了解,使得之后在架构上对 Kafka 的位置有一定了解。 7 | 8 | ### 搭建 Kafka 测试环境 ### 9 | 10 | #### 硬件配置 #### 11 | 12 | CPU: 2 core 13 | 14 | Memory: 8G 15 | 16 | Disk: 1T SATA。 17 | 18 | `fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=8 --size=4G --readwrite=randwrite --runtime=60` 19 | 20 | * 4K, iodepth=1 => 160 iops, 约 640 KB/s 21 | * 1M, iodepth=1 => 40 iops, 约 40000 KB/s 22 | 23 | Network: 公司内网 1Gb。 24 | 25 | 26 | #### 软件配置 #### 27 | 28 | ZK 使用 Docker 启动: 29 | 30 | ``` 31 | $ docker-compose up -d 32 | $ docker-compose ps 33 | ``` 34 | 35 | Kafka 通过 ansible 配置,部署在 Host 上。 版本: 0.10.1.0。 36 | 37 | 在默认的 Broker 设置基础上,修改了如下参数:(参见 [config/server.properties](./config/server.properties)) 38 | 39 | - message.max.bytes: 1000012 => 5242880 40 | - socket.receive.buffer.bytes: 102400 => 1048576 41 | - socket.send.buffer.bytes: 102400 => 1048576 42 | - num.network.threads: 3 => 4 43 | 44 | ### Topic 设置: 3 Replication, 6 partition ### 45 | 46 | 47 | ``` shell 48 | 49 | bin/kafka-topics.sh --create --zookeeper localhost:2181/kafka-test \ 50 | --replication-factor 3 --partitions 6 --topic three-replication 51 | ``` 52 | 53 | ![三副本的Topic信息](./three-replication-info.png) 54 | 55 | ### 测试命令 ### 56 | 57 | * 使用 kafka 自带的 `bin/kafka-producer-perf-test.sh`,主要参数: 58 | * num-records: 总共发送的消息个数 59 | * record-size: 消息大小(byte) 60 | * throughput: 最大的吞吐量 messages/sec,用来限制测试的吞吐量 61 | * producer-props: Producer 的配置(本次的 Benchmark 均使用默认的设置) 62 | 63 | ### 测试过程 ### 64 | 65 | #### Single Producer #### 66 | 67 | 消息大小在 100B, 1KB, 10KB, 100KB, 500KB, 1MB 时的最大吞吐量: 68 | 69 | - 100B: 104857600 records sent, 681716.878828 records/sec (65.01 MB/sec), 360.95 ms avg latency, 2591.00 ms max latency, 192 ms 50th, 1184 ms 95th, 1808 ms 99th, 2462 ms 99.9th 70 | - 1KB: 10485760 records sent, 91319.486175 records/sec (87.09 MB/sec), 333.74 ms avg latency, 3477.00 ms max latency, 201 ms 50th, 1172 ms 95th, 1696 ms 99th, 3367 ms 99.9th. 71 | - 10KB: 1048576 records sent, 9169.481002 records/sec (87.45 MB/sec), 220.37 ms avg latency, 5273.00 ms max latency, 96 ms 50th, 660 ms 95th, 1417 ms 99th, 5268 ms 99.9th. 72 | - 100KB: 104857 records sent, 939.023517 records/sec (89.55 MB/sec), 353.97 ms avg latency, 2158.00 ms max latency, 138 ms 50th, 1245 ms 95th, 1782 ms 99th, 2130 ms 99.9th. 73 | - 500KB: 20971 records sent, 187.599521 records/sec (89.45 MB/sec), 360.41 ms avg latency, 3365.00 ms max latency, 251 ms 50th, 1070 ms 95th, 1790 ms 99th, 3333 ms 99.9th. 74 | - 1MB: 10485 records sent, 93.135426 records/sec (88.82 MB/sec), 362.09 ms avg latency, 3073.00 ms max latency, 185 ms 50th, 1066 ms 95th, 1673 ms 99th, 2946 ms 99.9th 75 | 76 | 77 | netdata 监测到的网络和磁盘使用情况: 78 | 79 | 80 | ![500K 和 1M时的磁盘使用情况](./1M-disk.png) 81 | 82 | ![500K 和 1M 时的网络IO情况](./1M-network.png) 83 | 84 | 另外,Broker 的 page cache 8G 用了6G 左右。 85 | 86 | * 针对几十K大小的消息,网络和磁盘基本都是满的。 87 | * 瓶颈主要在于磁盘,IO慢导致延迟增加,并且异步复制也会明显变慢。 88 | * 另外,如果磁盘跟的上,带宽很快也会成为瓶颈。 89 | 90 | #### Single Producer, Single Consumer #### 91 | 92 | 93 | netdata 监测到的网络和磁盘使用情况: 94 | 95 | * 前半段是只有 producer 时的情况。后半段是 producer 和 consumer 同时运行时的情况。 96 | * 有 consumer 时,读写各占一半,写延迟增加了一倍。 97 | * 1MB: 10485 records sent, 36.279021 records/sec (34.60 MB/sec), 929.46 ms avg latency, 9295.00 ms max latency, 306 ms 50th, 4292 ms 95th, 6290 ms 99th, 9068 ms 99.9th. 98 | 99 | ![one producer, one consumer时的磁盘情况](./1M-consumer-disk.png) 100 | 101 | ![title="one producer, one consumer时的网络情况"](./1M-consumer-io.png) 102 | 103 | * 因为三台机器上都有 partition,而且还是 three replicas,一边读一边写势必导致读写各自都减半。 104 | * 保持 three replicas 不变,增加 brokers 会缓解读写竞争的情况。 105 | 106 | 107 | 108 | ### 结论 ### 109 | 110 | 大致可以得到: 111 | 112 | 100B, 1KB, 10KB, 100KB, 500KB, 1MB 消息大小时,在保证平均几十 ms的延迟下, 113 | 吞吐量分别在 65w, 9w, 9000, 900, 170, 90 records/sec 左右。 114 | 115 | #### 资源估算 #### 116 | 117 | 1000_0000 / 3600 / 12 = 232 requests/sec 118 | 119 | 高峰期按照平均流量的10倍估算,需要承受最大 2300 rps 的流量,需要 4~30 台 kafka 机器(普通PC配置) 120 | (具体数字可根据线上图片大小做具体配置,看是100K偏多,还是1M偏多)。 121 | 122 | 假设平均每个请求 400KB,高峰期带宽需 400K * 2300 ~= 1GB。 123 | 124 | 每天积累数据 1000_0000 * 400K ~= 4T 左右的数据。 125 | -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/config/server.properties: -------------------------------------------------------------------------------- 1 | # Licensed to the Apache Software Foundation (ASF) under one or more 2 | # contributor license agreements. See the NOTICE file distributed with 3 | # this work for additional information regarding copyright ownership. 4 | # The ASF licenses this file to You under the Apache License, Version 2.0 5 | # (the "License"); you may not use this file except in compliance with 6 | # the License. You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | # see kafka.server.KafkaConfig for additional details and defaults 17 | 18 | ############################# Server Basics ############################# 19 | 20 | # The id of the broker. This must be set to a unique integer for each broker. 21 | broker.id=2 22 | 23 | # Switch to enable topic deletion or not, default value is false 24 | #delete.topic.enable=true 25 | 26 | ############################# Socket Server Settings ############################# 27 | 28 | # The address the socket server listens on. It will get the value returned from 29 | # java.net.InetAddress.getCanonicalHostName() if not configured. 30 | # FORMAT: 31 | # listeners = security_protocol://host_name:port 32 | # EXAMPLE: 33 | # listeners = PLAINTEXT://your.host.name:9092 34 | listeners=PLAINTEXT://10.0.9.133:9092 35 | 36 | # Hostname and port the broker will advertise to producers and consumers. If not set, 37 | # it uses the value for "listeners" if configured. Otherwise, it will use the value 38 | # returned from java.net.InetAddress.getCanonicalHostName(). 39 | #advertised.listeners=PLAINTEXT://your.host.name:9092 40 | 41 | # The number of threads handling network requests 42 | num.network.threads=4 43 | 44 | # The number of threads doing disk I/O 45 | num.io.threads=8 46 | 47 | # The send buffer (SO_SNDBUF) used by the socket server 48 | socket.send.buffer.bytes=1048576 49 | 50 | # The receive buffer (SO_RCVBUF) used by the socket server 51 | socket.receive.buffer.bytes=1048576 52 | 53 | # The maximum size of a request that the socket server will accept (protection against OOM) 54 | socket.request.max.bytes=104857600 55 | 56 | 57 | ############################# Log Basics ############################# 58 | 59 | # A comma seperated list of directories under which to store log files 60 | log.dirs=/data/kafka-logs/a,/data/kafka-logs/b,/data/kafka-logs/c 61 | 62 | # The default number of log partitions per topic. More partitions allow greater 63 | # parallelism for consumption, but this will also result in more files across 64 | # the brokers. 65 | num.partitions=6 66 | 67 | # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. 68 | # This value is recommended to be increased for installations with data dirs located in RAID array. 69 | num.recovery.threads.per.data.dir=1 70 | 71 | ############################# Log Flush Policy ############################# 72 | 73 | # Messages are immediately written to the filesystem but by default we only fsync() to sync 74 | # the OS cache lazily. The following configurations control the flush of data to disk. 75 | # There are a few important trade-offs here: 76 | # 1. Durability: Unflushed data may be lost if you are not using replication. 77 | # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. 78 | # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks. 79 | # The settings below allow one to configure the flush policy to flush data after a period of time or 80 | # every N messages (or both). This can be done globally and overridden on a per-topic basis. 81 | 82 | # The number of messages to accept before forcing a flush of data to disk 83 | #log.flush.interval.messages=10000 84 | 85 | # The maximum amount of time a message can sit in a log before we force a flush 86 | #log.flush.interval.ms=1000 87 | 88 | ############################# Log Retention Policy ############################# 89 | 90 | # The following configurations control the disposal of log segments. The policy can 91 | # be set to delete segments after a period of time, or after a given size has accumulated. 92 | # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens 93 | # from the end of the log. 94 | 95 | # The minimum age of a log file to be eligible for deletion 96 | log.retention.hours=168 97 | 98 | # A size-based retention policy for logs. Segments are pruned from the log as long as the remaining 99 | # segments don't drop below log.retention.bytes. 100 | #log.retention.bytes=1073741824 101 | 102 | # The maximum size of a log segment file. When this size is reached a new log segment will be created. 103 | log.segment.bytes=536870912 104 | 105 | # The interval at which log segments are checked to see if they can be deleted according 106 | # to the retention policies 107 | log.retention.check.interval.mins=10 108 | 109 | ############################# Zookeeper ############################# 110 | 111 | # Zookeeper connection string (see zookeeper docs for details). 112 | # This is a comma separated host:port pairs, each corresponding to a zk 113 | # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". 114 | # You can also append an optional chroot string to the urls to specify the 115 | # root directory for all kafka znodes. 116 | zookeeper.connect=10.1.6.25:2181/kafka-test 117 | 118 | # Timeout in ms for connecting to zookeeper 119 | zookeeper.connection.timeout.ms=60000 120 | 121 | message.max.bytes=5242880 122 | -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '2' 2 | services: 3 | zookeeper: 4 | image: zookeeper:3.4 5 | ports: 6 | - 2181:2181 7 | - 2888:2888 8 | - 3888:3888 9 | zk-web: 10 | image: tobilg/zookeeper-webui 11 | ports: 12 | - 8888:8080 13 | environment: 14 | - ZK_DEFAULT_NODE=zookeeper:2181 15 | - USER=admin 16 | - PASSWORD=admin 17 | kafka-manager: 18 | depends_on: 19 | - zookeeper 20 | image: sheepkiller/kafka-manager 21 | ports: 22 | - 9000:9000 23 | environment: 24 | - ZK_HOSTS=zookeeper:2181 25 | - APPLICATION_SECRET=letmein 26 | -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/three-replication-info.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/three-replication-info.png -------------------------------------------------------------------------------- /ARCH/kafka-benchmark-runner/topic-info.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/ARCH/kafka-benchmark-runner/topic-info.png -------------------------------------------------------------------------------- /OS/csbu/README.md: -------------------------------------------------------------------------------- 1 | # 外设和总线 2 | 3 | - 外设通过总线和处理器相连。 4 | - 设备通过发送中断信号来告知处理器某些信息的发生。 5 | - 每个设备都会被(谁?)分配一个中断信号,通过操作系统和 BIOS 的某种组合。 6 | - 外设一般会(通过一个物理的中断线)连接到一个 可编程中断控制器(PIC, 7 | 这是主板上的一个独立芯片),PIC 再和处理器传递中断信息。 8 | - PIC 接收中断信号,并将其转变成可供处理器处理的信息。 9 | - 一般来说,操作系统会配置一个中断信号描述表,表中配对了可能的中断信号 10 | 和要跳转到的代码地址(中断处理代码)。 11 | - 中断处理代码的编写是外设驱动和操作系统的职责。 12 | 13 | ## 中断处理 14 | 15 | 外设发起中断信号给中断控制器,中断控制器将信息传递给处理器。 16 | 处理器查看中断信号描述表(由操作系统填充)寻找相应的代码去处理此次中断。 17 | 18 | 大多数驱动会将中断的处理分成两个部分。上层,下层。 19 | 下层接收中断,将 action 入队列,然后返回给处理器。 20 | 上层在CPU空闲时再运行。 21 | 这样防止中断占用CPU。 22 | 23 | ## 状态保存 24 | 25 | 在进入中断处理代码之前,处理器需要保存当前状态,保证在中断处理结束后, 26 | 还能恢复原样。 27 | 28 | 这个一般是操作系统的责任。除了损失点时间,中断对于正在运行的程序完全透 29 | 明。 30 | 31 | ## interrupt, trap 和 exception 32 | 33 | 处理器自身也可以利用中断机制来处理内部系统信息。比如,访问非法内存,试 34 | 图除0或者其他非法指令。处理器可以抛出异常让操作系统处理。 35 | 36 | also used to trap into the operating system for system calls, as 37 | discussed in the section called “System Calls” and to implement 38 | virtual memory, as discussed in Chapter 6, Virtual Memory. 39 | 40 | 41 | ![](./interrupt.png) 42 | 43 | ## 中断的类型 44 | 45 | - 水平触发 46 | - 边沿触发 47 | 48 | ## non-maskable interrupts (NMI) 49 | 50 | 51 | > NMIs can be useful for implementing things such as 52 | > A system watchdog: 53 | > where a NMI is raised periodically and sets some flag that must be 54 | > acknowledged by the operating system. If the acknowledgement is not 55 | > seen before the next periodic NMI, then system can be considered to be 56 | > not making forward progress. 57 | > Another common usage is for profiling a system. 58 | > A periodic NMI can be raised and used to evaluate what code 59 | > the processor is currently running; over time this builds a profile of 60 | > what code is being run and create a very useful insight into system 61 | > performance. 62 | 63 | 64 | ## IO 空间 65 | 66 | memory mapped IO, where registers on the device are mapped into 67 | memory. 68 | to communicate with the device, you need simply read or write to a 69 | specific address in memory. 70 | 71 | ## DMA(Direct Memory Access) 72 | 73 | DMA:在外设和内存之间直接传输数据。 74 | 75 | 设备驱动程序给设备一片内存区域让其启动DMA传输,然后CPU继续其他工作。 76 | 一旦完成传输,设备就会发送中断信号,提示设备驱动传输完成。 77 | 78 | 79 | ## USB 80 | 81 | TODO 82 | 83 | 84 | # 计算机结构 85 | 86 | ## SMP 87 | 88 | 对称多处理。 89 | CPU一样。共享其他系统资源,比如,内存,磁盘。 90 | 91 | ### Cache Coherency 92 | 93 | 缓存一致性 94 | 95 | CPUs use **snooping** 96 | 97 | 处理器监听一个(其他处理器都连接的)总线上的 cache events,然后更新自己的对应的 cache。 98 | 99 | **MOESI** 100 | 101 | Modified, Owner, Exclusive, Shared, Invalid. 102 | 103 | 104 | ### Hyper-threading 105 | 106 | ### Multi Core 107 | 108 | - have their own L1 cache. 109 | - share bus to memory and other devices. 110 | 111 | 112 | ## Cluster 113 | 114 | ## Non-Uniform Memory Access 115 | 116 | TODO 117 | 118 | ## Memory ordering, locking, and atomic operations 119 | 120 | 内存的一片区域,一个处理器写,另一个处理器读, 121 | 什么时候处理器读的是写过后的值呢? 122 | 123 | 最简单的: strict memory ordering 124 | 125 | 内存栅栏 126 | 127 | - Acquire 语义 128 | - Released 语义 129 | 130 | TODO: 131 | 132 | - http://dreamrunner.org/blog/2014/06/28/qian-tan-memory-reordering/ 133 | - https://en.wikipedia.org/wiki/Memory_ordering 134 | - https://en.wikipedia.org/wiki/Memory_barrier 135 | - https://en.wikipedia.org/wiki/Non-blocking_algorithm 136 | 137 | 138 | 139 | Locking 140 | 141 | TODO 142 | 143 | 144 | # 操作系统 145 | 146 | ## 系统调用 147 | 148 | 系统调用编号 149 | 150 | Application Binary Interface 151 | 152 | TODO 153 | 154 | ## Privileges 155 | 156 | TODO 157 | 158 | Raise Privileges 159 | 160 | 161 | 162 | 163 | # 进程 164 | 165 | ## 简介 166 | 167 | ### Process ID 168 | 169 | ### 内存 170 | 171 | - shared memory 172 | - mmaping a file 173 | 174 | 175 | ### 代码区域和数据区域 176 | 177 | 178 | 179 | ### 栈 180 | 181 | 数据区域的重要部分。 182 | 183 | stack frame。 184 | 185 | hardware has a register to store stack pointer。 186 | 187 | ### 堆 188 | 189 | brk: bottom of heap 190 | 191 | 192 | ### 内存布局 193 | 194 | ![](./memory-layout.png) 195 | 196 | ### 文件描述符 197 | 198 | file descriptors are kept by the kernel individually for each process. 199 | 200 | ### 寄存器 201 | 202 | TODO 203 | 204 | ### 内核状态 205 | 206 | the kernel needs to keep track of a number of elements for each process. 207 | 208 | - 进程状态 209 | runing, disk wait 210 | - 进程优先级 211 | - 统计信息 212 | 213 | 214 | ## 进程树 215 | 216 | init process (pid is 0) 217 | 218 | **pstree** 219 | 220 | ## Fork 和 Exec 221 | 222 | 新进程通过 `fork` 和 `exec` 创建。 223 | 224 | ### Fork 225 | 226 | TODO 227 | 228 | ### Exec 229 | 230 | exec will replace the contents of the currently running process with the information from a program binary. 231 | 232 | ### How Linux handles fork and exec 233 | 234 | - `clone` system call. 235 | - 线程 236 | TODO 237 | - CoW 238 | - init process 239 | - wait, and zombie 240 | 241 | 242 | 243 | ## Context Switching 244 | 245 | TODO 246 | 247 | ## Scheduling 248 | 249 | TODO 250 | 251 | linux O(1) scheduler: 252 | 253 | Bitmap from high priority to low priority 254 | 255 | ## Shell 256 | 257 | TODO 258 | 259 | ## Signals 260 | 261 | Signal: infrastructure between the kernel and processes. 262 | 263 | - SIGINT 264 | - SIGSTOP 265 | - SIGCONT 266 | - SIGABRT 267 | - SIGCHID 268 | - SIGSEGV 269 | - ... 270 | 271 | 272 | # 虚拟内存 273 | 274 | 275 | 276 | 277 | ### 64bit computing 278 | 279 | TODO 280 | 281 | ### Canonical Addresses 282 | 283 | Sign Extension 284 | 285 | 286 | TODO 287 | 288 | ## Pages 289 | 290 | page size(>= 4KiB) 291 | 292 | ## Frames 293 | 294 | Just Pages in Physical Memory 295 | 296 | frame-table: track which frame is being used. 297 | 298 | 299 | ## Page Table 300 | 301 | OS: keep track of which of virtual-page points to which physical frame 302 | 303 | Find the real memory address mapped by virtual memory. 304 | 305 | ## Consequence 306 | 307 | ### Swap ### 308 | 309 | ### mmap ### 310 | 311 | ### disk cache ### 312 | 313 | ## Hardware Support ## 314 | 315 | TLB 316 | 317 | ## Linux Specifics ## 318 | 319 | Three Level Page Table 320 | 321 | ## Hardware Support for Virtual Memory ## 322 | 323 | TODO 324 | 325 | # Toolchain 326 | 327 | TODO 328 | 329 | # Behind the Process 330 | 331 | ## ABI 332 | 333 | lower level interfaces which the compiler, operating system and, to some extent, processor, must agree on to communicate together. 334 | 335 | - Byte Order 336 | - Call Conventions 337 | - 参数传递,register or stack 338 | 339 | -------------------------------------------------------------------------------- /OS/csbu/interrupt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/OS/csbu/interrupt.png -------------------------------------------------------------------------------- /OS/csbu/memory-layout.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/OS/csbu/memory-layout.png -------------------------------------------------------------------------------- /PL/about-cps/README.md: -------------------------------------------------------------------------------- 1 | ### CPS 2 | 3 | 在利用 continuation 编程之前,需要认识一种编程风格:Continuation Passing Style(CPS)。 4 | 在 CPS 中,每一个过程(或者说函数)都接受一个额外的参数,这个参数代表了 *对该过程调用结果的处理* 。 5 | 6 | 举个例子: 7 | 8 | 以下这段代码以递归的形式计算前 n 个数的乘积。 9 | 10 | ``` racket 11 | (define (factorial n) 12 | (if (= n 0) 13 | 1 ; NOT tail-recursive 14 | (* n (factorial (- n 1))))) 15 | ``` 16 | 17 | 如何把它变成 CPS 形式的呢? 18 | 19 | 首先给 `factorial` 加一个额外的参数 `k`, 20 | 这个 `k` 代表了当 `factorial` 调用结束后要执行的动作。 21 | 22 | ``` racket 23 | ; postfix a & to represent the cps version of a function 24 | (define (factorial& n k) 25 | ???) 26 | ``` 27 | 28 | 接下来的唯一“复杂”的地方就是 `(* n (factorial (- n 1)))` 了。 29 | 这个表达式中,先有 `(- n 1)` 调用,然后是 `factorial`,乘法是最后一步计算。 30 | 需要做的同上:给 `factorial` 调用加上一个 `k` 参数。 31 | 32 | ``` racket 33 | (define (factorial& n k) 34 | (if (= n 0) 35 | 1 36 | (factorial& (- n 1) (lambda (fact) ; use a lambda to represent the computation 37 | (* n fact))))) 38 | ``` 39 | 40 | `k` 在这里用一个 lambda 表示,lambda 封装了 `factorial` 调用之后的动作,并接收调用结果作为参数 `fact`。 41 | 42 | 得到的这个就是 CPS 形式的 `factorial`。 43 | 调用它的时候,需要显式提供一个过程用来接收过程调用结果, 44 | 比如这样: `(factorial& 10 (lambda (x) (display x) (newline)))`。 45 | 46 | 我仿照 [continuation sandwich][continuation_sandwich] 写了一个中文类比: 47 | 48 | > 假设现在你走进**厨房**,想做一碗**西红柿鸡蛋面**给自己吃,或者给女朋友吃。 49 | > 无论是做给谁吃,还是想拿它做其他事情,暂且把它写在纸条上揣进兜里。 50 | > 现在,你从冰箱里拿出来两个鸡蛋,一份面,当然还有可爱的西红柿,然后花个十分钟做好这份面。 51 | > 这个时候,你再从兜里把纸条拿出来,还原当时的情景,看看自己做面是想干什么:依旧在厨房里,想着一碗西红柿鸡蛋面,做好了给女朋友吃。 52 | > 接着这个思路,这个时候你会发现一份热腾腾的西红柿鸡蛋面已经摆在了你的面前,可以该干嘛干嘛了。 53 | 54 | *参数 `k` 和纸条上的想法就是所说的 continuation。 55 | 可能你也会发现 javascript 最为“出名”的 [callback][callback] 也是 continuation。* 56 | 57 | 58 | > 在 Racket 中,`=`、`*` 都是过程调用,严格来说,也需要做 CPS 变换。 59 | 60 | > 下面给出详细的 CPS 转换过程,如果不感兴趣或者已熟谙于心,可以直接跳过。:) 61 | 62 | #### `factorial` 完整的 CPS 变换 63 | 64 | 65 | ``` racket 66 | (define (factorial& n k) 67 | (=& n 0 (lambda (b) 68 | (if b 69 | (k 1) 70 | ???)))) 71 | ``` 72 | 73 | 接下来变换 `(* n (factorial (- n 1)))`。 74 | 前面提到过,这个表达式中,先有 `(- n 1)` 调用,然后是 `factorial`,乘法是最后一步计算。 75 | 依照先后顺序,可以依次作如下变换: 76 | 77 | ``` racket 78 | ; cps of (- n 1) 79 | (define (factorial& n k) 80 | (=& n 0 (lambda (b) 81 | (if b 82 | (k 1) 83 | (-& n 1 (lambda (nm1) 84 | ???))))) 85 | 86 | ; cps of (factorial (- n 1)) 87 | (define (factorial& n k) 88 | (=& n 0 (lambda (b) 89 | (if b 90 | (k 1) 91 | (-& n 1 (lambda (nm1) 92 | (factorial& nm1 (lambda (factor) 93 | ???))))))) 94 | 95 | ; cps of (* n (factorial (- n 1))), and that's it! 96 | (define (factorial& n k) 97 | (=& n 0 (lambda (b) 98 | (if b 99 | (k 1) 100 | (-& n 1 (lambda (nm1) 101 | (factorial& nm1 (lambda (fact) 102 | (*& n fact k)))))))) 103 | ``` 104 | 105 | ### call/cc 106 | 107 | continuation 就是这么个玩意儿,Scheme 提供了 call-with-current-continuation(call/cc) 来利用它编程。 108 | 109 | call/cc 接受一个函数 `f` 作为参数,并把当前的 continuation 打包成函数,传递给 `f`,continuation 只能进行函数应用操作。 110 | 111 | 112 | ``` racket 113 | ; define a function f, take a function argument k 114 | (define (f k) 115 | (k 1) 116 | (display 2)) 117 | 118 | (display (call/cc f)) 119 | 120 | (display 3) 121 | ``` 122 | 123 | 分析下这个程序,先确定调用 `call/cc` 时的 *current continuation*: 124 | 125 | ``` racket 126 | (lambda (x) 127 | (display x) 128 | (display 3)) 129 | ``` 130 | 131 | 参数 `x` 是 `call/cc` 调用的返回值,所谓的 *current continuation* 就是 `call/cc` 之后需要执行的代码块。 132 | 133 | 之后,这样的一个 continuation 就作为参数 `k` 传递给了 `call/cc` 的参数 `f`。 134 | 135 | ``` racket 136 | (f (lambda (x) 137 | (display x) 138 | (display 3))) 139 | ``` 140 | 141 | 在 `f` 的执行流中,`k` 的应用使得 `k` 中的执行流取代了之后的执行流 (即:`(lambda (y) (display 2))`), 142 | 程序的运行由 `k` 中的执行流决定: 143 | 144 | ``` racket 145 | ((lambda (x) 146 | (display x) 147 | (display 3)) 148 | 1) 149 | ``` 150 | 151 | 152 | 也可以保存 `call/cc` 创建出来的 continuation,以便反复使用。 153 | 154 | ``` racket 155 | (define cont #f) 156 | (display (call/cc (lambda (k) (set! cont k)))) 157 | 158 | (cont 1) 159 | (cont 2) 160 | (cont 3) 161 | ``` 162 | 163 | 这就是 `call/cc`。 164 | 165 | 166 | ### delimited continuation(定界延续) 167 | 168 | `call/cc` 无法控制 continuation 的边界,`call/cc` 调用之后的执行流都包含在 continuation 之内。 169 | **delimited continuation** 是用来解决这个问题的,它将 continuation 控制在某一个范围内,比如说, 170 | 171 | ``` racket 172 | (reset (display 0) 173 | (display (+ 1 (shift k (begin 174 | (k 0) 175 | (display 3))))) 176 | (display 2)) 177 | (display 4) 178 | ``` 179 | 180 | `shift` 就类似于 `call/cc` 的参数 `f` ,接收一个 continuation `k`, 181 | 而`reset` 则界定了 continuation 能够作用的范围。 182 | 在 `reset` 中调用 `shift` 时,`shift` 中的 continuation 就被限制在 `reset` 决定的范围里。 183 | 184 | 不过这里需要注意一点:`shift` 中,当 `(k 0)` 调用完毕后,执行流会继续往下执行 `(display 3)`,然后从这里跳出 `reset`。 185 | `shift` 的使用使得执行流从 `reset` 转换到 `shift`,并以此结束 `reset`。 186 | 187 | 188 | 189 | ### next... 190 | 191 | 网上有很多相关的资料,我也考过许多,不过 Jim Mcbeath 的[delimited continuations][delimited continuations] 让我真正明白 continuation 做了一件什么事情。 192 | 下一步,想搞清楚为什么要有 continuation,具体实现上对闭包的处理,以及和尾递归优化的关系。 193 | 194 | 195 | 196 | [continuation]: http://en.wikipedia.org/wiki/Continuation 197 | [CPS]: http://en.wikipedia.org/wiki/Continuation-passing_style 198 | [callback]: http://en.wikipedia.org/wiki/Callback_(computer_programming) 199 | [continuation_sandwich]: http://en.wikipedia.org/wiki/Continuation#cite_note-cont-sandwich-3 200 | [call-with-current-continuation]: http://en.wikipedia.org/wiki/Call-with-current-continuation 201 | [delimited continuations]: http://jim-mcbeath.blogspot.com/2010/08/delimited-continuations.html 202 | 203 | [Continuations in Scheme(draft)]: http://phillipwright.info/drafts/continuations.htm 204 | -------------------------------------------------------------------------------- /PL/about-elixir/README.md: -------------------------------------------------------------------------------- 1 | ## 关于 Elixir 2 | 3 | 4 | ### 异常处理 5 | 6 | ### Elixir 中的 try-catch 7 | 8 | Elixir 有三种异常类型: 9 | 10 | - `:error`, 由 `Kernel.raise` 产生 11 | - `:throw`, 由 `Kernel.throw/1` 产生 12 | - `:exit`, 由 `Kernel.exit/1` 产生 13 | 14 | try-catch 中的子句中, 15 | 16 | ```elixir 17 | try do 18 | do_something_that_may_fail(some_arg) 19 | rescue 20 | x in [ArgumentError] -> 21 | IO.puts "Invalid argument given" 22 | catch 23 | class, value -> 24 | IO.puts "#{class} caught #{value}" 25 | else 26 | value -> 27 | IO.puts "Success! The result was #{value}" 28 | after 29 | IO.puts "This is printed regardless if it failed or succeed" 30 | end 31 | ``` 32 | 33 | - `rescue` 部分只能用来处理 `:error` 类型, 34 | 常见的包括: `RuntimeError`、`ArgumentError`、`ErlangError`,也可通过 `Kernel.defexception/1` 自定义类型。 35 | - `catch` 部分对这三种类型都可以处理,类型通过`class` 进行绑定(未指定 class 时,默认是 `:throw`)。 36 | - `else` 部分在未出错的情况下执行。 37 | - `after` 部分无论什么情况都会执行。 38 | 39 | ### Erlang 中的 try-catch 40 | 41 | Erlang 有两个表达式来处理,一个是 `catch`,另一个是 `try-catch`。 42 | 43 | #### Catch me 44 | 45 | `catch` 将表达式中产生的三种异常转换成信息。 46 | 47 | - `error` => {'EXIT', {Reason, Stacktrace}} 48 | - `exit(Term)` => {'EXIT', Term} 49 | - `throw(Term)` => Term 50 | 51 | ``` erlang 52 | %%% catch EXPR 53 | catch 1+a. % => {'EXIT',{badarith,[...]}} 54 | catch exit(invalidargs) %=> {'EXIT', invalidargs} 55 | catch throw(1). % => 1 56 | ``` 57 | 58 | #### Catch me and distinguish me 59 | 60 | `try-catch` 是增强版的 `catch`, 可以拿到异常的类型。 61 | 62 | ``` erlang 63 | %%% try-catch 64 | try EXPR 65 | catch 66 | CLASS:EXCEPTION -> 67 | %% do something with the exception. 68 | end 69 | ``` 70 | 71 | 可以看出,Elixir 和 Erlang 中的 try-catch 基本是相对应的,Erlang 中的 `catch` 没有对应的 Elixir 版本。 72 | 这是可以理解的,毕竟用到 `catch` 的地方一般都可以用 `try-catch` 代替。 73 | 74 | 75 | ### 进程退出 76 | 77 | 总结一下 __进程退出(proccess exiting)__ 这个话题。 78 | 79 | 如下我站在进程 A 的的视角解释它会以哪一种方式狗带。 80 | 81 | #### 进程内部 82 | 83 | 从进程自身来说: 84 | 85 | - 第一种是自然死亡:进程执行完(或者显式的调用 `exit(:normal)`),留下 `:normal` 消息。 86 | - 其次就是自杀:进程调用 `exit(:kill)` 方法结束自己的生命,留下 `:kill` 消息。 87 | - 最后是生病死亡:进程在工作的过程中,不小心出错了,比如:`ArgumentError`、`ArithmeticError`,留下的消息就是这些错误信息。 88 | (死前也可以显示地调用 `exit(other_reason)` 来说明自己死去的原因)。 89 | 90 | #### 进程外部 91 | 92 | 外部环境也可以通过 `Process.exit/2` 方法来杀死进程 A: 93 | 94 | - `exit(pid, :normal)` 对 A 来说是无效的,毕竟任何外部因素无法使一个人自然死亡。 95 | (如果调用这个方法的是 A 自己,就和从进程内部显示调用 `exit(:normal)` 的效果一样) 96 | - `exit(pid, :kill)` 能够成功杀死 A,A 会留下消息 `killed`。 97 | - `exit(pid, other_reason)` 同样也会将病传给了 A,致其死亡,A 留下消息 `other_reason`。 98 | 99 | 内部和外部导致进程 A 死亡的情况差不多,要么自然死亡,要么染病死亡,在要么被人暗杀或者自杀。 100 | 101 | #### 进程防御 102 | 103 | 对于外部环境的暗杀,进程 A 是有自己的防御措施的。 104 | A 通过 `Process.flag(:trap_exit, true)` 给自己加上 `trap_exit` 防御后, 105 | 106 | - 受到外部的 `:normal` 攻击时,A 不但不会受到任何伤害,还会记录该攻击的具体情况(`{:EXIT, _from_who, :normal}`)。 107 | - 受到外部的 `:kill` 攻击时,A 防御失败,最终死亡。 108 | - 受到外部的 `other_reason` 攻击时,A 防御成功,也会记录该攻击的情况(`{:EXIT, _from_who, _other_reason}`)。 109 | 110 | 以上是我理解的进程退出。有任何遗漏或者错误,欢迎讨论。 111 | 112 | 113 | 那么问题来了, `exit(:kill)` 和 `Process.exit(pid, :kill)` 有什么区别? 114 | 想清楚这个问题, __Process Exiting__ 就尘埃落定了,具体可参见 [This PR in Erlang/OTP](https://github.com/erlang/otp/pull/854)。 115 | 116 | 117 | ### Macro 使用 118 | 119 | 第一次用 Elixir 写代码,拿 [stockfighter](https://github.com/lerencao/stockfighter) 练手。 120 | 期间遇到一个编译错误,总是提示 `__struct__ is not defined`, 121 | 排查了一会,发现是 `use HTTPoison.Base` 引起的。 122 | 123 | 在它的 `__using__` 中发现了这样一行代码: `defoverridable Module.definitions_in(__MODULE__)`。 124 | 我的 module 中使用了 `defstruct`, 125 | 所以这行代码会使得 module 中 `__struct__` 方法(用 `defstruct` 时会编译出该方法)被惰性定义, 126 | 用 `%__MODULE__{}` 作 pattern matching 会出现上面提到的错误。 127 | 128 | 起初我很自然地认为这行代码是罪魁祸首,就给 httpoison 提了一个 [PR](https://github.com/edgurgel/httpoison/pull/110)。 129 | 后来再审视的时候,突然想起来,Elixir 中,macro 的使用顺序是很重要的。 130 | 在 using module 之前使用 `defstruct`,`__struct__` 会被 overridable, 131 | 但如果是在之后的话,`defoverridable Module.definitions_in(__MODULE__)` 这行代码在执行时, 132 | `__struct__` 还不存在,就不会被 overridable 了。 133 | 试了试,发现问题确实出在这里。 134 | 135 | 136 | ``` elixir 137 | defmodule FooBar do 138 | defmacro __using__(_) do 139 | quote do 140 | defoverridable Module.definitions_in(__MODULE__) 141 | Module.definitions_in(__MODULE__) |> IO.inspect 142 | end 143 | end 144 | end 145 | 146 | defmodule MyModule do 147 | use FooBar # it should be used first 148 | defstruct bar: nil 149 | def foo(%__MODULE__{bar: bar}), do: bar 150 | end 151 | ``` 152 | 153 | ### Plug 源码 154 | 155 | [Plug](https://github.com/elixir-lang/plug) 156 | 157 | 从 `Plug` 开始,`Plug` 本身是一个 `Application`,里面定义了 `start` 回调方法。 158 | 同时,它还是一个 *behaviour*,用于实现各种各样的 Web 模块。 159 | 160 | 161 | 作为 application 时,`Plug` 依托于 `Plug.Supervisor`,它做了两件事情: 162 | 163 | * 启动 `Plug.Upload`,用于管理文件上传。 164 | * 另外定义了 `Plug.Key` ets。 165 | 166 | #### 作为 `Behaviour` 167 | 168 | `Plug.Conn`: *unset* ---> (*set* | *file*) ---> *sent* | *chunked* 169 | 170 | - `put_status`: ---> *sent* 171 | - `resp`: unsent(*unset* | *set*) ---> *set* 172 | - `send_resp`: *set* --> *set* => `run_before_send` => `adapter.send_resp` --> *sent* 173 | - `send_file`: unsent([`unset`, `set`]) --> *file* => `run_before_send` => `adapter.send_file` --> *sent* 174 | - `send_chunked`: unsent --> *chunked* => `run_before_send` => `adapter.send_chunked` --> *chunked* 175 | `chunk`: *chunked* => `adapter.chunk` 176 | -------------------------------------------------------------------------------- /PL/about-es6/README.md: -------------------------------------------------------------------------------- 1 | # ES6 新特性 2 | 3 | 4 | Ref [mozilla's es6 in depth](https://hacks.mozilla.org/category/es6-in-depth/) 5 | 6 | ## for-of loop 7 | 8 | Via https://hacks.mozilla.org/2015/04/es6-in-depth-iterators-and-the-for-of-loop/ 9 | 10 | for-of 只是一个语法糖。 11 | 内部需要借助 iterator,凡是实现 iterator 接口的 object 都可以使用 for-of。 12 | 13 | 所谓的 iterator 接口只是需要 object 包含一个 key 为 `Symbol.iterator` 的函数字段,该函数执行返回 iterator 对象。 14 | 15 | iterator 对象需要实现 `next` 方法,以及两个可选的方法: `return` 和 `throw(exc)`。 16 | 17 | 如果在 for-of 过程中 break 或者 return 或者 exception 了, `return` 方法就会被调用。 18 | 19 | 下面是一个简单的例子:(斐波那契数) 20 | 21 | ``` javascript 22 | class Fibonacci { 23 | constructor() { 24 | this.a = 0; 25 | this.b = 1; 26 | 27 | } 28 | 29 | [Symbol.iterator]() { 30 | return { 31 | next: () => { 32 | const t = this.b; 33 | this.b = this.a + this.b; 34 | this.a = t; 35 | return { done: false, value: this.b - this.a }; 36 | }, 37 | return: () => { 38 | console.log("exit"); 39 | } 40 | }; 41 | } 42 | } 43 | ``` 44 | 45 | 在 for-of 中使用它: 46 | 47 | ``` javascript 48 | for(var fi of new Fibonacci()) { 49 | if(fi > 10) { 50 | break; 51 | } 52 | 53 | console.log(fi); 54 | } 55 | ``` 56 | 57 | [babel](http://babeljs.io) 会将上述代码转义成下面这样: 58 | 59 | ``` javascript 60 | var _iteratorNormalCompletion = true; 61 | var _didIteratorError = false; 62 | var _iteratorError = undefined; 63 | 64 | try { 65 | 66 | for (var _iterator = new Fibonacci()[Symbol.iterator](), _step; !(_iteratorNormalCompletion = (_step = _iterator.next()).done); _iteratorNormalCompletion = true) { 67 | var fi = _step.value; 68 | 69 | if (fi > 10) { 70 | break; 71 | } 72 | 73 | console.log(fi); 74 | } 75 | } catch (err) { 76 | _didIteratorError = true; 77 | _iteratorError = err; 78 | } finally { 79 | try { 80 | if (!_iteratorNormalCompletion && _iterator["return"]) { 81 | _iterator["return"](); 82 | } 83 | } finally { 84 | if (_didIteratorError) { 85 | throw _iteratorError; 86 | } 87 | } 88 | } 89 | ``` 90 | 91 | 可以看到,当 for-of 不是正常结束的时候,`return` 就会执行(当然,如果定义了的话)。 92 | 93 | 至于说 `throw(exc)`, 下一节再看。 94 | 95 | ## 生成器 96 | - https://hacks.mozilla.org/2015/05/es6-in-depth-generators 97 | - https://hacks.mozilla.org/2015/07/es6-in-depth-generators-continued 98 | 99 | 语法: `function* (...args) { ... }` ,称之为 **生成器函数**(generator function)。 100 | 101 | 可以在 _生成器函数_ 中使用 yield 方法达到 iterator 效果。 102 | 103 | 这里的 `yield` 和 ruby 中的 `yield` 并没有本质区别。 104 | 105 | 函数 yield 过程: 106 | 107 | (函数调用开始)start of function --> (`.next()`)yield --> (`.next()`)yield --> .... --> (`.next()`)end of function 108 | 109 | ### `yield` 返回值 110 | 111 | `yield [expression]` 也可以返回一个值: `var v = yield someExpression`。 112 | 这个值从 `.next(value)` 通过参数 `value` 传过来。 113 | 114 | start of function --> (`.next(undefined)`)yield --> (`.next(v)`)yield --> .... --> (`.next(v)`)end of function 115 | 116 | ``` javascript 117 | var g = getSomeGenerator(); 118 | // new created generator is suspended. 119 | // when first call next, the value passed to next is meaningless. 120 | // and the generator run to first yield, and suspended again. 121 | g.next(someValue); 122 | 123 | // when call next again, the value passed to it is the last-encountered yield's return value. 124 | g.next(anotherValue); 125 | ``` 126 | 127 | ``` javascript 128 | function* range(start, end) { 129 | try { 130 | for(var i = start; i < end; i++) 131 | console.log(yield i); 132 | } 133 | finally { 134 | console.log("clean up"); 135 | } 136 | 137 | } 138 | 139 | var rng = range(0, 10); 140 | 141 | rng.next("what"); // no output 142 | rng.return("what"); // output 'what' 143 | rng.next("what"); // output 'what' 144 | ``` 145 | 146 | ### `.return` 方法 147 | 148 | 前面说过,iterator 在迭代的过程中,可能会 break,会 return,也会 error,这个时候 iterator 的 `return` 方法会被调用来对 iterator 作一些收尾工作,这个收尾工作可能是关闭文件,断开数据库连接等等。 149 | 150 | Generator 作为一种 Iterator,也可以定义 `return` 方法。`try {} finally {}` 中的 `finally` 就是。。(不要问我,我也不知道为啥要这么设计)。 151 | 如果 Generator 中有使用 try-finally,那么finally 块就相当于 `return`。。 152 | 153 | ``` javascript 154 | function* range(start, end) { 155 | try { 156 | for(var i = start; i < end; i++) 157 | console.log(yield i); 158 | } 159 | finally { 160 | console.log("clean up"); 161 | } 162 | 163 | } 164 | 165 | var rng = range(0, 10); 166 | 167 | rng.next("what"); // no output 168 | rng.next("what"); // output 'what' 169 | rng.return(); // output 'clean up' 170 | rng.next("what"); // no output 171 | ``` 172 | 173 | ### `.throw` 方法 174 | 175 | Generator 的调用者也可以直接 `throw(exception)`,这相当于 `yield someExpression` 在执行过程中出错,`exception` 会在 Generator 中抛出。 176 | 177 | ``` javascript 178 | function* range(start, end) { 179 | try { 180 | for(var i = start; i < end; i++) 181 | console.log(yield i); 182 | } catch (e) { 183 | console.log(`error: ${e}`); 184 | } 185 | finally { 186 | console.log("clean up"); 187 | } 188 | 189 | } 190 | 191 | var rng = range(0, 10); 192 | 193 | rng.next("what"); // no output 194 | rng.throw("what"); // output 'error: what' and 'clean up' 195 | rng.next("what"); // no output 196 | ``` 197 | 198 | ### `yield*` 199 | 200 | 往往需要在一个 generator 中调用另一个 generator, 201 | 202 | ``` javascript 203 | function* concat(generatora, generatorb) { 204 | for (var a of generatora) yield a; 205 | for (var b of generatorb) yield b; 206 | } 207 | ``` 208 | 209 | `yield*` 更适用于这种的情况。 210 | 211 | ``` javascript 212 | function* concat(generatora, generatorb) { 213 | yield* a; 214 | yield* b; 215 | } 216 | ``` 217 | 218 | ### TODO:生成器结合 `Promise` 219 | 220 | ## Template String 221 | 222 | 模板字符串功能在其他语言已经很常见了: 223 | 224 | ``` javascript 225 | // javascript 226 | `hello, ${yourname}` 227 | ``` 228 | 229 | ``` scala 230 | // scala 231 | s"hello, $yourname" 232 | ``` 233 | 234 | ``` ruby 235 | # ruby 236 | "hello, #{yourname}" 237 | ``` 238 | 239 | ``` elixir 240 | # elixir 241 | ~s(hello, #{yourname}) 242 | ``` 243 | 244 | 但是,从本质上来说,javascript 的 **template string** 和 elixir 的 [sigils](http://elixir-lang.org/getting-started/sigils.html#custom-sigils) 更类似,可以自定义 template 处理方法。 245 | 246 | ``` javascript 247 | function SaferHTML(templateData) { 248 | var s = templateData[0]; 249 | for (var i = 1; i < arguments.length; i++) { 250 | var arg = String(arguments[i]); 251 | 252 | // Escape special characters in the substitution. 253 | s += arg.replace(/&/g, "&") 254 | .replace(//g, ">"); 256 | 257 | // Don't escape special characters in the template. 258 | s += templateData[i]; 259 | } 260 | return s; 261 | } 262 | 263 | let yourname = ""; 264 | let greeting = SaferHTML`

hello, ${yourname}

`; // =>

hello, <myname>

265 | ``` 266 | 267 | ### Rest parameters and defaults && Destructuring 268 | 269 | 这两个功能不再细说,其他语言早已经实现了。 270 | 271 | ### Arrow Function 272 | 273 | 类似于 _coffeescript_ 中的函数简写形式。 274 | 值得注意的一点是 _arrow function_ 里面的 `this` 是 context 中的 `this`。 275 | 276 | ### Symbol 277 | 278 | 符号类型是新加的一种类型。 279 | 280 | 其他语言如,Ruby,Elixir 都有这个概念。 281 | 282 | 但是 Symbol 主要是 ES6 为了做兼容而想出来的一种不失优雅的解决方案。 283 | 不过以后应该会派上更大的用场。 284 | 285 | -------------------------------------------------------------------------------- /PL/about-ruby/README.md: -------------------------------------------------------------------------------- 1 | 1. Lack of multi threads. 2 | Concurrency is hard, especially in ruby. Many other languages like Rust, Elixir, Scala, Clojure make multi-core programming easy. These fancy things cannot exists in Ruby because of GIL. Go http://awesome-ruby.com/#awesome-ruby-concurrency, and you can see only three libraries there, and difficult to use. 3 | 2. Lack of basic type constraints. (One thing I like Elixir/Erlang most is optional type definitions.) Type is the new sexy: 4 | - Many languages based on JVM are typed. 5 | - Many languages based on LLVM are typed. 6 | - Many languages based on BEAM are typed. 7 | 3. Too many magic boxes. Yeah, I mean meta-programming. 8 | 4. Pool docs. I hate it. Writing it is hard, Reading it is harder. 9 | 5. No new sexy projects. RoR is too big to fall, and it's going to fall. 10 | 6. The community is less active. 11 | 7. I will not criticize the performance, as Ruby is nothing to do with it. 12 | 8. To be continued. 13 | -------------------------------------------------------------------------------- /PL/type-level-programming/README.md: -------------------------------------------------------------------------------- 1 | ## TLP in Scala 2 | 3 | ### Concepts ### 4 | 5 | - Dependent Type 6 | - Abstract Type 7 | - `type` keyword 8 | 9 | - Phantom Type, used as type constraints but never instantiated. 10 | 11 | - Implicit Parameter/Conversion 12 | 13 | ``` scala 14 | trait Printer[T] { 15 | def print(t: T): String 16 | } 17 | 18 | implicit val sp: Printer[Int] = new Printer[Int] { 19 | def print(i :Int) = i.toString 20 | } 21 | 22 | def foo[T](t: T)(implicit p: Printer[T]) = p.print(t) 23 | ``` 24 | 25 | - Type Class 26 | 27 | ``` scala 28 | trait CanFoo[A] { 29 | def foos(x: A): String 30 | } 31 | 32 | case class Wrapper(wrapped: String) 33 | 34 | object WrapperCanFoo extends CanFoo[Wrapper] { 35 | def foos(x: Wrapper) = x.wrapped 36 | } 37 | ``` 38 | Idea: Provide evidence(`WrapperCanFoo`) that a class(`Wrapper`) satisfies an interface(`CanFoo`). 39 | 40 | add implicit to turn into: 41 | 42 | ``` scala 43 | implicit object WrapperCanFoo extends CanFoo[Wrapper] { 44 | def foos(x: Wrapper) = x.wrapped 45 | } 46 | def foo[A](thing: A)(implicit evidence: CanFoo[A]) = evidence.foos(thing) 47 | foo(Wrapper("hi")) 48 | ``` 49 | 50 | - `implicitly` and `=:=` 51 | 52 | - Aux Pattern 53 | 54 | 55 | ### Reflection ### 56 | 57 | #### Symbol #### 58 | 59 | TypeSymbol: type, class, trait declarations, type parameters. 60 | - ClassSymbol 61 | TermSymbol: val, var, def, object declarations, packages, value parameters. 62 | - MethodSymbol: def 63 | - ModuleSymbol: object declaration 64 | 65 | 转换到更具化的符号。 66 | 67 | `asClass`, 68 | `asMethod` 69 | 70 | ``` scala 71 | typeOf[String].member(TermName("length")).asMethod 72 | ``` 73 | 74 | #### Type #### 75 | 76 | ``` scala 77 | 78 | universe.typeOf 79 | universe.weakTypeOf 80 | 81 | ``` 82 | 83 | - check subtyping of two types: `<:<`, `weak_<:<` 84 | - check equality of two types: `=:=`(not `==`) 85 | - query for certain members or inner types: `typeOf[T].members`, `typeOf[T].declarations` 86 | 87 | 88 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | SUMMARY.md -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # Summary 2 | 3 | * [blockchain](./blockchain.md) 4 | * [about ceph](./about-ceph/README.md) 5 | * [Ruby is Dying](PL/about-ruby/README.md) 6 | * [Elixir 笔记](PL/about-elixir/README.md) 7 | * [ES6 新特性](PL/about-es6/README.md) 8 | * [关于 CPS](PL/about-cps/README.md) 9 | * [Scala Type Level Programming](PL/type-level-programming/README.md) 10 | * [ROP 错误处理](railway-oriented-programming/README.md) 11 | * [Kafka 压测](ARCH/kafka-benchmark-runner/README.md) 12 | * [CSBU 笔记](OS/csbu/README.md) 13 | * [TiDB](cap/tidb.md) 14 | * [LSM-tree 的 Compaction 策略](storage/lsm-tree/compaction-strategy.md) 15 | * [Beam Model](about-beam-model/README.md) 16 | * [effective cpp](effective_cpp/README.md) 17 | * [个人项目](projects/README.md) 18 | * [其他](miscellaneous.md) 19 | 20 | 21 | -------------------------------------------------------------------------------- /about-beam-model/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ### ParDo ### 4 | 5 | 6 | 7 | ### GroupByKey ### 8 | 9 | 10 | Windowing data 11 | 12 | => GroupByKeyAndWindow 13 | 14 | Support unaligned windowing: 15 | 16 | - unaligned is the general case. Aligned windowing is the special form. 17 | - Two related operations: AssignWindows, MergeWindows. 18 | 19 | #### Assignwindows #### 20 | 21 | creates a new copy of the element in each of the windows to which it has been assigned. 22 | 23 | #### MergeWindows #### 24 | 25 | 26 | ### The Second Problem: When to output result for a window? ### 27 | 28 | Trigger: provide multiple answers (or panes) for any given window. 29 | 30 | 31 | 32 | 33 | - Windowing determines **where in event time** data are grouped together for processing. 34 | - Triggering determines **when in processing time** the results of groupings are emitted as panes. 35 | 36 | 37 | -------------------------------------------------------------------------------- /about-ceph/README.md: -------------------------------------------------------------------------------- 1 | ## Ceph System ## 2 | 3 | - https://thenewstack.io/understanding-software-defined-storage/ 4 | - https://thenewstack.io/software-defined-storage-ceph-way/ 5 | - https://thenewstack.io/software-defined-storage-with-an-understandable-interface-the-ceph-way-part-three/ 6 | 7 | -------------------------------------------------------------------------------- /blockchain.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "About Blockchain" 3 | date: 2018-01-21T21:59:38+08:00 4 | draft: true 5 | --- 6 | 7 | 8 | #### blockchain ledger 9 | 10 | 总账本:append only transactions 11 | 12 | 13 | #### blockchain ledger precursor 14 | 15 | 16 | [universal construction for lock-free data structures](https://doi.org/10.1145/114005.102808) 17 | 18 | Consensus ensures: 19 | 1. agreement: all honest parties agree on which transaction was selected, 20 | 2. termination: all honest parties eventually learn the selected transaction, and 21 | 3. validity: the selected transaction was actually proposed by some party. 22 | 23 | #### private blockchain ledger #### 24 | 25 | peers are identified. 26 | 27 | #### public blockchain ledger #### 28 | 29 | peers are anonymous. 30 | 31 | #### smart contracts #### 32 | 33 | -------------------------------------------------------------------------------- /book.json: -------------------------------------------------------------------------------- 1 | { 2 | "plugins": ["toc2", "todo", "git-author"], 3 | "pluginsConfig": { 4 | "toc2": { 5 | "addClass": true, 6 | "className": "toc" 7 | }, 8 | "git-author": { 9 | "modifyTpl": "Last modified by {user} {timeStamp}", 10 | "createTpl": "Created by {user} {timeStamp}", 11 | "timeStampFormat": "YYYY/MM/DD" 12 | } 13 | } 14 | } 15 | -------------------------------------------------------------------------------- /cap/statistic_in_tidb.md: -------------------------------------------------------------------------------- 1 | ### Sampling - Reservoir_sampling ### 2 | 3 | ### Count-Min Sketch ### 4 | 5 | ### FM Sketch ### 6 | 7 | -------------------------------------------------------------------------------- /cap/tidb.graffle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/cap/tidb.graffle -------------------------------------------------------------------------------- /cap/tidb.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # TiKV - HBase done right 4 | 5 | ![TiDB Family](https://pingcap.com/images/blog/sparkontikv.png) 6 | 7 | 8 | 9 | - [x] 概览 10 | - [x] Storage - RocksDB 11 | - [ ] Replication - Raft protocol 12 | - [ ] Transaction - 2PC/MVCC 13 | - [ ] Schedule - PD 14 | - [ ] Monitoring - Prometheus && Grafana 15 | - [ ] Testing 16 | - [ ] SQL Layer 17 | 18 | 19 | ---------- 20 | ## 概览 21 | 22 | **HBase 出了什么问题** 23 | 24 | 25 | - 可用性差 26 | - GC 27 | - 恢复时间长 28 | - 无跨行事务 - Mistake of Jeaf Dean 29 | - 决定了它只合适做一个 KV 数据库 30 | 31 | **需要考虑什么** 32 | 33 | 34 | - 一致性:我们是否需要保证整个系统的线性一致性,还是能容忍短时间的数据不一致,只支持最终一致性。 35 | - 稳定性:我们能否保证系统 7 x 24 小时稳定运行。系统的可用性是 4 个 9,还有 5 个 9?如果出现了机器损坏等灾难情况,系统能否做的自动恢复。 36 | - 扩展性:当数据持续增多,能否通过添加机器就自动做到数据再次平衡,并且不影响外部服务。 37 | - 分布式事务:是否需要提供分布式事务支持,事务隔离等级需要支持到什么程度。 38 | ---------- 39 | 40 | 41 | 42 | ![](http://static.zybuluo.com/zyytop/rmudjvx02boh2g413hhkcqnu/1%E7%9A%84%E5%89%AF%E6%9C%AC.png) 43 | 44 | 45 | **可用性** 46 | 47 | 48 | - Rust 49 | - static language。 50 | - no GC。 51 | - Memory safe,avoid dangling pointer,memory leak。 52 | - Thread safe,no data race。 53 | - package manager。 54 | - C bindings,zero-cost。 55 | - not so easy to learn。 56 | - not so many libraries as Java or C++。 57 | - 多副本(Raft) 58 | - 数据写入只有**大多数副本节点写入成功**才算成功。 59 | - 一个副本节点 down,可以切换到其他副本节点。 60 | 61 | **跨行事务 -** 2PC 62 | 63 | 基于跨行事务,可以实现 SQL-based 的数据库。 64 | 65 | **GRPC API** 66 | 67 | 68 | - Get 69 | - Scan 70 | - BatchGet 71 | - Prewrite 72 | - Commit 73 | 74 | 75 | ---------- 76 | ## 数据存储 77 | ![Storage Stack](https://pingcap.com/images/blog/storage-stack1.png) 78 | 79 | ![Key Space](https://pingcap.com/images/blog/key-space.png) 80 | 81 | ![Storage Stack3](https://pingcap.com/images/blog/storage-stack3.png) 82 | 83 | ---------- 84 | ## 数据复制 85 | 86 | ### 一致性协议 Raft 87 | 88 | > 超级棒的 Slide http://thesecretlivesofdata.com 89 | 90 | ### Raft in TiKV 91 | 92 | ![](https://pingcap.com/images/blog-cn/tikv-architecture.png) 93 | 94 | - 每个 Region 有三个副本(Replica),副本之间构成一个 Raft Group。 95 | 96 | (Replica 分布在不同的 TiKV 节点上,其中 Leader 负责读/写,Follower 负责同步 Leader 发来的 raft log) 97 | 98 | - 请求到 Region Leader 的写请求的数据通过 Raft 协议,在三个副本之间达成一致。 99 | 100 | - 整个 KV 空间由许多个 Raft Group 构成。 101 | 102 | ------ 103 | 104 | ## 数据调度 105 | 106 | ### 为什么调度 107 | 108 | 负载与分布: 109 | 110 | - 保证请求均匀分布到节点上。 111 | - 保证节点存储容量均匀。 112 | - 保证数据访问热点分布均匀。 113 | - 保证副本数量不多不少,并且分布在不同机器上。 114 | - 避免集群 Balance 影响服务。 115 | 116 | 扩容缩容: 117 | 118 | - 增加节点以及下线节点后,保证数据均匀分布。 119 | 120 | 数据容灾: 121 | 122 | - 少数节点失效后,保证服务正常,以及负载和分布均衡。 123 | - 跨机房部署时,某个机房掉线后,保证不丢失数据甚至是保证正常服务。 124 | 125 | ### 调度的基本单元 126 | 127 | - 增加一个 Replica 128 | - 删除一个 Replica 129 | - 将 Raft Leader 从 一个 Replica 转移到另一个 Replica 130 | 131 | ### 集群信息收集 132 | 133 | 收集每个TiKV 节点的信息,以及每个 Region 的状信息。 134 | 135 | - 节点定时上报的信息。 136 | - 磁盘总量,可用磁盘容量。 137 | - 包含的 Region Replica 个数。 138 | - 数据写入速度。 139 | - 是否过载。 140 | - Label 信息。 141 | - 其他... 142 | - Region 的 Raft Leader 上报信息。 143 | - Leader/Follower 所在的节点。 144 | - 掉线的 Replica 个数。 145 | - Region 写入和读取的速度。 146 | - 其他... 147 | 148 | 149 | 150 | **通过管理接口传递进来的外部信息,做更准确的决策。** 151 | 152 | - 主动下线某个节点。 153 | - 主动迁移某些热点 Region。 154 | - 其他... 155 | 156 | ### 调度计划 157 | 158 | - **一个 Region 的 Replica 数量正确** 159 | - **一个 Raft Group 中的多个 Replica 不在同一个==位置==** 160 | - **副本在 Store 之间的分布均匀分配** 161 | - **Leader 数量在 Store 之间均匀分配** 162 | - **访问热点数量在 Store 之间均匀分配** 163 | - **各个 Store 的存储空间占用大致相等** 164 | - **控制调度速度,避免影响在线服务** 165 | - 其他。。(可在 PD 中实现自定义的调度策略) 166 | 167 | ### 调度实现 168 | 169 | 1. PD 不断的通过 Store 或者 Leader 的心跳包收集信息,获得整个集群的详细数据 170 | 2. 每次收到 Region Leader 发来的心跳包时,根据以上信息以及调度策略**==生成调度操作序列==**。 171 | 3. 通过心跳包的返回消息,将需要进行的操作返回给 Region Leader,并在后面的心跳包中监测执行结果。 172 | 173 | ------ 174 | 175 | ## Transaction && MVCC 176 | 177 | 178 | 179 | ------ 180 | 181 | ## SQL Layer 182 | 183 | ![SQL to Key-Value](https://pingcap.com/images/blog/sql-kv.png) 184 | 185 | 186 | 187 | ------ 188 | 189 | ## 编码格式 190 | 191 | DB 192 | 193 | 194 | - `m + DBs + h + DB:[db_id]` => `TiDBInfo` 195 | - `m + DB:[db_id] + h + Table:[tb_id]` => `TiTableInfo` 196 | 197 | ``` json 198 | { 199 | "id":130, 200 | "db_name":{"O":"global_temp","L":"global_temp"}, 201 | "charset":"utf8", 202 | "collate":"utf8_bin", 203 | "state":5 204 | } 205 | ``` 206 | 207 | ``` json 208 | { 209 | "id": 42, 210 | "name": { 211 | "O": "test", 212 | "L": "test" 213 | }, 214 | "charset": "", 215 | "collate": "", 216 | "cols": [ 217 | { 218 | "id": 1, 219 | "name": { 220 | "O": "c1", 221 | "L": "c1" 222 | }, 223 | "offset": 0, 224 | "origin_default": null, 225 | "default": null, 226 | "type": { 227 | "Tp": 3, 228 | "Flag": 139, 229 | "Flen": 11, 230 | "Decimal": -1, 231 | "Charset": "binary", 232 | "Collate": "binary", 233 | "Elems": null 234 | }, 235 | "state": 5, 236 | "comment": "" 237 | } 238 | ], 239 | "index_info": [], 240 | "fk_info": null, 241 | "state": 5, 242 | "pk_is_handle": true, 243 | "comment": "", 244 | "auto_inc_id": 0, 245 | "max_col_id": 4, 246 | "max_idx_id": 1 247 | } 248 | ``` 249 | 250 | **执行计划落地** 251 | 252 | Table Scan/Index Scan => Selection/TopN/Aggr/Limit 253 | 254 | http://andremouche.github.io/tidb/coprocessor_in_tikv.html 255 | 256 | 257 | ## 代码目录说明 ## 258 | 259 | #### structure #### 260 | 261 | TxStructure 包装 `kv.Retriever` 和 `kv.RetrieverMutator` 来操作 `string`, `list`, `hash` 类型。 262 | 263 | #### meta #### 264 | 265 | Meta: 封装 `TxStructure` 来操作 meta 相关的信息。 266 | Id Allocator: id genertor 封装。 267 | 268 | #### model #### 269 | 270 | schema, table, index 等数据结构 存入底层 kv 时的 json 格式。 271 | 272 | #### table #### 273 | 274 | table,index, column 相关的抽象,用来操作数据。 275 | 276 | #### tablecodec #### 277 | 278 | 数据编码。 279 | 280 | #### terror #### 281 | 282 | 错误码。 283 | 284 | #### types #### 285 | 286 | mysql 类型封装。 287 | 288 | 289 | #### distsql #### 290 | 291 | 对 `kv.Response` 的封装。 292 | 293 | - streaming(`stream.go`) 294 | 调用 `kv.Response` 的 `Next` 方法,返回的数据是 `tipb.StreamResponse`(one chunk a time)。 295 | - select(`distsql.go`) 296 | 调用 `kv.Response` 的 `Next` 方法,返回的数据是 `tipb.SelectResponse`(many chunk)。 297 | 298 | TODO: 299 | 300 | - [ ] implementation of `kv.Response` 301 | -------------------------------------------------------------------------------- /cap/tikv.graffle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/cap/tikv.graffle -------------------------------------------------------------------------------- /effective_cpp/README.md: -------------------------------------------------------------------------------- 1 | ## 5. Know what functions C++ silently writes and calls 2 | 3 | - default constructor if no constructor defined by user. 4 | - copy constructor if not defined by user. 5 | - assignment operator if not defined by user. 6 | - destructor if not defined by user. 7 | 8 | 9 | ## 6. Explicitly disallow the use of compiler-generated functions you do not want 10 | 11 | ## 7. Declare destructor virtual in polymorphic base classes 12 | - polymorphic base class should declare virtual destructors. 13 | If a class has any virtual functions, it should have a virtual destructor. 14 | - class not designed to be base classes or not designed to be used polymorphically 15 | should not declare virtual destructor. 16 | 17 | ## 8. Prevent exceptions from leaving destructors 18 | - Destructors should never emit exceptions. 19 | If functions called in a destructor may throw, the destructor should catch any exceptions, 20 | - If class clients need to be able to react to exceptions thrown during an operation, 21 | the class should provide a regular function that performs the operation. 22 | 23 | ## 9. Never call virtual functions during construction or destruction 24 | 25 | ## 12. Copy all part of a object 26 | 27 | - Copying functions should be sure to copy all of an object’s data members 28 | and all of its base class parts. 29 | - Don’t try to implement one of the copying functions in terms of the other. 30 | Instead, put common functionality in a third function that both call. 31 | 32 | 33 | ## 13. Use objects to manage resources 34 | 35 | - To prevent resource leaks, 36 | use RAII objects that acquire resources in their constructors and release them in their destructors. 37 | - Two commonly useful RAII classes are tr1::shared_ptr and auto_ptr. 38 | tr1::shared_ptr is usually the better choice, because its behavior when copied is intuitive. 39 | Copying an auto_ptr sets it to null. 40 | 41 | 42 | ## 14. Think carefully about copying behaviour in resource-managing objects 43 | 44 | - Copying an RAII object entails copying the resource it manages, 45 | so the copying behavior of the resource determines 46 | the copying behavior of the RAII object. 47 | - Common RAII class copying behaviours are: 48 | - Disallowing copying 49 | - Performing reference counting 50 | - Copy the underlying resource 51 | such as the standard string type 52 | - Transfer ownership of the underlying resource, use ~auto_ptr~ 53 | 54 | 55 | ## 15. Provide access to raw resources in resource-managing class 56 | 57 | - APIs often require access to raw resources, 58 | so each RAII class(like shared_ptr) should offer a way to get at the resource it manages. 59 | - Access may be via explicit conversion or implicit conversion. 60 | In general, explicit conversion is safer, 61 | but implicit conversion is more con- venient for clients. 62 | 63 | **shared_ptr** and **auto_ptr** provide both: 64 | 65 | - get method 66 | - pointer dereferencing operators overriding. 67 | 68 | 69 | ## 16. Use the same form in corresponding uses of new and delete 70 | 71 | - If you use [] in a new expression, you must use [] in the corresponding delete expression. 72 | - If you don’t use [] in a new expression, you mustn’t use [] in the corresponding delete expression. 73 | - Be careful of ~typedef~. 74 | 75 | 76 | ## 17. Store newed objects in smart pointers in standalone statements 77 | 78 | - Store newed objects in smart pointers in standalone statements. 79 | Failure to do so can lead to subtle resource leaks when exceptions are thrown. 80 | 81 | ``` cpp 82 | // ~priority()~ may be called between ~new Widget~ and ~shared_ptr~. 83 | processWidget(std::tr1::shared_ptr(new Widget), priority()); 84 | ``` 85 | 86 | 87 | # Designs and Declarations 88 | 89 | ## 18. Make interfaces easy to use correctly and hard to use incorrectly 90 | 91 | 92 | - Good interfaces are easy to use correctly and hard to use incorrectly. 93 | You should strive for these characteristics in all your interfaces. 94 | - Way to facilitate correct use include consistency in interfaces and 95 | behavioural compatibility with built in types. 96 | - Ways to prevent errors include: 97 | - creating new types 98 | - restricting operations on types, constraining object values 99 | - and eliminating client resource management responsibilities 100 | - ~tr1::shared_ptr~ supports custom deleters. 101 | This prevents the cross-DLL problem, 102 | can be used to automatically unlock mutexes. 103 | 104 | 105 | ## 19. Treat class design as type design. 106 | 107 | - Before defining a new type, be sure to consider all the issues in this item. 108 | 109 | 110 | ## 20. Prefer pass-by-ref-to-const to pass-by-value 111 | 112 | - Prefer pass-by-reference-to-const over pass-by-value. 113 | It’s typically more efficient and it avoids the slicing problem. 114 | - The rule doesn’t apply to 115 | built-in types and STL iterator and function object types. 116 | For them, pass-by-value is usually appropriate. 117 | 118 | 119 | ## 21. Don't try to return a reference when you must return an object 120 | 121 | Never return 122 | - a pointer or reference to a local stack object, 123 | - a reference to a heap-allocated object, 124 | - or a pointer or reference to a local static object 125 | if there is a chance that more than one such object will be needed. 126 | 127 | (Item 4 provides an example of a design where returning a reference to a local static is reasonable, at least in single-threaded environments.) 128 | 129 | 130 | 131 | ## 22. 132 | 133 | - Declare data members private. It gives clients 134 | syntactically uniform access to data, 135 | affords fine-grained access control, 136 | allows invariants to be enforced, 137 | and offers class authors implementation flexibility. 138 | - protected is no more encapsulated than public. 139 | 140 | 141 | ## 24. 142 | 143 | why ctor isn't declared explicit. 144 | 145 | ## 33. Avoid hiding inherited names 146 | 147 | - Names in derived classes hide names in base classes. 148 | Under public inheritance, this is never desirable. 149 | - To make hidden name visible again, employ using declarations or 150 | forwarding functions. 151 | 152 | ## 34. Differentiate between inheritance of interface and inheritance of implementation. 153 | 154 | 155 | - pure virtual functions must be redeclared by any concrete class that inherits them. 156 | and typically have no definition in abstract classes. 157 | The purpose of this is to have derived classes inherit a *function interface* only. 158 | 159 | - simple virtual functions 160 | provide an implementation that derived classes may override. 161 | The purpose of declaring a simple virtual function is 162 | to have derived classes inherit a function interface as well as a default implementation. 163 | A better way is using pure virtual functions with a protected default implementation. 164 | 165 | - non-virtual functions 166 | The purpose of declaring a non-virtual function is 167 | to have derived classes inherit a function interface as well as a mandatory implementation. 168 | 169 | 170 | Two common mistakes: 171 | - to declare all functions non-virtual. 172 | non-virtual destructors are problematic(Item 7). 173 | - to declare all member functions virtual. 174 | 175 | 176 | ### TtR 177 | - Inheritance of interface is different from inheritance of implementation. 178 | Under public inheritance, derived classes always inherit base class interfaces. 179 | - Pure virtual functions specify inheritance of interface only. 180 | - Simple (impure) virtual functions specify inheritance of interface plus inheritance of a default implementation. 181 | - Non-virtual functions specify inheritance of interface plus inheritance of a mandatory implementation. 182 | 183 | 184 | ## 35. Consider alternatives to virtual functions. 185 | 186 | - The Template Method Pattern via the Non-Virtual Interface Idiom 187 | - The Strategy Pattern via Function Pointers 188 | - The Classic Strategy Pattern 189 | 190 | - Use the non-virtual interface idiom (NVI idiom), a form of the Template Method design pattern that wraps public non-virtual member functions around less accessible virtual functions. 191 | - Replace virtual functions with function pointer data members, a stripped-down manifestation of the Strategy design pattern. 192 | - Replace virtual functions with tr1::function data members, thus allowing use of any callable entity with a signature compatible with what you need. This, too, is a form of the Strategy design pattern. 193 | - Replace virtual functions in one hierarchy with virtual functions in another hierarchy. This is the conventional implementation of the Strategy design pattern. 194 | 195 | 196 | - Alternatives to virtual functions include the NVI idiom and various forms of the Strategy design pattern. The NVI idiom is itself an example of the Template Method design pattern. 197 | - A disadvantage of moving functionality from a member function to a function outside the class is that the non-member function lacks access to the class’s non-public members. 198 | - tr1::function objects act like generalized function pointers. Such objects support all callable entities compatible with a given target signature. 199 | 200 | ## 36. Never redefine an inherited non-virtual function 201 | 202 | Non-virtual functions are statically bound(Item 37), 203 | while virtual functions are dynamically bound. 204 | 205 | See also Item 7. 206 | 207 | ## 37. Never redefine a function's inherited default parameter value. 208 | 209 | Never redefine an inherited default parameter value, 210 | because default parameter values are statically bound, while virtual functions — 211 | the only functions you should be redefining — are dynamically bound. 212 | 213 | And use non-virtual interface idiom(Item 35) to do DRY. 214 | 215 | 216 | ## 38. Model "has-a" or "is-implemented-in-terms-of" through composition. 217 | 218 | - Composition has meanings completely different from that of public inheritance. 219 | - In the application domain, composition means has-a. In the implementation domain, it means is-implemented-in-terms-of. 220 | 221 | ## 39. Use private inheritance judiciously 222 | 223 | - Private inheritance means is-implemented-in-terms of. 224 | It’s usually inferior to composition, 225 | but it makes sense when a derived class needs access to protected base class members 226 | or needs to redefine inherited virtual functions. 227 | - Unlike composition, private inheritance can enable the empty base optimization. 228 | This can be important for library developers who strive to minimize object sizes. 229 | 230 | 231 | ## 40. Use multiple inheritance judiciously 232 | 233 | ## 45. generalized version of copy functions 234 | 235 | ## 49 & 51 operator new and delete 236 | -------------------------------------------------------------------------------- /miscellaneous.md: -------------------------------------------------------------------------------- 1 | - [Dataflow Model](https://paper.dropbox.com/doc/Dataflow-Model-6GCcfZriy8zi8v3MPZghi) 2 | - [Ruby 生态](ruby-ecosystem/README.org) 3 | -------------------------------------------------------------------------------- /projects/README.md: -------------------------------------------------------------------------------- 1 | ### 个人项目 2 | 3 | - [asdf-sbt](https://github.com/lerencao/asdf-sbt): 安装 sbt 的 asdf 插件。 4 | - [divergent.rb](https://github.com/lerencao/divergent.rb): Try, Maybe 的 Ruby 实现,主要用来处理逻辑错误。 5 | - [etcd.ansible](https://github.com/lerencao/etcd.ansible), [flannel.ansible](https://github.com/lerencao/flannel.ansible), [docker.ansible](https://github.com/lerencao/docker.ansible): 用来安装 kubernetes 相关组件的 ansible Role。 6 | - [dependency gem](https://github.com/lerencao/dependency): 在 Ruby 中实现 Scala 的 Self Type 特性的一次尝试。 7 | - [Sinatra Param Checker Extension](https://github.com/lerencao/sinatra_param_checker): Sinatra 框架的 Param Checker 实现。(借助 before filter) 8 | - [raft.ex](https://github.com/lerencao/raft.ex): 用 Elixir 实现 Raft 协议的一次尝试。(WIP) 9 | - [peer connection](https://github.com/lerencao/peer_conn): 管理多个 Conn Peers 的 Elixir 实现。每个 Peer 是一个 process,被 supervisor 管理。 10 | - [devices.ruff](https://github.com/lerencao/devices.ruff): [Ruff](https://ruff.io) 的入门级应用。 11 | - [lakeland](https://github.com/lerencao/lakeland): Elixir 实现的 TCP Connection Pool, 基于 Ranch。 12 | - [beginners guide to scala](https://github.com/lerencao/guides-to-scala-book): [The Neophyte's Guide to Scala](http://danielwestheide.com/scala/neophytes.html) 的中文翻译。 13 | - [Emacs 配置](https://github.com/lerencao/emacs.d): 用 `use-package` 管理的各种 emacs 插件配置。 14 | - [webpack-react-starter](https://github.com/lerencao/webpack-react-starter): 非常简单的 webpack starter,自带 React HMR。 15 | -------------------------------------------------------------------------------- /railway-oriented-programming/README.md: -------------------------------------------------------------------------------- 1 | ## ROP 错误处理 2 | 3 | ### 常见的错误处理方法 4 | 5 | #### try-catch #### 6 | 7 | ``` ruby 8 | begin 9 | ## send post request 10 | rescue NetworkError => e 11 | end 12 | ``` 13 | 14 | #### return code #### 15 | 16 | ``` ruby 17 | ret = send_post_request() 18 | if ret.code != 0 19 | return error 20 | else 21 | parse(ret.data) 22 | end 23 | ``` 24 | 25 | -------- 26 | 27 | #### 存在的问题 #### 28 | 29 | 复杂的业务逻辑: 30 | 31 | * 验证用户输入 32 | * 数据库访问 33 | * 文件访问 34 | * 网络问题 35 | * ... 36 | 37 | 如果在业务逻辑中,使用上述的处理方法,代码很容易变丑,添加各种 if 判断,各种 begin-rescue。 38 | 39 | ![retrun early](imperative-code-return-early.png) 40 | 41 | -------- 42 | 43 | ### Enter ROP(Railway Oriented Programming) ### 44 | 45 | ![success-or-failure](success-failure.png) 46 | 47 | => 48 | 49 | ![success-or-failure-railway](success-failure-railway.png) 50 | ![success-or-failure-railway](success-failure-railway-1.png) 51 | 52 | 如果 Validate 失败,就不执行 UpdateDb 操作, 53 | 如果 UpdateDb 失败,就不执行 SendEmail 操作。 54 | 55 | ![pipe chain](pipe-chain.png) 56 | 57 | 实现: 58 | 59 | ``` ruby 60 | Try = Struct.new(:ok, data_or_exception) 61 | 62 | class Try 63 | def pipe(&block) 64 | if !@ok 65 | return self 66 | end 67 | block.call(@data_or_exception) 68 | end 69 | end 70 | 71 | def validate(req) 72 | # do your validation 73 | Try.new(ok, validated_req) 74 | rescue ValidateFailed => e 75 | Try.new(false, e) 76 | end 77 | 78 | request = FakeRequest.new() 79 | r = Try.new(true, request) 80 | r.pipe { |req| validate(req) } 81 | .pipe { |req| update_db(req) } 82 | .pipe { |db_result| send_email(db_result) } 83 | ``` 84 | 85 | 86 | 87 | 88 | #### 不会出错的操作 #### 89 | 90 | 91 | ``` ruby 92 | class Try 93 | def map(&block) 94 | if !@ok 95 | return self 96 | end 97 | result = block.call(@data_or_exception) 98 | Try.new(true, result) 99 | end 100 | end 101 | 102 | def trim_name(str) 103 | str.strip 104 | end 105 | 106 | 107 | request = FakeRequest.new() 108 | r = Try.new(true, request) 109 | r.map { |req| trim_name(req.name) } 110 | ``` 111 | 112 | 113 | ![two-track](two-track.png) 114 | 115 | 116 | ### 会抛异常的操作 ### 117 | 118 | ``` ruby 119 | class Try 120 | def map(&block) 121 | if !@ok 122 | return self 123 | end 124 | begin 125 | result = block.call(@data_or_exception) 126 | Try.new(true, result) 127 | rescue => e 128 | Try.new(false, e) 129 | end 130 | end 131 | end 132 | 133 | 134 | 135 | request = FakeRequest.new() 136 | r = Try.new(true, request) 137 | r.map { |req| function_may_raise_error(req) } 138 | ``` 139 | 140 | 141 | ![raise error function](function_may_raise_error.png) 142 | 143 | #### 副作用 #### 144 | 145 | do something meaningful but the return value is not needed. 146 | 147 | 148 | ``` ruby 149 | class Try 150 | def on_success(&block) 151 | if !@ok 152 | return self 153 | end 154 | block.call(@data_or_exception) 155 | return self 156 | end 157 | end 158 | 159 | def update_db(req) 160 | # User.save(req) 161 | end 162 | 163 | 164 | request = FakeRequest.new() 165 | r = Try.new(true, request) 166 | r.on_success { |req| update_db(req) } 167 | ``` 168 | 169 | ![dead end railway](one-track-input-output.png) 170 | 171 | 172 | 173 | ### 串联起来 174 | 175 | 将以上操作串联起来: 176 | 177 | ``` ruby 178 | request = FakeRequest.new() 179 | r = Try.new(true, request) 180 | response = r.pipe { |req| validate(req) } 181 | .map { |req| get_user(req.name) } 182 | .on_success { |req| update_db(req) } 183 | .pipe { |req| send_email(req) } 184 | ``` 185 | 186 | ![chain all](chain-validate-update_db_send_email.png) 187 | 188 | ### 其他 ### 189 | 190 | - Try 191 | - Maybe 192 | - Monad 193 | -------------------------------------------------------------------------------- /railway-oriented-programming/chain-validate-update_db_send_email.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/chain-validate-update_db_send_email.png -------------------------------------------------------------------------------- /railway-oriented-programming/function_may_raise_error.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/function_may_raise_error.png -------------------------------------------------------------------------------- /railway-oriented-programming/imperative-code-return-early.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/imperative-code-return-early.png -------------------------------------------------------------------------------- /railway-oriented-programming/one-track-input-output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/one-track-input-output.png -------------------------------------------------------------------------------- /railway-oriented-programming/pipe-chain.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/pipe-chain.png -------------------------------------------------------------------------------- /railway-oriented-programming/success-failure-railway-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/success-failure-railway-1.png -------------------------------------------------------------------------------- /railway-oriented-programming/success-failure-railway.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/success-failure-railway.png -------------------------------------------------------------------------------- /railway-oriented-programming/success-failure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/success-failure.png -------------------------------------------------------------------------------- /railway-oriented-programming/two-track.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/railway-oriented-programming/two-track.png -------------------------------------------------------------------------------- /ruby-ecosystem/README.org: -------------------------------------------------------------------------------- 1 | * Ruby Ecosystem - What every ruby newbies should know about? 2 | 3 | ** Ruby Basics 4 | 5 | - Created by *Matz* in 1995. 6 | - For productivity and fun. 7 | 8 | *** The basic elements that other languages also have. 9 | 10 | Interactive Ruby Shell (irb): 11 | #+BEGIN_SRC ruby 12 | ### Data Types 13 | 14 | 1_000_000 # Integer 15 | "Hello, #{`whoami`}" # String 16 | :SenseTime # Symbol 17 | [1, 2, 3] # Array 18 | %w(one two three) # Also array 19 | {'one' => 1, 'two' => 2} # Hash 20 | {1 => 'one', 2 => 'two'} # Hash 21 | {one: 1, two: 2} # Hash with Symbol keys 22 | /\d{1,3}\w*/ # Regexp 23 | # Also Set in Standard Library 24 | 25 | ### Control Flows 26 | 27 | # if, unless 28 | if your_age < 18 29 | 'do whatever you want' 30 | end 31 | 32 | unless he_is_a_programmer 33 | '' 34 | end 35 | 36 | # case/when 37 | 38 | case resp_code 39 | when 1000, 2000 40 | # the case 41 | when 3000 42 | # other case 43 | else 44 | # the left case 45 | end 46 | 47 | # loop, while, until, break, retry, next 48 | 49 | loop do 50 | if some_condition 51 | break 52 | end 53 | end 54 | 55 | ### Exception handling, begin/rescue/ensure 56 | 57 | begin 58 | raise 'Oops' 59 | rescue StandardError => e 60 | logger.error(e) 61 | ensure 62 | # any resource got gc-ed. 63 | end 64 | 65 | ### Built-in support for 66 | # 1. rational numbers 67 | # 2. complex numbers 68 | # 3. arbitrary-precision arithmetic 69 | 2/3r # => (2/3) 70 | Rational(2, 3) # => (2/3) 71 | 2+3i # => (2+3i) 72 | Complex(2, 3) # => (2+3i) 73 | 2 ** 100 # => 1267650600228229401496703205376 74 | 75 | ### Strict boolean coercion rules (everything is true except false and nil) 76 | !!nil # => false 77 | !!false # => false 78 | !![] # => true 79 | #+END_SRC 80 | 81 | 82 | *** Thoroughly object-oriented with inheritance, mixin and metaclass. 83 | 84 | - variable scope (global, class, instance, and local) 85 | 86 | #+BEGIN_SRC ruby 87 | # global 88 | $this_is_a_global_variable = "Please do not change me, as I am fragile." 89 | 90 | # class variable 91 | 92 | class Person 93 | @@this_is_a_class_variable = "SenseTime" 94 | def initialize(name = "unknown") 95 | @name = name # instance variable 96 | end 97 | 98 | def name 99 | @name 100 | end 101 | 102 | def age 103 | always_18 = 18 # local variable 104 | always_18 105 | end 106 | end 107 | 108 | #+END_SRC 109 | 110 | - Inheritance 111 | 112 | #+BEGIN_SRC ruby 113 | class A 114 | def a 115 | 'A' 116 | end 117 | end 118 | 119 | class B < A 120 | def b 121 | 'B' 122 | end 123 | end 124 | 125 | B.new.a # => 'A' 126 | #+END_SRC 127 | 128 | - Mixin, aka composition 129 | 130 | #+BEGIN_SRC ruby 131 | module RubySkill 132 | def metaprogramming 133 | 'Meta Programming' 134 | end 135 | 136 | def rack 137 | 'Rack' 138 | end 139 | 140 | def ruby_on_rails 141 | 'Ruby on Rails' 142 | end 143 | 144 | # any other skills 145 | end 146 | 147 | class RubyMaster 148 | include RubySkill 149 | end 150 | 151 | everybody = RubyMaster.new 152 | everybody.metaprogramming # 'Meta Programming' 153 | everybody.rack # 'Rack' 154 | 155 | 156 | module TestingSkill 157 | def unit 158 | 'Unit Test' 159 | end 160 | 161 | def spec 162 | 'Spec Test' 163 | end 164 | end 165 | 166 | class RubyMaster 167 | include TestingSkill 168 | end 169 | 170 | 171 | everybody.unit # => "Unit Test" 172 | #+END_SRC 173 | 174 | 175 | - Class is the first-class object. 176 | 177 | 178 | 179 | #+BEGIN_SRC ruby 180 | # *call methods on class* 181 | class A 182 | end 183 | 184 | def A.i_can_do_this 185 | 'write Ruby' 186 | end 187 | 188 | A.i_can_do_this 189 | #+END_SRC 190 | 191 | 192 | 193 | 194 | #+BEGIN_SRC ruby 195 | # *Every object has a singleton_class.* 196 | 197 | a = A.new 198 | 199 | def a.i_can_do_this 200 | 'write Ruby' 201 | end 202 | 203 | a.i_can_do_this #=> 'write Ruby' 204 | 205 | b = A.new 206 | 207 | b.i_can_do_this #=> NoMethodError! 208 | #+END_SRC 209 | 210 | 211 | #+BEGIN_SRC ruby 212 | a.class # => A 213 | a.singleton_class #=> #> 214 | 215 | A.class #=> Class 216 | A.singleton_class #=> # 217 | 218 | #+END_SRC 219 | 220 | - The Truth: [[http://debbbbie.com/img/metaprogramming_ruby/eigenclass.png][ruby method lookup]] 221 | 222 | 223 | 224 | *** Dynamic reflection and alteration of objects. 225 | 226 | #+BEGIN_SRC ruby 227 | # duck typing 228 | def bar(a) 229 | if a.respond_to?(:baz) 230 | a.baz 231 | end 232 | end 233 | #+END_SRC 234 | 235 | 236 | #+BEGIN_SRC ruby 237 | # all methods that you can call. 238 | [Integer, String, Regexp, Array, Hash, Enumerable].map do |clazz| 239 | clazz.instance_methods(false).sort 240 | end.each(&:inspect) 241 | #+END_SRC 242 | 243 | 244 | #+BEGIN_SRC ruby 245 | class Foo 246 | def bar 247 | @baz = 'baz' 248 | end 249 | end 250 | 251 | foo = Foo.new 252 | foo.instance_variabe_get :@baz # => nil 253 | 254 | foo.bar 255 | foo.instance_variabe_get :@baz # => 'baz' 256 | 257 | foo.instance_variabe_set :@baz, 'bar' 258 | foo.bar # => 'bar' 259 | #+END_SRC 260 | 261 | 262 | *** Lexical closures, uniq block syntax. 263 | 264 | #+BEGIN_SRC ruby 265 | [1, 2, 3].each do |x| 266 | puts x ** 3 267 | end # block 268 | 269 | cube = ->(x) { x ** 3 } # lambda 270 | cube[3] 271 | cube.call(3) 272 | 273 | gen = ->(times) { 274 | ->(e) { 275 | e ** times 276 | } 277 | } 278 | 279 | square = Proc.new { |x| x * x } # proc 280 | square[3] 281 | cube.call(3) 282 | #+END_SRC 283 | 284 | 285 | *** Centralized package management through RubyGems 286 | 287 | - just ~gem install any_gem_you_want_to_install~ 288 | 289 | - and also easy-to-use and powerful extensibility. 290 | 291 | #+BEGIN_SRC ruby 292 | class Range 293 | def overlaps?(other) 294 | cover?(other.first) || other.cover?(first) 295 | end 296 | end 297 | #+END_SRC 298 | 299 | - *active support*! 300 | 301 | 302 | *** Lispy and functional part 303 | 304 | - Ruby is eval. 305 | 306 | #+BEGIN_SRC ruby 307 | defs = < ['Programming Ruby', 'Ruby Metaprogramming', 'Scala Quick Guide', 'Rust Tutorial'] 329 | titles.select! { |t| t =~ /[Rr]uby/ } 330 | titles #=> ['Programming Ruby', 'Ruby Metaprogramming'] 331 | 332 | ### flat_map 333 | class User 334 | attr_accessor :books 335 | attr_accessor :company 336 | end 337 | 338 | User.find_by(company: 'SenseTime').flat_map do |user| 339 | user.books 340 | end.select do |book| 341 | book.title =~ /[rR]uby/ 342 | end.uniq 343 | #+END_SRC 344 | 345 | 346 | 347 | ** Ruby on Web 348 | 349 | *** Rack 350 | 351 | Minimal interface between webservers that support Ruby and Ruby 352 | frameworks. 353 | 354 | #+BEGIN_SRC ruby 355 | require 'rack' 356 | 357 | app = Proc.new do |env| 358 | ['200', {'Content-Type' => 'text/html'}, ['A barebones rack app.']] 359 | end 360 | 361 | Rack::Handler::WEBrick.run app 362 | #+END_SRC 363 | 364 | *** HTTP Servers 365 | 366 | - Thin 367 | - Unicorn 368 | - Puma 369 | - Passenger 5(aka Raptor) 370 | - ... 371 | 372 | evented? multi-process? multi-thread? 373 | 374 | 375 | 376 | *** Frameworks 377 | 378 | **** Rails 379 | 380 | - MVC 381 | - ORM 382 | - Avtive Series. 383 | - Many-and-Many rails gems.(Google 'awesome rails') 384 | 385 | **** Sinatra, Padrino 386 | 387 | #+BEGIN_SRC ruby 388 | require 'sinatra' 389 | use Rack::CommonLogger 390 | 391 | get '/frank-says' do 392 | 'Put this in your pipe & smoke it!' 393 | end 394 | #+END_SRC 395 | 396 | **** Hanami 397 | 398 | - fullstack 399 | - service-oriented, modular. 400 | - pure ruby object, less black magic. 401 | 402 | 403 | **** Volt(rich web applications, write ruby and run it in broswer) 404 | 405 | under the hook: compile ruby to javascript use Opal engine! 406 | 407 | #+BEGIN_SRC ruby 408 | class User 409 | attr_accessor :name 410 | 411 | def initialize(name) 412 | @name = name 413 | end 414 | 415 | def admin? 416 | @name == 'Admin' 417 | end 418 | end 419 | 420 | user = User.new('Bob') 421 | puts user.name 422 | puts user.admin? 423 | #+END_SRC 424 | 425 | ---------------------------------------->> 426 | 427 | #+BEGIN_SRC javascript 428 | (function(Opal) { 429 | var self = Opal.top, $scope = Opal, nil = Opal.nil, $breaker = Opal.breaker, $slice = Opal.slice, $klass = Opal.klass, user = nil; 430 | 431 | Opal.add_stubs(['$attr_accessor', '$==', '$new', '$puts', '$name', '$admin?']); 432 | (function($base, $super) { 433 | function $User(){}; 434 | var self = $User = $klass($base, $super, 'User', $User); 435 | 436 | var def = self.$$proto, $scope = self.$$scope, TMP_1, TMP_2; 437 | 438 | def.name = nil; 439 | self.$attr_accessor("name"); 440 | 441 | Opal.defn(self, '$initialize', TMP_1 = function ːinitialize(name) { 442 | var self = this; 443 | 444 | return self.name = name; 445 | }, TMP_1.$$arity = 1); 446 | 447 | return (Opal.defn(self, '$admin?', TMP_2 = function() { 448 | var self = this; 449 | 450 | return self.name['$==']("Admin"); 451 | }, TMP_2.$$arity = 0), nil) && 'admin?'; 452 | })($scope.base, null); 453 | user = $scope.get('User').$new("Bob"); 454 | self.$puts(user.$name()); 455 | return self.$puts(user['$admin?']()); 456 | })(Opal); 457 | 458 | #+END_SRC 459 | 460 | 461 | *** Tools 462 | 463 | - Test 464 | - Minitest 465 | - RSpec 466 | - Code Quality 467 | - RuboCop 468 | - DB Adapter 469 | - ActiveRecord 470 | - Ruby Object Mapper, pure ruby object. 471 | 472 | - Background Jobs 473 | - Sidekiq. (require redis, ruby) 474 | - SuckerPunch. (in memory, ruby) 475 | - Beanstalkd. (in memory, c) 476 | 477 | Q: Job Queue or Message Queue? 478 | 479 | - RabbitMQ 480 | - Kafka 481 | 482 | (exactly once, at most once, at least once) 483 | 484 | - Concurrency - a mistery in Ruby ecosystem. 485 | 486 | - GIL 487 | #+BEGIN_SRC ruby 488 | array = [] 489 | 490 | 5.times.map do 491 | Thread.new do 492 | 1000.times do 493 | array << nil 494 | end 495 | end 496 | end.each(&:join) 497 | 498 | puts array.size 499 | #+END_SRC 500 | - future, promise 501 | - concurrent-ruby 502 | #+BEGIN_SRC ruby 503 | require 'concurrent' 504 | 505 | p = Concurrent::Promise.new do 506 | # do your heavy work 507 | sleep(3) 508 | 42 509 | end.then do |x| 510 | # another heavy work 511 | sleep(1) 512 | x * 2 513 | end.then do |result| 514 | result - 10 515 | end.execute 516 | 517 | # do heavy work in the current thread, like 518 | sleep(5) 519 | 520 | p.value # => block until the value is available: rejected or satisfied. 521 | #+END_SRC 522 | 523 | - actor model. Let it Crash! 524 | - celluloid 525 | 526 | 527 | 528 | ** Ruby on Others 529 | 530 | *** IoT 531 | 532 | - artoo 533 | 534 | #+BEGIN_SRC ruby 535 | connection :arduino, :adaptor => :firmata, :port => '/dev/ttyACM0' 536 | device :led, :driver => :led, :pin => 13 537 | device :button, :driver => :button, :pin => 2 538 | 539 | work do 540 | on button, :push => proc {led.toggle} 541 | end 542 | #+END_SRC 543 | 544 | *** DevOps 545 | 546 | - Puppet 547 | - Chef 548 | - Vagrant 549 | 550 | ** Want More? 551 | 552 | Google 'Awesome Ruby'! 553 | -------------------------------------------------------------------------------- /sharing/tidb-in-wacai/README.md: -------------------------------------------------------------------------------- 1 | --- 2 | typora-copy-images-to: . 3 | --- 4 | 5 | # TiDB 在挖财的选型和应用 6 | 7 | ### 挖财介绍 8 | 9 | 挖财诞生于 2009 年 6 月,于国内率先推出个人记账理财工具,现已发展成为一家涵盖记账、管钱、理财、信用、社区等服务于一体的全方位的互联网财富管理平台。 10 | 11 | ### 本文主要内容 12 | 13 | 1. 当前技术栈所面临的困难。 14 | 2. 应用新的存储所做的比较和选型。 15 | 3. 实际场景中,TiDB 的应用情况。 16 | 4. 以及之后我们想围绕 TiDB 做的一些事情。 17 | 18 | 19 | ### 当前技术栈 20 | 21 | HBase 是我们这边使用的主要存储系统,提供 OLTP 服务能力。HBase 接口简单,但接入需要做不少的工作,而且需要去了解 HBase 的概念,对业务使用方来讲,是应该避免的负担。所以基于此,又引入了 Phoenix 提供 SQL 接口,以便于业务快速上马。在去年我来到挖财之前,这套系统运行了大概一年,期间基本没出现什么大问题。 22 | 随着后来业务发展,开始频出问题。大部分是 Phoenix 自身的设计导致。比如: 23 | 24 | - 索引写入失败直接导致 hbase 退出以及索引数据不一致。 25 | 26 | - meta 表成为访问热点(默认所有请求都会去读取最新的 meta 信息)。 27 | 28 | - 配置 `update_cache_frequency` 又会引起 meta 信息不同步的问题。 29 | 30 | - 甚至还有添加索引导致集群崩溃的情况。 31 | 32 | 以上只是冰山一角,我们花了非常多的时间来跟踪处理这些问题。内部版本与官方版本越走越远,升级变得很困难,加上社区很小,我们内部人员也不多(其实就两三个人在搞,我自己算是比较 focus 在这上面,但越看代码越是对 Phoenix 失望),我们意识到 Phoenix 并不是一个可以长久使用的 SQL 方案。 33 | 34 | 另外 HBase 是一个 CP 模型,可用性谈不上高。所以,去年年底,在将 HBase 的业务稳定下来之后,我们开始着手调研其他的路子。 35 | 36 | ### 新存储的调研和选型 37 | 38 | 调研之初,列了几个需要满足的条件: 39 | 40 | 1. 大容量,水平扩展。 41 | 2. 可用性要好。 42 | 3. 分布式事务,保证索引数据一致。 43 | 4. 最好是 SQL 接口。 44 | 45 | 在当时,符合这些要求的其实也就两个:TiDB 和 CockroachDB。(放到现在可能还会加上 FoundationDB) 46 | 47 | 在阅读了不少 TiDB 和 CockroachDB 的文章后,我倾向于 TiDB 。有这样几个原因: 48 | 49 | 1. 兼容 MySQL 协议,公司内部各种 MySQL 配套设施基本都可以无缝对接。虽然 CRDB 也兼容 PG 协议,但毕竟业务方已经习惯了 MySQL 的用法。 50 | 2. TiDB 和 CRDB 在[全局时钟的实现](https://pingcap.com/blog/Time-in-Distributed-Systems/)上选择了不同的方案,而 TiDB 的选择更契合公司现阶段的的状况。 51 | 3. 最后是可控度。之前看过部分 TiKV 的代码(出于学习 Rust 和 Raft 的目的),自信能够在短时间内定位存储层出现的大部分问题。 52 | 53 | 后面对 TiDB 和 CRDB 做了一些对比测试(版本都是 1.x 版本,具体方法是在相同机器上分别部署 tidb 和 crdb,利用 sysbench 测试写入和 tp 查询时的性能表现)。实测下来,有以下几个体验: 54 | 55 | 1. TiDB 部署起来略麻烦,从组件和配置参数个数上,CRDB 数量比 TiDB 少。TiDB 提供 ansible 脚本,可以一键部署,但毕竟组件多,使用起来还是需要多加注意。 56 | 2. CRDB 稳定性不如TiDB。测试过程中,CRDB 出现 OOM 的情况。另外,聚合查询的时间差别很大,有时候甚至达一倍之多。相比,TiDB 足够稳定,多次查询性能都很接近。个人猜测,这应该和 TiDB 存储层使用 Rust 编写有关。 57 | 3. TiDB(在关闭 `sync_log` 的情况) 写入性能上完胜。这一点确是始料未及的。CRDB 不知道是内部实现问题,还是未提供类似 sync_log 的配置参数,写入性能只有几千 QPS。(最新版 CRDB 2.0 宣称写入性能有极大的提升,有兴趣的朋友可针对新版做测试) 58 | 4. 查询方面(点查,范围查),测试显示两者差距不是很大。 59 | 60 | 最终,我们选择了 TiDB。TiDB 提供了: 61 | 62 | 1. Raft 协议保证一致性的同时,提高了整个系统的可用性。 63 | 2. 2PC 事务,保证数据索引一致。 64 | 3. Online, Async Schema Change,高效且 '安全' 的 Schema 变更操作,运维友好程度 Max。 65 | 66 | TiDB 的这些特性正好解决(或者说不存在)了 Phoenix/HBase 的很多缺点。 67 | 68 | ### TiDB 的上线和应用 69 | 70 | 部署使用的是官方提供的 tidb-ansible 脚本,在此基础上做了一些改动,以适配内部的运维环境。具体部署情况如下: 71 | 72 | | 机器 | pd | tikv | tidb | 73 | | --------------------------------- | ---- | ---- | ---- | 74 | | server1(40C,252 GB,6 * 2T SATA) | 1 | 3 | 1 | 75 | | server2(40C,252 GB,6 * 2T SATA) | 1 | 3 | 1 | 76 | | server3(40C,252 GB,6 * 2T SATA) | 1 | 3 | 1 | 77 | 78 | 部署之后,我们使用压测脚本 fake 了一些读写流量,打到 tidb 集群中,线上稳定跑了1个多月。期间经过两次rc 版本升级,过程比较顺利。 pd 和 tikv 升级过程中,会有少量请求因为 retry backoff,导致请求时长变高,几百毫秒到几秒不等。 79 | 80 | 目前,TiDB 在本部门已逐步使用起来。内部一个设备标识系统(重索引,写入、更新比较频繁)已经从 Phoenix/HBase 切换到 TiDB ,线上稳定运行有一个多月,tidb qps 稳定在 500 左右,数据量在 300G 左右。具体切换过程这里不展开讲,最终的结果是,业务更加稳定,schema 操作变得十分方便,大大加快了业务的推进。部门其他业务也在接入过程中。 81 | 82 | ![image-20180528163849477](image-20180528163849477.png) 83 | 84 | ![image-20180528164446453](image-20180528164446453.png) 85 | 86 | 87 | 88 | 在实际使用过程中,我们也发现一些小坑。 MySQL 协议的兼容问题遇到一两个,不过不重要,提给官方,可以得到快速的解决。给我印象比较深的一个坑是:线上观察到 TiKV KV Engine Seek 时间过长,达几 ms 之多。后经过 PingCAP团队的帮助,调整 `tikv_gc_concurrency`,`tikv_gc_run_interval` 参数之后,恢复正常。 我们的业务数据更新比较频繁,gc 不及时的话,历史版本积累过多,会引发上述问题。 89 | 90 | ### 未来计划 91 | 92 | 未来考虑将部分 AP 业务迁移到 TiDB。目前有两条路可以走: 93 | 94 | 1. 利用 TiSpark 接入到 TiDB 集群,提供实时查询分析能力。TiSpark 保证 spark 查询应用到 TiKV 上的优先级最低,保证了 TP 业务的优先处理。优点,零成本搭建。缺点是,隔离程度不高,具体的效果有待实际应用后再行观察。 95 | 2. 利用 binlog 同步,构建备份集群。所有 AP 查询走备份集群。优点是,物理隔离。缺点是,binlog 同步方案比较重,且目前不开源,可用性未知。 96 | 97 | 未来还考虑在 TiKV 的基础上,构建 Redis 协议的存储。TiKV 的高稳定性,高扩展性,简单的KV 接口,分布式事务实现,使得构建更丰富的存储模型成为可能。毕竟,我们的征途是星辰大海。 98 | 99 | ### 致谢 100 | 101 | 最后,非常感谢 PingCAP 的团队,在我们应用 TiDB 的过程中,给与的强大支持。 -------------------------------------------------------------------------------- /sharing/tidb-in-wacai/image-20180528163849477.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/sharing/tidb-in-wacai/image-20180528163849477.png -------------------------------------------------------------------------------- /sharing/tidb-in-wacai/image-20180528164446453.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/sharing/tidb-in-wacai/image-20180528164446453.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/compaction-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/compaction-1.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/compaction-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/compaction-2.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/compaction-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/compaction-3.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/leveled-compaction-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/leveled-compaction-1.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/leveled-compaction-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/leveled-compaction-2.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/leveled-compaction-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/leveled-compaction-3.png -------------------------------------------------------------------------------- /storage/lsm-tree/assets/images/leveled-compaction-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/lsm-tree/assets/images/leveled-compaction-4.png -------------------------------------------------------------------------------- /storage/lsm-tree/compaction-strategy.md: -------------------------------------------------------------------------------- 1 | ## LSM 的 Compaction 策略 2 | 3 | ### Size-Tiered Compaction Strategy(STCS) 4 | 5 | > via http://www.scylladb.com/2018/01/17/compaction-series-space-amplification/ 6 | 7 | memtable 定期的刷新到磁盘,成为一个一个比较小的 sstable。 8 | 9 | 当这些小的 sstable 的数量达到一定个数(比如说4个)时,这四个 sstable 会被 compact 成一个稍大些的 sstable。 10 | 11 | 当稍大些的 sstable 的数量又达到一定个数时,又会被一起 compact 成更大的 sstable。 12 | 13 | > 实际上要复杂些,因为多个 sstable compact 在一起,可能变得更小。假如,这些 sstable 里面 key 的重复率比较高(key 的更新频率高),compact 后的 sstable 的 key 数量不会增加太多。 14 | 15 | ![](assets/images/compaction-1.png) 16 | 17 | #### 带来的问题 18 | 19 | **存储放大** 20 | 21 | - 在做 compaction 的操作时,在新的 sstable 还没有生成之前,旧的 sstable 不能删除。所以 compaction 占用的空间要比实际存储的数据量多一倍。 22 | 23 | ![](assets/images/compaction-2.png) 24 | 25 | 26 | - 如果 key 的更新过于频繁,会导致同一个 level 以及不同 level 的 sstable 中会存储多个相同的 key。 27 | 28 | ![](assets/images/compaction-3.png) 29 | 30 | ### Leveled Compaction Strategy(LCS) 31 | 32 | > via http://www.scylladb.com/2018/01/31/compaction-series-leveled-compaction/ 33 | 34 | 可减少 Size-Tiered(ST) 带来的**存储放大**问题,同时还可以减少**读放大**(每个读请求所需要的平均磁盘读取次数)。 35 | 36 | 37 | 38 | 在 ST 中,层数低的 sstable 向上组合成更大的 sstable。 39 | 40 | Leveled 的做法是把每一层(L0层除外)的数据等分为**大小差不多(百M级别)**且 **key 不重叠**的 sstable。 41 | 42 | 1. 随着 memtable 刷新到 L0 层,L0 有足够的 sstables 时(层数越高,数目越大,一般是10的指数),这些 sstables 会向上和 L1 层的所有 sstables 一起做 compaction,并生成新的 L1 层(L1 层仍然是大小差不多且 key 不重叠的 sstables)。 43 | 44 | 2. 如果 L1 层的 sstable 个数超过限制(10个),再把这一层超出的一个 sstable,向上做 compaction。 45 | 46 | 可以知道,L1 层的每个 sstable 的 key range 大概是总共的 1/10,L2 层的每个 sstable 的 key range 大概是总共的 1/100。所以 L1 层的这个 sstable 和 L2 层中大概 10 个 sstable 会有 key 重叠。只需要把这 10 个 sstables 和 L1 层的 sstable 做 compaction。 47 | 48 | 3. 如此往复,直到最高的一层。 49 | 50 | ![](assets/images/leveled-compaction-1.png) 51 | 52 | #### 存储放大问题解决 53 | 54 | STCS 中,存储放大有两个主要因素, 55 | 56 | - compaction 时,磁盘空间占用会有临时放大一倍。 57 | - 频繁的更新同一个 key,导致需要在不同 sstables 中存储相同 key 的不同数据。 58 | 59 | 60 | 61 | 针对第一个因素, 使用 LCS 时, 涉及到的 sstable 个数大约总是在 11 个 sstable 左右,所需要的额外的存储空间大约是个常数,`11 * size_of_per_sstable`。 62 | 63 | ![](./assets/images/leveled-compaction-2.png) 64 | 65 | 针对第二个因素,使用 LCS 时,大多数的数据是存储在最高层的,而每一层的 sstables 的 key 互不重叠。 66 | 67 | 假如最高层是 L3,它有 1000 个 sstables,其他两层总共也就 110 个 sstable,从而 90% 的数据都是在 L3,最多有 10% 的数据的 key 是重复的(L1 和 L2 都是对 L3 中的数据的更新)。当然这是最好的情况。 68 | 69 | 最坏的情况是当 L3 层的 sstables 个数和 L2 层一样。L3 层大概只有 50% 的数据,这 50% 的数据可能都是重复的,存储放大大概在2倍左右。 70 | 71 | > (当L3 层的 sstable 的个数少于 L2 或者大于 L2,重复率不可能到 50%) 72 | 73 | [Optimizing Space Amplification in RocksDB](http://cidrdb.org/cidr2017/papers/p82-dong-cidr17.pdf) 提到了可以通过**保证 L3 层的 sstable 个数是 L2 层的10倍**来优化这种情况。 74 | 75 | ![](assets/images/leveled-compaction-3.png) 76 | 77 | #### 带来的问题 78 | 79 | **写放大**: 每次写请求带来的总的磁盘写入开销。 80 | 81 | 一个 写入操作,首先有一个 commit log,这个是所有策略都会有的,暂且不管。 82 | 83 | 来看看 compaction (假设写入没有对 key 的覆写): 84 | 85 | - STCS:选择几个 sstables (假设总共 `X` bytes)来做 compaction,结果也大约是 X bytes。 86 | - LCS:选择一个 sstable(假设总共 X bytes)来做 compaction,和更高层的大约 10 个 sstables 做 compaction,结果大约是 `11 * X` bytes。 87 | 88 | 可以看到,最差情况下,每次 compaction, LCS 的写放大是 STCS 的 11 倍。 89 | 90 | 当然某些写入场景不会达到如此差的地步。比如说,写入的数据很快会被更新掉的场景,或者是添加新数据和修改旧数据的情况不多的场景。这些情况下,LCS 大部分时间都是在低层做 compaction(L0 -> L1),compaction 过后的数据不会造成级联 compaction 现象,从而无法到达更高层上去。 91 | 92 | 虽然这么说,但事实上,LCS 并不适合大多数无法避免写的场景,更不用说写入频繁的情况。 93 | 94 | 即使写入场景只占总的 10%,一旦写入放大,比如说11倍,很容易造成磁盘带宽都被写入占用。从而: 95 | 96 | - 读请求被拖慢。 97 | - 如果 compaction 所需要的磁盘带宽不能被满足,会造成 L0 层的 sstable 堆积,读请求会进一步被拖慢。(sstable 过多,造成读放大) 98 | 99 | 100 | 101 | 所以虽然 LCS 解决了 STCS 的存储放大问题,但带来了更加严重的写放大,最终导致,读写都受影响。 102 | 103 | 104 | 105 | -------------------------------------------------------------------------------- /storage/tidb/images/executors.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/executors.jpg -------------------------------------------------------------------------------- /storage/tidb/images/expression.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/expression.jpg -------------------------------------------------------------------------------- /storage/tidb/images/sql-core-layer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/sql-core-layer.png -------------------------------------------------------------------------------- /storage/tidb/images/tidb3-all.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/tidb3-all.jpg -------------------------------------------------------------------------------- /storage/tidb/images/tidb3.0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/tidb3.0.jpg -------------------------------------------------------------------------------- /storage/tidb/images/tidb3.graffle: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nanne007/notes/f55cb8e0bd451bec028b2a09d39cff103108b78c/storage/tidb/images/tidb3.graffle -------------------------------------------------------------------------------- /storage/tidb/tikv-intro.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # TiDB 4 | 5 | ![TiDB Family](https://pingcap.com/images/blog/sparkontikv.png) 6 | 7 | 8 | 9 | - [x] 概览 10 | - [x] Storage - RocksDB 11 | - [x] Replication - Raft protocol 12 | - [x] Schedule - PD 13 | - [x] SQL Layer 14 | - [ ] Transaction - 2PC/MVCC 15 | - [ ] Monitoring - Prometheus && Grafana 16 | 17 | ------ 18 | 19 | ## 概览 20 | 21 | **HBase 出了什么问题** 22 | 23 | - 可用性差 24 | - GC 25 | - 恢复时间长 26 | - 无跨行事务 - Mistake of Jeaf Dean 27 | - 决定了它只合适做一个 KV 数据库 28 | 29 | **需要考虑什么** 30 | 31 | - 一致性:我们是否需要保证整个系统的线性一致性,还是能容忍短时间的数据不一致,只支持最终一致性。 32 | - 稳定性:我们能否保证系统 7 x 24 小时稳定运行。系统的可用性是 4 个 9,还有 5 个 9?如果出现了机器损坏等灾难情况,系统能否做的自动恢复。 33 | - 扩展性:当数据持续增多,能否通过添加机器就自动做到数据再次平衡,并且不影响外部服务。 34 | - 分布式事务:是否需要提供分布式事务支持,事务隔离等级需要支持到什么程度。 35 | 36 | ------ 37 | 38 | 39 | 40 | **可用性** 41 | 42 | - Rust 43 | - static language。 44 | - no GC。 45 | - Memory safe,avoid dangling pointer,memory leak。 46 | - Thread safe,no data race。 47 | - package manager。 48 | - C bindings,zero-cost。 49 | - not so easy to learn。 50 | - not so many libraries as Java or C++。 51 | - 多副本(Raft) 52 | - 数据写入只有**大多数副本节点写入成功**才算成功。 53 | - 一个副本节点 down,可以切换到其他副本节点。 54 | 55 | **跨行事务 -** 2PC 56 | 57 | 基于跨行事务,可以实现 SQL-based 的数据库。 58 | 59 | **GRPC API** 60 | 61 | - Get 62 | - Scan 63 | - BatchGet 64 | - Prewrite 65 | - Commit 66 | 67 | ------ 68 | 69 | ## 数据存储 70 | 71 | ![Storage Stack](https://pingcap.com/images/blog/storage-stack1.png) 72 | 73 | ![Key Space](https://pingcap.com/images/blog/key-space.png) 74 | 75 | ![Storage Stack3](https://pingcap.com/images/blog/storage-stack3.png) 76 | 77 | ------ 78 | 79 | ## 数据复制 80 | 81 | ### 一致性协议 Raft 82 | 83 | > 超级棒的 Slide http://thesecretlivesofdata.com 84 | 85 | ### Raft in TiKV 86 | 87 | ![](https://pingcap.com/images/blog-cn/tikv-architecture.png) 88 | 89 | - 每个 Region 有三个副本(Replica),副本之间构成一个 Raft Group。 90 | 91 | (Replica 分布在不同的 TiKV 节点上,其中 Leader 负责读/写,Follower 负责同步 Leader 发来的 raft log) 92 | 93 | - 请求到 Region Leader 的写请求的数据通过 Raft 协议,在三个副本之间达成一致。 94 | 95 | - 整个 KV 空间由许多个 Raft Group 构成。 96 | 97 | 98 | 99 | ### 关于 CAP 100 | 101 | > Choose Availability or Consistency when Partitioned. 102 | 103 | - 在保证一致性的前提,尽可能的提高可用性。[why you should pick strong consitency whenever possible - from google cloud blog](https://cloudplatform.googleblog.com/2018/01/why-you-should-pick-strong-consistency-whenever-possible.html) 104 | - Raft 可容忍少数节点的离线(宕机或者网络不可达)。 105 | 106 | 107 | 108 | ### 容灾 109 | 110 | - **单机房多副本**,副本在不同节点上。 111 | - 可容忍单节点故障。 112 | - 事务提交操作没有跨机房日志同步操作,对事务响应时间影响最小。 113 | - **同城三机房**,每个机房一个数据副本。 114 | - 可容忍单机房故障。 115 | - 事务提交操作需要在同城机房间进行日志同步,对事务响应时间影响较小。 116 | - **两地三机房**,可以使用3个副本或者5个副本。 117 | - 可容忍单机房故障,亦可容忍存放单机房的城市发生故障。 118 | - 事务大概率在同城机房间进行日志同步,对事务响应时间影响较小;但一旦同城的两个机房中一个机房发生故障,事务提交就需要跨城日志同步,对事务响应时间影响较大。 119 | - **三地三机房**,多使用5个副本。 120 | - 容忍城市级故障。 121 | - 事务提交操作需要跨城进行日志同步,对事务响应时间影响较大。 122 | 123 | ------ 124 | 125 | ## 数据调度 126 | 127 | ### 为什么调度 128 | 129 | 负载与分布: 130 | 131 | - 保证请求均匀分布到节点上。 132 | - 保证节点存储容量均匀。 133 | - 保证数据访问热点分布均匀。 134 | - 保证副本数量不多不少,并且分布在不同机器上。 135 | - 避免集群 Balance 影响服务。 136 | 137 | 扩容缩容: 138 | 139 | - 增加节点以及下线节点后,保证数据均匀分布。 140 | 141 | 数据容灾: 142 | 143 | - 少数节点失效后,保证服务正常,以及负载和分布均衡。 144 | - 跨机房部署时,某个机房掉线后,保证不丢失数据甚至是保证正常服务。 145 | 146 | ### 调度的基本单元 147 | 148 | - 增加一个 Replica 149 | - 删除一个 Replica 150 | - 将 Raft Leader 从 一个 Replica 转移到另一个 Replica 151 | 152 | ### 集群信息收集 153 | 154 | 收集每个TiKV 节点的信息,以及每个 Region 的状信息。 155 | 156 | - 节点定时上报的信息。 157 | - 磁盘总量,可用磁盘容量。 158 | - 包含的 Region Replica 个数。 159 | - 数据写入速度。 160 | - 是否过载。 161 | - Label 信息。 162 | - 其他... 163 | - Region 的 Raft Leader 上报信息。 164 | - Leader/Follower 所在的节点。 165 | - 掉线的 Replica 个数。 166 | - Region 写入和读取的速度。 167 | - 其他... 168 | 169 | **通过管理接口传递进来的外部信息,做更准确的决策。** 170 | 171 | - 主动下线某个节点。 172 | - 主动迁移某些热点 Region。 173 | - 其他... 174 | 175 | ### 调度计划 176 | 177 | - **一个 Region 的 Replica 数量正确** 178 | - **一个 Raft Group 中的多个 Replica 不在同一个==位置==** 179 | - **副本在 Store 之间的分布均匀分配** 180 | - **Leader 数量在 Store 之间均匀分配** 181 | - **访问热点数量在 Store 之间均匀分配** 182 | - **各个 Store 的存储空间占用大致相等** 183 | - **控制调度速度,避免影响在线服务** 184 | - 其他。。(可在 PD 中实现自定义的调度策略) 185 | 186 | ### 调度实现 187 | 188 | 1. PD 不断的通过 Store 或者 Leader 的心跳包收集信息,获得整个集群的详细数据 189 | 2. 每次收到 Region Leader 发来的心跳包时,根据以上信息以及调度策略**==生成调度操作序列==**。 190 | 3. 通过心跳包的返回消息,将需要进行的操作返回给 Region Leader,并在后面的心跳包中监测执行结果。 191 | 192 | ------ 193 | 194 | ## SQL Layer 195 | 196 | ![SQL to Key-Value](https://pingcap.com/images/blog/sql-kv.png) 197 | 198 | TiDB 对: 199 | 200 | - 每个表分配一个 TableID, 201 | - 每一个索引都会分配一个 IndexID, 202 | - 每一行分配一个 RowID(如果表有整数型的 Primary Key,那么会用 Primary Key 的值当做 RowID) 203 | - 其中 TableID 在整个集群内唯一,IndexID/RowID 在表内唯一,这些 ID 都是 int64 类型 204 | 205 | 206 | 207 | Record Data: `t_${table_id}_r_${handle}` => `${v1}${v2}${..}` 208 | 209 | Uniq Index Data: `t_${table_id}_i_${index_id}${v1}${v2}`=>`${handle}` 210 | 211 | Non-Uniq Index Data: `t_${table_id}_i_${index_id}${v1}${v2}${v...}${handle}` => `null` 212 | 213 | ------ 214 | 215 | ## Coprocessor 216 | 217 | ![img](images/sql-core-layer.png) 218 | 219 | 220 | 221 | 执行计划落地: 222 | 223 | ![img](images/executors.jpg) 224 | 225 | ![img](images/expression.jpg) 226 | 227 | ---------- 228 | 229 | ## The Future - TiDB 3.0 230 | 231 | ![tidb3.0](images/tidb3.0.jpg) 232 | 233 | ![tidb3-all](images/tidb3-all.jpg) 234 | 235 | 236 | 237 | ## TODO: Transaction - 2PC/MVCC 238 | 239 | ## 编码格式 240 | 241 | DB 242 | 243 | - `m + DBs + h + DB:[db_id]` => `TiDBInfo` 244 | - `m + DB:[db_id] + h + Table:[tb_id]` => `TiTableInfo` 245 | 246 | ```json 247 | { 248 | "id":130, 249 | "db_name":{"O":"global_temp","L":"global_temp"}, 250 | "charset":"utf8", 251 | "collate":"utf8_bin", 252 | "state":5 253 | } 254 | ``` 255 | 256 | ```json 257 | { 258 | "id": 42, 259 | "name": { 260 | "O": "test", 261 | "L": "test" 262 | }, 263 | "charset": "", 264 | "collate": "", 265 | "cols": [ 266 | { 267 | "id": 1, 268 | "name": { 269 | "O": "c1", 270 | "L": "c1" 271 | }, 272 | "offset": 0, 273 | "origin_default": null, 274 | "default": null, 275 | "type": { 276 | "Tp": 3, 277 | "Flag": 139, 278 | "Flen": 11, 279 | "Decimal": -1, 280 | "Charset": "binary", 281 | "Collate": "binary", 282 | "Elems": null 283 | }, 284 | "state": 5, 285 | "comment": "" 286 | } 287 | ], 288 | "index_info": [], 289 | "fk_info": null, 290 | "state": 5, 291 | "pk_is_handle": true, 292 | "comment": "", 293 | "auto_inc_id": 0, 294 | "max_col_id": 4, 295 | "max_idx_id": 1 296 | } 297 | ``` 298 | 299 | 300 | 301 | ## 代码目录说明 302 | 303 | #### structure 304 | 305 | TxStructure 包装 `kv.Retriever` 和 `kv.RetrieverMutator` 来操作 `string`, `list`, `hash` 类型。 306 | 307 | #### meta 308 | 309 | Meta: 封装 `TxStructure` 来操作 meta 相关的信息。 310 | Id Allocator: id genertor 封装。 311 | 312 | #### model 313 | 314 | schema, table, index 等数据结构 存入底层 kv 时的 json 格式。 315 | 316 | #### table 317 | 318 | table,index, column 相关的抽象,用来操作数据。 319 | 320 | #### tablecodec 321 | 322 | 数据编码。 323 | 324 | #### terror 325 | 326 | 错误码。 327 | 328 | #### types 329 | 330 | mysql 类型封装。 331 | 332 | #### distsql 333 | 334 | 对 `kv.Response` 的封装。 335 | 336 | - streaming(`stream.go`) 337 | 调用 `kv.Response` 的 `Next` 方法,返回的数据是 `tipb.StreamResponse`(one chunk a time)。 338 | - select(`distsql.go`) 339 | 调用 `kv.Response` 的 `Next` 方法,返回的数据是 `tipb.SelectResponse`(many chunk)。 340 | 341 | TODO: 342 | 343 | - [ ] implementation of `kv.Response` 344 | -------------------------------------------------------------------------------- /storage/tikv/CODE_READING.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | storage/mod -> Storage.start 4 | 5 | - storage/txn/scheduler 6 | - storage/engine 7 | - rocksdb 8 | - raftstore/store/engine 9 | 10 | 11 | 12 | 13 | ### MVCC TXN 14 | 15 | 16 | 17 | - `lock_key`: Put(key, (lock_type, short_value, primary, start_ts))=> CF_LOCK 18 | - `unlock_key`: Delete(key) => CF_LOCK 19 | - `put_value` : Put(key, Value) => CF_DEFAULT 20 | - `delete_value`: Delete(key) => CF_DEFAULT 21 | - `put_write`: Put(key, Value) => CF_WRITE 22 | - `delete_write`: Delete(key) => CF_WRITE 23 | 24 | 25 | 26 | `get`: MvccReader.get(key) 27 | 28 | 29 | 30 | ### Store 31 | 32 | 33 | 34 | #### KV Engine DB 35 | 36 | 37 | 38 | ##### CF_RAFT 39 | 40 | **Region State:** 41 | 42 | - Key: `0x01(LOCAL_PREFIX), 0x03(REGION_META_PREFIX), region_id, 0x01(REGION_STATE_SUFFIX)` 。到`0x01(LOCAL_PREFIX), 0x04(REGION_META_PREFIX+1)` 为止。 43 | 44 | - Value: 45 | 46 | ```protobuf 47 | message RegionLocalState { 48 | PeerState state = 1; 49 | metapb.Region region = 2; 50 | } 51 | ``` 52 | 53 | **Raft Apply State:** 54 | 55 | - Key: `0x01(LOCAL_PREFIX), 0x02(REGION_RAFT_PREFIX), region_id,0x03(APPLY_STATE_SUFFIX)` 56 | 57 | **Snapshot Raft State:** 58 | 59 | - Key: `0x01(LOCAL_PREFIX), 0x02(REGION_RAFT_PREFIX), region_id,0x04(SNAPSHOT_RAFT_STATE_SUFFIX)` 60 | - ​ 61 | 62 | 当 RegionLocalState 中的 peer state 是 Tombstone 时,清除这些数据。 63 | 64 | - 删除 KV DB中 该 region_id 的 Region State 数据。 65 | - 删除 KV DB中该 region_id 的 Raft Apply State 数据。 66 | - 删除 Raft DB 中该 region id 的 Raft Log 数据。 67 | - 删除 Raft DB 中的该 region id 的 Raft State 数据。 68 | - 将 Region State 的 peer state 置为 tombstone。? 69 | 70 | 当 RegionLocalState 中的 peer state 是 Applying 时,? 71 | 72 | 73 | 74 | #### Raft Engine DB 75 | 76 | **Raft State:** 77 | 78 | - Key: `0x01(LOCAL_PREFIX), 0x02(REGION_RAFT_PREFIX), region_id, 0x02(RAFT_STATE_SUFFIX) ` 79 | 80 | - Value: 81 | 82 | ```protobuf 83 | message RaftLocalState { 84 | eraftpb.HardState hard_state = 1; 85 | uint64 last_index = 2; 86 | } 87 | ``` 88 | 89 | **Raft Log:** 90 | 91 | - Key: `0x01(LOCAL_PREFIX), 0x02(REGION_RAFT_PREFIX), region_id, 0x01(RAFT_LOG_SUFFIX),raft_index ` -------------------------------------------------------------------------------- /working-with-socket/README.md: -------------------------------------------------------------------------------- 1 | ### Buffering 2 | 3 | - 一次该读(写)多少数据? 4 | - 是多次进行少量数据的读取还是一次性读取大量数据? 5 | 6 | #### Write buffer. 7 | 8 | What happens when write data on a tcp socket ? 9 | 10 | > Buffering allows calls to write to return almost immediately. 11 | > Then, behind the scenes, the kernel can collect all the pending writes, 12 | > group them and optimize when they're sent for maximum performance to avoid flooding the network. 13 | > At the network level, sending many small packets incurs a lot overhead, 14 | > so the kernel batches small writes together into larger ones. 15 | 16 | #### Read buffer. 17 | 18 | > When calling `read`, Ruby may actually be able to receive more data than your limit allows. 19 | 20 | So buffering the pending data. 21 | 22 | 16kb is a reasonable minimum length. 23 | 24 | 25 | ### Socket Options 26 | 27 | - Socket#setsockopt -> setsockopt(2) 28 | - Socket#getsockopt -> getsockopt(2) 29 | 30 | 31 | reuse_addr. 32 | 33 | 34 | ### Non-blocking IO 35 | 36 | - read_nonblock 37 | - write_nonblock 38 | - connect_nonblock 39 | - accept_nonblock 40 | 41 | 42 | 43 | > Given more than one process trying to accept a connection on different copies of the same socket, 44 | > the kernel balances the load and ensures that one, and only one, copy of the socket will be able to accept any particular connection. 45 | 46 | ### Unix Process 47 | 48 | - pid 49 | - ppid 50 | --------------------------------------------------------------------------------