├── 2016_summary ├── 2016_read_book.png ├── 2016_summary.md ├── 2016_xmind.xmind ├── aliyun.png ├── arch.png ├── baby.jpg ├── data_center.png ├── dev.png ├── devops_others.png ├── docker_k8s.png └── zabbix.png ├── Better_RESTful_API.md ├── Istio_Linkerd.md ├── Microservices_Arch9.md ├── Microservices_Authentication_Authorization.md ├── OpenShift_Install.md ├── README.md ├── Systemtap.key ├── Systemtap_QuickStart.md ├── Web_Framework_Benchmark.md ├── api_gateway_open_source.md ├── arch_pearl.md ├── container.md ├── distributed_trace.md ├── distribution_trans.md ├── docker_store_image.md ├── dood_dind_k8s.md ├── go_swagger.md ├── img ├── caas.jpg ├── cf_garden.jpg ├── consistency_compensation.jpg ├── container_timeline.jpg ├── cs_tcc_cancel.jpg ├── cs_tcc_confirm.jpg ├── cs_tcc_exception.jpg ├── cst-gateway.png ├── cst.png ├── dbs.png ├── docker_libcontainer_runc.jpg ├── event_local_table.jpg ├── event_remote_table.jpg ├── mesos_uc.jpg ├── micro_srv_avoid_snow_slide.jpg ├── microservice_consistence.jpg ├── no_docker_runc.jpg ├── runc.png └── sso.png └── what-is-serverless.pdf /2016_summary/2016_read_book.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/2016_read_book.png -------------------------------------------------------------------------------- /2016_summary/2016_summary.md: -------------------------------------------------------------------------------- 1 | # 2016年总结 2 | 3 | ## 1. 引子 4 | 5 | ![img](2016_read_book.png) 6 | 7 | >(绿色星星的表示精读) 8 | 9 | 通过阅读的技术书籍也极好地反应出了2016年工作中涉及到的主要技术和工作的重点: 10 | 11 | 1. 接触Golang,进行了初步的系统学习,重构了Passport服务,为Docker和K8S的开展打了一个基础; 12 | 2. 使用Python作为主要编程语言,编写和维护了几个阿里OSS相关的服务; 13 | 3. 在Devops投入的精力最大,系统学习了Docker + Kubernets,以及其所涉及的相关技术Ansbile、Ceph、Zabbix监控等等;牵制精力最大的主要原因是Devops同事集体离职和自己首次接触Devops,对于各个环节上都是hello world的阶段; 14 | 4. 由于整体系统采用了微服务架构,对于微服务的架构使用、功能合理划分,分布式事务、分布式的调用跟踪等个个方面也遇到了比较大的挑战; 15 | 5. 由于陪产假和国庆假连在了一起,有了较多的空余时间,抽空阅读了《从0到1》,阅读完的确是获取了比较大的思维扩展;《三体》,自不用说,尽管已经阅读了一半,已经是脑洞打开了; 16 | 17 | 18 | ## 2. 工作 19 | 20 | ### 2.1 R&D && Arch 21 | 22 | ![](dev.png) 23 | 24 | 开始入门Golang语言,使用Beggo框架重构了Passport服务;重拾Python,学习了Tornado框架以及相关的异步库、负责维护OSS相关的几个服务,通过精读《Python核心编程》,已经可以将Python列入了常规编程语言,也逐步熟悉了Rabbitmq和Mongodb,逐渐地也能尝试帮助同事解决些Python的疑难杂症,果然是学习过C/C++,可以秒杀大部分其他语言的相关问题; 25 | 26 | 存在的不足:由于Golang在公司后台开发中属于非常小众的语言,小到只有Passport这个服务采用了Golang(历史原因),公司也只有自己一个人在使用Golang,Golang在实际项目中的应用有些偏少,没有可以沟通的同事,多数都是自己在自我驱动学习,包括对于Docker和K8S的源码阅读,也都没有进入到公司项目层次;另外自己还要负责Python相关的几个服务,学习Python也牵制了不少的学习精力。 27 | 28 | ![](arch.png) 29 | 30 | 整个系统采用微服务架构,每个服务编写起来功能内聚得到了很好的解决,提供了统一的Restful API,不同的服务完全可以根据自己的喜好选择不同的语言和存储,部署和扩容上都灵活了很多,看起来一切都很美,但是分布式的固有复杂性却没有得到解决,分布式事务的处理,依然是一个比较头痛的问题,如论是采用那种最终一致性方案都是复杂性比较大,需要重点关注;同时对于微服务之间的调用链跟踪的可视化跟踪,尽管基于Google的Dapper论文有了相当多的开源方案,但是牵扯到各种语言实现和微服务逻辑,仍然需要投入一定的精力去解决;鉴于此在微服务采用初期的框架选型和统一跟踪方案是非常有必要的,可以参见[分布式跟踪系统调研](http://www.do1618.com/archives/757)。 31 | 32 | 其次微服务的Gateway除了具备服务发现基本功能外,对于安全的鉴权、流量限速、流量统计等方面也需要提供相当的支撑,可以参考[开源的Gateway实现](http://www.do1618.com/archives/783) 33 | 34 | ### 2.2 Devops 35 | 36 | Devops上无疑是投入的精力最大,短期内直接从0起步负责Devops的整体工作,到处都是各式各样新名词新概念,而且还要短期内达到可以维护和生产环境中可以使用,初期面临着比较大的挑战和压力;学习的方面主要是围绕着以下几个方面: 37 | 38 | 39 | #### 2.2.1 阿里云相关技术及熟悉 40 | ![](aliyun.png) 41 | 42 | 从阿里的ECS开始,全面熟悉了阿里相关的各种服务,数据类包括RDS、Mongdb和OSS存储;安全类包括安全组加固、先知扫描;涉黄涉政类的绿网使用;再到2016火的不能再火的直播; 43 | 44 | 总体体会上感觉阿里的管理功能做的非常不方便,工单的响应的及时性和准确性上都有待较大的提高。中间遇到过两次安全组不生效、OSS权限设置不能正常工作的情况。 45 | 46 | #### 2.2.2 Docker && Kubernets(K8S) 47 | 48 | ![](docker_k8s.png) 49 | 50 | Docker和K8S无疑是2016年的重中之重,各个生态圈都有相当多的内容需要去关注和熟悉,且两者都经历则快速的发展,紧跟着形势走都需要大量的精力投入;另外一个方面,两者的每个方案的调整都可能导致CI/CD流程的调整和优化,牵一发动全身。 51 | 52 | Docker的简单使用相对比较简单,但是在生产环境中的使用则需要考虑着更多的方面的:DockerFile的优化、DockerImage在CI/CD层次上的组织、DockerRegistry的存储、删除,安全和权限控制等个个方面都需要统筹考虑;另外基于Docker化的编译环境梳理、发布过程中的代码lint、uint test和自动化的集成测试都需要针对不同的语言进行定制化维护,同时在Jenkins中的发布流程的Pipeline也需要Docker化的强力支持,可能用到Docker in Docker 或者Docker outof Docker之类的方案。 53 | 54 | Kubernets则是Google集群神器Borge的开源实现,集聚了Goolge大规模的集群管理的精华,自动调度、服务故障自动迁移、收缩扩容异常方便,但是从安装到集群维护都是一个重量级的工作,这也是很多公司不敢采用K8S的一个主要原因;Ansbile的自动化安装、Etcd集群的搭建和维护、各种Overlay Network方案、集群监控和日志收集都需要有相当的精力进行研究。 55 | 56 | Kubernets集群中两个重头戏则是集群的监控和日志的收集,集群的监控方面官方的公开方案是基于cAdvisor的Agent,采用Heapster进行统一收集,保存到Influxdb,界面上采用Grafana展示,但是想要在生产环境中进行稳定运行,还是要注意各个版本的兼容和各种的Crash Restart,不小心就会踩到坑;容器的日志收集采用的是EFK方案,从各种样式的日志收集Format统一到聚合,再到查询搜索、报表统计各个环境上都需要进行跟踪,否则会经常出现漏采集日志、日志过多的时候Fluentd狂占CPU、Kibana搜索导致Memroy Overload的情况。 57 | 58 | 需要完善的地方: 59 | 60 | 1. EFK为针对Docker服务和K8S中各组件的收集,导致问题排查的时候仍然需要Ansible命令去各Node上进行采集日志分析; 61 | 2. 容器长期运行中的日志管理也未进行统一的策略管理,导致长期一直运行的容器日志会越写越大,导致磁盘空间的占用; 62 | 3. 针对K8S中的各种Event为进行针对性跟踪和告警,需要采用K8S的API Watch方式进行采集和分析,统一到Zabbix告警平台; 63 | 4. 对于业务Service的运行状态和连通性测试,只是提供了一个简单的页面,定时进行生产和刷新,功能比较简单且易用性不强; 64 | 5. Flannel的Overlay网络开销和KubeProxy的Iptables的转发,在对延时敏感性的业务上还是有比较大的影响,应该尽量避免,特别是基于Iptables的消息转发和封装,都不太利于问题的跟踪和排查;后期可以考虑Calico的三层路由网络; 65 | 66 | 对于K8S集群的优化还有相当多的工作需要做,更多工作的开展就牵扯到对于集群内部运作原理和源码层次的熟悉,需要有专门的人力进行持续的跟踪和优化; 67 | 68 | #### 2.2.3 DataCenter(缓存集群和数据集群) 69 | 70 | ![](data_center.png) 71 | 72 | 数据中心的搭建和维护是整套系统的重点,Mysql集群搭建和测试、Mongodb Cluster的搭建和测试,缓存集群Codis搭建和测试;基础服务方面还包括Rabbitmq和Kafka;另外还有图形数据库Neo4j和大数据Strorm集群的搭建。 73 | 74 | 个人而言自己重点在Mysql、Codis、Rabbitmq三个上,其他的也还都是停留着简单的接触; 75 | 76 | 需要完善的就是Mysql的高可用、备份和恢复;Codis缓存方面的快速扩容和节点状态Health监控。数据中心的每个方案上都需要有时间的校验和大量的工作去完成。毕竟数据才是企业的源头和发动机。 77 | 78 | ### 2.2.4 Zabbix监控系统 79 | 80 | ![](zabbix.png) 81 | 82 | 基于开源的Zabbix系统,功能包括了阿里ECS主机的常规监控、Nginx Status监控、Mongodb Status、Mysql Staus、Codis(Redis Cluster)和Services Health Check的功能。 83 | 84 | 需要进一步完善的地方就是没有完整实现对于Services的API的全面检测,导致部分业务不能正常工作的时候不能够第一时间定位和修复。改善的方案可以部署一个简单的Cient实现核心业务的模拟器,独立于整套环境外对于API响应的状态码和响应时间做针对性的监测,使用Zabbix的告警通知机制,可以方便的修复相关的故障业务。 85 | 86 | ### 2.2.5 Devops其他工作 87 | 88 | ![](devops_others.png) 89 | 90 | 在Devops其他方面上整理一套基于Supervisord的监控和告警方案;CoreOS的相关内容熟悉,CoreOS现在已经修改名字为了Container Linux,提供了高度的安全性和K8S的高度集成,未来的CAAS角色不可估量;于此同时还维护了dev/test/pro三套环境的日常维护,从当初的各种救火,到现在的基本稳定,一次次的方案调整效果还是相当明显;采取走出去的方针,参见了多次的线下Meetup,不仅开阔了视野也结交了相当多的志同道合的好友。 91 | 92 | 93 | 94 | ### 2.3 Blog 95 | 96 | 2016年坚持着写Blog的习惯,从2月份开始,大概有100篇左右,虽然有些文章的质量有待提高,有总胜于无,一路坚持总有收获。 97 | 98 | 99 | ## 3. 生活 100 | 101 | 在生活2016年无疑是最充实的一年,经历着宝宝在妈妈肚子里一天天长大,一次次的胎动、一次次和打仗一样的孕检、到宝宝呱呱落地、满月、3个月.... 102 | 103 | 2016年的中心就是这个可爱的大宝宝,每次看到宝宝那萌萌的表情,感觉一切的等待和辛苦都是那么值得,那么幸福...... 104 | 105 | ![](baby.jpg) 106 | 107 | 108 | -------------------------------------------------------------------------------- /2016_summary/2016_xmind.xmind: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/2016_xmind.xmind -------------------------------------------------------------------------------- /2016_summary/aliyun.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/aliyun.png -------------------------------------------------------------------------------- /2016_summary/arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/arch.png -------------------------------------------------------------------------------- /2016_summary/baby.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/baby.jpg -------------------------------------------------------------------------------- /2016_summary/data_center.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/data_center.png -------------------------------------------------------------------------------- /2016_summary/dev.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/dev.png -------------------------------------------------------------------------------- /2016_summary/devops_others.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/devops_others.png -------------------------------------------------------------------------------- /2016_summary/docker_k8s.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/docker_k8s.png -------------------------------------------------------------------------------- /2016_summary/zabbix.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/2016_summary/zabbix.png -------------------------------------------------------------------------------- /Better_RESTful_API.md: -------------------------------------------------------------------------------- 1 | # Better RESTful API实践 2 | 3 | ## 1. 资源采用名词,采用复数形式 4 | 5 | | Resource | GET read | POST create | PUT update | PATCH Partially update | DELETE | 6 | | --------- | ---------------------- | ---------------------------- | ---------------------------------------- | -------------------------------- | ---------------------- | 7 | | /cars | Returns a list of cars | Create a new car | Bulk update of cars | Bulk Partially update of cars | Delete all cars | 8 | | /cars/711 | Returns a specific car | Method not allowed (405) | Updates a specific car [200/204, or 404] | Partially Updates a specific car | Deletes a specific car | 9 | | | 200 or 404 | 201[ Location' header or 404 | 200/204 or 404 | | 200 or 404 | 10 | 11 | 尽量采用复数格式,不是绝对,如果对于全局唯一的配置,则可以采用单数格式。 12 | 13 | 14 | 15 | * Twitter: https://developer.twitter.com/en/docs/api-reference-index 16 | * Facebook: https://developers.facebook.com/docs/graph-api 17 | * LinkedIn: https://developer.linkedin.com/ 18 | * GitHub API: https://developer.github.com/v3/ 19 | 20 | ## 2. 采用子资源关联 21 | 22 | ``` 23 | GET /cars/711/drivers/ Returns a list of drivers for car 711 24 | GET /cars/711/drivers/4 Returns driver #4 for car 711 25 | ``` 26 | 27 | | Action | URI | DESC | 28 | | :----- | :--------------------- | :--------------------------------------- | 29 | | GET | /tickets/12/messages | Retrieves list of messages for ticket #12 | 30 | | GET | /tickets/12/messages/5 | Retrieves message #5 for ticket #12 | 31 | | POST | /tickets/12/messages | Creates a new message in ticket #12 | 32 | | PUT | /tickets/12/messages/5 | Updates message #5 for ticket #12 | 33 | | PATCH | /tickets/12/messages/5 | Partially updates message #5 for ticket #12 | 34 | | DELETE | /tickets/12/messages/5 | Deletes message #5 for ticket #12 | 35 | 36 | 是否需要采用多层次的关联,则需要根据子资源能否脱离父类资源独立存在。 37 | 38 | 39 | 40 | 如果对于某些资源的 **Action** 不能够对应到 **CRUD** 操作, 41 | 42 | 1. 比如对于资源的 activated,操作则可以映射成 PATCH 操作 43 | 2. 比如对于GitHub上的gist点赞,则可以映射成资源的子属性 44 | 45 | ``` 46 | PUT /gists/:id/star 47 | DELETE /gists/:id/star 48 | ``` 49 | 50 | 3. 如果对于多个资源的 search,不能够映射到各个资源上 51 | 52 | ``` 53 | GET /search 54 | ``` 55 | 56 | 57 | 58 | ## 3. 采用 HATEOAS 59 | 60 | HATEOAS = ypermedia **a**s **t**he **E**ngine **o**f **A**pplication **S**tate 61 | 62 | ```json 63 | { 64 | "id": 711, 65 | "manufacturer": "bmw", 66 | "model": "X5", 67 | "seats": 5, 68 | "drivers": [ 69 | { 70 | # "id": "23", no need 71 | "name": "Stefan Jauker", 72 | "links": [ 73 | { 74 | "rel": "self", 75 | "href": "/api/v1/drivers/23" 76 | } 77 | ] 78 | } 79 | ] 80 | } 81 | ``` 82 | 83 | 84 | 85 | ## 4. 提供过滤、排序、字段选择和分页 86 | 87 | **过滤** 88 | 89 | ``` 90 | GET /cars?color=red Returns a list of red cars 91 | GET /cars?seats<=2 Returns a list of cars with a maximum of 2 seats 92 | 93 | gt – Greater than 94 | lt – Less than 95 | eq – Equal to 96 | ge – Greater than or equal to 97 | le – Less than or equal to 98 | ``` 99 | 100 | **排序** 101 | 102 | ``` 103 | GET /cars?sort=-manufactorer,+model 104 | ``` 105 | 106 | **字段选择** 107 | 108 | ``` 109 | GET /cars?fields=manufacturer,model,id,color 110 | ``` 111 | 112 | **分页** 113 | 114 | ```shell 115 | GET /cars?offset=10&limit=5 116 | 117 | # github存放在 http HEADer中,参见:https://developer.github.com/v3/#pagination 118 | Link: ; rel="next", 119 | ; rel="last", 120 | ; rel="first", 121 | ; rel="prev", 122 | ``` 123 | 124 | [Github Traversing with Pagination](https://developer.github.com/v3/guides/traversing-with-pagination/) 125 | 126 | Google API开发中对于分页的处理方式如下https://cloud.google.com/apis/design/design_patterns: 127 | 128 | 对于所有的**List**操作都应该支持分页,即使当前的结果集数量非常小,因为对于API添加分页功能的支持也是一个 **behavior-breaking**的修改。在**List**操作中增加如下定义字段: 129 | 130 | * page_token (string) 字段出现在 List 方法的请求消息中,表明客户端请求的具体页数的数据的结果,该字段的内容必须是url-safe的base64编码,如果包含敏感数据则应该进行加密,服务端必须保证被篡改page_token不能够访问到不是预期暴露的数据: 131 | * require query parameters to be respecifed on follow up requests 132 | * only reference server-side session state in the page token 133 | * encrypt and sign the query parameters in the page token and revalidate and reauthorize these parameters on every call 134 | * total_size (int32) 135 | * page_size (int32),如果 page_size == 0, 则由服务端决定返回的数目,当然服务端也可能有自己的限制条件,比如比 page_size 更小; 136 | * next_page_token (string) 字段出现在 List 方法的 Response Message中,代表下一页数据的 Token,如果该字段为空 “”, 则表明没有后续的数据; 137 | 138 | ```json 139 | rpc ListBooks(ListBooksRequest) returns (ListBooksResponse); 140 | 141 | message ListBooksRequest { 142 | string name = 1; 143 | int32 page_size = 2; 144 | string page_token = 3; 145 | } 146 | 147 | message ListBooksResponse { 148 | repeated Book books = 1; 149 | string next_page_token = 2; 150 | } 151 | ``` 152 | 153 | 154 | 155 | ## 5. API 版本号 156 | 157 | 版本号放在url中,方便与日志收集后的过滤,同时版本号用整数, v1、v2, 不采用 v2.5的方式 158 | 159 | ``` 160 | /blog/api/v1 161 | ``` 162 | 163 | 164 | 165 | ## 6. 创建和更新操作返回资源主题信息 166 | 167 | 如果是采用Post Create参数应该返回创建后的资源信息,HTTP Status Code == 201, 并在 Header 中填写 “ Location = "Location" ":" absoluteURI” 表明创建资源后的URL地址; 168 | 169 | 170 | 171 | 如果是采用PUT操作资源,则应该返回更新后的资源字段,或者附件 created_at 或 updated_at等字段信息; 172 | 173 | ## 7. 采用HTTP Status Code处理错误 174 | 175 | | code | Status | Message | 176 | | ---- | --------------------- | ---------------------------------------- | 177 | | 200 | OK | Eyerything is working | 178 | | 201 | OK | New resource has been created | 179 | | 204 | OK | The resource was successfully deleted | 180 | | 304 | Not Modified | The client can use cached data | 181 | | 400 | Bad Request | The request was invalid or cannot be served. The exact error should be explained in the error payload. E.g. „The JSON is not valid“ | 182 | | 401 | Unauthorized | The request requires an user authentication | 183 | | 403 | Forbidden | The server understood the request, but is refusing it or the access is not allowed. | 184 | | 404 | Not found | There is no resource behind the URI. | 185 | | 409 | CONFLICT | Whenever a resource conflict would be caused by fulfilling the request. Duplicate entries, deleting root objects when cascade-delete not supported are a couple of examples. | 186 | | 422 | Unprocessable Entity | Should be used if the server cannot process the enitity, e.g. if an image cannot be formatted or mandatory fields are missing in the payload. | 187 | | 429 | Too Many Requests | Too many requests hit the API too quickly. We recommend an exponential backoff of your requests. | 188 | | 500 | Internal Server Error | API developers should avoid this error. If an error occurs in the global catch blog, the stracktrace should be logged and not returned as response. | 189 | | | | | 190 | | | | | 191 | | | | | 192 | 193 | 使用Error Payload 194 | 195 | ```json 196 | { 197 | "errors": [ 198 | { 199 | "userMessage": "Sorry, the requested resource does not exist", 200 | "internalMessage": "No car found in the database", 201 | "code": 34, 202 | "more info": "http://dev.mwaysolutions.com/blog/api/v1/errors/12345" 203 | } 204 | ] 205 | } 206 | ``` 207 | 208 | 209 | 210 | ```json 211 | { 212 | "code" : 1234, 213 | "message" : "Something bad happened :(", 214 | "description" : "More details about the error here" 215 | } 216 | ``` 217 | 218 | 219 | 220 | ```json 221 | { 222 | "code" : 1024, 223 | "message" : "Validation Failed", 224 | "errors" : [ 225 | { 226 | "code" : 5432, 227 | "field" : "first_name", 228 | "message" : "First name cannot have fancy characters" 229 | }, 230 | { 231 | "code" : 5622, 232 | "field" : "password", 233 | "message" : "Password cannot be blank" 234 | } 235 | ] 236 | } 237 | ``` 238 | 239 | 240 | 241 | ```json 242 | { 243 | "code":200, 244 | "status":"success", 245 | "data": 246 | { 247 | "lacksTOS":false, 248 | "invalidCredentials":false, 249 | "authToken":"4ee683baa2a3332c3c86026d" 250 | } 251 | } 252 | 253 | { 254 | "code":401, 255 | "status":"error", 256 | "message":"token is invalid", 257 | "data":"UnauthorizedException" 258 | } 259 | ``` 260 | 261 | 262 | 263 | ## 9. 输入和输出尽可能JSON格式 264 | 265 | [XML API vs JSON API](http://www.google.com/trends/explore?q=xml+api#q=xml%20api%2C%20json%20api&cmpt=q),目前JSON API的调用已经大幅度领先于XML;尽可能返回JSON格式的数据,并保证返回的JSON格式数据具备良好的阅读性,例如默认设置参数(?pretty=true),方便与用户展示和调试;返回的JSON格式内容启用压缩功能;JSON格式推荐采用下划线分割page_size,而不是大小写格式pageSize; 266 | 267 | 268 | 269 | ## 10. API 访问限速 270 | 271 | RFC 6585 中 [429 Too Many Requests](http://tools.ietf.org/html/rfc6585#section-4) 定义了该种情况。 272 | 273 | - X-Rate-Limit-Limit - The number of allowed requests in the current period 274 | - X-Rate-Limit-Remaining - The number of remaining requests in the current period 275 | - X-Rate-Limit-Reset - The number of seconds left in the current period 276 | 277 | 278 | ##参考资料 279 | 280 | * [10 Best Practices for Better RESTful API](https://blog.mwaysolutions.com/2014/06/05/10-best-practices-for-better-restful-api/) 281 | * [ReST APIs | Best Practices & Security](https://blog.wishtack.com/rest-apis-best-practices-and-security/) 282 | * [Best Practices for Designing a Pragmatic RESTful API](http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api) 283 | * [ Node.js Restful API tutorial](http://howtocodejs.com/how-to/create-restful-api-node-js/) 284 | * [Google Cloud Plateform](https://cloud.google.com/apis/design/resources) 285 | * [Google Seller Rest Filter](https://developers.google.com/ad-exchange/seller-rest/reporting/filtering) 286 | * [API version should be included in the URL or in a header](http://stackoverflow.com/questions/389169/best-practices-for-api-versioning) 287 | * [GitHub API](https://developer.github.com/v3/) 288 | * [Striper](https://stripe.com/docs/api) 289 | * [XML API vs JSON API](http://www.google.com/trends/explore?q=xml+api#q=xml%20api%2C%20json%20api&cmpt=q) 290 | * [Architectural Styles andthe Design of Network-based Software Architectures](http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm) 291 | * [RESTful Service Best Practices](www.restapitutorial.com/media/RESTful_Best_Practices-v1_0.pdf) 292 | * [Richardson Maturity Model](https://martinfowler.com/articles/richardsonMaturityModel.html) -------------------------------------------------------------------------------- /Istio_Linkerd.md: -------------------------------------------------------------------------------- 1 | # Istio && Linkerd 2 | 3 | [TOC] 4 | 5 | ## 1. Istio 6 | 7 | ### 1.1 整体架构 8 | 9 | Istio为希腊语,意思是“启航”。 10 | 11 | ![](https://avatars3.githubusercontent.com/u/23534644?s=200&v=4) 12 | 13 | 主要特点: 14 | 15 | - HTTP、gRPC和TCP网络流量的自动负载均衡; 16 | - 提供了丰富的路由规则,实现细粒度的网络流量行为控制; 17 | - 流量加密、服务间认证,以及强身份声明; 18 | - 全范围(Fleet-wide)策略执行; 19 | - 深度遥测和报告。 20 | 21 | 22 | 23 | 设计目标: 24 | 25 | * Maximize Transparency:最大透明化,可自动注入到服务中,对于服务透明,运维配置既可 26 | 27 | * Incrementality: 能够增量增加新功能,扩展策略系统,完成新的控制策略 28 | 29 | * Portability: 可移植性,能够花费很少的代价在各种生态环境下运行 30 | 31 | * Policy Uniformity: 策略一致性,策略系统作为独特服务来进行维护 32 | 33 | ​ 34 | 35 | 36 | 37 | 从架构上 Istio 分成了两个层面: 38 | 39 | * Control Plane: Pilot/Mixer/Istio-Auth,主要用于控制数据流的控制工作;主要参与者为Google、IBM 40 | * Data Plane: Envoy,对于流经的数据流进行代理,并结合控制层的策略进行后续的动作; 主要参与者为 Lyft (Envoy) 41 | 42 | 43 | Istio 的底层使用了[Envoy](https://lyft.github.io/envoy/)。Envoy 是 Lyft 于2016年9月份开源的一种服务代理和通信总线,已用于生产系统中,"管理了上万台虚拟机间的一百多个服务,每秒可处理近两百万次请求”。在近期的GlueCon 2017大会上,来自IBM的Shriram Rajagopalan和来自Google的Louis Ryan介绍了 Istio 的 [技术细节](https://istio.io/talks/istio_talk_gluecon_2017.pdf)(PDF)。 44 | 45 | Envoy 以 Sidecar 的方式透明部署在应用 Service 前面,为服务提供了服务治理(Service Discovery)、断路器(Circuit Breaker)、流量管理分发(Traffic Management)等;将以前部署于 Service 中的普通规则和通用功能提炼到 Sidecar 中处理,极大地降低了 Service 开发的难度和成本。 46 | 47 | 在线体验 https://www.katacoda.com/courses/istio/deploy-istio-on-kubernetes 48 | 49 | ### 1.2 Pilot 50 | 51 | 核心组件 Pilot 主要用于控制 Envoy 进行流量管控,Pilot 管理Istio服务网格中的 Envoy 代理实例。Pilot 允许指定用于在Envoy代理之间路由流量的规则和故障恢复策略如超时(Timeout),重试(Retries)和熔断器(Circuit Breakers)。 52 | 53 | 通过 Pilot 设置的流量管理规则,服务 Service 可以方便实现基于比例和特定内容信息的流量分发。 54 | 55 | 56 | 57 | Pilot 的整体架构上分为 `Rules API` `Envoy API` `Abstract Modle` 和 `Platform Adapter`。 58 | 59 | * Rules API: 为使用者提供 Restful 接口,方便与通过接口控制流量规则策略 60 | * Envoy API: 负责与各 Envoy 之间通信 61 | * Abstract Modle: 为不同的平台 Service Discovery 提供抽象层 62 | * Platform Adapter: 为不同的平台提供适配器,例如基于 K8S 平台,适配器需要通过监听 API Server的与流量规管理相关规则的变化情况,比如 pod 注册信息、ingress 资源、第三方资源等。 63 | 64 | 流量管控规则可以参见 [Traffic Management Rules](https://istio.io/docs/reference/config/traffic-rules/)。 65 | 66 | 67 | 68 | **Discovery and Load Balancing** 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | Pilot 作为服务发现和治理情况下,会提供 Restfule 接口供 Envoy 获取服务后面对应的实例 77 | 78 | ``` 79 | - GET /v1/registration/:service 80 | - GET /v1/registration/repo/:service_repo_name 81 | - POST /v1/registration/:service 82 | - DELETE /v1/registration/:service/:ip_address 83 | - POST /v1/loadbalancing/:service/:ip_address 84 | ``` 85 | 86 | 参见: [lyft discovery](https://github.com/lyft/discovery) 87 | 88 | 89 | 90 | ### 1.3 Mixer 91 | 92 | Mixer 负责在服务网格上执行访问控制和使用策略,并收集Envoy代理和其他服务的遥测数据。Mixer 旨在改变层之间的界限,以减少系统复杂性,从服务代码中消除策略逻辑,并替代为让运维人员控制。 93 | 94 | Mixer 提供三个核心功能: 95 | 96 | - **前提条件检查**。允许服务在响应来自服务消费者的传入请求之前验证一些前提条件。前提条件可以包括服务使用者是否被正确认证,是否在服务的白名单上,是否通过ACL检查等等。 97 | - **配额管理**。使服务能够在多个维度上分配和释放配额,配额被用作相对简单的资源管理工具,以便在争取有限的资源时在服务消费者之间提供一些公平性。限速是配额的例子。 98 | - **遥测报告**。使服务能够上报日志和监控。在未来,它还将启用针对服务运营商以及服务消费者的跟踪和计费流。 99 | 100 | 这些机制的应用是基于一组 [属性](https://istio.io/docs/concepts/policy-and-control/attributes.html) 的,这些属性为每个请求物化到 Mixer 中。在Istio内,Envoy重度依赖Mixer。在网格内运行的服务也可以使用Mixer上报遥测或管理配额。(注意:从Istio pre0.2起,只有Envoy可以调用Mixer。)定义的具体公共属性参见 [Attribute vocabulary](https://istio.io/docs/reference/config/mixer/attribute-vocabulary.html)。 101 | 102 | 103 | 104 | Istio 0.2 新的Adpter模型 [Mixer Adapter Model](https://istio.io/blog/mixer-adapter-model.html),每个Request需要调用Mixer两次:Precondition Check和Report; 105 | 106 | 为了将构建服务的支撑功能与程序解耦合,Mixer 在 Envoy 与 支撑服务之间充当了一个中间层,Mixer 本身则采用 Adapter 机制与各种支撑的服务进行对接,同时允许运维操作人员注入和控制策略来控制服务与支撑服务交互,运维人员可以决定数据流向那个支撑服务进行认证、计费、监控等各种控制; 107 | 108 | 109 | 110 | Adapters 为 go 的 pkg 直接链接到 Mixer 的程序内,因此创建自己的 Adapter 也非常容易; 111 | 112 | 113 | 114 | 115 | 116 | **Handlers**: Configuring Adapters 117 | **Templates**: [Adapter Input Schema](https://istio.io/docs/reference/config/mixer/template/), [metric](https://istio.io/docs/reference/config/mixer/template/metric.html) 和 [logentry](https://istio.io/docs/reference/config/mixer/template/logentry.html)为两个常用的模板 118 | **Instances**: Attribute Mapping 119 | **Rules**: Delivering Data to Adapters 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | ### 1.4 istioctl 128 | 129 | istioctl 命令用于创建、修改、删除配置等相关的 istio 系统的资源,包括以下: 130 | 131 | * route-rule 连接到upstream的路由规则(client side) 132 | * ingress-rule ingress 相关的规则 133 | * egress-rule egress 相关的规则 134 | * destination-policy 连接到destination的策略 135 | 136 | 137 | 138 | ### 1.5 部署 Bookinfos on K8S 139 | 140 | [Install](https://istio.io/docs/guides/bookinfo.html) 手动注入方式: 141 | 142 | ``` 143 | /v1/registration/ 144 | 145 | [lyft discovery](https://github.com/lyft/discovery) 146 | ``` 147 | 148 | ```shell 149 | $ kubectl apply -f <(istioctl kube-inject -f samples/bookinfo/kube/bookinfo.yaml) 150 | 151 | $ kubectl get services 152 | $ kubectl get pods 153 | 154 | $ export GATEWAY_URL=$(kubectl get po -n istio-system -l istio=ingress -o 'jsonpath={.items[0].status.hostIP}'):$(kubectl get svc istio-ingress -n istio-system -o 'jsonpath={.spec.ports[0].nodePort}') 155 | 156 | $ echo $GATEWAY_URL 157 | 158 | $ curl -o /dev/null -s -w "%{http_code}\n" http://${GATEWAY_URL}/productpage 159 | ``` 160 | 161 | 162 | 163 | 在线体验 https://www.katacoda.com/courses/istio/deploy-istio-on-kubernetes 164 | 165 | 166 | 167 | ``` 168 | while true; do 169 | > curl -s https://2886795303-80-frugo01.environments.katacoda.com/productpage > /dev/null 170 | > echo -n .; 171 | > sleep 0.2 172 | > done 173 | 174 | ``` 175 | 176 | 177 | 178 | http_proxy=$L5D_INGRESS_LB curl -s http://hello 179 | 180 | 181 | 182 | ### 1.6. Proxy 183 | 184 | #### 1.6.1 Envoy 185 | 186 | > Envoy 是 Lyft 于2016年9月份开源的一种服务代理和通信总线,已用于生产系统中,"管理了上万台虚拟机间的一百多个服务,每秒可处理近两百万次请求”。在近期的GlueCon 2017大会上,来自IBM的Shriram Rajagopalan和来自Google的Louis Ryan介绍了 Istio 的 [技术细节](https://istio.io/talks/istio_talk_gluecon_2017.pdf)(PDF)。 187 | 188 | #### 1.6.2 Linderd 189 | 190 | Linderd 可以作为 Proxy 与 Istio 集成,当然也可以作为单独的产品与 Istio 同等使用。 191 | 192 | #### 1.6.3 nginmesh 193 | 194 | NGINX [nginmesh](https://github.com/nginmesh/nginmesh) Istio 服务代理模块:为 NGINX Web 服务本身采用的是 Golang 编写而不是 C ,与作为 sidecar 模式运行的开源 NGINX 集成,并声称 ”占用的空间很小,具备先进的负载平衡算法的高性能代理、缓存、SSL 终端、使用 Lua 和 nginScript 的脚本功能、以及具备细粒度访问控制的各种安全功能。” 195 | 196 | 197 | 198 | ![nginmesh](https://res.infoq.com/news/2017/09/nginx-platform-service-mesh/zh/resources/121.png) 199 | 200 | 201 | 202 | 203 | 204 | ## 2. Linkerd 205 | 206 | buoyant 公司的 blog [A SERVICE MESH FOR KUBERNETES](https://buoyant.io/2016/10/04/a-service-mesh-for-kubernetes-part-i-top-line-service-metrics/) 详细介绍了与 K8S 集成的各种步骤。 207 | 208 | Linkerd 是一个提供弹性云端原生应用服务网格的开源项目。其核心是一个透明代理,可以用它来实现一个专用的基础设施层以提供服务间的通信,进而为软件应用提供服务发现、路由、错误处理以及服务可见性等功能,而无需侵入应用内部本身的实现。 209 | 210 | 211 | 212 | Linkerd 启动后会在每个 Node 上通过 DeamonSet 起一个代理或者作为 Sidercar 的方式和程序部署在一起。通过导出 http_proxy 环境变量的方式,为Application 提供透明的代理。HelloWorld的 [yaml](https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world.yml)文件。 213 | 214 | **No external LoadBalancer IPs:minikube** 215 | 216 | 在 Minikube 中 NODE_NAME 不能够获取,具体操作可以参见: https://discourse.linkerd.io/t/flavors-of-kubernetes/53 217 | 218 | minikube ip: 192.168.99.100 219 | 220 | ```shell 221 | $ kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/linkerd.yml 222 | $ kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml 223 | 224 | # 通过http代理测试 225 | $ OUTGOING_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="outgoing")].nodePort}') 226 | #$ L5D_INGRESS_LB=http://192.168.99.100:$OUTGOING_PORT 227 | $ L5D_INGRESS_LB=http://${minikube ip}:$OUTGOING_PORT 228 | $ http_proxy=$L5D_INGRESS_LB curl -s http://hello 229 | $ http_proxy=$L5D_INGRESS_LB curl -s http://world 230 | 231 | #or 232 | $curl -v http://$L5D_INGRESS_LB/ -H 'Host: hello' 233 | 234 | # 管理页面查看 235 | $ ADMIN_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="admin")].nodePort}') 236 | $ open http://$(minikube ip):$ADMIN_PORT 237 | ``` 238 | 239 | 由于 Minikube 中不能够获取到真实的 NODE_NAME,因此 [HelloWorld](https://github.com/linkerd/linkerd-examples/tree/master/docker/helloworld) 的样例中通过在本地起一个 K8S Cluster集群的代理 proxy,而 [hostIP.sh](https://github.com/linkerd/linkerd-examples/blob/master/docker/helloworld/hostIP.sh) 的脚本则通过自身的 POD_NAME,获取对应的IP地址。 240 | 241 | ```shell 242 | $ cat hostIP.sh 243 | #!/bin/sh 244 | 245 | set -e 246 | 247 | sleep 10 248 | curl -s "${K8S_API:-localhost:8001}/api/v1/namespaces/$NS/pods/$POD_NAME" | jq '.status.hostIP' | sed 's/"//g' 249 | ``` 250 | 251 | [hello-world-legacy.yml](https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml) 完整的样例如下: (导出的环境变量方法参见:https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/) 252 | 253 | ```yaml 254 | --- 255 | apiVersion: v1 256 | kind: ReplicationController 257 | metadata: 258 | name: hello 259 | spec: 260 | replicas: 3 261 | selector: 262 | app: hello 263 | template: 264 | metadata: 265 | labels: 266 | app: hello 267 | spec: 268 | dnsPolicy: ClusterFirst 269 | containers: 270 | - name: service 271 | image: buoyantio/helloworld:0.1.4 272 | env: 273 | - name: POD_NAME 274 | valueFrom: 275 | fieldRef: 276 | fieldPath: metadata.name 277 | - name: POD_IP 278 | valueFrom: 279 | fieldRef: 280 | fieldPath: status.podIP 281 | - name: NS 282 | valueFrom: 283 | fieldRef: 284 | fieldPath: metadata.namespace 285 | command: 286 | - "/bin/sh" 287 | - "-c" 288 | - "http_proxy=`hostIP.sh`:4140 helloworld -addr=:7777 -text=Hello -target=world" 289 | ports: 290 | - name: service 291 | containerPort: 7777 292 | - name: kubectl 293 | image: buoyantio/kubectl:v1.4.0 294 | args: 295 | - proxy 296 | - "-p" 297 | - "8001" 298 | --- 299 | apiVersion: v1 300 | kind: Service 301 | metadata: 302 | name: hello 303 | spec: 304 | selector: 305 | app: hello 306 | clusterIP: None 307 | ports: 308 | - name: http 309 | port: 7777 310 | --- 311 | apiVersion: v1 312 | kind: ReplicationController 313 | metadata: 314 | name: world-v1 315 | spec: 316 | replicas: 3 317 | selector: 318 | app: world-v1 319 | template: 320 | metadata: 321 | labels: 322 | app: world-v1 323 | spec: 324 | dnsPolicy: ClusterFirst 325 | containers: 326 | - name: service 327 | image: buoyantio/helloworld:0.1.4 328 | env: 329 | - name: POD_IP 330 | valueFrom: 331 | fieldRef: 332 | fieldPath: status.podIP 333 | - name: TARGET_WORLD 334 | value: world 335 | args: 336 | - "-addr=:7778" 337 | ports: 338 | - name: service 339 | containerPort: 7778 340 | --- 341 | apiVersion: v1 342 | kind: Service 343 | metadata: 344 | name: world-v1 345 | spec: 346 | selector: 347 | app: world-v1 348 | clusterIP: None 349 | ports: 350 | - name: http 351 | port: 7778 352 | ``` 353 | 354 | 355 | 356 | 删除 DaemonSet 357 | 358 | ```shell 359 | $ kubectl delete daemonset -l app=l5d 360 | ``` 361 | 362 | 363 | 364 | 查看监控指标: 365 | 366 | ```shell 367 | $ curl http://192.168.99.100:32407/admin/metrics.json\?pretty=1 368 | 369 | # promethus相关的参数 370 | $ http://192.168.99.100:32407/admin/metrics/prometheus 371 | ``` 372 | 373 | 374 | 375 | Application的配置需要添加一下相关配置: 376 | 377 | ``` 378 | env: 379 | - name: NODE_NAME 380 | valueFrom: 381 | fieldRef: 382 | fieldPath: spec.nodeName 383 | - name: http_proxy 384 | value: $(NODE_NAME):4140 385 | ``` 386 | 387 | 其中: 388 | 389 | - `$(NODE_NAME):4140` (outgoing) for HTTP 4041(incoming) 4042 (ingress) 390 | - `$(NODE_NAME):4240` for HTTP/2 391 | - `$(NODE_NAME):4340` for gRPC 392 | 393 | 394 | 395 | 统计相关的服务: 396 | 397 | [Prometheus](https://prometheus.io/), [InfluxDB](https://www.influxdata.com/), and [StatsD](https://github.com/etsy/statsd) 398 | 399 | http://121.196.214.67:30173/?router=outgoing 400 | 401 | 402 | 403 | http_proxy=http://121.196.214.67:31393/ curl -s http://hello 404 | 405 | 406 | 407 | ## 3. 参考 408 | 409 | 1. [Service Mesh:下一代微服务](https://mp.weixin.qq.com/s?__biz=MzA3MDg4Nzc2NQ==&mid=2652136254&idx=1&sn=bba9bbd24ac8e5c1f6ef5d1125a6975b&chksm=84d53304b3a2ba12f88675c1bf51973aa1210d174da9e6c2ddcd1f3c84ec7e25987b3bce1071&mpshare=1&scene=1&srcid=1020GPmfbEVP9QDNlZBHg47I&pass_ticket=a%2B3t43zt60SHoI6fLsq80dbx%2FKCTnp9%2Bg1DgmORXY0hwwje1mB3uFmK9f9%2BSNZ2v#rd): QCON 2017 上海站的演讲,系统介绍Service Mesh技术 410 | 2. [ 服务网格新生代--Istio](https://mp.weixin.qq.com/s?__biz=MzA3MDg4Nzc2NQ==&mid=2652136078&idx=1&sn=b261631ffe4df0638c448b0c71497021&chksm=84d532b4b3a2bba2c1ed22a62f4845eb9b6f70f92ad9506036200f84220d9af2e28639a22045&mpshare=1&scene=1&srcid=0922JYb4MpqpQCauaT9B4Xrx&pass_ticket=F8CjNuTDg%2Fskt94bwJ%2B1yiPKpHJhaaRYpxDCqtNGMrMGkGsZDLF5EW1HCByba35u#rd): 介绍isito的文章 411 | 3. [Envoy Docs](https://www.envoyproxy.io/docs/envoy/latest/) 412 | 4. [Istio Lab](https://www.katacoda.com/courses/istio) 413 | 5. [Monitoring Microservices with Weavescope](https://www.youtube.com/watch?v=aQcXOajWwE4) 414 | 6. http://blog.fleeto.us/content/istio-de-zi-dong-zhu-ru istio 的自动注入 415 | 7. [Microservices Patterns With Envoy Sidecar ](http://blog.christianposta.com/microservices/01-microservices-patterns-with-envoy-proxy-part-i-circuit-breaking/) 416 | 8. [istio 三日谈](https://www.kubernetes.org.cn/2449.html) 417 | 9. [KUBERNETES-NATIVE API GATEWAY FOR MICROSERVICES BUILT ON THE **ENVOY PROXY**](https://www.getambassador.io/) 418 | 10. [Docker应用的可视化监控管理](http://blog.csdn.net/horsefoot/article/details/51749528) 419 | 11. [[Generating code](https://blog.golang.org/generate)](https://blog.golang.org/generate) 420 | 12. [Linkerd中文文档](https://linkerd.doczh.cn/doc/overview/) [官方文档](https://linkerd.io/) 421 | 13. [A SERVICE MESH FOR KUBERNETES](https://buoyant.io/2016/10/04/a-service-mesh-for-kubernetes-part-i-top-line-service-metrics/) 介绍了Linkerd 在 k8s上的集成测试方案 422 | 14. [CNCF Landscape](https://github.com/cncf/landscape) 423 | 15. [Go kit for Miroservice](https://gokit.io/) 424 | 16. [NGINX 发布微服务平台、OpenShift Ingress Controller 和Service Mesh预览版](http://www.infoq.com/cn/news/2017/09/nginx-platform-service-mesh) -------------------------------------------------------------------------------- /Microservices_Arch9.md: -------------------------------------------------------------------------------- 1 | # “微服务”架构 9大特征解读 2 | 3 | > 2011年5月在威尼斯附近的一个软件架构工作坊中,大家开始讨论“微服务”这个术语,因为这个词可以描述参会者们在架构领域进行探索时所见到的一种通用的架构风格。2012年5月,这群参会者决定将“微服务”作为描述这种架构风格的最贴切的名字。在2012年3月波兰的克拉科夫市举办的“33rd Degree”技术大会上,本文作者之一James在其[“Microservices - Java, the Unix Way”](http://2012.33degree.org/talk/show/67)演讲中以案例的形式谈到了这些微服务的观点,与此同时,Fred George也表达了[同样的观点](http://www.slideshare.net/fredgeorge/micro-service-architecure)。Netflix公司的Adrian Cockcroft将这种方法描述为“细粒度的SOA”,并且作为先行者和本文下面所提到的众人已经着手在Web领域进行了实践——Joe Walnes, Dan North, Evan Botcher 和 Graham Tackley。 4 | 5 | ![Monoliths_VS_MicroServices](https://martinfowler.com/articles/microservices/images/sketch.png) 6 | 7 | 以一组 **services** 的方式来建造 **applications** ,起源于 Unix 的[设计哲学原则](https://my.oschina.net/u/184206/blog/370218),详见[《Unix编程艺术》](http://download.csdn.net/download/peiyingli/2562620)。[Microservices Resource Guide](https://martinfowler.com/microservices/) 8 | 9 | 10 | ![](https://martinfowler.com/bliki/images/microservice-verdict/productivity.png) 11 | 12 | ​ 图 单体开发与微服务开发的复杂度对比 13 | 14 | ### Services 的优点: 15 | 16 | * 不同的技术栈独立开发,甚至不同的团队开发 17 | 18 | * 独立部署 19 | 20 | * 独立扩展 21 | 22 | * 提供了清晰的边界 23 | 24 | 25 | 26 | 27 | ## 1. **组件化与服务化** 28 | 29 | 一直以来我们开发系统的时候,都希望能够基于可插拔式的 **component** 来建立系统,追求功能的 “**内聚**” 和 “**解耦**” 的目的;就像我们在现实生活中采用拼积木方式来组装不同的玩具一样,每个积木都具备清楚的功能和边界, [OSGi(pen Service Gateway Initiative)](https://www.osgi.org/) 规范做了非常好的尝试,Java开发者可以基于规范构建更好模块化,动态性,热插拔性的系统;知名C++开源网络框架 [ACE](http://www.cs.wustl.edu/~schmidt/ACE-overview.html) 中的 [“ACE Service Configurator”](http://www.cs.wustl.edu/~schmidt/PDF/O-Service-Configurator.pdf) 模式也实现了类似的功能。 30 | 31 | 从软件复用的层次来讲,软件复用方式可以依次为: **libraries** **->** **components** **->** **services**,其中以 **libraries** 与 **components** 一般情况情况下受制于相同技术栈,由于技术层次的耦合性,比较难以发现或者定义出比较清晰的边界;通过 **services** 方式的共享灵活性最大,调用成本也最高,但是容易形成比较清晰的边界,具备了 "**Services** 的优点" 章节描述的独有优势。 32 | 33 | 对于 **Services** 调用方式的功能演进,则可以通过内聚的服务边界和服务协议方面的演进机制来提供最大的兼容性;由于 **Services** 的通信成本更加昂贵,因此 API Interface 定义必须是 **粗粒度**,更多的接口面向业务并聚合多次调用,参见[使用DDD来构建你的REST API,而不是CRUD](https://mp.weixin.qq.com/s/251ql2WhDi-InUgVtIQ6_Q),如果简单地将系统原有调用方式修改成远端的调用,则会让当前程序的代码逻辑更加凌乱,不易于维护(将数据从服务器发送给客户端时,应该一次性的发送所有内容,让客户端能完成当前的任务),参见[软件组件:粗粒度与细粒度](https://www.ibm.com/developerworks/cn/webservices/ws-soa-granularity/)。 34 | 35 | 36 | 37 | ## 2. **围绕 "业务功能” 组织团队** 38 | 39 | **康威定律**: *设计的结构与该组织的沟通结构相一致。* 40 | 41 | > Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure. 42 | > 43 | > -- Melvyn Conway, 1967 44 | 45 | ![组织架构变化](https://martinfowler.com/articles/microservices/images/conways-law.png) 46 | 47 | 如果按照传统的职能性技术栈来组织微服务开发,那么会经常出现跨多个团队的变更和延期,而且也会导致业务逻辑的组织散落在不同的组织代码库中,因此在面对微服务开发过程中,需要根据实际情况按照业务功能的逻辑来划分不同的而研发团队,从而形成 **敏捷性组织**[《架构即未来》3.3.3 章节],例如按照浏览服务、用户服务、结账服务等不同的业务场景独立划分为研发团队,团队内部形成一个小的独立生态圈,包括独立的产品、研发和Devops人员。团队成员的规模,可以 [Amazon's Two Pizz Team](https://www.qianzhan.com/people/detail/268/141021-26c5bf4e.html) 的模式。 48 | 49 | ![服务边界的变化](https://martinfowler.com/articles/microservices/images/PreferFunctionalStaffOrganization.png) 50 | 51 | 由于按照业务功能来组织开发,组织的交流边界也会从原有的职能性转变层业务功能边界来进行,更加容易定义出团队的边界与责任。 52 | 53 | 54 | 55 | ## 3. **产品而非项目** 56 | 57 | 微服务的 ”敏捷组织团队“ 可以摆脱传统的开发与维护脱节的问题,”一个团队在一个产品的整个生命周期中都应该保持对其拥有“, 这种模式下因为业务相关的产品有一个团队负责,对于产品的整个生命周期管理(包括线上的日常运维数据的关注等),能够很好提升团队的创新性和团队开发的责任心,更加容易产出高质量的优秀产品。Amazon's 理念 ["you build, you run it"](https://queue.acm.org/detail.cfm?id=1142065)。 58 | 59 | 60 | 61 | ## 4. **”智能端点“与”哑巴管道“** 62 | 63 | 当在不同的进程之间构建各种通信结构时,我们已经看到许多产品和方法,来强调将大量的智能特性纳入通信机制本身。这种状况的一个典型例子,就是 **“SOA架构下企业服务总线”** (Enterprise Service Bus, ESB)。ESB产品经常包括高度智能的设施,来进行消息的路由、编制(choreography)、转换,并应用业务规则。 64 | 65 | 微服务社区主张采用另一种做法:**智能端点** (Smart Endpoints)和**哑巴管道**(Dumb Pipes)。使用微服务所构建的各个应用的目标,都是尽可能地实现 **“高内聚和低耦合”** —他们拥有自己的领域逻辑,并且更多地是像经典Unix的 **“过滤器”** (filter)那样来工作—即接收一个请求,酌情对其应用业务逻辑,并产生一个响应。这些应用通过使用一些简单的REST风格的协议来进行编制,而不去使用诸如下面这些复杂的协议,即"WS-编制"(WS-Choreography)、BPEL或通过位于中心的工具来进行编排(orchestration)。 66 | 67 | 微服务最常用的两种协议是:带有资源API的HTTP “Request-Response” 协议和轻量级的消息发送协议,如 RabbitMQ或ZeroMQ。 68 | 69 | **”智能端点“** 表达的意思是强化各端点服务的调用和编排消息的能力,**哑巴管道** 说表达的意思则是弱化通信机制的业务性与灵活性,定位在于消息路由与传递,功能更加简单。 70 | 71 | > Be of the web, not behind the web 72 | > 73 | > ​ [-- Ian Robinson](https://www.amazon.com/gp/product/0596805829?ie=UTF8&tag=martinfowlerc-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0596805829) 74 | 75 | 将一个单块系统改造为若干微服务的最大问题,在于对通信模式的改变。仅仅将内存中的方法调用转换为RPC调用这样天真的做法,会导致微服务之间产生繁琐的通信,使得系统表现变糟。取而代之的是,需要用更粗粒度的协议来替代细粒度的服务间通信。 76 | 77 | 78 | 79 | ## 5. **“去中心化”地治理技术** 80 | 81 | 总结:可以灵活选择不同的技术实现不同的问题,但是同步考虑公司内部与团队的技术积累与沉淀,选择更加合理的技术栈进行开发。 82 | 83 | > Experience shows that this approach is constricting - not every problem is a nail and not every solution a hammer. 84 | 85 | 使用中心化的方式来对开发进行治理,其中一个后果,就是趋向于在单一技术平台上制定标准。经验表明,这种做法会带来局限性——不是每一个问题都是钉子,不是每一个方案都是锤子。我们更喜欢根据工作的不同来选用合理的工具。尽管那些单块应用系统能在一定程度上利用不同的编程语言,但是这并不常见。 86 | 87 | 如果能将单块应用的那些组件拆分成多个服务,那么在构建每个服务时,就可以有选择不同技术栈的机会。想要使用Node.js来搞出一个简单的报表页面?尽管去搞。想用C++来做一个特别出彩儿的近乎实时的组件?没有问题。想要换一种不同风格的数据库,来更好地适应一个组件的读取数据的行为? 88 | 89 | 不同的语言都有着各自的擅长范围和领域,例如实时性要求高的C++,机器学习或者数据处理则可以用Python,云计算相关领域Golang,一般的企业应用有一定性能要求的则可以选择Node.js和Java,中小型的性能略低的可以采用Ruby和PHP等,技术栈的选择需要根据团队的人员储备与当前招聘相关的需求结合起来,不盲目最新,也不能一直固定模式,通过为服务化为试用各种新的技术栈提供了一个良好的契机。 90 | 91 | 92 | 93 | [编程语言排行榜](https://www.tiobe.com/tiobe-index/) 和 [九种编程语言大对比](https://zhuanlan.zhihu.com/p/20887949?refer=mengmengzhou) 94 | 95 | ![](http://img.mp.sohu.com/upload/20170808/82a4cee4c25b41cca895a65d24f1ff2c_th.png) 96 | 97 | 98 | 99 | ## 6. **“去中心化”地管理数据** 100 | 101 | 由于微服务采用 **“分而治之”** 的思想,因此每个微服务都可以自己选择技术栈和存储;我们以前开发的一款社交软件所选择的数据库就有MySQL、Mongodb、Neo4j、Redis等多种,和数据建模模型有关也和各自技术栈编程语言有关,在于前期的构建过程中还是比较顺利,但是等到了上线以后的数据维护上,就面临了不少挑战,由于涉及不同的数据库,数据维护和备份机制等不同,也导致研发人员的注意力分散,方案的不通用性也会导致后期的维护成本增加,因此该点的考虑还是需要根据当前研发团队的技术擅长度和深度两者结合,避免过于扩大,增加后期的技术债和维护的投入成本。 102 | 103 | DDD(Domain-Driven Design) 将一个复杂的领域划分为多个限界上下文,并且将其相互之间的关系用图画出来。这一划分过程对于单块和微服务架构两者都是有用的,而且就像前面有关“业务功能”一节中所讨论的那样,在服务和各个限界上下文之间所存在的自然的联动关系,能有助于澄清和强化这种划分。 104 | 105 | ![去中心化的数据管理](https://martinfowler.com/articles/microservices/images/decentralised-data.png) 106 | 107 | 108 | 109 | ## 7. **“基础设施”自动化** 110 | 111 | ![CI/CD](https://martinfowler.com/articles/microservices/images/basic-pipeline.png) 112 | 113 | 114 | 115 | 由于微服务的运行在各自的系统空间内,且经过服务拆分后服务的数目数倍于原有的传统的发布方式,因此带给运维的压力也比传统方式下要大得多;由于多服务之间存在版本依赖、配合敏捷开发的高频发布、发布后的问题排查与监控都相比于传统运维了有了质的区分,甚至有时候一天发布几十次服务的情况发生,在没有自动化流程的辅助则是非常难以实现的。 116 | 117 | 微服务与Docker技术的相互促进发展,Docker技术为微服务的自动化发布提供了良好的解决方案,也促使了 Devops 文化的不断演进,更多可以参见 [《一篇文了解DevOps:从概念、关键问题、兴起到实现需求》](https://mp.weixin.qq.com/s?__biz=MjM5MDE0Mjc4MA%3D%3D&mid=2650994236&idx=1&sn=d488ae3d66328eb4344eea421ca679be&chksm=bdbf0e6f8ac88779d4bc011a7d4c40f0501c19227128276385f4e739ebacc53440f2a1169f3f)。 118 | 119 | 整体的目的则是通过打造CI/CD平台,为微服务提供一个自动化的基础设置。 120 | 121 | ![](https://martinfowler.com/articles/microservices/images/micro-deployment.png) 122 | 123 | 124 | 125 | ## 8. **“容错”设计** [待补充] 126 | 127 | 使用各个微服务来替代组件,其结果是各个应用程序需要设计成能够容忍这些服务所出现的故障。如果服务提供方不可用,那么任何对该服务的调用都会出现故障。客户端要尽可能优雅地应对这种情况。与一个单块设计相比,这是一个劣势。因为这会引人额外的复杂性来处理这种情况。为此,各个微服务团队在不断地反思:这些服务故障是如何影响用户体验的。 128 | 129 | 因为各个服务可以在任何时候发生故障,所以下面两件事就变得很重要,即能够快速地检测出故障,而且在可能的情况下能够自动恢复服务。各个微服务的应用都将大量的精力放到了应用程序的实时监控上,来检查“架构元素指标”(例如数据库每秒收到多少请求)和“业务相关指标”(例如系统每分钟收到多少订单)。当系统某个地方出现问题,语义监控系统能提供一个预警,来触发开发团队进行后续的跟进和调查工作。 130 | 131 | 由于 service 之间的相互调用,导致服务调用链的复杂性,很容易在一个 service 出现问题的时候,出现链式的错误,将故障放大,从而导致雪崩的现象出现。 132 | 133 | 因此从服务设计的角度讲需要进行一下措施: 134 | 135 | 1. 系统过载保护,当系统出现了过载的情况下,及时丢弃过载数据,返回服务出错状态; 136 | 2. 断路器方式,对于出现问题的 service 进行短路,并持续跟踪,等待恢复后重新进行; 137 | 138 | 可以参考 [B站微服务实践](https://github.com/gopherchina/conference/tree/master/2017) 139 | 140 | 141 | 142 | ## 9. **“演进式”设计** [待补充] 143 | 144 | 可替换性组件,非常利于微服务架构的演进。演进式设计承认难以对边界进行正确定位,所以它将工作的重点放到了易于对边界进行重构之上。 145 | 146 | 那些微服务的从业者们,通常具有演进式设计的背景,而且通常将服务的分解,视作一个额外的工具,来让应用开发人员能够控制应用系统中的变化,而无须减少变化的发生。变化控制并不一定意味着要减少变化——在正确的态度和工具的帮助下,就能在软件中让变化发生得频繁、快速且经过了良好的控制。 147 | 148 | 每当试图要将软件系统分解为各个组件时,就会面临这样的决策,即如何进行切分——我们决定切分应用系统时应该遵循的原则是什么?一个组件的关键属性,是具有独立更换和升级的特点**[13]**——这意味着,需要寻找这些点,即想象着能否在其中一个点上重写该组件,而无须影响该组件的其他合作组件。事实上,许多做微服务的团队会更进一步,他们明确地预期许多服务将来会报废,而不是守着这些服务做长期演进。 149 | 150 | 英国卫报网站是一个好例子。原先该网站是一个以单块系统的方式来设计和构建的应用系统,然而它已经开始向微服务方向进行演进了。原先的单块系统依旧是该网站的核心,但是在添加新特性时他们愿意以构建一些微服务的方式来进行添加,而这些微服务会去调用原先那个单块系统的API。当在开发那些本身就带有临时性特点的新特性时,这种方法就特别方便,例如开发那些报道一个体育赛事的专门页面。当使用一些快速的开发语言时,像这样的网站页面就能被快速地整合起来。而一旦赛事结束,这样页面就可以被删除。在一个金融机构中,我们已经看到了一些相似的做法,即针对一个市场机会,一些新的服务可以被添加进来。然后在几个月甚至几周之后,这些新服务就作废了。 151 | 152 | 这种强调可更换性的特点,是模块化设计一般性原则的一个特例,通过“变化模式”(pattern of change)**[14]**来驱动进行模块化的实现。大家都愿意将那些能在同时发生变化的东西,放到同一个模块中。系统中那些很少发生变化的部分,应该被放到不同的服务中,以区别于那些当前正在经历大量变动(churn)的部分。如果发现需要同时反复变更两个服务时,这就是它们两个需要被合并的一个信号。 153 | 154 | 把一个个组件放入一个个服务中,增大了作出更加精细的软件发布计划的机会。对于一个单块系统,任何变化都需要做一次整个应用系统的全量构建和部署。然而,对于一个个微服务来说,只需要重新部署修改过的那些服务就够了。这能简化并加快发布过程。但缺点是:必须要考虑当一个服务发生变化时,依赖它并对其进行消费的其他服务将无法工作。传统的集成方法是试图使用版本化来解决这个问题。但在微服务世界中,大家更喜欢将版本化作为最后万不得已的手段(http://martinfowler.com/articles/enterpriseREST.html#versioning)来使用。我们可以通过下述方法来避免许多版本化的工作,即把各个服务设计得尽量能够容错,来应对其所依赖的服务所发生的变化。 155 | 156 | 157 | 158 | ## 微服务之测试(补充) 159 | 160 | ### 测试金字塔 161 | 162 | ![测试金字塔](https://www.testwo.com/attachments/12697/1479217406521.png) 163 | 164 | **UI(End-to-End Test)** **Service** **Unit** 165 | 166 | **Unit Tes**t: 聚焦于函数内部的测试 167 | 168 | **Service Tests**: Mocking or Stubbing 169 | 170 | [“Testing Strategies in a Microservice Architecture”](https://martinfowler.com/articles/microservice-testing/) 171 | 172 | ## 参考资料 173 | 174 | * [微服务架构9大特征中文版](https://mp.weixin.qq.com/s?__biz=MjM5MjEwNTEzOQ==&mid=401500724&idx=1&sn=4e42fa2ffcd5732ae044fe6a387a1cc3#rd) 175 | * [基于Spring Boot和Spring Cloud实现微服务架构学习](http://blog.csdn.net/enweitech/article/details/52582918) 176 | * [39本架构类书籍](http://download.csdn.net/album/detail/4093/1/1) 177 | * [用gomock进行mock测试](https://segmentfault.com/a/1190000009894570) 178 | * [Go语言用mock server模拟调用(httptest)](http://www.cnblogs.com/aguncn/p/7102675.html) 179 | * [mock.js-无需等待,让前端独立于后端进行开发](https://cnodejs.org/topic/53f718218f44dfa3511af923) 180 | * [Mocks Aren't Stubs](https://www.martinfowler.com/articles/mocksArentStubs.html) 181 | 182 | -------------------------------------------------------------------------------- /Microservices_Authentication_Authorization.md: -------------------------------------------------------------------------------- 1 | # Microservices Authentication and Authorization 2 | 3 | 4 | 5 | ## 1. Protocol 6 | 7 | 1. Oauth2 - RFC 6749 8 | 9 | > “The OAuth 2.0 authorization framework enables a third-party application to obtain 10 | > limited access to an HTTP service, either on behalf of a resource owner by 11 | > orchestrating an approval interaction between the resource owner and the HTTP 12 | > service, or by allowing the third-party application to obtain access on its own 13 | > behalf.” 14 | 15 | 2. Oauth2 - Beare Token usage - RFC 6750 16 | 17 | > “This specification describes how to use bearer tokens in HTTP requests to access 18 | > OAuth 2.0 protected resources. Any party in possession of a bearer token (a 19 | > "bearer") can use it to get access to the associated resources 20 | > (without demonstrating possession of a cryptographic key). To prevent misuse, 21 | > bearer tokens need to be protected from disclosure in storage and in transport. 22 | 23 | 3. OpenID Connect (OIDC) 24 | 25 | > “OpenID 26 | > Connect 1.0 is a simple identity layer on top of the OAuth 2.0 protocol. It 27 | > enables Clients to verify the identity of the End-User based on the 28 | > authentication performed by an Authorization Server, as well as to obtain basic 29 | > profile information about the End-User in an interoperable and REST-like manner.” 30 | 31 | 32 | 33 | ### Tokens: Type/Format 34 | 35 | 1. **AccessToken** 36 | 37 | * Part of Oauth, presented with each transaction 38 | * can be opaque or JWT 39 | * can be stateful or stateless 40 | * Shorter TTL 41 | 42 | 2. **RefreshToken** 43 | 44 | * Partof Oauth,received along with first access token after authentication to the authserver 45 | * Used to request a new access token from the auth server, no credentials required 46 | * LongerTTL 47 | * Must be stored securely 48 | 49 | 3. **ID Token** 50 | 51 | * Partof OIDC 52 | * ContainsIdentity information about authenticated user 53 | * Receivedin addition to the 2 oauth tokens 54 | * Mustbe JWT 55 | * LongerTTL 56 | 57 | 4. **JSON Web Tokens(JWT)** 58 | 59 | * Token format specified by OpenID Connect for the Identity Token 60 | 61 | * Multiple levels of security possible (JWE, JWS, JOSE) 62 | 63 | * Usually stateless 64 | 65 | ​ 66 | 67 | ## 2. Tokens: Performance vs. Security 68 | 69 | 70 | **Stateful** 71 | 72 | * Sessions stored on server 73 | * Tokenis opaque 74 | * Tokens must be validated with the server 75 | * Server handles authorization 76 | * Better logout 77 | 78 | 79 | 80 | **Stateless** 81 | 82 | * Sessions not stored on server 83 | 84 | * Token may be introspected 85 | 86 | * Tokens validated locally 87 | 88 | * Microservice must handle authorization 89 | 90 | * Tokens difficult to revoke before TTL 91 | 92 | ​ 93 | 94 | | Token | Performance | Security | 95 | | ------------------------ | ----------- | --------- | 96 | | State | Stateless | Statefull | 97 | | Encrypt JWT Body | No | Yes | 98 | | Validate w/Auth server | No | Yes | 99 | | Validate all tokens | No | Yes | 100 | | TTL’s | Longer | Shorter | 101 | 102 | 103 | 104 | Decode JWT: https://jwt.io/ 105 | 106 | ```sequence 107 | Title: Oauth Bearer Token - stateful 108 | mservice1 -> authserver: 1. RequestToken(Client Credenitals) 109 | authserver -> mservice1: 2. Resonse(access token, refresh token, metadata) 110 | mservice1 -> mservice2: 3. SrvRequest(access token) 111 | mservice2 -> authserver: 4. Token Validation Request(Client Credentials, access token) 112 | authserver -> mservice2: 5. Response (token_epires) 113 | mservice2 -> mservice1: 6. Response {payload data} 114 | 115 | 116 | ``` 117 | 118 | ```sequence 119 | title: Tier 1 and 2 microservices - stateless 120 | ExternalConsumer -> Tier1_application: Request proteceted app 121 | Tier1_application -> ExternalConsumer: 302 Redirect- Auth-Server 122 | ExternalConsumer -> AuthServer: 123 | AuthServer -> ExternalConsumer: 302 Redirect-w/auth code 124 | Tier1_application -> AuthServer: Auth Code 125 | AuthServer -> Tier1_application: {access token, refresh token, ID Token metadata} 126 | Tier1_application -> AuthServer: Request Token(Client Credentials) 127 | AuthServer -> Tier1_application: Response (access token, refresh token, metadata) 128 | Tier1_application -> Tier2_application: Service Request(consumer access token, \nconsumer IDToken, servcie access token) 129 | Tier2_application -> Tier2_application: Stateless token validated 130 | Tier2_application -> Tier1_application: Response(data payload) 131 | Tier1_application -> ExternalConsumer: (data payload) 132 | 133 | 134 | ``` 135 | 136 | 137 | 138 | ## 3. Authentication Ways 139 | 140 | ### 3.1 Use SSO solutions 141 | 142 | ![sso](img/sso.png) 143 | 144 | 1. User requests access 145 | 2. Not authenticated 146 | 3. User authenticattes with SSO Server 147 | 4. Authentication successful, grant token 148 | 5. User uses token 149 | 6. Application uses token to get user details 150 | 7. Auth Server returns user details 151 | 152 | ### 3.2 Distributed session 153 | 154 | ![dbs](img/dbs.png) 155 | 156 | 1. User requests access 157 | 2. Not Authenticated 158 | 3. User authenticates with Auth Server 159 | 4. Authentication successful 160 | * Write state to Distributed Session Store 161 | * User X is logged in 162 | * Sets TTL 163 | * Set Session ID on client side 164 | 5. User uses Session ID 165 | 6. μService read distributed Session Store 166 | * Refresh TTL 167 | 168 | ## 3.3 Client-side token 169 | 170 | ![cst](img/cst.png) 171 | 172 | 1. User requests access 173 | 2. Not authenticated 174 | 3. User authenticates with AuthServer 175 | 4. Authentication successful 176 | * Set ID token on the client side 177 | * Self-contained 178 | * Signed 179 | * TTL 180 | 5. Services understand ID token 181 | * Can parse user ID 182 | * Can verify token 183 | * Check signature 184 | * Check TTL 185 | 186 | 187 | 188 | ### 3.4 Client-side token + API Gateway 189 | 190 | ![cst-gateway](img/cst-gateway.png) 191 | 192 | 193 | 194 | 1. User requests access 195 | 2. Not authenticated 196 | 3. User authenticates with AuthServer 197 | 4. Authentication successful 198 | * Set ID token on the client side 199 | * Self-contained 200 | * Signed 201 | * TTL 202 | 5. API Gateway translates to opaque token 203 | 6. API Gateway resolves to ID token 204 | 7. Services understand ID token 205 | * Can parse user ID 206 | * Can verify token 207 | * Check signature 208 | * Check TTL 209 | 210 | #### 3.5 Summary 211 | 212 | | | SSO | Distributed Session | JWT | API GW | 213 | | --------------------- | ---- | ------------------- | ---- | ------ | 214 | | Security | ✔ | ✔ | ! | ✔ | 215 | | Secret sharing | ✔ | ✘ | ! | ! | 216 | | Statelessness | ✘ | ✔ | ✔ | ! | 217 | | SPOF @ service switch | ✘ | ! | ✔ | ✔ | 218 | | Bottlenecks | ! | ✘ | ✔ | ! | 219 | | Transparent | ✔ | ✔ | ✔ | ✔ | 220 | | Logout | ✘ | ✔ | ! | ✔ | 221 | | Technologies | ✔ | ✘ | ✔ | ! | 222 | | Integration | ✔ | ✘ | ✔ | ✔ | 223 | | Implementation | ✘ | ! | ✔ | ✘ | 224 | 225 | -------------------------------------------------------------------------------- /OpenShift_Install.md: -------------------------------------------------------------------------------- 1 | # OpenShift Origin Singnal Node 安装 2 | 3 | [openshift origin](https://github.com/openshift/origin/) 是用来支持 [openshift](https://www.openshift.org/) 产品的一个上游社区项目,围绕 Docker 容器和 Kubernetes 集群技术,一套来进行应用生命周期管理的 DevOps 工具,它提供了一个完整的开源容器应用平台。 4 | 5 | 与 openshift 类型的产品 有 [Pivotal](http://www.36dsj.com/archives/tag/pivotal) 公司 ([中国官网](https://pivotal.io/cn))的 [CloudFoundry](https://www.cloudfoundry.org/) 和 [Rancher](https://www.cnrancher.com/),国内 Ranchard 的接受度是远高于 Openshift。 6 | 7 | 8 | 9 | * Pivotal公司是由EMC和VMware联合成立的一家子公司,主要产品有 [CloudFoundry](https://www.cloudfoundry.org/)/[Spring](https://spring.io/)(SpringBoot/SpringCould)。 10 | 11 | * Rancher是全球唯一提供Kubernetes、Mesos和Swarm三种调度工具的企业级分发版和商业技术支持的容器管理平台 12 | 13 | * [Deis](https://deis.com/) Build powerful, open source tools that make it easy for teams to create and manage applications on Kubernetes. Projects: Workflow/Helm/Steward 14 | 15 | * [Eldarion](https://eldarion.cloud/) Hands-free DevOps powered by Kubernetes. Eldarion uniquely offers expert, white-glove DevOps services together with a new, highly-scalable PaaS. 16 | 17 | ​ 18 | 19 | **扩展**:[**awesome-kubernetes**](https://github.com/ramitsurana/awesome-kubernetes) 20 | 21 | **Kubernetes Platform as a Service providers** 22 | 23 | - [Deis Workflow](https://deis.com/) - [Deprecated Public Releases](https://deis.com/blog/2017/deis-workflow-final-release/) 24 | - [Kel](http://www.kelproject.com/) 25 | - [WSO2](http://wso2.com/) 26 | - [Kumoru](https://medium.com/@kumoru_io) - [Deprecated](https://www.youtube.com/watch?v=_5XQmE7rx9o) - Not Official 27 | - [Rancher](http://rancher.com/running-kubernetes-aws-rancher/) 28 | - [OpenShift Origin](https://www.openshift.org/) 29 | - [Eldarion Cloud](http://eldarion.cloud/) 30 | - [IBM Bluemix Container Service](https://www.ibm.com/cloud-computing/bluemix/containers) 31 | 32 | ![](http://soft.dog/images/openshift/openshift_architecture.png) 33 | 34 | 35 | 36 | [在线文档地址](https://docs.openshift.org/index.html): https://docs.openshift.org/index.html [OpenShift Origin 3.6 Documentation](https://docs.openshift.org/3.6/welcome/index.html) 37 | 38 | 版本关系图: 39 | 40 | ![](http://soft.dog/images/openshift/openshift_relationship.jpg) 41 | 42 | **Docker K8S 与Openshift版本对应关系图:** (kubernets创建于 2014-06-06) 43 | 44 | ![](https://www.duyidong.com/images/container_timeline.png) 45 | 46 | 备注: Centos 7.2 版本,已经安装了 docker 版本 47 | 48 | ```shell 49 | # docker version 50 | Client: 51 | Version: 17.03.1-ce 52 | API version: 1.27 53 | Go version: go1.7.5 54 | Git commit: c6d412e 55 | Built: Mon Mar 27 17:05:44 2017 56 | OS/Arch: linux/amd64 57 | 58 | Server: 59 | Version: 17.03.1-ce 60 | API version: 1.27 (minimum version 1.12) 61 | Go version: go1.7.5 62 | Git commit: c6d412e 63 | Built: Mon Mar 27 17:05:44 2017 64 | OS/Arch: linux/amd64 65 | Experimental: false 66 | 67 | # 如果未安装则可用一下命令安装 68 | # yum install docker 69 | # systemctl start docker 70 | ``` 71 | 72 | 73 | 74 | 安装步骤: 75 | 76 | ```shell 77 | # hostnamectl 78 | Static hostname: teambition 79 | Icon name: computer-vm 80 | Chassis: vm 81 | Machine ID: 99a0b5c74c754c0c8d94775a4d1a753a 82 | Boot ID: e4d29bee23a64794ac9c13408846e90a 83 | Virtualization: vmware 84 | Operating System: CentOS Linux 7 (Core) 85 | CPE OS Name: cpe:/o:centos:centos:7 86 | Kernel: Linux 3.10.0-514.el7.x86_64 87 | Architecture: x86-64 88 | 89 | # yum makecache fast 90 | # setenforce 0 91 | # yum list all | grep openshift # OpenShift 3.6的版本,集成的K8s 1.6 92 | # yum info centos-release-openshift-origin36.noarch 93 | # yum install centos-release-openshift-origin36 94 | # yum install centos-release-openshift-origin36.noarch 95 | # rpm -ql centos-release-openshift-origin36.noarch 96 | # yum install origin 97 | # openshift start 98 | # openshift version 99 | openshift v3.6.1+008f2d5 100 | kubernetes v1.6.1+5115d708d7 101 | etcd 3.2.1 102 | 103 | ``` 104 | 105 | 登录: 106 | 107 | [https://masterip:8443](https://masterip:8443/) 信任默认证书,默认用户名为 test:test 108 | 109 | 110 | 111 | **启动过程中错误处理:** 112 | 113 | misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"](https://github.com/kubernetes/kubernetes/issues/43805) 114 | 115 | ```shell 116 | $ systemctl cat docker 117 | $ vim /usr/lib/systemd/system/docker.service 118 | ExecStart=/usr/bin/dockerd --exec-opt native.cgroupdriver=systemd 119 | $ systemctl daemon-reload 120 | $ systemctl restart docker 121 | ``` 122 | 123 | 124 | 125 | ## 参考 126 | 127 | 1. [OpenShift介绍](http://www.chenshake.com/openshift%e4%bb%8b%e7%bb%8d/) 128 | 2. [安装 OpenShift Origin](http://soft.dog/2017/07/27/install-openshift-origin/) 129 | 3. [Installing Kubernetes 1.6.4 on centos 7](https://gettech1.wordpress.com/2017/06/13/installing-kubernetes-1-6-4-on-centos-7/) 130 | 4. [PaaS 平台(一) -- Openshift 介绍](https://www.duyidong.com/2017/06/14/kubernetes-and-openshift/) 131 | 5. [PaaS 平台(三)-- Openshift 使用](https://www.duyidong.com/2017/06/15/openshift-quick-start/) 132 | 6. [Who Manages Cloud IaaS, PaaS, and SaaS Services](https://mycloudblog7.wordpress.com/2013/06/19/who-manages-cloud-iaas-paas-and-saas-services/) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Micro-services Arch 2 | 3 | * IAS2017PPT https://pan.baidu.com/s/1slWhCD3:9cmd 4 | 5 | 6 | ![](https://blogs.gartner.com/smarterwithgartner/files/2017/08/Emerging-Technology-Hype-Cycle-for-2017_Infographic_R6A.jpg) 7 | -------------------------------------------------------------------------------- /Systemtap.key: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/Systemtap.key -------------------------------------------------------------------------------- /Systemtap_QuickStart.md: -------------------------------------------------------------------------------- 1 | # Systemtap快速入门 2 | 3 | 4 | 5 | ![Trace](http://www.brendangregg.com/perf_events/perf_events_map.png) 6 | 7 | ## 1. 介绍 8 | 9 | ### 1.1 Hello World 10 | 11 | Systemtap 为开发者或者管理者提供了一种深度检测运行 Linux 操作系统的能力,使用者通过编写一定的简单脚本,就可以获取内核级别的调试信息。 12 | 13 | systemtap 脚本中基本元素是 `event`, 用户为特定 `event` 编写相对应的 `handler` 进行相关动作处理。 14 | 15 | `event` 可能包括以下类型: 16 | 17 | - 调用函数或者调用函数结束 18 | - 定时器触发 19 | - systemtap 会话开始或者结束 20 | 21 | systemtap 的工作机制是通过将用户编写的脚本程序转化成 c 代码,将 c 代码编译成内核模块,挂载到系统内部的方式来进行数据的获取。 22 | 23 | ![](http://hi.csdn.net/attachment/201104/23/0_1303533510e2Tx.gif) 24 | 25 | 以 `root` 用户运行或者以 `stapdev` group 中具备`sudo` 权限的用户运行。 26 | 27 | ```bash 28 | # cat hello-world.stp 29 | 30 | probe begin 31 | { 32 | print("hello world!\n") 33 | exit() 34 | } 35 | 36 | # stp hello-world.stp 37 | hello world! 38 | ``` 39 | 40 | 监控系统调用 syscall.open 41 | ```shell 42 | # cat strace-open.stp 43 | 44 | probe syscall.open 45 | { 46 | # stap -e script -c cmd -x pid 47 | # stap -vv -e 'probe syscall.open {printf("name: %s args: %s [var: %s parms: %s]\n", name, argstr, $$vars, $$parms)}' 48 | 49 | printf("%s(%d) open (%s)\n", execname(), pid(), argstr) 50 | } 51 | 52 | probe timer.s(4) 53 | { 54 | exit() 55 | } 56 | 57 | # stap strace-open.stp 58 | redis-server(1511) open ("/proc/1511/stat", O_RDONLY) 59 | vmtoolsd(576) open ("/proc/meminfo", O_RDONLY) 60 | ...... 61 | ``` 62 | 63 | 64 | 65 | ```shell 66 | # stap -L "syscall.*" 67 | # stap -L 'kernel.function("inet_csk_accept")' 68 | # stap -L 'kernel.function("inet_csk_accept").return' 69 | # stap -L "syscall.open" 70 | syscall.open filename:string mode:long __nr:long name:string flags:long argstr:string $filename:long int $flags:long int $mode:long int 71 | 72 | > filename:string open file name 73 | > mode:long open file mode 74 | > __nr:long number of params ?? 75 | > flags:long open file flags 76 | 77 | > name:string syscall name: open here 78 | > argstr:string alias for print 79 | 80 | # 对于syscall 还定义了一些 aliases, 主要是用于调测: 81 | # argstr: A pretty-printed form of the entire argument list, without parentheses. 82 | # name: The name of the system call. 83 | # retstr: For return probes, a pretty-printed form of the system call result. 84 | # $$parms filename=0x7fff3f2cf330 flags=0x0 mode=0x7fff3f2cf33f 85 | 86 | # stap -e 'probe syscall.open{printf("argstr is %s, __nr is %d\n", argstr, __nr)}' 87 | # stap -e 'probe syscall.open{printf("filename is %s, name is %s, flags is 0x%x, mode is 0x%x\n", filename, name, flags, mode)}' 88 | # stap -e 'probe syscall.open{printf("filename is 0x%x, $flags is 0x%x, $mode is 0x%x\n", $filename, $flags, $mode)}' 89 | ``` 90 | 91 | #### stap 命令行参数 92 | 93 | - -v [vvv] 94 | 95 | 输出更加详细的信息,v越多信息越详细 96 | 97 | - `-o filename` 98 | 99 | 将屏幕输出到指定文件中,可以结合 `-S` 控制日志的大小 100 | 101 | - `-S size,count` 102 | 103 | 用于控制输出文件的大小和数目,`size` 单位是 `M` ,日志文件后面会包含一个序号的后缀 104 | 105 | - `-x process ID` 106 | 107 | 设定脚本中 `target()` 的值, 主要用于监控特定进程的事件类型 108 | 109 | ```shell 110 | # stap -x 1585 -vv -e 'probe syscall.open {if (target() == pid()) {printf("name: %s args: %s\n", name, a 111 | ``` 112 | 113 | - `-c command` 114 | 115 | 运行指定的命令,同时设定了 `target()` 函数指向了运行命令的 `pid`, 运行命令必须是全路径 116 | 117 | ```shell 118 | # stap -vv -e 'probe syscall.open {printf("name: %s args: %s\n", name, argstr)}' -c "/bin/ls" 119 | ``` 120 | 121 | - `-e 'script'` 122 | 123 | ```shell 124 | # stap -e 'probe begin{ println("Hello"); exit() }' 125 | ``` 126 | 127 | - `-F` 128 | 129 | 运行后台模式,可以随时切换过去 130 | 131 | `stap` can also be instructed to run scripts from the standard input using the `-` switch. To illustrate: 132 | 133 | ```shell 134 | # echo "probe timer.s(1) {exit()}" | stap - 135 | ``` 136 | 137 | 整体架构: 138 | 139 | ![](http://hi.csdn.net/attachment/201104/23/0_1303533747D3xY.gif) 140 | 141 | 流程: 142 | 143 | ![](http://officialblog-wordpress.stor.sinaapp.com/uploads/2013/11/figure2.gif) 144 | 145 | ### 1.2 Probe Points 146 | 147 | The library of scripts that comes with systemtap, each called a `tapset`, `man stapprobes`获取更详细的帮助。 148 | 149 | ```shell 150 | probe PROBEPOINT [, PROBEPOINT] { [STMT ...] } 151 | ``` 152 | 153 | Probe: 154 | 155 | - synchronous event(contextual data) 156 | - asynchronous timer 157 | 158 | Probe Points: 159 | 160 | | `begin` | The startup of the systemtap session. | 161 | | ---------------------------------------- | ---------------------------------------- | 162 | | `end` | The end of the systemtap session. | 163 | | `error` | The *error* probe point is similar to the *end* probe, except that each such probe handler run when the session ends after errors have occurred. In such cases, "end" probes are skipped, but each "error" probe is still attempted. This kind of probe can be used to clean up or emit a "final gasp". It may also be numerically parametrized to set a sequence. | 164 | | `never` | Its probe handler is never run, though its statements are analyzed for symbol / type correctness as usual. This probe point may be useful in conjunction with optional probes. | 165 | | `kernel.function("sys_open")` | The entry to the function named `sys_open` in the kernel. | 166 | | `syscall.close.return` | The return from the `close` system call. | 167 | | `module("ext3").statement(0xdeadbeef)` | The addressed instruction in the `ext3` filesystem driver. | 168 | | `timer.ms(200)` | A timer that fires every 200 milliseconds. | 169 | | `timer.profile` | A timer that fires periodically on every CPU. | 170 | | `perf.hw.cache_misses` | A particular number of CPU cache misses have occurred. | 171 | | `procfs("status").read` | A process trying to read a synthetic file. | 172 | | `process("a.out").statement("*@main.c:200")` | Line 200 of the `a.out` program. | 173 | | `kernel.function("*@net/socket.c").call` | Any function defined in `net/socket.c` | 174 | 175 | 176 | 177 | > probe points 可以使用 `aliases`或者`suffix`, 例如 `syscall.read.return.maxactive(10)` == `kernel.function("sys_read").return.maxactive(10)`;probe points 后面可以添加 `?` 表示可选,如果发生展开错误则忽略。 178 | 179 | > However, a probe point may be followed by a "?" character, to indicate that it is optional, and that no error should result if it fails to resolve. Optionalness passes down through all levels of alias/wildcard expansion. Alternately, a probe point may be followed by a "!" character, to indicate that it is both optional and sufficient. (Think vaguely of the Prolog cut operator.) If it does resolve, then no further probe points in the same comma-separated list will be resolved. Therefore, the "!" sufficiency mark only makes sense in a list of probe point alternatives. [man](https://sourceware.org/systemtap/man/stapprobes.3stap.html) 180 | 181 | 182 | 183 | | **DWARF** | NON-DWARF | SYMBOL-TABLE | 184 | | ---------------------------- | ------------------------- | ------------------- | 185 | | kernel.function, .statement | kernel.mark | kernel.function*** | 186 | | module.function, .statement | process.mark, process.plt | module.function*** | 187 | | process.function, .statement | begin, end, error, never | process.function*** | 188 | | process.mark*** | timer | | 189 | | .function.callee | perf | | 190 | | | procfs | | 191 | 192 | | **AUTO-GENERATED-DWARF** | kernel.statement.absolute | | 193 | | ------------------------ | -------------------------- | ---- | 194 | | | kernel.data | | 195 | | kernel.trace | kprobe.function | | 196 | | | process.statement.absolute | | 197 | | | process.begin, .end | | 198 | | | netfilter | | 199 | | | java | | 200 | 201 | Here is a list of DWARF probe points currently supported: 202 | 203 | ``` 204 | kernel.function(PATTERN) 205 | kernel.function(PATTERN).call 206 | kernel.function(PATTERN).callee(PATTERN) 207 | kernel.function(PATTERN).callee(PATTERN).return 208 | kernel.function(PATTERN).callee(PATTERN).call 209 | kernel.function(PATTERN).callees(DEPTH) 210 | kernel.function(PATTERN).return 211 | kernel.function(PATTERN).inline 212 | kernel.function(PATTERN).label(LPATTERN) 213 | module(MPATTERN).function(PATTERN) 214 | module(MPATTERN).function(PATTERN).call 215 | module(MPATTERN).function(PATTERN).callee(PATTERN) 216 | module(MPATTERN).function(PATTERN).callee(PATTERN).return 217 | module(MPATTERN).function(PATTERN).callee(PATTERN).call 218 | module(MPATTERN).function(PATTERN).callees(DEPTH) 219 | module(MPATTERN).function(PATTERN).return 220 | module(MPATTERN).function(PATTERN).inline 221 | module(MPATTERN).function(PATTERN).label(LPATTERN) 222 | kernel.statement(PATTERN) 223 | kernel.statement(PATTERN).nearest 224 | kernel.statement(ADDRESS).absolute 225 | module(MPATTERN).statement(PATTERN) 226 | process("PATH").function("NAME") 227 | process("PATH").statement("*@FILE.c:123") 228 | process("PATH").library("PATH").function("NAME") 229 | process("PATH").library("PATH").statement("*@FILE.c:123") 230 | process("PATH").library("PATH").statement("*@FILE.c:123").nearest 231 | process("PATH").function("*").return 232 | process("PATH").function("myfun").label("foo") 233 | process("PATH").function("foo").callee("bar") 234 | process("PATH").function("foo").callee("bar").return 235 | process("PATH").function("foo").callee("bar").call 236 | process("PATH").function("foo").callees(DEPTH) 237 | process(PID).function("NAME") 238 | process(PID).function("myfun").label("foo") 239 | process(PID).plt("NAME") 240 | process(PID).plt("NAME").return 241 | process(PID).statement("*@FILE.c:123") 242 | process(PID).statement("*@FILE.c:123").nearest 243 | process(PID).statement(ADDRESS).absolute 244 | ``` 245 | 246 | > SYSCALL and ND_SYSCALL, Generally, a pair of probes are defined for each normal system call as listed in the syscalls(2) manual page, one for entry and one for return. Those system calls that never return do not have a corresponding .return probe. The nd_* family of probes are about the same, except it uses non-DWARF based searching mechanisms, which may result in a lower quality of symbolic context data (parameters), and may miss some system calls. You may want to try them first, in case kernel debugging information is not immediately available. 247 | > 248 | > ND_SYSCALL **non-DWARF**, 主要是基于没有 **DWARF** 调试信息的情况下,采用更加底层的符号来处理。 249 | > 250 | > 对于系统调用,除了调用函数的参数,还有以下通用参数可以使用: 251 | > 252 | > **argstr** 253 | > 254 | > A pretty-printed form of the entire argument list, without parentheses. 255 | > 256 | > **name** 257 | > 258 | > The name of the system call. 259 | > 260 | > **retstr** 261 | > 262 | > For return probes, a pretty-printed form of the system-call result. 263 | 264 | 265 | 266 | #### [CONTEXT VARIABLES](https://sourceware.org/systemtap/man/stapprobes.3stap.html) 267 | 268 | - $var 269 | 270 | refers to an in-scope variable "var". If it's an integer-like type, it will be cast to a 64-bit int for systemtap script use. String-like pointers (char *) may be copied to systemtap string values using the *kernel_string* or *user_string* functions. 271 | 272 | - @var("varname") 273 | 274 | an alternative syntax for *$varname* 275 | 276 | - @var("varname@src/file.c") 277 | 278 | refers to the global (either file local or external) variable *varname* defined when the file *src/file.c* was compiled. The CU in which the variable is resolved is the first CU in the module of the probe point which matches the given file name at the end and has the shortest file name path (e.g. given *@var(foo@bar/baz.c)* and CUs with file name paths *src/sub/module/bar/baz.c* and *src/bar/baz.c* the second CU will be chosen to resolve the (file) global variable *foo* 279 | 280 | - $var->field traversal via a structure's or a pointer's field. This 281 | 282 | generalized indirection operator may be repeated to follow more levels. Note that the *.* operator is not used for plain structure members, only*->* for both purposes. (This is because "." is reserved for string concatenation.) 283 | 284 | - $return 285 | 286 | is available in return probes only for functions that are declared with a return value, which can be determined using @defined($return). 287 | 288 | - $var[N] 289 | 290 | indexes into an array. The index given with a literal number or even an arbitrary numeric expression. 291 | 292 | A number of operators exist for such basic context variable expressions: 293 | 294 | - $$vars 295 | 296 | expands to a character string that is equivalent to`sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x", parm1, ..., parmN, var1, ..., varN)`for each variable in scope at the probe point. Some values may be printed as *=?* if their run-time location cannot be found. 297 | 298 | - $$locals 299 | 300 | expands to a subset of $$vars for only local variables. 301 | 302 | - $$parms 303 | 304 | expands to a subset of $$vars for only function parameters. 305 | 306 | - $$return 307 | 308 | is available in return probes only. It expands to a string that is equivalent to sprintf("return=%x", $return) if the probed function has a return value, or else an empty string. 309 | 310 | - & $EXPR 311 | 312 | expands to the address of the given context variable expression, if it is addressable. 313 | 314 | - @defined($EXPR) 315 | 316 | expands to 1 or 0 iff the given context variable expression is resolvable, for use in conditionals such as`@defined($foo->bar) ? $foo->bar : 0` 317 | 318 | - $EXPR$ 319 | 320 | expands to a string with all of $EXPR's members, equivalent to`sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}", $EXPR->a, $EXPR->b)` 321 | 322 | - $EXPR$$ 323 | 324 | expands to a string with all of $var's members and submembers, equivalent to`sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}", $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])` 325 | 326 | 327 | > 对于 $$vars 和 $$parms 可以通过添加 suffix `$` 或者 `$$` 打印出对应结构体中详细的信息。 328 | 329 | 330 | return: 331 | 332 | In addition, arbitrary entry-time expressions can also be saved for ".return" probes using the *@entry(expr)* operator. For example, one can compute the elapsed time of a function: 333 | 334 | ``` 335 | probe kernel.function("do_filp_open").return { println( get_timeofday_us() - @entry(get_timeofday_us()) )} 336 | ``` 337 | 338 | 339 | 340 | ### 1.3 Print info 341 | 342 | | `tid()` | The id of the current thread. | 343 | | -------------------- | ---------------------------------------- | 344 | | `pid()` | The process (task group) id of the current thread. | 345 | | `uid()` | The id of the current user. | 346 | | `execname()` | The name of the current process. | 347 | | `cpu()` | The current cpu number. | 348 | | `gettimeofday_s()` | Number of seconds since epoch. | 349 | | `get_cycles()` | Snapshot of hardware cycle counter. | 350 | | `pp()` | A string describing the probe point being currently handled. | 351 | | `ppfunc()` | If known, the the function name in which this probe was placed. | 352 | | `$$vars` | If available, a pretty-printed listing of all local variables in scope. | 353 | | `print_backtrace()` | If possible, print a kernel backtrace. | 354 | | `print_ubacktrace()` | If possible, print a user-space backtrace. | 355 | | thread_indent() | Given an indentation delta parameter, it stores internally an indentation counter for each thread (`tid()`), and returns a string with some generic trace data plus an appropriate number of indentation spaces. | 356 | 357 | 358 | 359 | ```shell 360 | # cat thread_indent.stp 361 | 362 | probe kernel.function("*@net/socket.c") 363 | { 364 | printf ("%s -> %s\n", thread_indent(1), probefunc()) 365 | } 366 | probe kernel.function("*@net/socket.c").return 367 | { 368 | printf ("%s <- %s\n", thread_indent(-1), probefunc()) 369 | } 370 | ``` 371 | 372 | 373 | 374 | ## 2. 安装 375 | 376 | kernel development tools and debugging data 377 | 378 | [CentOS7安装systemtap](http://www.hi-roy.com/2016/07/27/CentOS7%E5%AE%89%E8%A3%85systemtap/) 379 | 380 | 381 | ```shell 382 | # yum install systemtap systemtap-runtime 383 | # yum install kernel-devel-`uname -r` kernel-debuginfo-common-`uname -r` kernel-debuginfo-`uname -r` 384 | ``` 385 | 386 | 387 | 388 | ## 3. Analysis 389 | 390 | ### 3.1 Basic constructs 391 | 392 | ``` 393 | if (EXPR) STATEMENT [else STATEMENT] 394 | 395 | while (EXPR) STATEMENT 396 | 397 | for (A; B; C) STATEMENT 398 | ``` 399 | 400 | 注释: 401 | 402 | > \# this is a commnet 403 | > /* this is a commnet */ 404 | > // this is a commnet 405 | 406 | ``` 407 | if (uid() > 100) 408 | if (execname() == "sed") 409 | if (cpu() == 0 && gettimeofday_s() > 1140498000) 410 | "hello" . " " . "world" 字符串拼接 411 | ``` 412 | 413 | 414 | | foo = gettimeofday_s() | foo is a number | 415 | | ------------------------------ | --------------------------- | 416 | | bar = "/usr/bin/" . execname() | bar is a string | 417 | | c++ | c is a number | 418 | | s = sprint(2345) | s becomes the string "2345" | 419 | 420 | 421 | 422 | 默认情况下,`variable` 是 local, 全局的变量需要使用 `global` 进行声明。 423 | 424 | 425 | 426 | ### 3.2 Target variables 427 | 428 | ### 3.3 Function 429 | 430 | `Function` 定义不要求顺序;全局变量也不要求定义的顺序; 431 | 432 | ```c 433 | function makedev(major, minor) 434 | { 435 | return major << 20 | minor 436 | } 437 | ``` 438 | 439 | 440 | 441 | ### 3.4 Arrays 442 | 443 | systemtap 中的数组实际上是 `hashtable` 结构, 必须要声明为**全局变量**。 444 | 445 | ```c 446 | global stat # 默认大小最大 2048 个, MAXMAPENTRIES 447 | global stat[4096] # 保留4096个空间 448 | ``` 449 | 450 | 451 | 452 | `Array` 常用的操作是设置变量和查找, 使用 `awk`的语法,` array_name[key] = value`, 其中 `key` 可以为多个变量的组合,最大为 9 个,中间使用 `,`分割,例如: `array_name[key1_1,key1_2] = value` 453 | 454 | | `if ([4,"hello"] in foo) { }` | membership test | 455 | | ----------------------------- | ---------------------------- | 456 | | `delete times[tid()]` | deletion of a single element | 457 | | `delete times` | deletion of all elements | 458 | 459 | 460 | 461 | 对于 Arrays,可以使用 `foreach` 来进行遍历,同时支持使用 `+`或者`-`进行排序。 462 | 463 | | `foreach (x = [a,b] in foo) { fuss_with(x) }` | simple loop in arbitrary sequence | 464 | | ---------------------------------------- | ---------------------------------------- | 465 | | `foreach ([a,b] in foo+ limit 5) { }` | loop in increasing sequence of value, stop after 5 | 466 | | `foreach ([a-,b] in foo) { }` | loop in decreasing sequence of first key | 467 | 468 | 469 | 470 | ### 3.5 Aggregates 471 | 472 | 主要用于数据统计,添加操作类型为 "<<<" 473 | 474 | 475 | 476 | ```shell 477 | a <<< delta_timestamp 478 | writes[execname()] <<< count 479 | ``` 480 | 481 | 482 | 483 | | @avg(a) | the average of all the values accumulated into `a` | 484 | | ------------------------------- | ---------------------------------------- | 485 | | print(@hist_linear(a,0,100,10)) | print an ``ascii art'' linear histogram of the same data stream, bounds 0…100, bucket width is 10 | 486 | | @count(writes["zsh"]) | the number of times ``zsh'' ran the probe handler | 487 | | print(@hist_log(writes["zsh"])) | print an ``ascii art'' logarithmic histogram of the same data stream | 488 | 489 | 490 | 491 | ### 3.6 Safety 492 | 493 | 1. systemtap 会对运行的 `prober handler` 运行时间进行限制避免死循环或者无限递归; 494 | 2. 没有动态内存分配,所有的数组、函数上下文和缓存都会在 `session` 初始化的时候一次性初始化; 495 | 3. 空指针或者0除数等危险的操作,在生成C语言代码过程中都会被严格检查; 496 | 4. 转化器后者编译器的bug,生成的 C 语言代码,可以通过 `-p3` 选型进行检测。 497 | 498 | stp 脚本安装的目录 `/usr/share/systemtap/tapset`,可以使用 `-I` 来包含用户定义的附加目录。 499 | 500 | 501 | 502 | ``` 503 | # stap -p1 -vv -e 'probe begin{}' > /dev/null 504 | Created temporary directory "/tmp/stapKbwUbu" 505 | Session arch: x86_64 release: 3.10.0-514.el7.x86_64 506 | Searched for library macro files: "/usr/share/systemtap/tapset/linux/*.stpm", found: 5, processed: 5 507 | Searched for library macro files: "/usr/share/systemtap/tapset/*.stpm", found: 8, processed: 8 508 | Searched: "/usr/share/systemtap/tapset/linux/x86_64/*.stp", found: 3, processed: 3 509 | Searched: "/usr/share/systemtap/tapset/linux/*.stp", found: 71, processed: 71 510 | Searched: "/usr/share/systemtap/tapset/x86_64/*.stp", found: 5, processed: 5 511 | Searched: "/usr/share/systemtap/tapset/*.stp", found: 35, processed: 35 512 | Pass 1: parsed user script and 127 library scripts using 230472virt/43356res/3236shr/40484data kb, in 320usr/20sys/311real ms. 513 | Running rm -rf /tmp/stapKbwUbu 514 | Spawn waitpid result (0x0): 0 515 | Removed temporary directory "/tmp/stapKbwUbu" 516 | ``` 517 | 518 | 519 | 520 | 嵌入C语言代码: 521 | 522 | stp 文件中可以,嵌入 C 语言代码,嵌入的C语言代码,可以包含头文件和代码,使用`%{`xxx `%}` 进行区分; 523 | 524 | ```shell 525 | function : ( :, ... ) %{ %} 526 | ``` 527 | 528 | 529 | Using a single `$` or a double `$$` suffix provides a swallow or deep string representation of the variable data type. Using a single `$`, as in `$var$`, will provide a string that only includes the values of all basic type values of fields of the variable structure type but not any nested complex type values (which will be represented with `{...}`). Using a double `$$`, as in `@var("var")$$` will provide a string that also includes all values of nested data types. 530 | 531 | `$$vars` expands to a character string that is equivalent to `sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x", $parm1, ..., $parmN, $var1, ..., $varN)` 532 | 533 | `$$locals` expands to a character string that is equivalent to `sprintf("var1=%x ... varN=%x", $var1, ..., $varN)` 534 | 535 | `$$parms` expands to a character string that is equivalent to `sprintf("parm1=%x ... parmN=%x", $parm1, ..., $parmN)` 536 | 537 | 538 | 源码目录:`/usr/src/debug/kernel-3.10.0-514.el7/linux-3.10.0-514.el7.x86_64/` 539 | 540 | 定义 `probe` 的别名,例如官方提供的函数库:` /usr/share/systemtap/tapset/linux` 提供的样例:`/usr/share/doc/systemtap-client-3.0/examples` 541 | 542 | ``` 543 | probe netdev.receive 544 | = kernel.function("netif_receive_skb") 545 | { 546 | dev_name = kernel_string($skb->dev->name) 547 | length = $skb->len 548 | protocol = $skb->protocol 549 | truesize = $skb->truesize 550 | } 551 | 552 | # used 553 | probe netdev.receive 554 | { 555 | ifrecv[pid(), dev_name, execname(), uid()] <<< length 556 | } 557 | ``` 558 | 559 | 560 | [who_send_it.stp](https://sourceware.org/systemtap/examples/network/who_sent_it.stp) 561 | 562 | ```shell 563 | #! /usr/bin/env stap 564 | 565 | # Print a trace of threads sending IP packets (UDP or TCP) to a given 566 | # destination port and/or address. Default is unfiltered. 567 | 568 | global the_dport = 0 # override with -G the_dport=53 569 | global the_daddr = "" # override with -G the_daddr=127.0.0.1 570 | 571 | probe netfilter.ip.local_out { 572 | if ((the_dport == 0 || the_dport == dport) && 573 | (the_daddr == "" || the_daddr == daddr)) 574 | printf("%s[%d] sent packet to %s:%d\n", execname(), tid(), daddr, dport) 575 | } 576 | ``` 577 | 578 | 579 | 580 | [para-callgraph.stp](https://sourceware.org/systemtap/examples/general/para-callgraph.txt) 581 | 582 | > thread_indent() The generic data included in the returned string includes a **timestamp** (number of microseconds since the first call to `thread_indent()` by the thread), a **process name**, and the **thread ID.** 583 | 584 | ```shell 585 | # 586 | # stap para-callgraph.stp 'process("/bin/ls").function("*")' \ 587 | # 'process("/bin/ls").function("main")' -c "/bin/ls > /dev/null" 588 | 589 | #! /usr/bin/env stap 590 | 591 | function trace(entry_p, extra) { 592 | %( $# > 1 %? if (tid() in trace) %) 593 | printf("%s%s%s %s\n", 594 | thread_indent (entry_p), 595 | (entry_p>0?"->":"<-"), 596 | ppfunc (), 597 | extra) 598 | } 599 | 600 | 601 | %( $# > 1 %? 602 | global trace 603 | probe $2.call { 604 | trace[tid()] = 1 605 | } 606 | probe $2.return { 607 | delete trace[tid()] 608 | } 609 | %) 610 | 611 | probe $1.call { trace(1, $$parms) } 612 | probe $1.return { trace(-1, $$return) } 613 | ``` 614 | 615 | 616 | 617 | https://sourceware.org/systemtap/langref/Probe_points.html 618 | 619 | A probe point may be followed by a question mark (?) character, to indicate that it is optional, and that no error should result if it fails to expand. This effect passes down through all levels of alias or wildcard expansion.The following is the general syntax.`kernel.function("no_such_function") ?` 620 | 621 | 622 | 623 | **DWARF** is a debugging file format used by many compilers and debuggers to support source level debugging. It addresses the requirements of a number of procedural languages, such as C, C++, and Fortran, and is designed to be extensible to other languages. DWARF is architecture independent and applicable to any processor or operating system. It is widely used on Unix, Linux and other operating systems, as well as in stand-alone environments. 624 | 625 | 626 | 627 | A new operator, `@entry`, is available for automatically saving an expression at entry time for use in a`.return` probe. 628 | 629 | 630 | 631 | ####Syscall probes 632 | syscall.NAME 633 | syscall.NAME.return 634 | 635 | * argstr: A pretty-printed form of the entire argument list, without parentheses. 636 | * name: The name of the system call. 637 | * retstr: For return probes, a pretty-printed form of the system call result. 638 | 639 | 640 | #### Special probe points 641 | begin 642 | end 643 | error 644 | 645 | 646 | 647 | Pointer typecasting: 648 | 649 | @cast(p, "type_name"[, "module"])->member 650 | 651 | ```shell 652 | @cast(pointer, "task_struct", "kernel")->parent->tgid 653 | 654 | # 采用头文件方式,用 <> 分开 655 | @cast(tv, "timeval", "")->tv_sec 656 | @cast(task, "task_struct", "kernel")->tgid 657 | ``` 658 | 659 | 660 | 661 | 命令行参数: 662 | 663 | $1 … $ 表示传输参数是整数常量 664 | 665 | @1 … @ 表示传入的是字符串常量 666 | 667 | 668 | 669 | ```shell 670 | # stap example.stp '5+5' mystring 671 | probe begin { printf("%d, %s\n", $1, @2) } 672 | ``` 673 | 674 | 675 | 676 | 条件编译: 677 | 678 | ```shell 679 | %( CONDITION %? TRUE-TOKENS %) 680 | %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %) 681 | ``` 682 | From [tcp_trace.stp](https://sourceware.org/systemtap/examples/network/tcp_trace.stp) 683 | ``` 684 | %( kernel_v > "2.6.24" %? 685 | probe kernel.function("tcp_set_state") 686 | { 687 | sk = $sk 688 | new_state = $state 689 | TCP_CLOSE = 7 690 | TCP_CLOSE_WAIT = 8 691 | key = filter_key(sk) 692 | if ( key && ((new_state == TCP_CLOSE)||(new_state == TCP_CLOSE_WAIT))){ 693 | if (state_flg && state[key]) print_close(key,new_state); 694 | clean_up(key); 695 | } 696 | } 697 | %) 698 | ``` 699 | 700 | 701 | 702 | 宏定义: 703 | 704 | ``` 705 | @define NAME %( BODY %) 706 | @define NAME(PARAM_1, PARAM_2, ...) %( BODY %) 707 | ``` 708 | 709 | 710 | 711 | ##### tapset::json 712 | The JSON tapset provides probes, functions, and macros to generate 713 | a JSON metadata and data file. The JSON metadata file is located in 714 | /proc/systemtap/MODULE/metadata.json. The JSON data file is located 715 | in /proc/systemtap/MODULE/data.json. The JSON data file is updated 716 | with current data every time the file is read. 717 | 718 | 719 | 720 | ``` 721 | /proc/systemtap/MODULE/metadata.json 722 | ``` 723 | 724 | From [iotime.stp](https://sourceware.org/systemtap/examples/io/iotime.stp) 725 | 726 | ``` 727 | function timestamp:long() { return gettimeofday_us() - start } 728 | function proc:string() { return sprintf("%d (%s)", pid(), execname()) } 729 | probe begin { start = gettimeofday_us() } 730 | 731 | probe syscall.open.return { 732 | filename = user_string($filename) 733 | if ($return != -1) { 734 | filehandles[pid(), $return] = filename 735 | } else { 736 | printf("%d %s access %s fail\n", timestamp(), proc(), filename) 737 | } 738 | } 739 | ``` 740 | 741 | 742 | ```shell 743 | man -k tapset:: # man pages for tapsets 744 | man -k probe:: # man pages for individual probe points. 745 | man -k function:: # man -k function: 746 | stap -L 'kernel.trace("net*")' 747 | ``` 748 | 749 | 750 | 751 | 内核tracepoint: 752 | 753 | ```shell 754 | # yum install perf 755 | # perf list tracepoint | head 756 | ``` 757 | 758 | 759 | **stap 运行的时候可以通过 -G 设置全局变量** 760 | 761 | [strace.stp](https://sourceware.org/systemtap/examples/process/strace.stp) 使用 nd_syscall 设置。 762 | 763 | 监控连入tcp连接 764 | 765 | ```shell 766 | #! /usr/bin/env stap 767 | probe begin { 768 | printf("%6s %16s %6s %6s %16s\n", 769 | "UID", "CMD", "PID", "PORT", "IP_SOURCE") 770 | } 771 | probe kernel.function("tcp_accept").return?, 772 | kernel.function("inet_csk_accept").return? { 773 | sock = $return 774 | if (sock != 0) 775 | printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(), 776 | inet_get_local_port(sock), inet_get_ip_source(sock)) 777 | } 778 | ``` 779 | 780 | 781 | 782 | ```shell 783 | cat tcp_state.stp 784 | global tcp_state; 785 | 786 | global filter_rip 787 | global filter_rport 788 | 789 | probe begin 790 | { 791 | filter_rip = ipv4_pton(@1) 792 | filter_rport = $2; 793 | 794 | init_tcp_state() 795 | printf("Start stat now....\n") 796 | } 797 | 798 | function ipv4_pton:long (addr:string) 799 | { 800 | i=32; 801 | ip=0; 802 | ips=addr; 803 | while(strlen(byte = tokenize(ips, ".")) != 0) { 804 | i-=8; 805 | ips=""; 806 | 807 | j=strtol(byte,10); 808 | ip=ip+(j<skc_state; 852 | str = state_num2str(new_state); 853 | 854 | printf("%d %s:%d %s %d\n", 855 | gettimeofday_ms(), 856 | ip_ntop(htonl(laddr)), 857 | lport, 858 | str, restranmit); 859 | } 860 | } 861 | 862 | probe kernel.function("tcp_retransmit_skb") 863 | { 864 | print_state($sk,1) 865 | } 866 | 867 | %( kernel_v > "2.6.24" %? 868 | probe kernel.function("tcp_set_state") 869 | { 870 | print_state($sk,0) 871 | } 872 | %) 873 | 874 | ``` 875 | 876 | 877 | 878 | 样例演示如何从 `struct file *`中获取 `file_name` 参考 [systemtap targets](https://zhengheng.me/2015/02/11/systemtap-targets/) 879 | 880 | ```shell 881 | probe kernel.function("vfs_read").return 882 | { 883 | if(execname() != "stapio") 884 | printf("%s[%ld], %ld, %s %s\n", execname(), pid(), $file->f_path->dentry->d_inode->i_ino, kernel_string($file->f_path->dentry->d_name->name), kernel_string($file->f_path->dentry->d_iname)) 885 | } 886 | probe timer.s(2) 887 | { 888 | exit() 889 | } 890 | ``` 891 | 892 | 893 | 894 | 可选 与尝试 probepoint 895 | 896 | ? 定义可选探测点,! 定义尝试探测点。可选探测点就是即使不能在这里设置探测点就不报错了而直接忽略这个探测点。尝试探测点是前面探测点设置失败之后再尝试设置后面的探测点。 897 | 898 | 899 | 900 | 跟踪 `process`的 `statement` 901 | 902 | ```c 903 | #include 904 | #include 905 | 906 | typedef struct _str 907 | { 908 | int len; 909 | const char *data; 910 | } str_t; 911 | 912 | typedef struct _student 913 | { 914 | int id; 915 | str_t name; 916 | } student_t; 917 | 918 | 919 | int main() 920 | { 921 | student_t *ptr = NULL; 922 | student_t one; 923 | 924 | one.id = 1; 925 | one.name.len = (int)strlen("dave"); 926 | one.name.data = "dave"; 927 | 928 | ptr = &one; 929 | 930 | printf("%d -> %s\n", ptr->id, ptr->name.data); 931 | 932 | ptr->id = 2; 933 | ptr->name.len = (int)strlen("davad"); 934 | ptr->name.data = "davad"; 935 | printf("%d -> %s\n", ptr->id, ptr->name.data); 936 | 937 | return 0; 938 | } 939 | 940 | ``` 941 | 942 | 943 | 944 | ```shell 945 | # gcc -o ptr -O2 -g ptr.c 946 | ``` 947 | 948 | ```shell 949 | # stap -L 'process("/root/stp/ptr").statement("main@ptr.c:*")' 950 | ``` 951 | 952 | ```shell 953 | #!/usr/bin/env stap 954 | 955 | probe process("/root/stp/ptr").statement("main@ptr.c:28") 956 | { 957 | printf("vars %s, student name: %s\n", $$locals$$, user_string($ptr->name->data)); 958 | } 959 | ``` 960 | 961 | ```shell 962 | # stp ptr.stp -c ./ptr 963 | 1 -> dave 964 | 2 -> davad 965 | vars one={.id=1, .name={.len=4, .data="dave"}} ptr={.id=1, .name={.len=4, .data="dave"}}, student name: dave 966 | ``` 967 | 968 | 通过 `stap` 脚本修改变量测试 969 | 970 | ```shell 971 | # cat ptr_modify.ptr 972 | #!/usr/bin/env stap 973 | 974 | probe process("/root/stp/ptr").statement("main@ptr.c:28") 975 | { 976 | $ptr->name->len = 7; 977 | printf("vars %s, student name: %s\n", $$locals$$, user_string($ptr->name->data)); 978 | } 979 | ``` 980 | 981 | 如果需要修改某个变量的值,需要添加 `-g` 参数运行。 982 | ```shell 983 | # stap -g ptr_modify.ptr -c ./ptr 984 | 1 -> dave 985 | 2 -> davad 986 | vars one={.id=1, .name={.len=7, .data="dave"}} ptr={.id=1, .name={.len=7, .data="dave"}}, student name: dave 987 | ``` 988 | 989 | 990 | 991 | 生成火焰图: 992 | 993 | [unkown] 编译的时候添加 `-g -fno-omit-frame-pointer` 994 | 995 | ``` 996 | # git clone https://github.com/brendangregg/FlameGraph.git 997 | # perf record -F 99 -p pid -ag -- sleep 60 998 | # perf script > out.perf 999 | # /opt/FlameGraph/stackcollapse-perf.pl out.perf > out.folded 1000 | # /opt/FlameGraph/flamegraph.pl out.folded > cpu.svg 1001 | ``` 1002 | 1003 | 突破systemtap定义变量的限制: 1004 | $ sudo stap -DMAXMAPENTRIES=10240 test.stp 1005 | 1006 | #### Links:#### 1007 | 1008 | 1. [System tap 官方文档](https://sourceware.org/systemtap/documentation.html) 1009 | - [Systemtap Tutorial](https://sourceware.org/systemtap/tutorial/tutorial.html) 1010 | - [SystemTap_Beginners_Guide](https://sourceware.org/systemtap/SystemTap_Beginners_Guide/) 1011 | - [SystemTap Language Reference](https://sourceware.org/systemtap/langref/) 1012 | - [SystemTap Tapset Reference Manual](https://sourceware.org/systemtap/tapsets/index.html) 1013 | - [man](https://sourceware.org/systemtap/man/) 1014 | 2. [System Best Examples](https://sourceware.org/systemtap/examples/) [TCP_Trace.stp](https://sourceware.org/systemtap/examples/network/tcp_trace.stp) 1015 | 3. [Centos7 SYSTEMTAP BEGINNERS GUIDE]([https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/SystemTap_Beginners_Guide/index.html](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/SystemTap_Beginners_Guide/index.html)) 1016 | 4. [CentOS7安装systemtap](http://www.hi-roy.com/2016/07/27/CentOS7%E5%AE%89%E8%A3%85systemtap/) 1017 | 5. [openresty-systemtap-toolkit](https://github.com/openresty/openresty-systemtap-toolkit/) 1018 | 6. [Systemtap笔记](http://nanxiao.me/tag/systemtap/) 1019 | 7. [Linux 内核的tcp连接状态分析](http://gmd20.blog.163.com/blog/static/168439232014741166246/) 1020 | 8. [Systemtap examples, Network - 4 Monitoring TCP Packets](http://www.cnblogs.com/zengkefu/p/6372276.html) 1021 | 9. [systemtap的网络监控脚本](http://m.blog.itpub.net/15480802/viewspace-762002/) 1022 | 10. [Go 火焰图](http://lihaoquan.me/2017/1/1/Profiling-and-Optimizing-Go-using-go-torch.html) 1023 | 11. [工欲性能调优,必先利其器(2)- 火焰图](https://pingcap.com/blog-tangliu-tool-%7C%7C-zh) 1024 | 12. [OpenRestry火焰图](https://moonbingbing.gitbooks.io/openresty-best-practices/flame_graph.html) [openresty-systemtap-toolkit](https://github.com/openresty/openresty-systemtap-toolkit) 1025 | 13. [各个版本的Linux kernel](http://elixir.free-electrons.com/linux/latest/source) 1026 | 14. [ 用 perf 和 SystemTap 跟踪 MongoDB 访问超时](https://zhuanlan.zhihu.com/p/22572231) 1027 | 15. Perf 相关资料 1028 | * [perf example](http://www.brendangregg.com/perf.html) 1029 | * [Perf -- Linux下的系统性能调优工具,第 1 部分](https://www.ibm.com/developerworks/cn/linux/l-cn-perf1/index.html) 1030 | * [Perf -- Linux下的系统性能调优工具,第 2 部分](https://www.ibm.com/developerworks/cn/linux/l-cn-perf2/index.html) 1031 | * [**perf-tools**](https://github.com/brendangregg/perf-tools) 1032 | 16. [动态追踪技术漫谈](https://openresty.org/posts/dynamic-tracing/) 1033 | 17. [Architecture of systemtap: a Linux trace/probe tool](https://sourceware.org/systemtap/archpaper.pdf) 1034 | 18. [SystemTap使用技巧【一】](http://blog.csdn.net/wangzuxi/article/details/42849053) 1035 | 19. [SystemTap使用技巧【二】](http://blog.csdn.net/wangzuxi/article/details/42976577) 1036 | 1037 | 1038 | 1039 | 1040 | 1041 | [systemtap变量范围限制]: http://blog.yufeng.info/archives/1213"突破systemtap脚本对资源使用的限制" 1042 | 1043 | ``` 1044 | 1045 | ``` 1046 | 1047 | 1048 | 1049 | 1050 | -------------------------------------------------------------------------------- /Web_Framework_Benchmark.md: -------------------------------------------------------------------------------- 1 | # Web Framework Benchmark 2 | 3 | ## 1. 介绍 4 | 网站 [https://www.techempower.com](https://www.techempower.com)目前已经对于常用的web框架提供了13轮的测试,框架包括了常见的各种语言, 前几名的主流语言分别为c/c++/scala/java/go。第13轮的blog:[framework-benchmarks-round-13](https://www.techempower.com/blog/2016/11/16/framework-benchmarks-round-13/) 5 | 6 | 该网站的测试结果对于我们选择语言和框架具备非常好的参考价值。 测试的源码Github地址:[FrameworkBenchmarks](https://github.com/TechEmpower/FrameworkBenchmarks) 7 | 8 | 测试的Framework具备生产环境运行,测试过程中禁止打印日志。 参见:[Source And Requirements](https://www.techempower.com/benchmarks/#section=code&hw=ph&test=fortune) 9 | 10 | 测试的种类包括: 11 | 12 | 1. JSON serialization 13 | 2. Single database query 14 | 3. Multiple database queries 15 | 4. Fortunes 16 | 5. Database updates 17 | 6. Plaintext 18 | 19 | 20 | ## 2. 测试环境 21 | 22 | [environments](https://www.techempower.com/benchmarks/#section=environment) 23 | 24 | 1. **ServerCentral** 25 | 26 | Physical hardware environment for Rounds 13 and beyond. Provided by ServerCentral. Dell R910 (4x 10-Core E7-4850 CPUs) application server; Dell R710 (2x 4-Core E5520 CPUs) database server; switched 10-gigabit Ethernet 27 | 28 | 2. **Azure** 29 | 30 | Cloud environment for Rounds 13 and beyond. Microsoft Azure D3v2 instances; switched gigabit Ethernet. 31 | 32 | 3. **Peak** 33 | 34 | Physical hardware environment for Rounds 9 through 12. Dell R720xd dual Xeon E5-2660 v2 (40 HT cores) with 32 GB memory; database servers equipped with SSDs in RAID; switched 10-gigabit Ethernet. 35 | 36 | 4. **i7** 37 | 38 | Physical hardware environment for Rounds 1 through 8. Sandy Bridge Core i7-2600K workstations with 8 GB memory (early 2011 vintage); database server equipped with Samsung 840 Pro SSD; switched gigabit Ethernet 39 | 40 | 5. **AWS** 41 | 42 | Cloud environment for Rounds 1 through 12. Amazon EC2 c3.large instances (2 vCPU each); switched gigabit Ethernet (m1.large was used through Round 9). 43 | 44 | ## 3. 测试结果展现 45 | 46 | ![res](http://www.do1618.com/wp-content/uploads/2016/12/web_framework_benchmark.png) 47 | 48 | ## 3. 其他相关资源 49 | 50 | jakubkulhan提供了Node.js, Spray, Erlang, Http-kit, Warp, Tornado,和 Puma的benchmark: [hit-server-bench](https://github.com/jakubkulhan/hit-server-bench) 51 | 52 | ### Requests per second 53 | 54 | | Environment | Req/s (c=1) | Req/s (c=10) | Req/s (c=50) | Req/s (c=100) | 55 | |----------------------|------------:|-------------:|-------------:|--------------:| 56 | | Node.js | 10215.47 | 38447.10 | 51362.60 | 52722.19 | 57 | | Spray | 10681.03 | 41761.57 | 50912.04 | 52746.62 | 58 | | Erlang | 24.70 | 246.98 | 1240.38 | 2474.18 | 59 | | Http-kit | 8789.63 | 61355.26 | 73457.67 | 73475.39 | 60 | | Warp | 12467.62 | 41547.37 | 53475.75 | 53864.91 | 61 | | Tornado | 2356.82 | 4590.33 | 4422.08 | 4155.61 | 62 | | Puma | 10048.54 | 19289.60 | 16280.00 | 16698.86 | 63 | | Netty | 28409.44 | 116605.01 | 186109.06 | 180001.91 | 64 | | Spark (Jetty) | 10982.23 | 44533.38 | 31624.97 | 57425.96 | 65 | | Spring Boot (Tomcat) | 2652.85 | 13450.45 | 26336.17 | 31288.19 | 66 | | Reactor | 12899.17 | 50113.60 | 66310.16 | 65718.33 | 67 | | PHP-FPM | 2907.34 | 9968.78 | 9004.46 | 9425.08 | 68 | | HHVM | 1323.99 | 6434.97 | 9192.72 | 9743.09 | 69 | | ReactPHP (PHP) | 1636.49 | 9454.17 | 13676.61 | 12061.91 | 70 | | ReactPHP (HHVM) | 2087.04 | 12185.55 | 16151.42 | 12036.11 | 71 | 72 | 73 | ### Average latency 74 | 75 | | Environment | Latency (c=1) | Latency (c=10) | Latency (c=50) | Latency (c=100) | 76 | |----------------------|--------------:|---------------:|---------------:|----------------:| 77 | | Node.js | 83.59us | 253.57us | 1.03ms | 2.19ms | 78 | | Spray | 686.19us | 0.92ms | 2.71ms | 2.59ms | 79 | | Erlang | 40.60ms | 40.58ms | 40.35ms | 40.43ms | 80 | | Http-kit | 98.01us | 770.13us | 683.12us | 1.36ms | 81 | | Warp | 116.25us | 333.56us | 1.30ms | 2.18ms | 82 | | Tornado | 414.73us | 2.28ms | 12.21ms | 24.95ms | 83 | | Puma | 96.37us | 510.70us | 2.83ms | 5.88ms | 84 | | Netty | 53.17us | 113.52us | 635.84us | 1.18ms | 85 | | Spark (Jetty) | 2.04ms | 617.83us | 6.23ms | 7.94ms | 86 | | Spring Boot (Tomcat) | 7.09ms | 815.26us | 2.25ms | 3.81ms | 87 | | Reactor | 2.19ms | 3.93ms | 6.72ms | 8.49ms | 88 | | PHP-FPM | 355.04us | 1.26ms | 6.37ms | 11.60ms | 89 | | HHVM | 824.73us | 1.53ms | 5.62ms | 10.47ms | 90 | | ReactPHP (PHP) | 626.27us | 1.05ms | 3.81ms | 10.83ms | 91 | | ReactPHP (HHVM) | 481.31us | 811.58us | 3.21ms | 15.36ms | 92 | 93 | 94 | -------------------------------------------------------------------------------- /api_gateway_open_source.md: -------------------------------------------------------------------------------- 1 | # 微服务API Gateway 2 | 3 | ## 1. 介绍 4 | 5 | API ⽹关是微服务架构中⼀个不可或缺的部分。 API ⽹关是对外提供的服务,是系统的⼊⼝,所有的外部系统接⼊系统都需要通过 API ⽹关。 6 | 7 | ## 2. 常见开源API Gateway 8 | 9 | ### 2.1 Tyk 10 | 11 | [Github地址](https://github.com/TykTechnologies/tyk) go 有企业版和社区版 12 | 13 | [Docker安装测试](https://tyk.io/tyk-documentation/get-started/with-tyk-on-premise/installation/docker/docker-quickstart/) 14 | 15 | ### 2.2 kong 16 | 17 | [Github地址](https://github.com/Mashape/kong) nginx + lua (openresty) 18 | 19 | 当前版本:Kong 0.9.5 [文档](https://getkong.org/docs/) 20 | 21 | #### 启动测试 22 | 23 | 可以采用dock方式快速启动测试, 由于过程中需要连接到外网,因此需要有网络连接和科学上网。 24 | 25 | # git clone https://github.com/Mashape/docker-kong.git 26 | # cd docker-kong/compose 27 | # docker-compose up 28 | 。。。。。 29 | 30 | # curl http://127.0.0.1:8001 31 | 返回json格式的网关信息 32 | 33 | 34 | #### 添加API 35 | 36 | curl -i -X POST \ 37 | --url http://localhost:8001/apis/ \ 38 | --data 'name=baidu' \ 39 | --data 'upstream_url=http://www.baidu.com/' \ 40 | --data 'request_host=www.baidu.com' 41 | 42 | 返回结果: 43 | 44 | { 45 | "upstream_url":"http://www.baidu.com/", 46 | "created_at":1481521300000, 47 | "id":"fe5f3e33-e877-4aff-83ca-f55addee8eec", 48 | "name":"baidu", 49 | "preserve_host":false, 50 | "strip_request_path":false, 51 | "request_host":"www.baidu.com" 52 | } 53 | 54 | 55 | #### 测试添加的API 56 | 57 | curl -i -X GET \ 58 | --url http://localhost:8000/ \ 59 | --header 'Host: www.baidu.com' 60 | 61 | #### 查看添加的相关信息 62 | 63 | # http://127.0.0.1:8001/apis/ 64 | { 65 | data: [ 66 | { 67 | upstream_url: "http://www.baidu.com/", 68 | created_at: 1481521300000, 69 | id: "fe5f3e33-e877-4aff-83ca-f55addee8eec", 70 | name: "baidu", 71 | preserve_host: false, 72 | strip_request_path: false, 73 | request_host: "www.baidu.com" 74 | } 75 | ], 76 | 77 | total: 1 78 | } 79 | 80 | ### 2.3 api-umbrella 81 | 82 | [官网](https://apiumbrella.io/) nginx + lua (openresty) 83 | 84 | [Github地址](https://github.com/NREL/api-umbrella) 85 | 86 | [apiumbrella分析--Revisiting, speeding up, and simplifying API Umbrella's architecture](https://github.com/NREL/api-umbrella/issues/86) 87 | 88 | [apiumbralla同类产品分析](https://github.com/NREL/api-umbrella/issues/159) 89 | 90 | ### 2.4 apiaxle 91 | 92 | [官网](http://apiaxle.com/) node.js 93 | 94 | [Github地址](https://github.com/apiaxle/apiaxle) 95 | 96 | ### 2.5 Netflix zuul 97 | 98 | [GitHub地址](https://github.com/Netflix/zuul) 99 | 100 | ### 2.6 WSO2 API Manager 101 | [官网](http://wso2.com/products/api-manager/) OpenSource, Java 102 | 103 | ![架构](http://b.content.wso2.com/sites/all/product-pages/images/apim-overview.png) 104 | 105 | ### 2.7 clydeio 106 | 107 | 108 | [Github](https://github.com/clydeio/clydeio) node.js 貌似更新不频繁 109 | 110 | 111 | 112 | 113 | ## 参考资料 114 | 1. [Pattern: API Gateway](http://microservices.io/patterns/apigateway.html) 115 | 2. [Manage your Web API with an API Gateway](http://www.ippon.tech/blog/api-gateway/) 116 | 3. [Mashape开源API网关——Kong](http://www.infoq.com/cn/news/2015/04/kong/) 117 | 4. [Are there any open source API Gateways?](https://www.quora.com/Are-there-any-open-source-API-Gateways) 118 | 5. [Taking A Fresh Look At What Open Source API Management Architecture Is Available](https://apievangelist.com/2014/10/05/taking-a-fresh-look-at-what-open-source-api-management-architecture-is-available/) 119 | 6. [How Mashape Manages Over 15,000 APIs & Microservices](https://stackshare.io/mashape/how-mashape-manages-over-15000-apis-and-microservices) 120 | 7. [Get Kong折騰筆記](http://cocoaspice.logdown.com/posts/424513-getkong-toss-notes) -------------------------------------------------------------------------------- /arch_pearl.md: -------------------------------------------------------------------------------- 1 | * [浅谈命令查询职责分离(CQRS)模式](http://www.cnblogs.com/yangecnu/p/Introduction-CQRS.html) 2 | * [CQRS revisited](https://lostechies.com/gabrielschenker/2015/04/07/cqrs-revisited/) 3 | * [CQRS applied](https://lostechies.com/gabrielschenker/2015/04/12/cqrs-applied/) 4 | * [在洋葱(Onion)架构中实现领域驱动设计](http://www.infoq.com/cn/news/2014/11/ddd-onion-architecture) 5 | * [Understanding Onion Architecture](http://blog.thedigitalgroup.com/chetanv/2015/07/06/understanding-onion-architecture/) 6 | * [The Onion Architecture](http://jeffreypalermo.com/blog/the-onion-architecture-part-1/) 7 | * [Hexagonal architecture](http://alistair.cockburn.us/Hexagonal+architecture) 8 | * [The Clean Architecture](https://8thlight.com/blog/uncle-bob/2012/08/13/the-clean-architecture.html) 9 | 10 | 11 | 12 | ![](https://lostechies.com/gabrielschenker/files/2015/04/Old-architecture1.png) 13 | 14 | ![](https://lostechies.com/gabrielschenker/files/2015/04/Event-sourcing.png) 15 | 16 | ![](https://8thlight.com/blog/assets/posts/2012-08-13-the-clean-architecture/CleanArchitecture.jpg) 17 | 18 | ![](http://jeffreypalermo.com/files/media/image/WindowsLiveWriter/TheOnionArchitecturepart1_70A9/image%7B0%7D%5B59%5D.png) 19 | 20 | ![](http://alistair.cockburn.us/get/2304) 21 | 22 | 23 | 24 | [技术架构演变的全景图- 从单体式到云原生](https://mp.weixin.qq.com/s?__biz=MzAxNjgyODc5OA==&mid=2650730188&idx=1&sn=9414d894e5d3cbb7058d1a5e03a946d0&chksm=83e49eddb49317cb8f5c69df6dfdc26eaad6eac852adf0e4d7ac3181286e97989e4176daac4d&mpshare=1&scene=1&srcid=1224c2OstR5LymrPWpnDKEs5#rd) 包括微服务2015-2018的流行趋势图 25 | 26 | ![](https://mmbiz.qpic.cn/mmbiz_jpg/C98a4cEgAfLt2nrM4VdP5BpKUGfILwtfRxaibS8muFAr6Opia24T9jYZF8dYUwjt3KwyJThaf4SdJiciaaLse9lK7A/?wx_fmt=jpeg&tp=webp&wxfrom=5&wx_lazy=1) 27 | 28 | 29 | 30 | ![](https://jimmysong.io/kubernetes-handbook/images/kubernetes-high-level-component-archtecture.jpg) 31 | 32 | 33 | 34 | 下图是[Bilgin Ibryam](https://developers.redhat.com/blog/author/bibryam/)给出的微服务中应该关心的主题,图片来自[RedHat Developers](https://developers.redhat.com/blog/2016/12/09/spring-cloud-for-microservices-compared-to-kubernetes/)。 35 | 36 | ![](https://raw.githubusercontent.com/rootsongjc/kubernetes-handbook/master/images/microservices-concerns.jpg) 37 | 38 | CNCF(云原生计算基金会)给出了云原生应用的三大特征: 39 | 容器化包装:软件应用的进程应该包装在容器中独立运行。 40 | 动态管理:通过集中式的编排调度系统来动态的管理和调度。 41 | 微服务化:明确服务间的依赖,互相解耦。 42 | 43 | 44 | 45 | From: https://www.gartner.com/smarterwithgartner/top-trends-in-the-gartner-hype-cycle-for-emerging-technologies-2017/ 46 | 47 | [Awesome Cloud Native](https://github.com/rootsongjc/awesome-cloud-native) 48 | 49 | ![](https://blogs.gartner.com/smarterwithgartner/files/2017/08/Emerging-Technology-Hype-Cycle-for-2017_Infographic_R6A.jpg) 50 | 51 | 52 | 53 | ![](http://5b0988e595225.cdn.sohucs.com/images/20171011/e8da96d00a264ef88e026119e1f02af6.jpeg) 54 | 55 | ![](http://scholarsupdate.hi2net.com/upload2013/2015121718124767.jpg) 56 | 57 | > “快鱼吃慢鱼”是思科CEO钱伯斯的名言,他认为“在Internet经济下,大公司不一定打败小公司,但是快的一定会打败慢的。Internet与工业革命的不同点之一是,你不必占有大量资金,哪里有机会,资本就很快会在哪里重新组合。速度会转换为市场份额、利润率和经验”。“快鱼吃慢鱼”强调了对市场机会和客户需求的快速反应,但决不是追求盲目扩张和仓促出击,正相反,真正的快鱼追求的不仅是快,更是“准”,因为只有准确的把握住市场的脉搏,了解未来技术或服务的方向后,快速出击进行收购才是必要而有效的。 58 | > 59 | > https://baike.baidu.com/item/%E5%BF%AB%E9%B1%BC%E5%90%83%E6%85%A2%E9%B1%BC 60 | 61 | 62 | 63 | 精益大师Mary Poppendick提出的问题了——“如果只是改变了应用的一行代码,您的组织需要多长时间才能把应用部署到线上?“答案是几分钟或几秒钟。 64 | 65 | 云原生应用架构在快速变动的需求、稳定性、可用性和耐久性之间寻求平衡。 66 | 67 | 云原生应用程序架构还通过诸如API网关之类的设计模式来支持移动优先开发的概念,API网关将服务聚合负担转移回服务器端。 68 | 69 | From: https://jimmysong.io/migrating-to-cloud-native-application-architectures/chapter1/why-cloud-native-application-architectures.html 70 | 71 | [到了检讨“快鱼吃慢鱼”理论的时候](https://www.huxiu.com/article/3194.html) Apple TV 72 | 73 | > *Technology is exciting and challenging, but lessons learned from the industry are not to start there, but with your business mission, vision, and your people.* 74 | 75 | 76 | 77 | > [Pivotal](https://pivotal.io/) 是云原生应用的提出者,并推出了 [Pivotal Cloud Foundry](https://pivotal.io/platform) 云原生应用平台和 [Spring](https://spring.io/) 开源 Java 开发框架,成为云原生应用架构中先驱者和探路者。 78 | 79 | ### What are Cloud-Native Applications? 80 | 81 | > > “One of the things we've learned is that if you can't get it to market more quickly, there is no doubt that the market will have changed and no matter how well you've engineered it or built it or deployed it or trained your folks, it's not going to be quite right because it's just a little too late.“ 82 | > 83 | > **James McGlennon** 84 | > Executive VP and CIO, Liberty Mutual Insurance Group ![From](https://pivotal.io/cloud-native) 85 | 86 | 87 | 88 | > The [Cloud Native Computing Foundation (CNCF)](https://cncf.io/) has defined cloud native as: 89 | > - Containerized 90 | > - Distributed Management and Orchestration 91 | > - Micro-services Architecture 92 | 93 | ![](https://d1fto35gcfffzn.cloudfront.net/images/topics/cloudnative/diagram-cloud-native.png) 94 | 95 | From: https://pivotal.io/cloud-native 96 | 97 | 98 | 99 | ### [Cloud Native Reference architecture]() 100 | 101 | ![](https://www.cncf.io/wp-content/uploads/2017/05/1.png) 102 | 103 | From: https://www.cncf.io/blog/2017/05/15/developing-cloud-native-applications/ 104 | 105 | *OSS*:Operation support system 106 | 107 | *BSS*:Business support system 108 | 109 | 110 | 111 | # The Reactive Manifesto 112 | 113 | ![](https://www.reactivemanifesto.org/images/reactive-traits.svg) 114 | 115 | From: https://www.reactivemanifesto.org/ 116 | 117 | 118 | 119 | 1. [What are Cloud-Native Applications?](https://pivotal.io/cloud-native) [Developing Cloud Native Applications](https://www.cncf.io/blog/2017/05/15/developing-cloud-native-applications/) 120 | 2. [Comparing PaaS and Container Platforms By Wikibon](https://www.openshift.com/container-platform/compare-openshift.html) 121 | 3. [The Journey to Cloud Native](https://blogs.cisco.com/cloud/the-journey-to-cloud-native) 122 | 4. [迁移到云原生应用架构ebook](https://jimmysong.io/migrating-to-cloud-native-application-architectures/) [Migrating to Cloud-Native Application Architectures](http://www.oreilly.com/programming/free/migrating-cloud-native-application-architectures.csp) 123 | 5. [2016 Container Report By Cloud Foundry](https://www.cloudfoundry.org/container-report-2016/) 124 | 6. [2017 Container Report By Cloud Foundry](https://cloudfoundry.org/container-report-2017/?utm_source=pr&utm_campaign=cr17&utm_content=cff) 125 | 7. [The Reactive Manifesto](https://www.reactivemanifesto.org/) -------------------------------------------------------------------------------- /container.md: -------------------------------------------------------------------------------- 1 | # 容器知识点梳理 2 | ## Docker不能解决的问题 3 | 4 | 环境一致性,配置相关的问题 5 | Devops 6 | 发布流程和联合发布 7 | 8 | ## 容器引擎: 9 | 1. Docker Engine 10 | 2. Coreos Rocket(Rkt), 11 | 3. Cloud Foundry Garden 12 | 13 | ==> 14 | 15 | OCI(Open Container Initiative) RunC [OCF(Open Container Format)] 16 | 17 | 测试和开发环境用Docker,但是生产环境中最好用Runc 18 | 19 | ## 演进 20 | 总体演进图: 21 | 22 | ![caas](img/caas.jpg) 23 | 24 | ![timeline](img/container_timeline.jpg) 25 | 26 | ### Libcontainer -> RunC (Docker 1.11) 27 | 插件主要包括: 安全认证、存储卷管理、网络、IP池 28 | 29 | ![Runc](img/docker_libcontainer_runc.jpg) 30 | 31 | ##### CaaS生态厂商通过容器抽象、RunC和插件来重组自己的容器技术堆栈示意图 32 | ![Runc_no_docker](img/no_docker_runc.jpg) 33 | 34 | ##### Kubernetes通过CRI-O取代Docker容器管理引擎架构 35 | CRI-O (CRI=Container Runtime Interface O=OCI) 36 | ![Runc](./img/runc.png) 37 | 38 | ##### Mesos通过UnifiedContainer取代Docker容器管理引擎 39 | 40 | ![messos_uc](img/mesos_uc.jpg) 41 | 42 | ##### Cloud Foundry通过Garden取代Docker容器管理引擎 43 | ![cf_garden](img/cf_garden.jpg) 44 | 45 | 46 | Docker网络解决方案包括Docker CNM(Container Network Model)和Coreos CNI(Container Networking Interface)。 47 | 48 | CNM是一个被 Docker 提出的规范。现在已经被Cisco Contiv, Kuryr, Open Virtual Networking (OVN), Project Calico, VMware 和 Weave 这些公司和项目所采纳。 49 | 50 | ![CNM](http://img.dockerinfo.net/2016/11/20161124135614.jpg) 51 | ![CNM interface](http://img.dockerinfo.net/2016/11/20161124135713.jpg) 52 | 53 | CNI是由CoreOS提出的一个容器网络规范。已采纳改规范的包括Apache Mesos, Cloud Foundry, Kubernetes, Kurma 和 rkt。另外 Contiv Networking, Project Calico 和 Weave这些项目也为CNI提供插件。 54 | 55 | ![CNI](http://img.dockerinfo.net/2016/11/20161124135721.jpg) 56 | 57 | 扩展阅读: 58 | 59 | * [容器网络聚焦:CNM和CNI](http://www.dockerinfo.net/3772.html) 60 | * [为什么Kubernetes不使用libnetwork](http://www.infoq.com/cn/articles/why-Kubernetes-doesnt-use-libnetwork) 61 | 62 | 63 | 经典连接: 64 | 65 | * [容器,你还只用Docker吗?(上)](http://mp.weixin.qq.com/s?__biz=MzA5OTAyNzQ2OA==&mid=2649692608&idx=1&sn=ae5dd2d8ee1dcc3d1d96a9fa79816d66&chksm=889326a3bfe4afb5759f47ad71f007c1274a12067528ff503b5c54f2f5cf25758ef0c5f5a433&scene=21#wechat_redirect) 66 | * [容器,你还只用Docker吗?(下)](http://mp.weixin.qq.com/s?__biz=MzA5OTAyNzQ2OA==&mid=2649692634&idx=1&sn=23d4b8583402ed3bff5037ad988ae4b2&chksm=889326b9bfe4afaff42d07d5d02129ab1d8d2335e59aafa48710c4735c29429863bbeb29b94b&mpshare=1&scene=1&srcid=1130CBfSe89p25Ophga834vk#rd) -------------------------------------------------------------------------------- /distributed_trace.md: -------------------------------------------------------------------------------- 1 | # 分布式调用跟踪 2 | 3 | ## 1. Google 4 | [Google 论文翻译](http://bigbully.github.io/Dapper-translation/) [英语原文](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf) 5 | 6 | 具体设计目标: 7 | 8 | 1. 低消耗:跟踪系统对在线服务的影响应该做到足够小。在一些高度优化过的服务,即使一点点损耗也会很容易察觉到,而且有可能迫使在线服务的部署团队不得不将跟踪系统关停。 9 | 10 | 2. 应用级的透明:对于应用的程序员来说,是不需要知道有跟踪系统这回事的。如果一个跟踪系统想生效,就必须需要依赖应用的开发者主动配合,那么这个跟踪系统也太脆弱了,往往由于跟踪系统在应用中植入代码的bug或疏忽导致应用出问题,这样才是无法满足对跟踪系统“无所不在的部署”这个需求。面对当下想Google这样的快节奏的开发环境来说,尤其重要。 11 | 12 | 3. 延展性:Google至少在未来几年的服务和集群的规模,监控系统都应该能完全把控住。 13 | 14 | 做到真正的应用级别的透明,这应该是当下面临的最挑战性的设计目标,我们把核心跟踪代码做的很轻巧,然后把它植入到那些无所不在的公共组件种,比如线程调用、控制流以及RPC库。 15 | 16 | 简单实用的分布式跟踪的实现,就是为服务器上每一次发送和接收动作来收集跟踪标识符(message identifiers)和时间戳(timestamped events)。 17 | 18 | 两种解决方案: 19 | 20 | 1. 黑盒方案(black-box):假定需要跟踪的除了上述信息之外没有额外的信息,这样使用统计回归技术来推断两者之间的关系。轻便,需要更多的数据,以获得足够的精度,因为他们依赖于统计推论 21 | 22 | 2. 标注方案(annotation-based):依赖于应用程序或中间件明确地标记一个全局ID,从而连接每一条记录和发起者的请求。需要代码植入。代码植入限制在一个很小的通用组件库中,从而实现了监测系统的应用对开发人员是有效地透明。 23 | 24 | ![Dapper跟踪树种短暂的关联关系](http://bigbully.github.io/Dapper-translation/images/img2.png) 25 | 26 | ![一个单独的span的细节图](http://bigbully.github.io/Dapper-translation/images/img3.png) 27 | 28 | 任何一个span可以包含来自不同的主机信息,也要记录下来。事实上,每一个RPC span可以包含客户端和服务器两个过程的注释,使得链接两个主机的span会成为模型中所说的span。由于客户端和服务器上的时间戳来自不同的主机,必须考虑到时间偏差。分析工具利用了这个事实:RPC客户端发送一个请求之后,服务器端才能接收到,对于响应也是一样的(服务器先响应,然后客户端才能接收到这个响应)。这样一来,服务器端的RPC就有一个时间戳的一个上限和下限。 29 | 30 | ![Dapper收集管道总览](http://bigbully.github.io/Dapper-translation/images/img5.png) 31 | 32 | Dapper的跟踪记录和收集管道的过程分为三个阶段。首先,span数据写入(1)本地日志文件中。然后Dapper的守护进程和收集组件把这些数据从生产环境的主机中拉出来(2),最终写到(3)Dapper的Bigtable仓库中。一次跟踪被设计成Bigtable中的一行,每一列相当于一个span。Bigtable的支持稀疏表格布局正适合这种情况,因为每一次跟踪可以有任意多个span。跟踪数据收集(即从应用中的二进制数据传输到中央仓库所花费的时间)的延迟中位数少于15秒。第98百分位的延迟(The 98th percentile latency)往往随着时间的推移呈现双峰型;大约75%的时间,第98百分位的延迟时间小于2分钟,但是另外大约25%的时间,它可以增涨到几个小时。 33 | 34 | ![](http://bigbully.github.io/Dapper-translation/images/img6.png) 35 | 36 | ## 2. Twitter Zipkin 37 | ### 2.1 综述 38 | 39 | 40 | [zipkin官方](http://zipkin.io/) [Github地址](https://github.com/openzipkin/) [DockerHub](https://hub.docker.com/u/openzipkin/) 41 | 42 | ![总体图](http://zipkin.io/public/img/web-screenshot.png) 43 | 44 | [目前支持的lib库语言](http://zipkin.io/pages/existing_instrumentations.html) 45 | 46 | 官方:go/java/js/ruby/scala 等 47 | 社区:c#/go/java/python 等 48 | 49 | ### 2.2 运行 50 | 51 | 使用Docker运行Zipkin: 52 | 53 | # git clone https://github.com/openzipkin/docker-zipkin 54 | # docker-compose up 55 | 56 | 57 | ## 2.3 测试 58 | 59 | 采用 Zipkin go lib进行测试:[zipkin-go-opentracing](https://github.com/openzipkin/zipkin-go-opentracing) 60 | 61 | # git clone https://github.com/openzipkin/zipkin-go-opentracing.git 62 | # cd zipkin-go-opentracing/examples/cli_with_2_services/cli && go build 63 | # cd zipkin-go-opentracing/examples/cli_with_2_services/svc1/cmd && go build 64 | # cd zipkin-go-opentracing/examples/cli_with_2_services/svc1/cmd && go build 65 | 66 | 分别启动svc1和svc2,然后启动运行cli 67 | 68 | 登录到Zipkin后显示效果如下: 69 | 70 | ![Summary](http://www.do1618.com/wp-content/uploads/2016/12/zipkin_cli_summary.png) 71 | 72 | ![one_recorde](http://www.do1618.com/wp-content/uploads/2016/12/zipkin_one.png) 73 | 74 | ![detail](http://www.do1618.com/wp-content/uploads/2016/12/zipkin_one_detail.png) 75 | 76 | ![dep](http://www.do1618.com/wp-content/uploads/2016/12/zipkin_dep.png) 77 | 78 | 79 | ## 3. 大众点评 CAT 80 | 81 | ### 3.1 介绍 82 | 83 | [Github地址](https://github.com/dianping/cat) 84 | 85 | CAT基于Java开发的实时应用监控平台,包括实时应用监控,业务监控。eBay的CAL是其原型。 86 | 87 | CAT支持的监控消息类型包括: 88 | 89 | 1. Transaction 适合记录跨越系统边界的程序访问行为,比如远程调用,数据库调用,也适合执行时间较长的业务逻辑监控,Transaction用来记录一段代码的执行时间和次数。 90 | 91 | 2. Event 用来记录一件事发生的次数,比如记录系统异常,它和transaction相比缺少了时间的统计,开销比transaction要小。 92 | 93 | 3. Heartbeat 表示程序内定期产生的统计信息, 如CPU%, MEM%, 连接池状态, 系统负载等。 94 | 95 | 4. Metric 用于记录业务指标、指标可能包含对一个指标记录次数、记录平均值、记录总和,业务指标最低统计粒度为1分钟。 96 | 97 | 5. Trace 用于记录基本的trace信息,类似于log4j的info信息,这些信息仅用于查看一些相关信息 98 | 99 | CAT 在提供实时应用监控和业务监控的基础上,提供了 消息树 的功能能够提供完整的调用关系和序列。 100 | 101 | CAT监控系统将每次URL、Service的请求内部执行情况都封装为一个完整的消息树、消息树可能包括Transaction、Event、Heartbeat、Metric和Trace信息。 102 | 103 | ![完整消息树](https://camo.githubusercontent.com/9056837192598119dbd8f4559a74aef9a58bb186/68747470733a2f2f7261772e6769746875622e636f6d2f6469616e70696e672f6361742f6d61737465722f6361742d686f6d652f7372632f6d61696e2f7765626170702f696d616765732f6c6f6776696577416c6c30312e706e67) 104 | 105 | 目前客户端语言支持: Java和.net 106 | 107 | [透过CAT,来看分布式实时监控系统的设计与实现](http://www.tuicool.com/articles/z2u2Mn) 108 | [大众点评CAT架构分析](http://blog.csdn.net/szwandcj/article/details/51025669) 109 | 110 | 111 | ## 4. 淘宝 鹰眼 112 | 113 | 基于HFS开发,目前未开源。 114 | 115 | ![鹰眼整体结构](http://images2015.cnblogs.com/blog/524341/201607/524341-20160727210517138-692387667.png) 116 | 117 | [相关连接1](https://github.com/Percy0601/log-extension) 118 | [鹰眼下的淘宝](http://club.alibabatech.org/resource_detail.htm?topicId=102) 119 | 120 | ## 5. 京东 Hydra 121 | 122 | [Github链接](https://github.com/odenny/hydra) 123 | 124 | Hydra是java开发的分布式跟踪系统。可以接入各种基础组件,完成对业务系统的跟踪。已接入的基础组件是阿里开源的分布式服务框架Dubbo。 125 | 126 | Hydra可以针对并发量和数据量的大小选择(需要手动配置),是否使用消息中间件,使用Hbase或是Mysql存储跟踪数据。 127 | 128 | Hydra自身提供跟踪数据展现功能,基于angularJS和D3.js 129 | 130 | ## 6. 新浪 Watchman 131 | 132 | [微博平台的链路追踪及服务质量保障系统——Watchman系统](http://www.infoq.com/cn/articles/weibo-watchman) 133 | 134 | [亿级用户下的新浪微博平台架构(pdf)](http://qndbp.qiniudn.com/weibo.pdf) 135 | [亿级用户下的新浪微博平台架构(html)](https://linux.cn/article-4715-1.html) 136 | 137 | ## 7. 唯品会 Microscope 138 | 139 | [唯品会Microscope——大规模分布式系统的跟踪、监控、告警平台](http://blog.csdn.net/alex19881006/article/details/24381109) 140 | 141 | 未看到开源信息。 142 | 143 | ## 8. 窝窝网 Tracing 144 | 145 | 未有详细信息 146 | 147 | ## 9. eBay Centralized Activity Logging (CAL) 148 | 149 | [eBay架构演进](http://www.addsimplicity.com/downloads/eBaySDForum2006-11-29.pdf) 150 | 151 | 152 | ## 常用 RPC 框架 153 | 154 | 155 | 1. [gRPC](http://www.grpc.io/) 156 | 2. [Thrift](https://thrift.apache.org/) 157 | 3. [Hessian](http://hessian.caucho.com/index.xtp) 158 | 4. [Dubbo](http://dubbo.io/) 阿里Java高性能优秀的服务框架 159 | 5. [Hprose](http://www.hprose.com/) 高性能远程对象服务引擎 160 | 6. [ICE](https://zeroc.com/products/ice) 161 | 162 | 参考: 163 | [RPC框架性能基本比较测试](http://www.useopen.net/blog/2015/rpc-performance.html) 164 | 165 | 166 | ## 参考链接 167 | 168 | 1. [老司机的微服务架构实现,照亮你的人生](http://mp.weixin.qq.com/s?__biz=MzI3MzEzMDI1OQ==&mid=2651815259&idx=1&sn=52c3a730dbbcda8b878de718c79a17e0&chksm=f0dc2b27c7aba23177fa9370028fcc5ddf75edc9ea5cb7bb4f359d74b5aa6d9b5b9af2b66ef6&scene=21#wechat_redirect) 169 | 2. [分布式跟踪系统调研](http://www.zenlife.tk/distributed-tracing.md) 170 | 3. [分布式行为追踪系统-Zipkin](http://nezhazheng.com/2014/01/14/try-zipkin.html) 171 | 4. [阿里RPC开源框架](http://dubbo.io/) 172 | -------------------------------------------------------------------------------- /distribution_trans.md: -------------------------------------------------------------------------------- 1 | # 分布式事务一致性 2 | 3 | ## 1. 概述 4 | 5 | ![微服务最终一致性](img/microservice_consistence.jpg) 6 | 7 | 为了保证异常情况的处理,系统最终还应该设置最终的对账系统进行一致性检查。 8 | 9 | ## 2.模式详解 10 | ### 2.1 可靠事件模式 11 | 12 | 可靠事件模式是通过消息机制的传递来保证事件的最终处理一致性,主要有两点要求: 13 | 14 | 1. 可靠的消息传递机制,保证产生的事件消息能够准确地到达后续处理业务,至少有一次的传递。 15 | 2. 避免重复事件消息处理,消息重复处理的幂等性,后续业务作为消费者可能消费到重复的事件消息,要保证重复消息的处理的幂等性。 16 | 17 | 一般来说对于消息传递过程需要添加一个中间事件表,以保证业务崩溃或者网络出错的情况下,未能够投递的事件消息进行重新投递或者取消,可以采用 本地事件表和外部事件表的方式。 18 | 19 | 本地事件表: 20 | 21 | ![local](img/event_local_table.jpg) 22 | 23 | 外部时间表: 24 | 25 | ![remote](img/event_remote_table.jpg) 26 | 27 | ### 2.2 补偿模式 28 | 29 | 增加一个协调服务(补偿框架),协调服务负责生产全局唯一的业务流水号,并将业务处理流程中的数据保存到相关表中(大表或者关联表)对于出现异常的业务,根据业务表定位出需要补偿的范围,并启动补偿业务流程,通过重试来保证补偿过程的完整。 30 | 31 | 对于重试机制设置合适的策略: 32 | 1. 重置重试,业务失败 33 | 2. 立即重试,罕见错误(数据格式或者网络出错) 34 | 3. 等待重试,系统繁忙 http500 35 | 36 | 37 | ![2](img/consistency_compensation.jpg) 38 | 39 | 备注: 因为补偿模式确定补偿范围的复杂性和资源不具备隔离性,因此应当从业务层面给予规避,尽量提供出其他备选方案而不是进行补偿。 40 | 41 | ### 2.3 TCC(Try-Confirm-Cancel)模式 42 | TCC模式是优化的补偿模式。TCC模式在一定程度上弥补了补偿模式中资源不具备隔离性的缺陷,在TCC模式中直到明确的confirm动作,所有的业务操作都是隔离的(由业务层面保证)。另外工作服务可以通过指定try操作的超时时间,主动的cancel预留的业务资源,从而实现自治的微服务。 43 | 44 | ![tcc_confirm](img/cs_tcc_confirm.jpg) 45 | 46 | ![tcc_cancel](img/cs_tcc_cancel.jpg) 47 | 48 | ![tcc_exception](img/cs_tcc_exception.jpg) 49 | 50 | 服务提交confirm失败(比如网络故障),那么就会出现不一致,一般称为heuristic exception。 51 | 需要说明的是为保证业务成功率,业务服务向TCC服务框架提交confirm以及TCC服务框架向工作服务提交confirm/cancel时都要支持重试,这也就要confirm/cancel的实现必须具有幂等性。如果业务服务向TCC服务框架提交confirm/cancel失败,不会导致不一致,因为服务最后都会超时而取消。 52 | 53 | 另外heuristic exception是不可杜绝的,但是可以通过设置合适的超时时间,以及重试频率和监控措施使得出现这个异常的可能性降低到很小。如果出现了heuristic exception是可以通过人工的手段补救的。 54 | 55 | 相关链接: 56 | 57 | 1. [微服务架构下的数据一致性保证(一)](http://mp.weixin.qq.com/s?__biz=MzI5MDEzMzg5Nw==&mid=2660392782&idx=1&sn=d28e43bf6f7cf140eed9fffcf2f29e86&mpshare=1&scene=1&srcid=03125Ta3vkcVeYzePBZ4HYba#rd) 58 | 2. [微服务架构下的数据一致性保证(二)](http://mp.weixin.qq.com/s?__biz=MzI5MDEzMzg5Nw==&mid=2660392867&idx=1&sn=7f751483271fbe2b25d103df1eb45977&mpshare=1&scene=1&srcid=0311P03RSqbzRmzCG2IN8Nak#rd) 59 | 3. [微服务架构下的数据一致性保证(三):补偿模式](http://mp.weixin.qq.com/s?__biz=MzI5MDEzMzg5Nw==&mid=2660392948&idx=1&sn=11602f1258af8bbf88322558aa8a2f21&mpshare=1&scene=1&srcid=0311lBAsg4FfdDRAKXJeUxYo#rd) -------------------------------------------------------------------------------- /docker_store_image.md: -------------------------------------------------------------------------------- 1 | # Docker Image Storage 2 | 3 | ## 1. Docker 基本信息 4 | 5 | 安装Docker后目录结构, 采用overlay driver 6 | 7 | 17.03.1-ce 安装需要Centos7.3以上版本。 8 | 9 | 安装Docker后目录结构, 采用overlay driver 10 | 11 | ``` 12 | # cat /etc/redhat-release 13 | CentOS Linux release 7.3.1611 (Core) 14 | ``` 15 | 16 | ``` 17 | # docker version 18 | Client: 19 | Version: 17.03.1-ce 20 | API version: 1.27 21 | Go version: go1.7.5 22 | Git commit: c6d412e 23 | Built: Mon Mar 27 17:05:44 2017 24 | OS/Arch: linux/amd64 25 | 26 | Server: 27 | Version: 17.03.1-ce 28 | API version: 1.27 (minimum version 1.12) 29 | Go version: go1.7.5 30 | Git commit: c6d412e 31 | Built: Mon Mar 27 17:05:44 2017 32 | OS/Arch: linux/amd64 33 | Experimental: false 34 | 35 | ``` 36 | 37 | 38 | ``` 39 | # docker info 40 | 41 | Containers: 0 42 | Running: 0 43 | Paused: 0 44 | Stopped: 0 45 | Images: 1 46 | Server Version: 17.03.1-ce 47 | Storage Driver: overlay ------- overlay (overlay2 kenerl >= 4.0) 48 | ------------------------ 49 | Backing Filesystem: xfs 50 | Supports d_type: true 51 | Logging Driver: json-file 52 | Cgroup Driver: cgroupfs 53 | Plugins: 54 | Volume: local 55 | Network: bridge host macvlan null overlay 56 | Swarm: inactive 57 | Runtimes: runc 58 | Default Runtime: runc 59 | Init Binary: docker-init 60 | containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc 61 | runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe 62 | init version: 949e6fa 63 | Security Options: 64 | seccomp 65 | Profile: default 66 | Kernel Version: 3.10.0-514.el7.x86_64 67 | Operating System: CentOS Linux 7 (Core) 68 | OSType: linux 69 | Architecture: x86_64 70 | CPUs: 1 71 | Total Memory: 976.5 MiB 72 | Name: localhost.localdomain 73 | ID: 6OUR:TNDP:NVJK:M2PM:GPC3:WGTR:EWVB:2GQT:2HHJ:3WYJ:E6HV:L5EG 74 | Docker Root Dir: /var/lib/docker 75 | Debug Mode (client): false 76 | Debug Mode (server): false 77 | Registry: https://index.docker.io/v1/ 78 | Experimental: false 79 | Insecure Registries: 80 | 127.0.0.0/8 81 | Live Restore Enabled: false 82 | 83 | ``` 84 | 85 | ## 2. Overlay 86 | 87 | ![overlay](https://docs.docker.com/engine/userguide/storagedriver/images/overlay_constructs.jpg) 88 | 89 | 因为 overlay driver 工作在单个层上,因此 90 | multi-layered 的实现需要采用hard link,在overlay2中已经natively支持multiple lower OverlayFS layers,最大到128层,因此在某些命令上效率更高。 91 | 92 | > While the overlay driver only works with a single lower OverlayFS layer and hence requires hard links for implementation of multi-layered images, the overlay2 driver natively supports multiple lower OverlayFS layers (up to 128). 93 | > 94 | > Hence the overlay2 driver offers better performance for layer-related docker commands (e.g. docker build and docker commit), and consumes fewer inodes than the overlay driver. 95 | > 96 | > Inode limits. Use of the overlay storage driver can cause excessive inode consumption. This is especially so as the number of images and containers on the Docker host grows. A Docker host with a large number of images and lots of started and stopped containers can quickly run out of inodes. The overlay2 does not have such an issue. 97 | 98 | Image From [Docker and OverlayFS in practice](https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/#image-layering-and-sharing-with-overlayfs-overlay) 99 | 100 | ``` 101 | # mount | grep overlay 102 | /dev/sda3 on /var/lib/docker/overlay type xfs (rw,relatime,seclabel,attr2,inode64,noquota) 103 | ``` 104 | 105 | Link 106 | 107 | * [Select a storage driver](https://docs.docker.com/engine/userguide/storagedriver/selectadriver/) 108 | * [image/spec/v1.2.md](https://github.com/docker/docker/blob/master/image/spec/v1.2.md) 109 | * [The new stored format of Docker image on disk and Distribution](http://hustcat.github.io/docker-image-new-format/) 110 | 111 | ``` 112 | tree /var/lib/docker/ 113 | /var/lib/docker/ 114 | |-- containers 115 | |-- image 116 | | `-- overlay 117 | | |-- distribution 118 | | |-- imagedb 119 | | | |-- content 120 | | | | `-- sha256 121 | | | `-- metadata 122 | | | `-- sha256 123 | | |-- layerdb 124 | | `-- repositories.json 125 | |-- network 126 | | `-- files 127 | | `-- local-kv.db 128 | |-- overlay 129 | |-- plugins 130 | | |-- storage 131 | | | `-- blobs 132 | | | `-- tmp 133 | | `-- tmp 134 | |-- swarm 135 | |-- tmp 136 | |-- trust 137 | `-- volumes 138 | `-- metadata.db 139 | ``` 140 | 141 | ## 3. Docker Image 142 | ### 3.1 Pull busybox Image 143 | 144 | ``` 145 | docker pull busybox 146 | Using default tag: latest 147 | Trying to pull repository docker.io/library/busybox ... 148 | latest: Pulling from docker.io/library/busybox 149 | 7520415ce762: Pull complete 150 | Digest: sha256:32f093055929dbc23dec4d03e09dfe971f5973a9ca5cf059cbfb644c206aa83f 151 | Status: Downloaded newer image for docker.io/busybox:latest 152 | ``` 153 | 154 | ### 3.2 Pull Image后文件结构 155 | 156 | 省略了与第一次tree相同的内容 157 | 158 | ``` 159 | |-- containers 160 | tree /var/lib/docker/ 161 | /var/lib/docker/ 162 | |-- containers 163 | |-- image 164 | | `-- overlay 165 | | |-- distribution 166 | | | |-- diffid-by-digest hashes of compressed data 167 | | | | `-- sha256 [7520415ce762: Pull complete] 见 docker pull busybox 168 | | | | `-- 7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3 169 | | | | sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9 170 | | | `-- v2metadata-by-diffid 171 | | | `-- sha256 ③ hashes of uncompressed data 172 | | | `-- c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9 173 | | | [{"Digest":"sha256:7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3", --> diffid-by-digest 174 | | | "SourceRepository":"docker.io/library/busybox","HMAC":""}] 175 | | |-- imagedb 176 | | | |-- content 177 | | | | `-- sha256 [IMAGE ID] ① 文件包含了 image json数据,详见下文 SHA256 hash of its configuration JSON 178 | | | | 、 该文件名称为内容的 sha256sum 值,可以通过linux sha256sum 命令进行验证 179 | | | | `-- 00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 180 | | | `-- metadata 181 | | | `-- sha256 182 | | |-- layerdb 183 | | | |-- sha256 [Layer ID --> v2metadata-by-diffid] 184 | | | | `-- c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9 ④ [ChainID] 只有一层等于 Layer DiffID 否则 ChainID(layerN) = SHA256hex(ChainID(layerN-1) + " " + DiffID(layerN)) 185 | | | | |-- cache-id ⑤ [ccb47fc4077d37cb1c22c1db317b014347807d3cb5d41e2437a623788b438f5e] 186 | | | | |-- diff [sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9] 187 | | | | |-- size [1109996] 188 | | | | `-- tar-split.json.gz Layers must be packed and unpacked reproducibly to avoid changing the layer ID, for example by using tar-split to save the tar headers. 189 | | | `-- tmp 190 | | `-- repositories.json 191 | |-- overlay [保存真正数据的目录] 192 | | `-- ccb47fc4077d37cb1c22c1db317b014347807d3cb5d41e2437a623788b438f5e ⑥ 193 | | `-- root ⑦ 真正的文件 194 | | |-- bin 195 | | | |-- arp 196 | | | ............ 197 | | | 198 | | |-- etc 199 | | | |-- group 200 | | | |-- localtime 201 | | | |-- passwd 202 | | | `-- shadow 203 | | |-- home 204 | | |-- root 205 | | |-- tmp 206 | | |-- usr 207 | | | `-- sbin 208 | | `-- var 209 | | |-- spool 210 | | | `-- mail 211 | | `-- www 212 | ``` 213 | 214 | 215 | ``` 216 | docker images --no-trunc 217 | REPOSITORY TAG IMAGE ID CREATED SIZE 218 | busybox latest sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 4 weeks ago 1.11 MB 219 | ``` 220 | 221 | ``` 222 | docker rmi docker.io/busybox:latest 223 | Untagged: docker.io/busybox:latest 224 | Deleted: sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 225 | Deleted: sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9 226 | ``` 227 | 228 | ``` 229 | docker pull busybox 230 | Using default tag: latest 231 | latest: Pulling from library/busybox 232 | Digest: sha256:32f093055929dbc23dec4d03e09dfe971f5973a9ca5cf059cbfb644c206aa83f [tag digest]--> repositories.json 233 | Status: Image is up to date for busybox:latest 234 | ``` 235 | 236 | ``` 237 | # cat repositories.json | python -m json.tool 238 | { 239 | "Repositories": { 240 | "busybox": { 241 | "busybox:latest": "sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff", 242 | "busybox@sha256:32f093055929dbc23dec4d03e09dfe971f5973a9ca5cf059cbfb644c206aa83f": "sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff" 243 | } 244 | } 245 | } 246 | 247 | ``` 248 | 249 | 250 | ``` 251 | # cat image/overlay/imagedb/content/sha256/00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff | python -m json.tool 252 | { 253 | "architecture": "amd64", 254 | "config": { 255 | "AttachStderr": false, 256 | "AttachStdin": false, 257 | "AttachStdout": false, 258 | "Cmd": [ 259 | "sh" 260 | ], 261 | "Domainname": "", 262 | "Entrypoint": null, 263 | "Env": [ 264 | "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" 265 | ], 266 | "Hostname": "1295ff10ed92", 267 | "Image": "sha256:0d7e86beb406ca2ff3418fa5db5e25dd6f60fe7265d68a9a141a2aed005b1ae7", 268 | "Labels": {}, 269 | "OnBuild": null, 270 | "OpenStdin": false, 271 | "StdinOnce": false, 272 | "Tty": false, 273 | "User": "", 274 | "Volumes": null, 275 | "WorkingDir": "" 276 | }, 277 | "container": "d12e9fb4928df60ac71b4b47d56b9b6aec383cccceb3b9275029959403ab4f73", 278 | "container_config": { 279 | "AttachStderr": false, 280 | "AttachStdin": false, 281 | "AttachStdout": false, 282 | "Cmd": [ 283 | "/bin/sh", 284 | "-c", 285 | "#(nop) ", 286 | "CMD [\"sh\"]" 287 | ], 288 | "Domainname": "", 289 | "Entrypoint": null, 290 | "Env": [ 291 | "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" 292 | ], 293 | "Hostname": "1295ff10ed92", 294 | "Image": "sha256:0d7e86beb406ca2ff3418fa5db5e25dd6f60fe7265d68a9a141a2aed005b1ae7", 295 | "Labels": {}, 296 | "OnBuild": null, 297 | "OpenStdin": false, 298 | "StdinOnce": false, 299 | "Tty": false, 300 | "User": "", 301 | "Volumes": null, 302 | "WorkingDir": "" 303 | }, 304 | "created": "2017-03-09T18:28:04.586987216Z", 305 | "docker_version": "1.12.6", 306 | "history": [ 307 | { 308 | "created": "2017-03-09T18:28:03.975884948Z", 309 | "created_by": "/bin/sh -c #(nop) ADD file:c9ecd8ff00c653fb652ad5a0a9215e1f467f0cd9933653b8a2e5e475b68597ab in / " 310 | }, 311 | { 312 | "created": "2017-03-09T18:28:04.586987216Z", 313 | "created_by": "/bin/sh -c #(nop) CMD [\"sh\"]", 314 | "empty_layer": true 315 | } 316 | ], 317 | "os": "linux", 318 | "rootfs": { 319 | "diff_ids": [ ② --> v2metadata-by-diffid/sha256/xxxxx 320 | "sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9" 321 | ], 322 | "type": "layers" 323 | } 324 | } 325 | ``` 326 | 327 | see : 328 | 329 | * [Docker 简析 1 - 镜像与容器](http://ryerh.com/2017/03/17/docker-inside-section-1-image-and-container.html) 330 | * [docker pull命令实现与镜像存储(3)](http://blog.csdn.net/idwtwt/article/details/53493745) 331 | * [10张图带你深入理解Docker容器和镜像](http://dockone.io/article/783) [english](http://merrigrove.blogspot.jp/2015/10/visualizing-docker-containers-and-images.html) 332 | 333 | 334 | ## 4. Docker Container 335 | 336 | ### 4.1 Create Container 337 | 338 | ``` 339 | # docker create -it --name busybox busybox 340 | 6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d 341 | ``` 342 | 343 | 文件变化情况如下: 344 | 345 | ``` 346 | tree 347 | |-- containers 348 | | `-- 6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d 349 | | |-- checkpoints [empty] 350 | | |-- config.v2.json 351 | | `-- hostconfig.json 352 | |-- image 353 | | `-- overlay 354 | | |-- layerdb 355 | | | |-- mounts 356 | | | | `-- 6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d 357 | | | | |-- init-id [acb73196bd7d563a3a167b61e564a6597d63dad559c0731343d030d5707f06a5-init] 358 | | | | |-- mount-id [acb73196bd7d563a3a167b61e564a6597d63dad559c0731343d030d5707f06a5] 359 | | | | `-- parent [sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9] 360 | |-- overlay 361 | | |-- acb73196bd7d563a3a167b61e564a6597d63dad559c0731343d030d5707f06a5 362 | | | |-- lower-id [ccb47fc4077d37cb1c22c1db317b014347807d3cb5d41e2437a623788b438f5e] 363 | | | |-- merged [empty] 364 | | | |-- upper 365 | | | | |-- dev 366 | | | | | |-- console 367 | | | | | |-- pts 368 | | | | | `-- shm 369 | | | | |-- etc 370 | | | | | |-- hostname [empty] 371 | | | | | |-- hosts [empty] 372 | | | | | |-- mtab -> /proc/mounts 373 | | | | | `-- resolv.conf [empty] 374 | | | | |-- proc 375 | | | | `-- sys 376 | | | `-- work 377 | | | `-- work 378 | | |-- acb73196bd7d563a3a167b61e564a6597d63dad559c0731343d030d5707f06a5-init 379 | | | |-- lower-id [ccb47fc4077d37cb1c22c1db317b014347807d3cb5d41e2437a623788b438f5e] 380 | | | |-- merged 381 | | | |-- upper 382 | | | | |-- dev 383 | | | | | |-- console 384 | | | | | |-- pts 385 | | | | | `-- shm 386 | | | | |-- etc 387 | | | | | |-- hostname [empty] 388 | | | | | |-- hosts [empty] 389 | | | | | |-- mtab -> /proc/mounts 390 | | | | | `-- resolv.conf [empty] 391 | | | | |-- proc 392 | | | | `-- sys 393 | | | `-- work 394 | | | `-- work 395 | 396 | 397 | 70 directories, 406 files 398 | ``` 399 | 400 | 401 | ``` 402 | # cat containers/6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d/config.v2.json | python -m json.tool 403 | { 404 | "AppArmorProfile": "", 405 | "Args": [], 406 | "Config": { 407 | "AttachStderr": true, 408 | "AttachStdin": true, 409 | "AttachStdout": true, 410 | "Cmd": [ 411 | "sh" 412 | ], 413 | "Domainname": "", 414 | "Entrypoint": null, 415 | "Env": [ 416 | "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" 417 | ], 418 | "Hostname": "6b89258c98ed", 419 | "Image": "busybox", 420 | "Labels": {}, 421 | "OnBuild": null, 422 | "OpenStdin": true, 423 | "StdinOnce": true, 424 | "Tty": true, 425 | "User": "", 426 | "Volumes": null, 427 | "WorkingDir": "" 428 | }, 429 | "Created": "2017-04-08T11:50:37.794940171Z", 430 | "Driver": "overlay", 431 | "HasBeenManuallyStopped": false, 432 | "HasBeenStartedBefore": false, 433 | "HostnamePath": "", 434 | "HostsPath": "", 435 | "ID": "6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d", 436 | "Image": "sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff", 437 | "LogPath": "", 438 | "Managed": false, 439 | "MountLabel": "", 440 | "MountPoints": {}, 441 | "Name": "/busybox", 442 | "NetworkSettings": { 443 | "Bridge": "", 444 | "HairpinMode": false, 445 | "HasSwarmEndpoint": false, 446 | "IsAnonymousEndpoint": false, 447 | "LinkLocalIPv6Address": "", 448 | "LinkLocalIPv6PrefixLen": 0, 449 | "Networks": { 450 | "bridge": { 451 | "Aliases": null, 452 | "EndpointID": "", 453 | "Gateway": "", 454 | "GlobalIPv6Address": "", 455 | "GlobalIPv6PrefixLen": 0, 456 | "IPAMConfig": null, 457 | "IPAMOperational": false, 458 | "IPAddress": "", 459 | "IPPrefixLen": 0, 460 | "IPv6Gateway": "", 461 | "Links": null, 462 | "MacAddress": "", 463 | "NetworkID": "" 464 | } 465 | }, 466 | "Ports": null, 467 | "SandboxID": "", 468 | "SandboxKey": "", 469 | "SecondaryIPAddresses": null, 470 | "SecondaryIPv6Addresses": null, 471 | "Service": null 472 | }, 473 | "NoNewPrivileges": false, 474 | "Path": "sh", 475 | "ProcessLabel": "", 476 | "ResolvConfPath": "", 477 | "RestartCount": 0, 478 | "SeccompProfile": "", 479 | "SecretReferences": null, 480 | "ShmPath": "", 481 | "State": { 482 | "Dead": false, 483 | "Error": "", 484 | "ExitCode": 0, 485 | "FinishedAt": "0001-01-01T00:00:00Z", 486 | "Health": null, 487 | "OOMKilled": false, 488 | "Paused": false, 489 | "Pid": 0, 490 | "RemovalInProgress": false, 491 | "Restarting": false, 492 | "Running": false, 493 | "StartedAt": "0001-01-01T00:00:00Z" 494 | }, 495 | "StreamConfig": {} 496 | } 497 | 498 | ``` 499 | 500 | ``` 501 | # cat containers/6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d/hostconfig.json | python -m json.tool 502 | { 503 | "AutoRemove": false, 504 | "Binds": null, 505 | "BlkioDeviceReadBps": null, 506 | "BlkioDeviceReadIOps": null, 507 | "BlkioDeviceWriteBps": null, 508 | "BlkioDeviceWriteIOps": null, 509 | "BlkioWeight": 0, 510 | "BlkioWeightDevice": null, 511 | "CapAdd": null, 512 | "CapDrop": null, 513 | "Cgroup": "", 514 | "CgroupParent": "", 515 | "ConsoleSize": [ 516 | 0, 517 | 0 518 | ], 519 | "ContainerIDFile": "", 520 | "CpuCount": 0, 521 | "CpuPercent": 0, 522 | "CpuPeriod": 0, 523 | "CpuQuota": 0, 524 | "CpuRealtimePeriod": 0, 525 | "CpuRealtimeRuntime": 0, 526 | "CpuShares": 0, 527 | "CpusetCpus": "", 528 | "CpusetMems": "", 529 | "Devices": [], 530 | "DiskQuota": 0, 531 | "Dns": [], 532 | "DnsOptions": [], 533 | "DnsSearch": [], 534 | "ExtraHosts": null, 535 | "GroupAdd": null, 536 | "IOMaximumBandwidth": 0, 537 | "IOMaximumIOps": 0, 538 | "IpcMode": "", 539 | "Isolation": "", 540 | "KernelMemory": 0, 541 | "Links": [], 542 | "LogConfig": { 543 | "Config": {}, 544 | "Type": "json-file" 545 | }, 546 | "Memory": 0, 547 | "MemoryReservation": 0, 548 | "MemorySwap": 0, 549 | "MemorySwappiness": -1, 550 | "NanoCpus": 0, 551 | "NetworkMode": "default", 552 | "OomKillDisable": false, 553 | "OomScoreAdj": 0, 554 | "PidMode": "", 555 | "PidsLimit": 0, 556 | "PortBindings": {}, 557 | "Privileged": false, 558 | "PublishAllPorts": false, 559 | "ReadonlyRootfs": false, 560 | "RestartPolicy": { 561 | "MaximumRetryCount": 0, 562 | "Name": "no" 563 | }, 564 | "Runtime": "runc", 565 | "SecurityOpt": null, 566 | "ShmSize": 67108864, 567 | "UTSMode": "", 568 | "Ulimits": null, 569 | "UsernsMode": "", 570 | "VolumeDriver": "", 571 | "VolumesFrom": null 572 | } 573 | ``` 574 | 575 | ### 4.1 Start Container 576 | 577 | ``` 578 | # docker start busybox 579 | busybox 580 | ``` 581 | 582 | ``` 583 | # docker ps 584 | CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 585 | 6b89258c98ed busybox "sh" 20 minutes ago Up 3 seconds busybox 586 | 587 | ``` 588 | 589 | 变化的文件系统: 590 | 591 | ``` 592 | # tree 593 | . 594 | |-- containers 595 | | `-- 6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d 596 | | |-- 6b89258c98ed9f2622aae7bc5f285f179e1f342a774043313634e1642ae84c4d-json.log 597 | | |-- hostname [6b89258c98ed] 598 | | |-- hosts [default hosts] 599 | | |-- resolv.conf [search localdomain\n nameserver 172.16.132.2] 600 | | |-- resolv.conf.hash [ad04558079be07f81ba7495b46cf93dc224f00df70c812d520bb28ca193b503a] 601 | | `-- shm 602 | 603 | |-- overlay 604 | | |-- acb73196bd7d563a3a167b61e564a6597d63dad559c0731343d030d5707f06a5 605 | | | |-- lower-id 606 | | | |-- merged [此处包含] 607 | | | | |-- bin 608 | | | | | ...... root system, 此处包含了root指向的全部文件 609 | | | | 610 | ``` 611 | 612 | 613 | ## 5. Docker Regitstry 614 | 615 | ### 5.1 Image Storage 结构 616 | ``` 617 | # docker push 172.16.132.8:5000/busybox 618 | The push refers to a repository [172.16.132.8:5000/busybox] 619 | c0de73ac9968: Pushed 620 | latest: digest: sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd size: 505 621 | ``` 622 | 623 | ``` 624 | latest: Pulling from 172.16.132.8:5000/busybox 625 | Digest: sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd 626 | Status: Downloaded newer image for 172.16.132.8:5000/busybox:latest 627 | ``` 628 | 629 | ``` 630 | # tree 631 | `-- docker 632 | `-- registry 633 | `-- v2 634 | |-- blobs 635 | | `-- sha256 636 | | |-- 00 637 | | | `-- 00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 638 | | | `-- data [image json] 639 | | |-- 75 640 | | | `-- 7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3 641 | | | `-- data [gzip->tar->root根系统] 642 | | `-- 92 643 | | `-- 92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd 644 | | `-- data 645 | `-- repositories 646 | `-- busybox 647 | |-- _layers 648 | | `-- sha256 649 | | |-- 00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 650 | | | `-- link [sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff] 651 | | `-- 7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3 652 | | `-- link [sha256:7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3] 653 | |-- _manifests 654 | | |-- revisions 655 | | | `-- sha256 656 | | | `-- 92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd 657 | | | `-- link [sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd] 658 | | `-- tags 659 | | `-- latest 660 | | |-- current 661 | | | `-- link [sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd] 662 | | `-- index 663 | | `-- sha256 664 | | `-- 92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd 665 | | `-- link [sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd] 666 | `-- _uploads 667 | ``` 668 | 669 | ``` 670 | # cat blobs/sha256/00/00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff/data 671 | 672 | { 673 | "architecture": "amd64", 674 | "config": { 675 | "AttachStderr": false, 676 | "AttachStdin": false, 677 | "AttachStdout": false, 678 | "Cmd": [ 679 | "sh" 680 | ], 681 | "Domainname": "", 682 | "Entrypoint": null, 683 | "Env": [ 684 | "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" 685 | ], 686 | "Hostname": "1295ff10ed92", 687 | "Image": "sha256:0d7e86beb406ca2ff3418fa5db5e25dd6f60fe7265d68a9a141a2aed005b1ae7", 688 | "Labels": {}, 689 | "OnBuild": null, 690 | "OpenStdin": false, 691 | "StdinOnce": false, 692 | "Tty": false, 693 | "User": "", 694 | "Volumes": null, 695 | "WorkingDir": "" 696 | }, 697 | "container": "d12e9fb4928df60ac71b4b47d56b9b6aec383cccceb3b9275029959403ab4f73", 698 | "container_config": { 699 | "AttachStderr": false, 700 | "AttachStdin": false, 701 | "AttachStdout": false, 702 | "Cmd": [ 703 | "/bin/sh", 704 | "-c", 705 | "#(nop) ", 706 | "CMD [\"sh\"]" 707 | ], 708 | "Domainname": "", 709 | "Entrypoint": null, 710 | "Env": [ 711 | "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" 712 | ], 713 | "Hostname": "1295ff10ed92", 714 | "Image": "sha256:0d7e86beb406ca2ff3418fa5db5e25dd6f60fe7265d68a9a141a2aed005b1ae7", 715 | "Labels": {}, 716 | "OnBuild": null, 717 | "OpenStdin": false, 718 | "StdinOnce": false, 719 | "Tty": false, 720 | "User": "", 721 | "Volumes": null, 722 | "WorkingDir": "" 723 | }, 724 | "created": "2017-03-09T18:28:04.586987216Z", 725 | "docker_version": "1.12.6", 726 | "history": [ 727 | { 728 | "created": "2017-03-09T18:28:03.975884948Z", 729 | "created_by": "/bin/sh -c #(nop) ADD file:c9ecd8ff00c653fb652ad5a0a9215e1f467f0cd9933653b8a2e5e475b68597ab in / " 730 | }, 731 | { 732 | "created": "2017-03-09T18:28:04.586987216Z", 733 | "created_by": "/bin/sh -c #(nop) CMD [\"sh\"]", 734 | "empty_layer": true 735 | } 736 | ], 737 | "os": "linux", 738 | "rootfs": { 739 | "diff_ids": [ 740 | "sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9" 741 | ], 742 | "type": "layers" 743 | } 744 | } 745 | ``` 746 | 747 | ``` 748 | # ls -hl blobs/sha256/75/7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3/data 749 | -rw-r--r-- 1 root root 662K Apr 9 02:36 data 750 | 751 | file data 752 | data: gzip compressed data 753 | 754 | mkdir test && cp data test/ && cd test && mv data data.gz && gzip -d data.gz && tar xvf data 755 | 756 | # ls -hl # 可见data保存的gzip(tar(data)) 757 | total 1.3M 758 | drwxr-xr-x 2 root root 8.0K Mar 8 16:05 bin 759 | -rw-r--r-- 1 root root 1.3M Apr 9 02:51 data 760 | drwxr-xr-x 2 adm sys 6 Mar 8 16:05 dev 761 | drwxr-xr-x 2 root root 60 Mar 8 16:05 etc 762 | drwxr-xr-x 2 nobody nobody 6 Mar 8 16:05 home 763 | drwxr-xr-x 2 root root 6 Mar 8 16:05 root 764 | drwxrwxrwt 2 root root 6 Mar 8 16:05 tmp 765 | drwxr-xr-x 3 root root 17 Mar 8 16:05 usr 766 | drwxr-xr-x 4 root root 28 Mar 8 16:05 var 767 | ``` 768 | 769 | 770 | ``` 771 | # cat v2/blobs/sha256/92/92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd/data 772 | { 773 | "schemaVersion": 2, 774 | "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 775 | "config": { 776 | "mediaType": "application/octet-stream", 777 | "size": 1465, 778 | "digest": "sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff" 779 | }, 780 | "layers": [ 781 | { 782 | "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 783 | "size": 677607, 784 | "digest": "sha256:7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3" 785 | } 786 | ] 787 | } 788 | ``` 789 | 790 | ## 5.2 拉取Image接口 791 | 792 | ``` 793 | GET /v2/busybox/manifests/latest HTTP/1.1 794 | Host: 172.16.132.8:5000 795 | User-Agent: docker/17.03.1-ce go/go1.7.5 git-commit/c6d412e kernel/4.9.13-moby os/linux arch/amd64 UpstreamClient(Docker-Client/17.03.1-ce \(darwin\)) 796 | Accept: application/vnd.docker.distribution.manifest.v2+json 797 | Accept: application/vnd.docker.distribution.manifest.list.v2+json 798 | Accept: application/vnd.docker.distribution.manifest.v1+prettyjws 799 | Accept: application/json 800 | Accept-Encoding: gzip 801 | Connection: close 802 | 803 | HTTP/1.1 200 OK 804 | Content-Length: 505 805 | Content-Type: application/vnd.docker.distribution.manifest.v2+json 806 | Docker-Content-Digest: sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd 807 | Docker-Distribution-Api-Version: registry/2.0 808 | Etag: "sha256:92b7c19467bf868e52949d7e5404e309486ba9ee7eb4b88b882747ee1522d3cd" 809 | X-Content-Type-Options: nosniff 810 | Date: Sun, 09 Apr 2017 10:29:55 GMT 811 | Connection: close 812 | 813 | { 814 | "schemaVersion": 2, 815 | "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 816 | "config": { 817 | "mediaType": "application/octet-stream", 818 | "size": 1465, 819 | // Image ID 820 | "digest": "sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff" 821 | }, 822 | "layers": [ 823 | { 824 | "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 825 | "size": 677607, 826 | // Layer diff data 827 | "digest": "sha256:7520415ce76232cdd62ecc345cea5ea44f5b6b144dc62351f2cd2b08382532a3" 828 | } 829 | ] 830 | } 831 | ``` 832 | 833 | Link: 834 | 835 | * [Image Manifest V 2, Schema 2](https://docs.docker.com/registry/spec/manifest-v2-2/#image-manifest-field-descriptions) 836 | * [Docker Registry HTTP API V2](https://docs.docker.com/registry/spec/api/) 837 | * [开启docker之旅](http://blog.suconghou.cn/post/using-docker/) 838 | * [Docker rReference](https://docs.docker.com/reference/) 839 | 840 | 841 | 842 | ## 5.3 Create New Image 843 | 844 | update: 2017.04.24 845 | 846 | 847 | $ cat Dockerfile 848 | FROM busybox 849 | 850 | ADD hello.txt / 851 | 852 | $cat hello.txt 853 | hello 854 | 855 | $ docker build -t busybox:v1.0 . 856 | 857 | $ docker images 858 | busybox v1.0 7dbf4402b79c 13 minutes ago 1.11 MB 859 | 860 | 861 | 862 | |-- imagedb 863 | | |-- content 864 | | | `-- sha256 865 | | | |-- 00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff 866 | | | `-- 7dbf4402b79c68a58c11d2343f6bd59a22ee52980b3f5d5be29990fa49426bbc 新增加,镜像id 867 | | `-- metadata 868 | | `-- sha256 869 | | `-- 7dbf4402b79c68a58c11d2343f6bd59a22ee52980b3f5d5be29990fa49426bbc 870 | | `-- parent [sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff] 871 | 872 | 873 | |-- layerdb 874 | | |-- sha256 875 | | | |-- 3e1127ba68b26d89fe62c41714bb5ff55a73cc58f8caa399156c074642646ae5 CHAIN_ID 876 | | | | |-- cache-id [22c2ea6bd5847202f9c959538aced6b49ecaa963a72ac8a535c4003375a3a2db] 877 | | | | |-- diff [sha256:794b01691d296c35552ed723fcdc865625149137a7f19f565fed5b123919687f] DIFFID 878 | | | | |-- parent [sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9] PARENT CHAINID 879 | | | | |-- size 880 | | | | `-- tar-split.json.gz 881 | 882 | 883 | 3e1127ba68b26d89fe62c41714bb5ff55a73cc58f8caa399156c074642646ae5 884 | = sha256((parent_chainid) + " " + diffid) 885 | = sha256("sha256:c0de73ac99683640bc8f8de5cda9e0e2fc97fe53d78c9fd60ea69b31303efbc9 sha256:794b01691d296c35552ed723fcdc865625149137a7f19f565fed5b123919687f") 886 | -------------------------------------------------------------------------------- /dood_dind_k8s.md: -------------------------------------------------------------------------------- 1 | # DooD or DinD On K8S 2 | 3 | K8S 中的 Pod 相对于 Docker Container 具备以下的优点: 4 | 5 | 1. 每个 Pod 有独立的在 Cluster 内部可以路由的地址 6 | 7 | 2. 运行在单一 Pod 中的 Containers 具有相同的 network namespace,可以通过 localhost 进行访问 8 | 9 | 3. 运行的 Pod 在结束后,可以被 Cluster 进行回收 10 | 11 | ​ 12 | 13 | 在使用 Docker 技术做 CI/CD 过程中,一般会使用到两种技术: 14 | 15 | 1. **DinD **(Docker-in-Docker),在 Docker 运行的 赋予 privileged 权限的 Container 中运行 Docker 进行版本编译或者 CI/CD 相关工作,与宿主机达到完全隔离,主要用于多个环境同时运行的测试; 16 | 17 | 2. **DooD** (Docker-outside-of-Docker),在 Docker 运行的 继承宿主机 docker.sock 的 Container ,从而达到在宿主机 Docker Daemon 程序中创建其兄弟 Container 的,从而达到复用宿主机中镜像等各种资源的目的,主要用于编译版本或者编译镜像、推送镜像; 18 | 19 | ```shell 20 | $ docker run -d -v /var/run/docker.sock:/var/run/docker.sock \ 21 | -v $(which docker):/usr/bin/docker -p 8080:8080 myjenk 22 | ``` 23 | 24 | ​ 25 | 26 | ​ 27 | 28 | ## 1. Pods and DooD 29 | 30 | ![](http://30ux233xk6rt3h0hse1xnq9f-wpengine.netdna-ssl.com/wp-content/uploads/2017/03/dood-1-1.png) 31 | 32 | 33 | 34 | **Docker Outside Docker** [Running Docker in Jenkins (in Docker)](http://container-solutions.com/running-docker-in-jenkins-in-docker/) 35 | 36 | ```yaml 37 | apiVersion: v1 38 | kind: Pod 39 | metadata: 40 | name: dood 41 | spec: 42 | containers: 43 | - name: docker-cmds 44 | image: docker:1.12.6 45 | command: ['docker', 'run', '-p', '80:80', 'httpd:latest'] 46 | resources: 47 | requests: 48 | cpu: 10m 49 | memory: 256Mi 50 | volumeMounts: 51 | - mountPath: /var/run 52 | name: docker-sock 53 | volumes: 54 | - name: docker-sock 55 | hostPath: 56 | path: /var/run 57 | ``` 58 | 59 | 60 | 61 | ## 2. Pods and DinD 62 | 63 | 由于在 Docker 中运行 Docker,能够达到与宿主机环境完全隔离的目的,但是也有一些存在的小问题: 64 | 65 | 1. 因此运行的外部 Container 的时候需要赋予 privileged 权限 (docker run --privileged) 66 | 2. Storage Drivers 叠加带来的问题:如果外部的 Docker 运行的文件系统(EXT4/BETRFS/AUFS..)与 内部 Docker 的文件系统(EXT4/BETRFS/Device Mapper)在叠加的时候可能存在不能够正常工作的情况,例如 不能在 AUFS 之上运行 AUFS;在 BETRFS 之上运行 BETRFS 可能一开始能够正常工作,但是在嵌套卷中不能够工作;Device Mapper不具备 namespaced 能力,同一个机器上的多个 Docker 实例可以相互影响 Images 和 Devices;备注:``上述问题可能随着 Docker 版本的演进,已经得到了解决``。 67 | 3. 不能够共享宿主机上的 Images 资源,每次运行都需要重新拉去相关的 Images 68 | 69 | ![](http://30ux233xk6rt3h0hse1xnq9f-wpengine.netdna-ssl.com/wp-content/uploads/2017/03/dind.png) 70 | 71 | 72 | ****Pods** and DinD** [Using Docker-in-Docker for your CI or testing environment? Think twice.](https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/) 73 | 74 | ```yaml 75 | apiVersion: v1 76 | kind: Pod 77 | metadata: 78 | name: dind 79 | spec: 80 | containers: 81 | - name: docker-cmds 82 | image: docker:1.12.6 83 | command: ['docker', 'run', '-p', '80:80', 'httpd:latest'] 84 | resources: 85 | requests: 86 | cpu: 10m 87 | memory: 256Mi 88 | env: 89 | - name: DOCKER_HOST 90 | value: tcp://localhost:2375 91 | - name: dind-daemon 92 | image: docker:1.12.6-dind 93 | resources: 94 | requests: 95 | cpu: 20m 96 | memory: 512Mi 97 | securityContext: 98 | privileged: true 99 | volumeMounts: 100 | - name: docker-graph-storage 101 | mountPath: /var/lib/docker 102 | volumes: 103 | - name: docker-graph-storage 104 | emptyDir: {} 105 | ``` 106 | 107 | 108 | ## 3. 参考 109 | 110 | 1. [A Case for Docker-in-Docker on Kubernetes One](https://applatix.com/case-docker-docker-kubernetes-part/) 111 | 2. [A Case for Docker-in-Docker on Kubernetes Two](https://applatix.com/case-docker-docker-kubernetes-part-2/) -------------------------------------------------------------------------------- /go_swagger.md: -------------------------------------------------------------------------------- 1 | # go-swagger 2 | 3 | ## 1. Swagger 规范 4 | 5 | swagger的官方网址为 [swagger.io](http://swagger.io/), 6 | 7 | 官方介绍 8 | > 9 | > THE WORLD'S MOST POPULAR API FRAMEWORK 10 | > 11 | > Swagger is a powerful open source framework backed by a large ecosystem of tools that helps you design, build, document, and consume your RESTful APIs.” 12 | > 13 | 14 | Swagger规范和框架是Wordnik公司(世界上最大的在线英文辞典,后更名为Reverb)驱动作为内部使用工具进行开发,项目开始于2010年。在2015年SmartBear公司从Reverb科技公司后去了开源Swagger API规范。 15 | 16 | 2015年11月,负责维护Swagger规范的SmartBear公司,宣称在Linux Foundation组织的资助下成了一个新的组织- Open API Initiative, 成员包括Google/IBM/Microsoft等。同时,SmartBear也将Swagger的规范捐赠给了该组织。[1] 17 | 18 | Swagger是一套完整的规范,并提供了完善的工具支撑,方便生成接口文档,可以用来非常快捷和方便实现Restful API。框架为创建 JSON 或 YAML(JSON 的一个人性化的超集)格式的 RESTful API 文档提供了Swagger规范(后称作Open API)。 19 | 20 | 2010-2014年间分别发布过1.0、1.1与 1.2,每个版本都是较前版本小幅度的改进。在Swagger的开放工作小组(自2014年5月成立)的不懈努力下,Swagger 2.0终于在2014年9月正式发布,2.0版本只是对于规范进行了一次较大的改动。[2] 21 | 22 | 目前主流的语言都支持了对于swagger的支持,语言支持列表参见:[Tools and Integrations](http://swagger.io/open-source-integrations/) 23 | 24 | > 一些 Swagger 编辑工具可帮助您轻松地创建 API 文档,确保它们遵守 OpenAPI 规范。举例而言,通过使用 [Swagger Editor](http://editor.swagger.io/#/),您可以创建或导入 API 文档,并在一个交互式环境中浏览它。右侧的显示窗格显示了格式化的文档,反映了您在左侧窗格中的代码编辑器中执行的更改。代码编辑器指出了所有格式错误。您可以展开和折叠每个窗格。 25 | 以下是您导入 leads.yaml 定义后的 Swagger Editor UI 外观:[3] 26 | 27 | ![png](http://www.ibm.com/developerworks/cn/web/wa-use-swagger-to-document-and-define-restful-apis/image001.png) 28 | 29 | 除了[Swagger Editor](http://editor.swagger.io/#/)外,[swagger-ui](https://github.com/swagger-api/swagger-ui)也是比较出名的提供在线文档接口的优秀工具。 30 | 31 | 32 | ## 2. go-swagger 33 | 34 | Github地址:[go-swagger/go-swagger](https://github.com/go-swagger/go-swagger),项目由VMware资助, 官方文档 [goswagger.io](https://goswagger.io/) 35 | 36 | 37 | ### 2.1 安装 38 | 39 | 由于0.5.0和0.6.0版本出现了一些pkg兼容性的问题,目前最稳定版本为0.8.0, **brew install** 安装的版本为0.5.0,因此推荐安装采用Static binary的方式进行安装。 40 | 41 | $ latestv=$(curl -s https://api.github.com/repos/go-swagger/go-swagger/releases/latest | jq -r .tag_name) 42 | $ curl -o /usr/local/bin/swagger -L'#' https://github.com/go-swagger/go-swagger/releases/download/$latestv/swagger_$(echo `uname`|tr '[:upper:]' '[:lower:]')_amd64 43 | $ chmod +x /usr/local/bin/swagger 44 | 45 | 然后安装go-swagger的源码: 46 | 47 | $ go get -u github.com/go-swagger/go-swagger 48 | 49 | 快速入门的教程参见:[Tutorials](https://goswagger.io/tutorial/todo-list.html),完整的to-do-list server代码参见源码 [github.com/go-swagger/go-swagger/examples/tutorials/todo-list/server-complete/cmd/todo-list-server](https://github.com/go-swagger/go-swagger/tree/master/examples/tutorials/todo-list/server-complete),然后就可以生成想代码进行测试。 50 | 51 | $ cd tutorials/todo-list/server-complete/cmd 52 | $ go build && ./main # 备注 go build过程中会出现提示安装更多的开源库,使用go get安装即可 53 | 54 | 如果生成的访问地址为: http://127.0.0.1:9090/,则可以使用 http://127.0.0.1:9090/docs 进行API文档测试,默认文档样式为 Redoc 格式,如果使用 Swagger 格式则配合[swagger-ui](https://github.com/swagger-api/swagger-ui)则方便进行 UI 页面测试。 55 | 56 | 从官方下载 swagger-ui, 直接运行 swagger-ui/dist/index.html,然后输入 http://127.0.0.1:9090/swagger.json 即可访问。 57 | 58 | 由于存在[跨域问题](https://github.com/go-swagger/go-swagger/issues/481), 因此需要修改server端 restapi/configure_todo_list.go 代码: 59 | 60 | import "github.com/rs/cors" 61 | 62 | func setupGlobalMiddleware(handler http.Handler) http.Handler { 63 | handleCORS := cors.Default().Handler // 修改成cores的默认放通 64 | return handleCORS(handler) 65 | } 66 | 67 | 然后重新在 swagger-ui 的界面中输入即可。 68 | 69 | ![样例图](http://www.do1618.com/wp-content/uploads/2016/12/to_do_list_api.png) 70 | 71 | ## 参考 72 | 73 | 1. [OpenAPI Specification](https://en.wikipedia.org/wiki/OpenAPI_Specification) 74 | 2. [通过Swagger进行API设计,与Tony Tam的一次对话](http://www.infoq.com/cn/articles/swagger-interview-tony-tam) 75 | 3. [使用 Swagger 文档化和定义 RESTful API](http://www.ibm.com/developerworks/cn/web/wa-use-swagger-to-document-and-define-restful-apis/index.html) -------------------------------------------------------------------------------- /img/caas.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/caas.jpg -------------------------------------------------------------------------------- /img/cf_garden.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cf_garden.jpg -------------------------------------------------------------------------------- /img/consistency_compensation.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/consistency_compensation.jpg -------------------------------------------------------------------------------- /img/container_timeline.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/container_timeline.jpg -------------------------------------------------------------------------------- /img/cs_tcc_cancel.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cs_tcc_cancel.jpg -------------------------------------------------------------------------------- /img/cs_tcc_confirm.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cs_tcc_confirm.jpg -------------------------------------------------------------------------------- /img/cs_tcc_exception.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cs_tcc_exception.jpg -------------------------------------------------------------------------------- /img/cst-gateway.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cst-gateway.png -------------------------------------------------------------------------------- /img/cst.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/cst.png -------------------------------------------------------------------------------- /img/dbs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/dbs.png -------------------------------------------------------------------------------- /img/docker_libcontainer_runc.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/docker_libcontainer_runc.jpg -------------------------------------------------------------------------------- /img/event_local_table.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/event_local_table.jpg -------------------------------------------------------------------------------- /img/event_remote_table.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/event_remote_table.jpg -------------------------------------------------------------------------------- /img/mesos_uc.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/mesos_uc.jpg -------------------------------------------------------------------------------- /img/micro_srv_avoid_snow_slide.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/micro_srv_avoid_snow_slide.jpg -------------------------------------------------------------------------------- /img/microservice_consistence.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/microservice_consistence.jpg -------------------------------------------------------------------------------- /img/no_docker_runc.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/no_docker_runc.jpg -------------------------------------------------------------------------------- /img/runc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/runc.png -------------------------------------------------------------------------------- /img/sso.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/img/sso.png -------------------------------------------------------------------------------- /what-is-serverless.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DavadDi/micro_services_arch/e76e72c5be69fa66be58d5223c591ff597125957/what-is-serverless.pdf --------------------------------------------------------------------------------