├── Auth ├── Auth-README.md ├── Session │ └── Session-README.md └── JWT │ └── JWT-README.md ├── .gitignore ├── Cache ├── Cache-README.md ├── Cache-Delete-Cache.md └── Cache-Set-Cache.md ├── Question ├── Question-README.md └── Question-Benchmark.md ├── Code ├── Code-README.md └── Code-Array.md ├── Database ├── Database-README.md ├── Database-MySQL-README.md ├── Database-MySQL-Server.md ├── Database-MySQL-Transaction.md ├── Database-MySQL-DataType.md ├── Database-MySQL-Lock-Table.md ├── Database-MySQL-Code-SQL.md ├── Database-Data-Permutations.md ├── Database-MySQL-SELECT-SQL-Explain-Test.md └── Database-MySQL-Index.md ├── Reference ├── Reference-README.md ├── Reference-Books.md ├── Reference-Video.md ├── Reference-Tool.md └── Reference-Article.md ├── Architecture-Evolution ├── Architecture-Evolution-README.md ├── Architecture-Evolution-Partial-Page-Cache.md ├── Architecture-Evolution-Add-Database-Server.md ├── Architecture-Evolution-Static-Page-Cache.md ├── Architecture-Evolution-High-Useability-Service-Structure.md ├── Architecture-Evolution-Data-Distribute.md ├── Architecture-Evolution-Add-Web-Server.md ├── Architecture-Evolution-Add-More-Web-Server.md ├── Architecture-Evolution-Break-Up.md ├── Architecture-Evolution-Set-Database-Read-Write-Server.md ├── Architecture-Evolution-All-Together.md └── Architecture-Evolution-Dynamic-Data-Cache.md ├── book.json ├── README.md ├── SUMMARY.md ├── File └── File-README.md ├── High-Scaling-Websites-Structure-Concept.md └── LICENSE /Auth/Auth-README.md: -------------------------------------------------------------------------------- 1 | # 授權 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /_book/ 2 | /node_modules/ 3 | -------------------------------------------------------------------------------- /Cache/Cache-README.md: -------------------------------------------------------------------------------- 1 | # 快取 2 | 3 | 在這裡會介紹怎麼樣用快取去加快資料存取的效率 4 | -------------------------------------------------------------------------------- /Question/Question-README.md: -------------------------------------------------------------------------------- 1 | # 常見問題 2 | 3 | 這裏會解說一些效能相關的問題 4 | -------------------------------------------------------------------------------- /Code/Code-README.md: -------------------------------------------------------------------------------- 1 | # 程式 2 | 3 | 在這裡會介紹程式中要怎麼控制去快速的處理資料,提高程式的存取效率 4 | -------------------------------------------------------------------------------- /Database/Database-README.md: -------------------------------------------------------------------------------- 1 | # 資料庫 2 | 3 | 在這裡會介紹一些如何提高查詢資料庫效率的相關技巧 4 | -------------------------------------------------------------------------------- /Reference/Reference-README.md: -------------------------------------------------------------------------------- 1 | # 參考資料 2 | 3 | 這裏會紀錄一些相關的參考資料,可以去做進一步的學習及查詢 4 | -------------------------------------------------------------------------------- /Database/Database-MySQL-README.md: -------------------------------------------------------------------------------- 1 | # MySQL 2 | 3 | 在這裡會介紹一些在 MySQL 提高存取效率的一些技巧及方法 4 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-README.md: -------------------------------------------------------------------------------- 1 | # 網站架構演進 2 | 3 | 這裏會介紹從小型網站到大型網站整個的架構,是怎麼規劃及演進的。 4 | -------------------------------------------------------------------------------- /Reference/Reference-Books.md: -------------------------------------------------------------------------------- 1 | # 參考書籍 2 | 3 | ## MySQL 4 | * [High Performance MySQL, 2nd Edition原文書免費下載](http://it-ebooks.info/book/256/) 5 | -------------------------------------------------------------------------------- /Reference/Reference-Video.md: -------------------------------------------------------------------------------- 1 | # 參考影片 2 | 3 | ## 流量分散式處理 4 | * [Load Balancing Stateless vs. Stateful Applications](https://www.youtube.com/watch?v=NrMM3s7Mbjo) 5 | -------------------------------------------------------------------------------- /Reference/Reference-Tool.md: -------------------------------------------------------------------------------- 1 | # 參考工具 2 | 3 | ## JavaScript 4 | * [jsPerf — JavaScript performance playground](http://jsperf.com/) 5 | 6 | ## Database 7 | 8 | ### 連線池 9 | * [Database Connection Pool Library | Libzdb](http://www.tildeslash.com/libzdb/#) 10 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Partial-Page-Cache.md: -------------------------------------------------------------------------------- 1 | # 頁面片段暫存 2 | 3 | 在我們將靜態頁面暫存起來後,減少了很多對資料庫不必要的存取,`回應時間` 提升了了很多。 4 | 5 | 但我們也會想看看能不能在動態產生的頁面中,對於部分的很少變動的資料也做暫存的處理,這樣的話就可以減少部分動態網頁頁面不必要的存取了。 6 | 7 | ***架構圖:*** 8 | 9 | ![頁面片段暫存](http://i.imgur.com/YfClhCO.png) 10 | 11 | ## 參考資料 12 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 13 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Add-Database-Server.md: -------------------------------------------------------------------------------- 1 | # 增加 Database Server 2 | 3 | 在網站成長之後一段時間後,發現資料庫在查詢(SELECT)及資料異動(INSERT、UPDATE、DELETE)之間,因為查詢處理變得太多,使得互相競爭資料庫的資源非常嚴重,進而導致系統回應時間變得很慢。 4 | 5 | 這時候我們可能會想要把不同的資料表分散在不同的資料庫當中,分散查詢處理的資源,而會需要更改程式讀取資料庫的邏輯架構。 6 | 7 | ***架構圖:*** 8 | 9 | ![增加 Database Server](http://i.imgur.com/zxlXQLx.png) 10 | 11 | 12 | ## 參考資料 13 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 14 | -------------------------------------------------------------------------------- /Database/Database-MySQL-Server.md: -------------------------------------------------------------------------------- 1 | # 伺服器 2 | 3 | ## 設定 4 | 5 | 6 | ### max_allowed_packet:允許最大封包 7 | 8 | 在INSERT或UPDATE的時候,因為封包資料過大,導致錯誤發生 9 | 10 | ### max_connect_errors:最大連線錯誤 11 | 12 | 設 4294967295,可以避免當 client (像是 php) 發生太多錯誤時被 block 住。 13 | 14 | > default 0,請設定一個最大的值即可,32位元的系統 最大是 4bytes 4294967295、64位元的系統最大是 18446744073709547520 15 | 16 | ## 參考資料 17 | * [note: mysql 跟效能有關的設定 @ mini box 迷你小盒子 :: nidBox親子盒子](http://mini.nidbox.com/diary/read/7451092) 18 | -------------------------------------------------------------------------------- /book.json: -------------------------------------------------------------------------------- 1 | { 2 | "plugins": ["disqus","ga", "adsense"], 3 | "pluginsConfig": { 4 | "disqus": { 5 | "shortName": "kejyun-high-scaling-websites-structure" 6 | }, 7 | "ga": { 8 | "token": "UA-31063691-11" 9 | }, 10 | "adsense": { 11 | "client": "ca-pub-1990193713845546", 12 | "slot": "2858259611", 13 | "format": "auto", 14 | "element": ".page-inner section", 15 | "position": "top" 16 | } 17 | } 18 | } -------------------------------------------------------------------------------- /Auth/Session/Session-README.md: -------------------------------------------------------------------------------- 1 | # Session 2 | 3 | ## Sticky Session / Cookie stickiness 4 | 5 | 當服務有超過 1 台以上的主機時,在主機前面會使用 Load balancer 去做 Request 的分流,假設有 3 台主機,同一個使用者的 Session 就會分散在不同的主機上,會導致使用者在不同主機的 Session 資料不一樣,可能使用者在主機 1 登入狀態,在主機 2 卻是登出的狀態。 6 | 7 | 可以使用 Sticky Session,讓同一個使用者的 Request 都指向固定一台實體主機,這樣可保持使用者登入的狀態 8 | 9 | 10 | ## 參考資料 11 | * [Sticky and NON-Sticky sessions - Stack Overflow](https://stackoverflow.com/questions/10494431/sticky-and-non-sticky-sessions) 12 | * [New Elastic Load Balancing Feature: Sticky Sessions | AWS News Blog](https://aws.amazon.com/blogs/aws/new-elastic-load-balancing-feature-sticky-sessions/) 13 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Static-Page-Cache.md: -------------------------------------------------------------------------------- 1 | # 前端靜態頁面暫存 2 | 3 | 4 | 當越來越多人來存取你的應用時,你會發現你服務的 `反應時間` 又開始變慢了,你可以會發現 `Web` 機器效能的使用沒有 `Database` 那麼大,就會發現現在的瓶頸應該是卡在 `Database` 有太多人去進行存取了。 5 | 6 | 我麼這時候會試著將頁面中很少異動的頁面(大約 1~2 天才會更新的頁面),做成靜態頁面的快取,在撈取資料庫資料時把資料產生成靜態 HTML 檔案,當下次再次的讀取相同資料時,則直接將靜態的 HTML 回傳,減少資料庫的存取,提高存取資料庫的效率(有需要再去進行查詢)。 7 | 8 | ***架構圖:*** 9 | 10 | ![前端靜態頁面暫存](http://i.imgur.com/20Zr88T.png) 11 | 12 | ## 注意事項 13 | 14 | ### 整頁靜態頁不可頻繁的修改 15 | 16 | 因為會將整個頁面都做是要給使用者看到的整個結果頁,所以頁面的資料若時常修改的話,則勢必要一直重新產生新的靜態頁面,但這樣就失去了做靜態頁面暫存的意義了 17 | 18 | ## 參考資料 19 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 20 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-High-Useability-Service-Structure.md: -------------------------------------------------------------------------------- 1 | # 高可用性的服務架構 2 | 3 | 透過不斷的增加 Web Server 就可以提高使用者訪問量,但這樣的架構非常的龐大,當要對這樣的架構進行改動時會相當的不方便,可用性變得不高。 4 | 5 | 而且部署機器和維護也必變得相當麻煩,龐大的應用服務架構要在 N 台機器上進行複製、啟動都需要花很多的時間,當機器出問題時也很難立即找出拿台機器出了問題,更有可能是某個應用服務的程式出現 Bug,導致整個站掛掉都沒辦法使用了。 6 | 7 | 在優化調校時也比較難操作,因為部署的機器每一台都需要進襲調校,沒辦法只進行針對性的調校。 8 | 9 | 餘是為了解決這樣的問題,就發展出大型的分佈式應用,而這樣的架構可能會遇到許多挑戰: 10 | 11 | * 分布式應用需要提供高效能且穩定的架構 12 | 13 | * 將龐大的應用拆分出來需要耗費很長的時間,而且需要對業務的整理和系統依賴關係進行控制 14 | 15 | * 如何維護(依賴管理、執行狀況管理、錯誤追踪、調校優化、監控和示警等)好這個龐大的分佈式應用。 16 | 17 | 完成這一步後,系統的架構差不多會到相對穩定的階段,也會使用大量便宜的機器去支撐突如其來的訪問流量 18 | 19 | 20 | # 參考資料 21 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 22 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Data-Distribute.md: -------------------------------------------------------------------------------- 1 | # 分散式資料 2 | 3 | 在做資料庫分散之後,若單一資料表的資料太多,我們會想要用分散資料庫的概念,對資料表中的資料進行分散式處理,而會再需要更改程式讀取資料庫及資料表的邏輯架構。 4 | 5 | 而在這我們就需要規劃資料分散在多個資料庫及資料表的架構,要怎麼去存取,我們通常會設計一個統一的資料庫及資料表資源分散的統一架構,去進行管理資源的存放及存取方式。 6 | 7 | 像是 Facebook 每天有好幾百萬千萬的文章資訊會被發表出來,我們可能會對各個使用者做文章存取的資源分配,將使用者的資源分配記錄記在統一個資源管理架構中做記錄,紀錄的資訊可能會是 User A 的文章資料存放在 DB_1 及 TABLE_2,而 User B 的文章資料存放在 DB_3 及 TABLE_1,當要存取個別使用者的資料資料,根據資源管理分配的的方式,分別去不同的資料庫及資料表撈取資料。 8 | 9 | 而當需要使用分散式資料的情況下,可能也會發現因為資料量過大,導致快取同步的方式出了問題,讓快取不能只存放在本地端而用同步的方式進行分享,可能需要採用分散式快取的方式去快取資料。 10 | 11 | ***架構圖:*** 12 | 13 | ![增加 Database Server](http://i.imgur.com/ky9Y29v.png) 14 | 15 | 16 | ## 參考資料 17 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 18 | -------------------------------------------------------------------------------- /Question/Question-Benchmark.md: -------------------------------------------------------------------------------- 1 | # 壓力測試 2 | 3 | 在我們寫完程式並上了伺服器後,若我們的網站需要服務很多人,我們可能會想要這個服務能夠服務多少人,通常我們會對伺服器進行壓力測試。 4 | 5 | 壓力測試的軟體有很多,我比較常用的是 `Apache AB test` ,簡單的了解伺服器在多少連線下,處理請求(Request)的速度及效能,而同樣的機器在不同的程式處理之下,會有不同的效能,所以沒辦法統一的斷定哪一種機器只能服務多少人,若程式寫的好的話,可能同時可以應付 4、500 人同使請求,但也有程式可能因為效能不佳,僅能服務 4、50 人(伺服器回應時間標準一樣),所以要對自己的應用服務做壓力測試後才知道狀況。 6 | 7 | 也有很多其他的壓力測試的軟體,只要找出一套適合自己的壓測工具就可以了。 8 | 9 | ## 參考資料 10 | * [ab - Apache HTTP server benchmarking tool](http://httpd.apache.org/docs/2.2/programs/ab.html) 11 | * [The Will Will Web | 使用 ApacheBench 進行網站的壓力測試](http://blog.miniasp.com/post/2008/06/30/Using-ApacheBench-ab-to-to-Web-stress-test.aspx) 12 | * [Web server benchmarking - Wikipedia, the free encyclopedia](http://en.wikipedia.org/wiki/Web_server_benchmarking) 13 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Add-Web-Server.md: -------------------------------------------------------------------------------- 1 | # 增加 Web Server 2 | 3 | 隨著網站應用使用者增加,我們會發現 Web Server 機器的壓力變得比較高了,這個時候就會開始考慮增加 Web Server 了。增加 Web Server 除了可以降低機器的壓力外,同時也可以在 Web Server 掛掉後,有備援機器可以使用,避免整個服務完全中斷。 4 | 5 | 在增加 Web Server 之後,會碰到一些問題: 6 | 7 | * 如何將流量平均分配到不同的 Web Server 8 | 9 | > 通常我們會使用 Apache 的負載平衡方法,或是 LVS 相關的負載平衡方法去分配這些流量 10 | 11 | * 如何讓 Web Server 的資料同步 12 | 13 | > 像是使用者的 session 在不同機器要如何同步,可能會考慮使用資料庫、cookie 或同步 Session 的機制 14 | 15 | * 快取(cache)資料同步 16 | 17 | > 像是快取會員的個人資料,可能會使用快取同步的機制,或者是分散式快取的方式 18 | 19 | * 上傳的檔案同步 20 | 21 | > 可能會使用檔案共享的方式 22 | 23 | ***架構圖:*** 24 | 25 | ![增加 Web Server](http://i.imgur.com/RA6cJgk.png) 26 | 27 | 28 | ## 參考資料 29 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 30 | -------------------------------------------------------------------------------- /Auth/JWT/JWT-README.md: -------------------------------------------------------------------------------- 1 | # JWT(JSON Web Token) 2 | 3 | ## 參考資料 4 | * [JSON Web Tokens - jwt.io](https://jwt.io/) 5 | * [Authentication 方案优化探索(JWT, Session, Refresh Token, etc.) - 后端 - 掘金](https://juejin.im/entry/5974088ef265da6c2c7f5c52) 6 | * [在 Laravel 實現自動 Refresh JWT 機制](https://blog.albert-chen.com/laravel-auto-refresh-jwt/) 7 | * [\[Day-32\] (實作)用JWT取代傳統Session來驗證使用者身份 - iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10196759) 8 | * [\[Day-33\] (實作)使用JWT來存取API內容(上) - iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10196967) 9 | * [\[Day-34\] (實作)使用JWT來存取API內容(下) - iT 邦幫忙::一起幫忙解決難題,拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10197054) 10 | * [Stop using JWT for sessions - joepie91's Ramblings](http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/) 11 | -------------------------------------------------------------------------------- /Database/Database-MySQL-Transaction.md: -------------------------------------------------------------------------------- 1 | # 交易 2 | 3 | 4 | ## 控制InnoDB Transaction 5 | 6 | ### 關閉自動提交: 7 | 8 | ```sql 9 | SET autocommit=0; 10 | ``` 11 | 12 | ### 開啟自動提交: 13 | 14 | ```sql 15 | SET autocommit=1; 16 | ``` 17 | 18 | 在 MySQL InnoDB 預設所有的資料異動都是 `Transaction(交易)`,當與資料庫做連線的時候,InnoDB 會採用`自動提交(autocommit)`的方式,所以除非使用 `BEGIN;` 及 `COMMIT;` 將異動語法包覆起來***設定為同一個 Transaction,否則任何一個資料異動的語法(INSERT、DELETE、UPDATE)會認為是一個獨立的 Transaction***。 19 | 20 | 所以資料表每次做資料異動的時候會一直`提交 (COMMIT)` 去更新日誌,若有 1000筆 獨立的SQL要執行就會被 COMMIT 1000 次,所以在下異動語法之前,可以使用指令 `SET autocommit=0;` 關閉自動提交,等異動完成後,再使用指令 `SET autocommit=1;` 開啟自動提交。 21 | 22 | ```sql 23 | SET autocommit=0; 24 | 異動(INSERT、DELETE、UPDATE)大量資料SQL語法 25 | SET autocommit=1; 26 | ``` 27 | 28 | ## 參考資料 29 | * [KeJyun學習日誌: MySQL效率調校](http://blog.kejyun.com/2012/12/MySQL-Efficiency-Adjustment.html) 30 | -------------------------------------------------------------------------------- /Code/Code-Array.md: -------------------------------------------------------------------------------- 1 | # 陣列 2 | 3 | ## 鍵值(key) 4 | 5 | 在撈取資料庫的資料後,我們會把資料存在陣列中,等之後比對資料時可以當作查詢之用,若可以的話建議陣列的`鍵值(key)`要用可以辨識資料的值。 6 | 7 | 像是使用者的資料可以用使用者編號當鍵值,若之後要直接取特定使用者的資料,直接就可以指定使用者編號去取得使用者的資料: 8 | 9 | ```php 10 | 1, 23 | 'user_id' => 1, 24 | 'message' => 'Hi' 25 | ], 26 | [ 27 | 'id' => 2, 28 | 'user_id' => 2, 29 | 'message' => 'hello' 30 | ], 31 | [ 32 | 'id' => 3, 33 | 'user_id' => 1, 34 | 'message' => 'How are you?' 35 | ] 36 | ]; 37 | 38 | // 取得留言使用者姓名 39 | foreach ($messages as &$message) { 40 | $message['user_name'] = $users[$message['user_id']]['user_name']; 41 | } 42 | ``` 43 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Add-More-Web-Server.md: -------------------------------------------------------------------------------- 1 | # 增加更多的 Web Server 2 | 3 | 在做完分散式資料庫及分散式資料表之後,又開始看到流量暴增的的時候,發現系統的回應時間又開始變慢了,你可能會看到 Web Server(Apache、Nginx...etc)那邊阻塞了很多的請求,而導致每個存取的使用者需要時間等待而導致回應時間增加了,這個時候我們就可能需要增加更多的 Web Server 去應付這些大量的請求。 4 | 5 | 在做完這些工作之後,會進到一個像是可以無限擴充的階段,網站流量增加時,就是不斷的增加 Server ($$$) 去應付大量的流量。 6 | 7 | 在做 Web Server 負載平衡時可能會遇到下列問題: 8 | 9 | * 基本的負載平衡(Apache 負載平衡、LVS...etc)無法應付巨量的請求量 10 | 11 | 如果經費允許的話(都流量暴增了,會不想要投入更多的 $ 去讓自己的事業擴大嗎 XD),可以採購負載平衡的硬體設備 12 | 13 | > e.g. [F5 Load Balancer](https://f5.com/glossary/load-balancer)、[Netsclar Load Balancer](http://www.citrix.com.tw/products/netscaler-application-delivery-controller/overview.html)...etc 14 | 15 | 若真的經濟不允許,可以將應用從邏輯上做分類,不同類型的應用分散給不同的負載平衡群集處理。 16 | 17 | * 訊息同步的文件分享方法出現瓶頸 18 | 19 | 這時可以根據不同的網站業務需求去做不同的分散式文件系統 20 | 21 | ## 參考資料 22 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 23 | * [F5 Load Balancer](https://f5.com/glossary/load-balancer) 24 | * [Netsclar Load Balancer](http://www.citrix.com.tw/products/netscaler-application-delivery-controller/overview.html) 25 | -------------------------------------------------------------------------------- /Cache/Cache-Delete-Cache.md: -------------------------------------------------------------------------------- 1 | # 刪除快取 2 | 3 | 我們設定了快取希望使用者存取資料的時候能夠加快回應時間,但資料可能會因為使用者的`刪除`或`修改`而產生了異動,若我們需要讓使用者能夠讀取到最新的資料,我們通常會手動的將快取刪除,讓使用者下次讀取的話能夠去資料庫讀取到最新的資料,在重新設定一次新的快取資料。 4 | 5 | ## 以部落格文章快取為例 6 | 7 | ### 更新部落格文章資料 8 | 9 | ```php 10 | 同時上線非每日上限,同日上線指的是同一時先有多個人同時跟伺服器要資源 27 | 28 | ### 需要維持一定的服務可用度 29 | 30 | 當服務需要提供比較多人去做存取時,我們就必須要提供一定水準的服務可用度,將反應時間保持在 1 ~ 2 秒內的範圍。 31 | 32 | ### 資料定期備份 33 | 34 | 雖然服務還在剛起步的階段,但是資料還是非常重要的公司資產,若只有一台 Database,我們還是需要定期的將資料備份出來(最少 1 天備援一次),避免機器掛掉的機會。 35 | 36 | 37 | ## 參考資料 38 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 39 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Set-Database-Read-Write-Server.md: -------------------------------------------------------------------------------- 1 | # Database 讀寫分離 2 | 3 | 當流量又開始暴增時,會發現原本無限制擴充 Server ($$$)的架構會讓資料庫連線的資源不敷使用。 4 | 5 | 因為而在做資料異動的時候,若有要更新範圍(range)的資料,很容易把資料進行鎖定,進而彼此影響到查詢上面的效能。 6 | 在 [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)中,大部份 80% 人都是在看文章比較多(讀取資料:SELECT),只有少部分 20% 的人或意見領袖,才會發表文章表示看法(異動資料:INSERT、UPDATE、DELETE),而在做資料異動的時候很有可能會對資料進行鎖定,進而去影響讀取的速度。若是要異動範圍(range)的資料,很容易把資料進行鎖定,更會影響到彼此查詢上面的效能。 7 | 8 | 這時候我們就會想要把資料庫做`讀寫分離`,所以像是 Facebook、部落格之類的媒體,大多會把資料庫做`讀寫分離`,使用者做異動的行為會去主資料庫(Master)去做寫入的動作,然後從資料庫(Slave)在定期的去同步資料庫的內容,而其他大部份的讀者在讀取資料時,都去讀取從資料庫(Slave)的資料,這樣就不會有因為資料異動而導致資料被鎖定,造成讀取變慢的問題。 9 | 10 | 而在主從架構中,可以是有很多台從的資料庫(Slave),透過分散式處理可以將不同的連線分配給不同的從資料庫(Slave),讓每一台從的資料庫平均分配差不多的連線量,因為需要處理的連線減少了,進而讓每個查詢都能夠在短時間都能夠回應查詢結果,提高系統的可用度。 11 | 12 | > 資料庫主從架構中,`主(Master)`資料庫用來做`寫入資料異動(INSERT、UPDATE、DELETE)`的動作,`從(Slave)`資料庫用來做`讀取(SELECT)`的動作。 13 | 14 | 15 | 而若有些很少在查詢的資料,像是 `Log 紀錄`或是在做 `分析的資料`,這類的資料從放在可靠性較高的資料庫(MySQL、Postgres、Oracle...etc)中,對於我們來說是比較浪費的,因為他會佔用存取資料庫的資源,這類的資料我們可能可以考慮用一些 NoSQL 的方案去存取這些資料,讓重要的資料存放在可靠性的資料庫就好,提高資料庫的可用性。 16 | 17 | # 參考資料 18 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 19 | * [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99) 20 | -------------------------------------------------------------------------------- /Cache/Cache-Set-Cache.md: -------------------------------------------------------------------------------- 1 | # 設定快取 2 | 3 | 對於同樣不常異動的資料,若有大量使用者存取,我們通常為把資料設為快取(可以存放在記憶體或檔案中),在下次有其他使用者要來存取相同資料的時候,不去資料庫重新撈取資料,直接將快取的資料回傳給使用者,可以減少資料庫的存取,增加系統處理使用者請求的效能。 4 | 5 | ## 以部落格文章快取為例 6 | 7 | 我們會透過文章編號存取部落格文章,像是 `/posts/1` 或 `/posts/2`,以 PHP Laravel Framework 為例,我們可能透過這樣的方式去處理存取部落格文章: 8 | 9 | ```php 10 | 在可以用`CHAR(20)`去儲存資料時,就不要用`CHAR(200)` 8 | 9 | > 在可以用`VARCHAR(20)`去儲存資料時,就不要用`VARCHAR(200)` 10 | 11 | > 在可以用`TINYINT`去儲存資料時,就不要用`INT` 12 | 13 | > 在可以用`TEXT`去儲存資料時,就不要用`LONGTEXT` 14 | 15 | > 在可以用`BLOB`去儲存資料時,就不要用`LONGBLOB` 16 | 17 | 18 | ## 越簡單越好 19 | 20 | > 整數(INT)> 固定字串(CHAR)> 變動字串(VARCHAR)> 文字(TEXT) 21 | 22 | 越簡單的資料類型,資料庫所需要用來建立索引的效率越好,因為資料類型越複雜代表資料的排列組合越多,所以需要更大的索引及計算去取得資料 23 | 24 | > 在可以用整數(INT)去儲存資料時,就不要用固定字串(CHAR) 25 | 26 | > 在可以用固定字串(CHAR)去儲存資料時,就不要用變動字串(VARCHAR) 27 | 28 | > 在可以用變動字串(VARCHAR)去儲存資料時,就不要用文字(TEXT) 29 | 30 | 31 | ## 盡量不使用NULL 32 | 33 | 若非必要儲存NULL的資料,否則要盡可能的把資料欄位設定為NOT NULL,資料庫很難最佳化有NULL資料欄位的查詢,可以NULL的資料欄位需要更多的儲存空間,資料庫還需要對其進行特殊處理,而當有NULL資料欄位使用`索引(INDEX)`的時候,每一條的索引紀錄必需要額外紀錄資料,導致查詢時索引的效率降低。 34 | 35 | 若真的要儲存NULL,在`不影響原有的資料`的情況下,可以考慮用`0`、`特殊值`...等等之類的值去代替,可以用來區別是否為NULL 36 | 37 | 38 | ## 結論 39 | 40 | 一切的資料類型都只能看自己應用的需求去決定,如果沒辦法還是要用比較複雜的資料類型,那還是必須要用,資料的完整性比任何的東西都重要多了,效率就想辦法用`增加機器`或者是`優化資料表結構`...等等的其他方式去達成,不要為了效率增加而強迫使用特定的資料類型,這樣反而是因噎廢食。 41 | 42 | 43 | ## 參考資料 44 | * [KeJyun學習日誌: 提高存取MySQL效率小技巧](http://blog.kejyun.com/2012/12/Tips-For-Use-MySQL-With-High-Performance.html) 45 | -------------------------------------------------------------------------------- /Database/Database-MySQL-Lock-Table.md: -------------------------------------------------------------------------------- 1 | # 鎖表 (Lock Table) 2 | 3 | ## 定義 4 | 5 | 在要更新資料表的資料時,`MySQL`會將表資料鎖定無法讀取,直到資料異動完畢,`MyISAM` 預設支援 `Table-level lock`,而 `InnoDB` 預設支援 `Row-level lock` 6 | 7 | ### Table-level lock 8 | 資料表資料異動時,將「`整個資料表(Table)`」都鎖定住無法讀取 9 | 10 | ### Row-level lock 11 | 資料表資料異動時,將「`要更新的資料列(row)`」都鎖定住無法讀取 12 | 13 | ## 注意事項 14 | 15 | 在使用 `Row-level lock` 時必需要 `明確指定要異動資料的主鍵(Primary Key)`,否則將會改用 `Table-level lock` 去做資料表的異動 16 | 17 | ## 範例 18 | 19 | 假設有 user 資料表,裡面有 id 與 name 的欄位,id 是主鍵 20 | 21 | | SQL | Table lock | Row lock | No lock | 備註 22 | | ------------- |:-------------:|:-------------:|:-------------:|:-------------:| 23 | | SELECT * FROM user WHERE id='1' FOR UPDATE; | - | v | - | 明確指定主鍵,並且有此筆資料,row lock | 24 | | SELECT * FROM user WHERE id='-1' FOR UPDATE; | - | - | v | 明確指定主鍵,若查無此筆資料,無 lock | 25 | | SELECT * FROM user WHERE name='KeJyun' FOR UPDATE; | v | - | - | 無主鍵,table lock | 26 | | SELECT * FROM user WHERE id<>'1' FOR UPDATE; | v | - | - | 主鍵不明確,table lock | 27 | | SELECT * FROM user WHERE id LIKE '3' FOR UPDATE; | v | - | - | 主鍵不明確,table lock | 28 | 29 | ## 備註 30 | 31 | FOR UPDATE 僅適用於 InnoDB,且必須在交易區塊(BEGIN/COMMIT)中才能生效。 32 | 33 | ## 參考資料 34 | * [KeJyun學習日誌: MySQL鎖表(Lock Table)Table-level與Row-level比較](http://blog.kejyun.com/2012/12/Compare-Tabel-And-Row-Level-Lock-On-MySQL.html) 35 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # High Scaling Websites Structure Learning Notes 大型網站架構學習筆記 2 | 3 | ## 作者 : KeJyun 4 | 5 | 本書會介紹一些在建構大型網站中常用到的技巧或方法 6 | 7 | ### 書籍網址 8 | 9 | | 項目 | 網址 | 10 | |---|---| 11 | | Github Page | http://kejyun.github.io/high-scaling-websites-structure-learning-notes/ | 12 | | GitBook | http://kejyuntw.gitbooks.io/high-scaling-websites-structure-learning-notes/ | 13 | 14 | ### 聯絡資訊 15 | 16 | | 項目 | 網址 | 17 | |---|---| 18 | | Email | kejyun@gmail.com | 19 | | LinkedIn | https://tw.linkedin.com/in/kejyun | 20 | | Github | https://github.com/kejyun | 21 | | Facebook | http://fb.me/kejyunTaiwan | 22 | | Blog | http://blog.kejyun.com | 23 | 24 | 25 | 26 | ## 所有 KeJyun 著作 27 | * [High Scaling Websites Structure Learning Notes 大型網站架構學習筆記](http://kejyuntw.gitbooks.io/high-scaling-websites-structure-learning-notes/) 28 | * [Laravel 4 學習筆記](http://kejyuntw.gitbooks.io/laravel-4-learning-notes/) 29 | * [Laravel 5 學習筆記](http://kejyuntw.gitbooks.io/laravel-5-learning-notes/) 30 | * [SEO 學習筆記](http://kejyuntw.gitbooks.io/seo-learning-notes/) 31 | * [gulp 學習筆記](http://kejyuntw.gitbooks.io/gulp-learning-notes/) 32 | * [Web Developer Learning Resource 網頁開發學習資源](http://kejyuntw.gitbooks.io/web-developer-learning-resource/) 33 | * [Mac OSX 新手入門](http://kejyuntw.gitbooks.io/mac-osx-for-newbie/) 34 | * [Ruby on Rails 學習筆記](http://kejyuntw.gitbooks.io/ruby-on-rails-learning-note/) -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-All-Together.md: -------------------------------------------------------------------------------- 1 | # Web 與 Database 在一起 2 | 3 | 把 Web 或 Database 弄在同一部機器這樣的方式,通常是用在程式開發的「本地測試機」中,或者通常不會`同時`拿來服務太多的人的應用(共同上線人數),像是一般基本的公司形象網站、個人介紹頁...等等不會有太多人同時存取的資料懷鏡時才會使用這樣的架構。 4 | 5 | 除了基本的 Apache(Nginx)及 MySQL(Postgres),視情況需要設置會把其他的服務也放在同一台機器,像是 Memcached、Redis、Node.js。 6 | 7 | ***架構圖:*** 8 | 9 | ![Web 與 Database 在一起](http://i.imgur.com/nofyGyI.png) 10 | 11 | ## 使用時機 12 | 13 | ### 同時上線人數約 10 ~ 15 人左右 14 | 15 | ***人數為 KeJyun 過去經驗大概估算的人數,沒有經過實際測試僅參考用*** 16 | 17 | 當一台機器需要乘載多種服務時,同樣的主機資源要做很多事情,當然處理的效率通常都比較低 18 | 19 | 以飲料店來當例子,1 個服務生可以同時服務 3 位客人(主機同時有 3 個人發出請求),同時替客人做點餐結帳、做飲料的工作,但是當客人越來越多時來到 15 人的時候,因為人的能力有限(主機的效能有限),所以後面很多人就需要排隊等前面幾位客人拿到飲料後才能繼續服務,所以就需要等待排隊(主機回應時間拉長),所以要看自己的經濟規模(同時乘載規模)去聘請適當人數的員工(規劃使用不同的主機架構) 20 | 21 | > 同時上線非每日上限,同日上線指的是同一時先有多個人同時跟伺服器要資源 22 | 23 | ### 服務的可用度不需要太高 24 | 25 | 大部份的使用者可能最高能夠忍受等待[一貶眼的時間](http://news.networkmagazine.com.tw/news/2012/03/02/38107/)(約 250 毫秒 = 0.25 秒),但是這個標準是在產品一秒鐘幾十萬上下時才需要達到的可用度標準(像是股票交易、電子商務網站、社群網站...等等),一般只是測試用或是形象的網站不需要那麼高的可用度,能夠在 1 ~ 2 秒內回應都是在可接受的範圍(當然不能太久超過 4、5 秒以上,使用者會以為網站掛掉了沒有回應) 26 | 27 | ### 資料有做異地備援 28 | 29 | 資料可以說是整個企業的資產及生命,當資料自己有做異地備援的話,就可以用這樣的方式,不然主機太操很容易掛掉(就像人一樣加班太久沒休息,就很容易生病),裡面的資料可能會因為沒辦法救回,所以若是有做資料備份,不怕資料不見的狀況下再去用這樣的架構去做服務。 30 | 31 | 32 | ## 參考資料 33 | * [你願意花多久等待網站回應? 一貶眼都嫌太久了](http://news.networkmagazine.com.tw/news/2012/03/02/38107/) 34 | * [瞭解網站速度 - Analytics (分析) 說明](https://support.google.com/analytics/answer/2383341?hl=zh-Hant) 35 | -------------------------------------------------------------------------------- /Reference/Reference-Article.md: -------------------------------------------------------------------------------- 1 | # 參考文章 2 | 3 | ## 概念 4 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 5 | * [10+個高流量網站架構的效能提升解決方案](http://www.qa-knowhow.com/?p=2314) 6 | * [100 open source Big Data architecture papers for data professionals.](https://www.linkedin.com/pulse/100-open-source-big-data-architecture-papers-anil-madan) 7 | 8 | ## 資料庫 9 | 10 | ### 連線池 11 | * [Connection pool - Wikipedia, the free encyclopedia](http://en.wikipedia.org/wiki/Connection_pool) 12 | 13 | ### MySQL 14 | * [KeJyun學習日誌: 在MySQL中MyISAM與InnoDB資料庫引擎比較](http://blog.kejyun.com/2012/12/Compare-MyISAM-And-InnoDB-Database-Engine-In-MySQL.html) 15 | 16 | ## 實際案例 17 | * [Stack Overflow: The Architecture - 2016 Edition](http://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/) 18 | * [《英雄联盟》玩家聊天服务的持久层演进](http://bbs.gameres.com/thread_485619.html) 19 | * [KKBOX Case Study – Amazon Web Services (AWS)](http://aws.amazon.com/tw/solutions/case-studies/kkbox/) 20 | 21 | ## HTTP Cache 22 | * [循序漸進理解 HTTP Cache 機制 | TechBridge 技術共筆部落格](http://blog.techbridge.cc/2017/06/17/cache-introduction/) 23 | 24 | 25 | ## Python 26 | * [Python + Django 如何支撑了 7 亿月活用户的 Instagram?](http://python.jobbole.com/87814/) 27 | * [Sharding & IDs at Instagram](https://engineering.instagram.com/sharding-ids-at-instagram-1cf5a71e5a5c) 28 | * [Dismissing Python Garbage Collection at Instagram](https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172) 29 | -------------------------------------------------------------------------------- /Architecture-Evolution/Architecture-Evolution-Dynamic-Data-Cache.md: -------------------------------------------------------------------------------- 1 | # 動態資料快取 2 | 3 | `產生的靜態頁面暫存`可以應付整個頁面很少改變的網頁,但是若網頁中有一部分的資料遭到異動,為了讓使用者取得到最新的資料,我們勢必要清除頁面並重新撈取資料庫的資料去產生新的`靜態頁面暫存`。 4 | 5 | 但對於資料層來說,其實只有`少部分的資料`有異動需要重新讀取,若我們需要重新去資料庫撈取`所有`頁面需要的資料,再去兜頁面所需要的資訊,勢必是浪費資料庫資源的,還記得我在[概念章節](High-Scaling-Websites-Structure-Concept.md)中有提到說 `資料庫存取是很昂貴的`,過多的查詢,導致資料庫的存取效率降低,且資料庫的的連線因為系統的限制,所以也沒辦法無限制的增加連線數量。 6 | 7 | 所以我們會想要把一些很少異動的資料去做快取(Cache)預存下來,當下一位使用者來索取相同的資料時,則不去資料庫進行查詢,直接將之前預存的資料丟回給使用者,提高系統反應的時間及資料存取的效率,快取的資料會依照我們設定資料的過期時間,當超過過期時間後,再重複去資料庫撈取資料,看看有沒有異動。 8 | 9 | ***架構圖:*** 10 | 11 | ![動態資料快取](http://i.imgur.com/sbORt7G.png) 12 | 13 | ## 注意事項 14 | 15 | ### 快取時間 16 | 17 | 快取的失效時間通常會依我們的應用去做設定,也因為系統資源的關係(記憶體大小、硬碟容量大小...等等)的原因,沒有辦法將快取設定為永久存放,像是做討論區的文章,通常熱門的文章在短時間會被重複的讀取,但熱度可能會隨著時間做遞減,可能大約 3 天的時間文章就會變得越少人存取,當沒有人存取的時候,我們就會希望將快取清除,避免佔住系統資源(記憶體或硬碟容量),所以我們可能會針對討論區的文章做 3 天時間的快取,等超過 3 天 後,快取會自動地將過期的快取資料做清除,除非又有人再次讀取該篇文章,才會再做一次快取(失效時間再次設為 3 天),一直重複這樣的動作,等到資料不再被存取,就只會保留在資料庫的洪流當中。 18 | 19 | ### 手動清除快取 20 | 21 | 在部落格發表的文章被讀取時,我們會對文章進行快取,但有時文章可能因為作者「打錯字」或者「需要補充資料」,導致文章需要被異動更新,為了能夠讓快取有最新被異動過的資料,通常我們會手動的清除這篇文章的快取,這樣就可以確保下一個存取這篇文章的讀者,一定會拿到最新的文章資料。 22 | 23 | ### 資料分散式快取 24 | 25 | 快取的資料通常是用 `key` 及 `value` 的方式去紀錄資料,而我們通常會把資料做片段的快取,像是部落格文章 A 的快取我們通常會存放在 `key` 為 `blog_post:post_id_A` 的快取資料中,以此類推,文章 B 的快取 `key` 為 `blog_post:post_id_B`,用這樣的方式對每個不同的小資源去做快取,而不是將所有部落格文章 A 及 B 存放在一整個快取 `key` 為 `blog_post:all` 中,因為當使用者異動文章 A 的時候,若我們要刪除快取的資料則只需要刪除文章 A 的快取即可,而不需要把其他不必要刪除的文章快取也一併刪除了,提高快取使用的效率,通常要看自己的應用及使用者存取資料的方式,需要用哪一種方式的快取比較適合,每一種應用都有適合自己的快取方法。 26 | 27 | 28 | ## 參考資料 29 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html) 30 | -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # 摘要 2 | 3 | * [介紹](README.md) 4 | * [概念](High-Scaling-Websites-Structure-Concept.md) 5 | * [資料庫](Database/Database-README.md) 6 | * [資料排列組合](Database/Database-Data-Permutations.md) 7 | * [MySQL](Database/Database-MySQL-README.md) 8 | * [資料類型](Database/Database-MySQL-DataType.md) 9 | * [程式語法](Database/Database-MySQL-Code-SQL.md) 10 | * [索引](Database/Database-MySQL-Index.md) 11 | * [鎖表](Database/Database-MySQL-Lock-Table.md) 12 | * [交易](Database/Database-MySQL-Transaction.md) 13 | * [伺服器](Database/Database-MySQL-Server.md) 14 | * [SELECT 語法效能測試](Database/Database-MySQL-SELECT-SQL-Explain-Test.md) 15 | * [程式](Code/Code-README.md) 16 | * [陣列](Code/Code-Array.md) 17 | * [快取](Cache/Cache-README.md) 18 | * [設定](Cache/Cache-Set-Cache.md) 19 | * [刪除](Cache/Cache-Delete-Cache.md) 20 | * [檔案](File/File-README.md) 21 | * [網站架構演進](Architecture-Evolution/Architecture-Evolution-README.md) 22 | * [Web 與 Database 在一起](Architecture-Evolution/Architecture-Evolution-All-Together.md) 23 | * [Web 與 Database 分手](Architecture-Evolution/Architecture-Evolution-Break-Up.md) 24 | * [前端靜態頁面暫存](Architecture-Evolution/Architecture-Evolution-Static-Page-Cache.md) 25 | * [頁面片段暫存](Architecture-Evolution/Architecture-Evolution-Partial-Page-Cache.md) 26 | * [動態資料快取](Architecture-Evolution/Architecture-Evolution-Dynamic-Data-Cache.md) 27 | * [增加 Web Server](Architecture-Evolution/Architecture-Evolution-Add-Web-Server.md) 28 | * [增加 Database Server](Architecture-Evolution/Architecture-Evolution-Add-Database-Server.md) 29 | * [分散式資料](Architecture-Evolution/Architecture-Evolution-Data-Distribute.md) 30 | * [增加更多的 Web Server](Architecture-Evolution/Architecture-Evolution-Add-More-Web-Server.md) 31 | * [Database 讀寫分離](Architecture-Evolution/Architecture-Evolution-Set-Database-Read-Write-Server.md) 32 | * [高可用性的服務架構](Architecture-Evolution/Architecture-Evolution-High-Useability-Service-Structure.md) 33 | * [授權](Auth/Auth-README.md) 34 | * [Session](Auth/Session/Session-README.md) 35 | * [JWT(JSON Web Token)](Auth/JWT/JWT-README.md) 36 | * [常見問題](Question/Question-README.md) 37 | * [壓力測試](Question/Question-Benchmark.md) 38 | * [參考資料](Reference/Reference-README.md) 39 | * [書籍](Reference/Reference-Books.md) 40 | * [文章](Reference/Reference-Article.md) 41 | * [影片](Reference/Reference-Video.md) 42 | * [工具](Reference/Reference-Tool.md) 43 | -------------------------------------------------------------------------------- /Database/Database-MySQL-Code-SQL.md: -------------------------------------------------------------------------------- 1 | # 程式語法 2 | 3 | ## 指定使用索引 4 | 5 | ```sql 6 | SELECT id 7 | FROM data USE INDEX(type) 8 | WHERE type=12345 AND level > 3 9 | ORDER BY id 10 | ``` 11 | 12 | 在下SQL語法的時候,有時某些語法使用某些索引執行效率會比較好,可是有時候MySQL沒辦法替我們選擇一個最適合的索引,導致執行的效率很慢(slow query),所以我們必須使用USE INDEX去指定執行效率好的索引,以提升效率。 13 | 14 | ## 少用 JOIN,多用幾次 SELECT 撈取大量資料 15 | 16 | ```sql 17 | SELECT user.id , user.name , post.id , post.content 18 | FROM user , post 19 | WHERE user.id = post.user_id ...... 20 | ``` 21 | 22 | 我們在撈取使用者的文章資訊的時候,我們可能會用 JOIN 去撈取我們要的資料,這樣很直覺,只是當使用者資料有`10000`筆,而文章有`99999`筆,像這種有大量資料時候的話,使用 JOIN 對於資料存取真的是惡夢,因為 JOIN 過後表示會有 `10000x99999=999990000` 筆資料,然後再從這麼大量的資料中去撈取 WHERE 判斷式中指定的資料,在資料庫伺服器記憶體不夠的時候鐵定會炸掉。 23 | 24 | 解決方式是我們可以分批撈取使用者的資料,以及文章的資料 25 | 26 | ```sql 27 | SELECT user.id , user.name 28 | FROM user 29 | ``` 30 | 31 | ```sql 32 | SELECT id , content , user_id 33 | FROM post 34 | ``` 35 | 36 | 在使用者資料資料撈取出來之後,使用迴圈將使用者存成陣列,但是「陣列的索引」是使用可以識別的「使用者編號(user.id)」當作索引值。 37 | 38 | ```php 39 | 只要字串長度大於 13 時,我們擁有的資料排列組合就會超過 BigInt 的 `18446744073709551615` 排列組合 68 | 69 | 在使用字串當做鍵值時,在我們每次新增資料手動產生字串鍵值時,我們都要再去資料庫檢查看看此鍵值是否存在,再新增進去,且我們需要額外做新增資料造成 Duplicate Key 的例外狀況處理 70 | 71 | > 在我們字串很長時雖然發生碰撞的機率很低,但為了系統資料的安全完整性,還是需要做檢查的步驟,並在頻繁被通知到鍵值衝突的狀況時,就要考慮增加字串長度,減少鍵值碰撞機率了 72 | 73 | ## 參考資料 74 | * [MySQL資料型態描述與比較](http://blog.kejyun.com/2012/09/MySQL-Field-Data-Type-Description-And-Compare.html) 75 | * [Store UUID in an optimized way](https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/) 76 | * [《江南Style》如何衝破YouTube的計數器](http://www.cw.com.tw/article/article.action?id=5063028) 77 | * [資料庫索引數據結構及主鍵設計(b+tree)(part 1)](http://www.slideshare.net/yftzeng/btreepart-1) 78 | -------------------------------------------------------------------------------- /Database/Database-MySQL-SELECT-SQL-Explain-Test.md: -------------------------------------------------------------------------------- 1 | # Explain 做 SQL SELECT 語法效能測試 2 | 3 | 在MySQL我們在使用 SELECT 做撈取資料的時候,有時候常常會效能低落,撈取資料需要很長的時間,有時候是 SQL 語法下得不好導致沒有使用到正確的索引去撈資料,我們這個時候就必須要檢查我們下的 SQL 語法到底有哪些地方需要改善,我建立的 comment 的資料表並新增幾筆假資料去做示範 4 | 5 | ```sql 6 | -- 建立資料表 7 | 8 | -- 留言 9 | CREATE TABLE IF NOT EXISTS `comment` ( 10 | `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '編號', 11 | `content` varchar(50) COLLATE utf8_unicode_ci NOT NULL COMMENT '留言', 12 | PRIMARY KEY (`id`) 13 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ; 14 | 15 | -- 使用者 16 | CREATE TABLE IF NOT EXISTS `user` ( 17 | `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '編號', 18 | `name` varchar(30) COLLATE utf8_unicode_ci NOT NULL COMMENT '姓名', 19 | PRIMARY KEY (`id`) 20 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=3 ; 21 | 22 | -- 使用者的留言 23 | CREATE TABLE IF NOT EXISTS `user_comment` ( 24 | `user_id` int(10) unsigned NOT NULL COMMENT '使用者編號', 25 | `comment_id` int(10) unsigned NOT NULL COMMENT '評論編號', 26 | PRIMARY KEY (`user_id`,`comment_id`) 27 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; 28 | 29 | 30 | 31 | -- 新增資料 32 | 33 | -- 留言 34 | INSERT INTO `comment` (`id`, `content`) VALUES 35 | (1, '留言1'), 36 | (2, '留言2'); 37 | 38 | -- 使用者 39 | INSERT INTO `user` (`id`, `name`) VALUES 40 | (1, '使用者1'), 41 | (2, '使用者2'); 42 | 43 | -- 使用者的留言 44 | INSERT INTO `user_comment` (`user_id`, `comment_id`) VALUES 45 | (1, 1), 46 | (1, 2); 47 | 48 | 49 | 50 | 51 | -- 解釋MySQL語法效能 52 | 53 | -- 撈取留言資料 54 | EXPLAIN SELECT * FROM `comment` WHERE id` =2; 55 | 56 | -- 撈取使用者的留言資料 57 | EXPLAIN SELECT * 58 | FROM `comment` c, `user` u, `user_comment` uc 59 | WHERE u.`id` = uc.`user_id` 60 | AND uc.`comment_id` = c.`id` 61 | ``` 62 | 63 | ## 解釋 MySQL 語法效能:撈取留言資料 64 | 65 | ![Explain MySQL SQL](http://3.bp.blogspot.com/-YagV7ODyhPI/UNLeKtvO_uI/AAAAAAAAC8k/v4NrdlhZnCo/s1600/2012-12-20_174153.png) 66 | 67 | ## 解釋MySQL語法效能:撈取留言資料 68 | 69 | ![解釋MySQL語法效能:撈取留言資料](http://4.bp.blogspot.com/-7EtC6Ep0cbY/UNLjeYCIEPI/AAAAAAAAC88/chZ1hCB5rrs/s1600/2012-12-20_180223.png) 70 | 71 | 72 | 而EXPLAIN後的資料有下面這些欄位 73 | 74 | ### select_type 75 | ### table 76 | 77 | 關連到的資料表 78 | 79 | ### type 80 | 81 | 使用關聯查詢的類型(效率由好至壞排序) 82 | 83 | * System 84 | * const 85 | * eq_ref 86 | * ref 87 | * fulltext 88 | * ref_or_null 89 | * index_merge 90 | * unique_subquery 91 | * index_subquery 92 | * range 93 | * index 94 | * ALL 95 | 96 | ### possible_keys 97 | 98 | 可能使用到的索引,從WHERE語法選擇出一個適合的欄位 99 | 100 | ### key 101 | 102 | 實際使用到的索引,如果為NULL,則是沒有使用索引 103 | 104 | ### key_len 105 | 106 | 使用索引的長度,長度越短 準確性越高 107 | 108 | ### ref 109 | 110 | 顯示那一列的索引被使用,一般是一個常數(const) 111 | 112 | ### rows 113 | 114 | MySQL用來返回資料的筆數,可以簡單的把rows視為執行效能,越少越好 115 | 116 | ### Extra 117 | 118 | MySQL用來解析額外的查詢訊息 119 | 120 | 121 | - Distinct 122 | 123 | 當MySQL找到相關連的資料時,就不再搜尋。 124 | 125 | - Not exists 126 | 127 | MySQL優化 LEFT JOIN,一旦找到符合的LEFT JOIN資料後,就不再搜尋。 128 | 129 | - Range checked for each Record(index map:#) 130 | 131 | 無法找到理想的索引。此為最慢的使用索引。 132 | 133 | - Using filesort 134 | 135 | 當出現這個值時,表示此SELECT語法需要優化。因為MySQL必須進行額外的步驟來進行查詢。 136 | 137 | - Using index 138 | 139 | 返回的資料是從索引中資料,而不是從實際的資料中返回,當返回的資料都出現在索引中的資料時就會發生此情況。 140 | 141 | - Using temporary 142 | 143 | 同Using filesort,表示此SELECT語法需要進行優化。此為MySQL必須建立一個暫時的資料表(Table)來儲存結果,此情況會發生在針對不同的資料進行ORDER BY,而不是GROUP BY。 144 | 145 | - Using where 146 | 147 | 使用WHERE語法中的欄位來返回結果。 148 | 149 | - System 150 | 151 | system資料表,此為const連接類型的特殊情況。 152 | 153 | - Const 154 | 155 | 資料表中的一個記錄的最大值能夠符合這個查詢。因為只有一行,這個值就是常數,因為MySQL會先讀這個值然後把它當做常數。 156 | 157 | - eq_ref 158 | 159 | MySQL在連接查詢時,會從最前面的資料表,對每一個記錄的聯合,從資料表中讀取一個記錄,在查詢時會使用索引為主鍵或唯一鍵的全部。 160 | 161 | - ref 162 | 163 | 只有在查詢使用了非唯一鍵或主鍵時才會發生。 164 | 165 | - range 166 | 167 | 使用索引返回一個範圍的結果。例如:使用大於>或小於<查詢時發生。 168 | 169 | - index 170 | 171 | 此為針對索引中的資料進行查詢。 172 | 173 | - ALL 174 | 175 | 針對每一筆記錄進行完全掃描,此為最壞的情況,應該盡量避免。 176 | 177 | ## 結論 178 | 179 | MySQL 的 Explain 可以分析大部份的 SQL 語法效能,但有些語法像是 WHERE IN 則會被歸類為 range 的語法,但實際上則是 Using Where,所以確切的語法分析要再看看文件真正的用法去決定 180 | 181 | ## 參考資料 182 | * [KeJyun學習日誌: 在MySQL使用Explain做SQL SELECT語法效能測試](http://blog.kejyun.com/2012/12/Using-EXPLAIN-SQL-To-Analysis-Efficient-On-MySQL.html) 183 | -------------------------------------------------------------------------------- /High-Scaling-Websites-Structure-Concept.md: -------------------------------------------------------------------------------- 1 | # 大型網站架構概念 2 | 3 | 當一個網站需要服務很多很多很多人的時候,我們就需要好的存取資料的架構,讓使用者能夠很快的拿到他想要拿的資料,這樣伺服器(Web & Database)才可以有更多「`時間`」及「`空間`」去服務更多的人。 4 | 5 | 你可能會聽到有一些在提高伺服器存取效率的關鍵字,像是「資料快取」、「資料庫索引」、「資料庫讀寫分離」、「資料分散式處理」...等等之類的方式。 6 | 7 | ## 提高效能存取的概念 8 | 9 | ### 不常異動的資料做快取 10 | 11 | #### 後端資料 12 | 13 | 以部落格文章為例,作者發表一篇文章後,除了文章內有一些小小的錯誤或錯字需要編輯修改,否則大部份的時間文章都是不會去異動的。 14 | 15 | 在動態去資料庫存取本篇文章時,我們可能會使用這樣的 SQL 語法去存取: 16 | 17 | ```sql 18 | SELECT * FROM posts WHERE post_id = '部落格文章編號' Limit 1; 19 | ``` 20 | 21 | 當每個使用者要來看這篇文章時,都需要透過這樣的 SQL 指令去撈取這篇文章的資訊(標題、內文...blabla),如果文章讀者不多時,偶爾這樣取資料庫撈資料還撐得過去,但像是比較熱門被轉載的文章,每分鐘可能有好幾百幾千人要去看這篇文章的資料,對於資料庫就是執行 1000 次這樣相同的語法去取得相同的資料(要記得,資料庫的存取是很昂貴的)。 22 | 23 | 為了減少資料庫存取次數,所以在程式方面我們會將這些不長異動的資料在第一次從資料庫撈取之後,把它`快取`下來,可能用 `Memcached`、`Redis` 或 `File` 的方式將資料預先存下來,然後再設定資料的過期時間,當超過過期時間後,再重複去資料庫撈取資料。 24 | 25 | 這樣可以減少資料庫的運算資源,使用者也可以很快地拿到他們想要看的資料,提高了伺服器對於資料的存取量。 26 | 27 | #### 前端資料 28 | 29 | 我們在網站上常常需要用 JavaScript 去完成我們要的 UI 操作效果,但是這些 JavaScript 除非操作方式有做大幅的異動,否則我們很少會修改這些 JavaScript 檔案(在 CSS 檔案部分也相同)。 30 | 31 | 除非有做特別的設定,再不重新整理畫面的前提下,瀏覽器都會幫我們的 JavaScript 及 CSS 檔案去做快取,載入過一次後就存放在使用者自己的電腦上,等到下次我們需要使用相同的 JavaScript 或 CSS 檔案時,瀏覽器會直接取用使用者電腦本地端的檔案,而不會去遠端伺服器再次拿取 JavaScript 或 CSS 檔案。 32 | 33 | 我們已可以利用瀏覽器的這個特性,等到 JavaScript 或 CSS 檔案做變更時,再透過`版本號`或`不同的檔名`去讓瀏覽器讀取最新的 JavaScript 或 CSS 檔案: 34 | 35 | ***版本號*** 36 | 37 | ```html 38 | 39 | 40 | 41 | 42 | 43 | ``` 44 | 45 | ***不同檔名*** 46 | 47 | > 這個部分通常是用後端程式去控制 JavaScript 及 CSS 現在的版本檔名 48 | 49 | ```html 50 | 51 | 52 | 53 | 54 | 55 | ``` 56 | 57 | ### 索引 58 | 59 | 對於需要常用來查詢的資料做好索引,可以加快查詢的效率 60 | 61 | ### 資料庫讀寫分離 62 | 63 | 以部落格文章為例,在 [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)中,大部份 80% 人都是在看文章比較多(讀取資料:SELECT),只有少部分 20% 的人或意見領袖,才會發表文章表示看法(異動資料:INSERT、UPDATE、DELETE),而在做資料異動的時候很有可能會對資料進行鎖定,進而去影響讀取的速度。 64 | 65 | > 除了有關交易(Transaction)的資料在 SELECT 的時候才有可能對資料進行鎖定,像是購票或購買限量商品時,會把撈取出來加入購物車的資料先行鎖定 66 | 67 | 所以像是 Facebook、部落格之類的媒體,大多會把資料庫做讀寫分離,使用者做異動的行為會去主資料庫(Master)去做寫入的動作,然後從資料庫(Slave)在定期的去同步資料庫的內容,而其他大部份的讀者在讀取資料時,都去讀取從資料庫(Slave)的資料,這樣就不會有因為資料異動而導致資料被鎖定,造成讀取變慢的問題。 68 | 69 | 而在主從架構中,可以是有很多台從的資料庫(Slave),透過分散式處理可以將不同的連線分配給不同的從資料庫(Slave),讓每一台從的資料庫平均分配差不多的連線量,因為需要處理的連線減少了,進而讓每個查詢都能夠在短時間都能夠回應查詢結果,提高系統的可用度。 70 | 71 | 72 | --- 73 | 74 | 75 | ## 重要觀念 76 | 77 | ### 資料庫存取是很昂貴的 78 | 79 | 每下一段資料庫 SQL 語法去撈取資料,資料庫就要對資料去進行比對運算,當資料很龐大,SQL 語法又太複雜導致運算很繁瑣時,資料庫需要進行運算的時間又會變得更久,所以建立好的索引(Index)及下好的 SQL 語法是很重要的工作 80 | 81 | > 雖然資料都撈得出來,但是撈得漂亮,撈得快也是一種藝術啊~ 82 | 83 | ### 與資料庫建立的連線是很昂貴的資源 84 | 85 | 我們要對資料庫進行操作就需要先與資料庫進行連線,但是資料庫的連線建立是很花時間的,時間越久會導致回應的時間也會跟著變長,然而資料庫的連線數又不能無限制的擴張(需要看主機效能的乘載量),所以連線的資源就相對的珍貴。 86 | 87 | 我們通常會用連線池(Connection Pool)的方式去維持資料庫的連線,建立連線後做完查詢就將連線丟到連線池內,供下一個要做查詢的人使用,下一個要做查詢的人發現有可用的連線,就不用花費時間重新的去建立與資料庫的連線,直接使用就有的連線即可,直到這個連線時間超過資料庫最高的限制失效斷線為止,這樣就能提高資料庫連線使用的效率。 88 | 89 | ### 對於需要當作查詢的資料必須建立索引 90 | 91 | 我們會依照我們想要撈取的資料下 `WHERE`、`GROUP BY`、`ORDER BY` ...等等的條件去撈取我們想要撈取的資料,如果不建立索引資料庫也是有辦法將資料撈出來,只是速度會比較慢,當這些條件資料有預先做索引時,資料庫可以很快地利用索引去完成查詢的動作 92 | 93 | > 索引原理就是將資料透過資料結構(e.g. [B-tree]((http://zh.wikipedia.org/wiki/B%E6%A0%91)), Hash)預先做排序的存放,使用索引時資料庫可以很快的資料放在哪一個地方,以達到快速撈取資料的目的 94 | 95 | #### ***以書籍為例*** 96 | 97 | 索引就像書籍最前面的目錄一樣,可以很快地告訴我們哪一段章節的資料在第幾頁,所以我們就可以很快地翻閱到我們想看的資料,如果書籍沒有做目錄(索引)的話,我們要找到特定章節的資料時,我們就需要從頭到尾翻閱去查看看我們要看的資料到底在哪一頁,這對於書籍的讀者來說是相當浪費時間的一件事。 98 | 99 | *詳細的索引說明會在稍後章節提到* 100 | 101 | 102 | ## 結論 103 | 104 | 大部份提高存取效率的作法大多是: 105 | 106 | ### 設計好的程式邏輯 107 | 108 | 在做簡單的資料撈取,能夠用更快的方式(減少迴圈、快速比對資料...blabla)去產生我們要的資料,當然能夠加快請求的速度,但是在現在硬體處理的速度越來越快的情況下,程式的邏輯沒有太複雜下(複雜的可能像是演算法),大部份的資料處理都是很快的,所以大部份的瓶頸都是卡在資料庫資料的撈取邏輯、資料存放方式、資料撈取方式以及索引建置的邏輯。 109 | 110 | ### 減少資料庫的存取 111 | 112 | 資料能夠快取不要重複撈取就不要重複撈取,可以把資料存放在記憶體的快取中,若是整個頁面都很少異動,也可以把整個畫面(View)的 HTML 去做快取,把資料組合成 HTML 的動作都省下來了。 113 | 114 | ### 降低資料表資料量 115 | 116 | 當一個資料表只有 100 筆或是到 1000 筆資料的時候,資料庫大部份都的處理速度都是很快的(只有幾毫秒而已,人感覺不出來差異),但是當一個資料表的資料量有好幾百萬或好幾千萬筆資料,要從這麼多的資料去撈取出想要的資料,需要耗費的運算資源就會更多,所以讓資料能夠分散儲存,降低每個資料表的資料量,這樣也可以大大提高資料庫存取的效率。 117 | 118 | #### ***以書籍為例*** 119 | 120 | 想像一下一本書有 100 頁或是到 1000 頁的量,我們在透過書籍目錄去查詢我們想要看的資料,應該速度不會差太多,但是當一本書有好幾百萬頁(現實上可能會分很多冊,像百科全書),我們需要看的書籍目錄也會比較多(索引表很大),所以也需要花更多時間去查詢我們要的資料。 121 | 122 | ### 資料庫分散式處理 123 | 124 | 當資料異動或讀取的資料量過大時,資料的異動會影響到讀取的效能,所以我們就需要將資料做讀寫分離。 125 | 126 | ### 建立良好的資料庫索引 127 | 128 | 將用來做資料撈取判斷的資料做好索引,而索引的順序及 WHERE 條件順序會影響索引的使用,這個部分會在之後提到。 129 | 130 | ## 參考資料 131 | * [B樹 - 維基百科,自由的百科全書](http://zh.wikipedia.org/wiki/B%E6%A0%91) 132 | * [全文檢索 - 維基百科,自由的百科全書](http://zh.wikipedia.org/wiki/%E5%85%A8%E6%96%87%E6%AA%A2%E7%B4%A2) 133 | * [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99) 134 | -------------------------------------------------------------------------------- /Database/Database-MySQL-Index.md: -------------------------------------------------------------------------------- 1 | # 索引 2 | 3 | ## 索引順序 4 | 5 | 在所有的資料庫內,建立索引是提昇資料庫資料存取效率的很重要的方式,但是錯誤的`索引順序`或是 `SQL 語法` 都可能讓查詢語法變成了 slow query。 6 | 7 | 假設我們有一個部落格文章的資料表,裡頭存放所有使用者的部落格文章,資料表資料如下: 8 | 9 | ### posts 10 | 11 | |post_id | post_status | author | content | 12 | | ------------- |:-------------:|:-------------:|:-------------:| 13 | | 1 | 正常 | kejyun | 文章 1 | 14 | | 2 | 刪除 | kejyun | 文章 2 | 15 | | 3 | 正常 | kejyun | 文章 3 | 16 | | 4 | 正常 | jimmy | 文章 4 | 17 | | 5 | 刪除 | jimmy | 文章 5 | 18 | 19 | 如果我們要撈取 `作者為 kejyun` 且 `狀態正常` 的文章,我們可能會用下列語法去撈取 20 | 21 | ### SQL 語法 1 22 | 23 | ```sql 24 | SELECT * 25 | FROM posts 26 | WHERE post_status = '正常' 27 | AND author = 'kejyun' 28 | ``` 29 | 30 | 或者用這個語法撈取 31 | 32 | ### SQL 語法 2 33 | 34 | ```sql 35 | SELECT * 36 | FROM posts 37 | WHERE author = 'kejyun' 38 | AND post_status = '正常' 39 | ``` 40 | 41 | 這兩句 SQL 語法都可以撈出我們想要的結果,但是對於不同的索引執行的效率卻是差異很多 42 | 43 | 如果我們的索引是 `post_status`、`author` 的順序,`SQL 語法 1` 則會正確的使用索引來做查詢,執行的效率會很快。 44 | 45 | 但是對於`SQL 語法 2`,會因為找不到適合該語法的索引去做查詢,所以會變成不使用索引,而對整個資料表作完整的檢索,去查找出資料來。 46 | 47 | #### 主要原因是 48 | 49 | > 在 SQL 語法 WHERE 的順序會影響查詢索引的順序 50 | 51 | > 同樣的索引資料在不同的順序,表示為不同的索引,所以 `post_status、author` 與 `author、post_status` 這兩個索引雖然是使用相同的資料欄位,但因為順序不同所帶來的查詢效果也不同 52 | 53 | > 同一句 SQL 語法只能使用單一個索引,所以不同的索引沒辦法共用 54 | 55 | 資料庫沒有這麼人工智慧,會去判斷兩個 SQL 語法可以使用相同的索引去查找出相同的資料,他會依照程式設計師給的語法,依序去查找是否有可用的索引。 56 | 57 | #### 分析 SQL 語法 1 58 | 59 | 在 `SQL 語法 1` 中,我們 WHERE 的第 1 個條件是 `post_status = '正常'`,所以資料庫會去查找有沒有索引開頭是使用 `post_status` 的索引。 60 | 61 | 若有該索引,則再判斷 WHERE 的第 2 個條件 `author = 'kejyun'`,所以資料庫會去查找有沒有索引開頭是使用 `post_status` 的索引,且該索引的第 2 個索引是使用 `author` 的索引。 62 | 63 | 資料庫會一直判斷比較 SQL 語法 WHERE 條件與索引之間的順序關係,直到沒辦法匹配後,後面的沒辦法匹配索引則使用完整比對的方式去進行查詢 64 | 65 | #### 分析 SQL 語法 2 66 | 67 | 在 `SQL 語法 2` 中,我們 WHERE 的第 1 個條件是 `author = 'kejyun'` 但我們資料庫沒有建立索引開頭是使用 `author` 的索引,所以只能對資料庫使用完整比對。 68 | 69 | ### 原理 70 | 71 | 索引就像是書籍的目錄一樣,若以童話故事書為例,我們書中會收錄世界各國的故事,且每個故事有他自己的類型,像是奇幻、驚悚、傳說、神話。 72 | 73 | 如果我們以國家當做大標題(e.g. 第 1 索引),以故事類型當作小標題(e.g. 第 2 索引),那麼書籍目錄會像: 74 | 75 | - 台灣 76 | - 奇幻 77 | - 驚悚 78 | - 傳說 79 | - 神話 80 | - 日本 81 | - 奇幻 82 | - 驚悚 83 | - 傳說 84 | - 神話 85 | - 歐美 86 | - 奇幻 87 | - 驚悚 88 | - 傳說 89 | - 神話 90 | - 印度 91 | - 奇幻 92 | - 驚悚 93 | - 傳說 94 | - 神話 95 | - etc... 96 | 97 | 如果我們要找`台灣`且類型為`傳說`的童話故事,我們的目光會先移動到台灣的區段,然後在這個區塊下找到類型為`傳說`的故事 98 | 99 | - 台灣 100 | - 奇幻 101 | - 驚悚 102 | - ***傳說*** 103 | - 神話 104 | 105 | 106 | 但是如果我們要找所有類型為`傳說`的故事中,發生在`台灣`的故事,依照步驟我們會希望先把所有`傳說`的故事先列出來,再從這個目錄下去找屬於`台灣`的故事,所以我們希望會看到像這樣的目錄(索引): 107 | 108 | - 傳說 109 | - ***台灣*** 110 | - 日本 111 | - 歐美 112 | - 印度 113 | 114 | 但是在這本故事書的目錄(索引)中,我們沒有看到這樣的目錄結構,所以我們沒辦法透過目錄快速的找到我們要看的所有類型為`傳說`的章節資料,只好從頭到尾的去翻閱整本書,直到找到全部我們`傳說`章節的故事,然後再從這些找出來的`傳說`故事中,再去區別出哪些為`台灣`的故事。 115 | 116 | 所以目錄(索引)的規則就是希望看故事書的人要怎麼快速找到他想要的東西,當沒有我們可以參考的目錄的話,就像資料庫沒有可參考的索引一樣,就會找得比較慢(但還是找得出來)。 117 | 118 | 119 | ### 結論 120 | 121 | 因為索引順序的不同,以及 SQL WHERE 條件順序的不同,會使得資料庫在使用索引進行查詢有不同的效率,所以要謹慎的使用索引及 SQL 語法,才能達到高效率的查詢結果。 122 | 123 | 124 | ## 控制索引更新 125 | 126 | 127 | ### 關閉索引更新: 128 | 129 | ```sql 130 | ALTER TABLE table_name DISABLE KEYS; 131 | ``` 132 | ### 開啟索引更新: 133 | 134 | ```sql 135 | ALTER TABLE table_name ENABLE KEYS; 136 | ``` 137 | 138 | MySQL在新增(INSERT)、刪除(DELETE)、更新(UPDATE)的時候會去更新現有的索引表,而更新索引表也需要花費一些時間,當異動一筆資料的時候,索引表也做一次的異動,但當在做大量資料異動的時候,例如異動1000筆資料,索引表也需要異動1000次,而其實我們只需要最後一次(最新)的異動就好了,前面的999次都是不需要做的索引表異動更新,所以在異動大量資料前,可以使用指令 `ALTER TABLE table_name DISABLE KEYS;` 關閉索引更新,等異動完成後,再使用指令 `ALTER TABLE table_name ENABLE KEYS;` 開啟索引更新。 139 | 140 | ```sql 141 | ALTER TABLE users DISABLE KEYS; 142 | 異動(INSERT、DELETE、UPDATE)大量資料SQL語法 143 | ALTER TABLE users ENABLE KEYS; 144 | ``` 145 | 146 | ## 自定義Hash Index做字串完整比對 147 | 148 | 我們知道在對`字串(CHAR或VARCHAR)`去做查找的時候效率會遠比對`整數(INT)`查找還慢,因CRC32對字串做校驗後會回傳`整數的校驗碼`,我們在資料表增加一個整數型態欄位,儲存要比對字串的校驗碼。 149 | 150 | ### crc32 151 | 152 | > 建立str 的 32 位迴圈冗余校驗碼多項式。這通常用於檢查傳輸的資料是否完整。 153 | 154 | > 由於 PHP 的整數是帶符號的,許多 crc32 校驗碼將返回負整數,因此你需要使用 sprintf() 或 printf() 的「%u」格式符來取得表示無符號 crc32 校驗碼的字串。 155 | 在原本實作email登入時會對email欄位做索引,所以會先去查找email字串欄位的資料,之後再去比對密碼是否正確,但若資料過多字串比對的效率會降低很多 156 | 157 | ```sql 158 | SELECT id,name,email 159 | FROM user 160 | WHERE email = "kejyun@gmail.com" 161 | AND password = "xxx" 162 | ``` 163 | 164 | 我們加入了 `emailcrc` 的欄位去儲存對 email 字串的校驗碼,再查找 email 字串欄位的資料前,先透過crc整數校驗碼快速過濾掉不可能的資料,之後再從少數的資料中做 email 字串欄位字串比對,如果資料量很大,這樣的效率會提升很多。 165 | 166 | ```sql 167 | SELECT id,name,email 168 | FROM user 169 | WHERE emailcrc = CRC32("kejyun@gmail.com") 170 | AND email="kejyun@gmail.com" 171 | AND password="xxx" 172 | ``` 173 | 174 | 這邊要注意的是沒辦法只使用 crc 校驗碼去當作唯一的條件,`不同的字串可能會出現相同的校驗碼`,所以最後還是要對你要比對的字串做比對,避免查詢發生錯誤。 175 | 176 | |email | emailcrc | 177 | | ------------- |:-------------:| 178 | |kejyun1@gmail.com | 1234567890 | 179 | |kejyun2@gmail.com | 1234567890 | 180 | 181 | 182 | ## 參考資料 183 | * [KeJyun學習日誌: 提高存取MySQL效率小技巧](http://blog.kejyun.com/2012/12/Tips-For-Use-MySQL-With-High-Performance.html) 184 | * [KeJyun學習日誌: MySQL效率調校](http://blog.kejyun.com/2012/12/MySQL-Efficiency-Adjustment.html) 185 | * [MySQL Indexing: Best Practices Slide PDF](http://www.percona.com/files/presentations/WEBINAR-MySQL-Indexing-Best-Practices.pdf) 186 | * [Tools and Techniques for Index Design PDF Slide](http://www.percona.com/files/presentations/WEBINAR-tools-and-techniques-for-index-design.pdf) 187 | * [EXPLAIN Demystified PDF slide](http://www.percona.com/files/presentations/WEBINAR2012-02-Explain-Demystified.pdf) 188 | * [Optimizing MySQL Configuration PDF Slide](http://www.percona.com/files/presentations/percona-live/london-2012/PLUK2012-optimizing-mysql-configuration.pdf) 189 | * [PHP手冊 - crc32](http://por.tw/Website_Design/PHP5/function.crc32.html) 190 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | --------------------------------------------------------------------------------