├── Auth
    ├── Auth-README.md
    ├── Session
    │   └── Session-README.md
    └── JWT
    │   └── JWT-README.md
├── .gitignore
├── Cache
    ├── Cache-README.md
    ├── Cache-Delete-Cache.md
    └── Cache-Set-Cache.md
├── Question
    ├── Question-README.md
    └── Question-Benchmark.md
├── Code
    ├── Code-README.md
    └── Code-Array.md
├── Database
    ├── Database-README.md
    ├── Database-MySQL-README.md
    ├── Database-MySQL-Server.md
    ├── Database-MySQL-Transaction.md
    ├── Database-MySQL-DataType.md
    ├── Database-MySQL-Lock-Table.md
    ├── Database-MySQL-Code-SQL.md
    ├── Database-Data-Permutations.md
    ├── Database-MySQL-SELECT-SQL-Explain-Test.md
    └── Database-MySQL-Index.md
├── Reference
    ├── Reference-README.md
    ├── Reference-Books.md
    ├── Reference-Video.md
    ├── Reference-Tool.md
    └── Reference-Article.md
├── Architecture-Evolution
    ├── Architecture-Evolution-README.md
    ├── Architecture-Evolution-Partial-Page-Cache.md
    ├── Architecture-Evolution-Add-Database-Server.md
    ├── Architecture-Evolution-Static-Page-Cache.md
    ├── Architecture-Evolution-High-Useability-Service-Structure.md
    ├── Architecture-Evolution-Data-Distribute.md
    ├── Architecture-Evolution-Add-Web-Server.md
    ├── Architecture-Evolution-Add-More-Web-Server.md
    ├── Architecture-Evolution-Break-Up.md
    ├── Architecture-Evolution-Set-Database-Read-Write-Server.md
    ├── Architecture-Evolution-All-Together.md
    └── Architecture-Evolution-Dynamic-Data-Cache.md
├── book.json
├── README.md
├── SUMMARY.md
├── File
    └── File-README.md
├── High-Scaling-Websites-Structure-Concept.md
└── LICENSE


/Auth/Auth-README.md:
--------------------------------------------------------------------------------
1 | # 授權
2 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | /_book/
2 | /node_modules/
3 | 


--------------------------------------------------------------------------------
/Cache/Cache-README.md:
--------------------------------------------------------------------------------
1 | # 快取
2 | 
3 | 在這裡會介紹怎麼樣用快取去加快資料存取的效率
4 | 


--------------------------------------------------------------------------------
/Question/Question-README.md:
--------------------------------------------------------------------------------
1 | # 常見問題
2 | 
3 | 這裏會解說一些效能相關的問題
4 | 


--------------------------------------------------------------------------------
/Code/Code-README.md:
--------------------------------------------------------------------------------
1 | # 程式
2 | 
3 | 在這裡會介紹程式中要怎麼控制去快速的處理資料，提高程式的存取效率
4 | 


--------------------------------------------------------------------------------
/Database/Database-README.md:
--------------------------------------------------------------------------------
1 | # 資料庫
2 | 
3 | 在這裡會介紹一些如何提高查詢資料庫效率的相關技巧
4 | 


--------------------------------------------------------------------------------
/Reference/Reference-README.md:
--------------------------------------------------------------------------------
1 | # 參考資料
2 | 
3 | 這裏會紀錄一些相關的參考資料，可以去做進一步的學習及查詢
4 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-README.md:
--------------------------------------------------------------------------------
1 | # MySQL
2 | 
3 | 在這裡會介紹一些在 MySQL 提高存取效率的一些技巧及方法
4 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-README.md:
--------------------------------------------------------------------------------
1 | # 網站架構演進
2 | 
3 | 這裏會介紹從小型網站到大型網站整個的架構，是怎麼規劃及演進的。
4 | 


--------------------------------------------------------------------------------
/Reference/Reference-Books.md:
--------------------------------------------------------------------------------
1 | # 參考書籍
2 | 
3 | ## MySQL
4 | * [High Performance MySQL, 2nd Edition原文書免費下載](http://it-ebooks.info/book/256/)
5 | 


--------------------------------------------------------------------------------
/Reference/Reference-Video.md:
--------------------------------------------------------------------------------
1 | # 參考影片
2 | 
3 | ## 流量分散式處理
4 | * [Load Balancing Stateless vs. Stateful Applications](https://www.youtube.com/watch?v=NrMM3s7Mbjo)
5 | 


--------------------------------------------------------------------------------
/Reference/Reference-Tool.md:
--------------------------------------------------------------------------------
 1 | # 參考工具
 2 | 
 3 | ## JavaScript
 4 | * [jsPerf — JavaScript performance playground](http://jsperf.com/)
 5 | 
 6 | ## Database
 7 | 
 8 | ### 連線池
 9 | * [Database Connection Pool Library | Libzdb](http://www.tildeslash.com/libzdb/#)
10 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Partial-Page-Cache.md:
--------------------------------------------------------------------------------
 1 | # 頁面片段暫存
 2 | 
 3 | 在我們將靜態頁面暫存起來後，減少了很多對資料庫不必要的存取，`回應時間` 提升了了很多。
 4 | 
 5 | 但我們也會想看看能不能在動態產生的頁面中，對於部分的很少變動的資料也做暫存的處理，這樣的話就可以減少部分動態網頁頁面不必要的存取了。
 6 | 
 7 | ***架構圖：***
 8 | 
 9 | ![頁面片段暫存](http://i.imgur.com/YfClhCO.png)
10 | 
11 | ## 參考資料
12 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
13 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Add-Database-Server.md:
--------------------------------------------------------------------------------
 1 | # 增加 Database Server
 2 | 
 3 | 在網站成長之後一段時間後，發現資料庫在查詢（SELECT）及資料異動（INSERT、UPDATE、DELETE）之間，因為查詢處理變得太多，使得互相競爭資料庫的資源非常嚴重，進而導致系統回應時間變得很慢。
 4 | 
 5 | 這時候我們可能會想要把不同的資料表分散在不同的資料庫當中，分散查詢處理的資源，而會需要更改程式讀取資料庫的邏輯架構。
 6 | 
 7 | ***架構圖：***
 8 | 
 9 | ![增加 Database Server](http://i.imgur.com/zxlXQLx.png)
10 | 
11 | 
12 | ## 參考資料
13 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
14 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-Server.md:
--------------------------------------------------------------------------------
 1 | # 伺服器
 2 | 
 3 | ## 設定
 4 | 
 5 | 
 6 | ### max_allowed_packet：允許最大封包
 7 | 
 8 | 在INSERT或UPDATE的時候，因為封包資料過大，導致錯誤發生
 9 | 
10 | ### max_connect_errors：最大連線錯誤
11 | 
12 | 設 4294967295，可以避免當 client (像是 php) 發生太多錯誤時被 block 住。
13 | 
14 | > default 0，請設定一個最大的值即可，32位元的系統 最大是 4bytes 4294967295、64位元的系統最大是 18446744073709547520
15 | 
16 | ## 參考資料
17 | * [note: mysql 跟效能有關的設定 @ mini box 迷你小盒子 :: nidBox親子盒子](http://mini.nidbox.com/diary/read/7451092)
18 | 


--------------------------------------------------------------------------------
/book.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "plugins": ["disqus","ga", "adsense"],
 3 |   "pluginsConfig": {
 4 |     "disqus": {
 5 |       "shortName": "kejyun-high-scaling-websites-structure"
 6 |     },
 7 |     "ga": {
 8 |       "token": "UA-31063691-11"
 9 |     },
10 |     "adsense": {
11 |         "client": "ca-pub-1990193713845546",
12 |         "slot": "2858259611",
13 |         "format": "auto",
14 |         "element": ".page-inner section",
15 |         "position": "top"
16 |     }
17 |   }
18 | }


--------------------------------------------------------------------------------
/Auth/Session/Session-README.md:
--------------------------------------------------------------------------------
 1 | # Session
 2 | 
 3 | ## Sticky Session / Cookie stickiness
 4 | 
 5 | 當服務有超過 1 台以上的主機時，在主機前面會使用 Load balancer 去做 Request 的分流，假設有 3 台主機，同一個使用者的 Session 就會分散在不同的主機上，會導致使用者在不同主機的 Session 資料不一樣，可能使用者在主機 1 登入狀態，在主機 2 卻是登出的狀態。
 6 | 
 7 | 可以使用 Sticky Session，讓同一個使用者的 Request 都指向固定一台實體主機，這樣可保持使用者登入的狀態
 8 | 
 9 | 
10 | ## 參考資料
11 | * [Sticky and NON-Sticky sessions - Stack Overflow](https://stackoverflow.com/questions/10494431/sticky-and-non-sticky-sessions)
12 | * [New Elastic Load Balancing Feature: Sticky Sessions | AWS News Blog](https://aws.amazon.com/blogs/aws/new-elastic-load-balancing-feature-sticky-sessions/)
13 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Static-Page-Cache.md:
--------------------------------------------------------------------------------
 1 | # 前端靜態頁面暫存
 2 | 
 3 | 
 4 | 當越來越多人來存取你的應用時，你會發現你服務的 `反應時間` 又開始變慢了，你可以會發現 `Web` 機器效能的使用沒有 `Database` 那麼大，就會發現現在的瓶頸應該是卡在 `Database` 有太多人去進行存取了。
 5 | 
 6 | 我麼這時候會試著將頁面中很少異動的頁面（大約 1~2 天才會更新的頁面），做成靜態頁面的快取，在撈取資料庫資料時把資料產生成靜態 HTML 檔案，當下次再次的讀取相同資料時，則直接將靜態的 HTML 回傳，減少資料庫的存取，提高存取資料庫的效率（有需要再去進行查詢）。
 7 | 
 8 | ***架構圖：***
 9 | 
10 | ![前端靜態頁面暫存](http://i.imgur.com/20Zr88T.png)
11 | 
12 | ## 注意事項
13 | 
14 | ### 整頁靜態頁不可頻繁的修改
15 | 
16 | 因為會將整個頁面都做是要給使用者看到的整個結果頁，所以頁面的資料若時常修改的話，則勢必要一直重新產生新的靜態頁面，但這樣就失去了做靜態頁面暫存的意義了
17 | 
18 | ## 參考資料
19 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
20 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-High-Useability-Service-Structure.md:
--------------------------------------------------------------------------------
 1 | # 高可用性的服務架構
 2 | 
 3 | 透過不斷的增加 Web Server 就可以提高使用者訪問量，但這樣的架構非常的龐大，當要對這樣的架構進行改動時會相當的不方便，可用性變得不高。
 4 | 
 5 | 而且部署機器和維護也必變得相當麻煩，龐大的應用服務架構要在 N 台機器上進行複製、啟動都需要花很多的時間，當機器出問題時也很難立即找出拿台機器出了問題，更有可能是某個應用服務的程式出現 Bug，導致整個站掛掉都沒辦法使用了。
 6 | 
 7 | 在優化調校時也比較難操作，因為部署的機器每一台都需要進襲調校，沒辦法只進行針對性的調校。
 8 | 
 9 | 餘是為了解決這樣的問題，就發展出大型的分佈式應用，而這樣的架構可能會遇到許多挑戰：
10 | 
11 | * 分布式應用需要提供高效能且穩定的架構
12 | 
13 | * 將龐大的應用拆分出來需要耗費很長的時間，而且需要對業務的整理和系統依賴關係進行控制
14 | 
15 | * 如何維護（依賴管理、執行狀況管理、錯誤追踪、調校優化、監控和示警等）好這個龐大的分佈式應用。
16 | 
17 | 完成這一步後，系統的架構差不多會到相對穩定的階段，也會使用大量便宜的機器去支撐突如其來的訪問流量
18 | 
19 | 
20 | # 參考資料
21 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
22 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Data-Distribute.md:
--------------------------------------------------------------------------------
 1 | # 分散式資料
 2 | 
 3 | 在做資料庫分散之後，若單一資料表的資料太多，我們會想要用分散資料庫的概念，對資料表中的資料進行分散式處理，而會再需要更改程式讀取資料庫及資料表的邏輯架構。
 4 | 
 5 | 而在這我們就需要規劃資料分散在多個資料庫及資料表的架構，要怎麼去存取，我們通常會設計一個統一的資料庫及資料表資源分散的統一架構，去進行管理資源的存放及存取方式。
 6 | 
 7 | 像是 Facebook 每天有好幾百萬千萬的文章資訊會被發表出來，我們可能會對各個使用者做文章存取的資源分配，將使用者的資源分配記錄記在統一個資源管理架構中做記錄，紀錄的資訊可能會是 User A 的文章資料存放在 DB_1 及 TABLE_2，而 User B 的文章資料存放在 DB_3 及 TABLE_1，當要存取個別使用者的資料資料，根據資源管理分配的的方式，分別去不同的資料庫及資料表撈取資料。
 8 | 
 9 | 而當需要使用分散式資料的情況下，可能也會發現因為資料量過大，導致快取同步的方式出了問題，讓快取不能只存放在本地端而用同步的方式進行分享，可能需要採用分散式快取的方式去快取資料。
10 | 
11 | ***架構圖：***
12 | 
13 | ![增加 Database Server](http://i.imgur.com/ky9Y29v.png)
14 | 
15 | 
16 | ## 參考資料
17 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
18 | 


--------------------------------------------------------------------------------
/Question/Question-Benchmark.md:
--------------------------------------------------------------------------------
 1 | # 壓力測試
 2 | 
 3 | 在我們寫完程式並上了伺服器後，若我們的網站需要服務很多人，我們可能會想要這個服務能夠服務多少人，通常我們會對伺服器進行壓力測試。
 4 | 
 5 | 壓力測試的軟體有很多，我比較常用的是 `Apache AB test` ，簡單的了解伺服器在多少連線下，處理請求（Request）的速度及效能，而同樣的機器在不同的程式處理之下，會有不同的效能，所以沒辦法統一的斷定哪一種機器只能服務多少人，若程式寫的好的話，可能同時可以應付 4、500 人同使請求，但也有程式可能因為效能不佳，僅能服務 4、50 人（伺服器回應時間標準一樣），所以要對自己的應用服務做壓力測試後才知道狀況。
 6 | 
 7 | 也有很多其他的壓力測試的軟體，只要找出一套適合自己的壓測工具就可以了。
 8 | 
 9 | ## 參考資料
10 | * [ab - Apache HTTP server benchmarking tool](http://httpd.apache.org/docs/2.2/programs/ab.html)
11 | * [The Will Will Web | 使用 ApacheBench 進行網站的壓力測試](http://blog.miniasp.com/post/2008/06/30/Using-ApacheBench-ab-to-to-Web-stress-test.aspx)
12 | * [Web server benchmarking - Wikipedia, the free encyclopedia](http://en.wikipedia.org/wiki/Web_server_benchmarking)
13 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Add-Web-Server.md:
--------------------------------------------------------------------------------
 1 | # 增加 Web Server
 2 | 
 3 | 隨著網站應用使用者增加，我們會發現 Web Server 機器的壓力變得比較高了，這個時候就會開始考慮增加 Web Server 了。增加 Web Server 除了可以降低機器的壓力外，同時也可以在 Web Server 掛掉後，有備援機器可以使用，避免整個服務完全中斷。
 4 | 
 5 | 在增加 Web Server 之後，會碰到一些問題：
 6 | 
 7 | * 如何將流量平均分配到不同的 Web Server
 8 | 
 9 | > 通常我們會使用 Apache 的負載平衡方法，或是 LVS 相關的負載平衡方法去分配這些流量
10 | 
11 | * 如何讓 Web Server 的資料同步
12 | 
13 | > 像是使用者的 session 在不同機器要如何同步，可能會考慮使用資料庫、cookie 或同步 Session 的機制
14 | 
15 | * 快取（cache）資料同步
16 | 
17 | > 像是快取會員的個人資料，可能會使用快取同步的機制，或者是分散式快取的方式
18 | 
19 | * 上傳的檔案同步
20 | 
21 | > 可能會使用檔案共享的方式
22 | 
23 | ***架構圖：***
24 | 
25 | ![增加 Web Server](http://i.imgur.com/RA6cJgk.png)
26 | 
27 | 
28 | ## 參考資料
29 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
30 | 


--------------------------------------------------------------------------------
/Auth/JWT/JWT-README.md:
--------------------------------------------------------------------------------
 1 | # JWT（JSON Web Token）
 2 | 
 3 | ## 參考資料
 4 | * [JSON Web Tokens - jwt.io](https://jwt.io/)
 5 | * [Authentication 方案优化探索（JWT, Session, Refresh Token, etc.） - 后端 - 掘金](https://juejin.im/entry/5974088ef265da6c2c7f5c52)
 6 | * [在 Laravel 實現自動 Refresh JWT 機制](https://blog.albert-chen.com/laravel-auto-refresh-jwt/)
 7 | * [\[Day-32\] (實作)用JWT取代傳統Session來驗證使用者身份 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10196759)
 8 | * [\[Day-33\] (實作)使用JWT來存取API內容(上) - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10196967)
 9 | * [\[Day-34\] (實作)使用JWT來存取API內容(下) - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天](https://ithelp.ithome.com.tw/articles/10197054)
10 | * [Stop using JWT for sessions - joepie91's Ramblings](http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/)
11 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-Transaction.md:
--------------------------------------------------------------------------------
 1 | # 交易
 2 | 
 3 | 
 4 | ## 控制InnoDB Transaction
 5 | 
 6 | ### 關閉自動提交：
 7 | 
 8 | ```sql
 9 | SET autocommit=0;
10 | ```
11 | 
12 | ### 開啟自動提交：
13 | 
14 | ```sql
15 | SET autocommit=1;
16 | ```
17 | 
18 | 在 MySQL InnoDB 預設所有的資料異動都是 `Transaction（交易）`，當與資料庫做連線的時候，InnoDB 會採用`自動提交（autocommit）`的方式，所以除非使用 `BEGIN;` 及 `COMMIT;` 將異動語法包覆起來***設定為同一個  Transaction，否則任何一個資料異動的語法（INSERT、DELETE、UPDATE）會認為是一個獨立的 Transaction***。
19 | 
20 | 所以資料表每次做資料異動的時候會一直`提交 (COMMIT)` 去更新日誌，若有 1000筆 獨立的SQL要執行就會被 COMMIT 1000 次，所以在下異動語法之前，可以使用指令 `SET autocommit=0;` 關閉自動提交，等異動完成後，再使用指令 `SET autocommit=1;` 開啟自動提交。
21 | 
22 | ```sql
23 | SET autocommit=0;
24 | 異動（INSERT、DELETE、UPDATE）大量資料SQL語法
25 | SET autocommit=1;
26 | ```
27 | 
28 | ## 參考資料
29 | * [KeJyun學習日誌: MySQL效率調校](http://blog.kejyun.com/2012/12/MySQL-Efficiency-Adjustment.html)
30 | 


--------------------------------------------------------------------------------
/Code/Code-Array.md:
--------------------------------------------------------------------------------
 1 | # 陣列
 2 | 
 3 | ## 鍵值（key）
 4 | 
 5 | 在撈取資料庫的資料後，我們會把資料存在陣列中，等之後比對資料時可以當作查詢之用，若可以的話建議陣列的`鍵值（key）`要用可以辨識資料的值。
 6 | 
 7 | 像是使用者的資料可以用使用者編號當鍵值，若之後要直接取特定使用者的資料，直接就可以指定使用者編號去取得使用者的資料：
 8 | 
 9 | ```php
10 | <?php
11 | 
12 | $users[$user_id] = $user_info;
13 | ```
14 | 
15 | 像是我們要取所有使用者的留言，留言資料中有使用者編號，不同的留言可能來自同一個使用者，但我們都要取用他的使用者名稱，這樣我們就可以很方便的取得該使用者的資料。
16 | 
17 | ```php
18 | <?php
19 | 
20 | $messages = [
21 |   [
22 |     'id'      => 1,
23 |     'user_id' => 1,
24 |     'message' => 'Hi'
25 |   ],
26 |   [
27 |     'id'      => 2,
28 |     'user_id' => 2,
29 |     'message' => 'hello'
30 |   ],
31 |   [
32 |     'id'      => 3,
33 |     'user_id' => 1,
34 |     'message' => 'How are you?'
35 |   ]
36 | ];
37 | 
38 | // 取得留言使用者姓名
39 | foreach ($messages as &$message) {
40 |   $message['user_name'] = $users[$message['user_id']]['user_name'];
41 | }
42 | ```
43 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Add-More-Web-Server.md:
--------------------------------------------------------------------------------
 1 | # 增加更多的 Web Server
 2 | 
 3 | 在做完分散式資料庫及分散式資料表之後，又開始看到流量暴增的的時候，發現系統的回應時間又開始變慢了，你可能會看到 Web Server（Apache、Nginx...etc）那邊阻塞了很多的請求，而導致每個存取的使用者需要時間等待而導致回應時間增加了，這個時候我們就可能需要增加更多的 Web Server 去應付這些大量的請求。
 4 | 
 5 | 在做完這些工作之後，會進到一個像是可以無限擴充的階段，網站流量增加時，就是不斷的增加 Server （$$$） 去應付大量的流量。
 6 | 
 7 | 在做 Web Server 負載平衡時可能會遇到下列問題：
 8 | 
 9 | * 基本的負載平衡（Apache 負載平衡、LVS...etc）無法應付巨量的請求量
10 | 
11 | 如果經費允許的話（都流量暴增了，會不想要投入更多的 $ 去讓自己的事業擴大嗎 XD），可以採購負載平衡的硬體設備
12 | 
13 | > e.g. [F5 Load Balancer](https://f5.com/glossary/load-balancer)、[Netsclar Load Balancer](http://www.citrix.com.tw/products/netscaler-application-delivery-controller/overview.html)...etc
14 | 
15 | 若真的經濟不允許，可以將應用從邏輯上做分類，不同類型的應用分散給不同的負載平衡群集處理。
16 | 
17 | * 訊息同步的文件分享方法出現瓶頸
18 | 
19 | 這時可以根據不同的網站業務需求去做不同的分散式文件系統
20 | 
21 | ## 參考資料
22 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
23 | * [F5 Load Balancer](https://f5.com/glossary/load-balancer)
24 | * [Netsclar Load Balancer](http://www.citrix.com.tw/products/netscaler-application-delivery-controller/overview.html)
25 | 


--------------------------------------------------------------------------------
/Cache/Cache-Delete-Cache.md:
--------------------------------------------------------------------------------
 1 | # 刪除快取
 2 | 
 3 | 我們設定了快取希望使用者存取資料的時候能夠加快回應時間，但資料可能會因為使用者的`刪除`或`修改`而產生了異動，若我們需要讓使用者能夠讀取到最新的資料，我們通常會手動的將快取刪除，讓使用者下次讀取的話能夠去資料庫讀取到最新的資料，在重新設定一次新的快取資料。
 4 | 
 5 | ## 以部落格文章快取為例
 6 | 
 7 | ### 更新部落格文章資料
 8 | 
 9 | ```php
10 | <?php
11 | 
12 | // 快取 key 依照不同的文章編號，可以是 blog_post:1 或 blog_post:2
13 | $cache_key = 'blog_post:' . $post_id;
14 | 
15 | $results = DB::update('UPDATE `posts` SET `content`= ? WHERE `post_id` = ?', [$new_content, $post_id]);
16 | 
17 | if ($result) {
18 |   // 選擇1: 直接設定文章快取，讓下一位使用者再次存取相同文章時可直接取用快取
19 |   Cache::put($cache_key, $new_content, $cache_minutes);
20 | 
21 |   // 選擇2: 刪除快取，等下一位使用者讀取文章時在去設定快取
22 |   Cache::forget($cache_key);
23 | }
24 | 
25 | return $result;
26 | ```
27 | 
28 | ### 刪除部落格文章資料
29 | 
30 | ```php
31 | <?php
32 | 
33 | // 快取 key 依照不同的文章編號，可以是 blog_post:1 或 blog_post:2
34 | $cache_key = 'blog_post:' . $post_id;
35 | 
36 | $results = DB::delete('DELETE FROM `posts` WHERE `post_id` = ?', [$post_id]);
37 | 
38 | if ($result) {
39 |   // 刪除快取，讓下一位使用者不要再讀取已被刪除的文章
40 |   Cache::forget($cache_key);
41 | }
42 | 
43 | return $result;
44 | ```
45 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Break-Up.md:
--------------------------------------------------------------------------------
 1 | # Web 與 Database 分手
 2 | 
 3 | 當服務越來越多人需要去同時存取時，原本的 `Web` 與 `Database 在一起` 的架構會變得不敷使用了，網站的 `反應時間` 會變得越來越慢，伺服器的 `CPU` 及 `記憶體` 的消耗也會變得越來越高。
 4 | 
 5 | 而且服務彼此還會互相搶機器的資源，若機器發生問題導致資料庫的資料也出現問題，會造成的麻煩會變得更大，於是我們就會想要把 Web 與 Database 的機器相互分離。
 6 | 
 7 | 當用這樣的架構之後，你會發現效率會比之前提高很多，服務彼此不會搶資源了，但是就是要付出 $ 去添購機器摟（當可以用 $ 解決的問題都是小問題）
 8 | 
 9 | ***架構圖：***
10 | 
11 | ![Web 與 Database 分手](http://i.imgur.com/bEOTyVv.png)
12 | 
13 | 
14 | ## 使用時機
15 | 
16 | ### 同時上線人數約 20 ~ 40 人左右
17 | 
18 | ***人數為 KeJyun 過去經驗大概估算的人數，沒有經過實際測試僅參考用***
19 | 
20 | 當我們的服務能夠尤多台機器去分攤時，機器彼此的處理效率當然就會提高許多，可以乘載的人數也會提高
21 | 
22 | 以飲料店來當例子，1 個服務生可以同時服務 3 位客人（主機同時有 3 個人發出請求），但是若有 2 位服務生就可以同時服務 6 位客人（主機同時有 6 個人發出請求），
23 | 將替客人做點餐結帳、做飲料的工作做分工處理，一為服務生同時可以替 6 位客人做點餐結帳的工作，另一位則可以同時替 6 為客人做飲料，整個服務可以乘載的量就會提高很多，
24 | 讓每個人專心做一件事情，效率也會提高很多
25 | 
26 | > 同時上線非每日上限，同日上線指的是同一時先有多個人同時跟伺服器要資源
27 | 
28 | ### 需要維持一定的服務可用度
29 | 
30 | 當服務需要提供比較多人去做存取時，我們就必須要提供一定水準的服務可用度，將反應時間保持在 1 ~ 2 秒內的範圍。
31 | 
32 | ### 資料定期備份
33 | 
34 | 雖然服務還在剛起步的階段，但是資料還是非常重要的公司資產，若只有一台 Database，我們還是需要定期的將資料備份出來（最少 1 天備援一次），避免機器掛掉的機會。
35 | 
36 | 
37 | ## 參考資料
38 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
39 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Set-Database-Read-Write-Server.md:
--------------------------------------------------------------------------------
 1 | # Database 讀寫分離
 2 | 
 3 | 當流量又開始暴增時，會發現原本無限制擴充 Server （$$$）的架構會讓資料庫連線的資源不敷使用。
 4 | 
 5 | 因為而在做資料異動的時候，若有要更新範圍（range）的資料，很容易把資料進行鎖定，進而彼此影響到查詢上面的效能。
 6 | 在 [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)中，大部份 80% 人都是在看文章比較多（讀取資料：SELECT），只有少部分 20% 的人或意見領袖，才會發表文章表示看法（異動資料：INSERT、UPDATE、DELETE），而在做資料異動的時候很有可能會對資料進行鎖定，進而去影響讀取的速度。若是要異動範圍（range）的資料，很容易把資料進行鎖定，更會影響到彼此查詢上面的效能。
 7 | 
 8 | 這時候我們就會想要把資料庫做`讀寫分離`，所以像是 Facebook、部落格之類的媒體，大多會把資料庫做`讀寫分離`，使用者做異動的行為會去主資料庫（Master）去做寫入的動作，然後從資料庫（Slave）在定期的去同步資料庫的內容，而其他大部份的讀者在讀取資料時，都去讀取從資料庫（Slave）的資料，這樣就不會有因為資料異動而導致資料被鎖定，造成讀取變慢的問題。
 9 | 
10 | 而在主從架構中，可以是有很多台從的資料庫（Slave），透過分散式處理可以將不同的連線分配給不同的從資料庫（Slave），讓每一台從的資料庫平均分配差不多的連線量，因為需要處理的連線減少了，進而讓每個查詢都能夠在短時間都能夠回應查詢結果，提高系統的可用度。
11 | 
12 | > 資料庫主從架構中，`主（Master）`資料庫用來做`寫入資料異動（INSERT、UPDATE、DELETE）`的動作，`從（Slave）`資料庫用來做`讀取（SELECT）`的動作。
13 | 
14 | 
15 | 而若有些很少在查詢的資料，像是 `Log 紀錄`或是在做 `分析的資料`，這類的資料從放在可靠性較高的資料庫（MySQL、Postgres、Oracle...etc）中，對於我們來說是比較浪費的，因為他會佔用存取資料庫的資源，這類的資料我們可能可以考慮用一些 NoSQL 的方案去存取這些資料，讓重要的資料存放在可靠性的資料庫就好，提高資料庫的可用性。
16 | 
17 | # 參考資料
18 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
19 | * [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)
20 | 


--------------------------------------------------------------------------------
/Cache/Cache-Set-Cache.md:
--------------------------------------------------------------------------------
 1 | # 設定快取
 2 | 
 3 | 對於同樣不常異動的資料，若有大量使用者存取，我們通常為把資料設為快取（可以存放在記憶體或檔案中），在下次有其他使用者要來存取相同資料的時候，不去資料庫重新撈取資料，直接將快取的資料回傳給使用者，可以減少資料庫的存取，增加系統處理使用者請求的效能。
 4 | 
 5 | ## 以部落格文章快取為例
 6 | 
 7 | 我們會透過文章編號存取部落格文章，像是 `/posts/1` 或 `/posts/2`，以 PHP Laravel Framework 為例，我們可能透過這樣的方式去處理存取部落格文章：
 8 | 
 9 | ```php
10 | <?php
11 | 
12 | // 快取 key 依照不同的文章編號，可以是 blog_post:1 或 blog_post:2
13 | $cache_key = 'blog_post:' . $post_id;
14 | 
15 | // 快取時間為 60 分鐘
16 | //（在 Laravel 的快取方法時間是以分鐘為計，若是原生的快取方法則是以秒數為計，所以要存放 60 分鐘則必須設定為 60 * 60 = 3600）
17 | $cache_minutes = 60;
18 | 
19 | if (Cache::has($cache_key)) {
20 |   return Cache::get($cache_key);
21 | }
22 | 
23 | $results = DB::select('SELECT * FROM `posts` WHERE `post_id` = ?', [$post_id]);
24 | 
25 | if ($result !== null) {
26 |   // 若有找到該文章資料，則設定快取，讓下一位使用者再次存取相同文章時可直接取用快取
27 |   Cache::put($cache_key, $results, $cache_minutes);
28 | }
29 | 
30 | return $result;
31 | ```
32 | 
33 | 我們會先判斷部落格文章的快取是否存在，若有則直接回傳快取資料，若沒有則重新去資料庫撈取文章資料，在撈取完資料後則設定該文章資料的快取，這樣的話之後有其他使用者要存取相同的部落格文章，則直接從快取拿資料就好，不用再到資料庫去撈取資料，減少資料庫的存取，以提升快取的存取效率。
34 | 
35 | 快取的失效時間必須要依你自己的應用去設定，因為快取大多是放在的記憶體當中（Memcached、Redis）這樣 I/O 存取速度才會快，但是記憶體是`有限`的昂貴資源，除非資料需要一直做快取讓使用者存取，否則快取通常不會設定過長的失效時間，希望失效後自動清除並釋放記憶體資源給其他的快取或應用做使用。
36 | 
37 | 若 $ 夠多，能夠買夠多的記憶體，將快取失效時間設為 1 個月，甚至是更大也沒差，端看自己的`硬體狀況`及`資料需要存放狀況`去決定快取的時間。
38 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-DataType.md:
--------------------------------------------------------------------------------
 1 | # 資料類型
 2 | 
 3 | ## 越小越好
 4 | 
 5 | 要使用能夠用來正確儲存資料的最小類型的資料型態，要確保不會低估要儲存的資料大小，因為當資料可以確定用比較少的資料去儲存的時候，所`使用的空間`也會更少，所`需要的索引空間`也會越少，取得資料所需要的`計算時間`也會越短。
 6 | 
 7 | > 在可以用`CHAR(20)`去儲存資料時，就不要用`CHAR(200)`
 8 | 
 9 | > 在可以用`VARCHAR(20)`去儲存資料時，就不要用`VARCHAR(200)`
10 | 
11 | > 在可以用`TINYINT`去儲存資料時，就不要用`INT`
12 | 
13 | > 在可以用`TEXT`去儲存資料時，就不要用`LONGTEXT`
14 | 
15 | > 在可以用`BLOB`去儲存資料時，就不要用`LONGBLOB`
16 | 
17 | 
18 | ## 越簡單越好
19 | 
20 |  > 整數（INT）> 固定字串（CHAR）> 變動字串（VARCHAR）> 文字（TEXT）
21 | 
22 | 越簡單的資料類型，資料庫所需要用來建立索引的效率越好，因為資料類型越複雜代表資料的排列組合越多，所以需要更大的索引及計算去取得資料
23 | 
24 | > 在可以用整數（INT）去儲存資料時，就不要用固定字串（CHAR）
25 | 
26 | > 在可以用固定字串（CHAR）去儲存資料時，就不要用變動字串（VARCHAR）
27 | 
28 | > 在可以用變動字串（VARCHAR）去儲存資料時，就不要用文字（TEXT）
29 | 
30 | 
31 | ## 盡量不使用NULL
32 | 
33 | 若非必要儲存NULL的資料，否則要盡可能的把資料欄位設定為NOT NULL，資料庫很難最佳化有NULL資料欄位的查詢，可以NULL的資料欄位需要更多的儲存空間，資料庫還需要對其進行特殊處理，而當有NULL資料欄位使用`索引（INDEX）`的時候，每一條的索引紀錄必需要額外紀錄資料，導致查詢時索引的效率降低。
34 | 
35 | 若真的要儲存NULL，在`不影響原有的資料`的情況下，可以考慮用`0`、`特殊值`...等等之類的值去代替，可以用來區別是否為NULL
36 | 
37 | 
38 | ## 結論
39 | 
40 | 一切的資料類型都只能看自己應用的需求去決定，如果沒辦法還是要用比較複雜的資料類型，那還是必須要用，資料的完整性比任何的東西都重要多了，效率就想辦法用`增加機器`或者是`優化資料表結構`...等等的其他方式去達成，不要為了效率增加而強迫使用特定的資料類型，這樣反而是因噎廢食。
41 | 
42 | 
43 | ## 參考資料
44 | * [KeJyun學習日誌: 提高存取MySQL效率小技巧](http://blog.kejyun.com/2012/12/Tips-For-Use-MySQL-With-High-Performance.html)
45 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-Lock-Table.md:
--------------------------------------------------------------------------------
 1 | # 鎖表 (Lock Table)
 2 | 
 3 | ## 定義
 4 | 
 5 | 在要更新資料表的資料時，`MySQL`會將表資料鎖定無法讀取，直到資料異動完畢，`MyISAM` 預設支援 `Table-level lock`，而 `InnoDB` 預設支援 `Row-level lock`
 6 | 
 7 | ### Table-level lock
 8 | 資料表資料異動時，將「`整個資料表（Table）`」都鎖定住無法讀取
 9 | 
10 | ### Row-level lock
11 | 資料表資料異動時，將「`要更新的資料列（row）`」都鎖定住無法讀取
12 | 
13 | ## 注意事項
14 | 
15 | 在使用 `Row-level lock` 時必需要 `明確指定要異動資料的主鍵（Primary Key）`，否則將會改用 `Table-level lock` 去做資料表的異動
16 | 
17 | ## 範例
18 | 
19 | 假設有 user 資料表，裡面有 id 與 name 的欄位，id 是主鍵
20 | 
21 | | SQL	| Table lock | Row lock |	No lock	| 備註
22 | | ------------- |:-------------:|:-------------:|:-------------:|:-------------:|
23 | | SELECT * FROM user WHERE id='1' FOR UPDATE; | - | v | - | 明確指定主鍵，並且有此筆資料，row lock |
24 | | SELECT * FROM user WHERE id='-1' FOR UPDATE; | - | - | v | 明確指定主鍵，若查無此筆資料，無 lock |
25 | | SELECT * FROM user WHERE name='KeJyun' FOR UPDATE; | v | - | - |  無主鍵，table lock |
26 | | SELECT * FROM user WHERE id<>'1' FOR UPDATE; | v | - | - | 主鍵不明確，table lock |
27 | | SELECT * FROM user WHERE id LIKE '3' FOR UPDATE; | v | - | - | 主鍵不明確，table lock |
28 | 
29 | ## 備註
30 | 
31 | FOR UPDATE 僅適用於 InnoDB，且必須在交易區塊(BEGIN/COMMIT)中才能生效。
32 | 
33 | ## 參考資料
34 | * [KeJyun學習日誌: MySQL鎖表(Lock Table)Table-level與Row-level比較](http://blog.kejyun.com/2012/12/Compare-Tabel-And-Row-Level-Lock-On-MySQL.html)
35 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # High Scaling Websites Structure Learning Notes 大型網站架構學習筆記
 2 | 
 3 | ## 作者 : KeJyun
 4 | 
 5 | 本書會介紹一些在建構大型網站中常用到的技巧或方法
 6 | 
 7 | ### 書籍網址
 8 | 
 9 | | 項目  | 網址 |
10 | |---|---|
11 | |  Github Page | http://kejyun.github.io/high-scaling-websites-structure-learning-notes/  |
12 | |  GitBook | http://kejyuntw.gitbooks.io/high-scaling-websites-structure-learning-notes/  |
13 | 
14 | ### 聯絡資訊
15 | 
16 | | 項目  | 網址 |
17 | |---|---|
18 | |  Email | kejyun@gmail.com  |
19 | |  LinkedIn | https://tw.linkedin.com/in/kejyun  |
20 | |  Github | https://github.com/kejyun  |
21 | |  Facebook | http://fb.me/kejyunTaiwan  |
22 | |  Blog | http://blog.kejyun.com  |
23 | 
24 | 
25 | 
26 | ## 所有 KeJyun 著作
27 | * [High Scaling Websites Structure Learning Notes 大型網站架構學習筆記](http://kejyuntw.gitbooks.io/high-scaling-websites-structure-learning-notes/)
28 | * [Laravel 4 學習筆記](http://kejyuntw.gitbooks.io/laravel-4-learning-notes/)
29 | * [Laravel 5 學習筆記](http://kejyuntw.gitbooks.io/laravel-5-learning-notes/)
30 | * [SEO 學習筆記](http://kejyuntw.gitbooks.io/seo-learning-notes/)
31 | * [gulp 學習筆記](http://kejyuntw.gitbooks.io/gulp-learning-notes/)
32 | * [Web Developer Learning Resource 網頁開發學習資源](http://kejyuntw.gitbooks.io/web-developer-learning-resource/)
33 | * [Mac OSX 新手入門](http://kejyuntw.gitbooks.io/mac-osx-for-newbie/)
34 | * [Ruby on Rails 學習筆記](http://kejyuntw.gitbooks.io/ruby-on-rails-learning-note/)


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-All-Together.md:
--------------------------------------------------------------------------------
 1 | # Web 與 Database 在一起
 2 | 
 3 | 把 Web 或 Database 弄在同一部機器這樣的方式，通常是用在程式開發的「本地測試機」中，或者通常不會`同時`拿來服務太多的人的應用（共同上線人數），像是一般基本的公司形象網站、個人介紹頁...等等不會有太多人同時存取的資料懷鏡時才會使用這樣的架構。
 4 | 
 5 | 除了基本的 Apache（Nginx）及 MySQL（Postgres），視情況需要設置會把其他的服務也放在同一台機器，像是 Memcached、Redis、Node.js。
 6 | 
 7 | ***架構圖：***
 8 | 
 9 | ![Web 與 Database 在一起](http://i.imgur.com/nofyGyI.png)
10 | 
11 | ## 使用時機
12 | 
13 | ### 同時上線人數約 10 ~ 15 人左右
14 | 
15 | ***人數為 KeJyun 過去經驗大概估算的人數，沒有經過實際測試僅參考用***
16 | 
17 | 當一台機器需要乘載多種服務時，同樣的主機資源要做很多事情，當然處理的效率通常都比較低
18 | 
19 | 以飲料店來當例子，1 個服務生可以同時服務 3 位客人（主機同時有 3 個人發出請求），同時替客人做點餐結帳、做飲料的工作，但是當客人越來越多時來到 15 人的時候，因為人的能力有限（主機的效能有限），所以後面很多人就需要排隊等前面幾位客人拿到飲料後才能繼續服務，所以就需要等待排隊（主機回應時間拉長），所以要看自己的經濟規模（同時乘載規模）去聘請適當人數的員工（規劃使用不同的主機架構）
20 | 
21 | > 同時上線非每日上限，同日上線指的是同一時先有多個人同時跟伺服器要資源
22 | 
23 | ### 服務的可用度不需要太高
24 | 
25 | 大部份的使用者可能最高能夠忍受等待[一貶眼的時間](http://news.networkmagazine.com.tw/news/2012/03/02/38107/)（約 250 毫秒 = 0.25 秒），但是這個標準是在產品一秒鐘幾十萬上下時才需要達到的可用度標準（像是股票交易、電子商務網站、社群網站...等等），一般只是測試用或是形象的網站不需要那麼高的可用度，能夠在 1 ~ 2 秒內回應都是在可接受的範圍（當然不能太久超過 4、5 秒以上，使用者會以為網站掛掉了沒有回應）
26 | 
27 | ### 資料有做異地備援
28 | 
29 | 資料可以說是整個企業的資產及生命，當資料自己有做異地備援的話，就可以用這樣的方式，不然主機太操很容易掛掉（就像人一樣加班太久沒休息，就很容易生病），裡面的資料可能會因為沒辦法救回，所以若是有做資料備份，不怕資料不見的狀況下再去用這樣的架構去做服務。
30 | 
31 | 
32 | ## 參考資料
33 | * [你願意花多久等待網站回應？ 一貶眼都嫌太久了](http://news.networkmagazine.com.tw/news/2012/03/02/38107/)
34 | * [瞭解網站速度 - Analytics (分析) 說明](https://support.google.com/analytics/answer/2383341?hl=zh-Hant)
35 | 


--------------------------------------------------------------------------------
/Reference/Reference-Article.md:
--------------------------------------------------------------------------------
 1 | # 參考文章
 2 | 
 3 | ## 概念
 4 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
 5 | * [10+個高流量網站架構的效能提升解決方案](http://www.qa-knowhow.com/?p=2314)
 6 | * [100 open source Big Data architecture papers for data professionals.](https://www.linkedin.com/pulse/100-open-source-big-data-architecture-papers-anil-madan)
 7 | 
 8 | ## 資料庫
 9 | 
10 | ### 連線池
11 | * [Connection pool - Wikipedia, the free encyclopedia](http://en.wikipedia.org/wiki/Connection_pool)
12 | 
13 | ### MySQL
14 | * [KeJyun學習日誌: 在MySQL中MyISAM與InnoDB資料庫引擎比較](http://blog.kejyun.com/2012/12/Compare-MyISAM-And-InnoDB-Database-Engine-In-MySQL.html)
15 | 
16 | ## 實際案例
17 | * [Stack Overflow: The Architecture - 2016 Edition](http://nickcraver.com/blog/2016/02/17/stack-overflow-the-architecture-2016-edition/)
18 | * [《英雄联盟》玩家聊天服务的持久层演进](http://bbs.gameres.com/thread_485619.html)
19 | * [KKBOX Case Study – Amazon Web Services (AWS)](http://aws.amazon.com/tw/solutions/case-studies/kkbox/)
20 | 
21 | ## HTTP Cache
22 | * [循序漸進理解 HTTP Cache 機制 | TechBridge 技術共筆部落格](http://blog.techbridge.cc/2017/06/17/cache-introduction/)
23 | 
24 | 
25 | ## Python
26 | * [Python + Django 如何支撑了 7 亿月活用户的 Instagram？](http://python.jobbole.com/87814/)
27 |     * [Sharding & IDs at Instagram](https://engineering.instagram.com/sharding-ids-at-instagram-1cf5a71e5a5c)
28 |     * [Dismissing Python Garbage Collection at Instagram](https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172)
29 | 


--------------------------------------------------------------------------------
/Architecture-Evolution/Architecture-Evolution-Dynamic-Data-Cache.md:
--------------------------------------------------------------------------------
 1 | # 動態資料快取
 2 | 
 3 | `產生的靜態頁面暫存`可以應付整個頁面很少改變的網頁，但是若網頁中有一部分的資料遭到異動，為了讓使用者取得到最新的資料，我們勢必要清除頁面並重新撈取資料庫的資料去產生新的`靜態頁面暫存`。
 4 | 
 5 | 但對於資料層來說，其實只有`少部分的資料`有異動需要重新讀取，若我們需要重新去資料庫撈取`所有`頁面需要的資料，再去兜頁面所需要的資訊，勢必是浪費資料庫資源的，還記得我在[概念章節](High-Scaling-Websites-Structure-Concept.md)中有提到說 `資料庫存取是很昂貴的`，過多的查詢，導致資料庫的存取效率降低，且資料庫的的連線因為系統的限制，所以也沒辦法無限制的增加連線數量。
 6 | 
 7 | 所以我們會想要把一些很少異動的資料去做快取（Cache）預存下來，當下一位使用者來索取相同的資料時，則不去資料庫進行查詢，直接將之前預存的資料丟回給使用者，提高系統反應的時間及資料存取的效率，快取的資料會依照我們設定資料的過期時間，當超過過期時間後，再重複去資料庫撈取資料，看看有沒有異動。
 8 | 
 9 | ***架構圖：***
10 | 
11 | ![動態資料快取](http://i.imgur.com/sbORt7G.png)
12 | 
13 | ## 注意事項
14 | 
15 | ### 快取時間
16 | 
17 | 快取的失效時間通常會依我們的應用去做設定，也因為系統資源的關係（記憶體大小、硬碟容量大小...等等）的原因，沒有辦法將快取設定為永久存放，像是做討論區的文章，通常熱門的文章在短時間會被重複的讀取，但熱度可能會隨著時間做遞減，可能大約 3 天的時間文章就會變得越少人存取，當沒有人存取的時候，我們就會希望將快取清除，避免佔住系統資源（記憶體或硬碟容量），所以我們可能會針對討論區的文章做 3 天時間的快取，等超過 3 天 後，快取會自動地將過期的快取資料做清除，除非又有人再次讀取該篇文章，才會再做一次快取（失效時間再次設為 3 天），一直重複這樣的動作，等到資料不再被存取，就只會保留在資料庫的洪流當中。
18 | 
19 | ### 手動清除快取
20 | 
21 | 在部落格發表的文章被讀取時，我們會對文章進行快取，但有時文章可能因為作者「打錯字」或者「需要補充資料」，導致文章需要被異動更新，為了能夠讓快取有最新被異動過的資料，通常我們會手動的清除這篇文章的快取，這樣就可以確保下一個存取這篇文章的讀者，一定會拿到最新的文章資料。
22 | 
23 | ### 資料分散式快取
24 | 
25 | 快取的資料通常是用 `key` 及 `value` 的方式去紀錄資料，而我們通常會把資料做片段的快取，像是部落格文章 A 的快取我們通常會存放在 `key` 為 `blog_post:post_id_A` 的快取資料中，以此類推，文章 B 的快取 `key` 為 `blog_post:post_id_B`，用這樣的方式對每個不同的小資源去做快取，而不是將所有部落格文章 A 及 B 存放在一整個快取 `key` 為 `blog_post:all` 中，因為當使用者異動文章 A 的時候，若我們要刪除快取的資料則只需要刪除文章 A 的快取即可，而不需要把其他不必要刪除的文章快取也一併刪除了，提高快取使用的效率，通常要看自己的應用及使用者存取資料的方式，需要用哪一種方式的快取比較適合，每一種應用都有適合自己的快取方法。
26 | 
27 | 
28 | ## 參考資料
29 | * [一步步構建大型網站架構- 架構設計- | 九街| 白開水的博客](http://www.9streets.cn/art-php-489.html)
30 | 


--------------------------------------------------------------------------------
/SUMMARY.md:
--------------------------------------------------------------------------------
 1 | # 摘要
 2 | 
 3 | * [介紹](README.md)
 4 | * [概念](High-Scaling-Websites-Structure-Concept.md)
 5 | * [資料庫](Database/Database-README.md)
 6 |     * [資料排列組合](Database/Database-Data-Permutations.md)
 7 |     * [MySQL](Database/Database-MySQL-README.md)
 8 |         * [資料類型](Database/Database-MySQL-DataType.md)
 9 |         * [程式語法](Database/Database-MySQL-Code-SQL.md)
10 |         * [索引](Database/Database-MySQL-Index.md)
11 |         * [鎖表](Database/Database-MySQL-Lock-Table.md)
12 |         * [交易](Database/Database-MySQL-Transaction.md)
13 |         * [伺服器](Database/Database-MySQL-Server.md)
14 |         * [SELECT 語法效能測試](Database/Database-MySQL-SELECT-SQL-Explain-Test.md)
15 | * [程式](Code/Code-README.md)
16 |     * [陣列](Code/Code-Array.md)
17 | * [快取](Cache/Cache-README.md)
18 |     * [設定](Cache/Cache-Set-Cache.md)
19 |     * [刪除](Cache/Cache-Delete-Cache.md)
20 | * [檔案](File/File-README.md)
21 | * [網站架構演進](Architecture-Evolution/Architecture-Evolution-README.md)
22 |     * [Web 與 Database 在一起](Architecture-Evolution/Architecture-Evolution-All-Together.md)
23 |     * [Web 與 Database 分手](Architecture-Evolution/Architecture-Evolution-Break-Up.md)
24 |     * [前端靜態頁面暫存](Architecture-Evolution/Architecture-Evolution-Static-Page-Cache.md)
25 |     * [頁面片段暫存](Architecture-Evolution/Architecture-Evolution-Partial-Page-Cache.md)
26 |     * [動態資料快取](Architecture-Evolution/Architecture-Evolution-Dynamic-Data-Cache.md)
27 |     * [增加 Web Server](Architecture-Evolution/Architecture-Evolution-Add-Web-Server.md)
28 |     * [增加 Database Server](Architecture-Evolution/Architecture-Evolution-Add-Database-Server.md)
29 |     * [分散式資料](Architecture-Evolution/Architecture-Evolution-Data-Distribute.md)
30 |     * [增加更多的 Web Server](Architecture-Evolution/Architecture-Evolution-Add-More-Web-Server.md)
31 |     * [Database 讀寫分離](Architecture-Evolution/Architecture-Evolution-Set-Database-Read-Write-Server.md)
32 |     * [高可用性的服務架構](Architecture-Evolution/Architecture-Evolution-High-Useability-Service-Structure.md)
33 | * [授權](Auth/Auth-README.md)
34 |     * [Session](Auth/Session/Session-README.md)
35 |     * [JWT（JSON Web Token）](Auth/JWT/JWT-README.md)
36 | * [常見問題](Question/Question-README.md)
37 |     * [壓力測試](Question/Question-Benchmark.md)
38 | * [參考資料](Reference/Reference-README.md)
39 |     * [書籍](Reference/Reference-Books.md)
40 |     * [文章](Reference/Reference-Article.md)
41 |     * [影片](Reference/Reference-Video.md)
42 |     * [工具](Reference/Reference-Tool.md)
43 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-Code-SQL.md:
--------------------------------------------------------------------------------
  1 | # 程式語法
  2 | 
  3 | ## 指定使用索引
  4 | 
  5 | ```sql
  6 | SELECT id
  7 | FROM data USE INDEX(type)
  8 | WHERE type=12345 AND level > 3
  9 | ORDER BY id
 10 | ```
 11 | 
 12 | 在下SQL語法的時候，有時某些語法使用某些索引執行效率會比較好，可是有時候MySQL沒辦法替我們選擇一個最適合的索引，導致執行的效率很慢（slow query），所以我們必須使用USE INDEX去指定執行效率好的索引，以提升效率。
 13 | 
 14 | ## 少用 JOIN，多用幾次 SELECT 撈取大量資料
 15 | 
 16 | ```sql
 17 | SELECT user.id , user.name , post.id , post.content
 18 | FROM user , post
 19 | WHERE user.id = post.user_id ......
 20 | ```
 21 | 
 22 | 我們在撈取使用者的文章資訊的時候，我們可能會用 JOIN 去撈取我們要的資料，這樣很直覺，只是當使用者資料有`10000`筆，而文章有`99999`筆，像這種有大量資料時候的話，使用 JOIN 對於資料存取真的是惡夢，因為 JOIN 過後表示會有 `10000x99999=999990000` 筆資料，然後再從這麼大量的資料中去撈取 WHERE 判斷式中指定的資料，在資料庫伺服器記憶體不夠的時候鐵定會炸掉。
 23 | 
 24 | 解決方式是我們可以分批撈取使用者的資料，以及文章的資料
 25 | 
 26 | ```sql
 27 | SELECT user.id , user.name
 28 | FROM user
 29 | ```
 30 | 
 31 | ```sql
 32 | SELECT id , content , user_id
 33 | FROM post
 34 | ```
 35 | 
 36 | 在使用者資料資料撈取出來之後，使用迴圈將使用者存成陣列，但是「陣列的索引」是使用可以識別的「使用者編號（user.id）」當作索引值。
 37 | 
 38 | ```php
 39 | <?php
 40 | $user['user.id'] = array();
 41 | ```
 42 | 
 43 | 在撈取文章資料後，必須將撈取的文章指定給該文章的作者（使用者），而我們撈取的使用有撈取「使用者編號（user.id）」，所以可以透過這個資料，將文章存到使用者資料下的文章陣列。
 44 | 
 45 | ```php
 46 | <?php
 47 | $user[$post['user_id']]['post'] = $post_array;
 48 | ```
 49 | 
 50 | 透過分次撈取，本來的 `10000x99999=999990000` 筆資料，就會變成 `10000+99999=109999` 筆資料，記憶體消耗降低極大下，也可以達到同樣的效果
 51 | 
 52 | 
 53 | 
 54 | ### 後記
 55 | 
 56 | JOIN 在資料庫幾乎是必學的語法，讓資料表的正規化，避免資料冗余，在查詢的時候也可以透過 JOIN 組合出想要的資料，所以 JOIN 對於資料庫正規化有它的用途，但資料庫的發展的初期中，還沒有像現在有用到巨量的資料（Google、Facebook、Twitter ... etc），所以在流量&資料量小的時候，資料庫在 JOIN 時的查詢回應速度通常差沒幾毫秒 （0.01~0.05毫秒），所以都在可接受的範圍，但現在大型網站一天都好幾千萬的存取量，假如每天有 1000 萬的存取次數，而每個地方都慢了 0.05 毫秒，總共就會慢了 `10,000,000 x 0.05 = 500,000 毫秒 = 500 秒`，在我們要求每次回應時間最差都要在 1 ~ 2 秒內回應，若是像 Google、Facebook 這樣的大型服務，要求可能都要在 0.1 秒以內，這樣慢的時間可能會讓使用者等到不耐煩，若 $$ 多多加機器僅能簡單應付巨量級的查詢，沒辦法解決 JOIN 所造成的查詢瓶頸，所以在巨量級資料的情況下，幾乎很少會使用 JOIN 的方式去查詢所有的資料出來，天下武功，無堅不摧，唯快不破。
 57 | 
 58 | ## 更新或刪除大量範圍的資料，請用主鍵去更新
 59 | 
 60 | 我們可能會想要一次異動大範圍的資料，若以社群網站的`通知（Notification）`來說，我們希望在使用者點選通知列表後，將所有的通知從未讀狀態變更為已讀，我們可以很容易地想到用這樣的 SQL 語法去進行更新：
 61 | 
 62 | ```sql
 63 | UPDATE `notification` SET status="已讀" WHERE created_at < '現在時間';
 64 | ```
 65 | 
 66 | 若我們要刪除大量的資料時，我們也可能用這樣的 SQL 語法去刪除資料：
 67 | 
 68 | ```sql
 69 | DELETE FROM `backup` WHERE created_at < '現在時間';
 70 | ```
 71 | 
 72 | 但是在這樣的條件為範圍的語法，資料庫會需要一筆一筆的去比對資料是否為設定條件的範圍，若為`交易`的資料，資料庫會將資料進行鎖定禁止讀取，有可能鎖定一些非我們要處理的資料，若處理時間過久，可能會導致撈取資料的反應時間過久。
 73 | 
 74 | 所以通常為了避免`鎖定不必要資料`的情況發生，我們可以試著先把資料撈取出來，再透過`明確的主鍵`去指定哪些異動的資料需要被鎖定，而不影響其他的資料。
 75 | 
 76 | ### 更新資料（較佳作法）
 77 | 
 78 | ```sql
 79 | -- 先撈取資料
 80 | SELECT `notification` WHERE status="未讀" AND created_at < '現在時間';
 81 | 
 82 | -- 指定要更新資料的主鍵
 83 | UPDATE `notification` SET status="已讀" WHERE  id IN (1, 2, 3);
 84 | ```
 85 | 
 86 | ### 刪除資料（較佳作法）
 87 | 
 88 | ```sql
 89 | -- 先撈取資料
 90 | SELECT `backup` WHERE created_at < '現在時間';
 91 | 
 92 | -- 指定要刪除資料的主鍵
 93 | DELETE FROM `backup` WHERE id IN (1, 2, 3);
 94 | ```
 95 | 
 96 | 在服務會有大量的人撈取資料的時候，這樣可以避免不必要的資料鎖定，也可以讓存取速度加快喔～
 97 | 
 98 | ## 參考資料
 99 | * [KeJyun學習日誌: 提高存取MySQL效率小技巧](http://blog.kejyun.com/2012/12/Tips-For-Use-MySQL-With-High-Performance.html)
100 | 


--------------------------------------------------------------------------------
/File/File-README.md:
--------------------------------------------------------------------------------
 1 | # 檔案
 2 | 
 3 | 我們可能會需要產生圖片或檔案供使用者去下載，這些讓使用者可以公開存取檔案的原則大概有這些：
 4 | 
 5 | * 隨機檔名
 6 | * 同一目錄下不要有太多檔案
 7 | * 資料庫僅儲存圖片檔名，不要紀錄完整路徑
 8 | * 使用 CDN 提高負載平衡及降低頻寬成本
 9 | 
10 | # 隨機檔名
11 | 
12 | 很多人可能會圖方便，讓檔案用可以猜出規則的方式做儲存，例如有人可能在儲存會員的照片，是用會員的編號當作圖片檔名，如果會員編號是整數的欄位，則就更容易讓人家猜出來，更方便地去解析你主機的所有會員照片，像是：
13 | 
14 | ```
15 | 100000001.png
16 | 100000002.png
17 | 100000003.png
18 | ...
19 | 100999999.png
20 | ```
21 | 
22 | 所以通常我們會將圖片的名稱依照會員編號的規則去做加密，我們可能可以用 md5 去做加密的動作，這樣圖片檔名可能會像是：
23 | 
24 | ```
25 | d69d3401d01a8ceed4549434e5ad9f40.png (`100000001` md5)
26 | 5a5ea04157ce4d020f65c3dd950f4fa3.png (`100000002` md5)
27 | 154666111ab87ce13405f384faeb5428.png (`100000003` md5)
28 | ...
29 | e0b50eb5c4add211417977e1f79364e0.png (`100999999` md5)
30 | ```
31 | 
32 | 這樣圖片的檔名看起來隨機了許多，比較不容易被猜出來，但是對比較厲害的人，可能會想要拿你的會員編號自己做 md5，所以還是可以看出規則，這時候我會試著在 md5 的會員編號字串加自定義的 salt 字串像是 `KeJyun`，這樣出來的圖片檔名會像這樣：
33 | 
34 | ```
35 | a3bce35e2b4e57911869b1bf5d040d71.png (`100000001KeJyun` md5)
36 | 5121a0fabb52c26c01817ee65df683a8.png (`100000002KeJyun` md5)
37 | 2ec12af8e6e011a0f5256b61db2145c3.png (`100000003KeJyun` md5)
38 | ...
39 | 467bd65bf6d4864adcc633caf2a70f7d.png (`100999999KeJyun` md5)
40 | ```
41 | 
42 | 這樣除非對方知道我們使用哪一個 salt 去做加密，否則就很難被猜出來
43 | 
44 | ## 同一目錄下不要有太多檔案
45 | 
46 | 我們存放會員大頭照的路徑可能會像是： `http://www.kejyun.com/users/images/a3bce35e2b4e57911869b1bf5d040d71.png`
47 | 
48 | 在上面的例子我們把所有的會員照片都存放在 `users/images` 的目錄下，對於系統程式，我們可以很快的指定路徑去讀取該檔案出來，但如果對於系統端的管理者，要去管理這些檔案時就會造成很大的困擾，想像你有 1000 萬個會員照片，所有的會員照片都放在同一個目錄下面，光是要使用 `ls`指令去列出那個目錄的所有照片，對於系統來說就是一大夢靨...
49 | 
50 | 所以我們通常會對檔案做資料夾的分層處理，像是 `a3bce35e2b4e57911869b1bf5d040d71.png` 檔案，我們可能取其前 3 個字元 `a3b` 去做資料夾的分層，所以我們圖片會份在 `users/images/a3b/a3bce35e2b4e57911869b1bf5d040d71.png` 目錄下，分層資料夾的 3 個字元的[資料排列組合](../Database/Database-Data-Permutations.md)為 `46,656` 種，所以若我們有 `1000 萬`個會員照片，我們可以預期照片會被平均的放在到各個分層資料夾，每個分層資料夾約 `214 張`照片 `（10000000 / 46625 = 214.33）`，這樣對於系統在做檔案的索引及瀏覽也會比較快。
51 | 
52 | 資料夾的分層要如何分層，完全取決於你系統專案資料數量的需求，若你預期你的檔案會超過 1 億種，那麼分層方式我們可能會分成 2 層，依照上面規則，照片可能會被放到 `users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png` 下面，要怎麼分層處理完全取決於你的需求狀況，沒有一定的規則！
53 | 
54 | 
55 | ## 資料庫僅儲存圖片檔名，不要紀錄完整路徑
56 | 
57 | 我們的檔案可能會隨著系統使用人數變多，需要轉換成不同的管理方式，所以檔案在做大型專案時，很有可能會隨時被移動位置的，所以千萬不要將完整的圖片路徑存到資料表欄位中，否則以後在做檔案效能最佳化調校時，資料表檔案路徑還需要重新的異動更新，會是很大的難題，可能的瓶頸可能會卡在：
58 | * 專案不能只為了調整圖片路徑而關站維護
59 | * 資料量很大，更新檔案路徑很慢，可能需要 3~7 天以上去更新資料表的資料
60 | * 若更新失敗，或者是又需要重調整架構時，所耗費時間會更長
61 | 
62 | 所以我們在資料表只要紀錄檔案的檔名就好，所有與架構有關的資料通通由程式去控制產生，程式只要改一下，檔案的存取架構隨時都可以容易的做異動
63 | 
64 | 
65 | ## 使用 CDN 提高負載平衡及降低頻寬成本
66 | 
67 | 我們的主機可能會使用 AWS、Linode 之類的雲端服務，這些的主機可能是放在日本或美國，我們希望使用者去距離使用者自己比較近的 CDN 主機，去存取這些不常被異動的靜態資料，這樣可以提高原本主機本身的負載流量，將流量做分散式的處理。
68 | 
69 | 我們的圖片可能放在像是這樣的網址下:
70 | 
71 | ```
72 | http://cdn1.kejyun.com/users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png
73 | http://cdn2.kejyun.com/users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png
74 | http://cdn3.kejyun.com/users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png
75 | ...
76 | http://cdn9.kejyun.com/users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png
77 | ```
78 | 
79 | 這些不同的主機存放的圖片都與原主機的 `http://www.kejyun.com/users/images/a3bc/e35/a3bce35e2b4e57911869b1bf5d040d71.png` 相同，只是我們把同樣的靜態檔案分流複製到各個 CDN 主機上，這樣就不用讓原主機承載過高的頻寬吞吐量了！
80 | 
81 | ## 參考網站
82 | * [內容傳遞網路 CDN - Wiki](http://zh.wikipedia.org/wiki/%E5%85%A7%E5%AE%B9%E5%82%B3%E9%81%9E%E7%B6%B2%E8%B7%AF)
83 | * [CloudFlare](https://www.cloudflare.com/)
84 | 


--------------------------------------------------------------------------------
/Database/Database-Data-Permutations.md:
--------------------------------------------------------------------------------
 1 | # 資料排列組合
 2 | 
 3 | 在使用 MySQL 或 PostgreSQL 資料庫的時候，我們常常會使用 Auto Increment 的整數去當作我們的鍵值，Int 總共有 `4294967295` 種排列組合方式，而 BigInt 則總共有 `18446744073709551615` 種排列組合方式
 4 | 
 5 | 使用 Auto Increment 的好處是：
 6 | 
 7 | * 可以避免每次的鍵值都是不同的，不需要再做額外的欄位資料檢查是否重複
 8 | * 可以依照整數順序去判斷資料新增的順序
 9 | 
10 | 但這樣也有壞處：
11 | 
12 | * 資料被刪除後，空出來的鍵值很難被管理，會造成不必要的浪費
13 |   * e.g. Queue 資料
14 | * 總資料量被限制，當我們預期資料很有可能超過整數欄位的限制時，資料量超過時還需要處理溢位問題
15 |   * e.g. 很多的 Queue 資料
16 | 
17 | 所以若預期資料量很大時，我們會傾向使用字串去當作資料表的鍵值，單一整數的資料排列組合有 10 種 `（0-9）`，而單一字串的資料排列組合有 36 種`（0-9 A-Z）`，雖然整數欄位在資料表搜尋的速度比字串快很多，但是這個是沒辦法的必要之惡。
18 | 
19 | 
20 | ## 字串排列組合
21 | 
22 | 在資料庫儲存的字串資料是不區分大小寫的，所以若有一個 Primary 或 Unique 的鍵值，裡面是沒辦法同時儲存 `AA`、`Aa`、`aA` 或 `aa` 的，所以我們字串鍵值的排列組合只能用不區分大小寫 36 種 `(0-9 A-Z)` 排列組合去計算。
23 | 
24 | |字串長度 |  排列組合 (不區分大小寫) |
25 | |---|---|
26 | | 1	 |  36 |
27 | | 2	 |  1,296 |
28 | | 3	 |  46,656 |
29 | | 4	 |  1,679,616 |
30 | | 5	 |  60,466,176 |
31 | | 6	 |  2,176,782,336 |
32 | | 7	 |  78,364,164,096 |
33 | | 8	 |  2,821,109,907,456 |
34 | | 9	 |  101,559,956,668,416 |
35 | | 10 |	3,656,158,440,062,980 |
36 | | 11 |	131,621,703,842,267,000 |
37 | | 12 |	4,738,381,338,321,620,000 |
38 | | 13 |	170,581,728,179,578,000,000 |
39 | | 14 |	6,140,942,214,464,820,000,000 |
40 | | 15 |	221,073,919,720,733,000,000,000 |
41 | | 16 |	7,958,661,109,946,400,000,000,000 |
42 | | 17 |	286,511,799,958,070,000,000,000,000 |
43 | | 18 |	10,314,424,798,490,500,000,000,000,000 |
44 | | 19 |	371,319,292,745,659,000,000,000,000,000 |
45 | | 20 |	13,367,494,538,843,700,000,000,000,000,000 |
46 | | 21 |	481,229,803,398,374,000,000,000,000,000,000 |
47 | | 22 |	17,324,272,922,341,500,000,000,000,000,000,000 |
48 | | 23 |	623,673,825,204,293,000,000,000,000,000,000,000 |
49 | | 24 |	22,452,257,707,354,600,000,000,000,000,000,000,000 |
50 | | 25 |	808,281,277,464,764,000,000,000,000,000,000,000,000 |
51 | | 26 |	29,098,125,988,731,500,000,000,000,000,000,000,000,000 |
52 | | 27 |	1,047,532,535,594,330,000,000,000,000,000,000,000,000,000 |
53 | | 28 |	37,711,171,281,396,000,000,000,000,000,000,000,000,000,000 |
54 | | 29 |	1,357,602,166,130,260,000,000,000,000,000,000,000,000,000,000 |
55 | | 30 |	48,873,677,980,689,300,000,000,000,000,000,000,000,000,000,000 |
56 | | 31 |	1,759,452,407,304,810,000,000,000,000,000,000,000,000,000,000,000 |
57 | | 32 |	63,340,286,662,973,300,000,000,000,000,000,000,000,000,000,000,000 |
58 | | 33 |	2,280,250,319,867,040,000,000,000,000,000,000,000,000,000,000,000,000 |
59 | | 34 |	82,089,011,515,213,400,000,000,000,000,000,000,000,000,000,000,000,000 |
60 | | 35 |	2,955,204,414,547,680,000,000,000,000,000,000,000,000,000,000,000,000,000 |
61 | | 36 |	106,387,358,923,717,000,000,000,000,000,000,000,000,000,000,000,000,000,000 |
62 | | 37 |	3,829,944,921,253,800,000,000,000,000,000,000,000,000,000,000,000,000,000,000 |
63 | | 38 |	137,878,017,165,137,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 |
64 | | 39 |	4,963,608,617,944,920,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 |
65 | | 40 |	178,689,910,246,017,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 |
66 | 
67 | > 只要字串長度大於 13 時，我們擁有的資料排列組合就會超過 BigInt 的 `18446744073709551615` 排列組合
68 | 
69 | 在使用字串當做鍵值時，在我們每次新增資料手動產生字串鍵值時，我們都要再去資料庫檢查看看此鍵值是否存在，再新增進去，且我們需要額外做新增資料造成 Duplicate Key 的例外狀況處理
70 | 
71 | > 在我們字串很長時雖然發生碰撞的機率很低，但為了系統資料的安全完整性，還是需要做檢查的步驟，並在頻繁被通知到鍵值衝突的狀況時，就要考慮增加字串長度，減少鍵值碰撞機率了
72 | 
73 | ## 參考資料
74 | * [MySQL資料型態描述與比較](http://blog.kejyun.com/2012/09/MySQL-Field-Data-Type-Description-And-Compare.html)
75 | * [Store UUID in an optimized way](https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/)
76 | * [《江南Style》如何衝破YouTube的計數器](http://www.cw.com.tw/article/article.action?id=5063028)
77 | * [資料庫索引數據結構及主鍵設計(b+tree)(part 1)](http://www.slideshare.net/yftzeng/btreepart-1)
78 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-SELECT-SQL-Explain-Test.md:
--------------------------------------------------------------------------------
  1 | # Explain 做 SQL SELECT 語法效能測試
  2 | 
  3 | 在MySQL我們在使用 SELECT 做撈取資料的時候，有時候常常會效能低落，撈取資料需要很長的時間，有時候是 SQL 語法下得不好導致沒有使用到正確的索引去撈資料，我們這個時候就必須要檢查我們下的 SQL 語法到底有哪些地方需要改善，我建立的 comment 的資料表並新增幾筆假資料去做示範
  4 | 
  5 | ```sql
  6 | -- 建立資料表
  7 | 
  8 | -- 留言
  9 | CREATE TABLE IF NOT EXISTS `comment` (
 10 |   `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '編號',
 11 |   `content` varchar(50) COLLATE utf8_unicode_ci NOT NULL COMMENT '留言',
 12 |   PRIMARY KEY (`id`)
 13 | ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;
 14 | 
 15 | -- 使用者
 16 | CREATE TABLE IF NOT EXISTS `user` (
 17 |   `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '編號',
 18 |   `name` varchar(30) COLLATE utf8_unicode_ci NOT NULL COMMENT '姓名',
 19 |   PRIMARY KEY (`id`)
 20 | ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=3 ;
 21 | 
 22 | -- 使用者的留言
 23 | CREATE TABLE IF NOT EXISTS `user_comment` (
 24 |   `user_id` int(10) unsigned NOT NULL COMMENT '使用者編號',
 25 |   `comment_id` int(10) unsigned NOT NULL COMMENT '評論編號',
 26 |   PRIMARY KEY (`user_id`,`comment_id`)
 27 | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
 28 | 
 29 | 
 30 | 
 31 | -- 新增資料
 32 | 
 33 | -- 留言
 34 | INSERT INTO `comment` (`id`, `content`) VALUES
 35 | (1, '留言1'),
 36 | (2, '留言2');
 37 | 
 38 | -- 使用者
 39 | INSERT INTO `user` (`id`, `name`) VALUES
 40 | (1, '使用者1'),
 41 | (2, '使用者2');
 42 | 
 43 | -- 使用者的留言
 44 | INSERT INTO `user_comment` (`user_id`, `comment_id`) VALUES
 45 | (1, 1),
 46 | (1, 2);
 47 | 
 48 | 
 49 | 
 50 | 
 51 | -- 解釋MySQL語法效能
 52 | 
 53 | -- 撈取留言資料
 54 | EXPLAIN SELECT * FROM  `comment` WHERE  id` =2;
 55 | 
 56 | -- 撈取使用者的留言資料
 57 | EXPLAIN SELECT *
 58 | FROM  `comment` c,  `user` u,  `user_comment` uc
 59 | WHERE u.`id` = uc.`user_id`
 60 | AND uc.`comment_id` = c.`id`
 61 | ```
 62 | 
 63 | ## 解釋 MySQL 語法效能：撈取留言資料
 64 | 
 65 | ![Explain MySQL SQL](http://3.bp.blogspot.com/-YagV7ODyhPI/UNLeKtvO_uI/AAAAAAAAC8k/v4NrdlhZnCo/s1600/2012-12-20_174153.png)
 66 | 
 67 | ## 解釋MySQL語法效能：撈取留言資料
 68 | 
 69 | ![解釋MySQL語法效能：撈取留言資料](http://4.bp.blogspot.com/-7EtC6Ep0cbY/UNLjeYCIEPI/AAAAAAAAC88/chZ1hCB5rrs/s1600/2012-12-20_180223.png)
 70 | 
 71 | 
 72 | 而EXPLAIN後的資料有下面這些欄位
 73 | 
 74 | ### select_type
 75 | ### table
 76 | 
 77 | 關連到的資料表
 78 | 
 79 | ### type
 80 | 
 81 | 使用關聯查詢的類型(效率由好至壞排序)
 82 | 
 83 | * System
 84 | * const
 85 | * eq_ref
 86 | * ref
 87 | * fulltext
 88 | * ref_or_null
 89 | * index_merge
 90 | * unique_subquery
 91 | * index_subquery
 92 | * range
 93 | * index
 94 | * ALL
 95 | 
 96 | ### possible_keys
 97 | 
 98 | 可能使用到的索引，從WHERE語法選擇出一個適合的欄位
 99 | 
100 | ### key
101 | 
102 | 實際使用到的索引，如果為NULL，則是沒有使用索引
103 | 
104 | ### key_len
105 | 
106 | 使用索引的長度，長度越短 準確性越高
107 | 
108 | ### ref
109 | 
110 | 顯示那一列的索引被使用，一般是一個常數(const)
111 | 
112 | ### rows
113 | 
114 | MySQL用來返回資料的筆數，可以簡單的把rows視為執行效能，越少越好
115 | 
116 | ### Extra
117 | 
118 | MySQL用來解析額外的查詢訊息
119 | 
120 | 
121 | - Distinct
122 | 
123 | 當MySQL找到相關連的資料時，就不再搜尋。
124 | 
125 | - Not exists
126 | 
127 | MySQL優化 LEFT JOIN，一旦找到符合的LEFT JOIN資料後，就不再搜尋。
128 | 
129 | - Range checked for each Record(index map:#)
130 | 
131 | 無法找到理想的索引。此為最慢的使用索引。
132 | 
133 | - Using filesort
134 | 
135 | 當出現這個值時，表示此SELECT語法需要優化。因為MySQL必須進行額外的步驟來進行查詢。
136 | 
137 | - Using index
138 | 
139 | 返回的資料是從索引中資料，而不是從實際的資料中返回，當返回的資料都出現在索引中的資料時就會發生此情況。
140 | 
141 | - Using temporary
142 | 
143 | 同Using filesort，表示此SELECT語法需要進行優化。此為MySQL必須建立一個暫時的資料表(Table)來儲存結果，此情況會發生在針對不同的資料進行ORDER BY，而不是GROUP BY。
144 | 
145 | - Using where
146 | 
147 | 使用WHERE語法中的欄位來返回結果。
148 | 
149 | - System
150 | 
151 | system資料表，此為const連接類型的特殊情況。
152 | 
153 | - Const
154 | 
155 | 資料表中的一個記錄的最大值能夠符合這個查詢。因為只有一行，這個值就是常數，因為MySQL會先讀這個值然後把它當做常數。
156 | 
157 | - eq_ref
158 | 
159 | MySQL在連接查詢時，會從最前面的資料表，對每一個記錄的聯合，從資料表中讀取一個記錄，在查詢時會使用索引為主鍵或唯一鍵的全部。
160 | 
161 | - ref
162 | 
163 | 只有在查詢使用了非唯一鍵或主鍵時才會發生。
164 | 
165 | - range
166 | 
167 | 使用索引返回一個範圍的結果。例如：使用大於>或小於<查詢時發生。
168 | 
169 | - index
170 | 
171 | 此為針對索引中的資料進行查詢。
172 | 
173 | - ALL
174 | 
175 | 針對每一筆記錄進行完全掃描，此為最壞的情況，應該盡量避免。
176 | 
177 | ## 結論
178 | 
179 | MySQL 的 Explain 可以分析大部份的 SQL 語法效能，但有些語法像是 WHERE IN 則會被歸類為 range 的語法，但實際上則是 Using Where，所以確切的語法分析要再看看文件真正的用法去決定
180 | 
181 | ## 參考資料
182 | * [KeJyun學習日誌: 在MySQL使用Explain做SQL SELECT語法效能測試](http://blog.kejyun.com/2012/12/Using-EXPLAIN-SQL-To-Analysis-Efficient-On-MySQL.html)
183 | 


--------------------------------------------------------------------------------
/High-Scaling-Websites-Structure-Concept.md:
--------------------------------------------------------------------------------
  1 | # 大型網站架構概念
  2 | 
  3 | 當一個網站需要服務很多很多很多人的時候，我們就需要好的存取資料的架構，讓使用者能夠很快的拿到他想要拿的資料，這樣伺服器（Web & Database）才可以有更多「`時間`」及「`空間`」去服務更多的人。
  4 | 
  5 | 你可能會聽到有一些在提高伺服器存取效率的關鍵字，像是「資料快取」、「資料庫索引」、「資料庫讀寫分離」、「資料分散式處理」...等等之類的方式。
  6 | 
  7 | ## 提高效能存取的概念
  8 | 
  9 | ### 不常異動的資料做快取
 10 | 
 11 | #### 後端資料
 12 | 
 13 | 以部落格文章為例，作者發表一篇文章後，除了文章內有一些小小的錯誤或錯字需要編輯修改，否則大部份的時間文章都是不會去異動的。
 14 | 
 15 | 在動態去資料庫存取本篇文章時，我們可能會使用這樣的 SQL 語法去存取：
 16 | 
 17 | ```sql
 18 | SELECT * FROM posts WHERE post_id = '部落格文章編號' Limit 1;
 19 | ```
 20 | 
 21 | 當每個使用者要來看這篇文章時，都需要透過這樣的 SQL 指令去撈取這篇文章的資訊（標題、內文...blabla），如果文章讀者不多時，偶爾這樣取資料庫撈資料還撐得過去，但像是比較熱門被轉載的文章，每分鐘可能有好幾百幾千人要去看這篇文章的資料，對於資料庫就是執行 1000 次這樣相同的語法去取得相同的資料（要記得，資料庫的存取是很昂貴的）。
 22 | 
 23 | 為了減少資料庫存取次數，所以在程式方面我們會將這些不長異動的資料在第一次從資料庫撈取之後，把它`快取`下來，可能用 `Memcached`、`Redis` 或 `File` 的方式將資料預先存下來，然後再設定資料的過期時間，當超過過期時間後，再重複去資料庫撈取資料。
 24 | 
 25 | 這樣可以減少資料庫的運算資源，使用者也可以很快地拿到他們想要看的資料，提高了伺服器對於資料的存取量。
 26 | 
 27 | #### 前端資料
 28 | 
 29 | 我們在網站上常常需要用 JavaScript 去完成我們要的 UI 操作效果，但是這些 JavaScript 除非操作方式有做大幅的異動，否則我們很少會修改這些 JavaScript 檔案（在 CSS 檔案部分也相同）。
 30 | 
 31 | 除非有做特別的設定，再不重新整理畫面的前提下，瀏覽器都會幫我們的 JavaScript 及 CSS 檔案去做快取，載入過一次後就存放在使用者自己的電腦上，等到下次我們需要使用相同的 JavaScript 或 CSS 檔案時，瀏覽器會直接取用使用者電腦本地端的檔案，而不會去遠端伺服器再次拿取 JavaScript 或 CSS 檔案。
 32 | 
 33 | 我們已可以利用瀏覽器的這個特性，等到 JavaScript 或 CSS 檔案做變更時，再透過`版本號`或`不同的檔名`去讓瀏覽器讀取最新的 JavaScript 或 CSS 檔案：
 34 | 
 35 | ***版本號***
 36 | 
 37 | ```html
 38 | <script src="kejyun.app.js?v=1.0.0"></script>
 39 | <script src="kejyun.app.js?v=1.0.1"></script>
 40 | 
 41 | <link rel="stylesheet" href="kejyun.app.css?v=1.0.0">
 42 | <link rel="stylesheet" href="kejyun.app.css?v=1.0.1">
 43 | ```
 44 | 
 45 | ***不同檔名***
 46 | 
 47 | > 這個部分通常是用後端程式去控制 JavaScript 及 CSS 現在的版本檔名
 48 | 
 49 | ```html
 50 | <script src="kejyun.app-asdfghjkl.js"></script>
 51 | <script src="kejyun.app-qwertyuio.js"></script>
 52 | 
 53 | <link rel="stylesheet" href="kejyun.app-asdfghjkl.css">
 54 | <link rel="stylesheet" href="kejyun.app-qwertyuio.css">
 55 | ```
 56 | 
 57 | ### 索引
 58 | 
 59 | 對於需要常用來查詢的資料做好索引，可以加快查詢的效率
 60 | 
 61 | ### 資料庫讀寫分離
 62 | 
 63 | 以部落格文章為例，在 [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)中，大部份 80% 人都是在看文章比較多（讀取資料：SELECT），只有少部分 20% 的人或意見領袖，才會發表文章表示看法（異動資料：INSERT、UPDATE、DELETE），而在做資料異動的時候很有可能會對資料進行鎖定，進而去影響讀取的速度。
 64 | 
 65 | > 除了有關交易（Transaction）的資料在 SELECT 的時候才有可能對資料進行鎖定，像是購票或購買限量商品時，會把撈取出來加入購物車的資料先行鎖定
 66 | 
 67 | 所以像是 Facebook、部落格之類的媒體，大多會把資料庫做讀寫分離，使用者做異動的行為會去主資料庫（Master）去做寫入的動作，然後從資料庫（Slave）在定期的去同步資料庫的內容，而其他大部份的讀者在讀取資料時，都去讀取從資料庫（Slave）的資料，這樣就不會有因為資料異動而導致資料被鎖定，造成讀取變慢的問題。
 68 | 
 69 | 而在主從架構中，可以是有很多台從的資料庫（Slave），透過分散式處理可以將不同的連線分配給不同的從資料庫（Slave），讓每一台從的資料庫平均分配差不多的連線量，因為需要處理的連線減少了，進而讓每個查詢都能夠在短時間都能夠回應查詢結果，提高系統的可用度。
 70 | 
 71 | 
 72 | ---
 73 | 
 74 | 
 75 | ## 重要觀念
 76 | 
 77 | ### 資料庫存取是很昂貴的
 78 | 
 79 | 每下一段資料庫 SQL 語法去撈取資料，資料庫就要對資料去進行比對運算，當資料很龐大，SQL 語法又太複雜導致運算很繁瑣時，資料庫需要進行運算的時間又會變得更久，所以建立好的索引（Index）及下好的 SQL 語法是很重要的工作
 80 | 
 81 | > 雖然資料都撈得出來，但是撈得漂亮，撈得快也是一種藝術啊～
 82 | 
 83 | ### 與資料庫建立的連線是很昂貴的資源
 84 | 
 85 | 我們要對資料庫進行操作就需要先與資料庫進行連線，但是資料庫的連線建立是很花時間的，時間越久會導致回應的時間也會跟著變長，然而資料庫的連線數又不能無限制的擴張（需要看主機效能的乘載量），所以連線的資源就相對的珍貴。
 86 | 
 87 | 我們通常會用連線池（Connection Pool）的方式去維持資料庫的連線，建立連線後做完查詢就將連線丟到連線池內，供下一個要做查詢的人使用，下一個要做查詢的人發現有可用的連線，就不用花費時間重新的去建立與資料庫的連線，直接使用就有的連線即可，直到這個連線時間超過資料庫最高的限制失效斷線為止，這樣就能提高資料庫連線使用的效率。
 88 | 
 89 | ### 對於需要當作查詢的資料必須建立索引
 90 | 
 91 | 我們會依照我們想要撈取的資料下 `WHERE`、`GROUP BY`、`ORDER BY` ...等等的條件去撈取我們想要撈取的資料，如果不建立索引資料庫也是有辦法將資料撈出來，只是速度會比較慢，當這些條件資料有預先做索引時，資料庫可以很快地利用索引去完成查詢的動作
 92 | 
 93 | > 索引原理就是將資料透過資料結構（e.g. [B-tree]((http://zh.wikipedia.org/wiki/B%E6%A0%91)), Hash）預先做排序的存放，使用索引時資料庫可以很快的資料放在哪一個地方，以達到快速撈取資料的目的
 94 | 
 95 | #### ***以書籍為例***
 96 | 
 97 | 索引就像書籍最前面的目錄一樣，可以很快地告訴我們哪一段章節的資料在第幾頁，所以我們就可以很快地翻閱到我們想看的資料，如果書籍沒有做目錄（索引）的話，我們要找到特定章節的資料時，我們就需要從頭到尾翻閱去查看看我們要看的資料到底在哪一頁，這對於書籍的讀者來說是相當浪費時間的一件事。
 98 | 
 99 | *詳細的索引說明會在稍後章節提到*
100 | 
101 | 
102 | ## 結論
103 | 
104 | 大部份提高存取效率的作法大多是：
105 | 
106 | ### 設計好的程式邏輯
107 | 
108 | 在做簡單的資料撈取，能夠用更快的方式（減少迴圈、快速比對資料...blabla）去產生我們要的資料，當然能夠加快請求的速度，但是在現在硬體處理的速度越來越快的情況下，程式的邏輯沒有太複雜下（複雜的可能像是演算法），大部份的資料處理都是很快的，所以大部份的瓶頸都是卡在資料庫資料的撈取邏輯、資料存放方式、資料撈取方式以及索引建置的邏輯。
109 | 
110 | ### 減少資料庫的存取
111 | 
112 | 資料能夠快取不要重複撈取就不要重複撈取，可以把資料存放在記憶體的快取中，若是整個頁面都很少異動，也可以把整個畫面（View）的 HTML 去做快取，把資料組合成 HTML 的動作都省下來了。
113 | 
114 | ### 降低資料表資料量
115 | 
116 | 當一個資料表只有 100 筆或是到 1000 筆資料的時候，資料庫大部份都的處理速度都是很快的（只有幾毫秒而已，人感覺不出來差異），但是當一個資料表的資料量有好幾百萬或好幾千萬筆資料，要從這麼多的資料去撈取出想要的資料，需要耗費的運算資源就會更多，所以讓資料能夠分散儲存，降低每個資料表的資料量，這樣也可以大大提高資料庫存取的效率。
117 | 
118 | #### ***以書籍為例***
119 | 
120 | 想像一下一本書有 100 頁或是到 1000 頁的量，我們在透過書籍目錄去查詢我們想要看的資料，應該速度不會差太多，但是當一本書有好幾百萬頁（現實上可能會分很多冊，像百科全書），我們需要看的書籍目錄也會比較多（索引表很大），所以也需要花更多時間去查詢我們要的資料。
121 | 
122 | ### 資料庫分散式處理
123 | 
124 | 當資料異動或讀取的資料量過大時，資料的異動會影響到讀取的效能，所以我們就需要將資料做讀寫分離。
125 | 
126 | ### 建立良好的資料庫索引
127 | 
128 | 將用來做資料撈取判斷的資料做好索引，而索引的順序及 WHERE 條件順序會影響索引的使用，這個部分會在之後提到。
129 | 
130 | ## 參考資料
131 | * [B樹 - 維基百科，自由的百科全書](http://zh.wikipedia.org/wiki/B%E6%A0%91)
132 | * [全文檢索 - 維基百科，自由的百科全書](http://zh.wikipedia.org/wiki/%E5%85%A8%E6%96%87%E6%AA%A2%E7%B4%A2)
133 | * [80/20 法則](http://wiki.mbalib.com/zh-tw/80/20%E6%B3%95%E5%88%99)
134 | 


--------------------------------------------------------------------------------
/Database/Database-MySQL-Index.md:
--------------------------------------------------------------------------------
  1 | # 索引
  2 | 
  3 | ## 索引順序
  4 | 
  5 | 在所有的資料庫內，建立索引是提昇資料庫資料存取效率的很重要的方式，但是錯誤的`索引順序`或是 `SQL 語法` 都可能讓查詢語法變成了 slow query。
  6 | 
  7 | 假設我們有一個部落格文章的資料表，裡頭存放所有使用者的部落格文章，資料表資料如下：
  8 | 
  9 | ### posts
 10 | 
 11 | |post_id  | post_status   | author | content  |
 12 | | ------------- |:-------------:|:-------------:|:-------------:|
 13 | | 1  |	正常  | kejyun | 文章 1 |
 14 | | 2  |	刪除  | kejyun | 文章 2 |
 15 | | 3  |	正常  | kejyun | 文章 3 |
 16 | | 4  |	正常  | jimmy | 文章 4 |
 17 | | 5  |	刪除  | jimmy | 文章 5 |
 18 | 
 19 | 如果我們要撈取 `作者為 kejyun` 且 `狀態正常` 的文章，我們可能會用下列語法去撈取
 20 | 
 21 | ### SQL 語法 1
 22 | 
 23 | ```sql
 24 | SELECT *
 25 | FROM posts
 26 | WHERE post_status = '正常'
 27 | AND author = 'kejyun'
 28 | ```
 29 | 
 30 | 或者用這個語法撈取
 31 | 
 32 | ### SQL 語法 2
 33 | 
 34 | ```sql
 35 | SELECT *
 36 | FROM posts
 37 | WHERE author = 'kejyun'
 38 | AND post_status = '正常'
 39 | ```
 40 | 
 41 | 這兩句 SQL 語法都可以撈出我們想要的結果，但是對於不同的索引執行的效率卻是差異很多
 42 | 
 43 | 如果我們的索引是 `post_status`、`author` 的順序，`SQL 語法 1` 則會正確的使用索引來做查詢，執行的效率會很快。
 44 | 
 45 | 但是對於`SQL 語法 2`，會因為找不到適合該語法的索引去做查詢，所以會變成不使用索引，而對整個資料表作完整的檢索，去查找出資料來。
 46 | 
 47 | #### 主要原因是
 48 | 
 49 | > 在 SQL 語法 WHERE 的順序會影響查詢索引的順序
 50 | 
 51 | > 同樣的索引資料在不同的順序，表示為不同的索引，所以 `post_status、author` 與 `author、post_status` 這兩個索引雖然是使用相同的資料欄位，但因為順序不同所帶來的查詢效果也不同
 52 | 
 53 | > 同一句 SQL 語法只能使用單一個索引，所以不同的索引沒辦法共用
 54 | 
 55 | 資料庫沒有這麼人工智慧，會去判斷兩個 SQL 語法可以使用相同的索引去查找出相同的資料，他會依照程式設計師給的語法，依序去查找是否有可用的索引。
 56 | 
 57 | #### 分析 SQL 語法 1
 58 | 
 59 | 在 `SQL 語法 1` 中，我們 WHERE 的第 1 個條件是 `post_status = '正常'`，所以資料庫會去查找有沒有索引開頭是使用 `post_status` 的索引。
 60 | 
 61 | 若有該索引，則再判斷 WHERE 的第 2 個條件 `author = 'kejyun'`，所以資料庫會去查找有沒有索引開頭是使用 `post_status` 的索引，且該索引的第 2 個索引是使用 `author` 的索引。
 62 | 
 63 | 資料庫會一直判斷比較 SQL 語法 WHERE 條件與索引之間的順序關係，直到沒辦法匹配後，後面的沒辦法匹配索引則使用完整比對的方式去進行查詢
 64 | 
 65 | #### 分析 SQL 語法 2
 66 | 
 67 | 在 `SQL 語法 2` 中，我們 WHERE 的第 1 個條件是 `author = 'kejyun'` 但我們資料庫沒有建立索引開頭是使用 `author` 的索引，所以只能對資料庫使用完整比對。
 68 | 
 69 | ### 原理
 70 | 
 71 | 索引就像是書籍的目錄一樣，若以童話故事書為例，我們書中會收錄世界各國的故事，且每個故事有他自己的類型，像是奇幻、驚悚、傳說、神話。
 72 | 
 73 | 如果我們以國家當做大標題（e.g. 第 1 索引），以故事類型當作小標題（e.g. 第 2 索引），那麼書籍目錄會像：
 74 | 
 75 |   - 台灣
 76 |     - 奇幻
 77 |     - 驚悚
 78 |     - 傳說
 79 |     - 神話
 80 |   - 日本
 81 |     - 奇幻
 82 |     - 驚悚
 83 |     - 傳說
 84 |     - 神話
 85 |   - 歐美
 86 |     - 奇幻
 87 |     - 驚悚
 88 |     - 傳說
 89 |     - 神話
 90 |   - 印度
 91 |     - 奇幻
 92 |     - 驚悚
 93 |     - 傳說
 94 |     - 神話
 95 |   - etc...
 96 | 
 97 | 如果我們要找`台灣`且類型為`傳說`的童話故事，我們的目光會先移動到台灣的區段，然後在這個區塊下找到類型為`傳說`的故事
 98 | 
 99 | - 台灣
100 |   - 奇幻
101 |   - 驚悚
102 |   - ***傳說***
103 |   - 神話
104 | 
105 | 
106 | 但是如果我們要找所有類型為`傳說`的故事中，發生在`台灣`的故事，依照步驟我們會希望先把所有`傳說`的故事先列出來，再從這個目錄下去找屬於`台灣`的故事，所以我們希望會看到像這樣的目錄（索引）：
107 | 
108 |   - 傳說
109 |     - ***台灣***
110 |     - 日本
111 |     - 歐美
112 |     - 印度
113 | 
114 | 但是在這本故事書的目錄（索引）中，我們沒有看到這樣的目錄結構，所以我們沒辦法透過目錄快速的找到我們要看的所有類型為`傳說`的章節資料，只好從頭到尾的去翻閱整本書，直到找到全部我們`傳說`章節的故事，然後再從這些找出來的`傳說`故事中，再去區別出哪些為`台灣`的故事。
115 | 
116 | 所以目錄（索引）的規則就是希望看故事書的人要怎麼快速找到他想要的東西，當沒有我們可以參考的目錄的話，就像資料庫沒有可參考的索引一樣，就會找得比較慢（但還是找得出來）。
117 | 
118 | 
119 | ### 結論
120 | 
121 | 因為索引順序的不同，以及 SQL WHERE 條件順序的不同，會使得資料庫在使用索引進行查詢有不同的效率，所以要謹慎的使用索引及 SQL 語法，才能達到高效率的查詢結果。
122 | 
123 | 
124 | ## 控制索引更新
125 | 
126 | 
127 | ### 關閉索引更新：
128 | 
129 | ```sql
130 | ALTER TABLE table_name DISABLE KEYS;
131 | ```
132 | ### 開啟索引更新：
133 | 
134 | ```sql
135 | ALTER TABLE table_name ENABLE KEYS;
136 | ```
137 | 
138 | MySQL在新增（INSERT）、刪除（DELETE）、更新（UPDATE）的時候會去更新現有的索引表，而更新索引表也需要花費一些時間，當異動一筆資料的時候，索引表也做一次的異動，但當在做大量資料異動的時候，例如異動1000筆資料，索引表也需要異動1000次，而其實我們只需要最後一次（最新）的異動就好了，前面的999次都是不需要做的索引表異動更新，所以在異動大量資料前，可以使用指令 `ALTER TABLE table_name DISABLE KEYS;` 關閉索引更新，等異動完成後，再使用指令 `ALTER TABLE table_name ENABLE KEYS;` 開啟索引更新。
139 | 
140 | ```sql
141 | ALTER TABLE users DISABLE KEYS;
142 | 異動（INSERT、DELETE、UPDATE）大量資料SQL語法
143 | ALTER TABLE users ENABLE KEYS;
144 | ```
145 | 
146 | ## 自定義Hash Index做字串完整比對
147 | 
148 | 我們知道在對`字串（CHAR或VARCHAR）`去做查找的時候效率會遠比對`整數（INT）`查找還慢，因CRC32對字串做校驗後會回傳`整數的校驗碼`，我們在資料表增加一個整數型態欄位，儲存要比對字串的校驗碼。
149 | 
150 | ### crc32
151 | 
152 | > 建立str 的 32 位迴圈冗余校驗碼多項式。這通常用於檢查傳輸的資料是否完整。
153 | 
154 | > 由於 PHP 的整數是帶符號的，許多 crc32 校驗碼將返回負整數，因此你需要使用 sprintf() 或 printf() 的「%u」格式符來取得表示無符號 crc32 校驗碼的字串。
155 |  在原本實作email登入時會對email欄位做索引，所以會先去查找email字串欄位的資料，之後再去比對密碼是否正確，但若資料過多字串比對的效率會降低很多
156 | 
157 | ```sql
158 | SELECT id,name,email
159 | FROM user
160 | WHERE email = "kejyun@gmail.com"
161 | AND password = "xxx"
162 | ```
163 | 
164 | 我們加入了 `emailcrc` 的欄位去儲存對 email 字串的校驗碼，再查找 email 字串欄位的資料前，先透過crc整數校驗碼快速過濾掉不可能的資料，之後再從少數的資料中做 email 字串欄位字串比對，如果資料量很大，這樣的效率會提升很多。
165 | 
166 | ```sql
167 | SELECT id,name,email
168 | FROM user
169 | WHERE emailcrc = CRC32("kejyun@gmail.com")
170 | AND email="kejyun@gmail.com"
171 | AND password="xxx"
172 | ```
173 | 
174 | 這邊要注意的是沒辦法只使用 crc 校驗碼去當作唯一的條件，`不同的字串可能會出現相同的校驗碼`，所以最後還是要對你要比對的字串做比對，避免查詢發生錯誤。
175 | 
176 | |email  | emailcrc  |
177 | | ------------- |:-------------:|
178 | |kejyun1@gmail.com  |	1234567890  |
179 | |kejyun2@gmail.com  | 1234567890  |
180 | 
181 | 
182 | ## 參考資料
183 | * [KeJyun學習日誌: 提高存取MySQL效率小技巧](http://blog.kejyun.com/2012/12/Tips-For-Use-MySQL-With-High-Performance.html)
184 | * [KeJyun學習日誌: MySQL效率調校](http://blog.kejyun.com/2012/12/MySQL-Efficiency-Adjustment.html)
185 | * [MySQL Indexing: Best Practices Slide PDF](http://www.percona.com/files/presentations/WEBINAR-MySQL-Indexing-Best-Practices.pdf)
186 | * [Tools and Techniques for Index Design PDF Slide](http://www.percona.com/files/presentations/WEBINAR-tools-and-techniques-for-index-design.pdf)
187 | * [EXPLAIN Demystified PDF slide](http://www.percona.com/files/presentations/WEBINAR2012-02-Explain-Demystified.pdf)
188 | * [Optimizing MySQL Configuration PDF Slide](http://www.percona.com/files/presentations/percona-live/london-2012/PLUK2012-optimizing-mysql-configuration.pdf)
189 | * [PHP手冊 - crc32](http://por.tw/Website_Design/PHP5/function.crc32.html)
190 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                     GNU GENERAL PUBLIC LICENSE
  2 |                        Version 3, 29 June 2007
  3 | 
  4 |  Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
  5 |  Everyone is permitted to copy and distribute verbatim copies
  6 |  of this license document, but changing it is not allowed.
  7 | 
  8 |                             Preamble
  9 | 
 10 |   The GNU General Public License is a free, copyleft license for
 11 | software and other kinds of works.
 12 | 
 13 |   The licenses for most software and other practical works are designed
 14 | to take away your freedom to share and change the works.  By contrast,
 15 | the GNU General Public License is intended to guarantee your freedom to
 16 | share and change all versions of a program--to make sure it remains free
 17 | software for all its users.  We, the Free Software Foundation, use the
 18 | GNU General Public License for most of our software; it applies also to
 19 | any other work released this way by its authors.  You can apply it to
 20 | your programs, too.
 21 | 
 22 |   When we speak of free software, we are referring to freedom, not
 23 | price.  Our General Public Licenses are designed to make sure that you
 24 | have the freedom to distribute copies of free software (and charge for
 25 | them if you wish), that you receive source code or can get it if you
 26 | want it, that you can change the software or use pieces of it in new
 27 | free programs, and that you know you can do these things.
 28 | 
 29 |   To protect your rights, we need to prevent others from denying you
 30 | these rights or asking you to surrender the rights.  Therefore, you have
 31 | certain responsibilities if you distribute copies of the software, or if
 32 | you modify it: responsibilities to respect the freedom of others.
 33 | 
 34 |   For example, if you distribute copies of such a program, whether
 35 | gratis or for a fee, you must pass on to the recipients the same
 36 | freedoms that you received.  You must make sure that they, too, receive
 37 | or can get the source code.  And you must show them these terms so they
 38 | know their rights.
 39 | 
 40 |   Developers that use the GNU GPL protect your rights with two steps:
 41 | (1) assert copyright on the software, and (2) offer you this License
 42 | giving you legal permission to copy, distribute and/or modify it.
 43 | 
 44 |   For the developers' and authors' protection, the GPL clearly explains
 45 | that there is no warranty for this free software.  For both users' and
 46 | authors' sake, the GPL requires that modified versions be marked as
 47 | changed, so that their problems will not be attributed erroneously to
 48 | authors of previous versions.
 49 | 
 50 |   Some devices are designed to deny users access to install or run
 51 | modified versions of the software inside them, although the manufacturer
 52 | can do so.  This is fundamentally incompatible with the aim of
 53 | protecting users' freedom to change the software.  The systematic
 54 | pattern of such abuse occurs in the area of products for individuals to
 55 | use, which is precisely where it is most unacceptable.  Therefore, we
 56 | have designed this version of the GPL to prohibit the practice for those
 57 | products.  If such problems arise substantially in other domains, we
 58 | stand ready to extend this provision to those domains in future versions
 59 | of the GPL, as needed to protect the freedom of users.
 60 | 
 61 |   Finally, every program is threatened constantly by software patents.
 62 | States should not allow patents to restrict development and use of
 63 | software on general-purpose computers, but in those that do, we wish to
 64 | avoid the special danger that patents applied to a free program could
 65 | make it effectively proprietary.  To prevent this, the GPL assures that
 66 | patents cannot be used to render the program non-free.
 67 | 
 68 |   The precise terms and conditions for copying, distribution and
 69 | modification follow.
 70 | 
 71 |                        TERMS AND CONDITIONS
 72 | 
 73 |   0. Definitions.
 74 | 
 75 |   "This License" refers to version 3 of the GNU General Public License.
 76 | 
 77 |   "Copyright" also means copyright-like laws that apply to other kinds of
 78 | works, such as semiconductor masks.
 79 | 
 80 |   "The Program" refers to any copyrightable work licensed under this
 81 | License.  Each licensee is addressed as "you".  "Licensees" and
 82 | "recipients" may be individuals or organizations.
 83 | 
 84 |   To "modify" a work means to copy from or adapt all or part of the work
 85 | in a fashion requiring copyright permission, other than the making of an
 86 | exact copy.  The resulting work is called a "modified version" of the
 87 | earlier work or a work "based on" the earlier work.
 88 | 
 89 |   A "covered work" means either the unmodified Program or a work based
 90 | on the Program.
 91 | 
 92 |   To "propagate" a work means to do anything with it that, without
 93 | permission, would make you directly or secondarily liable for
 94 | infringement under applicable copyright law, except executing it on a
 95 | computer or modifying a private copy.  Propagation includes copying,
 96 | distribution (with or without modification), making available to the
 97 | public, and in some countries other activities as well.
 98 | 
 99 |   To "convey" a work means any kind of propagation that enables other
100 | parties to make or receive copies.  Mere interaction with a user through
101 | a computer network, with no transfer of a copy, is not conveying.
102 | 
103 |   An interactive user interface displays "Appropriate Legal Notices"
104 | to the extent that it includes a convenient and prominently visible
105 | feature that (1) displays an appropriate copyright notice, and (2)
106 | tells the user that there is no warranty for the work (except to the
107 | extent that warranties are provided), that licensees may convey the
108 | work under this License, and how to view a copy of this License.  If
109 | the interface presents a list of user commands or options, such as a
110 | menu, a prominent item in the list meets this criterion.
111 | 
112 |   1. Source Code.
113 | 
114 |   The "source code" for a work means the preferred form of the work
115 | for making modifications to it.  "Object code" means any non-source
116 | form of a work.
117 | 
118 |   A "Standard Interface" means an interface that either is an official
119 | standard defined by a recognized standards body, or, in the case of
120 | interfaces specified for a particular programming language, one that
121 | is widely used among developers working in that language.
122 | 
123 |   The "System Libraries" of an executable work include anything, other
124 | than the work as a whole, that (a) is included in the normal form of
125 | packaging a Major Component, but which is not part of that Major
126 | Component, and (b) serves only to enable use of the work with that
127 | Major Component, or to implement a Standard Interface for which an
128 | implementation is available to the public in source code form.  A
129 | "Major Component", in this context, means a major essential component
130 | (kernel, window system, and so on) of the specific operating system
131 | (if any) on which the executable work runs, or a compiler used to
132 | produce the work, or an object code interpreter used to run it.
133 | 
134 |   The "Corresponding Source" for a work in object code form means all
135 | the source code needed to generate, install, and (for an executable
136 | work) run the object code and to modify the work, including scripts to
137 | control those activities.  However, it does not include the work's
138 | System Libraries, or general-purpose tools or generally available free
139 | programs which are used unmodified in performing those activities but
140 | which are not part of the work.  For example, Corresponding Source
141 | includes interface definition files associated with source files for
142 | the work, and the source code for shared libraries and dynamically
143 | linked subprograms that the work is specifically designed to require,
144 | such as by intimate data communication or control flow between those
145 | subprograms and other parts of the work.
146 | 
147 |   The Corresponding Source need not include anything that users
148 | can regenerate automatically from other parts of the Corresponding
149 | Source.
150 | 
151 |   The Corresponding Source for a work in source code form is that
152 | same work.
153 | 
154 |   2. Basic Permissions.
155 | 
156 |   All rights granted under this License are granted for the term of
157 | copyright on the Program, and are irrevocable provided the stated
158 | conditions are met.  This License explicitly affirms your unlimited
159 | permission to run the unmodified Program.  The output from running a
160 | covered work is covered by this License only if the output, given its
161 | content, constitutes a covered work.  This License acknowledges your
162 | rights of fair use or other equivalent, as provided by copyright law.
163 | 
164 |   You may make, run and propagate covered works that you do not
165 | convey, without conditions so long as your license otherwise remains
166 | in force.  You may convey covered works to others for the sole purpose
167 | of having them make modifications exclusively for you, or provide you
168 | with facilities for running those works, provided that you comply with
169 | the terms of this License in conveying all material for which you do
170 | not control copyright.  Those thus making or running the covered works
171 | for you must do so exclusively on your behalf, under your direction
172 | and control, on terms that prohibit them from making any copies of
173 | your copyrighted material outside their relationship with you.
174 | 
175 |   Conveying under any other circumstances is permitted solely under
176 | the conditions stated below.  Sublicensing is not allowed; section 10
177 | makes it unnecessary.
178 | 
179 |   3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180 | 
181 |   No covered work shall be deemed part of an effective technological
182 | measure under any applicable law fulfilling obligations under article
183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184 | similar laws prohibiting or restricting circumvention of such
185 | measures.
186 | 
187 |   When you convey a covered work, you waive any legal power to forbid
188 | circumvention of technological measures to the extent such circumvention
189 | is effected by exercising rights under this License with respect to
190 | the covered work, and you disclaim any intention to limit operation or
191 | modification of the work as a means of enforcing, against the work's
192 | users, your or third parties' legal rights to forbid circumvention of
193 | technological measures.
194 | 
195 |   4. Conveying Verbatim Copies.
196 | 
197 |   You may convey verbatim copies of the Program's source code as you
198 | receive it, in any medium, provided that you conspicuously and
199 | appropriately publish on each copy an appropriate copyright notice;
200 | keep intact all notices stating that this License and any
201 | non-permissive terms added in accord with section 7 apply to the code;
202 | keep intact all notices of the absence of any warranty; and give all
203 | recipients a copy of this License along with the Program.
204 | 
205 |   You may charge any price or no price for each copy that you convey,
206 | and you may offer support or warranty protection for a fee.
207 | 
208 |   5. Conveying Modified Source Versions.
209 | 
210 |   You may convey a work based on the Program, or the modifications to
211 | produce it from the Program, in the form of source code under the
212 | terms of section 4, provided that you also meet all of these conditions:
213 | 
214 |     a) The work must carry prominent notices stating that you modified
215 |     it, and giving a relevant date.
216 | 
217 |     b) The work must carry prominent notices stating that it is
218 |     released under this License and any conditions added under section
219 |     7.  This requirement modifies the requirement in section 4 to
220 |     "keep intact all notices".
221 | 
222 |     c) You must license the entire work, as a whole, under this
223 |     License to anyone who comes into possession of a copy.  This
224 |     License will therefore apply, along with any applicable section 7
225 |     additional terms, to the whole of the work, and all its parts,
226 |     regardless of how they are packaged.  This License gives no
227 |     permission to license the work in any other way, but it does not
228 |     invalidate such permission if you have separately received it.
229 | 
230 |     d) If the work has interactive user interfaces, each must display
231 |     Appropriate Legal Notices; however, if the Program has interactive
232 |     interfaces that do not display Appropriate Legal Notices, your
233 |     work need not make them do so.
234 | 
235 |   A compilation of a covered work with other separate and independent
236 | works, which are not by their nature extensions of the covered work,
237 | and which are not combined with it such as to form a larger program,
238 | in or on a volume of a storage or distribution medium, is called an
239 | "aggregate" if the compilation and its resulting copyright are not
240 | used to limit the access or legal rights of the compilation's users
241 | beyond what the individual works permit.  Inclusion of a covered work
242 | in an aggregate does not cause this License to apply to the other
243 | parts of the aggregate.
244 | 
245 |   6. Conveying Non-Source Forms.
246 | 
247 |   You may convey a covered work in object code form under the terms
248 | of sections 4 and 5, provided that you also convey the
249 | machine-readable Corresponding Source under the terms of this License,
250 | in one of these ways:
251 | 
252 |     a) Convey the object code in, or embodied in, a physical product
253 |     (including a physical distribution medium), accompanied by the
254 |     Corresponding Source fixed on a durable physical medium
255 |     customarily used for software interchange.
256 | 
257 |     b) Convey the object code in, or embodied in, a physical product
258 |     (including a physical distribution medium), accompanied by a
259 |     written offer, valid for at least three years and valid for as
260 |     long as you offer spare parts or customer support for that product
261 |     model, to give anyone who possesses the object code either (1) a
262 |     copy of the Corresponding Source for all the software in the
263 |     product that is covered by this License, on a durable physical
264 |     medium customarily used for software interchange, for a price no
265 |     more than your reasonable cost of physically performing this
266 |     conveying of source, or (2) access to copy the
267 |     Corresponding Source from a network server at no charge.
268 | 
269 |     c) Convey individual copies of the object code with a copy of the
270 |     written offer to provide the Corresponding Source.  This
271 |     alternative is allowed only occasionally and noncommercially, and
272 |     only if you received the object code with such an offer, in accord
273 |     with subsection 6b.
274 | 
275 |     d) Convey the object code by offering access from a designated
276 |     place (gratis or for a charge), and offer equivalent access to the
277 |     Corresponding Source in the same way through the same place at no
278 |     further charge.  You need not require recipients to copy the
279 |     Corresponding Source along with the object code.  If the place to
280 |     copy the object code is a network server, the Corresponding Source
281 |     may be on a different server (operated by you or a third party)
282 |     that supports equivalent copying facilities, provided you maintain
283 |     clear directions next to the object code saying where to find the
284 |     Corresponding Source.  Regardless of what server hosts the
285 |     Corresponding Source, you remain obligated to ensure that it is
286 |     available for as long as needed to satisfy these requirements.
287 | 
288 |     e) Convey the object code using peer-to-peer transmission, provided
289 |     you inform other peers where the object code and Corresponding
290 |     Source of the work are being offered to the general public at no
291 |     charge under subsection 6d.
292 | 
293 |   A separable portion of the object code, whose source code is excluded
294 | from the Corresponding Source as a System Library, need not be
295 | included in conveying the object code work.
296 | 
297 |   A "User Product" is either (1) a "consumer product", which means any
298 | tangible personal property which is normally used for personal, family,
299 | or household purposes, or (2) anything designed or sold for incorporation
300 | into a dwelling.  In determining whether a product is a consumer product,
301 | doubtful cases shall be resolved in favor of coverage.  For a particular
302 | product received by a particular user, "normally used" refers to a
303 | typical or common use of that class of product, regardless of the status
304 | of the particular user or of the way in which the particular user
305 | actually uses, or expects or is expected to use, the product.  A product
306 | is a consumer product regardless of whether the product has substantial
307 | commercial, industrial or non-consumer uses, unless such uses represent
308 | the only significant mode of use of the product.
309 | 
310 |   "Installation Information" for a User Product means any methods,
311 | procedures, authorization keys, or other information required to install
312 | and execute modified versions of a covered work in that User Product from
313 | a modified version of its Corresponding Source.  The information must
314 | suffice to ensure that the continued functioning of the modified object
315 | code is in no case prevented or interfered with solely because
316 | modification has been made.
317 | 
318 |   If you convey an object code work under this section in, or with, or
319 | specifically for use in, a User Product, and the conveying occurs as
320 | part of a transaction in which the right of possession and use of the
321 | User Product is transferred to the recipient in perpetuity or for a
322 | fixed term (regardless of how the transaction is characterized), the
323 | Corresponding Source conveyed under this section must be accompanied
324 | by the Installation Information.  But this requirement does not apply
325 | if neither you nor any third party retains the ability to install
326 | modified object code on the User Product (for example, the work has
327 | been installed in ROM).
328 | 
329 |   The requirement to provide Installation Information does not include a
330 | requirement to continue to provide support service, warranty, or updates
331 | for a work that has been modified or installed by the recipient, or for
332 | the User Product in which it has been modified or installed.  Access to a
333 | network may be denied when the modification itself materially and
334 | adversely affects the operation of the network or violates the rules and
335 | protocols for communication across the network.
336 | 
337 |   Corresponding Source conveyed, and Installation Information provided,
338 | in accord with this section must be in a format that is publicly
339 | documented (and with an implementation available to the public in
340 | source code form), and must require no special password or key for
341 | unpacking, reading or copying.
342 | 
343 |   7. Additional Terms.
344 | 
345 |   "Additional permissions" are terms that supplement the terms of this
346 | License by making exceptions from one or more of its conditions.
347 | Additional permissions that are applicable to the entire Program shall
348 | be treated as though they were included in this License, to the extent
349 | that they are valid under applicable law.  If additional permissions
350 | apply only to part of the Program, that part may be used separately
351 | under those permissions, but the entire Program remains governed by
352 | this License without regard to the additional permissions.
353 | 
354 |   When you convey a copy of a covered work, you may at your option
355 | remove any additional permissions from that copy, or from any part of
356 | it.  (Additional permissions may be written to require their own
357 | removal in certain cases when you modify the work.)  You may place
358 | additional permissions on material, added by you to a covered work,
359 | for which you have or can give appropriate copyright permission.
360 | 
361 |   Notwithstanding any other provision of this License, for material you
362 | add to a covered work, you may (if authorized by the copyright holders of
363 | that material) supplement the terms of this License with terms:
364 | 
365 |     a) Disclaiming warranty or limiting liability differently from the
366 |     terms of sections 15 and 16 of this License; or
367 | 
368 |     b) Requiring preservation of specified reasonable legal notices or
369 |     author attributions in that material or in the Appropriate Legal
370 |     Notices displayed by works containing it; or
371 | 
372 |     c) Prohibiting misrepresentation of the origin of that material, or
373 |     requiring that modified versions of such material be marked in
374 |     reasonable ways as different from the original version; or
375 | 
376 |     d) Limiting the use for publicity purposes of names of licensors or
377 |     authors of the material; or
378 | 
379 |     e) Declining to grant rights under trademark law for use of some
380 |     trade names, trademarks, or service marks; or
381 | 
382 |     f) Requiring indemnification of licensors and authors of that
383 |     material by anyone who conveys the material (or modified versions of
384 |     it) with contractual assumptions of liability to the recipient, for
385 |     any liability that these contractual assumptions directly impose on
386 |     those licensors and authors.
387 | 
388 |   All other non-permissive additional terms are considered "further
389 | restrictions" within the meaning of section 10.  If the Program as you
390 | received it, or any part of it, contains a notice stating that it is
391 | governed by this License along with a term that is a further
392 | restriction, you may remove that term.  If a license document contains
393 | a further restriction but permits relicensing or conveying under this
394 | License, you may add to a covered work material governed by the terms
395 | of that license document, provided that the further restriction does
396 | not survive such relicensing or conveying.
397 | 
398 |   If you add terms to a covered work in accord with this section, you
399 | must place, in the relevant source files, a statement of the
400 | additional terms that apply to those files, or a notice indicating
401 | where to find the applicable terms.
402 | 
403 |   Additional terms, permissive or non-permissive, may be stated in the
404 | form of a separately written license, or stated as exceptions;
405 | the above requirements apply either way.
406 | 
407 |   8. Termination.
408 | 
409 |   You may not propagate or modify a covered work except as expressly
410 | provided under this License.  Any attempt otherwise to propagate or
411 | modify it is void, and will automatically terminate your rights under
412 | this License (including any patent licenses granted under the third
413 | paragraph of section 11).
414 | 
415 |   However, if you cease all violation of this License, then your
416 | license from a particular copyright holder is reinstated (a)
417 | provisionally, unless and until the copyright holder explicitly and
418 | finally terminates your license, and (b) permanently, if the copyright
419 | holder fails to notify you of the violation by some reasonable means
420 | prior to 60 days after the cessation.
421 | 
422 |   Moreover, your license from a particular copyright holder is
423 | reinstated permanently if the copyright holder notifies you of the
424 | violation by some reasonable means, this is the first time you have
425 | received notice of violation of this License (for any work) from that
426 | copyright holder, and you cure the violation prior to 30 days after
427 | your receipt of the notice.
428 | 
429 |   Termination of your rights under this section does not terminate the
430 | licenses of parties who have received copies or rights from you under
431 | this License.  If your rights have been terminated and not permanently
432 | reinstated, you do not qualify to receive new licenses for the same
433 | material under section 10.
434 | 
435 |   9. Acceptance Not Required for Having Copies.
436 | 
437 |   You are not required to accept this License in order to receive or
438 | run a copy of the Program.  Ancillary propagation of a covered work
439 | occurring solely as a consequence of using peer-to-peer transmission
440 | to receive a copy likewise does not require acceptance.  However,
441 | nothing other than this License grants you permission to propagate or
442 | modify any covered work.  These actions infringe copyright if you do
443 | not accept this License.  Therefore, by modifying or propagating a
444 | covered work, you indicate your acceptance of this License to do so.
445 | 
446 |   10. Automatic Licensing of Downstream Recipients.
447 | 
448 |   Each time you convey a covered work, the recipient automatically
449 | receives a license from the original licensors, to run, modify and
450 | propagate that work, subject to this License.  You are not responsible
451 | for enforcing compliance by third parties with this License.
452 | 
453 |   An "entity transaction" is a transaction transferring control of an
454 | organization, or substantially all assets of one, or subdividing an
455 | organization, or merging organizations.  If propagation of a covered
456 | work results from an entity transaction, each party to that
457 | transaction who receives a copy of the work also receives whatever
458 | licenses to the work the party's predecessor in interest had or could
459 | give under the previous paragraph, plus a right to possession of the
460 | Corresponding Source of the work from the predecessor in interest, if
461 | the predecessor has it or can get it with reasonable efforts.
462 | 
463 |   You may not impose any further restrictions on the exercise of the
464 | rights granted or affirmed under this License.  For example, you may
465 | not impose a license fee, royalty, or other charge for exercise of
466 | rights granted under this License, and you may not initiate litigation
467 | (including a cross-claim or counterclaim in a lawsuit) alleging that
468 | any patent claim is infringed by making, using, selling, offering for
469 | sale, or importing the Program or any portion of it.
470 | 
471 |   11. Patents.
472 | 
473 |   A "contributor" is a copyright holder who authorizes use under this
474 | License of the Program or a work on which the Program is based.  The
475 | work thus licensed is called the contributor's "contributor version".
476 | 
477 |   A contributor's "essential patent claims" are all patent claims
478 | owned or controlled by the contributor, whether already acquired or
479 | hereafter acquired, that would be infringed by some manner, permitted
480 | by this License, of making, using, or selling its contributor version,
481 | but do not include claims that would be infringed only as a
482 | consequence of further modification of the contributor version.  For
483 | purposes of this definition, "control" includes the right to grant
484 | patent sublicenses in a manner consistent with the requirements of
485 | this License.
486 | 
487 |   Each contributor grants you a non-exclusive, worldwide, royalty-free
488 | patent license under the contributor's essential patent claims, to
489 | make, use, sell, offer for sale, import and otherwise run, modify and
490 | propagate the contents of its contributor version.
491 | 
492 |   In the following three paragraphs, a "patent license" is any express
493 | agreement or commitment, however denominated, not to enforce a patent
494 | (such as an express permission to practice a patent or covenant not to
495 | sue for patent infringement).  To "grant" such a patent license to a
496 | party means to make such an agreement or commitment not to enforce a
497 | patent against the party.
498 | 
499 |   If you convey a covered work, knowingly relying on a patent license,
500 | and the Corresponding Source of the work is not available for anyone
501 | to copy, free of charge and under the terms of this License, through a
502 | publicly available network server or other readily accessible means,
503 | then you must either (1) cause the Corresponding Source to be so
504 | available, or (2) arrange to deprive yourself of the benefit of the
505 | patent license for this particular work, or (3) arrange, in a manner
506 | consistent with the requirements of this License, to extend the patent
507 | license to downstream recipients.  "Knowingly relying" means you have
508 | actual knowledge that, but for the patent license, your conveying the
509 | covered work in a country, or your recipient's use of the covered work
510 | in a country, would infringe one or more identifiable patents in that
511 | country that you have reason to believe are valid.
512 | 
513 |   If, pursuant to or in connection with a single transaction or
514 | arrangement, you convey, or propagate by procuring conveyance of, a
515 | covered work, and grant a patent license to some of the parties
516 | receiving the covered work authorizing them to use, propagate, modify
517 | or convey a specific copy of the covered work, then the patent license
518 | you grant is automatically extended to all recipients of the covered
519 | work and works based on it.
520 | 
521 |   A patent license is "discriminatory" if it does not include within
522 | the scope of its coverage, prohibits the exercise of, or is
523 | conditioned on the non-exercise of one or more of the rights that are
524 | specifically granted under this License.  You may not convey a covered
525 | work if you are a party to an arrangement with a third party that is
526 | in the business of distributing software, under which you make payment
527 | to the third party based on the extent of your activity of conveying
528 | the work, and under which the third party grants, to any of the
529 | parties who would receive the covered work from you, a discriminatory
530 | patent license (a) in connection with copies of the covered work
531 | conveyed by you (or copies made from those copies), or (b) primarily
532 | for and in connection with specific products or compilations that
533 | contain the covered work, unless you entered into that arrangement,
534 | or that patent license was granted, prior to 28 March 2007.
535 | 
536 |   Nothing in this License shall be construed as excluding or limiting
537 | any implied license or other defenses to infringement that may
538 | otherwise be available to you under applicable patent law.
539 | 
540 |   12. No Surrender of Others' Freedom.
541 | 
542 |   If conditions are imposed on you (whether by court order, agreement or
543 | otherwise) that contradict the conditions of this License, they do not
544 | excuse you from the conditions of this License.  If you cannot convey a
545 | covered work so as to satisfy simultaneously your obligations under this
546 | License and any other pertinent obligations, then as a consequence you may
547 | not convey it at all.  For example, if you agree to terms that obligate you
548 | to collect a royalty for further conveying from those to whom you convey
549 | the Program, the only way you could satisfy both those terms and this
550 | License would be to refrain entirely from conveying the Program.
551 | 
552 |   13. Use with the GNU Affero General Public License.
553 | 
554 |   Notwithstanding any other provision of this License, you have
555 | permission to link or combine any covered work with a work licensed
556 | under version 3 of the GNU Affero General Public License into a single
557 | combined work, and to convey the resulting work.  The terms of this
558 | License will continue to apply to the part which is the covered work,
559 | but the special requirements of the GNU Affero General Public License,
560 | section 13, concerning interaction through a network will apply to the
561 | combination as such.
562 | 
563 |   14. Revised Versions of this License.
564 | 
565 |   The Free Software Foundation may publish revised and/or new versions of
566 | the GNU General Public License from time to time.  Such new versions will
567 | be similar in spirit to the present version, but may differ in detail to
568 | address new problems or concerns.
569 | 
570 |   Each version is given a distinguishing version number.  If the
571 | Program specifies that a certain numbered version of the GNU General
572 | Public License "or any later version" applies to it, you have the
573 | option of following the terms and conditions either of that numbered
574 | version or of any later version published by the Free Software
575 | Foundation.  If the Program does not specify a version number of the
576 | GNU General Public License, you may choose any version ever published
577 | by the Free Software Foundation.
578 | 
579 |   If the Program specifies that a proxy can decide which future
580 | versions of the GNU General Public License can be used, that proxy's
581 | public statement of acceptance of a version permanently authorizes you
582 | to choose that version for the Program.
583 | 
584 |   Later license versions may give you additional or different
585 | permissions.  However, no additional obligations are imposed on any
586 | author or copyright holder as a result of your choosing to follow a
587 | later version.
588 | 
589 |   15. Disclaimer of Warranty.
590 | 
591 |   THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592 | APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596 | PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597 | IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599 | 
600 |   16. Limitation of Liability.
601 | 
602 |   IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610 | SUCH DAMAGES.
611 | 
612 |   17. Interpretation of Sections 15 and 16.
613 | 
614 |   If the disclaimer of warranty and limitation of liability provided
615 | above cannot be given local legal effect according to their terms,
616 | reviewing courts shall apply local law that most closely approximates
617 | an absolute waiver of all civil liability in connection with the
618 | Program, unless a warranty or assumption of liability accompanies a
619 | copy of the Program in return for a fee.
620 | 
621 |                      END OF TERMS AND CONDITIONS
622 | 
623 |             How to Apply These Terms to Your New Programs
624 | 
625 |   If you develop a new program, and you want it to be of the greatest
626 | possible use to the public, the best way to achieve this is to make it
627 | free software which everyone can redistribute and change under these terms.
628 | 
629 |   To do so, attach the following notices to the program.  It is safest
630 | to attach them to the start of each source file to most effectively
631 | state the exclusion of warranty; and each file should have at least
632 | the "copyright" line and a pointer to where the full notice is found.
633 | 
634 |     <one line to give the program's name and a brief idea of what it does.>
635 |     Copyright (C) <year>  <name of author>
636 | 
637 |     This program is free software: you can redistribute it and/or modify
638 |     it under the terms of the GNU General Public License as published by
639 |     the Free Software Foundation, either version 3 of the License, or
640 |     (at your option) any later version.
641 | 
642 |     This program is distributed in the hope that it will be useful,
643 |     but WITHOUT ANY WARRANTY; without even the implied warranty of
644 |     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
645 |     GNU General Public License for more details.
646 | 
647 |     You should have received a copy of the GNU General Public License
648 |     along with this program.  If not, see <https://www.gnu.org/licenses/>.
649 | 
650 | Also add information on how to contact you by electronic and paper mail.
651 | 
652 |   If the program does terminal interaction, make it output a short
653 | notice like this when it starts in an interactive mode:
654 | 
655 |     <program>  Copyright (C) <year>  <name of author>
656 |     This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657 |     This is free software, and you are welcome to redistribute it
658 |     under certain conditions; type `show c' for details.
659 | 
660 | The hypothetical commands `show w' and `show c' should show the appropriate
661 | parts of the General Public License.  Of course, your program's commands
662 | might be different; for a GUI interface, you would use an "about box".
663 | 
664 |   You should also get your employer (if you work as a programmer) or school,
665 | if any, to sign a "copyright disclaimer" for the program, if necessary.
666 | For more information on this, and how to apply and follow the GNU GPL, see
667 | <https://www.gnu.org/licenses/>.
668 | 
669 |   The GNU General Public License does not permit incorporating your program
670 | into proprietary programs.  If your program is a subroutine library, you
671 | may consider it more useful to permit linking proprietary applications with
672 | the library.  If this is what you want to do, use the GNU Lesser General
673 | Public License instead of this License.  But first, please read
674 | <https://www.gnu.org/licenses/why-not-lgpl.html>.
675 | 


--------------------------------------------------------------------------------