├── README.md
├── pom.xml
└── src
    └── main
        └── java
            └── threadpool
                ├── PoolSizeCalculator.java
                └── SimplePoolSizeCaculator.java


/README.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # 合理估算java的线程池大小及队列数
  3 | ## 原理分析
  4 | >原文：http://ifeve.com/how-to-calculate-threadpool-size/
  5 | 
  6 | 先来一个天真的估算方法：假设要求一个系统的TPS（Transaction Per Second或者Task Per Second）至少为20，然后假设每个Transaction由一个线程完成，继续假设平均每个线程处理一个Transaction的时间为4s。那么问题转化为：
  7 | 
  8 | 如何设计线程池大小，使得可以在1s内处理完20个Transaction？
  9 | 
 10 | 计算过程很简单，每个线程的处理能力为0.25TPS，那么要达到20TPS，显然需要20/0.25=80个线程。
 11 | 
 12 | 很显然这个估算方法很天真，因为它没有考虑到CPU数目。一般服务器的CPU核数为16或者32，如果有80个线程，那么肯定会带来太多不必要的线程上下文切换开销。
 13 | 
 14 | 再来第二种简单的但不知是否可行的方法（N为CPU总核数）：
 15 | 
 16 | - 如果是CPU密集型应用，则线程池大小设置为N+1；
 17 | - 如果是IO密集型应用，则线程池大小设置为2N+1；
 18 | 
 19 | 如果一台服务器上只部署这一个应用并且只有这一个线程池，那么这种估算或许合理，具体还需自行测试验证。
 20 | 
 21 | 第三种方法是在服务器性能IO优化中发现的一个估算公式：
 22 | 
 23 | >最佳线程数目 = （（线程等待时间 + 线程CPU时间）/ 线程CPU时间 ）* CPU数目
 24 | 
 25 | 比如平均每个线程CPU运行时间为0.5s，而线程等待时间（非CPU运行时间，比如IO）为1.5s，CPU核心数为8，那么根据上面这个公式估算得到：((0.5+1.5)/0.5)*8=32。这个公式进一步转化为：
 26 | 
 27 | >最佳线程数目 = （线程等待时间与线程CPU时间之比 + 1）* CPU数目
 28 | 
 29 | 可以得出一个结论：
 30 | 
 31 | - 线程CPU时间所占比例越高，需要越少线程。
 32 | - 线程等待时间所占比例越高，需要越多线程。
 33 | 
 34 | 上一种估算方法也和这个结论相合。
 35 | 
 36 | 一个系统最快的部分是CPU，所以决定一个系统吞吐量上限的是CPU。增强CPU处理能力，可以提高系统吞吐量上限。但根据短板效应，真实的系统吞吐量并不能单纯根据CPU来计算。那要提高系统吞吐量，就需要从“系统短板”（比如网络延迟、IO）着手：
 37 | 
 38 | - 尽量提高短板操作的并行化比率，比如多线程下载技术
 39 | - 增强短板能力，比如用NIO替代IO
 40 | 
 41 | 第一条可以联系到Amdahl定律，这条定律定义了串行系统并行化后的加速比计算公式：
 42 | 
 43 | >加速比=优化前系统耗时 / 优化后系统耗时
 44 | 
 45 | 加速比越大，表明系统并行化的优化效果越好。
 46 | 
 47 | Addahl定律还给出了系统并行度、CPU数目和加速比的关系：
 48 | 
 49 | 加速比为Speedup，系统串行化比率（指串行执行代码所占比率）为F，CPU数目为N：
 50 | 
 51 | >Speedup <= 1 / (F + (1-F)/N)
 52 | 
 53 | 当N足够大时，串行化比率F越小，加速比Speedup越大。
 54 | 
 55 | 是否使用线程池就一定比使用单线程高效呢？
 56 | 
 57 | 答案是否定的，比如Redis就是单线程的，但它却非常高效，基本操作都能达到十万量级/s。
 58 | 
 59 | 从线程这个角度来看，部分原因在于多线程带来线程上下文切换开销，单线程就没有这种开销。
 60 | 
 61 | 当然“Redis很快”更本质的原因在于：Redis基本都是内存操作，这种情况下单线程可以很高效地利用CPU。而多线程适用场景一般是：存在相当比例的IO和网络操作。
 62 | 
 63 | **所以即使有上面的估算方法，也许看似合理，但实际上也未必合理，都需要结合系统真实情况（比如是IO密集型或者是CPU密集型或者是纯内存操作）和硬件环境（CPU、内存、硬盘读写速度、网络状况等）来不断尝试达到一个符合实际的合理估算值。**
 64 | 
 65 | ## 源码分析
 66 | ### `PoolSizeCalculator`类
 67 | #### `calculateBoundaries`方法
 68 | 入口类，计算线程池大小和队列数。
 69 | 
 70 | 接收两个参数，CPU负载和队列总内存的大小（bytes）
 71 | #### `calculateMemoryUsage`方法
 72 | 计算单个任务的内存大小，计算方法：
 73 | 1. 手动GC
 74 | 2. 计算可用内存大小m0
 75 | 3. 创建一个队列，并往里面放1000个任务
 76 | 4. 再次GC
 77 | 5. 计算可用内存大小m1
 78 | 6. `(m1 - m0) / 1000`即每个任务的大小
 79 | >参考：<https://www.javaspecialists.eu/archive/Issue029.html>
 80 | #### `calculateOptimalCapacity`方法
 81 | 计算队列数，计算公式：**队列总内存**/**单个任务的内存**。
 82 | 
 83 | 接收一个参数：队列总内存的大小。
 84 | 
 85 | #### `start`方法
 86 | 计算*执行3秒的任务*所消耗CPU的实际使用时间。
 87 | >参考：<https://www.javaspecialists.eu/archive/Issue124.html>
 88 | #### `calculateOptimalThreadCount`方法
 89 | 计算线程池大小。
 90 | 计算公式：CPU核数 * （1 + 线程等待时间/线程CPU时间）
 91 | #### `collectGarbage`方法
 92 | 循环手动GC
 93 | ### `SimplePoolSizeCaculator`类
 94 | `PoolSizeCalculator`类的一个实现，计算*CPU负载1*，*队列总内存的大小为100k左右*的*IO密集型*的线程池大小和队列数
 95 | #### `AsyncIOTask`类
 96 | IO密集型的一个例子
 97 | 
 98 | ## 使用方法
 99 | ```bash
100 | # 下载
101 | git clone https://github.com/sunshanpeng/dark_magic.git
102 | # 编译
103 | cd dark_magic && mvn package
104 | # 执行
105 | java -jar target/dark_magic-1.0-SNAPSHOT.jar
106 | ```
107 | 控制台打印示例
108 | ```$xslt
109 | Target queue memory usage (bytes): 100000
110 | createTask() produced threadpool.AsyncIOTask which took 40 bytes in a queue
111 | Formula: 100000 / 40
112 | * Recommended queue capacity (bytes): 2500
113 | Number of CPU: 4
114 | Target utilization: 1
115 | Elapsed time (nanos): 3000000000
116 | Compute time (nanos): 125000000
117 | Wait time (nanos): 2875000000
118 | Formula: 4 * 1 * (1 + 2875000000 / 125000000)
119 | * Optimal thread count: 96
120 | ```
121 | > 如果不修改队列内存大小和任务，队列数可能都是2500


--------------------------------------------------------------------------------
/pom.xml:
--------------------------------------------------------------------------------
 1 | <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 2 |          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 3 |     <modelVersion>4.0.0</modelVersion>
 4 | 
 5 |     <groupId>threadpool</groupId>
 6 |     <artifactId>dark_magic</artifactId>
 7 |     <version>1.0-SNAPSHOT</version>
 8 |     <packaging>jar</packaging>
 9 |     <name>dark_magic</name>
10 | 
11 |     <properties>
12 |         <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
13 |     </properties>
14 | 
15 |     <build>
16 |         <plugins>
17 |             <plugin>
18 |                 <!-- Build an executable JAR -->
19 |                 <groupId>org.apache.maven.plugins</groupId>
20 |                 <artifactId>maven-jar-plugin</artifactId>
21 |                 <version>3.1.0</version>
22 |                 <configuration>
23 |                     <archive>
24 |                         <manifest>
25 |                             <addClasspath>true</addClasspath>
26 |                             <classpathPrefix>lib/</classpathPrefix>
27 |                             <mainClass>threadpool.SimplePoolSizeCaculator</mainClass>
28 |                         </manifest>
29 |                     </archive>
30 |                 </configuration>
31 |             </plugin>
32 |             <plugin>
33 |                 <groupId>org.apache.maven.plugins</groupId>
34 |                 <artifactId>maven-compiler-plugin</artifactId>
35 |                 <configuration>
36 |                     <source>6</source>
37 |                     <target>6</target>
38 |                 </configuration>
39 |             </plugin>
40 |         </plugins>
41 |     </build>
42 | 
43 | </project>
44 | 


--------------------------------------------------------------------------------
/src/main/java/threadpool/PoolSizeCalculator.java:
--------------------------------------------------------------------------------
  1 | package threadpool;
  2 | 
  3 | import java.math.BigDecimal;
  4 | import java.math.RoundingMode;
  5 | import java.util.Timer;
  6 | import java.util.TimerTask;
  7 | import java.util.concurrent.BlockingQueue;
  8 | 
  9 | /**
 10 |  * A class that calculates the optimal thread pool boundaries. It takes the
 11 |  * desired target utilization and the desired work queue memory consumption as
 12 |  * input and retuns thread count and work queue capacity.
 13 |  *
 14 |  * @author Niklas Schlimm
 15 |  */
 16 | public abstract class PoolSizeCalculator {
 17 | 
 18 |     /**
 19 |      * The sample queue size to calculate the size of a single {@link Runnable}
 20 |      * element.
 21 |      */
 22 |     private static final int SAMPLE_QUEUE_SIZE = 1000;
 23 | 
 24 |     /**
 25 |      * Accuracy of test run. It must finish within 20ms of the testTime
 26 |      * otherwise we retry the test. This could be configurable.
 27 |      */
 28 |     private static final int EPSYLON = 20;
 29 | 
 30 |     /**
 31 |      * Control variable for the CPU time investigation.
 32 |      */
 33 |     private volatile boolean expired;
 34 | 
 35 |     /**
 36 |      * Time (millis) of the test run in the CPU time calculation.
 37 |      */
 38 |     private final long elapsed = 3000;
 39 | 
 40 |     /**
 41 |      * Calculates the boundaries of a thread pool for a given {@link Runnable}.
 42 |      *
 43 |      * @param targetUtilization the desired utilization of the CPUs (0 <= targetUtilization <= 1)
 44 |      * @param targetQueueSizeBytes the desired maximum work queue size of the thread pool (bytes)
 45 |      */
 46 |     void calculateBoundaries(BigDecimal targetUtilization, BigDecimal targetQueueSizeBytes) {
 47 |         calculateOptimalCapacity(targetQueueSizeBytes);
 48 |         Runnable task = createTask();
 49 |         start(task);
 50 |         start(task); // warm up phase
 51 |         long cputime = getCurrentThreadCPUTime();
 52 |         start(task); // test interval
 53 |         cputime = getCurrentThreadCPUTime() - cputime;
 54 |         long waitTime = (elapsed * 1000000) - cputime;
 55 |         calculateOptimalThreadCount(cputime, waitTime, targetUtilization);
 56 |     }
 57 | 
 58 |     private void calculateOptimalCapacity(BigDecimal targetQueueSizeBytes) {
 59 |         long mem = calculateMemoryUsage();
 60 |         BigDecimal queueCapacity = targetQueueSizeBytes.divide(new BigDecimal(mem),
 61 |                 RoundingMode.HALF_UP);
 62 |         System.out.println("Target queue memory usage (bytes): "
 63 |                 + targetQueueSizeBytes);
 64 |         System.out.println("createTask() produced " + createTask().getClass().getName() + " which took " + mem + " bytes in a queue");
 65 |         System.out.println("Formula: " + targetQueueSizeBytes + " / " + mem);
 66 |         System.out.println("* Recommended queue capacity (bytes): " + queueCapacity);
 67 |     }
 68 | 
 69 |     /**
 70 |      * Brian Goetz' optimal thread count formula, see 'Java Concurrency in
 71 |      * * Practice' (chapter 8.2) 	 *
 72 |      * * @param cpu
 73 |      * *            cpu time consumed by considered task
 74 |      * * @param wait
 75 |      * *            wait time of considered task
 76 |      * * @param targetUtilization
 77 |      * *            target utilization of the system
 78 |      */
 79 |     private void calculateOptimalThreadCount(long cpu, long wait,
 80 |                                              BigDecimal targetUtilization) {
 81 |         BigDecimal computeTime = new BigDecimal(cpu);
 82 |         BigDecimal waitTime = new BigDecimal(wait);
 83 |         BigDecimal numberOfCPU = new BigDecimal(Runtime.getRuntime()
 84 |                 .availableProcessors());
 85 |         BigDecimal optimalthreadcount = numberOfCPU.multiply(targetUtilization)
 86 |                 .multiply(new BigDecimal(1).add(waitTime.divide(computeTime,
 87 |                         RoundingMode.HALF_UP)));
 88 |         System.out.println("Number of CPU: " + numberOfCPU);
 89 |         System.out.println("Target utilization: " + targetUtilization);
 90 |         System.out.println("Elapsed time (nanos): " + (elapsed * 1000000));
 91 |         System.out.println("Compute time (nanos): " + cpu);
 92 |         System.out.println("Wait time (nanos): " + wait);
 93 |         System.out.println("Formula: " + numberOfCPU + " * "
 94 |                 + targetUtilization + " * (1 + " + waitTime + " / "
 95 |                 + computeTime + ")");
 96 |         System.out.println("* Optimal thread count: " + optimalthreadcount);
 97 |     }
 98 | 
 99 |     /**
100 |      * * Runs the {@link Runnable} over a period defined in {@link #elapsed}.
101 |      * * Based on Heinz Kabbutz' ideas
102 |      * * (http://www.javaspecialists.eu/archive/Issue124.html).
103 |      * *
104 |      * * @param task
105 |      * *            the runnable under investigation
106 |      */
107 |     public void start(Runnable task) {
108 |         long start = 0;
109 |         int runs = 0;
110 |         do {
111 |             if (++runs > 10) {
112 |                 throw new IllegalStateException("Test not accurate");
113 |             }
114 |             expired = false;
115 |             start = System.currentTimeMillis();
116 |             Timer timer = new Timer();
117 |             timer.schedule(new TimerTask() {
118 |                 public void run() {
119 |                     expired = true;
120 |                 }
121 |             }, elapsed);
122 |             while (!expired) {
123 |                 task.run();
124 |             }
125 |             start = System.currentTimeMillis() - start;
126 |             timer.cancel();
127 |         } while (Math.abs(start - elapsed) > EPSYLON);
128 |         collectGarbage(3);
129 |     }
130 | 
131 |     private void collectGarbage(int times) {
132 |         for (int i = 0; i < times; i++) {
133 |             System.gc();
134 |             try {
135 |                 Thread.sleep(10);
136 |             } catch (InterruptedException e) {
137 |                 Thread.currentThread().interrupt();
138 |                 break;
139 |             }
140 |         }
141 |     }
142 | 
143 |     /**
144 |      * Calculates the memory usage of a single element in a work queue. Based on
145 |      * Heinz Kabbutz' ideas
146 |      * (http://www.javaspecialists.eu/archive/Issue029.html).
147 |      *
148 |      * @return memory usage of a single {@link Runnable} element in the thread
149 |      * pools work queue
150 |      */
151 |     private long calculateMemoryUsage() {
152 |         BlockingQueue<Runnable> queue = createWorkQueue(SAMPLE_QUEUE_SIZE);
153 |         for (int i = 0; i < SAMPLE_QUEUE_SIZE; i++) {
154 |             queue.add(createTask());
155 |         }
156 |         long mem0 = Runtime.getRuntime().totalMemory() -
157 |                 Runtime.getRuntime().freeMemory();
158 |         long mem1 = Runtime.getRuntime().totalMemory() -
159 |                 Runtime.getRuntime().freeMemory();
160 |         queue = null;
161 |         collectGarbage(15);
162 |         mem0 = Runtime.getRuntime().totalMemory()
163 |                 - Runtime.getRuntime().freeMemory();
164 |         queue = createWorkQueue(SAMPLE_QUEUE_SIZE);
165 |         for (int i = 0; i < SAMPLE_QUEUE_SIZE; i++) {
166 |             queue.add(createTask());
167 |         }
168 |         collectGarbage(15);
169 |         mem1 = Runtime.getRuntime().totalMemory()
170 |                 - Runtime.getRuntime().freeMemory();
171 |         return (mem1 - mem0) / SAMPLE_QUEUE_SIZE;
172 |     }
173 | 
174 |     /**
175 |      * Create your runnable task here.
176 |      *
177 |      * @return an instance of your runnable task under investigation
178 |      */
179 |     protected abstract Runnable createTask();
180 | 
181 |     /**
182 |      * Return an instance of the queue used in the thread pool.
183 |      *
184 |      * @return queue instance
185 |      */
186 |     protected abstract BlockingQueue<Runnable> createWorkQueue(int capacity);
187 | 
188 |     /**
189 |      * Calculate current cpu time. Various frameworks may be used here,
190 |      * depending on the operating system in use. (e.g.
191 |      * http://www.hyperic.com/products/sigar). The more accurate the CPU time
192 |      * measurement, the more accurate the results for thread count boundaries.
193 |      *
194 |      * @return current cpu time of current thread
195 |      */
196 |     protected abstract long getCurrentThreadCPUTime();
197 | 
198 | }


--------------------------------------------------------------------------------
/src/main/java/threadpool/SimplePoolSizeCaculator.java:
--------------------------------------------------------------------------------
 1 | package threadpool;
 2 | 
 3 | import java.io.BufferedReader;
 4 | import java.io.IOException;
 5 | import java.io.InputStreamReader;
 6 | import java.lang.management.ManagementFactory;
 7 | import java.math.BigDecimal;
 8 | import java.net.HttpURLConnection;
 9 | import java.net.URL;
10 | import java.util.concurrent.BlockingQueue;
11 | import java.util.concurrent.LinkedBlockingQueue;
12 | 
13 | public class SimplePoolSizeCaculator extends PoolSizeCalculator {
14 | 
15 |     @Override
16 |     protected Runnable createTask() {
17 |         return new AsyncIOTask();
18 |     }
19 | 
20 |     @Override
21 |     protected BlockingQueue<Runnable> createWorkQueue(int capacity) {
22 |         return new LinkedBlockingQueue<Runnable>(capacity);
23 |     }
24 | 
25 |     @Override
26 |     protected long getCurrentThreadCPUTime() {
27 |         //the total CPU time for the current thread in nanoseconds
28 |         return ManagementFactory.getThreadMXBean().getCurrentThreadCpuTime();
29 |     }
30 | 
31 |     public static void main(String[] args) {
32 |         PoolSizeCalculator poolSizeCalculator = new SimplePoolSizeCaculator();
33 |         poolSizeCalculator.calculateBoundaries(new BigDecimal(1.0), new BigDecimal(100000));
34 |     }
35 | 
36 | }
37 | 
38 | /**
39 |  * 自定义的异步IO任务
40 |  * @author Will
41 |  *
42 |  */
43 | class AsyncIOTask implements Runnable {
44 | 
45 |     @Override
46 |     public void run() {
47 |         HttpURLConnection connection = null;
48 |         BufferedReader reader = null;
49 |         try {
50 |             URL url = new URL("http://baidu.com");
51 | 
52 |             connection = (HttpURLConnection) url.openConnection();
53 |             connection.connect();
54 |             reader = new BufferedReader(new InputStreamReader(
55 |                     connection.getInputStream()));
56 | 
57 |             String line;
58 |             StringBuilder stringBuilder;
59 |             while ((line = reader.readLine()) != null) {
60 |                 stringBuilder = new StringBuilder();
61 |                 stringBuilder.append(line);
62 |             }
63 |         }
64 | 
65 |         catch (IOException e) {
66 | 
67 |         } finally {
68 |             if(reader != null) {
69 |                 try {
70 |                     reader.close();
71 |                 }
72 |                 catch(Exception e) {
73 | 
74 |                 }
75 |             }
76 |             if (connection != null)
77 |             connection.disconnect();
78 |         }
79 | 
80 |     }
81 | 
82 | }


--------------------------------------------------------------------------------