├── README.md ├── kafka_junit_tests.iml ├── lib └── collectd-api.jar ├── pom.xml ├── src ├── main │ └── java │ │ └── com │ │ └── mapr │ │ └── sample │ │ ├── Tick.java │ │ └── TickPojo.java └── test │ ├── R │ └── draw-speed-graphs.r │ ├── java │ └── com │ │ └── mapr │ │ └── sample │ │ ├── MessageSizeSpeedTest.java │ │ ├── ThreadCountSpeedTest.java │ │ ├── TopicCountGridSearchTest.java │ │ └── TypeFormatSpeedTest.java │ └── resources │ ├── producer.props │ └── sample-tick-01.txt ├── target ├── classes │ └── com │ │ └── mapr │ │ └── demo │ │ └── finserv │ │ └── Tick.class └── test-classes │ ├── com │ └── mapr │ │ └── demo │ │ └── finserv │ │ ├── ThreadCountSpeedTest$1.class │ │ ├── ThreadCountSpeedTest$Sender.class │ │ ├── ThreadCountSpeedTest.class │ │ ├── Tick2Test.class │ │ └── TopicCountGridSearchTest.class │ ├── offset_producer.props │ ├── producer.props │ └── sample-tick-01.txt └── thread.png /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | This project contains JUnit tests for tuning Kafka configurations. 4 | 5 | # What is the purpose of this project? 6 | 7 | [Apache Kafka](http://kafka.apache.org) is a distributed streaming platform. It lets you publish and subscribe to streams of data like a messaging system. You can also use it to store streams of data in a distributed cluster and process those streams in real-time. However, sometimes it can be challenging to publish or consume data at a rate that keeps up with real-time. Optimizing the speed of your producers or consumers involves knowing what specific values to use for a variety of performance related variables. 8 | 9 | One method of tuning these parameters is to just run a series of incremental unit tests designed to measure throughput over a range of values for a single parameter. However, determining which configurations produce the best possible Kafka performance can be a time-consuming process of trial and error. Automating that process with parametrized JUnit tests is an excellent way to optimize Kafka without guess work and without wasting time. 10 | 11 | ## What is JUnit? 12 | 13 | [JUnit](https://en.wikipedia.org/wiki/JUnit) is a unit testing framework for the Java programming language and is by far the most popular framework for developing test cases in Java. 14 | 15 | # What is in this project? 16 | 17 | This project includes JUnit tests designed to find which Kafka configurations will maximize the speed at which messages can be published to a Kafka stream. In fact, these unit tests don't so much test anything as produce speed data so that different configurations of Kafka producers can be adjusted to get optimal performance under different conditions. 18 | 19 | The following unit tests are included: 20 | 21 | 1. *MessageSizeSpeedTest* measures producer throughput for a variety of message sizes. This test will show how much throughput declines as message sizes increase. 22 | 23 | 2. *ThreadCountSpeedTest* measures producer throughput for a variety of topic quantities. This test will show how much throughput declines as the producer sends to an increasing quantity of topics. 24 | 25 | 3. *TopicCountGridSearchTest* explores the effect of number of output topics, buffer size, threading and so on. 26 | 27 | 4. *TypeFormatSpeedTest* measures how fast messages can be converted from POJO or JSON data format to Kafka's native byte array format. This is useful for illustrating the speed penalty you pay in Kafka serialization for using complex data types. 28 | 29 | # How do I compile and run this project? 30 | 31 | ## Prerequisites 32 | 33 | Download and run this code on a Kafka or MapR cluster. 34 | 35 | Install a JDK and maven if you haven't already. 36 | 37 | If you want to graph your test results, install Rscript, too. 38 | 39 | Start Kafka and Zookeeper services. 40 | 41 | Update bootstrap.servers in src/test/resources/producer.props to point to the Kafka service. 42 | 43 | ## Compile and Run 44 | 45 | This project has been prepared to run on either MapR or vanilla Kafka clusters. 46 | 47 | To run it on a MapR cluster, checkout the `mapr` branch and run maven, like this: 48 | 49 | ``` 50 | git checkout mapr 51 | mvn package 52 | ``` 53 | 54 | To run it on a vanilla Kafka cluster, checkout the `kafka` branch and run maven, like this: 55 | 56 | ``` 57 | git checkout kafka 58 | mvn package 59 | ``` 60 | 61 | After maven completes test data should have been saved to three new files: `size-count.csv`, `thread-count.csv`, and `topic-count.csv`. 62 | 63 | If you want to only run one unit test, use a command like, `mvn -e -Dtest=MessageSizeSpeedTest test`. 64 | 65 | You can graph performance results like this: 66 | 67 | ```Rscript src/test/R/draw-speed-graphs.r``` 68 | 69 | Open the resulting .png image files to see your results. Here is an example of performance data graphed from the TopicCountGridSearchTest test: 70 | 71 | ![Producer Throughput on a Kafka 3 node cluster](thread.png?raw=true "Producer Throughput on a Kafka 3 node cluster") 72 | 73 | 74 | 75 | ## Caveats 76 | 77 | Sometimes these tests require a lot of memory. You'll know when you run out of heap if you see a "queue full" exception. If that happens, edit the pom.xml and increase the JVM heap in the Xmx parameter. 78 | 79 | Also, make sure you don't run out of disk space. In zookeeper.properties (under the config dir, where ever you installed Kafka) make sure dataDir is pointed to a drive with lots of space. -------------------------------------------------------------------------------- /kafka_junit_tests.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | -------------------------------------------------------------------------------- /lib/collectd-api.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/lib/collectd-api.jar -------------------------------------------------------------------------------- /pom.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4.0.0 4 | 5 | mapr.com 6 | kafka_junit_tests 7 | 1.0 8 | 9 | jar 10 | 11 | 1.8 12 | 1.8 13 | UTF-8 14 | 15 | 16 | 17 | 18 | apache-releases 19 | https://repository.apache.org/content/groups/public 20 | 21 | 22 | 23 | 24 | 25 | org.apache.spark 26 | spark-core_2.10 27 | 1.6.0 28 | 29 | 30 | org.apache.kafka 31 | kafka-clients 32 | 0.10.0.1 33 | 34 | 35 | com.fasterxml.jackson.core 36 | jackson-databind 37 | 2.4.0 38 | 39 | 40 | com.googlecode.json-simple 41 | json-simple 42 | 1.1 43 | 44 | 45 | junit 46 | junit 47 | 4.12 48 | test 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | org.apache.maven.plugins 57 | maven-compiler-plugin 58 | 59 | 1.8 60 | 1.8 61 | 62 | 63 | 64 | 65 | maven-assembly-plugin 66 | 2.4 67 | 68 | 69 | 70 | com.mapr.sample.Run 71 | 72 | 73 | 74 | jar-with-dependencies 75 | 76 | 77 | 78 | 79 | make-assembly 80 | package 81 | 82 | single 83 | 84 | 85 | 86 | 87 | 88 | org.apache.maven.plugins 89 | maven-surefire-plugin 90 | 2.19.1 91 | 92 | -Xms2g -Xmx2g 93 | 94 | 95 | 96 | 97 | kafka_junit_tests 98 | 99 | -------------------------------------------------------------------------------- /src/main/java/com/mapr/sample/Tick.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | import com.fasterxml.jackson.annotation.JsonProperty; 4 | import com.google.common.base.Charsets; 5 | 6 | import java.io.IOException; 7 | import java.io.Serializable; 8 | import java.util.ArrayList; 9 | import java.util.Calendar; 10 | import java.util.GregorianCalendar; 11 | import java.util.List; 12 | 13 | /** 14 | * This tick is a data structure containing a single tick 15 | * that avoids parsing the underlying bytes as long as possible. 16 | *

17 | * By using annotations, it also supports fast serialization to JSON. 18 | */ 19 | public class Tick implements Serializable { 20 | private byte[] data; 21 | 22 | public Tick(byte[] data) { 23 | this.data = data; 24 | } 25 | 26 | public Tick(String data) { 27 | this.data = data.getBytes(Charsets.ISO_8859_1); 28 | } 29 | 30 | public byte[] getData() { return this.data; } 31 | 32 | @JsonProperty("date") 33 | public String getDate() { 34 | return new String(data, 0, 9); 35 | } 36 | 37 | public long getTimeInMillis() { 38 | // NYSE TAQ records do not reference year, month, day. So, we'll hard code, for now. 39 | Calendar timestamp = new GregorianCalendar(2013,12,1); 40 | timestamp.set(Calendar.HOUR, Integer.valueOf(new String(data, 0, 2))); 41 | timestamp.set(Calendar.MINUTE, Integer.valueOf(new String(data, 2, 2))); 42 | timestamp.set(Calendar.SECOND, Integer.valueOf(new String(data, 4, 2))); 43 | timestamp.set(Calendar.MILLISECOND, Integer.valueOf(new String(data, 6, 3))); 44 | return timestamp.getTimeInMillis(); 45 | } 46 | 47 | @JsonProperty("exchange") 48 | public String getExchange() { 49 | return new String(data, 9, 1); 50 | } 51 | 52 | @JsonProperty("symbol-root") 53 | public String getSymbolRoot() { 54 | return trim(10, 6); 55 | } 56 | 57 | @JsonProperty("symbol-suffix") 58 | public String getSymbolSuffix() { 59 | return trim(16, 10); 60 | } 61 | 62 | @JsonProperty("sale-condition") 63 | public String getSaleCondition() { 64 | return trim(26, 4); 65 | } 66 | 67 | @JsonProperty("trade-volume") 68 | public double getTradeVolume() { 69 | return digitsAsInt(30, 9); 70 | } 71 | 72 | @JsonProperty("trade-price") 73 | public double getTradePrice() { 74 | return digitsAsDouble(39, 11, 4); 75 | } 76 | 77 | //String getTradePrice() {return new String(data, 39, 46) + "." + record.substring(data, 46, getTradeStopStockIndicator() {return new String(data, 50, 51); } 78 | 79 | @JsonProperty("trade-correction-indicator") 80 | public String getTradeCorrectionIndicator() { 81 | return new String(data, 51, 2); 82 | } 83 | 84 | @JsonProperty("trade-sequence-number") 85 | public String getTradeSequenceNumber() { 86 | return new String(data, 53, 16); 87 | } 88 | 89 | @JsonProperty("trade-source") 90 | public String getTradeSource() { 91 | return new String(data, 69, 1); 92 | } 93 | 94 | @JsonProperty("trade-reporting-facility") 95 | public String getTradeReportingFacility() { 96 | return new String(data, 70, 1); 97 | } 98 | 99 | @JsonProperty("sender") 100 | public String getSender() { 101 | return new String(data,71,4); 102 | } 103 | 104 | @JsonProperty("receiver-list") 105 | public List getReceivers() { 106 | List receivers = new ArrayList<>(); 107 | for (int i=0; data.length >= 79 + i*4; i++) { 108 | receivers.add(new String(data, 75 + i*4, 4)); 109 | } 110 | return receivers; 111 | } 112 | 113 | private double digitsAsDouble(int start, int length, int decimals) { 114 | double r = digitsAsInt(start, length); 115 | for (int i = 0; i < decimals; i++) { 116 | r = r / 10; 117 | } 118 | return r; 119 | } 120 | 121 | private int digitsAsInt(int start, int length) { 122 | int r = 0; 123 | for (int i = start; i < start + length; i++) { 124 | if (data[i] != ' ') { 125 | r = r * 10 + data[i] - '0'; 126 | } 127 | } 128 | return r; 129 | } 130 | 131 | private String trim(int start, int length) { 132 | int i = start; 133 | int j = start+length; 134 | while (i < start + length && data[i] == ' ') { 135 | i++; 136 | } 137 | while ((j-i) > 0 && data[j] == ' ') { 138 | j--; 139 | } 140 | return new String(data, i, j - i + 1); 141 | } 142 | 143 | public void writeObject(java.io.ObjectOutputStream out) throws IOException { 144 | out.writeInt(data.length); 145 | out.write(data); 146 | } 147 | 148 | public void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException { 149 | int length = in.readInt(); 150 | data = new byte[length]; 151 | int n = in.read(data); 152 | if (n != length) { 153 | throw new IOException("Couldn't read entire Tick object, only got " + n + " bytes"); 154 | } 155 | 156 | } 157 | } 158 | -------------------------------------------------------------------------------- /src/main/java/com/mapr/sample/TickPojo.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | import java.io.Serializable; 4 | 5 | public class TickPojo implements Serializable { 6 | 7 | public String getDate() { 8 | return date; 9 | } 10 | 11 | public void setDate(String date) { 12 | this.date = date; 13 | } 14 | 15 | public String getExchange() { 16 | return exchange; 17 | } 18 | 19 | public void setExchange(String exchange) { 20 | this.exchange = exchange; 21 | } 22 | 23 | public String getSymbolroot() { 24 | return symbolroot; 25 | } 26 | 27 | public void setSymbolroot(String symbolroot) { 28 | this.symbolroot = symbolroot; 29 | } 30 | 31 | public String getSymbolsuffix() { 32 | return symbolsuffix; 33 | } 34 | 35 | public void setSymbolsuffix(String symbolsuffix) { 36 | this.symbolsuffix = symbolsuffix; 37 | } 38 | 39 | public String getTradeVolume() { 40 | return tradeVolume; 41 | } 42 | 43 | public String getSaleCondition() { 44 | return saleCondition; 45 | } 46 | 47 | public void setSaleCondition(String saleCondition) { 48 | this.saleCondition = saleCondition; 49 | } 50 | 51 | public void setTradeVolume(String tradeVolume) { 52 | this.tradeVolume = tradeVolume; 53 | } 54 | 55 | public String getTradePrice() { 56 | return tradePrice; 57 | } 58 | 59 | public void setTradePrice(String tradePrice) { 60 | this.tradePrice = tradePrice; 61 | } 62 | 63 | public String getTradeStopStockIndicator() { 64 | return tradeStopStockIndicator; 65 | } 66 | 67 | public void setTradeStopStockIndicator(String tradeStopStockIndicator) { 68 | this.tradeStopStockIndicator = tradeStopStockIndicator; 69 | } 70 | 71 | public String getTradeCorrectionIndicator() { 72 | return tradeCorrectionIndicator; 73 | } 74 | 75 | public void setTradeCorrectionIndicator(String tradeCorrectionIndicator) { 76 | this.tradeCorrectionIndicator = tradeCorrectionIndicator; 77 | } 78 | 79 | public String getTradeSequenceNumber() { 80 | return tradeSequenceNumber; 81 | } 82 | 83 | public void setTradeSequenceNumber(String tradeSequenceNumber) { 84 | this.tradeSequenceNumber = tradeSequenceNumber; 85 | } 86 | 87 | public String getTradeSource() { 88 | return tradeSource; 89 | } 90 | 91 | public void setTradeSource(String tradeSource) { 92 | this.tradeSource = tradeSource; 93 | } 94 | 95 | public String getTradeReportingFacility() { 96 | return tradeReportingFacility; 97 | } 98 | 99 | public void setTradeReportingFacility(String tradeReportingFacility) { 100 | this.tradeReportingFacility = tradeReportingFacility; 101 | } 102 | 103 | String date; 104 | String exchange; 105 | String symbolroot; 106 | String symbolsuffix; 107 | String saleCondition; 108 | String tradeVolume; 109 | String tradePrice; 110 | String tradeStopStockIndicator; 111 | String tradeCorrectionIndicator; 112 | String tradeSequenceNumber; 113 | String tradeSource; 114 | String tradeReportingFacility; 115 | String sender; 116 | 117 | public String getSender() { 118 | return sender; 119 | } 120 | 121 | public void setSender(String sender) { 122 | this.sender = sender; 123 | } 124 | 125 | String[] receivers; 126 | 127 | public String[] getReceivers() { 128 | return receivers; 129 | } 130 | 131 | public void setReceivers(String[] receivers) { 132 | this.receivers = receivers; 133 | } 134 | } 135 | -------------------------------------------------------------------------------- /src/test/R/draw-speed-graphs.r: -------------------------------------------------------------------------------- 1 | png(file="topics.png", width=800, height=500, pointsize=16) 2 | x = read.csv("topic-count.csv") 3 | boxplot(batchRate/1e6 ~ topicCount + batchSize, x, 4 | xlab=c("Topics"), ylab="Millions of Messages / second", ylim=c(0,2), 5 | col=rainbow(3)[ceiling((1:16)/4)], xaxt='n') 6 | axis(1,labels=as.character(rep(c(100,300,1000,2000),4)), at=(1:16), las=3) 7 | legend(x=10,y=1.9,legend=c(0,16384,65536), col=rainbow(3), fill=rainbow(3), title="batch.size") 8 | abline(v=4.5, col='lightgray') 9 | abline(v=8.5, col='lightgray') 10 | dev.off() 11 | 12 | 13 | 14 | png(file="thread.png", width=800, height=500, pointsize=16) 15 | x = read.csv("thread-count.csv") 16 | boxplot(batchRate/1e6 ~ topicCount + threadCount, x, ylim=c(0,2.1), 17 | ylab="Millions of messages / second", xlab="Topics", 18 | col=rainbow(6)[ceiling((1:36)/6)], xaxt='n') 19 | axis(1,labels=as.character(rep(c(50,100,200,500,1000,2000),6)), at=(1:36), las=3) 20 | legend(x=32,y=2.1,legend=c(1,2,5,10,15,20), col=rainbow(6), fill=rainbow(6), title="Threads") 21 | abline(v=6.5, col='lightgray') 22 | abline(v=12.5, col='lightgray') 23 | abline(v=18.5, col='lightgray') 24 | abline(v=24.5, col='lightgray') 25 | abline(v=30.5, col='lightgray') 26 | dev.off() 27 | 28 | 29 | png(file="size-count.png", width=800, height=500, pointsize=16) 30 | x = read.csv("size-count.csv") 31 | boxplot(batchRate/1e6 ~ topicCount + messageSize, x, ylim=c(0,1.0), 32 | ylab="Millions of messages / second", xlab="Message Size (bytes)", 33 | col=rainbow(6)[ceiling((1:36)/6)], xaxt='n') 34 | axis(1,labels=as.character(rep(c(100,500,1000,2000,10000,50000,100000),7)), at=(1:49), las=3) 35 | title("Message Size vs Throughput (Messages)") 36 | dev.off() 37 | 38 | png(file="size-count-bytes.png", width=800, height=500, pointsize=16) 39 | x = read.csv("size-count.csv") 40 | boxplot(batchRate*messageSize/1e6 ~ messageSize, x, 41 | ylab="Throughput (MB/Second)", xlab="Message Size (Bytes)", ylim=c(0,200), 42 | col=rainbow(6)[ceiling((1:36)/6)], xaxt='n') 43 | axis(1,labels=as.character(rep(c(100,500,1000,2000,10000,50000,100000),7)), at=(1:49), las=3) 44 | title("Message Size vs Throughput (MBs)") 45 | dev.off() 46 | 47 | -------------------------------------------------------------------------------- /src/test/java/com/mapr/sample/MessageSizeSpeedTest.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | import com.google.common.collect.Lists; 4 | import com.google.common.io.Resources; 5 | import org.apache.kafka.clients.producer.KafkaProducer; 6 | import org.apache.kafka.clients.producer.ProducerRecord; 7 | import org.junit.AfterClass; 8 | import org.junit.BeforeClass; 9 | import org.junit.Test; 10 | import org.junit.runner.RunWith; 11 | import org.junit.runners.Parameterized; 12 | 13 | import java.io.File; 14 | import java.io.FileNotFoundException; 15 | import java.io.IOException; 16 | import java.io.PrintWriter; 17 | import java.util.Arrays; 18 | import java.util.List; 19 | import java.util.Properties; 20 | import java.util.Random; 21 | import java.util.concurrent.*; 22 | 23 | /** 24 | * Tests the effect of threading on message transmission to lots of topics 25 | */ 26 | @RunWith(Parameterized.class) 27 | public class MessageSizeSpeedTest { 28 | // STREAM is unused for vanilla Kafka tests 29 | private static final String STREAM = "/mapr/ian.cluster.com/user/mapr/taq"; 30 | private static final double TIMEOUT = 30; // seconds 31 | private static final int BATCH_SIZE = 1000000; // The unit of measure for throughput is "batch size" per second 32 | // e.g. Throughput = X "millions of messages" per sec 33 | 34 | @BeforeClass 35 | public static void openDataFile() throws FileNotFoundException { 36 | data = new PrintWriter(new File("size-count.csv")); 37 | data.printf("messageSize, threadCount, topicCount, i, t, rate, dt, batchRate\n"); 38 | } 39 | 40 | @AfterClass 41 | public static void closeDataFile() { 42 | data.close(); 43 | } 44 | 45 | private static PrintWriter data; 46 | 47 | @Parameterized.Parameters(name = "{index}: messageSize={0}, topics={1}") 48 | public static Iterable data() { 49 | return Arrays.asList(new Object[][]{ 50 | {10, 1}, {100, 1}, {500, 1} 51 | }); 52 | } 53 | 54 | private int threadCount=2; // number of concurrent Kafka producers to run 55 | private int topicCount; // number of Kafka topics in our stream 56 | private int messageSize; // size of each message sent into Kafka 57 | 58 | private static final ProducerRecord end = new ProducerRecord<>("end", null); 59 | 60 | public MessageSizeSpeedTest(int messageSize, int topicCount) { 61 | this.messageSize = messageSize; 62 | this.topicCount = topicCount; 63 | } 64 | 65 | private static class Sender extends Thread { 66 | private final KafkaProducer producer; 67 | private final BlockingQueue> queue; 68 | 69 | private Sender(KafkaProducer producer, BlockingQueue> queue) { 70 | this.producer = producer; 71 | this.queue = queue; 72 | } 73 | 74 | @Override 75 | public void run() { 76 | try { 77 | ProducerRecord rec = queue.take(); 78 | while (rec != end) { 79 | // Here's were the sender thread sends a message. 80 | // Since we're not supplying a callback the send will be done asynchronously. 81 | // The outgoing message will go to a local buffer which is not necessarily FIFO, 82 | // but sending messages out-of-order does not matter since we're just trying to 83 | // test throughput in this class. 84 | producer.send(rec); 85 | rec = queue.take(); 86 | } 87 | } catch (InterruptedException e) { 88 | System.out.printf("%s: Interrupted\n", this.getName()); 89 | } 90 | } 91 | } 92 | 93 | @Test 94 | public void testThreads() throws Exception { 95 | System.out.printf("messageSize = %d, topicCount = %d\n", messageSize, topicCount); 96 | 97 | // Create new topic names. Kafka will automatically create these topics if they don't already exist. 98 | List ourTopics = Lists.newArrayList(); 99 | for (int i = 0; i < topicCount; i++) { 100 | /* 101 | * Use this line to run test in MapR 102 | */ 103 | // ourTopics.add(String.format("%s:t-%05d", STREAM, i)); 104 | /* 105 | * Use this line to run test in vanilla Kafka 106 | */ 107 | ourTopics.add(String.format("t-%05d", i)); // Topic names will look like, "t-00874". 108 | } 109 | 110 | // Create a message containing random bytes. We'll send this message over and over again 111 | // in our performance test, below. 112 | Random rand = new Random(); 113 | byte[] buf = new byte[messageSize]; 114 | rand.nextBytes(buf); 115 | Tick message = new Tick(buf); 116 | 117 | // Create a pool of sender threads. 118 | ExecutorService pool = Executors.newFixedThreadPool(threadCount); 119 | 120 | // We need some way to give each sender messages to publish. 121 | // We'll do that via this list of queues. 122 | List>> queues = Lists.newArrayList(); 123 | for (int i = 0; i < threadCount; i++) { 124 | // We use BlockingQueue to buffer messages for each sender. 125 | // We use this type not for concurrency reasons (although it is thread safe) but 126 | // rather because it provides an efficient way for senders to take messages if 127 | // they're available and for us to generate those messages (see below). 128 | BlockingQueue> q = new ArrayBlockingQueue<>(1000); 129 | queues.add(q); 130 | // spawn each thread with a reference to "q", which we'll add messages to later. 131 | pool.submit(new Sender(getProducer(), q)); 132 | } 133 | 134 | double t0 = System.nanoTime() * 1e-9; 135 | double batchStart = 0; 136 | 137 | // -------- Generate Messages for each Sender -------- 138 | // Generate BATCH_SIZE messages at a time and send each one to a random sender thread. 139 | // The batch size was defined above as containing 1 million messages. 140 | // We want to send as many messages as possible until a timeout has been reached. 141 | // The timeout was defined above as 30 seconds. 142 | // We'll break out of this loop when that timeout occurs. 143 | for (int i = 0; i >= 0 && i < Integer.MAX_VALUE; ) { 144 | // Send each message in our batch (of 1 million messages) to a random topic. 145 | for (int j = 0; j < BATCH_SIZE; j++) { 146 | // Get a random topic (but always assign it to the same sender thread) 147 | String topic = ourTopics.get(rand.nextInt(topicCount)); 148 | // The topic hashcode works in the sense that equal topics always have equal hashes. 149 | // So this will ensure that a topic will always be populated by the same sender thread. 150 | // We want to load balance senders without using round robin, because with round robin 151 | // all senders would have to send to all topics, and we've found that it's much faster 152 | // to minimize the number of topics each kafka producer sends to. 153 | // By using this hashcode we can maintain affinity between Kafka topic and sender thread. 154 | int qid = topic.hashCode() % threadCount; 155 | if (qid < 0) { 156 | qid += threadCount; 157 | } 158 | try { 159 | // Put a message to be published in the queue belonging to the sender we just selected. 160 | // That sender will automatically send this message as soon as possible. 161 | queues.get(qid).put(new ProducerRecord<>(topic, message.getData())); 162 | } catch (Exception e) { 163 | // BlockingQueue might throw an IllegalStateException if the queue fills up. 164 | e.printStackTrace(); 165 | } 166 | } 167 | i += BATCH_SIZE; 168 | double t = System.nanoTime() * 1e-9 - t0; 169 | double dt = t - batchStart; 170 | batchStart = t; 171 | // i = number of batches (number of "1 million messages" sent) 172 | // t = total elapsed time 173 | // i/t = throughput (number of batches sent overall per second) 174 | // dt = elapsed time for this batch 175 | // batch / dt = millions of messages sent per second for this batch 176 | data.printf("%d, %d,%d,%d,%.3f,%.1f,%.3f,%.1f\n", messageSize, threadCount, topicCount, i, t, i / t, dt, BATCH_SIZE / dt); 177 | data.flush(); 178 | if (t > TIMEOUT) { 179 | break; 180 | } 181 | } 182 | // We cleanly shutdown each producer thread by sending the predefined "end" message 183 | // then shutdown the threads in the pool after giving them a few seconds to see that 184 | // end message. 185 | for (int i = 0; i < threadCount; i++) { 186 | queues.get(i).add(end); 187 | } 188 | pool.shutdown(); 189 | pool.awaitTermination(10, TimeUnit.SECONDS); 190 | } 191 | 192 | KafkaProducer getProducer() throws IOException { 193 | Properties props = new Properties(); 194 | props.load(Resources.getResource("producer.props").openStream()); 195 | // Properties reference: 196 | // https://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html 197 | // props.put("batch.size", 16384); 198 | // props.put("linger.ms", 1); 199 | // props.put("buffer.memory", 33554432); 200 | 201 | return new KafkaProducer<>(props); 202 | } 203 | } 204 | -------------------------------------------------------------------------------- /src/test/java/com/mapr/sample/ThreadCountSpeedTest.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | import com.google.common.collect.Lists; 4 | import com.google.common.io.Resources; 5 | import org.apache.kafka.clients.producer.KafkaProducer; 6 | import org.apache.kafka.clients.producer.ProducerRecord; 7 | import org.junit.AfterClass; 8 | import org.junit.BeforeClass; 9 | import org.junit.Test; 10 | import org.junit.runner.RunWith; 11 | import org.junit.runners.Parameterized; 12 | 13 | import java.io.File; 14 | import java.io.FileNotFoundException; 15 | import java.io.IOException; 16 | import java.io.PrintWriter; 17 | import java.util.*; 18 | import java.util.concurrent.*; 19 | 20 | /** 21 | * Tests the effect of threading on message transmission to lots of topics 22 | */ 23 | @RunWith(Parameterized.class) 24 | public class ThreadCountSpeedTest { 25 | private static final double TIMEOUT = 30; // seconds 26 | private static final int BATCH_SIZE = 1000000; // The unit of measure for throughput is "batch size" per second 27 | // e.g. Throughput = X "millions of messages" per sec 28 | 29 | @BeforeClass 30 | public static void openDataFile() throws FileNotFoundException { 31 | data = new PrintWriter(new File("thread-count.csv")); 32 | data.printf("threadCount, topicCount, i, t, rate, dt, batchRate\n"); 33 | } 34 | 35 | @AfterClass 36 | public static void closeDataFile() { 37 | data.close(); 38 | } 39 | 40 | private static PrintWriter data; 41 | 42 | @Parameterized.Parameters(name = "{index}: threads={0}, topics={1}") 43 | public static Iterable data() { 44 | return Arrays.asList(new Object[][]{ 45 | {1, 50}, {2, 50}, {5, 50}, 46 | {1, 100}, {2, 100}, {5, 100} 47 | }); 48 | 49 | // return Arrays.asList(new Object[][]{ 50 | // {1, 50}, {2, 50}, {5, 50}, {10, 50}, {15, 50}, {20, 50}, 51 | // {1, 100}, {2, 100}, {5, 100}, {10, 100}, {15, 100}, {20, 100}, 52 | // {1, 200}, {2, 200}, {5, 200}, {10, 200}, {15, 200}, {20, 200}, 53 | // {1, 500}, {2, 500}, {5, 500}, {10, 500}, {15, 500}, {20, 500}, 54 | // {1, 1000}, {2, 1000}, {5, 1000}, {10, 1000}, {15, 1000}, {20, 1000}, 55 | // {1, 2000}, {2, 2000}, {5, 2000}, {10, 2000}, {15, 2000}, {20, 2000} 56 | // }); 57 | } 58 | 59 | private int threadCount; // number of concurrent Kafka producers to run 60 | private int topicCount; // number of Kafka topics in our stream 61 | private int messageSize = 100; // size of each message sent into Kafka 62 | 63 | private static final ProducerRecord end = new ProducerRecord<>("end", null); 64 | 65 | public ThreadCountSpeedTest(int threadCount, int topicCount) { 66 | this.threadCount = threadCount; 67 | this.topicCount = topicCount; 68 | } 69 | 70 | private static class Sender extends Thread { 71 | private final KafkaProducer producer; 72 | private final BlockingQueue> queue; 73 | 74 | private Sender(KafkaProducer producer, BlockingQueue> queue) { 75 | this.producer = producer; 76 | this.queue = queue; 77 | } 78 | 79 | @Override 80 | public void run() { 81 | try { 82 | ProducerRecord rec = queue.take(); 83 | while (rec != end) { 84 | // Here's were the sender thread sends a message. 85 | // Since we're not supplying a callback the send will be done asynchronously. 86 | // The outgoing message will go to a local buffer which is not necessarily FIFO, 87 | // but sending messages out-of-order does not matter since we're just trying to 88 | // test throughput in this class. 89 | producer.send(rec); 90 | rec = queue.take(); 91 | } 92 | } catch (InterruptedException e) { 93 | System.out.printf("%s: Interrupted\n", this.getName()); 94 | } 95 | } 96 | } 97 | 98 | @Test 99 | public void testThreads() throws Exception { 100 | System.out.printf("threadCount = %d, topicCount = %d\n", threadCount, topicCount); 101 | 102 | // Create new topic names. Kafka will automatically create these topics if they don't already exist. 103 | List ourTopics = Lists.newArrayList(); 104 | for (int i = 0; i < topicCount; i++) { 105 | // Topic names will look like, "t-00874" 106 | ourTopics.add(String.format("t-%05d", i)); 107 | } 108 | 109 | // Create a message containing random bytes. We'll send this message over and over again 110 | // in our performance test, below. 111 | Random rand = new Random(); 112 | byte[] buf = new byte[messageSize]; 113 | rand.nextBytes(buf); 114 | Tick message = new Tick(buf); 115 | 116 | // Create a pool of sender threads. 117 | ExecutorService pool = Executors.newFixedThreadPool(threadCount); 118 | 119 | // We need some way to give each sender messages to publish. 120 | // We'll do that via this list of queues. 121 | List>> queues = Lists.newArrayList(); 122 | for (int i = 0; i < threadCount; i++) { 123 | // We use BlockingQueue to buffer messages for each sender. 124 | // We use this type not for concurrency reasons (although it is thread safe) but 125 | // rather because it provides an efficient way for senders to take messages if 126 | // they're available and for us to generate those messages (see below). 127 | BlockingQueue> q = new ArrayBlockingQueue<>(1000); 128 | queues.add(q); 129 | // spawn each thread with a reference to "q", which we'll add messages to later. 130 | pool.submit(new Sender(getProducer(), q)); 131 | } 132 | 133 | double t0 = System.nanoTime() * 1e-9; 134 | double batchStart = 0; 135 | 136 | // -------- Generate Messages for each Sender -------- 137 | // Generate BATCH_SIZE messages at a time and send each one to a random sender thread. 138 | // The batch size was defined above as containing 1 million messages. 139 | // We want to send as many messages as possible until a timeout has been reached. 140 | // The timeout was defined above as 30 seconds. 141 | // We'll break out of this loop when that timeout occurs. 142 | for (int i = 0; i >= 0 && i < Integer.MAX_VALUE; ) { 143 | // Send each message in our batch (of 1 million messages) to a random topic. 144 | for (int j = 0; j < BATCH_SIZE; j++) { 145 | // Get a random topic (but always assign it to the same sender thread) 146 | String topic = ourTopics.get(rand.nextInt(topicCount)); 147 | // The topic hashcode works in the sense that equal topics always have equal hashes. 148 | // So this will ensure that a topic will always be populated by the same sender thread. 149 | // We want to load balance senders without using round robin, because with round robin 150 | // all senders would have to send to all topics, and we've found that it's much faster 151 | // to minimize the number of topics each kafka producer sends to. 152 | // By using this hashcode we can maintain affinity between Kafka topic and sender thread. 153 | int qid = topic.hashCode() % threadCount; 154 | if (qid < 0) { 155 | qid += threadCount; 156 | } 157 | try { 158 | // Put a message to be published in the queue belonging to the sender we just selected. 159 | // That sender will automatically send this message as soon as possible. 160 | queues.get(qid).put(new ProducerRecord<>(topic, message.getData())); 161 | } catch (Exception e) { 162 | // BlockingQueue might throw an IllegalStateException if the queue fills up. 163 | e.printStackTrace(); 164 | } 165 | } 166 | i += BATCH_SIZE; 167 | double t = System.nanoTime() * 1e-9 - t0; 168 | double dt = t - batchStart; 169 | batchStart = t; 170 | // i = number of batches (number of "1 million messages" sent) 171 | // t = total elapsed time 172 | // i/t = throughput (number of batches sent overall per second) 173 | // dt = elapsed time for this batch 174 | // batch / dt = millions of messages sent per second for this batch 175 | data.printf("%d,%d,%d,%.3f,%.1f,%.3f,%.1f\n", threadCount, topicCount, i, t, i / t, dt, BATCH_SIZE / dt); 176 | data.flush(); 177 | if (t > TIMEOUT) { 178 | break; 179 | } 180 | } 181 | // We cleanly shutdown each producer thread by sending the predefined "end" message 182 | // then shutdown the threads in the pool after giving them a few seconds to see that 183 | // end message. 184 | for (int i = 0; i < threadCount; i++) { 185 | queues.get(i).add(end); 186 | } 187 | pool.shutdown(); 188 | pool.awaitTermination(10, TimeUnit.SECONDS); 189 | } 190 | 191 | KafkaProducer getProducer() throws IOException { 192 | Properties props = new Properties(); 193 | props.load(Resources.getResource("producer.props").openStream()); 194 | // Properties reference: 195 | // https://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html 196 | // props.put("batch.size", 16384); 197 | // props.put("linger.ms", 1); 198 | // props.put("buffer.memory", 33554432); 199 | 200 | return new KafkaProducer<>(props); 201 | } 202 | } 203 | -------------------------------------------------------------------------------- /src/test/java/com/mapr/sample/TopicCountGridSearchTest.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | import com.google.common.collect.Lists; 4 | import com.google.common.io.Resources; 5 | import org.apache.kafka.clients.producer.KafkaProducer; 6 | import org.apache.kafka.clients.producer.ProducerRecord; 7 | import org.junit.AfterClass; 8 | import org.junit.BeforeClass; 9 | import org.junit.Test; 10 | import org.junit.runner.RunWith; 11 | import org.junit.runners.Parameterized; 12 | 13 | import java.io.File; 14 | import java.io.FileNotFoundException; 15 | import java.io.IOException; 16 | import java.io.PrintWriter; 17 | import java.util.Arrays; 18 | import java.util.List; 19 | import java.util.Properties; 20 | import java.util.Random; 21 | 22 | /** 23 | * Performance tests intended to explore effect of number of output topics, buffer size, 24 | * threading and so on. 25 | */ 26 | @RunWith(Parameterized.class) 27 | public class TopicCountGridSearchTest { 28 | 29 | @Parameterized.Parameters(name = "{index}: fib({0})={1}") 30 | public static Iterable data() { 31 | return Arrays.asList(new Object[][]{ 32 | {0, 100, 100}, {0, 300, 100}, 33 | {16384, 100, 100}, {16384, 300, 100} 34 | }); 35 | } 36 | 37 | @BeforeClass 38 | public static void openDataFile() throws FileNotFoundException { 39 | data = new PrintWriter(new File("topic-count.csv")); 40 | data.printf("batchSize, topicCount, messageSize, i, t, rate, dt, batchRate\n"); 41 | } 42 | 43 | @AfterClass 44 | public static void closeDataFile() { 45 | data.close(); 46 | } 47 | 48 | private static PrintWriter data; 49 | 50 | private int batchSize; 51 | private int topicCount; 52 | private int messageSize; 53 | 54 | public TopicCountGridSearchTest(int batchSize, int topicCount, int messageSize) { 55 | this.batchSize = batchSize; 56 | this.topicCount = topicCount; 57 | this.messageSize = messageSize; 58 | } 59 | 60 | @Test 61 | public void testSpeed() throws IOException { 62 | System.out.printf("batchSize = %d, topicCount = %d\n", batchSize, topicCount); 63 | 64 | List ourTopics = Lists.newArrayList(); 65 | for (int i = 0; i < topicCount; i++) { 66 | ourTopics.add(String.format("t-%05d", i)); 67 | } 68 | Random rand = new Random(); 69 | 70 | byte[] buf = new byte[messageSize]; 71 | rand.nextBytes(buf); 72 | Tick message = new Tick(buf); 73 | 74 | KafkaProducer producer = getProducer(); 75 | 76 | double t0 = System.nanoTime() * 1e-9; 77 | double batchStart = 0; 78 | double timeout = 15; 79 | 80 | int batch = 500000; 81 | 82 | for (int i = 0; i < 1e8; ) { 83 | for (int j = 0; j < batch; j++) { 84 | String topic = ourTopics.get(rand.nextInt(topicCount)); 85 | producer.send(new ProducerRecord<>(topic, message.getData())); 86 | } 87 | double t = System.nanoTime() * 1e-9 - t0; 88 | double dt = t - batchStart; 89 | i += batch; 90 | batchStart = t; 91 | data.printf("%d,%d,%d,%d,%.3f,%.1f,%.3f,%.1f\n", batchSize, topicCount, messageSize, i, t, i / t, dt, batch / dt); 92 | data.flush(); 93 | if (t > timeout) { 94 | break; 95 | } 96 | } 97 | } 98 | 99 | KafkaProducer getProducer() throws IOException { 100 | Properties p = new Properties(); 101 | p.load(Resources.getResource("producer.props").openStream()); 102 | 103 | if (batchSize > 0) { 104 | p.setProperty("batch.size", String.valueOf(batchSize)); 105 | } 106 | return new KafkaProducer<>(p); 107 | } 108 | } 109 | -------------------------------------------------------------------------------- /src/test/java/com/mapr/sample/TypeFormatSpeedTest.java: -------------------------------------------------------------------------------- 1 | package com.mapr.sample; 2 | 3 | 4 | /* DESCRIPTION: 5 | * This JUnit tests compares how fast we can serialize data objects of various 6 | * formats. We simulate Kafka serialization by reading string data form a 7 | * file, casting it to a data record of a specific type (e.g. POJO, 8 | * JsonObject, or JSON annotated Byte Array), then writing the object back out 9 | * to a file. Essentially, we're simulating Kafka streams as file streams and 10 | * measuring how long it takes to convert a string data record to a Java object 11 | * that encapsulates the record's data fields. 12 | * 13 | * USAGE: 14 | * mvn -e -Dtest=TypeFormatSpeedTest test 15 | */ 16 | 17 | import com.fasterxml.jackson.databind.ObjectMapper; 18 | import com.google.common.base.Charsets; 19 | import com.google.common.io.Resources; 20 | import org.junit.Test; 21 | 22 | import java.io.*; 23 | import java.text.ParseException; 24 | import java.util.LinkedList; 25 | import java.util.List; 26 | 27 | import static org.junit.Assert.assertEquals; 28 | import org.json.simple.JSONObject; 29 | import org.json.simple.JSONArray; 30 | 31 | public class TypeFormatSpeedTest { 32 | 33 | public static final double N = 1e6; 34 | 35 | @Test 36 | public void testJsonSpeed() throws Exception { 37 | 38 | List data = Resources.readLines(Resources.getResource("sample-tick-01.txt"), Charsets.ISO_8859_1); 39 | 40 | double t0 = System.nanoTime() * 1e-9; 41 | File tempFile = File.createTempFile("foo", "data"); 42 | tempFile.deleteOnExit(); 43 | try (ObjectOutputStream out = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(tempFile), 10_000_00))) { 44 | for (int i = 0; i < N; i++) { 45 | int j = i % data.size(); 46 | JSONObject tick = parse_json(data.get(j)); 47 | out.writeObject(tick); 48 | } 49 | } 50 | double t = System.nanoTime() * 1e-9 - t0; 51 | System.out.printf("[testJsonSpeed] t = %.3f us, %.2f records/s\n", t / N * 1e6, N / t); 52 | } 53 | 54 | private static JSONObject parse_json(String record) throws ParseException { 55 | // TODO: handle corrupted messages or messages with missing fields gracefully 56 | if (record.length() < 71) { 57 | throw new ParseException("Expected line to be at least 71 characters, but got " + record.length(), record.length()); 58 | } 59 | 60 | JSONObject trade_info = new JSONObject(); 61 | trade_info.put("date", record.substring(0, 9)); 62 | trade_info.put("exchange", record.substring(9, 10)); 63 | trade_info.put("symbol root", record.substring(10, 16).trim()); 64 | trade_info.put("symbol suffix", record.substring(16, 26).trim()); 65 | trade_info.put("saleCondition", record.substring(26, 30).trim()); 66 | trade_info.put("tradeVolume", record.substring(30, 39)); 67 | trade_info.put("tradePrice", record.substring(39, 46) + "." + record.substring(46, 50)); 68 | trade_info.put("tradeStopStockIndicator", record.substring(50, 51)); 69 | trade_info.put("tradeCorrectionIndicator", record.substring(51, 53)); 70 | trade_info.put("tradeSequenceNumber", record.substring(53, 69)); 71 | trade_info.put("tradeSource", record.substring(69, 70)); 72 | trade_info.put("tradeReportingFacility", record.substring(70, 71)); 73 | if (record.length() >= 74) { 74 | trade_info.put("sender", record.substring(71, 75)); 75 | 76 | JSONArray receiver_list = new JSONArray(); 77 | int i = 0; 78 | while (record.length() >= 78 + i) { 79 | receiver_list.add(record.substring(75 + i, 79 + i)); 80 | i += 4; 81 | } 82 | trade_info.put("receivers", receiver_list); 83 | } 84 | return trade_info; 85 | } 86 | 87 | @Test 88 | public void testPojoSpeed() throws Exception { 89 | List data = Resources.readLines(Resources.getResource("sample-tick-01.txt"), Charsets.ISO_8859_1); 90 | 91 | double t0 = System.nanoTime() * 1e-9; 92 | File tempFile = File.createTempFile("foo", "data"); 93 | tempFile.deleteOnExit(); 94 | try (ObjectOutputStream out = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(tempFile), 10_000_00))) { 95 | for (int i = 0; i < N; i++) { 96 | int j = i % data.size(); 97 | TickPojo tick = parse_pojo(data.get(j)); 98 | out.writeObject(tick); 99 | } 100 | } 101 | double t = System.nanoTime() * 1e-9 - t0; 102 | System.out.printf("[testPojoSpeed] t = %.3f us, %.2f records/s\n", t / N * 1e6, N / t); 103 | } 104 | 105 | private static TickPojo parse_pojo(String record) throws ParseException { 106 | // TODO: handle corrupted messages or messages with missing fields gracefully 107 | if (record.length() < 71) { 108 | throw new ParseException("Expected line to be at least 71 characters, but got " + record.length(), record.length()); 109 | } 110 | 111 | TickPojo trade_info = new TickPojo(); 112 | trade_info.setDate(record.substring(0, 9)); 113 | trade_info.setExchange(record.substring(9, 10)); 114 | trade_info.setSymbolroot(record.substring(10, 16).trim()); 115 | trade_info.setSymbolsuffix(record.substring(16, 26).trim()); 116 | trade_info.setSaleCondition(record.substring(26, 30).trim()); 117 | trade_info.setTradeVolume(record.substring(30, 39)); 118 | trade_info.setTradePrice(record.substring(39, 46) + "." + record.substring(46, 50)); 119 | trade_info.setTradeStopStockIndicator(record.substring(50, 51)); 120 | trade_info.setTradeCorrectionIndicator(record.substring(51, 53)); 121 | trade_info.setTradeSequenceNumber(record.substring(53, 69)); 122 | trade_info.setTradeSource(record.substring(69, 70)); 123 | trade_info.setTradeReportingFacility(record.substring(70, 71)); 124 | if (record.length() >= 74) { 125 | trade_info.setSender(record.substring(71, 75)); 126 | 127 | List receiver_list = new LinkedList(); 128 | int i = 0; 129 | while (record.length() >= 78 + i) { 130 | receiver_list.add(record.substring(75 + i, 79 + i)); 131 | i += 4; 132 | } 133 | trade_info.setReceivers(receiver_list.toArray(new String[receiver_list.size()])); 134 | } 135 | return trade_info; 136 | } 137 | 138 | @Test 139 | public void testByteSpeed() throws Exception { 140 | List data = Resources.readLines(Resources.getResource("sample-tick-01.txt"), Charsets.ISO_8859_1); 141 | 142 | double t0 = System.nanoTime() * 1e-9; 143 | File tempFile = File.createTempFile("foo", "data"); 144 | tempFile.deleteOnExit(); 145 | try (ObjectOutputStream out = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(tempFile), 10_000_00))) { 146 | for (int i = 0; i < N; i++) { 147 | int j = i % data.size(); 148 | Tick tick = new Tick(data.get(j)); 149 | out.writeObject(tick); 150 | } 151 | } 152 | double t = System.nanoTime() * 1e-9 - t0; 153 | System.out.printf("[testByteSpeed] t = %.3f us, %.2f records/s\n", t / N * 1e6, N / t); 154 | } 155 | } -------------------------------------------------------------------------------- /src/test/resources/producer.props: -------------------------------------------------------------------------------- 1 | bootstrap.servers=localhost:9092 2 | key.serializer=org.apache.kafka.common.serialization.StringSerializer 3 | value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer 4 | block.on.buffer.full=true 5 | default.replication.factor=3 6 | batch.size=16384 7 | acks=all 8 | -------------------------------------------------------------------------------- /src/test/resources/sample-tick-01.txt: -------------------------------------------------------------------------------- 1 | 080845201DAA T 00000082500000105600N0000000070800001CT100110051009 2 | 080845201DAA T 00000026900000106000N0000000070800002CT10001007 3 | 080845201PAA T 00000001500000089700N0000000070800003C 100110091006 4 | 080845201DAA T 00000039700000105300N0000000070800004CT1002100610071006 5 | 080845201PAA T 00000001200000091400N0000000070800005C 10001007 6 | 080845201DAA T 00000151100000089800N0000000070800006CT100310091007 7 | 080845201DAA T 00000009500000104400N0000000070800007CT100410061007 8 | 080845201PAA T 00000001400000087600N0000000070800008C 1000100710071008 9 | 080845201DAA T 00000140700000087700N0000000070800009CT1000100910051009 10 | 080845201DAA T 00000008200000091500N0000000070800010CT1000100710071007 11 | 080845201PAA T 00000001400000089000N0000000070800011C 10011006 12 | 080845201DAA T 00000019000000103200N0000000070800012CT1002100710051006 13 | 080845201PAA T 00000005000000103900N0000000070800013C 10041005 14 | 080845201DAA T 00000000400000088400N0000000070800014CT1001100710051006 15 | -------------------------------------------------------------------------------- /target/classes/com/mapr/demo/finserv/Tick.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/classes/com/mapr/demo/finserv/Tick.class -------------------------------------------------------------------------------- /target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest$1.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest$1.class -------------------------------------------------------------------------------- /target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest$Sender.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest$Sender.class -------------------------------------------------------------------------------- /target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/test-classes/com/mapr/demo/finserv/ThreadCountSpeedTest.class -------------------------------------------------------------------------------- /target/test-classes/com/mapr/demo/finserv/Tick2Test.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/test-classes/com/mapr/demo/finserv/Tick2Test.class -------------------------------------------------------------------------------- /target/test-classes/com/mapr/demo/finserv/TopicCountGridSearchTest.class: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/target/test-classes/com/mapr/demo/finserv/TopicCountGridSearchTest.class -------------------------------------------------------------------------------- /target/test-classes/offset_producer.props: -------------------------------------------------------------------------------- 1 | # batch.size=16384 2 | key.serializer=org.apache.kafka.common.serialization.StringSerializer 3 | value.serializer=org.apache.kafka.common.serialization.StringSerializer 4 | block.on.buffer.full=true -------------------------------------------------------------------------------- /target/test-classes/producer.props: -------------------------------------------------------------------------------- 1 | bootstrap.servers=ubuntu:9092 2 | key.serializer=org.apache.kafka.common.serialization.StringSerializer 3 | value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer 4 | block.on.buffer.full=true 5 | default.replication.factor=3 6 | acks=0 7 | -------------------------------------------------------------------------------- /target/test-classes/sample-tick-01.txt: -------------------------------------------------------------------------------- 1 | 080845201DAA T 00000082500000105600N0000000070800001CT100110051009 2 | 080845201DAA T 00000026900000106000N0000000070800002CT10001007 3 | 080845201PAA T 00000001500000089700N0000000070800003C 100110091006 4 | 080845201DAA T 00000039700000105300N0000000070800004CT1002100610071006 5 | 080845201PAA T 00000001200000091400N0000000070800005C 10001007 6 | 080845201DAA T 00000151100000089800N0000000070800006CT100310091007 7 | 080845201DAA T 00000009500000104400N0000000070800007CT100410061007 8 | 080845201PAA T 00000001400000087600N0000000070800008C 1000100710071008 9 | 080845201DAA T 00000140700000087700N0000000070800009CT1000100910051009 10 | 080845201DAA T 00000008200000091500N0000000070800010CT1000100710071007 11 | 080845201PAA T 00000001400000089000N0000000070800011C 10011006 12 | 080845201DAA T 00000019000000103200N0000000070800012CT1002100710051006 13 | 080845201PAA T 00000005000000103900N0000000070800013C 10041005 14 | 080845201DAA T 00000000400000088400N0000000070800014CT1001100710051006 15 | -------------------------------------------------------------------------------- /thread.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/iandow/kafka_junit_tests/5071bd5758cfcf02a9bee0fd78d184ff08b9ac9c/thread.png --------------------------------------------------------------------------------