├── README.md
├── code
    ├── cmedecoder
    │   ├── override.q
    │   ├── parse.q
    │   ├── schema.q
    │   ├── tall_book.q
    │   ├── util.q
    │   ├── wide_book.q
    │   └── write.q
    ├── common
    │   └── logging.q
    ├── msgs
    │   ├── incremental_refresh.q
    │   ├── security_definition.q
    │   └── security_status.q
    └── processes
    │   └── cmedecoder.q
├── config
    └── settings
    │   └── cmedecoder.q
├── sample
    └── sample_20170101.log
├── setenv.sh
└── spec
    ├── FIX44.xml
    ├── cust_enums.csv
    ├── cust_fields.csv
    └── xml2json.py


/README.md:
--------------------------------------------------------------------------------
  1 | # CME Datamine
  2 | 
  3 | ## Parsing Historical Data from the Chicago Mercantile Exchange
  4 | 
  5 | The Chicago Mercantile Exchange (CME) is an American financial and commodity derivative exchange, offering a highly liquid market for futures and options on currencies, interest rates, indices and commodities. It is the largest exchange of futures and options in the world, and as such contains a wealth of useful market data.
  6 | 
  7 | The CME makes available for purchase historical and realtime market data in the form of FIX/FAST format messages from its CME Globex trading platform. These messages are used to track the level-aggregated status of orders (i.e. level 2 data), trades executed, book updates, and information on individual securities and security groups.
  8 | 
  9 | The Globex data available historically can be of use in transaction cost analysis of previously executed trades, inter-market comparison of derivative prices and liquidity, feeding algorithmic trading systems, or determination of the market risk of holdings. However, first it must be converted from the raw FIX messages provided by the CME Globex Market Data Platform, which may contain complex level 2 market information in multiple sections of a single message, into a more useful, manageable format. 
 10 | 
 11 | This codebase presents a method for processing historical data from the CME, building and maintaining an order book, and writing this data to disk in a variety of approaches suited to query efficiency, space efficiency or a balance between the two. Our example data set is FX futures contracts of 11 major currency pairs, but this method should equally apply to all historical CME market data.
 12 | 
 13 | ## Requirements
 14 | 
 15 | 
 16 | Basic knowledge of the q programming language and linux commands is assumed.
 17 | This system has been tested on KDB v3.4, release date 2016.12.23 on a x86_64 system running Ubuntu 14.04
 18 | 
 19 | ### Software Requirements
 20 | 
 21 | 
 22 | - KDB v3.4+
 23 | - Python
 24 | - zcat (optional, needed for gzipped logfiles)
 25 | 
 26 | ### Data Requirements
 27 | 
 28 | 
 29 | This software is designed to process CME MDP 3.0 FIX historical data (FIX Version 4.4), this data is not provided by this software and can be obtained from [CME Group](https://datamine.cmegroup.com/)
 30 | 
 31 | A sample file is available [here](https://github.com/jonathonmcmurray/cme/blob/master/sample/sample.log).
 32 | 
 33 | 
 34 | ## Capabilities
 35 | 
 36 | 
 37 | ### The CME parser is designed to complete the following tasks:
 38 | 
 39 | 
 40 | - Parse CME MDP 3.0 FIX messages from both extracted and gzipped files on disk.
 41 | - Store these messages as raw Quote, Trade and Security Definition tables (with the raw CME FIX data fields).
 42 | - Build an order book from the Quote table with as many levels as the Security Definition states.
 43 | - Derive more traditional user-friendly Quote and Trade tables from the above tables.
 44 | - Save these tables to disk.
 45 | 
 46 | 
 47 | ### Features of the CME parser:
 48 | 
 49 | 
 50 | - The parser is designed to process files as quickly as possible in order for the data to be available quickly after it is published.
 51 | - Reads data in from the file in chunks to minimize memory usage while parsing data.
 52 | - The codebase is reasonably flexible to small changes in CME specification, in case of updates to the CME MDP format.
 53 | - Allows for flexibility on how the data is stored on disk.
 54 | 	- Narrow Book table takes up less space on disk, but has less granularity in the data.
 55 | 	- Wide Book table takes up more space on disk, but can retain more data about the order and state of the book at given time.
 56 | 	- Optionally store raw CME messages as quote, trade and security definition tables on disk.
 57 | - The process can be started easily from the command line and passed paths of files in either explicit or wildcard format.
 58 | - The process can be monitored by reading log files split into out (information messages), err (Errors) and usage (Inter-process communication).
 59 | 
 60 | ### Limitations:
 61 | 
 62 | 
 63 | - The parser can only process raw and gzipped CME MDP 3.0 FIX log files, if the data is compressed in a different format the files will need to be manually decompressed.
 64 | - The parser cannot process real time data, it is designed to parse CME FIX MDP 3.0 historical datamine logfiles. If you would like information on how to parse, store and query realtime cme data, please contact [AquaQ](mailto:info@aquaq.co.uk).
 65 | 
 66 | ## Getting Started:
 67 | 
 68 | - These bash commands will give directions on downloading TorQ and our FIX message package. The FIX package will be placed on top of the base TorQ package.
 69 | 	
 70 | 1. Make a directory to check the git repos into, and a directory to deploy the system to.
 71 | 
 72 | 		~/cme$ mkdir git deploy
 73 | 		~cme$ ls
 74 | 		deploy  git
 75 | 	
 76 | 2. Change to the git directory and clone the FIX parser and TorQ repositories.
 77 | 
 78 | 		~/cme$ cd git
 79 | 		~/cme/git$ git clone https://github.com/AquaQAnalytics/TorQ-CME.git
 80 | 		~/cme/git$ git clone https://github.com/AquaQAnalytics/TorQ.git
 81 | 		~/cme/git$ ls
 82 | 		TorQ-CME  TorQ
 83 | 	
 84 | 3. Change to the deploy directory and copy the contents of TorQ into it.
 85 | 
 86 | 		~/cme/git$ cd ../deploy/
 87 |  		~/cme/deploy$ cp -r ../git/TorQ/* ./
 88 | 	
 89 | 4. Copy the contents of the FIX parsers repo into the same directory, allowing overwrites.
 90 | 
 91 | 		~/cme/deploy$ cp -r ../git/TorQ-CME/* ./
 92 | 
 93 | You should have a combination of each directories content included in the deploy direcotry:
 94 | 
 95 | 	~/cme/deploy$ ls
 96 | 	aquaq-torq-brochure.pdf  code  config  decoder.q  docs  html  lib  LICENSE  logs  mkdocs.yml  README.md  sample  setenv.sh  spec  tests  torq.q
 97 | 	
 98 | 
 99 | The processing of files is called in a similar manner to other TorQ processes (note environment variables must be set with setenv.sh below):
100 | ```
101 | ~/cme/deploy$ . setenv.sh
102 | ~/cme/deploy$ cmedecoder -files sample/sample_20170101.log
103 | ```
104 | 
105 | `cmedecoder` is an alias defined in setenv.sh for convenience. The expanded version of the same command is shown below:
106 | 
107 | ```
108 | ~/cme/deploy$ q torq.q -load code/processes/cmedecoder.q -proctype cmedecoder -procname cmedecoder -files sample/sample_20170101.log
109 | ```
110 | The above will process the sample logfile provided and save the data to `hdb`.
111 | To load the hdb simply run from your TorQ directory `q hdb`.
112 | 
113 | ## Column Override
114 | Different CME datasets will contain different information and different tags. If you are loading a file in and it is missing a column that is expected you can use the code/cmedecoder/override.q script to fix this. This script allows users a place to add custom fields before the entries are inserted into the associated tables. 
115 | 
116 | Add the column to the missingfields parameter and what you would like the column to be overwritten with.
117 | ```
118 | missingfields:{[x]if[not `TransactTime in key x;x[`TransactTime]:x[`SendingTime]];
119 |                   if[not `MatchEventIndicator in key x;x[`MatchEventIndicator]:0x0];
120 |                                 x};
121 | ```
122 | 
123 | ## Data Handling
124 | The FIX message categories within the CME needed to maintain market information are "d" and "X" - security definition and market data incremental refresh, respectively. The security information includes the standard FIX header, and then identifies the instrument and its features, including those used in maintaining the book (MarketDepth:264, used to maintain book depth, and DisplayFactor:9787, used to convert the FIX message prices to real market values). These messages may contain multiple repeated blocks, e.g. for multiple underlying securities in spread instruments, which must be accounted for in processing. An example definition message is shown below.
125 | ```
126 | 1128=9^A9=511^A35=d^A49=SAMPLE^A75=20161009^A34=1281^A52=20030124045450030397440^A5799=00000000^A980=A^A779=20161009160533273752621^A1180=314^A1300=62^A55=6SH0^A48=24929^A22=8^A200=202003^A1151=6S^A6937=6S^A167=FUT^A461=FFCXSX^A9779=N^A462=4^A207=XCME^A15=USD^A1142=F^A562=1^A1140=9999^A969=1.0^A1146=0.0^A9787=1.0E-4^A1141=1^A1022=GBX^A264=10^A864=2^A865=5^A1145=20150316-21:49:00.000000000^A865=7^A1145=20200316-14:16:00.000000000^A870=1^A871=24^A872=00000000000001000010000000001111^A996=CHF^A1147=125000^A1149=11501.0^A1148=10701.0^A1143=60.0^A1150=11101.0^A731=00000011^A10=240^A60=20030124045450030397440
127 | ```
128 | 
129 | A market data incremental refresh message contains information on quotes and trades executed, including multiple repeated blocks (NoMDEntries: 268) which contain the market actions resulting in an event, e.g. multiple book level updates to account for a trade eliminating multiple orders. The information in the repeated blocks is separated out in this case and the surrounding information (time, security etc) duplicated, while keeping MsgSeqNum:34 and RptSeq:83 which allow tracking of event ordering. Each full message can then be pushed to the appropriate location based on the MDUpdateAction:279 which indicates the event type (0 - bid; 1 - ask; 2 - trade; ... ), and the order book can be maintained based on the changes indicated in this message type. An example market data incremental refresh message is shown below.
130 | 
131 | ```
132 | 1128=9^A9=180^A35=X^A49=SAMPLE^A75=20161011^A34=2344^A52=20010525125902582648128^A60=20010525125902582648128^A5799=10000100^A268=1^A279=0^A269=1^A48=173595^A55=6SZ6^A83=354045^A270=10270.0^A271=3^A346=1^A1023=3^A10=086
133 | ```
134 | ## Case Study
135 | 
136 | To create the book, define four schemas to store the data parsed from the raw FIX message format:
137 | 
138 | 1. Market Data Security Status (msgType = f)
139 | ```
140 | q)meta rawstatus
141 | c                    | t f a
142 | ---------------------| -----
143 | MsgSeqNum            | i
144 | TransactTime         | p
145 | TradingDate          | d
146 | MatchEventIndicator  | i
147 | SecurityGroup        | s
148 | SecurityTradingStatus| s
149 | HaltReasonChar       | s
150 | SecurityTradingEvent | s
151 | ```
152 | 
153 | 2. Quote (Market Data Incremental Refresh where MDEntryType = 0/1)
154 | ```
155 | q)meta rawquote
156 | c                  | t f a
157 | -------------------| -----
158 | date               | d
159 | Symbol             | s   p
160 | TradeDate          | d
161 | MsgSeqNum          | i
162 | TransactTime       | p
163 | MatchEventIndicator| i
164 | MDUpdateAction     | s
165 | MDEntryType        | s
166 | SecurityID         | i
167 | RptSeq             | i
168 | MDEntryPx          | f
169 | MDEntrySize        | f
170 | NumberOfOrders     | i
171 | MDPriceLevel       | i
172 | ```
173 | 
174 | 3. Trade (Market Data Incremental Refresh where MDEntryType = 2)
175 | ```
176 | q)meta rawtrade
177 | c                  | t f a
178 | -------------------| -----
179 | date               | d
180 | Symbol             | s   p
181 | TradeDate          | d
182 | MsgSeqNum          | i
183 | TransactTime       | p
184 | MatchEventIndicator| i
185 | MDUpdateAction     | s
186 | SecurityID         | i
187 | RptSeq             | i
188 | MDEntryPx          | f
189 | MDEntrySize        | f
190 | NumberOfOrders     | i
191 | AggressorSide      | s
192 | ```
193 | 
194 | 4. Security Definition (msgType = d)
195 | ```
196 | q)meta rawdefinitions
197 | c                   | t f a
198 | --------------------| -----
199 | TradeDate           | d
200 | LastUpdateTime      | p
201 | MatchEventIndicator | i
202 | SecurityUpdateAction| s
203 | MarketSegmentID     | i
204 | Symbol              | s
205 | SecurityID          | i
206 | MaturityMonthYear   | m
207 | SecurityGroup       | s
208 | SecurityType        | s
209 | UnderlyingProduct   | i
210 | SecurityExchange    | s
211 | Currency            | s
212 | MarketDepth         | i
213 | DisplayFactor       | f
214 | ```
215 | 
216 | Once data has been parsed and placed in the appropriate tables it is possible to generate a book of quotes and trades. Depending on user requirements, there are scripts to build both a wide book and a tall book. 
217 | 
218 | The wide book format stores a nested list of prices and sizes up to the maximum market depth at each point in time for the data. The user may then query the data over a time range or an exact time to generate a view of the book at that point. 
219 | 
220 | In contrast, the tall book stores only what has changed on each update for the appropriate side. The table is thus smaller, since only the level which has been changed (and those below in the case of a "NEW" or "DELETE" MDUpdateAction) on a single side must be changed with each message. A sample of the tall book, showing a single entry at level 3, and an appropriate query to return a book for a single sym at a certain time are show below:
221 | 
222 | ```
223 | ~/deploy$ q torq.q -load code/processes/cmedecoder.q -proctype cmedecoder -procname cmedecoder -files sample/sample_20170101.log -debug -tallbook
224 | 	...
225 | 	...
226 | 	...
227 | q)book
228 | date       time                          sym  side  level orders size price msgseq rptseq matchevent
229 | ----------------------------------------------------------------------------------------------------
230 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 3     1      3    10270 2344   354045 132
231 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 4                       2344   354045 132
232 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 5                       2344   354045 132
233 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 6                       2344   354045 132
234 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 7                       2344   354045 132
235 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 8                       2344   354045 132
236 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 9                       2344   354045 132
237 | 2017.01.01 2017.01.01D01:10:58.905415920 6SZ6 OFFER 10                      2344   354045 132
238 | ..
239 | 
240 | q)select by side,level from book where date=2017.01.01, time<=07:05:00.0, sym=`6SZ6
241 | side  level| date       time                          sym  orders size price msgseq rptseq matchevent
242 | -----------| ----------------------------------------------------------------------------------------
243 | BID   1    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 1      1    10215 2060   358920 132
244 | BID   2    | 2017.01.01 2017.01.01D00:14:21.855551221 6SZ6 13     23   10214 2611   358963 132
245 | BID   3    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 11     21   10213 2060   358920 132
246 | BID   4    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 11     21   10212 2060   358920 132
247 | BID   5    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 11     21   10211 2060   358920 132
248 | BID   6    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 12     35   10210 2060   358920 132
249 | BID   7    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 8      44   10209 2060   358920 132
250 | BID   8    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 11     36   10208 2060   358920 132
251 | BID   9    | 2017.01.01 2017.01.01D00:31:33.384676725 6SZ6 9      26   10207 2060   358920 132
252 | BID   10   | 2017.01.01 2017.01.01D03:32:25.585326361 6SZ6 5      21   10205 3737   358923 132
253 | ..
254 | ```
255 | 
256 | Similarly, a sample of the wide book is shown below
257 | 
258 | ```
259 | ~/deploy$ q torq.q -load decoder.q -proctype decoder -procname decoder -files sample/sample_20170101.log -debug
260 | 	...
261 | 	...
262 | 	...
263 | ~/deploy$ q hdb
264 | q)10 sublist select from book
265 | date       sym  time                          bprice                                                                bsize                      aprice                                      ..
266 | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------..
267 | 2017.01.01 6SH7 2017.01.01D12:47:06.545756971 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 3 5 6 4  4  4  4  3  36 11 1.0272 1.0273 1.0274 1.0275 1.0276 1.0277 1...
268 | 2017.01.01 6SH7 2017.01.01D04:07:46.761071942 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 1.0258 5 6 4 37 4  3  11 36 11 53 1.0272 1.0273 1.0274 1.0275 1.0276 1.0277 1...
269 | 2017.01.01 6SH7 2017.01.01D01:18:39.077345929 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 6 6 4 4  37 3  3  11 36 11 1.0274 1.0275 1.0276 1.0277 1.0278 1.0279 1...
270 | 2017.01.01 6SH7 2017.01.01D00:07:02.427603074 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 5 5 3 3  3  11 3  11 3  3  1.0272 1.0274 1.0275 1.0276 1.0277 1.0278 1...
271 | 2017.01.01 6SH7 2017.01.01D04:24:36.588750868 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 5 6 4 4  37 4  3  11 36 11 1.0272 1.0273 1.0274 1.0275 1.0276 1.0277 1...
272 | 2017.01.01 6SH7 2017.01.01D09:32:24.001122966 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 3 5 6 4  4  4  4  3  36 11 1.0272 1.0273 1.0274 1.0275 1.0276 1.0277 1...
273 | 2017.01.01 6SH7 2017.01.01D10:32:35.416123137 1.0269 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  3 3 5 5  3  3  11 3  11 3  1.0274 1.0275 1.0276 1.0277 1.0278 1.0279 1...
274 | 2017.01.01 6SH7 2017.01.01D11:05:35.588591217 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 1.0258 1.0257 5 5 3 3  3  11 3  11 3  3  1.0272 1.0273 1.0274 1.0275 1.0276 1.0277 1...
275 | 2017.01.01 6SH7 2017.01.01D05:57:25.396454768 1.0268 1.0267 1.0266 1.0265 1.0264 1.0263 1.0262 1.0261 1.026  1.0259 5 6 4 4  37 12 3  11 36 3  1.0274 1.0275 1.0276 1.0277 1.0278 1.0279 1...
276 | ```
277 | 
278 | It should be noted that building an order book time series from raw FIX messages is open to interpretation. The manner in which events are aggregated ( by Sequence Number ) should be investigated. 
279 | 
280 | https://github.com/AquaQAnalytics/TorQ
281 | 


--------------------------------------------------------------------------------
/code/cmedecoder/override.q:
--------------------------------------------------------------------------------
 1 | // the following code allows users a place to add custom fields before the entries are inserted into the associated tables
 2 | \d .cme
 3 | 
 4 | / Simple Override set up to allow custom fields to be added.
 5 | / This file should be customized to users needs.
 6 | overridedict:enlist[`]!enlist[{x}];                                       // empty dict for override function (key: msgtype)
 7 | override:{[msg]overridedict[msg`MsgType][msg]};                           // lookup override function based on msgtype & apply
 8 | 
 9 | / handle missing fields for incr refresh
10 | missingfields:{[x]
11 |   if[not `TransactTime in key x;x[`TransactTime]:x[`SendingTime]];        // if no TransactTime, use SendingTime
12 |   if[not `MatchEventIndicator in key x;x[`MatchEventIndicator]:0x0];      // if no MEI, use 0x0
13 |   :x;                                                                     // return updated msg
14 |  };
15 | 
16 | overridedict[`MARKET_DATA_INCREMENTAL_REFRESH]:missingfields;             // add missingfields function as override for incr refresh msgs
17 | 


--------------------------------------------------------------------------------
/code/cmedecoder/parse.q:
--------------------------------------------------------------------------------
 1 | // parse FIX spec into tables for use in processing
 2 | 
 3 | / initialise tables from spec
 4 | .parse.init:{[]
 5 |   system"cd ",getenv[`TORQHOME],"/spec";                                               // cd into spec directory for reading in files etc
 6 |   fix:.j.k raze system"python xml2json.py -t xml2json FIX44.xml";                      // convert XML to JSON with python script and parse into variable fix
 7 |   jn:`$("@number";"@name";"@type";"@msgtype";"@enum";"@description";"value");          // list of JSON field names
 8 | 
 9 |   / fields
10 |   .fix.fields:flip "ISS"$'flip `number`name`fixtype xcol jn[0 1 2]#/:fix[`fix][`fields][`field]; // generate table of FIX fields with tag number (@number), field name (@name) and data type (@type)
11 |   .fix.fields:(`number xkey .fix.fields) uj `number xkey ("ISS";enlist ",")0:`:cust_fields.csv;  // manually add custom CME fields
12 |   update number:`u#number from `.fix.fields;                                                     // apply `u attribute to tag number, for speed up
13 | 
14 |   / enumerations
15 |   c:flip[c] where 0<count each last c:value flip jn[1 6]#/:fix[`fix][`fields][`field]; // extract enumeration details (value) for fields, filter to fields with enumerations (count>0)
16 |   .fix.enums:flip `name`enums`values!flip (`$c[;0]),'.[c;(::;1;jn[4 5])];              // from each enum, extract @enum and @description, join to field name cast to sym
17 |   upd:select name," "vs'enums," "vs'values from ("S**";enlist ",")0:`:cust_enums.csv;  // read custom enumerations from csv, split enums & values
18 |   .fix.enums:raze@''`name xgroup .fix.enums,upd;                                       // group together records based on name, raze together to combine upd with .fix.enums
19 |   update name:`u#name from `.fix.enums;                                                // apply `u attribute to name, for speed up
20 | 
21 |   system"cd ",getenv[`TORQHOME];                                                       // cd back to top level directory
22 | 
23 |   / dictionary of functions to parse data types
24 |   .fix.typefuncs:(!/) flip 2 cut                                                       // define dictionary in convenient list format below
25 |     (
26 |     `LENGTH;       {"I"$x};
27 |     `STRING;       {x};
28 |     `SEQNUM;       {"I"$x};
29 |     //`UTCTIMESTAMP; {("D"$8#x)+"T"$8_x};
30 |     `UTCTIMESTAMP; {"P"$((8#x),"D",8_x)};
31 |     `LOCALMKTDATE; {"D"$x};
32 |     `INT;          {"I"$x};
33 |     `CHAR;         {`$x};
34 |     `CURRENCY;     {`$x};
35 |     `MONTHYEAR;    {`month$"D"$(x,"01")};
36 |     `EXCHANGE;     {`$x};
37 |     `QTY;          {"F"$x};
38 |     `NUMINGROUP;   {"I"$x};
39 |     `AMT;          {"F"$x};
40 |     `FLOAT;        {"F"$x};
41 |     `PRICE;        {"F"$x};
42 |     `BOOLEANLIST;  {`byte$$[0<count x;2 sv "1"=x;0]};
43 |     `SYMBOL;       {`$x};
44 |     `EPOCHDATE;    {1970.01.01+"I"$x}
45 |     );
46 |  }
47 | 


--------------------------------------------------------------------------------
/code/cmedecoder/schema.q:
--------------------------------------------------------------------------------
  1 | // schema for defitions table from "d" msgs, quote table, trade table
  2 | \d .schema
  3 | 
  4 | definitions:([] 
  5 |  TradeDate:`date$();
  6 |  LastUpdateTime:`timestamp$();
  7 |  MatchEventIndicator:`byte$();
  8 |  SecurityUpdateAction:`$();
  9 |  MarketSegmentID:`int$();
 10 |  Symbol:`$();
 11 |  SecurityID:`int$();
 12 |  MaturityMonthYear:`month$();
 13 |  SecurityGroup:`$();
 14 |  SecurityType:`$();
 15 |  UnderlyingProduct:`int$();
 16 |  SecurityExchange:`$();
 17 |  Currency:`$();
 18 |  MarketDepth:`int$();
 19 |  DisplayFactor:`float$());
 20 | 
 21 | quote:([] 
 22 |  TradeDate:`date$();
 23 |  MsgSeqNum:`int$();
 24 |  TransactTime:`timestamp$();
 25 |  MatchEventIndicator:`byte$();
 26 |  MDUpdateAction:`$();
 27 |  MDEntryType:`$();
 28 |  SecurityID:`int$();
 29 |  Symbol:`$();
 30 |  RptSeq:`int$();
 31 |  MDEntryPx:`float$();
 32 |  MDEntrySize:`float$();
 33 |  NumberOfOrders:`int$();
 34 |  MDPriceLevel:`int$();
 35 |  SecurityDesc:`$());
 36 | 
 37 | trade:([] 
 38 |  TradeDate:`date$();
 39 |  MsgSeqNum:`int$();
 40 |  TransactTime:`timestamp$();
 41 |  MatchEventIndicator:`byte$();
 42 |  MDUpdateAction:`$();
 43 |  SecurityID:`int$();
 44 |  Symbol:`$();
 45 |  RptSeq:`int$();
 46 |  MDEntryPx:`float$();
 47 |  MDEntrySize:`float$();
 48 |  NumberOfOrders:`int$();
 49 |  AggressorSide:`$();
 50 |  SecurityDesc:`$());
 51 | 
 52 | status:([] 
 53 |  MsgSeqNum:`int$();
 54 |  TransactTime:`timestamp$();
 55 |  TradingDate:`date$();
 56 |  MatchEventIndicator:`byte$();
 57 |  SecurityGroup:`$();
 58 |  SecurityTradingStatus:`$();
 59 |  HaltReasonChar:`$();
 60 |  SecurityTradingEvent:`$());
 61 | 
 62 | init:{[] 
 63 |  .raw.definitions:.schema.definitions;
 64 |  .raw.quote:.schema.quote;
 65 |  .raw.trade:.schema.trade;
 66 |  .raw.status:.schema.status;
 67 |  }
 68 | 
 69 | savetype:(!) . flip (
 70 |   `.raw.quote`partitioned;
 71 |   `.raw.trade`partitioned;
 72 |   `.raw.definitions`splay;
 73 |   `.raw.status`splay
 74 |  );
 75 | 
 76 | / field mappings for user-friendly trade table
 77 | trfieldmaps:(!) . flip (
 78 |   `date`TradeDate;
 79 |   `time`TransactTime;
 80 |   (`sym;(^;`SecurityDesc;`Symbol)); / fill null Symbol with SecurityDesc field
 81 |   `price`MDEntryPx;
 82 |   `size`MDEntrySize;
 83 |   `orders`NumberOfOrders;
 84 |   `side`AggressorSide;
 85 |   `msgseq`MsgSeqNum;
 86 |   `rptseq`RptSeq;
 87 |   `matchevent`MatchEventIndicator
 88 |  );
 89 | 
 90 | / field mappings for user-friendly quote/book table
 91 | qtfieldmaps:(!) . flip (
 92 |   `date`TradeDate;
 93 |   `time`TransactTime;
 94 |   `sym`Symbol;
 95 |   `bprice`bprice;
 96 |   `bsize`bsize;
 97 |   `aprice`aprice;
 98 |   `asize`asize;
 99 |   `msgseq`MsgSeqNum;
100 |   `rptseq`RptSeq;
101 |   `matchevent`MatchEventIndicator
102 |  );
103 | 


--------------------------------------------------------------------------------
/code/cmedecoder/tall_book.q:
--------------------------------------------------------------------------------
 1 | // build book in "tall" format
 2 | \d .
 3 | 
 4 | setbook:{[depth]
 5 |   bbk::(`oc`qty`pc)!(depth#0ni;depth#0nf;depth#0nf);                                                // bk in fmt order count, qty, price
 6 |   abk::(`oc`qty`pc)!(depth#0ni;depth#0nf;depth#0nf);                                                // define bid and ask books
 7 |   ebk::(`BID`OFFER)!(bbk;abk);                                                                      // starting empty book
 8 |   bdict::(enlist `)!enlist ebk                                                                      // book state maintaining dictionary
 9 |   }
10 | 
11 | bk0:{[x;y;z;bk;d] a:.[bk;(z;::;1_ml);:;-1_'bk[z;;ml:x+til d-x]];.[a;(z;::;x);:;y]}                  // enter data y at position x on side z in book bk and shunt down
12 | bk1:{[x;y;z;bk;d] .[bk;(z;::;x);:;y]};                                                              // update at position x with data y on side z
13 | bk2:{[x;y;z;bk;d] .[bk;(z;::;ml);:;bk[z;;1_ml:x+til d-x],'(0Ni;0Nf;0Nf)]}                           // delete position x from bk side y
14 | bk3:{[x;y;z;bk;d] .[bk;(z;::;::);:;(0Ni;0Nf;0Nf)]}                                                  // clear side x
15 | bk4:{[x;y;z;bk;d] .[bk;(z;::;::);:;bk[z;::;ml:(x+1)+cl:til d-x+1],'flip (1+x)#enlist(0Ni;0Nf;0Nf)]} // delete from
16 | mdua:(`NEW`CHANGE`DELETE`DELETETHRU`DELETEFROM)!(bk0;bk1;bk2;bk3;bk4)                               // action selection based on MDUpdateAction
17 | 
18 | /quote and book processor
19 | /modify the book based on the update action with the above functions, starting with the previous book state (empty book if none)
20 | /cl takes the modified levels in this action, then take the modified levels from the new book along with duplicated information from this entry to maintain state
21 | /push to `quote and put the new book in the book state dict
22 | qtf:{[x;d]
23 |   nbk:mdua[x[`MDUpdateAction]][-1+x`MDPriceLevel;(x`NumberOfOrders;x`MDEntrySize;x`MDEntryPx);x`MDEntryType;tbk:$[sum count each raze tbk:bdict[x`Symbol];tbk;ebk];d];
24 |   cl:$[`NEW=x`MDUpdateAction;{(x-1)+til 10-(x-1)}x`MDPriceLevel;-1+x`MDPriceLevel];
25 |   `..book insert ((count cl)#'x`TradeDate`TransactTime`Symbol`MDEntryType),(enlist 1+cl),(value nbk[x`MDEntryType;;cl]),(count cl)#'x`MsgSeqNum`RptSeq`MatchEventIndicator;
26 |   bdict[x`Symbol]::nbk
27 |   };
28 | 
29 | .cme.tallbook:{[qt]
30 |   d:exec Symbol!MarketDepth from .raw.definitions;
31 |   setbook[d:$[0=count d:value d;.raw.dfltlvl;max d]];
32 |   `..book upsert ([] date:"d"$(); time:"p"$(); sym:"s"$(); side:"s"$(); level:"i"$(); orders:"i"$(); size:"f"$(); price:"f"$(); msgseq:"i"$(); rptseq:"i"$();  matchevent:"x"$());
33 |   qtf[;d] each qt;
34 |   }
35 | 


--------------------------------------------------------------------------------
/code/cmedecoder/util.q:
--------------------------------------------------------------------------------
 1 | // utility functions
 2 | 
 3 | \d .util
 4 | 
 5 | / convert file size (bytes) to human readable representation
 6 | fmtsize:{.Q.f[2;x%2 xexp 10*b],(" KMGT" b:floor 0.1*a:2 xlog x),"B"}
 7 | 
 8 | / convert a dictionary to string representation for console output, logging etc.
 9 | strdict:{[d]((max count each a)$/:a:string key d),'" | ",/:raze each string value d}
10 | 


--------------------------------------------------------------------------------
/code/cmedecoder/wide_book.q:
--------------------------------------------------------------------------------
 1 | //build book in the "wide" format
 2 | 
 3 | .cme.widebook:{[tab]
 4 |  t:update MDEntryPx^MDEntryPx*DisplayFactor from tab lj select first DisplayFactor by Symbol from .raw.definitions;
 5 |  / extract prices & sizes from book column
 6 |  t:update bprice:{exec price from x where side=`BID}'[book],
 7 |         bsize:{exec size from x where side=`BID}'[book],
 8 |         aprice:{exec price from x where side=`OFFER}'[book],
 9 |         asize:{exec size from x where side=`OFFER}'[book]
10 |  from 
11 | 
12 |  / create temporary book column
13 |  update book:{[state;action;px;lvl;sz;sd;mtch;sym]
14 |   `level xasc $[action=`CHANGE;
15 |     state upsert (lvl;sd;px;sz);
16 |    action=`NEW;
17 |     delete from ((update level+1 from state where level>=lvl,side=sd) upsert (lvl;sd;px;sz)) where level > .raw.dfltlvl^exec last MarketDepth from .raw.definitions where Symbol = sym;
18 |    action=`DELETE;
19 |     update level-1 from (delete from state where level=lvl,side=sd) where level>lvl,side=sd;
20 |    action=`DELETETHRU;
21 |     delete from state where side=sd;
22 |   /action=`DELETEFROM
23 |     update level-lvl from (delete from state where level<=lvl,side=sd) where level>lvl,side=sd
24 |    ]}\[([level:();side:()] price:();size:());MDUpdateAction;MDEntryPx;MDPriceLevel;MDEntrySize;MDEntryType;MatchEventIndicator;Symbol]
25 |  by Symbol
26 |  from update SecurityDesc^Symbol from t;
27 |    
28 |  / delete temporary book column
29 |  // t:0!select by MsgSeqNum,Symbol from delete book from t;
30 |  t:0!select by TransactTime,Symbol from delete book from t;
31 |  / To not aggregate events in this case please comment line above and uncomment line below.
32 |  / t:0!delete book from t;
33 |  `..book upsert ?[t;();0b;.schema.qtfieldmaps] lj `sym xcol select underlying:first SecurityGroup by Symbol from .raw.definitions
34 |   }
35 | 


--------------------------------------------------------------------------------
/code/cmedecoder/write.q:
--------------------------------------------------------------------------------
 1 | // writing tables to disk
 2 | 
 3 | / sort table by column & apply attribute to that column
 4 | apply_attr:{[tbl;dt;c]
 5 |   c xasc dir:hsym `$"/" sv (dbdir;string dt;tbl);                               // sort table on disk by passed column
 6 |   @[dir; c; `p#]                                                                // apply `p attribute
 7 |  };
 8 | 
 9 | / write the data down partitioned on date with a `p# attribute on symcol
10 | write_partitioned:{[tbl;dt]
11 |   c:first a where (a:cols tbl) like\: "*[Ss]ym*";                               // find sym/Symbol column
12 |   n:$[tbl like ".raw*";c xcols select from tbl where TradeDate=dt;              // if raw table, date is TradeDate column
13 |       c xcols select from tbl where date=dt];                                   // if processed, date is date column
14 |   .lg.o[`endofday;"Saving ", string tbl];
15 |   tn:(string tbl) except ".";                                                   // name for saving = table name without "."
16 |   (hsym `$"/" sv (dbdir;string dt;tn;"")) upsert .Q.en[hsym `$dbdir] n;         // enumerate and upsert, appending to existing partition if present
17 |   apply_attr[tn;dt;c]                                                           // sort by sym/Symbol & apply `p attribute
18 |  };
19 | 
20 | / write the data down splayed to a directory
21 | write_splay:{[tbl;dt]
22 |   n:select from tbl;                                                            // select full table
23 |   .lg.o[`endofday;"Saving ",string tbl];
24 |   tn:(string tbl) except ".";                                                   // name for saving = table name without "."
25 |   (hsym `$"/" sv (dbdir;tn;"")) set .Q.en[hsym `$dbdir] n                       // enumerate and set, overwriting old version
26 |  };
27 | 
28 | / call appropriate write function based on table name for each supplied date
29 | write_method:{[d;x]
30 |   $[.schema.savetype[x]~`splay;                                                 // check save type, defined in code/cmedecoder/schema.q
31 |     write_splay[x]'[d];                                                         // write splayed table
32 |     write_partitioned[x]'[d]                                                    // write partitioned table
33 |   ]
34 |  };
35 | 
36 | writedown:{
37 |   dbdir::getenv[`DBDIR];                                                        // setting db directory pathways
38 |   .lg.o[`writedown;"Writing to disk"];
39 |   x:((` sv' ``raw,/:tables[`.raw]),tables[]) except `heartbeat`logmsg`df;       // get list of tables in the .raw & root namespace
40 |   d:(union/) {exec distinct date from x} each `book`trade;                      // extract the date(s) from the raw tables
41 |   write_method[d]'[x];                                                          // write each table for each date
42 |   .lg.o[`writedown;"Successfully saved to disk"];
43 |  }	
44 | 


--------------------------------------------------------------------------------
/code/common/logging.q:
--------------------------------------------------------------------------------
1 | // overwrite torq logging to colour log messages
2 | \d .lg
3 | 
4 | colours:(`ERROR`ERR`WRN`WARN`INF!("\033[1;31m";"\033[1;31m";"\033[0;33m";"\033[0;33m";"\033[0m"));
5 | / overrides .lg.format included in torq to add ccnsole colours for error and warn
6 | format:{[loglevel;proctype;proc;id;message]((colours loglevel), "|" sv (string .proc.cp[];string .z.h;string proctype;string proc;string loglevel;string id;message)),"\033[0m"}
7 | 


--------------------------------------------------------------------------------
/code/msgs/incremental_refresh.q:
--------------------------------------------------------------------------------
 1 | // functions for handling incremental refresh messages
 2 | 
 3 | / header & cut keys for incremental refresh
 4 | .fix.incr.headerkeys:`TradeDate`MsgSeqNum`SendingTime`TransactTime`MatchEventIndicator`NoMDEntries
 5 | .fix.incr.cutkey:`MDUpdateAction
 6 | 
 7 | \d .cme
 8 | 
 9 | / process a single quote
10 | singlequote:{[msg]
11 |   .raw.quote,:(cols .raw.quote)#(first each flip 0#.raw.quote),msg;                                // pull out relevant fields, fix types and column names, upsert to global quote table
12 |  }
13 | 
14 | / process a single trade
15 | singletrade:{[msg]
16 |   .raw.trade,:(cols .raw.trade)#(first each flip 0#.raw.trade),msg;                                // pull out relevant fields, fix types and column names, upsert to global quote table
17 |  } 
18 | 
19 | / dictionary of handlers for incremental message MDEntryTypes
20 | .fix.incr.handlers:`BID`OFFER`IMPLIED_BID`IMPLIED_OFFER`TRADE!(.cme.singlequote;.cme.singlequote;.cme.singlequote;.cme.singlequote;.cme.singletrade);
21 | 
22 | / process a single incremental refresh message - pass to quote or trade handler, as applicable
23 | singleincr:{[msg]
24 |   f:$[msg[`MDEntryType] in key .fix.incr.handlers;                                                 // get handler function, default to recording EntryType
25 |       .fix.incr.handlers[msg[`MDEntryType]];                                                       // if there's a handler function, use it
26 |       {.raw.unhandled,:x[`MDEntryType]}                                                            // else record the EntryType in list of unhandled types
27 |   ];
28 |   f msg;                                                                                           // apply returned function to message
29 |  }
30 | 
31 | / process MarketDataIncrementalRefresh message - convert to single messages and pass to handler
32 | MARKET_DATA_INCREMENTAL_REFRESH:{[msg]
33 |   header:{[x;y](key[x] inter key[y])#y}[msg;] .fix.incr.headerkeys!msg .fix.incr.headerkeys;       // extract header for this message
34 |   c:where .fix.incr.cutkey=key msg;                                                                // determine where to cut to extract individual quotes/trades
35 |   msgs:header,/:(c cut key msg)!'c cut value msg;                                                  // generate list of single quotes/trades
36 |   singleincr each msgs;                                                                            // pass to handler for single messages
37 |  }
38 | 


--------------------------------------------------------------------------------
/code/msgs/security_definition.q:
--------------------------------------------------------------------------------
 1 | // functions to handle security definition messages
 2 | 
 3 | \d .cme
 4 | 
 5 | / process SecurityDefintion msgs into definitions table
 6 | SECURITY_DEFINITION:{[msg]
 7 |   `.raw.definitions upsert .Q.en[hsym `$getenv[`DBDIR]] enlist (cols .raw.definitions)#(first each flip 0#.raw.definitions),msg; // join msg to typed null dict (ensure correct cols), enumerate & upsert
 8 |   } 
 9 | 
10 | 


--------------------------------------------------------------------------------
/code/msgs/security_status.q:
--------------------------------------------------------------------------------
1 | // functions to handle security status messages
2 | 
3 | \d .cme
4 | 
5 | SECURITY_STATUS:{[msg]
6 |   `.raw.status upsert .Q.en[hsym `$getenv[`DBDIR]] enlist (cols .raw.status)#(first each flip 0#.raw.status),msg        // join msg to typed null dict (ensure correct cols), enumerate & upsert
7 |  }
8 | 


--------------------------------------------------------------------------------
/code/processes/cmedecoder.q:
--------------------------------------------------------------------------------
  1 | \d .cme
  2 | 
  3 | .cme.book:$[`tallbook in key .proc.params;.cme.tallbook;.cme.widebook];   // determine book function to use from process params
  4 | .proc.loaddir[getenv[`KDBCODE],"/msgs/"];                                 // load per-message type scripts
  5 | 
  6 | / process one message from log (i.e. one line from text file)
  7 | msg:{
  8 |   / generate dictionary from message, with correct tags & properly typed values
  9 |   msg:(!/) {
 10 |     d:.fix.fields each x[;0];                                             // get field name & type from tag number
 11 |     enum:0!([] name:d`name)#.fix.enums;                                   // check if field has enumerations
 12 |     a:enum[`values]@'enum[`enums]?'x[;1];                                 // get list of un-enumerated values
 13 |     val:?[""~/:a;x[;1];a];                                                // if enumeration exists, use it, else use original value from msg
 14 |     val:.fix.typefuncs[d`fixtype]@'val;                                   // fix field value type
 15 |     (d[`name];val)                                                        // list of name-value pairs
 16 |     } flip "I=\001"0:x;                                                   // split message into key-value pairs for processing
 17 | 
 18 |   $[msg[`MsgType] in key .cme;                                            // check if msghandler exists
 19 |     [msg:override[msg];                                                   // apply any override function defined in code/cmedecoder/override.q for this msgtype
 20 |      @[value;(.cme[msg[`MsgType]];msg);                                   // if handler exists, pass & catch errors
 21 |           {[msg;x]                                                        // on error, display error message
 22 |           .lg.w[`msg] each .util.strdict msg;                             // show failed msg as warning (error will exit process by default)
 23 |           .lg.e[`msg;"Error parsing message: ",x];}[msg]                  // show error message (exit process by default)
 24 |       ]
 25 |     ];
 26 |     [.lg.w[`msg;"Missing msg handler: ",string msg[`MsgType]]             // if no handler, display warning about missing handler
 27 |      .lg.w[`msg] each .util.strdict msg                                   // also display failed message as warning
 28 |     ]
 29 |   ];
 30 |  }
 31 | 
 32 | / extract gz file to pipe & process
 33 | pipegz:{[gzfile]
 34 |   .lg.o[`pipegz;"Unzipping and piping to fifo"];
 35 |   system"rm -f fifo && mkfifo fifo";                                      // remove any existing fifo, make a new one
 36 |   system"zcat ",(1_ string gzfile)," > fifo &";                           // use zcat to extract to fifo
 37 |   .lg.o[`pipegz;"Unzipped, parsing"];
 38 |   @[.Q.fps[{msg each x}];`:fifo;                                          // use .Q.fps to process file from fifo, catch error & display msg
 39 |     {.lg.e[`.proc.pipegz;"Reading form fifo failed, possible corrupt gz file: ",x]}];
 40 |   system"rm -f fifo";                                                     // remove fifo when done with it
 41 |  }
 42 | 
 43 | if[not `files in key .Q.opt .z.x;                                         // Checks if the -files tag is applied properly
 44 |    .lg.w[`files;"-files tag is missing"]
 45 |  ];
 46 | 
 47 | / process one log file
 48 | logfile:{[logfile]
 49 |   if[()~key hsym logfile;                                                 // check file exists
 50 |      .lg.e[`logfile;"Logile: ",(string logfile)," not found"];            // error message if not
 51 |      :()                                                                  // return early, nothing to do
 52 |   ];
 53 |   .lg.o[`logfile;"Processing file: ",(string logfile)," with size: ",.util.fmtsize hcount hsym logfile];
 54 |   $[logfile like "*.gz";                                                  // check if file is gz compressed
 55 |       pipegz[logfile];                                                    // pass compressed files to pipegz
 56 |       .Q.fs[{msg each x}] hsym logfile;                                   // for uncompressed files, process directly with .Q.fs
 57 |     ];
 58 |   .lg.o[`logfile;"Finished processing file: ",string logfile];
 59 |  }
 60 | 
 61 | \d .
 62 | 
 63 | .schema.init[]                                                            // set up empty schemas for processing
 64 | .parse.init[]                                                             // parse FIX spec & create tables/dicts for use in processing
 65 | 
 66 | .lg.o[`load;"Attempting to load existing definitions & status tables"];
 67 | sym:@[get;hsym `$getenv[`DBDIR],"/sym";                                   // attempt to load sym file
 68 |       {.lg.w[`load;"Failed to load sym file"]}]                           // warn if unable
 69 | .raw.dfltlvl:10                                                           // Add default Price level, for the case if .raw.definitons is empty
 70 | .raw.definitions:select from @[get;hsym `$getenv[`DBDIR],"/rawdefinitions/";                    // attempt to load existing definitions table for further updates
 71 |                                {.lg.w[`load;"No definitions table found"];.schema.definitions}] // warn if unable
 72 | .raw.status:select from @[get;hsym `$getenv[`DBDIR],"/rawstatus/";                              // attempt to load existing status table for further updates
 73 |                           {.lg.w[`load;"No status table found"];.schema.status}]                // warn if unable
 74 | 
 75 | if[`files in key .proc.params;                                            // if files are passed in cmd line args, begin processing
 76 |  .cme.logfile each hsym `$.proc.params[`files];                           // process each file in turn
 77 |  if[0 = count .raw.definitions;                                           // if no definitions after processing files, won't be able to make accurate book, warn
 78 |     .lg.w[`definition;"No definitions table found. Cannot build accurate book"]
 79 |  ];
 80 | 
 81 |  if[0 = count .raw.quote;                                                 // Checks if .raw.quote table is populated to fill order book
 82 |     .lg.w[`rawquote;".raw.quote table is empty"]
 83 |  ];
 84 | 
 85 | 
 86 |  .cme.book .raw.quote;                                                    // process raw quote table into book table
 87 |  df:`sym xcol select underlying:first SecurityGroup,first DisplayFactor by Symbol from .raw.definitions;  // get underlying and display factor from definitions table
 88 |  trade:?[.raw.trade;();0b;.schema.trfieldmaps] lj df;                     // join underlying & display factor to user-friendly trade table
 89 |  trade:delete DisplayFactor from update price*DisplayFactor from trade;   // apply diplayfactor to trade table and remove
 90 |  writedown[];                                                             // save tables to disk
 91 |  ];
 92 | 
 93 | if[not `debug in key .proc.params;                                        // if not running in debug mode, exit on completion
 94 |  exit 0;
 95 |  ];
 96 | 
 97 | /
 98 | Example Usage
 99 | 
100 | > q torq.q -load code/processes/cmedecoder.q -proctype cmedecoder -procname cmedecoder -files sample/sample_20170101.log
101 | > q torq.q -load code/processes/cmedecoder.q -proctype cmedecoder -procname cmedecoder -files /tmp/CME/CME_DATA/xcme_md_6s_fut_20161012-r-00447.gz
102 | 


--------------------------------------------------------------------------------
/config/settings/cmedecoder.q:
--------------------------------------------------------------------------------
1 | .proc.loadprocesscode:1b
2 | 


--------------------------------------------------------------------------------
/setenv.sh:
--------------------------------------------------------------------------------
 1 | # if running the kdb+tick example, change these to full paths
 2 | # some of the kdb+tick processes will change directory, and these will no longer be valid
 3 | export TORQHOME=${PWD}
 4 | export KDBCONFIG=${TORQHOME}/config
 5 | export KDBCODE=${TORQHOME}/code
 6 | export KDBLOG=${TORQHOME}/logs
 7 | export KDBHTML=${TORQHOME}/html
 8 | export KDBLIB=${TORQHOME}/lib
 9 | export DBDIR=${TORQHOME}/hdb
10 | 
11 | # if using the email facility, modify the library path for the email lib depending on OS
12 | # e.g. linux:
13 | # export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$KDBLIB/l[32|64]
14 | # e.g. osx:
15 | # export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:$KDBLIB/m[32|64]
16 | 
17 | alias cmedecoder='q ${TORQHOME}/torq.q -load ${TORQHOME}/code/processes/cmedecoder.q -proctype cmedecoder -procname cmedecoder'
18 | 


--------------------------------------------------------------------------------
/spec/cust_enums.csv:
--------------------------------------------------------------------------------
 1 | name,enums,values
 2 | MDEntryType,N O E F J e g,SESSION_HIGH_BID SESSION_LOW_OFFER IMPLIED_BID IMPLIED_ASK EMPTY_BOOK ELECTRONIC_VOLUME THRESHOLD
 3 | SecurityGroup,6A 6B 6C 6E 6J 6N 6S,AUDUSD GBPUSD USDCAD EURUSD USDJPY NZDUSD USDCHF
 4 | MDUpdateAction,3 4,DELETE_THRU DELETE_FROM
 5 | HaltReasonChar,0 1 2 3 4 5 6,GROUP_SCHEDULE SURVEILLANCE_INTERVENTION MARKET_EVENT INSTRUMENT_ACTIVATION INSTRUMENT_EXPIRATION UNKNOWN RECOVERY_IN_PROGRESS
 6 | SecurityTradingStatus,4 24 25 26 103,CLOSE PRE_CROSS CROSS POST_CLOSE NO_CHANGE
 7 | AggressorSide,0 1 2,NONE BUY SELL
 8 | SecurityTradingEvent,0 1 4 5 6,NO_EVENT NO_CANCEL CHANGE_OF_TRADING_SESSION IMPLIED_MATCHING_IN IMPLIED_MATCHING_OFF
 9 | SecurityUpdateAction,A D M,ADD DELETE MODIFY
10 | 


--------------------------------------------------------------------------------
/spec/cust_fields.csv:
--------------------------------------------------------------------------------
 1 | number,name,fixtype
 2 | 1128,ApplVerID,INT
 3 | 5799,MatchEventIndicator,BOOLEANLIST
 4 | 1151,SecurityGroup,SYMBOL
 5 | 980,SecurityUpdateAction,SYMBOL
 6 | 1180,ApplID,STRING
 7 | 1300,MarketSegmentID,INT
 8 | 6937,Asset,STRING
 9 | 9779,UserDefinedInstrument,STRING
10 | 1142,MatchAlgorithm,STRING
11 | 1140,MaxTradeVol,QTY
12 | 969,MinPriceIncrement,QTY
13 | 1146,MinPriceIncrementAmount,AMT
14 | 9787,DisplayFactor,FLOAT
15 | 1141,NoMdFeedTypes,INT
16 | 1022,MDFeedType,STRING
17 | 1145,EventTime,UTCTIMESTAMP
18 | 996,UnitOfMeasure,STRING
19 | 1147,UnitOfMeasureQty,FLOAT
20 | 1149,HighLimitPrice,PRICE
21 | 1148,LowLimitPrice,PRICE
22 | 1143,MaxPriceVariation,PRICE
23 | 1150,TradingReferencePrice,PRICE
24 | 5796,TradingReferenceDate,EPOCHDATE
25 | 1023,MDPriceLevel,INT
26 | 55,Symbol,SYMBOL
27 | 48,SecurityID,INT
28 | 269,MDEntryType,SYMBOL
29 | 167,SecurityType,SYMBOL
30 | 35,MsgType,SYMBOL
31 | 731,SettlPriceType,BOOLEANLIST
32 | 286,OpenCloseSettlFlag,SYMBOL
33 | 326,SecurityTradingStatus,SYMBOL
34 | 5797,AggressorSide,SYMBOL
35 | 1174,SecurityTradingEvent,SYMBOL
36 | 107,SecurityDesc,SYMBOL
37 | 


--------------------------------------------------------------------------------
/spec/xml2json.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | """xml2json.py  Convert XML to JSON
  4 | 
  5 | Relies on ElementTree for the XML parsing.  This is based on
  6 | pesterfish.py but uses a different XML->JSON mapping.
  7 | The XML->JSON mapping is described at
  8 | http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html
  9 | 
 10 | Rewritten to a command line utility by Hay Kranen < github.com/hay > with
 11 | contributions from George Hamilton (gmh04) and Dan Brown (jdanbrown)
 12 | 
 13 | XML                              JSON
 14 | <e/>                             "e": null
 15 | <e>text</e>                      "e": "text"
 16 | <e name="value" />               "e": { "@name": "value" }
 17 | <e name="value">text</e>         "e": { "@name": "value", "#text": "text" }
 18 | <e> <a>text</a ><b>text</b> </e> "e": { "a": "text", "b": "text" }
 19 | <e> <a>text</a> <a>text</a> </e> "e": { "a": ["text", "text"] }
 20 | <e> text <a>text</a> </e>        "e": { "#text": "text", "a": "text" }
 21 | 
 22 | This is very similar to the mapping used for Yahoo Web Services
 23 | (http://developer.yahoo.com/common/json.html#xml).
 24 | 
 25 | This is a mess in that it is so unpredictable -- it requires lots of testing
 26 | (e.g. to see if values are lists or strings or dictionaries).  For use
 27 | in Python this could be vastly cleaner.  Think about whether the internal
 28 | form can be more self-consistent while maintaining good external
 29 | characteristics for the JSON.
 30 | 
 31 | Look at the Yahoo version closely to see how it works.  Maybe can adopt
 32 | that completely if it makes more sense...
 33 | 
 34 | R. White, 2006 November 6
 35 | """
 36 | 
 37 | import json
 38 | import optparse
 39 | import sys
 40 | import os
 41 | 
 42 | import xml.etree.cElementTree as ET
 43 | 
 44 | 
 45 | def strip_tag(tag):
 46 |     strip_ns_tag = tag
 47 |     split_array = tag.split('}')
 48 |     if len(split_array) > 1:
 49 |         strip_ns_tag = split_array[1]
 50 |         tag = strip_ns_tag
 51 |     return tag
 52 | 
 53 | 
 54 | def elem_to_internal(elem, strip_ns=1, strip=1):
 55 |     """Convert an Element into an internal dictionary (not JSON!)."""
 56 | 
 57 |     d = {}
 58 |     elem_tag = elem.tag
 59 |     if strip_ns:
 60 |         elem_tag = strip_tag(elem.tag)
 61 |     else:
 62 |         for key, value in list(elem.attrib.items()):
 63 |             d['@' + key] = value
 64 | 
 65 |     # loop over subelements to merge them
 66 |     for subelem in elem:
 67 |         v = elem_to_internal(subelem, strip_ns=strip_ns, strip=strip)
 68 | 
 69 |         tag = subelem.tag
 70 |         if strip_ns:
 71 |             tag = strip_tag(subelem.tag)
 72 | 
 73 |         value = v[tag]
 74 | 
 75 |         try:
 76 |             # add to existing list for this tag
 77 |             d[tag].append(value)
 78 |         except AttributeError:
 79 |             # turn existing entry into a list
 80 |             d[tag] = [d[tag], value]
 81 |         except KeyError:
 82 |             # add a new non-list entry
 83 |             d[tag] = value
 84 |     text = elem.text
 85 |     tail = elem.tail
 86 |     if strip:
 87 |         # ignore leading and trailing whitespace
 88 |         if text:
 89 |             text = text.strip()
 90 |         if tail:
 91 |             tail = tail.strip()
 92 | 
 93 |     if tail:
 94 |         d['#tail'] = tail
 95 | 
 96 |     if d:
 97 |         # use #text element if other attributes exist
 98 |         if text:
 99 |             d["#text"] = text
100 |     else:
101 |         # text is the value if no attributes
102 |         d = text or None
103 |     return {elem_tag: d}
104 | 
105 | 
106 | def internal_to_elem(pfsh, factory=ET.Element):
107 | 
108 |     """Convert an internal dictionary (not JSON!) into an Element.
109 | 
110 |     Whatever Element implementation we could import will be
111 |     used by default; if you want to use something else, pass the
112 |     Element class as the factory parameter.
113 |     """
114 | 
115 |     attribs = {}
116 |     text = None
117 |     tail = None
118 |     sublist = []
119 |     tag = list(pfsh.keys())
120 |     if len(tag) != 1:
121 |         raise ValueError("Illegal structure with multiple tags: %s" % tag)
122 |     tag = tag[0]
123 |     value = pfsh[tag]
124 |     if isinstance(value, dict):
125 |         for k, v in list(value.items()):
126 |             if k[:1] == "@":
127 |                 attribs[k[1:]] = v
128 |             elif k == "#text":
129 |                 text = v
130 |             elif k == "#tail":
131 |                 tail = v
132 |             elif isinstance(v, list):
133 |                 for v2 in v:
134 |                     sublist.append(internal_to_elem({k: v2}, factory=factory))
135 |             else:
136 |                 sublist.append(internal_to_elem({k: v}, factory=factory))
137 |     else:
138 |         text = value
139 |     e = factory(tag, attribs)
140 |     for sub in sublist:
141 |         e.append(sub)
142 |     e.text = text
143 |     e.tail = tail
144 |     return e
145 | 
146 | 
147 | def elem2json(elem, options, strip_ns=1, strip=1):
148 | 
149 |     """Convert an ElementTree or Element into a JSON string."""
150 | 
151 |     if hasattr(elem, 'getroot'):
152 |         elem = elem.getroot()
153 | 
154 |     if options.pretty:
155 |         return json.dumps(elem_to_internal(elem, strip_ns=strip_ns, strip=strip), sort_keys=True, indent=4, separators=(',', ': '))
156 |     else:
157 |         return json.dumps(elem_to_internal(elem, strip_ns=strip_ns, strip=strip))
158 | 
159 | 
160 | def json2elem(json_data, factory=ET.Element):
161 | 
162 |     """Convert a JSON string into an Element.
163 | 
164 |     Whatever Element implementation we could import will be used by
165 |     default; if you want to use something else, pass the Element class
166 |     as the factory parameter.
167 |     """
168 | 
169 |     return internal_to_elem(json.loads(json_data), factory)
170 | 
171 | 
172 | def xml2json(xmlstring, options, strip_ns=1, strip=1):
173 | 
174 |     """Convert an XML string into a JSON string."""
175 | 
176 |     elem = ET.fromstring(xmlstring)
177 |     return elem2json(elem, options, strip_ns=strip_ns, strip=strip)
178 | 
179 | 
180 | def json2xml(json_data, factory=ET.Element):
181 | 
182 |     """Convert a JSON string into an XML string.
183 | 
184 |     Whatever Element implementation we could import will be used by
185 |     default; if you want to use something else, pass the Element class
186 |     as the factory parameter.
187 |     """
188 |     if not isinstance(json_data, dict):
189 |         json_data = json.loads(json_data)
190 | 
191 |     elem = internal_to_elem(json_data, factory)
192 |     return ET.tostring(elem)
193 | 
194 | 
195 | def main():
196 |     p = optparse.OptionParser(
197 |         description='Converts XML to JSON or the other way around.  Reads from standard input by default, or from file if given.',
198 |         prog='xml2json',
199 |         usage='%prog -t xml2json -o file.json [file]'
200 |     )
201 |     p.add_option('--type', '-t', help="'xml2json' or 'json2xml'", default="xml2json")
202 |     p.add_option('--out', '-o', help="Write to OUT instead of stdout")
203 |     p.add_option(
204 |         '--strip_text', action="store_true",
205 |         dest="strip_text", help="Strip text for xml2json")
206 |     p.add_option(
207 |         '--pretty', action="store_true",
208 |         dest="pretty", help="Format JSON output so it is easier to read")
209 |     p.add_option(
210 |         '--strip_namespace', action="store_true",
211 |         dest="strip_ns", help="Strip namespace for xml2json")
212 |     p.add_option(
213 |         '--strip_newlines', action="store_true",
214 |         dest="strip_nl", help="Strip newlines for xml2json")
215 |     options, arguments = p.parse_args()
216 | 
217 |     inputstream = sys.stdin
218 |     if len(arguments) == 1:
219 |         try:
220 |             inputstream = open(arguments[0])
221 |         except:
222 |             sys.stderr.write("Problem reading '{0}'\n".format(arguments[0]))
223 |             p.print_help()
224 |             sys.exit(-1)
225 | 
226 |     input = inputstream.read()
227 | 
228 |     strip = 0
229 |     strip_ns = 0
230 |     if options.strip_text:
231 |         strip = 1
232 |     if options.strip_ns:
233 |         strip_ns = 1
234 |     if options.strip_nl:
235 |         input = input.replace('\n', '').replace('\r','')
236 |     if (options.type == "xml2json"):
237 |         out = xml2json(input, options, strip_ns, strip)
238 |     else:
239 |         out = json2xml(input)
240 | 
241 |     if (options.out):
242 |         file = open(options.out, 'w')
243 |         file.write(out)
244 |         file.close()
245 |     else:
246 |         print(out)
247 | 
248 | if __name__ == "__main__":
249 |     main()
250 | 
251 | 


--------------------------------------------------------------------------------