├── LICENSE ├── README.md ├── config.go ├── datapool.go ├── images ├── ewma.png ├── min-distribution-of-time-by-resource.png └── percentiles.png ├── logmetrics_collector_transform.conf ├── logtail.go ├── main └── logmetrics_collector.go ├── metrilyx2.internals.json ├── metrilyx2.psstat.json ├── parsertest.go ├── parsertest └── logmetrics_parsertest.go ├── syslog_helper.go ├── transform.go ├── tsdpusher.go └── utils ├── etc ├── init.d │ └── logmetrics_collector └── sysconfig │ └── logmetrics_collector └── logmetrics_collector.spec /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015, Mathieu Payeur Levallois 2 | 3 | Permission to use, copy, modify, and/or distribute this software for any 4 | purpose with or without fee is hereby granted, provided that the above 5 | copyright notice and this permission notice appear in all copies. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 8 | WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 9 | MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR 10 | ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 11 | WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 12 | ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 13 | OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

logmetrics-collector

2 | 3 | logmetrics-collector is aimed at parsing log files containing performance data, computing statistics and outputting them to TSD or tcollector while using limited ressources. See also for a quick summary of the idea behind this kind of metric statistical aggregation: 4 | - http://pivotallabs.com/139-metrics-metrics-everywhere/ 5 | - http://metrics.codahale.com/getting-started/ - Note that logmetrics-collector only implements a subset of this library, Meter and Histogram. 6 | - http://dimacs.rutgers.edu/~graham/pubs/papers/fwddecay.pdf: Paper on the method used for histogram generation. 7 | 8 | Big thanks to rcrowley for his go-metrics library! 9 | 10 | Here's 3 examples of that data it generates: 11 | Exponentially weighted moving call count average by 1, 5 and 15 minutes (like Unix load average seen in uptime) + call count: 12 | 13 | ![](images/ewma.png) 14 | 15 | Percentiles on executions times + call count: 16 | 17 | ![](images/percentiles.png) 18 | 19 | Distribution of minimum time spent by resource for calls + total time spent (sum of all of resources): 20 | 21 | ![](images/min-distribution-of-time-by-resource.png) 22 | 23 | 24 |

Features list

25 | 26 | - Generic: everything is in the configuration file.. 27 | - Not real time based, only the time seen in the log files is used for everything. 28 | - Handle log lag gracefully. 29 | - Can push old logs to TSD with accuracy. (If TSD can accept it, see limitations) 30 | - Handles TSD slowness or absence gracefully through buffering and blocking. 31 | - If it can't push data to TSD it will wait until it can, progressively blocking its internal functions up to blocking the file tailer. Once TSD becomes available again it will then continue to parse the logs where it was and push what it held in memory. 32 | - Pushes clean data to TSD: no duplicate key and order is always respected. 33 | - If a key has been updated but hasn't changed in value it will still push it, mostly for precision on old log import. In realtime use tcollector will deal with that use case to limit the number of points sent. 34 | - Scale CPU and network-wise. 35 | - Can use X threads for statistical computation and regexp matching. (See goroutines) 36 | - Can use X pusher threads to TSD to get better throughput to it. 37 | - Low resource usage. 38 | - This is directly dependent on the configuration used and the number of keys tracked and activity in the logs. 39 | - Easy to deploy: a single statically compiled binary. 40 | - Integrated pprof output. See -P and http://blog.golang.org/profiling-go-programs. 41 | 42 |

Configuration

43 | 44 | Here's a simple configuration for fictional service. Comments inline. It's in json-like yaml. 45 | ``` 46 | { 47 | # Log group, you can define multiple of these 48 | rest.api: { 49 | # Glob expression of the files to tail 50 | files: [ "/var/log/rest_*.perf.log" ], 51 | 52 | # Regular expression used to extract fields from the logs. 53 | # Spaces are stripped, comments are stripped, literal "\n" are necessary at the end of the line. 54 | # Multiple expressions can be defined but match groups must remain the same 55 | re: [ 56 | '([A-z]{3}\s+\d+\s+\d+:\d+:\d+)\s+ # Date 1 \n 57 | (([a-z])+\d+\.\S+)\s+ # server 2, class 3,\n 58 | rest_([a-z]+)\.api:.* # rest type 4 \n 59 | \[c:(\S+)\].* # call 5 \n 60 | \(([0-9]+)\)\s+ # call time 6 \n 61 | \[bnt:([0-9]+)/([0-9]+)\]\s+ # bnt calls 7, bnt time 8 \n 62 | \[sql:([0-9]+)/([0-9]+)\]\s+ # sql calls 9, sql time 10 \n 63 | \[membase:([0-9]+)/([0-9]+)\]\s+ # membase calls 11, membase time 12 \n 64 | \[memcache:([0-9]+)/([0-9]+)\]\s+ # memcache calls 13, memcache time 14 \n 65 | \[other:([0-9]+)/([0-9]+)\].* # other calls 15, other time 16 \n'], 66 | 67 | # Validation of the previous regexp. 68 | expected_matches: 16, 69 | 70 | # How to parse the log date 71 | date: { 72 | # Match group number from the regexp wher to find the date 73 | position: 1, 74 | 75 | # Format of the date. See http://godoc.org/time#pkg-constants for format 76 | format: "Jan 2 15:04:05" #rfc3164 77 | }, 78 | 79 | # Prefix used in key name generation 80 | key_prefix: 'rest.api', 81 | 82 | # Tag lookup against regexp match group position. 83 | tags: {call: 5, 84 | host: 2, 85 | class: 3 86 | }, 87 | 88 | # Metrics definition. Only meter and histogram are supported 89 | metrics: { 90 | meter: [ 91 | { # Key suffix to use for this type of metric. 92 | # The metric itself will then append multiple endings to this. 93 | key_suffix: "executions", 94 | 95 | # float or int. Defaults to int 96 | format: "int" 97 | 98 | # Multiply the value by this. Defaults to 1. Useful for time in float. 99 | multiply: 1 100 | 101 | # Regexp match groups where to use this metric type + tag(s) to append for it 102 | reference: [ 103 | [0, "resource=local"], #When pos=0, simply inc counter by 1 104 | [7, "resource=bnt"], 105 | [9, "resource=sql"], 106 | [11, "resource=membase"], 107 | [13, "resource=memcache"], 108 | [15, "resource=other"] 109 | ] 110 | } 111 | ], 112 | histogram: [ 113 | { key_suffix: "execution_time.ms", 114 | reference: [ 115 | # Operations add or sub can be applied to the value. Here we substract all the 116 | # resource accesses from the total time so we only have the time spent on the server. 117 | [6, "resource=local" , {sub: [8,10,12,14,16]}], 118 | [6, "resource=total" ], 119 | [8, "resource=bnt"], 120 | [10, "resource=sql"], 121 | [12, "resource=membase"], 122 | [14, "resource=memcache"], 123 | [16, "resource=other"] 124 | ] 125 | } 126 | ] 127 | }, 128 | 129 | # Histogram sampler parameters. See Exponential Decay in http://dimacs.rutgers.edu/~graham/pubs/papers/fwddecay.pdf 130 | # Size of the samplers, has the most effet on memory used when using histograms. Defaults to 256. 131 | histogram_size: 256, 132 | histogram_alpha_decay: 0.15, 133 | histogram_rescale_threshold_min: 60, 134 | 135 | # Interval in sec for EWMA calculation when no new data has been parsed for a key. Defaults to 30. 136 | ewma_interval: 30, 137 | 138 | # Enable removal of metrics that haven't been updated for X amount of time. Defaults to false. 139 | stale_removal: true, 140 | 141 | # Metric will be dropped if no new update has been receive within that time 142 | stale_treshold_min: 15, 143 | 144 | # Send metric even if it hasn't changed. Useful for near real-time graphs. Defaults to false. 145 | send_duplicates: true, 146 | 147 | # Split workload on multiple go routines to scale across cpus 148 | goroutines: 1, 149 | 150 | # Poll the file instead of using inotify. Defaults to false. 151 | # Last I checked inotify leaks FD on log rotation, so use true. 152 | poll_file: true, 153 | 154 | #Push data to TSD every X seconds. Default to 15. 155 | interval: 15, 156 | 157 | # Log a warning when the regexp fails. Useful for performance-only logs. Defaults to false. 158 | warn_on_regex_fail: true, 159 | 160 | # Log a warning when a metric operation fails (result lower than 0). Default to false. 161 | warn_on_operation_fail: false, 162 | 163 | # Log a warning when an out of order time is seen in the logs. Default to false. 164 | warn_on_out_of_order_time: true, 165 | 166 | # Parse log from start. Allows to push old logs, otherwise it will start at its current end of the file. Defaults to false. 167 | parse_from_start: false 168 | }, 169 | 170 | # Section for general settings 171 | settings: { 172 | # Interval at which to look for new log file 173 | poll_interval: 5, 174 | 175 | log_facility: "local3", 176 | 177 | # Information on where to send TSD keys 178 | push_port: 4242, 179 | push_host: "tsd.mynetwork", 180 | 181 | # tcp or udp. tsd is tcp, tcollector is udp. 182 | push_proto: "tcp", 183 | 184 | # tsd or tcollector 185 | push_type: "tsd", 186 | 187 | # Number of parallel senders. 188 | push_number: 1, 189 | 190 | #Seconds to wait before retrying to send data when unable to contact tsd/tcollector 191 | push_wait: 5 192 | 193 | #Seconds between internal stats are pushed 194 | stats_interval: 60, 195 | } 196 | ``` 197 | 198 | Example of a line this configuration would parse: 199 | ``` 200 | Feb 8 04:02:26 rest1.mynetwork rest_sales.api: [INFO] [performance] (http-2350-92) [c:session.addItem] [s:d9ea09bf2612060d9] [r:141915] (34) [bnt:1/28] [sql:2/1] [membase:0/0] [memcache:4/2] [other:0/0] 201 | ``` 202 | 203 | With this configuration and the level of information we have in the performance logs in this fictive applications we can extract the following information: 204 | - Meter information: Simple counter going up by the value parsed on each line. 205 | - Call count 206 | - Resource call count: bnt, sql, membase, memcache, other 207 | - Histogram information: Distribution of the value parsed. 208 | - Call time 209 | - Resource call time: bnt, sql, membase, memcache, other 210 | 211 | For each of these we compute: 212 | - Meter: Adds up a counter, push it as a key and computes: 213 | - Exponentially weighted moving average 1, 5 and 15 minutes. (Like Unix load average) 214 | - Histogram: Sample timers parsed and computes: 215 | - Min, max and average. 216 | - Percentiles 50, 75, 95, 99 and 999. 217 | - Standard deviation. 218 | 219 | All this information is specific to a single host. A single call generates 13 stats * 6 resources = 78 keys every "interval". 220 | 221 |

Keys generated

222 | 223 |

Meter

224 | 225 | - \.\.count: Sum of the values. 226 | - \.\.rate._1min: Moving average over 1 minute. 227 | - \.\.rate._5min: Moving average over 5 minutes. 228 | - \.\.rate._15min: Moving average over 15 minutes. 229 | 230 | Example: 231 | ``` 232 | rest.api.executions.count 1391745780 4 call=getUser host=api1.mynetwork class=api 233 | rest.api.executions.rate._1min 1391745780 0.100 call=getUser host=api1.mynetwork class=api 234 | rest.api.executions.rate._5min 1391745780 0.050 call=getUser host=api1.mynetwork class=api 235 | rest.api.executions.rate._15min 1391745780 0.003 call=getUser host=api1.mynetwork class=api 236 | ``` 237 | 238 |

Histogram

239 | 240 | - \.\.min: Minimum in the sample. 241 | - \.\.max: Maximum in the sample. 242 | - \.\.mean: Average of the sample. 243 | - \.\.std_dev: Standard deviation. 244 | - \.\.p50: Percentile 50. 245 | - \.\.p75: Percentile 75. 246 | - \.\.p95: Percentile 95. 247 | - \.\.p99: Percentile 99. 248 | - \.\.p999: Percentile 99.9. 249 | - \.\.sample_size: Current sample size. For sampler tuning, likely to disappear. Badly named as well, gets the time unit right before it. 250 | 251 | Example: 252 | ``` 253 | rest.api.execution_time.ms.min 1391745767 11 call=getUser host=api1.mynetwork class=api 254 | rest.api.execution_time.ms.max 1391745767 16 call=getUser host=api1.mynetwork class=api 255 | rest.api.execution_time.ms.mean 1391745767 13 call=getUser host=api1.mynetwork class=api 256 | rest.api.execution_time.ms.std-dev 1391745767 2 call=getUser host=api1.mynetwork class=api 257 | rest.api.execution_time.ms.p50 1391745767 12 call=getUser host=api1.mynetwork class=api 258 | rest.api.execution_time.ms.p75 1391745767 16 call=getUser host=api1.mynetwork class=api 259 | rest.api.execution_time.ms.p95 1391745767 16 call=getUser host=api1.mynetwork class=api 260 | rest.api.execution_time.ms.p99 1391745767 16 call=getUser host=api1.mynetwork class=api 261 | rest.api.execution_time.ms.p999 1391745767 16 call=getUser host=api1.mynetwork class=api 262 | rest.api.execution_time.ms.sample_size 1391745767 3 call=getUser host=api1.mynetwork class=api 263 | ``` 264 | 265 |

Internal processing metrics

266 | 267 | It will also push internal processing stats under the following keys and tags: 268 | - logmetrics_collector.tail.line_read: Number of line read 269 | - log_group: log_group name 270 | - filename: filename tailed 271 | - logmetrics_collector.tail.line_matched: Number of line matched by regex 272 | - log_group: log_group name 273 | - filename: filename tailed 274 | - logmetrics_collector.tail.byte_read: Amount of bytes read from file 275 | - log_group: log_group name 276 | - filename: filename tailed 277 | - logmetrics_collector.data_pool.key_tracked: Number of keys tracked by a data pool. 278 | - log_group: log_group name 279 | - log_group_number: log_group number when multiple goroutines are used 280 | - logmetrics_collector.data_pool.key_stalled: Number of keys that have been recognized as stale and have been removed. 281 | - log_group: log_group name 282 | - log_group_number: log_group number when multiple goroutines are used 283 | - logmetrics_collector.pusher.key_sent 284 | - pusher_number: pusher number when multiple ones are used. 285 | - logmetrics_collector.pusher.byte_sent 286 | - pusher_number: pusher number when multiple ones are used. 287 | 288 | Additionnaly all internal processing keys have the current host's hostname tag added. 289 | 290 |

Understanding stale/dupes options

291 | 292 | To get OpenTSDB a aggregate data in a meaningful way while keeping the process' footprint as low as possible a few options have been added: 293 | - stale_removal: If you use certain regexp match group as tag and you know a good amount will not show up that often you might want to flush them from logmetrics' memory. The upside is it's easier to keep the cpu/memory usage on the low side, the downside is that some aggregation might suffer from disappearing tag. Your call! 294 | - stale_threshold_min: How many minutes without update until the metrics is considered stale and is flushed. 295 | - send_duplicates: With enough diverisity with tag content there comes a point where a large amount of them won't be updated every "interval" and thus makes OpenTSDB's aggregation go wild, especially avg. Sending keys only when they are updated also has the side-effect of generating an interpolated line between the last value and the new one which can be in fact very misleading, especially on counts. This options will make logmetrics send duplicate data with the latest timestamp instead of only when that specific metrics was last update. Makes for nicer live graph. 296 | - live_poll: logmetrics aims at both dealing with old logs files for forensics puposes and live logs, all this without looking at the actual system time since it would make dealing with lagging logs/process/etc more tricky than it should. Thus the mechanic to decide if a key can be flushed or if the time interval has passed and it's time to send metrics ended up needing to be different. So if you' pushing old logs, set this to false, on live logs true! 297 | 298 | All these options do not necesseraly play nice together, here's some recipes: 299 | 300 | Live logs parsing with stale metric flushing and duplicate sending: 301 | - stale_removal: true 302 | - stale_threshold_min: 10 303 | - send_duplicates: true 304 | - live_poll: true 305 | - parse_from_start: false 306 | 307 | Live logs parsing without stale metric flushing but with duplicate sending: 308 | - stale_removal: false 309 | - send_duplicates: true 310 | - live_poll: true 311 | - parse_from_start: false 312 | 313 | Live logs parsing without stale metric flushing nor duplicate sending: 314 | - stale_removal: false 315 | - send_duplicates: false 316 | - live_poll: true 317 | - parse_from_start: false 318 | 319 | Old logs parsing: 320 | - stale_removal: false 321 | - send_duplicates: false 322 | - live_poll: false 323 | - parse_from_start: true 324 | 325 | Note the neither stale_removal nor send_duplicate should be used when parsing old logs, the behaviour isn't defined over mulitple log files. 326 | 327 | 328 |

Transform

329 | 330 | To deal with more unruly log files there's a way to modify match groups before using them. Check logmetrics_collector_transform.conf for an example of a config file parsing apache logs with url cleanup so we can use it as tag. 331 | 332 |

Internal structure

333 | 334 | One of the reason Go was used was the concept of goroutines and channels. A goroutine is a lightweight threads managed by go itself. Go will schedule/unschedule these onto real system thread for execution. They are very cheap, using only 8k of memory each. A Go program has the choice of the number of real thread it will use, thus making program with workload distributed one multiple goroutines easily scalable on any number of processors. 335 | 336 | Channels are a way to communicate between goroutines that's safer and easier to read than mutexes. Think of it as a pipe between threads that can be buffered, where operations are atomic and will block when writing when it's full or reading when it's empty. In the case of logmetrics-collector this enable stream processing of the log lines. Thus we have: 337 | 338 | - 1 goroutine that check for log files which weren't previous tailed, start a file tailing goroutine on it if one is detected. 339 | - 1 goroutine by file to read each line and apply the regexp to extract data. These will send the regexp matches to one of its datapools via a channel. 340 | - If the datapool is blocked, it will wait until it can send the data before it continues parsing its log. 341 | - X goroutines by datapool. These are associated to a specific log group and will receive information from of log tailers, collect it, generate stats and keys from it and send the data via another channel to the TSD sender when enough time has passed according to the logs. 342 | - X goroutines for TSD senders, enables multiple streams out if a single isn't sending fast enough. 343 | 344 | *FIXME* make a diagram of this 345 | 346 | 347 |

Compiling

348 | Since I moved away from the default regex package in favor of libpcre (for a vast performance improvement) compiling logmetrics_collector for deployment is a little more tricky than your average Go project. You will need libpcre-devel version 8+. Included under utils is the specfile used to build a logmetrics_collector rpm. Follow it and you should be able to statically link the libpcre of your choice. 349 | 350 | 351 |

Todo

352 | 353 | List of tasks pending related to logmetrics-collector. 354 | 355 | - Persist file tailer position and datapool to disk to enable downtime without side effect. 356 | - Push internal metrics to TSD: Mem usage, GC info, key/data sent, data pool size, line parsed, etc. -Half-done 357 | - Clean up go-metrics remains. 358 | - Clean up data structures. - 1/3 done 359 | - Tests! 360 | 361 | 362 | Many thanks to Kieren Hynd for the idea and input! 363 | -------------------------------------------------------------------------------- /config.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "errors" 5 | "fmt" 6 | "io/ioutil" 7 | "log" 8 | "log/syslog" 9 | "os" 10 | "strings" 11 | "time" 12 | 13 | "github.com/mathpl/golang-pkg-pcre/src/pkg/pcre" 14 | "launchpad.net/~niemeyer/goyaml/beta" 15 | ) 16 | 17 | type Config struct { 18 | pollInterval int 19 | pushPort int 20 | pushWait int 21 | pushHost string 22 | pushProto string 23 | pushType string 24 | pushNumber int 25 | stats_interval int 26 | logFacility syslog.Priority 27 | 28 | logGroups map[string]*logGroup 29 | } 30 | 31 | //type match struct { 32 | // str string 33 | // matcher *pcre.Regexp 34 | //} 35 | // 36 | //func (m *match) apply(str string) (bool, string) { 37 | // result := m.matcher.MatcherString(str, 0) 38 | // 39 | // return result.Matches(), "" 40 | //} 41 | 42 | type keyExtract struct { 43 | tag string 44 | metric_type string 45 | key_suffix string 46 | format string 47 | never_stale bool 48 | multiply int 49 | divide int 50 | 51 | operations map[string][]int 52 | } 53 | 54 | type logGroup struct { 55 | name string 56 | globFiles []string 57 | filename_match string 58 | filename_match_re *pcre.Regexp 59 | re []*pcre.Regexp 60 | strRegexp []string 61 | expected_matches int 62 | hostname string 63 | 64 | date_position int 65 | date_format string 66 | 67 | key_prefix string 68 | tags map[string]interface{} 69 | metrics map[int][]keyExtract 70 | transform map[int]transform 71 | 72 | histogram_size int 73 | histogram_alpha_decay float64 74 | histogram_rescale_threshold_min int 75 | ewma_interval int 76 | stale_removal bool 77 | stale_treshold_min int 78 | send_duplicates bool 79 | 80 | goroutines int 81 | interval int 82 | poll_file bool 83 | live_poll bool 84 | 85 | fail_operation_warn bool 86 | fail_regex_warn bool 87 | out_of_order_time_warn bool 88 | log_stale_metrics bool 89 | parse_from_start bool 90 | 91 | //Channels 92 | tail_data []chan lineResult 93 | } 94 | 95 | func (lg *logGroup) getNbTags() int { 96 | return len(lg.tags) 97 | } 98 | 99 | func (lg *logGroup) getNbKeys() int { 100 | i := 0 101 | for _, metrics := range lg.metrics { 102 | i += len(metrics) 103 | } 104 | return i 105 | } 106 | 107 | func (conf *Config) GetPusherNumber() int { 108 | return conf.pushNumber 109 | } 110 | 111 | func (conf *Config) GetTsdTarget() string { 112 | return fmt.Sprintf("%s:%d", conf.pushHost, conf.pushPort) 113 | } 114 | 115 | func (conf *Config) GetSyslogFacility() syslog.Priority { 116 | return conf.logFacility 117 | } 118 | 119 | func (lg *logGroup) CreateDataPool(channel_number int, tsd_pushers []chan []string, tsd_channel_number int) *datapool { 120 | var dp datapool 121 | dp.Bye = make(chan bool) 122 | dp.duplicateSent = make(map[string]time.Time) 123 | 124 | dp.channel_number = channel_number 125 | dp.tail_data = lg.tail_data[channel_number] 126 | 127 | dp.data = make(map[string]*tsdPoint) 128 | 129 | dp.tsd_channel_number = tsd_channel_number 130 | dp.tsd_push = tsd_pushers[tsd_channel_number] 131 | 132 | dp.last_time_file = make(map[string]fileInfo) 133 | 134 | dp.lg = lg 135 | 136 | dp.compileTagOrder() 137 | 138 | return &dp 139 | } 140 | 141 | func getHostname() string { 142 | //Get hostname 143 | hostname, err := os.Hostname() 144 | if err != nil { 145 | log.Fatalf("Unable to get hostname: ", err) 146 | } 147 | 148 | return hostname 149 | } 150 | 151 | func cleanSre2(log_group_name string, re string) (string, *pcre.Regexp, error) { 152 | //Little hack to support extended style regex. Removes comments, spaces en endline 153 | noSpacesRe := strings.Replace(re, " ", "", -1) 154 | splitRe := strings.Split(noSpacesRe, "\\n") 155 | 156 | var rebuiltRe []string 157 | for _, l := range splitRe { 158 | noComments := strings.Split(l, "#") 159 | rebuiltRe = append(rebuiltRe, string(noComments[0])) 160 | } 161 | cleanRe := strings.Join(rebuiltRe, "") 162 | 163 | //Try to compile the regex 164 | if compiledRe, err := pcre.Compile(cleanRe, 0); err == nil { 165 | return cleanRe, &compiledRe, nil 166 | } else { 167 | return "", nil, errors.New(err.Message) 168 | } 169 | } 170 | 171 | func parseMetrics(conf map[interface{}]interface{}) map[int][]keyExtract { 172 | keyExtracts := make(map[int][]keyExtract) 173 | 174 | for metric_type, metrics := range conf { 175 | for _, n := range metrics.([]interface{}) { 176 | m := n.(map[interface{}]interface{}) 177 | 178 | key_suffix := m["key_suffix"].(string) 179 | 180 | var format string 181 | var multiply int 182 | var divide int 183 | var never_stale bool 184 | 185 | if format_key, ok := m["format"]; ok == true { 186 | format = format_key.(string) 187 | } else { 188 | format = "int" 189 | } 190 | if multiply_key, ok := m["multiply"]; ok == true { 191 | multiply = multiply_key.(int) 192 | if multiply == 0 { 193 | log.Fatalf("A 'multiply' transform cannot be zero") 194 | } 195 | } else { 196 | multiply = 1 197 | } 198 | if divide_key, ok := m["divide"]; ok == true { 199 | divide = divide_key.(int) 200 | if divide < 1 { 201 | log.Fatalf("A 'divide' transform cannot be zero") 202 | } 203 | } else { 204 | divide = 1 205 | } 206 | if never_stale_key, ok := m["never_stale"]; ok == true { 207 | never_stale = never_stale_key.(bool) 208 | } else { 209 | never_stale = false 210 | } 211 | 212 | for _, val := range m["reference"].([]interface{}) { 213 | position := val.([]interface{})[0].(int) 214 | tag := val.([]interface{})[1].(string) 215 | 216 | operations := make(map[string][]int) 217 | if len(val.([]interface{})) > 2 { 218 | operations_struct := val.([]interface{})[2].(map[interface{}]interface{}) 219 | 220 | for op, opvals := range operations_struct { 221 | //Make sure we only accept operation we can perform 222 | if op != "add" && op != "sub" { 223 | log.Fatalf("Operation %s no supported", op) 224 | } 225 | 226 | for _, opval := range opvals.([]interface{}) { 227 | operations[op.(string)] = append(operations[op.(string)], opval.(int)) 228 | } 229 | } 230 | } 231 | 232 | newKey := keyExtract{tag: tag, metric_type: metric_type.(string), key_suffix: key_suffix, 233 | format: format, multiply: multiply, divide: divide, never_stale: never_stale, operations: operations} 234 | keyExtracts[position] = append(keyExtracts[position], newKey) 235 | } 236 | } 237 | } 238 | 239 | return keyExtracts 240 | } 241 | 242 | func LoadConfig(configFile string) Config { 243 | byteConfig, err := ioutil.ReadFile(configFile) 244 | if err != nil { 245 | log.Print(err) 246 | os.Exit(1) 247 | } 248 | 249 | var rawCfg interface{} 250 | err = goyaml.Unmarshal(byteConfig, &rawCfg) 251 | if err != nil { 252 | log.Print(err) 253 | os.Exit(1) 254 | } 255 | 256 | settings := rawCfg.(map[interface{}]interface{})["settings"] 257 | 258 | var cfg Config 259 | cfg.logGroups = make(map[string]*logGroup) 260 | 261 | //Settings 262 | for key, val := range settings.(map[interface{}]interface{}) { 263 | switch v := val.(type) { 264 | case int: 265 | switch key { 266 | case "poll_interval": 267 | cfg.pollInterval = v 268 | case "push_port": 269 | cfg.pushPort = v 270 | case "push_wait": 271 | cfg.pushWait = v 272 | case "push_number": 273 | cfg.pushNumber = v 274 | case "stats_interval": 275 | cfg.stats_interval = v 276 | 277 | default: 278 | log.Fatalf("Unknown key settings.%s", key) 279 | } 280 | 281 | case string: 282 | switch key { 283 | case "log_facility": 284 | //Lookup 285 | if facility, found := facilityStrings[v]; found == true { 286 | cfg.logFacility = syslog.LOG_INFO | facility 287 | } else { 288 | log.Fatalf("Unable to map log_facility: %s", v) 289 | } 290 | case "push_host": 291 | cfg.pushHost = v 292 | case "push_proto": 293 | cfg.pushProto = v 294 | case "push_type": 295 | cfg.pushType = v 296 | 297 | default: 298 | log.Fatalf("Unknown key settings.%s", key) 299 | } 300 | 301 | default: 302 | log.Fatalf("Unknown key settings.%s", key) 303 | } 304 | } 305 | 306 | //Some default vals 307 | if cfg.pollInterval == 0 { 308 | cfg.pollInterval = 15 309 | } 310 | if cfg.logFacility == 0 { 311 | cfg.logFacility = syslog.LOG_LOCAL0 312 | } 313 | if cfg.pushHost == "" { 314 | cfg.pushHost = "localhost" 315 | } 316 | if cfg.pushProto == "" { 317 | cfg.pushProto = "udp" 318 | } 319 | if cfg.pushType == "" { 320 | cfg.pushType = "tcollector" 321 | } 322 | if cfg.pushNumber == 0 { 323 | cfg.pushNumber = 1 324 | } 325 | if cfg.stats_interval == 0 { 326 | cfg.stats_interval = 60 327 | } 328 | 329 | //Log_groups configs 330 | for name, group_content := range rawCfg.(map[interface{}]interface{}) { 331 | //Skip settings, already parsed 332 | if name == "settings" { 333 | continue 334 | } 335 | 336 | var lg logGroup 337 | 338 | lg.name = name.(string) 339 | lg.tags = make(map[string]interface{}) 340 | 341 | //Process content 342 | for key, val := range group_content.(map[interface{}]interface{}) { 343 | switch v := val.(type) { 344 | case string: 345 | switch key { 346 | case "key_prefix": 347 | lg.key_prefix = v 348 | 349 | case "filename_match": 350 | lg.filename_match = v 351 | re := pcre.MustCompile(v, 0) 352 | lg.filename_match_re = &re 353 | 354 | default: 355 | log.Fatalf("Unknown key %s.%s", name, key) 356 | } 357 | 358 | case int: 359 | switch key { 360 | case "interval": 361 | lg.interval = v 362 | case "ewma_interval": 363 | lg.ewma_interval = v 364 | case "expected_matches": 365 | lg.expected_matches = v 366 | case "histogram_size": 367 | lg.histogram_size = v 368 | case "goroutines": 369 | lg.goroutines = v 370 | case "histogram_rescale_threshold_min": 371 | lg.histogram_rescale_threshold_min = v 372 | case "stale_treshold_min": 373 | lg.stale_treshold_min = v 374 | 375 | default: 376 | log.Fatalf("Unknown key %s.%s", name, key) 377 | } 378 | 379 | case float64: 380 | switch key { 381 | case "histogram_alpha_decay": 382 | lg.histogram_alpha_decay = v 383 | 384 | default: 385 | log.Fatalf("Unknown key %s.%s", name, key) 386 | } 387 | 388 | case bool: 389 | switch key { 390 | case "warn_on_regex_fail": 391 | lg.fail_regex_warn = v 392 | case "parse_from_start": 393 | lg.parse_from_start = v 394 | case "warn_on_operation_fail": 395 | lg.fail_operation_warn = v 396 | case "warn_on_out_of_order_time": 397 | lg.out_of_order_time_warn = v 398 | case "poll_file": 399 | lg.poll_file = v 400 | case "live_poll": 401 | lg.live_poll = v 402 | case "stale_removal": 403 | lg.stale_removal = v 404 | case "send_duplicates": 405 | lg.send_duplicates = v 406 | case "log_stale_metrics": 407 | lg.log_stale_metrics = v 408 | 409 | default: 410 | log.Fatalf("Unknown key %s.%s", name, key) 411 | } 412 | 413 | case []interface{}: 414 | switch key { 415 | case "re": 416 | var err error 417 | lg.re = make([]*pcre.Regexp, len(v)) 418 | lg.strRegexp = make([]string, len(v)) 419 | for i, re := range v { 420 | if lg.strRegexp[i], lg.re[i], err = cleanSre2(lg.name, re.(string)); err != nil { 421 | log.Fatal(err) 422 | } 423 | } 424 | case "files": 425 | for _, file := range v { 426 | lg.globFiles = append(lg.globFiles, file.(string)) 427 | } 428 | 429 | default: 430 | log.Fatalf("Unknown key %s.%s", name, key) 431 | } 432 | 433 | case map[interface{}]interface{}: 434 | switch key { 435 | case "tags": 436 | for tag, pos := range v { 437 | switch pos.(type) { 438 | case int: 439 | lg.tags[tag.(string)] = pos.(int) 440 | case string: 441 | lg.tags[tag.(string)] = pos.(string) 442 | default: 443 | log.Fatalf("Unexpected type for tags section, key %s: %T", tag, pos) 444 | } 445 | } 446 | 447 | case "metrics": 448 | lg.metrics = parseMetrics(v) 449 | 450 | case "date": 451 | for date_name, date_val := range v { 452 | if date_name.(string) == "position" { 453 | lg.date_position = date_val.(int) 454 | } else if date_name.(string) == "format" { 455 | lg.date_format = date_val.(string) 456 | } else { 457 | log.Fatalf("Unknown key %s.date.%s", name, date_name) 458 | } 459 | } 460 | 461 | case "transform": 462 | lg.transform = parseTransform(v) 463 | 464 | default: 465 | log.Fatalf("Unknown key %s.%s", name, key) 466 | } 467 | 468 | default: 469 | log.Fatalf("Unknown key %s.%s", name, key) 470 | } 471 | } 472 | 473 | //Defaults 474 | if lg.goroutines == 0 { 475 | lg.goroutines = 1 476 | } 477 | if lg.histogram_alpha_decay == 0 { 478 | lg.histogram_alpha_decay = 0.15 479 | } 480 | if lg.histogram_size == 0 { 481 | lg.histogram_size = 256 482 | } 483 | if lg.histogram_rescale_threshold_min == 0 { 484 | lg.histogram_rescale_threshold_min = 60 485 | } 486 | if lg.ewma_interval == 0 { 487 | lg.ewma_interval = 30 488 | } 489 | if lg.stale_treshold_min == 0 { 490 | lg.stale_treshold_min = 60 491 | } 492 | 493 | //Init channels 494 | lg.tail_data = make([]chan lineResult, lg.goroutines) 495 | for i := 0; i < lg.goroutines; i++ { 496 | lg.tail_data[i] = make(chan lineResult, 1000) 497 | } 498 | lg.hostname = getHostname() 499 | 500 | cfg.logGroups[name.(string)] = &lg 501 | } 502 | 503 | return cfg 504 | } 505 | -------------------------------------------------------------------------------- /datapool.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "sort" 7 | "strconv" 8 | "strings" 9 | "time" 10 | 11 | "github.com/mathpl/go-timemetrics" 12 | ) 13 | 14 | type dataPoint struct { 15 | name string 16 | value int64 17 | never_stale bool 18 | metric_type string 19 | } 20 | 21 | type dataPointTime struct { 22 | name string 23 | time int64 24 | } 25 | 26 | type tsdPoint struct { 27 | data timemetrics.Metric 28 | filename string 29 | last_push time.Time 30 | last_crunched_push time.Time 31 | never_stale bool 32 | } 33 | 34 | type fileInfo struct { 35 | lastUpdate time.Time 36 | last_push time.Time 37 | } 38 | 39 | type datapool struct { 40 | data map[string]*tsdPoint 41 | duplicateSent map[string]time.Time 42 | tsd_push chan []string 43 | tail_data chan lineResult 44 | 45 | channel_number int 46 | tsd_channel_number int 47 | 48 | tag_order []string 49 | 50 | lg *logGroup 51 | 52 | total_keys int 53 | total_stale int 54 | last_time_file map[string]fileInfo 55 | 56 | Bye chan bool 57 | } 58 | 59 | func (dp *datapool) compileTagOrder() { 60 | tag_order := make([]string, dp.lg.getNbTags()) 61 | i := 0 62 | for tagname, _ := range dp.lg.tags { 63 | tag_order[i] = tagname 64 | i++ 65 | } 66 | sort.Strings(tag_order) 67 | 68 | dp.tag_order = tag_order 69 | } 70 | 71 | func (dp *datapool) extractTags(data []string) []string { 72 | //General tags 73 | tags := make([]string, dp.lg.getNbTags()) 74 | for cnt, tagname := range dp.tag_order { 75 | 76 | tag_value := "" 77 | pos_or_value := dp.lg.tags[tagname] 78 | 79 | switch pos_or_string := pos_or_value.(type) { 80 | case int: 81 | tag_value = data[pos_or_string] 82 | case string: 83 | tag_value = pos_or_string 84 | } 85 | 86 | tags[cnt] = fmt.Sprintf("%s=%s", tagname, tag_value) 87 | } 88 | 89 | return tags 90 | } 91 | 92 | func build_replace_map(data []string) map[string]string { 93 | r := make(map[string]string) 94 | 95 | for pos, s := range data { 96 | r[fmt.Sprintf("%d", pos)] = s 97 | } 98 | 99 | return r 100 | } 101 | 102 | func (dp *datapool) applyTransforms(match_groups []string) []string { 103 | transformed_matches := make([]string, len(match_groups)) 104 | 105 | for pos, data := range match_groups { 106 | if transform, ok := dp.lg.transform[pos]; ok { 107 | transformed_matches[pos] = transform.apply(data) 108 | } else { 109 | transformed_matches[pos] = data 110 | } 111 | } 112 | 113 | return transformed_matches 114 | } 115 | 116 | func (dp *datapool) getKeys(data []string) ([]dataPoint, time.Time) { 117 | y := time.Now().Year() 118 | 119 | tags := dp.extractTags(data) 120 | 121 | nbKeys := dp.lg.getNbKeys() 122 | dataPoints := make([]dataPoint, nbKeys) 123 | 124 | //Time 125 | t, err := time.Parse(dp.lg.date_format, data[dp.lg.date_position]) 126 | if err != nil { 127 | log.Print(err) 128 | var nt time.Time 129 | return nil, nt 130 | } 131 | 132 | //Patch in year if missing - rfc3164 133 | if t.Year() == 0 { 134 | t = time.Date(y, t.Month(), t.Day(), t.Hour(), t.Minute(), 135 | t.Second(), t.Nanosecond(), t.Location()) 136 | } 137 | 138 | //Make a first pass extracting the data, applying float->int conversion on multiplier 139 | values := make([]int64, dp.lg.expected_matches+1) 140 | for position, keyTypes := range dp.lg.metrics { 141 | for _, keyType := range keyTypes { 142 | if position == 0 { 143 | values[position] = 1 144 | } else { 145 | var val int64 146 | var err error 147 | if keyType.format == "float" { 148 | var val_float float64 149 | if val_float, err = strconv.ParseFloat(data[position], 64); err == nil { 150 | if keyType.multiply > 1 { 151 | val = int64(val_float * float64(keyType.multiply)) 152 | } 153 | if keyType.divide > 1 { 154 | val = int64(val_float / float64(keyType.divide)) 155 | } 156 | } 157 | } else { 158 | if val, err = strconv.ParseInt(data[position], 10, 64); err == nil { 159 | if keyType.multiply > 1 { 160 | val = val * int64(keyType.multiply) 161 | } 162 | if keyType.divide > 1 { 163 | val = val / int64(keyType.divide) 164 | } 165 | } 166 | } 167 | 168 | if err != nil { 169 | log.Printf("Unable to extract data from value match, %s: %s", err, data[position]) 170 | var nt time.Time 171 | return nil, nt 172 | } else { 173 | values[position] = val 174 | } 175 | } 176 | } 177 | } 178 | 179 | // Second pass applies operation and create datapoints 180 | var i = 0 181 | for position, val := range values { 182 | //Is the value a metric? 183 | for _, keyType := range dp.lg.metrics[position] { 184 | //Key name 185 | key := fmt.Sprintf("%s.%s.%s %s %s", dp.lg.key_prefix, keyType.key_suffix, "%s %d %s", strings.Join(tags, " "), keyType.tag) 186 | 187 | //Do we need to do any operation on this val? 188 | for op, opvalues := range keyType.operations { 189 | for _, op_position := range opvalues { 190 | //log.Printf("%s %d on pos %d, current val: %d", op, op_position, position, val) 191 | if op_position != 0 { 192 | switch op { 193 | case "add": 194 | val += values[op_position] 195 | 196 | case "sub": 197 | val -= values[op_position] 198 | } 199 | } 200 | } 201 | } 202 | 203 | if val < 0 && dp.lg.fail_operation_warn { 204 | log.Printf("Values cannot be negative after applying operation. Offending line: %s", data[0]) 205 | var nt time.Time 206 | return nil, nt 207 | } 208 | 209 | dataPoints[i] = dataPoint{name: key, value: val, metric_type: keyType.metric_type, never_stale: keyType.never_stale} 210 | i++ 211 | } 212 | } 213 | 214 | return dataPoints, t 215 | } 216 | 217 | func (dp *datapool) getStatsKey(timePush time.Time) []string { 218 | line := make([]string, 2) 219 | line[0] = fmt.Sprintf("logmetrics_collector.data_pool.key_tracked %d %d host=%s log_group=%s log_group_number=%d", timePush.Unix(), dp.total_keys, dp.lg.hostname, dp.lg.name, dp.tsd_channel_number) 220 | line[1] = fmt.Sprintf("logmetrics_collector.data_pool.key_staled %d %d host=%s log_group=%s log_group_number=%d", timePush.Unix(), dp.total_stale, dp.lg.hostname, dp.lg.name, dp.tsd_channel_number) 221 | 222 | return line 223 | } 224 | 225 | func (dp *datapool) start() { 226 | log.Printf("Datapool[%s:%d] started. Pushing keys to TsdPusher[%d]", dp.lg.name, dp.channel_number, dp.tsd_channel_number) 227 | 228 | var last_time_pushed *time.Time 229 | var lastTimeStatsPushed time.Time 230 | for { 231 | select { 232 | case line_result := <-dp.tail_data: 233 | 234 | transformed_matches := dp.applyTransforms(line_result.matches) 235 | 236 | data_points, point_time := dp.getKeys(transformed_matches) 237 | 238 | if currentFileInfo, ok := dp.last_time_file[line_result.filename]; ok { 239 | if currentFileInfo.lastUpdate.Before(point_time) { 240 | currentFileInfo.lastUpdate = point_time 241 | } 242 | } else { 243 | dp.last_time_file[line_result.filename] = fileInfo{lastUpdate: point_time} 244 | } 245 | 246 | //To start things off 247 | if last_time_pushed == nil { 248 | last_time_pushed = &point_time 249 | } 250 | 251 | for _, data_point := range data_points { 252 | //New metrics, add 253 | if _, ok := dp.data[data_point.name]; !ok { 254 | switch data_point.metric_type { 255 | case "histogram": 256 | s := timemetrics.NewExpDecaySample(point_time, dp.lg.histogram_size, dp.lg.histogram_alpha_decay, dp.lg.histogram_rescale_threshold_min) 257 | dp.data[data_point.name] = &tsdPoint{data: timemetrics.NewHistogram(s, dp.lg.stale_treshold_min), 258 | filename: line_result.filename} 259 | case "counter": 260 | dp.data[data_point.name] = &tsdPoint{data: timemetrics.NewCounter(point_time, dp.lg.stale_treshold_min), 261 | filename: line_result.filename} 262 | case "meter": 263 | dp.data[data_point.name] = &tsdPoint{data: timemetrics.NewMeter(point_time, dp.lg.ewma_interval, dp.lg.stale_treshold_min), 264 | filename: line_result.filename} 265 | default: 266 | log.Fatalf("Unexpected metric type %s!", data_point.metric_type) 267 | } 268 | } 269 | 270 | //Make sure data is ordered or we risk sending duplicate data 271 | if dp.data[data_point.name].last_push.After(point_time) && dp.lg.out_of_order_time_warn { 272 | log.Printf("Non-ordered data detected in log file. Its key already had a update at %s in the future. Offending line: %s", 273 | dp.data[data_point.name].last_push, line_result.matches[0]) 274 | } 275 | 276 | dp.data[data_point.name].data.Update(point_time, data_point.value) 277 | dp.data[data_point.name].filename = line_result.filename 278 | dp.data[data_point.name].never_stale = data_point.never_stale 279 | } 280 | 281 | //Support for log playback - Push when has pass in the logs, not real time 282 | run_push_keys := false 283 | if dp.lg.live_poll && point_time.Sub(*last_time_pushed) >= time.Duration(dp.lg.interval)*time.Second { 284 | run_push_keys = true 285 | } else if !dp.lg.stale_removal { 286 | // Check for each file individually 287 | for _, fileInfo := range dp.last_time_file { 288 | if point_time.Sub(fileInfo.last_push) >= time.Duration(dp.lg.interval)*time.Second { 289 | run_push_keys = true 290 | break 291 | } 292 | } 293 | } 294 | 295 | if run_push_keys { 296 | var nb_stale int 297 | dp.total_keys, nb_stale = dp.pushKeys(point_time) 298 | dp.total_stale += nb_stale 299 | 300 | //Push stats as well? 301 | if point_time.Sub(lastTimeStatsPushed) > time.Duration(dp.lg.interval)*time.Second { 302 | dp.tsd_push <- dp.getStatsKey(point_time) 303 | lastTimeStatsPushed = point_time 304 | } 305 | 306 | last_time_pushed = &point_time 307 | } 308 | case <-dp.Bye: 309 | log.Printf("Datapool[%s:%d] stopped.", dp.lg.name, dp.channel_number) 310 | return 311 | } 312 | } 313 | } 314 | 315 | func (dp *datapool) pushKeys(point_time time.Time) (int, int) { 316 | nbKeys := 0 317 | nbStale := 0 318 | for tsd_key, tsdPoint := range dp.data { 319 | pointData := tsdPoint.data 320 | currentFileInfo := dp.last_time_file[tsdPoint.filename] 321 | 322 | if dp.lg.stale_removal && pointData.Stale(point_time) && !tsdPoint.never_stale { 323 | if dp.lg.log_stale_metrics { 324 | log.Printf("Deleting stale metric. Last update: %s Current time: %s Metric: %s", pointData.GetMaxTime(), point_time, tsd_key) 325 | } 326 | 327 | //Push the zeroed-out key one last time to stabilize aggregated data 328 | pointData.ZeroOut() 329 | delete(dp.data, tsd_key) 330 | nbStale += pointData.NbKeys() 331 | } else { 332 | nbKeys += pointData.NbKeys() 333 | } 334 | 335 | // pointData.lastUpdate.After(tsdPoint.last_push) 336 | updateToSend := pointData.PushKeysTime(tsdPoint.last_push) 337 | 338 | var keys []string 339 | if updateToSend { 340 | tsdPoint.last_push = pointData.GetMaxTime() 341 | currentFileInfo.last_push = tsdPoint.last_push 342 | 343 | // always take the log file timestamp 344 | keys = pointData.GetKeys(point_time, tsd_key, false) 345 | } else if !updateToSend && dp.lg.send_duplicates { 346 | var dup_time time.Time 347 | if _,ok := dp.duplicateSent[tsd_key]; ok { 348 | dup_time = dp.duplicateSent[tsd_key].Add((time.Second * time.Duration(dp.lg.interval))) 349 | } else { 350 | dup_time = pointData.GetMaxTime().Add((time.Second * time.Duration(dp.lg.interval))) 351 | } 352 | 353 | dp.duplicateSent[tsd_key] = dup_time 354 | keys = pointData.GetKeys(dup_time, tsd_key, true) 355 | } 356 | 357 | dp.tsd_push <- keys 358 | 359 | if currentFileInfo.last_push.After(dp.last_time_file[tsdPoint.filename].last_push) { 360 | dp.last_time_file[tsdPoint.filename] = currentFileInfo 361 | } 362 | } 363 | 364 | return nbKeys, nbStale 365 | } 366 | 367 | func StartDataPools(config *Config, tsd_pushers []chan []string) (dps []*datapool) { 368 | //Start a queryHandler by log group 369 | nb_tsd_push := 0 370 | dps = make([]*datapool, 0) 371 | for _, lg := range config.logGroups { 372 | for i := 0; i < lg.goroutines; i++ { 373 | dp := lg.CreateDataPool(i, tsd_pushers, nb_tsd_push) 374 | go dp.start() 375 | dps = append(dps, dp) 376 | 377 | nb_tsd_push = (nb_tsd_push + 1) % config.GetPusherNumber() 378 | } 379 | } 380 | 381 | return dps 382 | } 383 | -------------------------------------------------------------------------------- /images/ewma.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mathpl/logmetrics/75a6835c9f3fe3f7620a5eee7455997c25e70827/images/ewma.png -------------------------------------------------------------------------------- /images/min-distribution-of-time-by-resource.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mathpl/logmetrics/75a6835c9f3fe3f7620a5eee7455997c25e70827/images/min-distribution-of-time-by-resource.png -------------------------------------------------------------------------------- /images/percentiles.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mathpl/logmetrics/75a6835c9f3fe3f7620a5eee7455997c25e70827/images/percentiles.png -------------------------------------------------------------------------------- /logmetrics_collector_transform.conf: -------------------------------------------------------------------------------- 1 | { 2 | apache_reverse_proxy: { 3 | files: [ "/var/log/apache/httpd_*.log" ], 4 | 5 | # Filename based match groups are appended at the end of the line match groups (re) 6 | filename_match: 'httpd_(\w+)\.log$', 7 | 8 | # Apache logs defined with: 9 | # LogFormat "%h\t%l\t%u\t%{%d/%b/%Y:%H:%M:%S %z}t\t%r\t%>s\t%b\t%{Referer}i\t%{User-Agent}i\t%D 10 | re: [ 11 | '^\S+\t # Client IP \n 12 | \S+\t # Login \n 13 | \S+\t # Remote user \n 14 | (\S+\s\S+)\t # Date 1 \n 15 | (GET|POST|HEAD|COOK)\s+ # HTTP verb 2 \n 16 | (\S+)\s+ # HTTP path 3 \n 17 | HTTP/\d+\.\d+\t # HTTP version \n 18 | (\d+)\t # HTTP final status 4 \n 19 | (\S+)\t # Response size 5 \n 20 | .*\t # Referer \n 21 | .*\t # User-agent \n 22 | (\S+) # Response time in ms 6 \n' 23 | ], 24 | expected_matches: 6, 25 | 26 | date: { 27 | position: 1, 28 | format: "2/Jan/2006:15:04:05 -0700" 29 | }, 30 | 31 | # Transformation to apply to match groups. Applied sequentially before anything else. 32 | transform: { 33 | 3: { 34 | # Once we got 1 successful replace operation, stop. 35 | replace_only_one: true, 36 | 37 | # Prints log line that get the default value assigned by match_or_default 38 | log_default_assign: false, 39 | 40 | operations: [ 41 | [ 'replace', '^.*/bid', '/bid/' ], # /erronous/path/bid?params -> /bid/ 42 | [ 'replace', '^/(\w+)/.*$', '/@@1@@/' ], # /buy/now.php -> /buy/ 43 | [ 'replace', '^/search.php', '/search/' ], 44 | [ 'replace', '^/[^/]*$', '/' ], # /favicon.ico -> / 45 | 46 | # If the match fails, use the default value instead 47 | [ 'match_or_default', '^/(|(|bid|buy|search|list|category|sitemap|info)/)$', 'other' ] # White list 48 | ] 49 | }, 50 | 4: { 51 | operations: [ 52 | [ 'replace', '^-$', '0'] # Equivalent to s/^_$/0/g 53 | ] 54 | }, 55 | 5: { 56 | operations: [ 57 | [ 'replace', '^-$', '0'] 58 | ] 59 | }, 60 | 6: { 61 | operations: [ 62 | [ 'replace', '^-$', '0'] 63 | ] 64 | } 65 | }, 66 | 67 | key_prefix: 'apache', 68 | # General tag-position lookup 69 | tags: {verb: 2, 70 | path: 3, 71 | status: 4, 72 | site: 7, 73 | product: "myCoolProduct" 74 | }, 75 | 76 | metrics: { 77 | meter: [ 78 | { key_suffix: "executions", 79 | reference: [ 80 | [0, ""] #When pos=0, simply inc counter by 1 81 | ] 82 | } 83 | ], 84 | histogram: [ 85 | { key_suffix: "response_size.byte", 86 | reference: [ 87 | [5, ""] 88 | ] 89 | }, 90 | { key_suffix: "response_time.ms", 91 | divide: 1000, 92 | reference: [ 93 | [6, ""] 94 | ] 95 | } 96 | ] 97 | }, 98 | 99 | histogram_size: 256, 100 | histogram_alpha_decay: 0.15, 101 | histogram_rescale_threshold_min: 10, 102 | 103 | #Minimum interval between EWMA calculations 104 | ewma_interval: 5, 105 | 106 | # Enable removal of metrics that haven't been updated for X amount of time. Defaults to false. 107 | stale_removal: true, 108 | 109 | # Metric will be dropped if no new update has been receive within that time 110 | stale_treshold_min: 10, 111 | 112 | # Send metric even if it hasn't changed. Useful for near real-time graphs. Defaults to false. 113 | send_duplicates: true, 114 | 115 | #Split workload on multiple go routines to scale accross cpus 116 | goroutines: 1, 117 | 118 | #Poll the file instead of using inotify 119 | poll_file: true, 120 | 121 | #Push data to TSD every X interval 122 | interval: 15, 123 | warn_on_regex_fail: true, 124 | warn_on_operation_fail: true, 125 | warn_on_out_of_order_time: false, 126 | log_stale_metrics: true, 127 | parse_from_start: true #Dev setting 128 | }, 129 | 130 | settings: { 131 | poll_interval: 15, 132 | 133 | log_facility: "local3", 134 | 135 | push_port: 4242, 136 | push_host: "tsd.my.network", 137 | push_proto: "tcp", 138 | push_type: "tsd", 139 | # Number of parallel pushers. 140 | push_number: 1, 141 | 142 | # Secs to wait for retry when unable to send to push 143 | push_wait: 15, 144 | 145 | #Nb seconds between logging pushed stats. To be moved to tsd 146 | stats_interval: 60, 147 | } 148 | } 149 | -------------------------------------------------------------------------------- /logtail.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "path/filepath" 7 | "time" 8 | 9 | "github.com/mathpl/tail" 10 | ) 11 | 12 | type tailer struct { 13 | ts tailStats 14 | filename string 15 | channel_number int 16 | tsd_pusher chan []string 17 | 18 | lg *logGroup 19 | 20 | Bye chan bool 21 | } 22 | 23 | type tailStats struct { 24 | line_read int64 25 | byte_read int64 26 | line_match int64 27 | last_report time.Time 28 | hostname string 29 | filename string 30 | log_group string 31 | interval int 32 | } 33 | 34 | type lineResult struct { 35 | filename string 36 | matches []string 37 | } 38 | 39 | func (ts *tailStats) isTimeForStats() bool { 40 | return time.Now().Sub(ts.last_report) > time.Duration(ts.interval)*time.Second 41 | } 42 | 43 | func (ts *tailStats) incLineMatch() { 44 | ts.line_match++ 45 | } 46 | 47 | func (ts *tailStats) incLine(line string) { 48 | ts.line_read++ 49 | ts.byte_read += int64(len(line)) 50 | } 51 | 52 | func (ts *tailStats) getTailStatsKey() []string { 53 | t := time.Now() 54 | 55 | ts.last_report = t 56 | 57 | line := make([]string, 3) 58 | line[0] = fmt.Sprintf("logmetrics_collector.tail.line_read %d %d host=%s log_group=%s filename=%s", t.Unix(), ts.line_read, ts.hostname, ts.log_group, ts.filename) 59 | line[1] = fmt.Sprintf("logmetrics_collector.tail.byte_read %d %d host=%s log_group=%s filename=%s", t.Unix(), ts.byte_read, ts.hostname, ts.log_group, ts.filename) 60 | line[2] = fmt.Sprintf("logmetrics_collector.tail.line_matched %d %d host=%s log_group=%s filename=%s", t.Unix(), ts.line_match, ts.hostname, ts.log_group, ts.filename) 61 | 62 | return line 63 | 64 | } 65 | 66 | func (t *tailer) tailFile() { 67 | t.ts = tailStats{last_report: time.Now(), hostname: getHostname(), 68 | filename: t.filename, log_group: t.lg.name, interval: t.lg.interval} 69 | 70 | maxMatches := t.lg.expected_matches + 1 71 | 72 | var filename_matches []string 73 | if t.lg.filename_match_re != nil { 74 | m := t.lg.filename_match_re.MatcherString(t.filename, 0) 75 | filename_matches = m.ExtractString()[1:] 76 | } 77 | 78 | //os.Seek end of file descriptor 79 | seekParam := 2 80 | if t.lg.parse_from_start { 81 | seekParam = 0 82 | } 83 | 84 | loc := tail.SeekInfo{0, seekParam} 85 | 86 | maxLineSize := 2048 87 | 88 | tail, err := tail.TailFile(t.filename, tail.Config{Location: &loc, Follow: true, ReOpen: true, Poll: t.lg.poll_file, MaxLineSize: maxLineSize}) 89 | 90 | if err != nil { 91 | log.Fatalf("Unable to tail %s: %s", t.filename, err) 92 | return 93 | } 94 | log.Printf("Tailing %s data to datapool[%s:%d]", t.filename, t.lg.name, t.channel_number) 95 | 96 | line_overflow := false 97 | 98 | //FIXME: Bug in ActiveTail can get partial lines 99 | for { 100 | select { 101 | case line, ok := <-tail.Lines: 102 | if !ok { 103 | err := tail.Err() 104 | if err != nil { 105 | log.Printf("Tail on %s ended with error: %v", t.filename, err) 106 | } else { 107 | log.Printf("Tail on %s ended early", t.filename) 108 | } 109 | return 110 | } 111 | if line == nil { 112 | log.Printf("tail.Lines on %s returned nil; not possible", t.filename) 113 | continue 114 | } 115 | 116 | //Support to skip very long lines 117 | if line_overflow { 118 | line_overflow = (len(line.Text) == maxLineSize) 119 | continue 120 | } 121 | 122 | line_overflow = (len(line.Text) == maxLineSize) 123 | 124 | //Test out all the regexp, pick the first one that matches 125 | match_one := false 126 | for _, re := range t.lg.re { 127 | m := re.MatcherString(line.Text, 0) 128 | matches := m.ExtractString() 129 | if len(matches) == maxMatches { 130 | match_one = true 131 | if filename_matches != nil { 132 | matches = append(matches, filename_matches[:]...) 133 | } 134 | 135 | results := lineResult{t.filename, matches} 136 | t.lg.tail_data[t.channel_number] <- results 137 | t.ts.incLineMatch() 138 | break 139 | } 140 | } 141 | 142 | t.ts.incLine(line.Text) 143 | 144 | if t.lg.fail_regex_warn && !match_one { 145 | log.Printf("Regexp match failed on %s, expected %d matches: %s", t.filename, maxMatches, line.Text) 146 | } 147 | 148 | if (t.ts.line_read%100) == 0 && t.ts.isTimeForStats() { 149 | t.tsd_pusher <- t.ts.getTailStatsKey() 150 | } 151 | case <-t.Bye: 152 | log.Printf("Tailer for %s stopped.", t.filename) 153 | return 154 | } 155 | } 156 | } 157 | 158 | type filenamePoller struct { 159 | lg *logGroup 160 | poll_interval int 161 | tsd_pushers []chan []string 162 | push_number int 163 | 164 | Bye chan bool 165 | } 166 | 167 | func (fp *filenamePoller) startFilenamePoller() { 168 | log.Printf("Filename poller for %s started", fp.lg.name) 169 | log.Printf("Using the following regexp for log group %s: %s", fp.lg.name, fp.lg.strRegexp) 170 | 171 | rescanFiles := make(chan bool, 1) 172 | go func() { 173 | rescanFiles <- true 174 | for { 175 | time.Sleep(time.Duration(fp.poll_interval) * time.Second) 176 | rescanFiles <- true 177 | } 178 | }() 179 | 180 | currentFiles := make(map[string]bool) 181 | channel_number := 0 182 | pusher_channel_number := 0 183 | 184 | allTailers := make([]*tailer, 0) 185 | for { 186 | select { 187 | case <-rescanFiles: 188 | newFiles := make(map[string]bool) 189 | for _, glob := range fp.lg.globFiles { 190 | files, err := filepath.Glob(glob) 191 | if err != nil { 192 | log.Fatalf("Unable to find files for log group %s: %s", fp.lg.name, err) 193 | } 194 | 195 | for _, v := range files { 196 | newFiles[v] = true 197 | } 198 | } 199 | 200 | //Check only the diff, missing files will automatically be dropped 201 | //by their goroutine 202 | for file, _ := range newFiles { 203 | if ok := currentFiles[file]; ok { 204 | delete(newFiles, file) 205 | } 206 | } 207 | 208 | //Start tailing new files! 209 | for file, _ := range newFiles { 210 | bye := make(chan bool) 211 | t := tailer{filename: file, channel_number: channel_number, lg: fp.lg, Bye: bye, 212 | tsd_pusher: fp.tsd_pushers[pusher_channel_number]} 213 | go t.tailFile() 214 | allTailers = append(allTailers, &t) 215 | 216 | channel_number = (channel_number + 1) % fp.lg.goroutines 217 | pusher_channel_number = (pusher_channel_number + 1) % fp.push_number 218 | 219 | currentFiles[file] = true 220 | } 221 | case <-fp.Bye: 222 | for _, t := range allTailers { 223 | go func() { t.Bye <- true }() 224 | } 225 | log.Printf("Filename poller for %s stopped", fp.lg.name) 226 | return 227 | } 228 | } 229 | } 230 | 231 | func StartTails(config *Config, tsd_pushers []chan []string) []*filenamePoller { 232 | filenamePollers := make([]*filenamePoller, 0) 233 | for _, logGroup := range config.logGroups { 234 | bye := make(chan bool) 235 | f := filenamePoller{lg: logGroup, poll_interval: config.pollInterval, tsd_pushers: tsd_pushers, push_number: config.GetPusherNumber(), Bye: bye} 236 | filenamePollers = append(filenamePollers, &f) 237 | go f.startFilenamePoller() 238 | } 239 | 240 | return filenamePollers 241 | } 242 | -------------------------------------------------------------------------------- /main/logmetrics_collector.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "flag" 5 | "log" 6 | "log/syslog" 7 | "os" 8 | "os/signal" 9 | "runtime" 10 | "runtime/pprof" 11 | "syscall" 12 | "time" 13 | 14 | "github.com/mathpl/logmetrics" 15 | ) 16 | 17 | var configFile = flag.String("c", "/etc/logmetrics_collector.conf", "Full path to config file.") 18 | var threads = flag.Int("j", 1, "Thread count.") 19 | var logToConsole = flag.Bool("d", false, "Print to console.") 20 | var doNotSend = flag.Bool("D", false, "Print data instead of sending over network.") 21 | var profile = flag.String("P", "", "Create a pprof file with this filename.") 22 | 23 | func main() { 24 | //Process execution flags 25 | flag.Parse() 26 | 27 | var pf *os.File 28 | if *profile != "" { 29 | var err error 30 | pf, err = os.Create(*profile) 31 | if err != nil { 32 | log.Fatal(err) 33 | } 34 | 35 | log.Print("Starting profiler") 36 | pprof.StartCPUProfile(pf) 37 | } 38 | 39 | //Channel to stop the program 40 | stop := make(chan bool) 41 | 42 | //Signal handling 43 | sigc := make(chan os.Signal, 1) 44 | signal.Notify(sigc, 45 | syscall.SIGINT, 46 | syscall.SIGTERM, 47 | syscall.SIGQUIT) 48 | go func() { 49 | s := <-sigc 50 | log.Printf("Received signal: %s", s) 51 | stop <- true 52 | }() 53 | 54 | //Set the number of real threads to start 55 | runtime.GOMAXPROCS(*threads) 56 | 57 | //Config 58 | config := logmetrics.LoadConfig(*configFile) 59 | 60 | //Logger 61 | logger, err := syslog.New(config.GetSyslogFacility(), "logmetrics_collector") 62 | if err != nil { 63 | log.Fatal(err) 64 | } 65 | defer logger.Close() 66 | 67 | if !*logToConsole { 68 | log.SetOutput(logger) 69 | } else { 70 | log.SetFlags(0) 71 | } 72 | 73 | //Start the out channels 74 | tsd_pushers := make([]chan []string, config.GetPusherNumber()) 75 | for i := 0; i < config.GetPusherNumber(); i++ { 76 | tsd_pushers[i] = make(chan []string, 1000) 77 | } 78 | 79 | //Start log tails 80 | fps := logmetrics.StartTails(&config, tsd_pushers) 81 | 82 | //Start datapools 83 | dps := logmetrics.StartDataPools(&config, tsd_pushers) 84 | 85 | //Start TSD pusher 86 | ps := logmetrics.StartTsdPushers(&config, tsd_pushers, *doNotSend) 87 | 88 | //Block until we're told to stop 89 | <-stop 90 | 91 | log.Print("Stopping all goroutines...") 92 | 93 | //Stop file checkers 94 | for _, fp := range fps { 95 | fp.Bye <- true 96 | } 97 | 98 | //Stop data pools 99 | for _, dp := range dps { 100 | dp.Bye <- true 101 | } 102 | 103 | //Stop tsd pushers 104 | for _, ps := range ps { 105 | ps.Bye <- true 106 | } 107 | 108 | if *profile != "" { 109 | pprof.StopCPUProfile() 110 | pf.Close() 111 | log.Print("Stopped profiler") 112 | time.Sleep(time.Duration(10 * time.Second)) 113 | } 114 | 115 | log.Print("All stopped") 116 | } 117 | -------------------------------------------------------------------------------- /metrilyx2.internals.json: -------------------------------------------------------------------------------- 1 | {"_id": "logmetrics_collector_internals", "layout": [[[{"graphs": [{"name": "Line read", "series": [{"alias": "logmetrics_collector.tail.line_read", "yTransform": "", "query": {"aggregator": "sum", "metric": "logmetrics_collector.tail.line_read", "rate": true, "tags": {"cluster": "*"}}}], "thresholds": {"info": 999999999999999, "warning": 10000000000000000, "danger": 1000000000000000000}, "graphType": "line", "_id": "8b9eb73462c64043aba338041e98cae0", "size": "medium"}, {"name": "Line matched", "series": [{"alias": "logmetrics_collector.tail.match", "yTransform": "", "query": {"aggregator": "sum", "metric": "logmetrics_collector.tail.match", "rate": true, "tags": {"cluster": "*"}}}], "thresholds": {"info": 999999999999999, "warning": 10000000000000000, "danger": 1000000000000000000}, "graphType": "line", "_id": "579d3705ec754a4c8c8e9f7c1bec9c5b", "size": "medium"}], "name": "Tailer", "orientation": "vertical"}], [{"graphs": [{"name": "", "series": [{"alias": "logmetrics_collector.data_pool.key_tracked", "yTransform": "", "query": {"aggregator": "sum", "metric": "logmetrics_collector.data_pool.key_tracked", "rate": false, "tags": {"cluster": "*"}}}], "thresholds": {"info": 999999999999999, "warning": 10000000000000000, "danger": 1000000000000000000}, "graphType": "line", "_id": "745bac72022b4f398df3aa5583c809b4", "size": "medium"}], "name": "Datapool", "orientation": "vertical"}, {"graphs": [{"name": "Key sent", "series": [{"alias": "logmetrics_collector.pusher.key_sent", "yTransform": "", "query": {"aggregator": "sum", "metric": "logmetrics_collector.pusher.key_sent", "rate": true, "tags": {"cluster": "*"}}}], "thresholds": {"info": 999999999999999, "warning": 10000000000000000, "danger": 1000000000000000000}, "graphType": "line", "_id": "b9fe6be0dc28401dba11bf799c46928f", "size": "medium"}], "name": "Pusher", "orientation": "vertical"}]]], "name": "logmetrics internals", "description": ""} -------------------------------------------------------------------------------- /metrilyx2.psstat.json: -------------------------------------------------------------------------------- 1 | {"_id": "logmetrics_collector_psstat", "layout": [[[{"graphs": [{"name": "Read IO", "series": [{"alias": "proc.stat.ps.io", "yTransform": "", "query": {"aggregator": "sum", "metric": "proc.stat.ps.io", "rate": true, "tags": {"cluster": "*", "image": "logmetrics_collector", "type": "rchar"}}}], "thresholds": {"info": "400000", "warning": "600000", "danger": "800000"}, "graphType": "line", "_id": "291627ff03c6412e97414456d3a7919b", "size": "medium"}, {"name": "Write IO", "series": [{"alias": "proc.stat.ps.io", "yTransform": "", "query": {"aggregator": "sum", "metric": "proc.stat.ps.io", "rate": true, "tags": {"cluster": "*", "image": "logmetrics_collector", "type": "wchar"}}}], "thresholds": {"info": "100000", "warning": "150000", "danger": "200000"}, "graphType": "line", "_id": "76e7e1d3ea48447a9a46c1695aa3c536", "size": "medium"}], "name": "IO", "orientation": "horizontal"}]], [[{"graphs": [{"name": "Mem usage", "series": [{"alias": "proc.stat.ps.mem", "yTransform": "", "query": {"aggregator": "sum", "metric": "proc.stat.ps.mem", "rate": false, "tags": {"image": "logmetrics_collector", "type": "resident", "host": "*"}}}], "thresholds": {"info": "100000000", "warning": "150000000", "danger": "200000000"}, "graphType": "line", "_id": "b40572956ff14b369b9e9d68024f7cbd", "size": "medium"}, {"name": "FD open", "series": [{"alias": "proc.stat.ps.fd", "yTransform": "", "query": {"aggregator": "sum", "metric": "proc.stat.ps.fd", "rate": false, "tags": {"image": "logmetrics_collector", "type": "open", "host": "*"}}}], "thresholds": {"info": "125", "warning": "150", "danger": "200"}, "graphType": "line", "_id": "75a75d3dd1c84ad98664f09a97616e32", "size": "medium"}], "name": "Resources", "orientation": "horizontal"}, {"graphs": [{"name": "Process CPU", "series": [{"alias": "proc.stat.ps.pcpu", "yTransform": "", "query": {"aggregator": "sum", "metric": "proc.stat.ps.pcpu", "rate": false, "tags": {"image": "logmetrics_collector", "host": "*"}}}], "thresholds": {"info": 999999999999999, "warning": 10000000000000000, "danger": 1000000000000000000}, "graphType": "line", "_id": "228b463f7f21497e99d1d0a0cd201883", "size": "medium"}], "name": "CPU", "orientation": "vertical"}]]], "name": "logmetrics psstat", "description": ""} -------------------------------------------------------------------------------- /parsertest.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "bufio" 5 | "fmt" 6 | "log" 7 | "os" 8 | "path/filepath" 9 | "time" 10 | ) 11 | 12 | type readStats struct { 13 | line_read int64 14 | line_matched int64 15 | byte_pushed int64 16 | last_report time.Time 17 | } 18 | 19 | func (f *readStats) inc(matched bool, data_read int) { 20 | f.line_read++ 21 | if matched { 22 | f.line_matched++ 23 | } 24 | f.byte_pushed += int64(data_read) 25 | } 26 | 27 | func (f *readStats) getStats() string { 28 | line_sec := int(f.line_read / int64(time.Now().Sub(f.last_report)/time.Second)) 29 | match_sec := int(f.line_matched / int64(time.Now().Sub(f.last_report)/time.Second)) 30 | mbyte_sec := float64(f.byte_pushed) / 1024 / 1024 / float64(time.Now().Sub(f.last_report)/time.Second) 31 | 32 | f.line_read = 0 33 | f.line_matched = 0 34 | f.byte_pushed = 0 35 | f.last_report = time.Now() 36 | 37 | return fmt.Sprintf("%d line/s %d match/s %.3f Mb/s.", 38 | line_sec, match_sec, mbyte_sec) 39 | } 40 | 41 | func (f *readStats) isTimeForStats(interval int) bool { 42 | return (time.Now().Sub(f.last_report) > time.Duration(interval)*time.Second) 43 | } 44 | 45 | func parserTest(filename string, lg *logGroup, perfInfo bool) { 46 | maxMatches := lg.expected_matches + 1 47 | 48 | file, err := os.Open(filename) 49 | if err != nil { 50 | log.Fatalf("Unable to tail %s: %s", filename, err) 51 | return 52 | } 53 | 54 | scanner := bufio.NewScanner(file) 55 | 56 | log.Printf("Parsing %s", filename) 57 | 58 | read_stats := readStats{last_report: time.Now()} 59 | for scanner.Scan() { 60 | line := scanner.Text() 61 | 62 | //Test out all the regexp, pick the first one that matches 63 | match_one := false 64 | for _, re := range lg.re { 65 | m := re.MatcherString(line, 0) 66 | matches := m.ExtractString() 67 | if len(matches) == maxMatches { 68 | 69 | match_one = true 70 | } 71 | } 72 | 73 | read_stats.inc(match_one, len(line)) 74 | 75 | if lg.fail_regex_warn && !match_one { 76 | log.Printf("Regexp match failed on %s, expected %d matches: %s", filename, maxMatches, line) 77 | } 78 | 79 | if read_stats.isTimeForStats(1) { 80 | log.Print(read_stats.getStats()) 81 | } 82 | } 83 | 84 | log.Printf("Finished parsing %s.", filename) 85 | } 86 | 87 | func startlogGroupParserTest(logGroup *logGroup, perfInfo bool) { 88 | 89 | newFiles := make(map[string]bool) 90 | for _, glob := range logGroup.globFiles { 91 | files, err := filepath.Glob(glob) 92 | if err != nil { 93 | log.Fatalf("Unable to find files for log group %s: %s", logGroup.name, err) 94 | } 95 | 96 | for _, v := range files { 97 | newFiles[v] = true 98 | } 99 | } 100 | 101 | //Start tailing new files! 102 | for file, _ := range newFiles { 103 | parserTest(file, logGroup, perfInfo) 104 | } 105 | 106 | } 107 | 108 | func StartParserTest(config *Config, selectedlogGroup string, perfInfo bool) { 109 | for logGroupName, logGroup := range config.logGroups { 110 | if selectedlogGroup == "" || logGroupName == selectedlogGroup { 111 | startlogGroupParserTest(logGroup, perfInfo) 112 | } 113 | } 114 | } 115 | -------------------------------------------------------------------------------- /parsertest/logmetrics_parsertest.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "flag" 5 | //"log" 6 | "runtime" 7 | "github.com/mathpl/logmetrics" 8 | ) 9 | 10 | var configFile = flag.String("c", "/etc/logmetrics_collector.conf", "Full path to config file.") 11 | var logGroup = flag.String("l", "", "Log group to test. (Default: all)") 12 | var perfInfo = flag.Bool("p", false, "Print parser performance info. (Default: false)") 13 | var threads = flag.Int("j", 1, "Thread count.") 14 | 15 | func main() { 16 | //Process execution flags 17 | flag.Parse() 18 | 19 | //Set the number of real threads to start 20 | runtime.GOMAXPROCS(*threads) 21 | 22 | //Config 23 | config := logmetrics.LoadConfig(*configFile) 24 | 25 | //Start log parsing 26 | logmetrics.StartParserTest(&config, *logGroup, *perfInfo) 27 | } 28 | -------------------------------------------------------------------------------- /syslog_helper.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import "log/syslog" 4 | 5 | var facilityStrings = map[string]syslog.Priority{ 6 | "kern": syslog.LOG_KERN, 7 | "user": syslog.LOG_USER, 8 | "mail": syslog.LOG_MAIL, 9 | "daemon": syslog.LOG_DAEMON, 10 | "auth": syslog.LOG_AUTH, 11 | "syslog": syslog.LOG_SYSLOG, 12 | "lpr": syslog.LOG_LPR, 13 | "news": syslog.LOG_NEWS, 14 | "uucp": syslog.LOG_UUCP, 15 | "cron": syslog.LOG_CRON, 16 | "authpriv": syslog.LOG_AUTHPRIV, 17 | "ftp": syslog.LOG_FTP, 18 | "local0": syslog.LOG_LOCAL0, 19 | "local1": syslog.LOG_LOCAL1, 20 | "local2": syslog.LOG_LOCAL2, 21 | "local3": syslog.LOG_LOCAL3, 22 | "local4": syslog.LOG_LOCAL4, 23 | "local5": syslog.LOG_LOCAL5, 24 | "local6": syslog.LOG_LOCAL6, 25 | "local7": syslog.LOG_LOCAL7, 26 | } 27 | -------------------------------------------------------------------------------- /transform.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "bytes" 5 | "log" 6 | "os" 7 | 8 | "github.com/mathpl/golang-pkg-pcre/src/pkg/pcre" 9 | "github.com/metakeule/replacer" 10 | ) 11 | 12 | type transform struct { 13 | replace_only_one bool 14 | log_default_assign bool 15 | 16 | ops []interface{} 17 | } 18 | 19 | type replace struct { 20 | str string 21 | repl []byte 22 | matcher *pcre.Regexp 23 | replacer replacer.Replacer 24 | } 25 | 26 | type match_or_default struct { 27 | str string 28 | default_val string 29 | matcher *pcre.Regexp 30 | } 31 | 32 | func (r *replace) init(regexp string, template string) { 33 | matcher := pcre.MustCompile(regexp, 0) 34 | r.matcher = &matcher 35 | 36 | r.replacer = replacer.New() 37 | r.replacer.Parse([]byte(template)) 38 | } 39 | 40 | func (m *match_or_default) init(regexp string, default_val string) { 41 | matcher := pcre.MustCompile(regexp, 0) 42 | m.matcher = &matcher 43 | m.default_val = default_val 44 | } 45 | 46 | func (t *transform) apply(data string) string { 47 | for _, operation := range t.ops { 48 | got_match := false 49 | switch op := operation.(type) { 50 | case replace: 51 | if (t.replace_only_one && !got_match) || !t.replace_only_one { 52 | m := op.matcher.MatcherString(data, 0) 53 | got_match = m.Matches() 54 | if got_match { 55 | var buf bytes.Buffer 56 | 57 | replace_map := build_replace_map(m.ExtractString()) 58 | op.replacer.Replace(&buf, replace_map) 59 | data = buf.String() 60 | } 61 | } 62 | case match_or_default: 63 | m := op.matcher.Matcher([]byte(data), 0) 64 | if !m.Matches() { 65 | if t.log_default_assign { 66 | log.Printf("Assigning default value to: %s", data) 67 | } 68 | data = op.default_val 69 | } 70 | } 71 | } 72 | 73 | return data 74 | } 75 | 76 | func parseTransform(conf map[interface{}]interface{}) map[int]transform { 77 | transforms := make(map[int]transform) 78 | 79 | for position, setting := range conf { 80 | switch s := setting.(type) { 81 | case map[interface{}]interface{}: 82 | var transform transform 83 | 84 | var ok bool 85 | if transform.replace_only_one, ok = s["replace_only_one"].(bool); !ok { 86 | transform.replace_only_one = false 87 | } 88 | if transform.log_default_assign, ok = s["log_default_assign"].(bool); !ok { 89 | transform.log_default_assign = false 90 | } 91 | 92 | var operations []interface{} 93 | if operations, ok = s["operations"].([]interface{}); ok { 94 | for _, args := range operations { 95 | 96 | var str_args []string 97 | // Convert to []string 98 | for _, arg := range args.([]interface{}) { 99 | str_args = append(str_args, arg.(string)) 100 | } 101 | 102 | switch str_args[0] { 103 | case "replace": 104 | var r replace 105 | r.init(str_args[1], str_args[2]) 106 | transform.ops = append(transform.ops, r) 107 | case "match_or_default": 108 | var m match_or_default 109 | m.init(str_args[1], str_args[2]) 110 | transform.ops = append(transform.ops, m) 111 | } 112 | } 113 | } else { 114 | log.Print("No operation under tranform group.") 115 | os.Exit(1) 116 | } 117 | 118 | transforms[position.(int)] = transform 119 | } 120 | } 121 | 122 | return transforms 123 | } 124 | -------------------------------------------------------------------------------- /tsdpusher.go: -------------------------------------------------------------------------------- 1 | package logmetrics 2 | 3 | import ( 4 | "fmt" 5 | "log" 6 | "net" 7 | "time" 8 | ) 9 | 10 | type pusher struct { 11 | cfg *Config 12 | tsd_push chan []string 13 | do_not_send bool 14 | channel_number int 15 | hostname string 16 | key_push_stats keyPushStats 17 | 18 | Bye chan bool 19 | } 20 | 21 | type keyPushStats struct { 22 | key_pushed int64 23 | byte_pushed int64 24 | last_report time.Time 25 | hostname string 26 | interval int 27 | pusher_number int 28 | } 29 | 30 | func (f *keyPushStats) inc(data_written int) { 31 | f.key_pushed++ 32 | f.byte_pushed += int64(data_written) 33 | } 34 | 35 | func (f *keyPushStats) getLine() []string { 36 | t := time.Now() 37 | 38 | f.last_report = t 39 | 40 | line := make([]string, 2) 41 | line[0] = fmt.Sprintf("logmetrics_collector.pusher.key_sent %d %d host=%s pusher_number=%d", t.Unix(), f.key_pushed, f.hostname, f.pusher_number) 42 | line[1] = fmt.Sprintf("logmetrics_collector.pusher.byte_sent %d %d host=%s pusher_number=%d", t.Unix(), f.byte_pushed, f.hostname, f.pusher_number) 43 | 44 | return line 45 | } 46 | 47 | func (f *keyPushStats) isTimeForStats() bool { 48 | return time.Now().Sub(f.last_report) > time.Duration(f.interval)*time.Second 49 | } 50 | 51 | func writeLine(config *Config, do_not_send bool, conn net.Conn, line string) (int, net.Conn) { 52 | if config.pushType == "tsd" { 53 | line = ("put " + line + "\n") 54 | } else { 55 | line = line 56 | } 57 | 58 | byte_line := []byte(line) 59 | byte_written := len(byte_line) 60 | 61 | var err error 62 | if do_not_send { 63 | fmt.Print(line + "\n") 64 | } else { 65 | for { 66 | //Reconnect if needed 67 | if conn == nil { 68 | target := config.GetTsdTarget() 69 | log.Printf("Reconnecting to %s", target) 70 | 71 | if conn, err = net.Dial(config.pushProto, target); err != nil { 72 | log.Printf("Unable to reconnect: %s", err) 73 | time.Sleep(time.Duration(config.pushWait) * time.Second) 74 | } 75 | } 76 | 77 | if conn != nil { 78 | _, err = conn.Write(byte_line) 79 | 80 | if err != nil { 81 | log.Printf("Error writting data: %s", err) 82 | conn.Close() 83 | conn = nil 84 | time.Sleep(time.Duration(config.pushWait) * time.Second) 85 | } else { 86 | break 87 | } 88 | } 89 | 90 | } 91 | } 92 | 93 | return byte_written, conn 94 | } 95 | 96 | func (p *pusher) start() { 97 | log.Printf("TsdPusher[%d] started. Pushing keys to %s:%d over %s in %s format", p.channel_number, p.cfg.pushHost, 98 | p.cfg.pushPort, p.cfg.pushProto, p.cfg.pushType) 99 | 100 | p.key_push_stats = keyPushStats{last_report: time.Now(), hostname: p.hostname, interval: p.cfg.stats_interval, pusher_number: p.channel_number} 101 | 102 | var conn net.Conn 103 | for { 104 | select { 105 | case keys := <-p.tsd_push: 106 | for _, line := range keys { 107 | var bytes_written int 108 | bytes_written, conn = writeLine(p.cfg, p.do_not_send, conn, line) 109 | 110 | p.key_push_stats.inc(bytes_written) 111 | 112 | //Stats on key pushed, limit checks with modulo (now() is a syscall) 113 | if (p.key_push_stats.key_pushed%100) == 0 && p.key_push_stats.isTimeForStats() { 114 | for _, local_line := range p.key_push_stats.getLine() { 115 | bytes_written, conn = writeLine(p.cfg, p.do_not_send, conn, local_line) 116 | p.key_push_stats.inc(bytes_written) 117 | } 118 | } 119 | } 120 | case <-p.Bye: 121 | log.Printf("TsdPusher[%d] stopped.", p.channel_number) 122 | return 123 | } 124 | } 125 | } 126 | 127 | func StartTsdPushers(config *Config, tsd_pushers []chan []string, do_not_send bool) []*pusher { 128 | if config.pushPort == 0 { 129 | return nil 130 | } 131 | 132 | hostname := getHostname() 133 | 134 | allPushers := make([]*pusher, 0) 135 | for i, _ := range tsd_pushers { 136 | channel_number := i 137 | 138 | tsd_push := tsd_pushers[channel_number] 139 | bye := make(chan bool) 140 | p := pusher{cfg: config, tsd_push: tsd_push, hostname: hostname, do_not_send: do_not_send, channel_number: channel_number, Bye: bye} 141 | go p.start() 142 | allPushers = append(allPushers, &p) 143 | } 144 | 145 | return allPushers 146 | } 147 | -------------------------------------------------------------------------------- /utils/etc/init.d/logmetrics_collector: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # logmetrics_collector Startup script for the logmetrics_collector monitoring agent 4 | # 5 | # chkconfig: 2345 15 85 6 | # description: logmetrics_collector is an agent that collects and reports \ 7 | # monitoring data from logs for OpenTSDB. 8 | # processname: logmetrics_collector 9 | # pidfile: /var/run/logmetrics_collector.pid 10 | # 11 | ### BEGIN INIT INFO 12 | # Provides: logmetrics_collector 13 | # Required-Start: $local_fs $remote_fs $network $named 14 | # Required-Stop: $local_fs $remote_fs $network 15 | # Short-Description: start and stop logmetrics_collector log monitoring agent 16 | # Description: logmetrics_collector is an agent that collects and reports 17 | # monitoring data from logs for OpenTSDB. 18 | ### END INIT INFO 19 | 20 | # Source function library. 21 | . /etc/init.d/functions 22 | 23 | LOGMETRICS_COLLECTOR=${LOGMETRICS_COLLECTOR-/usr/local/bin/logmetrics_collector} 24 | PIDFILE=${PIDFILE-/var/run/logmetrics_collector.pid} 25 | 26 | prog=logmetrics_collector 27 | if [ -f /etc/sysconfig/$prog ]; then 28 | . /etc/sysconfig/$prog 29 | fi 30 | 31 | if [ -z "$LOGMETRICS_OPTIONS" ]; then 32 | LOGMETRICS_OPTIONS="-c /etc/logmetrics_collector.conf" 33 | fi 34 | 35 | sanity_check() { 36 | for i in "$PIDFILE"; do 37 | # If the file doesn't exist, check that we have write access to its parent 38 | # directory to be able to create it. 39 | test -e "$i" || i=`dirname "$i"` 40 | test -w "$i" || { 41 | echo >&2 "error: Cannot write to $i" 42 | return 4 43 | } 44 | done 45 | 46 | if [ -z "$LOGMETRICS_USER" ]; then 47 | echo >&2 "error: No \$LOGMETRICS_USER set" 48 | return 4 49 | fi 50 | } 51 | 52 | start() { 53 | echo -n $"Starting $prog: " 54 | sanity_check || return $? 55 | 56 | PID=`cat $PIDFILE 2>/dev/null` 57 | kill -0 $PID 2>/dev/null 58 | if [ $? -eq 0 ]; then 59 | echo "Already running"; 60 | return; 61 | fi 62 | 63 | #Perl to the rescue, drop privileges before starting. 64 | nohup perl -e "(undef, undef, \$uid, \$gid ) = getpwnam('$LOGMETRICS_USER'); 65 | $)=\$gid;$>=\$uid; 66 | close(STDOUT); close(STDERR); close(STDIN); 67 | open(LOG,'|/usr/bin/logger -p daemon.info -t logmetrics_collector -i'); 68 | open(STDOUT, '>&LOG'); open(STDERR, '>&LOG'); 69 | exec '$LOGMETRICS_COLLECTOR $LOGMETRICS_OPTIONS';" 2>&1 >/dev/null & 70 | 71 | PID=$! 72 | echo $PID > $PIDFILE 73 | sleep 1 74 | 75 | kill -0 $PID 2>/dev/null 76 | if [ $? -eq 0 ]; then 77 | success 78 | RETVAL=0 79 | else 80 | failure 81 | RETVAL=1 82 | fi 83 | echo 84 | } 85 | 86 | # When stopping logmetrics_collector a delay of ~15 seconds before SIGKILLing the 87 | # process so as to give enough time for logmetrics_collector to SIGKILL any errant 88 | # collectors. 89 | stop() { 90 | echo -n $"Stopping $prog: " 91 | sanity_check || return $? 92 | killproc -p $PIDFILE -d 15 $LOGMETRICS_COLLECTOR 93 | RETVAL=$? 94 | echo 95 | } 96 | 97 | # See how we were called. 98 | case "$1" in 99 | start) start;; 100 | stop) stop;; 101 | status) 102 | status -p $PIDFILE $LOGMETRICS_COLLECTOR 103 | RETVAL=$? 104 | ;; 105 | restart|force-reload|reload) stop && start;; 106 | condrestart|try-restart) 107 | if status -p $PIDFILE $LOGMETRICS_COLLECTOR >&/dev/null; then 108 | stop && start 109 | fi 110 | ;; 111 | *) 112 | echo $"Usage: $prog {start|stop|status|restart|force-reload|reload|condrestart|try-restart}" 113 | RETVAL=2 114 | esac 115 | 116 | exit $RETVAL 117 | -------------------------------------------------------------------------------- /utils/etc/sysconfig/logmetrics_collector: -------------------------------------------------------------------------------- 1 | LOGMETRICS_USER="nobody" 2 | LOGMETRICS_COLLECTOR_FIELDS="-j 1 -c /etc/logmetrics_collector.conf" 3 | -------------------------------------------------------------------------------- /utils/logmetrics_collector.spec: -------------------------------------------------------------------------------- 1 | %define name logmetrics_collector 2 | %define tester_name logmetrics_parsertest 3 | %define path /usr/local 4 | %define version 0.4 5 | %define release 9 6 | %define app_path src/github.com/mathpl/logmetrics 7 | %define pcre_version 8.32 8 | 9 | Name: %{name} 10 | Version: %{version} 11 | Release: %{release} 12 | Summary: Log file metrics collector and statistical aggregator for OpenTSDB 13 | Group: System/Monitoring 14 | License: GPL 15 | Source0: /source/%{name}/%{name}-%{version}.src.tgz 16 | Source1: /source/%{name}/pcre-8.32.tar.gz 17 | Source2: /source/%{name} 18 | Requires: tcollector 19 | BuildRequires: go-devel = 1.2 20 | BuildRoot: /build/%{name}-%{version}-%{release} 21 | AutoReqProv: no 22 | 23 | %description 24 | Parse log files containing performance data, computes statistics and outputs them 25 | to TSD or tcollector while using limited ressources. 26 | 27 | %prep 28 | rm -rf $RPM_BUILD_DIR/%{name}-%{version}-%{release} 29 | mkdir -p $RPM_BUILD_DIR/%{name}-%{version}-%{release} 30 | tar xvzf %{SOURCE0} -C $RPM_BUILD_DIR/%{name}-%{version}-%{release} 31 | tar xvzf %{SOURCE1} -C $RPM_BUILD_DIR/ 32 | cp %{SOURCE2} $RPM_BUILD_DIR/%{name}-%{version}-%{release}/ 33 | 34 | #%post 35 | #if [ "$1" = 1 ]; then 36 | # chkconfig --add logmetrics_collector 37 | # chkconfig logmetrics_collector on 38 | # service logmetrics_collector start 39 | #fi 40 | 41 | #%preun 42 | #if [ "$1" = 0 ]; then 43 | # service logmetrics_collector stop 44 | # chkconfig logmetrics_collector off 45 | # chkconfig --del logmetrics_collector 46 | #fi 47 | 48 | %build 49 | #First build pcre to enable static linking 50 | cd $RPM_BUILD_DIR/pcre-%{pcre_version}/ 51 | CFLAGS="-fPIC" CXXFLAGS="-fPIC" ./configure 52 | make 53 | 54 | export GOPATH=$RPM_BUILD_DIR/%{name}-%{version}-%{release} \ 55 | GOROOT=/usr/local/go 56 | 57 | cd $RPM_BUILD_DIR/%{name}-%{version}-%{release}/%{app_path}/main 58 | cp $RPM_BUILD_DIR/pcre-%{pcre_version}/.libs/*.a . 59 | CGO_LDFLAGS="-lpcre -L`pwd`" CGO_CFLAGS="-I$RPM_BUILD_DIR/pcre-%{pcre_version}/" \ 60 | /usr/local/go/bin/go build -o %{name} logmetrics_collector.go 61 | 62 | cd $RPM_BUILD_DIR/%{name}-%{version}-%{release}/%{app_path}/parsertest 63 | cp $RPM_BUILD_DIR/pcre-%{pcre_version}/.libs/*.a . 64 | CGO_LDFLAGS="-lpcre -L`pwd`" CGO_CFLAGS="-I$RPM_BUILD_DIR/pcre-%{pcre_version}/" \ 65 | /usr/local/go/bin/go build -o %{tester_name} %{tester_name}.go 66 | 67 | %install 68 | %{__mkdir_p} ${RPM_BUILD_ROOT}/usr/local/bin/ 69 | %{__cp} $RPM_BUILD_DIR/%{name}-%{version}-%{release}/%{app_path}/main/%{name} ${RPM_BUILD_ROOT}/usr/local/bin/ 70 | %{__cp} $RPM_BUILD_DIR/%{name}-%{version}-%{release}/%{app_path}/parsertest/%{tester_name} ${RPM_BUILD_ROOT}/usr/local/bin/ 71 | %{__mkdir_p} ${RPM_BUILD_ROOT}/etc/init.d/ 72 | %{__cp} $RPM_BUILD_DIR/%{name}-%{version}-%{release}/%{name} ${RPM_BUILD_ROOT}/etc/init.d/ 73 | 74 | %files 75 | %defattr(0755,root,root,-) 76 | /usr/local/bin/%{name} 77 | /usr/local/bin/%{tester_name} 78 | /etc/init.d/%{name} 79 | 80 | %changelog 81 | * Mon Jul 28 2014 Mathieu Payeur - 0.4-9 82 | - Better initscript. 83 | 84 | * Tue May 09 2014 Mathieu Payeur - 0.4-8 85 | - Logging timestamps with stale logging. 86 | 87 | * Tue May 09 2014 Mathieu Payeur - 0.4-7 88 | - Better name to support to print staled metrics. 89 | 90 | * Tue May 09 2014 Mathieu Payeur - 0.4-6 91 | - Support to print staled metrics. 92 | 93 | * Tue May 09 2014 Mathieu Payeur - 0.4-5 94 | - Logic error with stale removal. Yet again! 95 | 96 | * Tue May 09 2014 Mathieu Payeur - 0.4-4 97 | - Logic error with stale removal. Again! 98 | 99 | * Tue May 08 2014 Mathieu Payeur - 0.4-3 100 | - Logic error with stale removal. 101 | 102 | * Tue May 08 2014 Mathieu Payeur - 0.4-2 103 | - Better config options for stale metrics. Sending duplicate metrics now an option. 104 | 105 | * Tue May 07 2014 Mathieu Payeur - 0.4-1 106 | - Stale value support for realtime metrics. 107 | 108 | * Tue Apr 15 2014 Mathieu Payeur - 0.3-10 109 | - Fix for duplicate keys sent. 110 | 111 | * Tue Apr 15 2014 Mathieu Payeur - 0.3-9 112 | - Fix for EWMA stale metric mechanic. 113 | 114 | * Tue Apr 15 2014 Mathieu Payeur - 0.3-8 115 | - EWMA tuning phase 2. 116 | 117 | * Tue Apr 14 2014 Mathieu Payeur - 0.3-7 118 | - EWMA tuning. 119 | 120 | * Tue Apr 14 2014 Mathieu Payeur - 0.3-6 121 | - Configuration stale threshold for metrics. 122 | 123 | * Tue Apr 14 2014 Mathieu Payeur - 0.3-5 124 | - Internals stats output more often, inotify/polling configurable. 125 | 126 | * Tue Apr 11 2014 Mathieu Payeur - 0.3-4 127 | - Fixes for inotify. 128 | 129 | * Tue Apr 11 2014 Mathieu Payeur - 0.3-3 130 | - Now pushes internal processing stats to TSD. 131 | 132 | * Tue Apr 10 2014 Mathieu Payeur - 0.3-2 133 | - pcre now statically linked. 134 | 135 | * Tue Apr 9 2014 Mathieu Payeur - 0.3-1 136 | - Replacing native go regex by pcre bindings. 137 | 138 | * Tue Apr 02 2014 Mathieu Payeur - 0.2-17 139 | - fix in init script to properly drop privileges. 140 | 141 | * Tue Apr 02 2014 Mathieu Payeur - 0.2-16 142 | - Removing CentOS 5.8 req. 143 | 144 | * Tue Apr 02 2014 Mathieu Payeur - 0.2-15 145 | - Config parameter name change 146 | 147 | * Tue Apr 02 2014 Mathieu Payeur - 0.2-14 148 | - Option for time out of order warnings. 149 | 150 | * Tue Mar 31 2014 Mathieu Payeur - 0.2-13 151 | - Left debugging things in. 152 | 153 | * Tue Mar 31 2014 Mathieu Payeur - 0.2-11 154 | - Slight mistake in setuid script. 155 | 156 | * Tue Mar 31 2014 Mathieu Payeur - 0.2-10 157 | - Better init script. 158 | 159 | * Tue Mar 31 2014 Mathieu Payeur - 0.2-9 160 | - Cleanup of user setuid... no more. 161 | 162 | * Tue Mar 31 2014 Mathieu Payeur - 0.2-8 163 | - EWMA tuning + support for multiple value from single math group. 164 | 165 | * Tue Mar 26 2014 Mathieu Payeur - 0.2-7 166 | - Bugfix on EWMA tuning. 167 | 168 | * Tue Mar 26 2014 Mathieu Payeur - 0.2-6 169 | - New paramters for EWMA generation tuning. 170 | 171 | * Tue Mar 26 2014 Mathieu Payeur - 0.2-5 172 | - Another EWMA refresh bugfix. 173 | 174 | * Tue Mar 26 2014 Mathieu Payeur - 0.2-3 175 | - EWMA refresh bugfix. 176 | 177 | * Tue Mar 26 2014 Mathieu Payeur - 0.2-1 178 | - Float support. 179 | 180 | * Tue Mar 24 2014 Mathieu Payeur - 0.1-4 181 | - Now named logmetrics_collector 182 | 183 | * Tue Mar 24 2014 Mathieu Payeur - 0.1-3 184 | - better init script. 185 | 186 | * Tue Mar 24 2014 Mathieu Payeur - 0.1-2 187 | - A few fixes. 188 | 189 | * Tue Mar 24 2014 Mathieu Payeur - 0.1-1 190 | - Initial specfile. 191 | 192 | --------------------------------------------------------------------------------