├── graph_concept.png ├── injuries_sample.png ├── LICENCE ├── README.md └── attack_parser.py /graph_concept.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbennett/dayz_log_parser/HEAD/graph_concept.png -------------------------------------------------------------------------------- /injuries_sample.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbennett/dayz_log_parser/HEAD/injuries_sample.png -------------------------------------------------------------------------------- /LICENCE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Tim Bennett 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DayZ Log File Parser 2 | Extracts injury and kill data from DayZ .ADM/.clog log files. 3 | 4 | **Requirements:** [Pandas](http://pandas.pydata.org/) (pip install pandas). Works on Python 2 and 3. 5 | 6 | **Usage:** python attack_parser.py *name_of_log.clog* 7 | 8 | **Output:** 9 | 10 | * injuries_name_of_log.csv: A timestamped list of player-on-player violence. 11 | * Timestamp: dd/mm/yyyy hh:mm:ss (someone please let me know if it picks up your local date format) 12 | * Attacker/Victim: Player tag. May be duplicate with other players. 13 | * Attacker_ID/Victim_ID: Steam64 player ID. Guaranteed to refer to a unique player. 14 | * Attack_Type: SHOT with a projectile or HIT with a melee weapon. 15 | * Weapon: the naughty thing 16 | * Body_Part: the impact area. (Can someone clarify whether there's a difference between RightArm and rightarm, e.g. upper and lower?) 17 | * kills_name_of_log.csv: A timestamped list of player-caused kills. 18 | * Timestamp: dd/mm/yyyy hh:mm:ss 19 | * Killer/Killer_ID/Victim/Victim_ID: as above 20 | * Kill type: 'Direct' or 'Blood loss' (see discussion below) 21 | * Elapsed time: For 'Blood loss' deaths, number of seconds between the killer last injuring the victim, and their eventual death due to blood loss. Always zero for direct kills. 22 | 23 | Example: 24 | 25 | ![Sample injuries csv output](injuries_sample.png) 26 | 27 | **Discussion:** 28 | 29 | My innovation over other log parsers is to detect when a player dies by blood loss, and guesstimate whether that blood loss was due to another player. With direct kills (e.g. headshots) you get a log line like: 30 | 31 | Player "Alice"(id=76561190000000001) has been killed by player "Bob"(id=76561190000000002) 32 | 33 | But if Bob causes Alice to bleed out, you'll only get this: 34 | 35 | "Alice(uid=76561190000000001) DIED Blood <= 0" 36 | 37 | So after Bob injures Alice, the script starts a timer. If Alice dies of blood loss within 300 seconds, Bob gets credit for the kill. (This is a trade-off because she might take longer than 300 seconds to die, though this is unusual, and also increases the risk of crediting Bob for a kill when Alice escapes but is attacked and killed by a zombie instead.) 38 | 39 | I also de-duplicate multiple references to the same injury. DayZ seems to log high-damage injuries over multiple lines as shock/blood damage is applied (usually one line per 500 damage). 40 | 41 | Please let me know if you find this useful or would like to extract other intelligence from your server log files. 42 | 43 | **Goals:** 44 | 45 | * Fine-tune blood loss death detection. 46 | * Add support for crediting all involved parties on a kill, or for crediting kill assists. 47 | * Add support for simultaneous player count, players session length etc. 48 | * Add support for grouping multiple injuries into a single engagement (use case: awarding points for causing injuries, without awarding more points for rapid fire versus bolt-action weapons). 49 | * Attribute weapons to kills, not just injuries. 50 | * Graphs, webservice, etc (volunteers wanted!): 51 | 52 | ![Concept graph output](graph_concept.png) 53 | 54 | **Acknowledgements:** Regex code from [C222/DayZ-Obituaries](https://github.com/C222/DayZ-Obituaries/) 55 | -------------------------------------------------------------------------------- /attack_parser.py: -------------------------------------------------------------------------------- 1 | import re 2 | import time 3 | import datetime 4 | from datetime import timedelta 5 | import pandas as pd 6 | import sys 7 | import csv 8 | #import gzip # use gzip if you're reading a zipped log file 9 | 10 | # Search patterns 11 | # courtesy of https://github.com/C222/DayZ-Obituaries/blob/master/regex_parse.py 12 | kill = '((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?(?:am|AM|pm|PM))?).*?Player.*?(".*?").*?id=(\\d+).*?has been killed by.*?player.*?(".*?").*?id=(\\d+)' 13 | day = 'AdminLog started on ((?:(?:[1]{1}\\d{1}\\d{1}\\d{1})|(?:[2]{1}\\d{3}))[-:\\/.](?:[0]?[1-9]|[1][012])[-:\\/.](?:(?:[0-2]?\\d{1})|(?:[3][01]{1})))(?![\\d]) at ((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?(?:am|AM|pm|PM))?)' 14 | injury = '((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?(?:am|AM|pm|PM))?).*?"(.*?)\\(uid=(\\d+)\\).*?(SHOT|HIT) (.*?)\\(uid=(\\d+)\\) by (.*?) into (.*?)\\."' 15 | timestamp = '^((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?(?:am|AM|pm|PM))?)' 16 | blood_death ='((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?(?:am|AM|pm|PM))?).*?"(.*?)\\(uid=(\\d+)\\) (DIED Blood <= 0\")' 17 | damage = '((?:(?:[0-1][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\s?(?:am|AM|pm|PM))?).*?"(.*?)\\(uid=(\\d+)\\) STATUS S::([-\\d\\.]+) B::([-\\d\\.]+) H::([-\\d\\.]+)' 18 | 19 | # Compiled regexen 20 | kill_c = re.compile(kill,re.IGNORECASE|re.DOTALL) 21 | day_c = re.compile(day,re.IGNORECASE|re.DOTALL) 22 | injury_c = re.compile(injury,re.IGNORECASE|re.DOTALL) 23 | timestamp_c = re.compile(timestamp,re.IGNORECASE|re.DOTALL) 24 | blood_death_c = re.compile(blood_death,re.IGNORECASE|re.DOTALL) 25 | damage_c = re.compile(damage,re.IGNORECASE|re.DOTALL) 26 | 27 | # Tracking variables 28 | last_injury_time = {} # records victim's timestamp and attacker details after each player-inflicted injury 29 | 30 | # Output variables 31 | kill_output = [] # contains individual kill event details 32 | injury_output = [] # data related to players' injuries 33 | blood_death_output = [] # deaths from blood loss, not direct damage 34 | 35 | # Time related 36 | current_day = "" 37 | current_time = "" 38 | 39 | # Functions 40 | def check_increment_date(current_timestamp, this_timestamp): 41 | ''' 42 | If a log file passes midnight, increment the current_day. Pass it 43 | current_timestamp and a new timestamp (this_timestamp) and it will 44 | checks whether this_timestamp is earlier in the 24 hour day than 45 | current_timestamp; if so, we've gone past midnight (e.g. 23:59:59 > 00:00:00) 46 | and need to set current_timestamp to this_timestamp + 1 day. 47 | ''' 48 | if this_timestamp < current_timestamp: 49 | return this_timestamp + timedelta(days=1) 50 | else: 51 | return this_timestamp 52 | 53 | lines = [] # we'll read all lines into this list initially 54 | 55 | 56 | #with gzip.open('zipped_big_log_file.ADM.gz','rb') as f: # remember to also import gzip if you're doing this 57 | with open(sys.argv[1], 'rb') as f: 58 | # read all lines into the list of lines (personal choice; I find it easier to read previous/subsequent lines this way) 59 | for line in f: 60 | lines.append(str(line)) 61 | print("{} lines ingested.".format(len(lines))) 62 | 63 | for line_number, line in enumerate(lines): 64 | # main processing loop examines every line sequentially, deciding if it's one of: 65 | # day_line: timestamp on server startup, used to set the day-month-year in timestamps 66 | # injury_line: where one player injures another in a body part with a weapon 67 | # kill_line: where one player dies and the killer is named 68 | # blood_death_line: where a player dies of blood loss and the cause is ambiguous without further investigation 69 | 70 | if line_number % 1000 == 0: # progress counter 71 | print("processed {} lines".format(line_number)) 72 | 73 | day_line = day_c.search(line) 74 | if day_line: 75 | # construct timestamp from the regex match 76 | current_day = day_line.group(1) 77 | current_time = day_line.group(2) 78 | current_timestamp = datetime.datetime.strptime(current_day+" "+current_time,"%Y-%m-%d %H:%M:%S") 79 | 80 | injury_line = injury_c.search(line) 81 | if injury_line: 82 | 83 | # check if date needs incrementing 84 | this_timestamp = datetime.datetime.strptime(current_day+" "+injury_line.group(1),"%Y-%m-%d %H:%M:%S") 85 | current_timestamp = check_increment_date(current_timestamp, this_timestamp) 86 | 87 | # Regex groups: 1:"Timestamp",2:"Attacker",3:"Attacker_ID",4:"Attack_type",5:"Victim",6:"Victim_ID",7:"Weapon",8:"Body_part" 88 | # injury_output will get turned into a dataframe later 89 | injury_output.append([current_timestamp, 90 | injury_line.group(2), 91 | injury_line.group(3), 92 | injury_line.group(4), 93 | injury_line.group(5), 94 | injury_line.group(6), 95 | injury_line.group(7), 96 | injury_line.group(8).lower() # lower case because eg RightArm and rightarm exist but AFAIK are duplicates; 97 | # (however if they refer to upper and lower arm, this would be a wrong decision) 98 | ]) 99 | # set the timestamp and perpetrator of the victim_id's most recent injury 100 | # if they die within 5 minutes from blood loss, the most recent attacker gets credit 101 | # (this check happens in 'if blood_death_line' conditional below) 102 | last_injury_time[injury_line.group(6)] = [current_timestamp, injury_line.group(2), injury_line.group(3), injury_line.group(5)] 103 | 104 | kill_line = kill_c.search(line) 105 | if kill_line: 106 | this_timestamp = datetime.datetime.strptime(current_day+" "+kill_line.group(1),"%Y-%m-%d %H:%M:%S") 107 | current_timestamp = check_increment_date(current_timestamp, this_timestamp) 108 | 109 | # Regex groups: 1: time 2: victim name, 3:victim id 4:killer name, 5:killer id 110 | if kill_line.group(3) in last_injury_time: # remove victim from last_injury_time; not needed to determine killer 111 | del last_injury_time[kill_line.group(3)] 112 | 113 | kill_output.append([current_timestamp, 114 | kill_line.group(4), 115 | kill_line.group(5), 116 | kill_line.group(2), 117 | kill_line.group(3), 118 | 'Direct', 119 | 0 120 | ]) 121 | 122 | blood_death_line = blood_death_c.search(line) 123 | if blood_death_line: 124 | # group 1: timestamp; group 2: victim; group 3: victim_id 125 | # update the current_timestamp 126 | this_timestamp = datetime.datetime.strptime(current_day+" "+blood_death_line.group(1),"%Y-%m-%d %H:%M:%S") 127 | current_timestamp = check_increment_date(current_timestamp, this_timestamp) 128 | # check if the victim had an injury timer and how long ago it was: 129 | if blood_death_line.group(3) in last_injury_time: 130 | delta = (current_timestamp - last_injury_time[blood_death_line.group(3)][0]).total_seconds() # check how long since the injury occurred 131 | if delta < 300: 132 | # award kill to a killer only if it's within 5 minutes; basically decided to do this 133 | # as a stopgap because longer windows increase the odds of attributing a zombie attack 134 | # to another player; this could be avoided by only keeping a player on the last_injury_time list 135 | # while they are in a bleeding state, but that is possibly complicated by other factors, so this 136 | # is a compromise value. 137 | killer = last_injury_time[blood_death_line.group(3)][1] 138 | killer_id = last_injury_time[blood_death_line.group(3)][2] 139 | kill_output.append([current_timestamp, 140 | killer, 141 | killer_id, 142 | blood_death_line.group(2), 143 | blood_death_line.group(3), 144 | 'Blood loss', 145 | delta 146 | ]) 147 | #else: # uncomment to do something for deaths as a result of other blood loos 148 | # print "{} Bloodloss {} {}".format(current_timestamp,blood_death_line.group(2),delta) 149 | if blood_death_line.group(3) in last_injury_time: # they're dead; we don't need to track last_injury_time till they're injured again 150 | del last_injury_time[blood_death_line.group(3)] 151 | 152 | # create dataframes for output 153 | injuries_df = pd.DataFrame(injury_output, columns=['Timestamp','Attacker','Attacker_ID','Attack_Type','Victim','Victim_ID','Weapon','Body_Part']) 154 | injuries_df = injuries_df.drop_duplicates() # log files contain frequent multiple entries for a single high-damage attack; this condenses them into one attack 155 | injuries_df[['Attacker_ID','Victim_ID']].apply(str) # this is meant to help when opening output CSVs in Excel... but it still treats quoted numeric strings as numbers :( 156 | 157 | kills_df = pd.DataFrame(kill_output, columns=['Timestamp','Killer','Killer_ID','Victim','Victim_ID','Kill type','Elapsed time']) 158 | 159 | injuries_df.to_csv("injuries_{}.csv".format(sys.argv[1]),index=False, quoting=csv.QUOTE_NONNUMERIC) 160 | print('Saved "injuries_{}.csv"'.format(sys.argv[1])) 161 | kills_df.to_csv("kills_{}.csv".format(sys.argv[1]),index=False) 162 | print('Saved "kills_{}.csv"'.format(sys.argv[1])) 163 | print('Processing complete.') 164 | --------------------------------------------------------------------------------