├── CSVtoREADME.py
├── LICENSE
├── README.md
├── img
    ├── autoreengine.png
    └── biprominer.png
├── tools.csv
├── tools.ods
├── tools_based-on-Duchêne-et-al-2017.ods
└── tools_based-on-Sija-et-al-2018.ods


/CSVtoREADME.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | '''
  4 |     Copyright (C) 2020 PRE-list
  5 | 
  6 |     This program is free software: you can redistribute it and/or modify
  7 |     it under the terms of the GNU General Public License as published by
  8 |     the Free Software Foundation, either version 3 of the License, or
  9 |     (at your option) any later version.
 10 | 
 11 |     This program is distributed in the hope that it will be useful,
 12 |     but WITHOUT ANY WARRANTY; without even the implied warranty of
 13 |     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 14 |     GNU General Public License for more details.
 15 | 
 16 |     You should have received a copy of the GNU General Public License
 17 |     along with this program.  If not, see <http://www.gnu.org/licenses/>.
 18 | '''
 19 | 
 20 | 
 21 | import argparse
 22 | import csv
 23 | 
 24 | 
 25 | def checkmark(char):
 26 |     if char == "x":
 27 |         return "&#10004;"
 28 |     else:
 29 |         return char
 30 | 
 31 | """
 32 | Input:
 33 |     file handle
 34 |     list of papers
 35 |     a list of column header entries
 36 |     use checkmark sign instead of content if checkm is true
 37 | Output:
 38 |     table based on column headers printed to file
 39 | """
 40 | def print_table(f, papers, columns, checkm=False):
 41 | 
 42 | # write table header
 43 |     f.write("| Name | Year |")
 44 | 
 45 |     for item in columns:
 46 |         f.write(" " + item + " |")
 47 | 
 48 |     f.write("\n")
 49 |     f.write("|------|------|")
 50 | 
 51 |     for item in columns:
 52 |         # put in as many '-' as item is long + 2 for space on beginning and end
 53 |         f.write((len(item) + 2) * "-" + "|")
 54 | 
 55 |     f.write("\n")
 56 | 
 57 | # write table rows with content
 58 |     if len(columns) > 1:
 59 |         # TODO do sorting
 60 |         for cite, paper in papers:
 61 |             f.write("| " + paper['Name'] + " [[" + str(cite) + "]](#" + str(cite) + ") | " + \
 62 |                     paper['Year'] + " |")
 63 |             for item in columns:
 64 |                 if checkm:
 65 |                     f.write(" " + checkmark(paper[item]) + " |")
 66 |                 else:
 67 |                     f.write(" " + paper[item] + " |")
 68 |             f.write("\n")
 69 |     else:
 70 |         for cite, paper in papers:
 71 |             if paper[item]:
 72 |                 f.write("| " + paper['Name'] + " [[" + str(cite) + "]](#" + str(cite) + ") | " + \
 73 |                         paper['Year'] + " | " + paper[item] + " |\n")
 74 | 
 75 | 
 76 | def main(delimiter):
 77 | 
 78 |     papers = []
 79 |     with open('tools.csv', newline='') as f:
 80 |         reader = csv.DictReader(f, delimiter=delimiter, quotechar='"')
 81 |         for row in reader:
 82 |             papers.append(row)
 83 | 
 84 |     papers = list(enumerate(sorted(papers, key=lambda k: k['Year']), start=1))
 85 | 
 86 |     with open('README.md', 'w') as f:
 87 |         f.write("PRE-list\n")
 88 |         f.write("========\n\n")
 89 |         f.write("List of (automatic) protocol reverse engineering tools/methods/approaches for " + \
 90 |                 "network protocols<br/>\n\n")
 91 | 
 92 |         f.write("This is a collection of " + str(len(papers)) + " scientific papers about " + \
 93 |                 "(automatic) protocol reverse engineering (PRE) methods and tools. " + \
 94 |                 "The papers are categorized into different groups so that it is more easy " + \
 95 |                 "to get an overview of existing solutions based on the problem you want to " + \
 96 |                 "tackle.<br/>\n\n" + \
 97 |                 "The collection is based on the following three surveys and got " + \
 98 |                 "extended afterwards:\n\n" + \
 99 |                 "* J. Narayan, S. K. Shukla, and T. C. Clancy, “A Survey of Automatic Protocol Reverse Engineering Tools,” ACM Computing Surveys, vol. 48, no. 3, pp. 1–26, Feb. 2016, doi: 10.1145/2840724. [PDF](https://www.researchgate.net/profile/Sandeep_Shukla6/publication/287106642_A_Survey_of_Automatic_Protocol_Reverse_Engineering_Tools/links/5773948208ae1b18a7ddff91/A-Survey-of-Automatic-Protocol-Reverse-Engineering-Tools.pdf)\n" + \
100 |                 "* J. Duchêne, C. Le Guernic, E. Alata, V. Nicomette, and M. Kaâniche, “State of the art of network protocol reverse engineering tools,” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 1, pp. 53–68, Feb. 2018, doi: 10.1007/s11416-016-0289-8. [PDF](https://hal.inria.fr/hal-01496958/document)\n" + \
101 |                 "* B. D. Sija, Y.-H. Goo, K.-S. Shim, H. Hasanova, and M.-S. Kim, “A Survey of Automatic Protocol Reverse Engineering Approaches, Methods, and Tools on the Inputs and Outputs View,” Security and Communication Networks, vol. 2018, pp. 1–17, 2018, doi: 10.1155/2018/8370341. [PDF](https://downloads.hindawi.com/journals/scn/2018/8370341.pdf)\n\n" + \
102 |                 "Furthermore, there is a very extensive surveys which focuses on the methods " + \
103 |                 "and approaches of PRE tools that are based on network traces. The work of " + \
104 |                 "Kleber et al. is an excellent starting point to see what was already tried " + \
105 |                 "and for which use cases a method is working best.\n\n" + \
106 |                 "* S. Kleber, L. Maile, and F. Kargl, “Survey of Protocol Reverse Engineering Algorithms: Decomposition of Tools for Static Traffic Analysis,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 526–561, 2019, doi: 10.1109/COMST.2018.2867544. [PDF](https://oparu.uni-ulm.de/xmlui/bitstream/handle/123456789/11078/COMST2867544.pdf)\n\n" + \
107 |                 "Please help extending this collection by adding papers to the `tools.ods`.\n\n")
108 | 
109 |         f.write("\n# Table of Contents\n\n")
110 | 
111 |         f.write("* [Overview](#overview-)\n")
112 |         f.write("* [Input and Output](#input-and-output-)\n")
113 |         f.write("* [Tested protocols](#tested-protocols-)\n")
114 |         f.write("* [Source code](#source-code-)\n")
115 |         f.write("* [References](#references-)\n\n")
116 | 
117 | # Overview
118 |         f.write("\n# Overview [&uarr;](#table-of-contents)\n\n")
119 |         print_table(f, papers, ['Approach used'])
120 | 
121 | # Input and Output
122 |         f.write("\n# Input and Output [&uarr;](#table-of-contents)\n\n")
123 |         f.write("NetT: input is a network trace (e.g. pcap)<br />\n" + \
124 |                 "ExeT: input is an execution trace (code/binary at hand)<br />\n" + \
125 |                 "PF: output is protocol format (describing the syntax)<br />\n" + \
126 |                 "PFSM: output is protocol finite state machine (describing semantic/sequential " + \
127 |                 "logic)<br />\n\n")
128 |         print_table(f, papers, ['NetT', 'ExeT', 'PF', 'PFSM', 'Other Output'], checkm=True)
129 | 
130 | # Tested protocols
131 |         f.write("\n# Tested protocols [&uarr;](#table-of-contents)\n\n")
132 |         print_table(f, papers, ['Text-based', 'Binary-based', 'Hybrid', 'Other Protocols'])
133 | 
134 | # Source Code
135 |         f.write("\n# Source Code [&uarr;](#table-of-contents)\n\n")
136 |         f.write("Most papers do not provide the code used in the research. For the following " + \
137 |                 "papers exists (example) code.<br/>\n")
138 |         print_table(f, papers, ['Source Code'])
139 | 
140 | # References
141 |         f.write("\n# References [&uarr;](#table-of-contents)\n\n")
142 |         for cite, paper in papers:
143 |             f.write('#### [' + str(cite) + ']\n')
144 |             f.write(paper['Paper(s)'])
145 |             if paper['Link to paper']:
146 |                 f.write(" [PDF](" + paper['Link to paper'] + ")")
147 |             f.write("\n")
148 | 
149 | 
150 | if __name__ == '__main__':
151 | 
152 |     parser = argparse.ArgumentParser(description="Transform tools.cvs into a README.md for Github repo")
153 | 
154 |     parser.add_argument("--delimiter", default=";", 
155 |                         help="Delimiter used in .csv file", metavar="DELIM")
156 |     # TODO add argument for csv file and check if file exists
157 | 
158 |     args = parser.parse_args()
159 | 
160 |     main(args.delimiter)
161 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 | Creative Commons Legal Code
  2 | 
  3 | CC0 1.0 Universal
  4 | 
  5 |     CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
  6 |     LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
  7 |     ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
  8 |     INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
  9 |     REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
 10 |     PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
 11 |     THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
 12 |     HEREUNDER.
 13 | 
 14 | Statement of Purpose
 15 | 
 16 | The laws of most jurisdictions throughout the world automatically confer
 17 | exclusive Copyright and Related Rights (defined below) upon the creator
 18 | and subsequent owner(s) (each and all, an "owner") of an original work of
 19 | authorship and/or a database (each, a "Work").
 20 | 
 21 | Certain owners wish to permanently relinquish those rights to a Work for
 22 | the purpose of contributing to a commons of creative, cultural and
 23 | scientific works ("Commons") that the public can reliably and without fear
 24 | of later claims of infringement build upon, modify, incorporate in other
 25 | works, reuse and redistribute as freely as possible in any form whatsoever
 26 | and for any purposes, including without limitation commercial purposes.
 27 | These owners may contribute to the Commons to promote the ideal of a free
 28 | culture and the further production of creative, cultural and scientific
 29 | works, or to gain reputation or greater distribution for their Work in
 30 | part through the use and efforts of others.
 31 | 
 32 | For these and/or other purposes and motivations, and without any
 33 | expectation of additional consideration or compensation, the person
 34 | associating CC0 with a Work (the "Affirmer"), to the extent that he or she
 35 | is an owner of Copyright and Related Rights in the Work, voluntarily
 36 | elects to apply CC0 to the Work and publicly distribute the Work under its
 37 | terms, with knowledge of his or her Copyright and Related Rights in the
 38 | Work and the meaning and intended legal effect of CC0 on those rights.
 39 | 
 40 | 1. Copyright and Related Rights. A Work made available under CC0 may be
 41 | protected by copyright and related or neighboring rights ("Copyright and
 42 | Related Rights"). Copyright and Related Rights include, but are not
 43 | limited to, the following:
 44 | 
 45 |   i. the right to reproduce, adapt, distribute, perform, display,
 46 |      communicate, and translate a Work;
 47 |  ii. moral rights retained by the original author(s) and/or performer(s);
 48 | iii. publicity and privacy rights pertaining to a person's image or
 49 |      likeness depicted in a Work;
 50 |  iv. rights protecting against unfair competition in regards to a Work,
 51 |      subject to the limitations in paragraph 4(a), below;
 52 |   v. rights protecting the extraction, dissemination, use and reuse of data
 53 |      in a Work;
 54 |  vi. database rights (such as those arising under Directive 96/9/EC of the
 55 |      European Parliament and of the Council of 11 March 1996 on the legal
 56 |      protection of databases, and under any national implementation
 57 |      thereof, including any amended or successor version of such
 58 |      directive); and
 59 | vii. other similar, equivalent or corresponding rights throughout the
 60 |      world based on applicable law or treaty, and any national
 61 |      implementations thereof.
 62 | 
 63 | 2. Waiver. To the greatest extent permitted by, but not in contravention
 64 | of, applicable law, Affirmer hereby overtly, fully, permanently,
 65 | irrevocably and unconditionally waives, abandons, and surrenders all of
 66 | Affirmer's Copyright and Related Rights and associated claims and causes
 67 | of action, whether now known or unknown (including existing as well as
 68 | future claims and causes of action), in the Work (i) in all territories
 69 | worldwide, (ii) for the maximum duration provided by applicable law or
 70 | treaty (including future time extensions), (iii) in any current or future
 71 | medium and for any number of copies, and (iv) for any purpose whatsoever,
 72 | including without limitation commercial, advertising or promotional
 73 | purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
 74 | member of the public at large and to the detriment of Affirmer's heirs and
 75 | successors, fully intending that such Waiver shall not be subject to
 76 | revocation, rescission, cancellation, termination, or any other legal or
 77 | equitable action to disrupt the quiet enjoyment of the Work by the public
 78 | as contemplated by Affirmer's express Statement of Purpose.
 79 | 
 80 | 3. Public License Fallback. Should any part of the Waiver for any reason
 81 | be judged legally invalid or ineffective under applicable law, then the
 82 | Waiver shall be preserved to the maximum extent permitted taking into
 83 | account Affirmer's express Statement of Purpose. In addition, to the
 84 | extent the Waiver is so judged Affirmer hereby grants to each affected
 85 | person a royalty-free, non transferable, non sublicensable, non exclusive,
 86 | irrevocable and unconditional license to exercise Affirmer's Copyright and
 87 | Related Rights in the Work (i) in all territories worldwide, (ii) for the
 88 | maximum duration provided by applicable law or treaty (including future
 89 | time extensions), (iii) in any current or future medium and for any number
 90 | of copies, and (iv) for any purpose whatsoever, including without
 91 | limitation commercial, advertising or promotional purposes (the
 92 | "License"). The License shall be deemed effective as of the date CC0 was
 93 | applied by Affirmer to the Work. Should any part of the License for any
 94 | reason be judged legally invalid or ineffective under applicable law, such
 95 | partial invalidity or ineffectiveness shall not invalidate the remainder
 96 | of the License, and in such case Affirmer hereby affirms that he or she
 97 | will not (i) exercise any of his or her remaining Copyright and Related
 98 | Rights in the Work or (ii) assert any associated claims and causes of
 99 | action with respect to the Work, in either case contrary to Affirmer's
100 | express Statement of Purpose.
101 | 
102 | 4. Limitations and Disclaimers.
103 | 
104 |  a. No trademark or patent rights held by Affirmer are waived, abandoned,
105 |     surrendered, licensed or otherwise affected by this document.
106 |  b. Affirmer offers the Work as-is and makes no representations or
107 |     warranties of any kind concerning the Work, express, implied,
108 |     statutory or otherwise, including without limitation warranties of
109 |     title, merchantability, fitness for a particular purpose, non
110 |     infringement, or the absence of latent or other defects, accuracy, or
111 |     the present or absence of errors, whether or not discoverable, all to
112 |     the greatest extent permissible under applicable law.
113 |  c. Affirmer disclaims responsibility for clearing rights of other persons
114 |     that may apply to the Work or any use thereof, including without
115 |     limitation any person's Copyright and Related Rights in the Work.
116 |     Further, Affirmer disclaims responsibility for obtaining any necessary
117 |     consents, permissions or other rights required for any use of the
118 |     Work.
119 |  d. Affirmer understands and acknowledges that Creative Commons is not a
120 |     party to this document and has no duty or obligation with respect to
121 |     this CC0 or use of the Work.
122 | 
123 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | PRE-list
  2 | ========
  3 | 
  4 | List of (automatic) protocol reverse engineering tools/methods/approaches for network protocols<br/>
  5 | 
  6 | This is a collection of 71 scientific papers about (automatic) protocol reverse engineering (PRE) methods and tools. The papers are categorized into different groups so that it is more easy to get an overview of existing solutions based on the problem you want to tackle.<br/>
  7 | 
  8 | The collection is based on the following three surveys and got extended afterwards:
  9 | 
 10 | * J. Narayan, S. K. Shukla, and T. C. Clancy, “A Survey of Automatic Protocol Reverse Engineering Tools,” ACM Computing Surveys, vol. 48, no. 3, pp. 1–26, Feb. 2016, doi: 10.1145/2840724. [PDF](https://www.researchgate.net/profile/Sandeep_Shukla6/publication/287106642_A_Survey_of_Automatic_Protocol_Reverse_Engineering_Tools/links/5773948208ae1b18a7ddff91/A-Survey-of-Automatic-Protocol-Reverse-Engineering-Tools.pdf)
 11 | * J. Duchêne, C. Le Guernic, E. Alata, V. Nicomette, and M. Kaâniche, “State of the art of network protocol reverse engineering tools,” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 1, pp. 53–68, Feb. 2018, doi: 10.1007/s11416-016-0289-8. [PDF](https://hal.inria.fr/hal-01496958/document)
 12 | * B. D. Sija, Y.-H. Goo, K.-S. Shim, H. Hasanova, and M.-S. Kim, “A Survey of Automatic Protocol Reverse Engineering Approaches, Methods, and Tools on the Inputs and Outputs View,” Security and Communication Networks, vol. 2018, pp. 1–17, 2018, doi: 10.1155/2018/8370341. [PDF](https://downloads.hindawi.com/journals/scn/2018/8370341.pdf)
 13 | 
 14 | Furthermore, there is a very extensive surveys which focuses on the methods and approaches of PRE tools that are based on network traces. The work of Kleber et al. is an excellent starting point to see what was already tried and for which use cases a method is working best.
 15 | 
 16 | * S. Kleber, L. Maile, and F. Kargl, “Survey of Protocol Reverse Engineering Algorithms: Decomposition of Tools for Static Traffic Analysis,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 526–561, 2019, doi: 10.1109/COMST.2018.2867544. [PDF](https://oparu.uni-ulm.de/xmlui/bitstream/handle/123456789/11078/COMST2867544.pdf)
 17 | 
 18 | Please help extending this collection by adding papers to the `tools.ods`.
 19 | 
 20 | 
 21 | # Table of Contents
 22 | 
 23 | * [Overview](#overview-)
 24 | * [Input and Output](#input-and-output-)
 25 | * [Tested protocols](#tested-protocols-)
 26 | * [Source code](#source-code-)
 27 | * [References](#references-)
 28 | 
 29 | 
 30 | # Overview [&uarr;](#table-of-contents)
 31 | 
 32 | | Name | Year | Approach used |
 33 | |------|------|---------------|
 34 | | PIP [[1]](#1) | 2004 | Keyword detection and Sequence alignment based on Needleman and Wunsch 1970 and Smith and Waterman 1981; this approach was applied and extended by many following papers |
 35 | | GAPA [[2]](#2) | 2005 | Protocol analyzer and open language that uses the protocol analyzer specification Spec → it is meant to be integrated in monitoring and analyzing tools |
 36 | | ScriptGen [[3]](#3) | 2005 | Grouping and clustering messages, find edges from clusters to clusters for being able to replay messages once a similar message arrives |
 37 | | RolePlayer [[4]](#4) | 2006 | Byte-wise sequence alignment (find variable fields in messages) and clustering with FSM simplification |
 38 | | Ma et al. [[5]](#5) | 2006 | Please review |
 39 | | FFE/x86 [[6]](#6) | 2006 | Please review |
 40 | | Replayer [[7]](#7) | 2006 | Please review |
 41 | | Discoverer [[8]](#8) | 2007 | Tokenization of messages, recursive clustering to find formats, merge similar formats |
 42 | | Polyglot [[9]](#9) | 2007 | Dynamic taint-analysis |
 43 | | PEXT [[10]](#10) | 2007 | Message clustering for creating FSM graph and simplify FSM graph |
 44 | | Rosetta [[11]](#11) | 2007 | Please review |
 45 | | AutoFormat [[12]](#12) | 2008 | Dynamic taint-analysis |
 46 | | Tupni [[13]](#13) | 2008 | Dynamic taint-analysis; look for loops to identify boundaries within messages |
 47 | | Boosting [[14]](#14) | 2008 | Please review |
 48 | | ConfigRE [[15]](#15) | 2008 | Please review |
 49 | | ReFormat [[16]](#16) | 2009 | Dynamic taint-analysis, especially targeting encrypted protocols by looking for bitwise and arithmetic operations |
 50 | | Prospex [[17]](#17) | 2009 | Dynamic taint-analysis with following message clustering, optionally provides fuzzing candidates for Peach fuzzer |
 51 | | Xiao et al. [[18]](#18) | 2009 | Please review |
 52 | | Trifilo et al. [[19]](#19) | 2009 | Measure byte-wise variances in aligned messages |
 53 | | Antunes and Neves [[20]](#20) | 2009 | Please review |
 54 | | Dispatcher [[21]](#21) | 2009 | Dynamic taint-analysis (successor of Polyglot using send instead of received messages) |
 55 | | Fuzzgrind [[22]](#22) | 2009 | Please review |
 56 | | REWARDS [[23]](#23) | 2010 | Please review |
 57 | | MACE [[24]](#24) | 2010 | Please review |
 58 | | Whalen et al. [[25]](#25) | 2010 | Please review |
 59 | | AutoFuzz [[26]](#26) | 2010 | Please review |
 60 | | ReverX [[27]](#27) | 2011 | Speech recognition (thus only for text-based protocols) to find carriage returns and spaces, afterwards looking for frequencies of keywords; multiple partial FSMs are merged and simplified to get PFSM |
 61 | | Veritas [[28]](#28) | 2011 | Identifiying keywords, clustering and transition probability → probabilistic protocol state machine |
 62 | | Biprominer [[29]](#29) | 2011 | Statistical analysis including three phases, learning phase, labeling phase and transition probability model building phase. See [this figure](img/biprominer.png). |
 63 | | ASAP [[30]](#30) | 2011 | Please review |
 64 | | Howard [[31]](#31) | 2011 | Please review |
 65 | | ProDecoder [[32]](#32) | 2012 | Successor of Biprominer which also addresses text-based protocols; two-phases are used: first apply Biprominer, second use Needleman-Wunsch for alignment |
 66 | | Zhang et al. [[33]](#33) | 2012 | Please review |
 67 | | Netzob [[34]](#34) | 2012 | See [this figure](https://github.com/netzob/netzob/blob/4a72c0cbd6d1e7b997b2b8ad170b7a38e400dfca/netzob/doc/documentation/source/netzob_archi.png) |
 68 | | PRISMA [[35]](#35) | 2012 | Please review, follow-up paper/project to ASAP |
 69 | | ARTISTE [[36]](#36) | 2012 | Please review |
 70 | | Wang et al. [[37]](#37) | 2013 | Capturing of data, identifying frames and inferring the format by looking and frequency of frames and doing association analysis (using Apriori and FP-Growth). |
 71 | | Laroche et al. [[38]](#38) | 2013 | Please review |
 72 | | AutoReEngine [[39]](#39) | 2013 | Apriori Algorithm (based on Agrawal/Srikant 1994). Identify fields and keywords by considering the amount of occurrences. Message formats are considered as series of keywords. State machines are derived from labeled messages or frequent subsequences. See [this figure](img/autoreengine.png) for clarification. |
 73 | | Dispatcher2 [[40]](#40) | 2013 | Please review |
 74 | | ProVeX [[41]](#41) | 2013 | Identify Botnet traffic and try to infer the botnet type by using signatures |
 75 | | Meng et al. [[42]](#42) | 2014 | Please review |
 76 | | AFL [[43]](#43) | 2014 | Please review |
 77 | | Proword [[44]](#44) | 2014 | Please review |
 78 | | ProGraph [[45]](#45) | 2015 | Please review |
 79 | | FieldHunter [[46]](#46) | 2015 | Please review |
 80 | | RS Cluster [[47]](#47) | 2015 | Please review |
 81 | | UPCSS [[48]](#48) | 2015 | Please review |
 82 | | ARGOS [[49]](#49) | 2015 | Please review |
 83 | | PULSAR [[50]](#50) | 2015 | Reverse engineer network protocols with the aim to fuzz them with thus knowledge |
 84 | | Li et al. [[51]](#51) | 2015 | Please review |
 85 | | Cai et al. [[52]](#52) | 2016 | Please review |
 86 | | WASp [[53]](#53) | 2016 | Pcap files are provided with context information (i.e. known MAC address), then grouping and analysing (looking for CRC, N-gram, Entropy, Features, Ranges), afterwards report creation based on scoring. |
 87 | | PRE-Bin [[54]](#54) | 2016 | Please review |
 88 | | Xiao et al. [[55]](#55) | 2016 | Please review |
 89 | | PowerShell [[56]](#56) | 2017 | Please review |
 90 | | ProPrint [[57]](#57) | 2017 | Please review |
 91 | | ProHacker [[58]](#58) | 2017 | Please review |
 92 | | Esoul and Walkinshaw [[59]](#59) | 2017 | Please review |
 93 | | PREUGI [[60]](#60) | 2017 | Please review |
 94 | | NEMESYS [[61]](#61) | 2018 | Please review |
 95 | | Goo et al. [[62]](#62) | 2019 | Apriori based: Finding „frequent contiguous common subsequences“ via new Contiguous Sequential Pattern (CSP) algorithm which is based on Generalized Sequential Pattern (GSP) and other Apriori algorithms. CSP is used three times hierarchically to extract different information/fields based on previous results. |
 96 | | Universal Radio Hacker [[63]](#63) | 2019 | Physical layer based analysis of proprietary wireless protocols considering wireless specific properties like Received Signal Strength Indicator (RSSI) and using statistical methods |
 97 | | Luo et al. [[64]](#64) | 2019 | From abstract: “[…] this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message.” |
 98 | | Sun et al. [[65]](#65) | 2019 | Please review |
 99 | | Yang et al. [[66]](#66) | 2020 | Using deep-learning (LSTM-FCN) for reversing binary protocols |
100 | | Sun et al. [[67]](#67) | 2020 | "To measure format similarity of unknown protocol messages in a proper granularity, we propose relative measurements, Token Format Distance (TFD) and Message Format Distance (MFD), based on core rules of Augmented Backus-Naur Form (ABND)." for clustering process Silhouette Coefficient and Dunn Index are used. density based cluster algorithm DBSCAN is used for clustering of messages |
101 | | Shim et al. [[68]](#68) | 2020 | Follow up on Goo et al. 2019 |
102 | | IPART [[69]](#69) | 2020 | Using extended voting expert algorithm to infer boundaries of fields, otherwise using three phase which are tokenizing, classifying and clustering. |
103 | | NEMETYL [[70]](#70) | 2020 | Please review |
104 | | NetPlier [[71]](#71) | 2021 | Probabilistic method for network trace based protocol reverse engineering. |
105 | 
106 | # Input and Output [&uarr;](#table-of-contents)
107 | 
108 | NetT: input is a network trace (e.g. pcap)<br />
109 | ExeT: input is an execution trace (code/binary at hand)<br />
110 | PF: output is protocol format (describing the syntax)<br />
111 | PFSM: output is protocol finite state machine (describing semantic/sequential logic)<br />
112 | 
113 | | Name | Year | NetT | ExeT | PF | PFSM | Other Output |
114 | |------|------|------|------|----|------|--------------|
115 | | PIP [[1]](#1) | 2004 | &#10004; |  |  |  | Keywords/ fields |
116 | | GAPA [[2]](#2) | 2005 |  | &#10004; | &#10004; | &#10004; |  |
117 | | ScriptGen [[3]](#3) | 2005 | &#10004; |  |  |  | Dialogs/scripts (for replaying) |
118 | | RolePlayer [[4]](#4) | 2006 | &#10004; |  |  |  | Dialogs/scripts |
119 | | Ma et al. [[5]](#5) | 2006 | &#10004; |  |  |  | App-identification |
120 | | FFE/x86 [[6]](#6) | 2006 |  | &#10004; |  |  |  |
121 | | Replayer [[7]](#7) | 2006 |  | &#10004; |  |  |  |
122 | | Discoverer [[8]](#8) | 2007 | &#10004; |  | &#10004; |  |  |
123 | | Polyglot [[9]](#9) | 2007 |  | &#10004; | &#10004; |  |  |
124 | | PEXT [[10]](#10) | 2007 |  | &#10004; |  | &#10004; |  |
125 | | Rosetta [[11]](#11) | 2007 |  | &#10004; |  |  |  |
126 | | AutoFormat [[12]](#12) | 2008 |  | &#10004; | &#10004; |  |  |
127 | | Tupni [[13]](#13) | 2008 |  | &#10004; | &#10004; |  |  |
128 | | Boosting [[14]](#14) | 2008 | &#10004; |  |  |  | Field(s) |
129 | | ConfigRE [[15]](#15) | 2008 |  | &#10004; |  |  |  |
130 | | ReFormat [[16]](#16) | 2009 |  | &#10004; | &#10004; |  |  |
131 | | Prospex [[17]](#17) | 2009 | &#10004; | &#10004; | &#10004; | &#10004; |  |
132 | | Xiao et al. [[18]](#18) | 2009 |  | &#10004; |  | &#10004; |  |
133 | | Trifilo et al. [[19]](#19) | 2009 | &#10004; |  |  | &#10004; |  |
134 | | Antunes and Neves [[20]](#20) | 2009 | &#10004; |  |  | &#10004; |  |
135 | | Dispatcher [[21]](#21) | 2009 |  | &#10004; |  |  | C&C malware |
136 | | Fuzzgrind [[22]](#22) | 2009 |  | &#10004; |  |  |  |
137 | | REWARDS [[23]](#23) | 2010 |  | &#10004; |  |  |  |
138 | | MACE [[24]](#24) | 2010 |  | &#10004; |  |  |  |
139 | | Whalen et al. [[25]](#25) | 2010 | &#10004; |  | &#10004; |  |  |
140 | | AutoFuzz [[26]](#26) | 2010 | &#10004; |  | &#10004; | &#10004; |  |
141 | | ReverX [[27]](#27) | 2011 | &#10004; |  | &#10004; | &#10004; |  |
142 | | Veritas [[28]](#28) | 2011 | &#10004; |  |  | &#10004; |  |
143 | | Biprominer [[29]](#29) | 2011 | &#10004; |  | &#10004; | &#10004; |  |
144 | | ASAP [[30]](#30) | 2011 | &#10004; |  |  |  | Semantics |
145 | | Howard [[31]](#31) | 2011 |  | &#10004; |  |  |  |
146 | | ProDecoder [[32]](#32) | 2012 | &#10004; |  | &#10004; |  |  |
147 | | Zhang et al. [[33]](#33) | 2012 | &#10004; |  |  | &#10004; |  |
148 | | Netzob [[34]](#34) | 2012 | &#10004; | &#10004; | &#10004; | &#10004; |  |
149 | | PRISMA [[35]](#35) | 2012 | &#10004; |  |  |  |  |
150 | | ARTISTE [[36]](#36) | 2012 |  | &#10004; |  |  |  |
151 | | Wang et al. [[37]](#37) | 2013 | &#10004; |  | &#10004; |  |  |
152 | | Laroche et al. [[38]](#38) | 2013 | &#10004; |  |  | &#10004; |  |
153 | | AutoReEngine [[39]](#39) | 2013 | &#10004; |  | &#10004; | &#10004; |  |
154 | | Dispatcher2 [[40]](#40) | 2013 |  | &#10004; |  |  | C&C malware |
155 | | ProVeX [[41]](#41) | 2013 | &#10004; |  |  |  | Signatures |
156 | | Meng et al. [[42]](#42) | 2014 | &#10004; |  |  | &#10004; |  |
157 | | AFL [[43]](#43) | 2014 |  | &#10004; |  |  |  |
158 | | Proword [[44]](#44) | 2014 |  |  |  |  |  |
159 | | ProGraph [[45]](#45) | 2015 | &#10004; |  | &#10004; |  |  |
160 | | FieldHunter [[46]](#46) | 2015 | &#10004; |  |  |  | Fields |
161 | | RS Cluster [[47]](#47) | 2015 | &#10004; |  |  |  | Grouped-messages |
162 | | UPCSS [[48]](#48) | 2015 | &#10004; |  |  |  | Proto-classification |
163 | | ARGOS [[49]](#49) | 2015 |  | &#10004; |  |  |  |
164 | | PULSAR [[50]](#50) | 2015 |  |  |  |  |  |
165 | | Li et al. [[51]](#51) | 2015 | &#10004; |  | &#10004; |  |  |
166 | | Cai et al. [[52]](#52) | 2016 | &#10004; |  | &#10004; |  |  |
167 | | WASp [[53]](#53) | 2016 | &#10004; |  | &#10004; |  | scored analysis reports, spoofing candidates |
168 | | PRE-Bin [[54]](#54) | 2016 | &#10004; |  | &#10004; |  |  |
169 | | Xiao et al. [[55]](#55) | 2016 | &#10004; |  | &#10004; |  |  |
170 | | PowerShell [[56]](#56) | 2017 | &#10004; |  |  |  | Dialogs/scripts |
171 | | ProPrint [[57]](#57) | 2017 | &#10004; |  |  |  | Fingerprints |
172 | | ProHacker [[58]](#58) | 2017 | &#10004; |  |  |  | Keywords |
173 | | Esoul and Walkinshaw [[59]](#59) | 2017 |  |  |  |  |  |
174 | | PREUGI [[60]](#60) | 2017 | &#10004; |  |  | &#10004; |  |
175 | | NEMESYS [[61]](#61) | 2018 | &#10004; |  | &#10004; |  |  |
176 | | Goo et al. [[62]](#62) | 2019 | &#10004; |  | &#10004; | &#10004; |  |
177 | | Universal Radio Hacker [[63]](#63) | 2019 | &#10004; |  | &#10004; |  |  |
178 | | Luo et al. [[64]](#64) | 2019 |  |  |  |  |  |
179 | | Sun et al. [[65]](#65) | 2019 |  |  |  |  |  |
180 | | Yang et al. [[66]](#66) | 2020 | &#10004; |  | &#10004; |  |  |
181 | | Sun et al. [[67]](#67) | 2020 |  |  |  |  |  |
182 | | Shim et al. [[68]](#68) | 2020 | &#10004; |  | &#10004; |  |  |
183 | | IPART [[69]](#69) | 2020 | &#10004; |  | &#10004; |  |  |
184 | | NEMETYL [[70]](#70) | 2020 | &#10004; |  | &#10004; |  |  |
185 | | NetPlier [[71]](#71) | 2021 | &#10004; |  |  |  |  |
186 | 
187 | # Tested protocols [&uarr;](#table-of-contents)
188 | 
189 | | Name | Year | Text-based | Binary-based | Hybrid | Other Protocols |
190 | |------|------|------------|--------------|--------|-----------------|
191 | | PIP [[1]](#1) | 2004 | HTTP |  |  |  |
192 | | GAPA [[2]](#2) | 2005 | HTTP |  |  |  |
193 | | ScriptGen [[3]](#3) | 2005 | HTTP | NetBIOS |  | DCE |
194 | | RolePlayer [[4]](#4) | 2006 | HTTP, FTP, SMTP, NFS, TFTP | DNS, BitTorrent, QQ, NetBios | SMB, CIFS |  |
195 | | Ma et al. [[5]](#5) | 2006 | HTTP, FTP, SMTP, HTTPS (TCP-Protos) | DNS, NetBIOS, SrvLoc (UDP-Protos) |  |  |
196 | | FFE/x86 [[6]](#6) | 2006 |  |  |  |  |
197 | | Replayer [[7]](#7) | 2006 |  |  |  |  |
198 | | Discoverer [[8]](#8) | 2007 | HTTP | RPC | SMB, CIFS |  |
199 | | Polyglot [[9]](#9) | 2007 | HTTP, Samba, ICQ | DNS, IRC |  |  |
200 | | PEXT [[10]](#10) | 2007 | FTP |  |  |  |
201 | | Rosetta [[11]](#11) | 2007 |  |  |  |  |
202 | | AutoFormat [[12]](#12) | 2008 | HTTP, SIP | DHCP, RIP, OSPF | SMB, CIFS |  |
203 | | Tupni [[13]](#13) | 2008 | HTTP, FTP | RPC, DNS, TFTP |  | WMF, BMP, JPG, PNG, TIF |
204 | | Boosting [[14]](#14) | 2008 |  | DNS |  |  |
205 | | ConfigRE [[15]](#15) | 2008 |  |  |  |  |
206 | | ReFormat [[16]](#16) | 2009 | HTTP, MIME | IRC |  | One unknown protocol |
207 | | Prospex [[17]](#17) | 2009 | SMTP, SIP | SMB |  | Agobot (C&C) |
208 | | Xiao et al. [[18]](#18) | 2009 | HTTP, FTP, SMTP |  |  |  |
209 | | Trifilo et al. [[19]](#19) | 2009 |  | TCP, DHCP, ARP, KAD |  |  |
210 | | Antunes and Neves [[20]](#20) | 2009 | FTP |  |  |  |
211 | | Dispatcher [[21]](#21) | 2009 | HTTP, FTP, ICQ | DNS |  |  |
212 | | Fuzzgrind [[22]](#22) | 2009 |  |  |  |  |
213 | | REWARDS [[23]](#23) | 2010 |  |  |  |  |
214 | | MACE [[24]](#24) | 2010 |  |  |  |  |
215 | | Whalen et al. [[25]](#25) | 2010 |  |  |  |  |
216 | | AutoFuzz [[26]](#26) | 2010 |  |  |  |  |
217 | | ReverX [[27]](#27) | 2011 | FTP |  |  |  |
218 | | Veritas [[28]](#28) | 2011 | SMTP | PPLIVE, XUNLEI |  |  |
219 | | Biprominer [[29]](#29) | 2011 |  | XUNLEI, QQLive, SopCast |  |  |
220 | | ASAP [[30]](#30) | 2011 | HTTP, FTP, IRC, TFTP |  |  |  |
221 | | Howard [[31]](#31) | 2011 |  |  |  |  |
222 | | ProDecoder [[32]](#32) | 2012 | SMTP, SIP | SMB |  |  |
223 | | Zhang et al. [[33]](#33) | 2012 | HTTP, SNMP, ISAKMP |  |  |  |
224 | | Netzob [[34]](#34) | 2012 | FTP, Samba | SMB |  | Unknown P2P & VoIP protocol |
225 | | PRISMA [[35]](#35) | 2012 |  |  |  |  |
226 | | ARTISTE [[36]](#36) | 2012 |  |  |  |  |
227 | | Wang et al. [[37]](#37) | 2013 | ICMP | ARP |  |  |
228 | | Laroche et al. [[38]](#38) | 2013 | FTP | DHCP |  |  |
229 | | AutoReEngine [[39]](#39) | 2013 | HTTP, FTP, SMTP, POP3 | DNS, NetBIOS |  |  |
230 | | Dispatcher2 [[40]](#40) | 2013 | HTTP, FTP, ICQ | DNS | SMB |  |
231 | | ProVeX [[41]](#41) | 2013 | HTTP, SMTP, IMAP | DNS, VoIP, XMPP |  | Malware Family Protocols |
232 | | Meng et al. [[42]](#42) | 2014 |  | TCP, ARP |  |  |
233 | | AFL [[43]](#43) | 2014 |  |  |  |  |
234 | | Proword [[44]](#44) | 2014 |  |  |  |  |
235 | | ProGraph [[45]](#45) | 2015 | HTTP | DNS, BitTorrent, WeChat |  |  |
236 | | FieldHunter [[46]](#46) | 2015 | MSNP | DNS |  | SopCast, Ramnit |
237 | | RS Cluster [[47]](#47) | 2015 | FTP, SMTP, POP3, HTTPS | DNS, XunLei, BitTorrent, BitSpirit, QQ, eMule |  | MSSQL, Kugoo, PPTV |
238 | | UPCSS [[48]](#48) | 2015 | HTTP, FTP, SMTP, POP3, IMAP | DNS, SSL, SSH | SMB |  |
239 | | ARGOS [[49]](#49) | 2015 |  |  |  |  |
240 | | PULSAR [[50]](#50) | 2015 |  |  |  |  |
241 | | Li et al. [[51]](#51) | 2015 |  |  |  |  |
242 | | Cai et al. [[52]](#52) | 2016 | HTTP, SSDP | DNS, BitTorrent, QQ, NetBios |  |  |
243 | | WASp [[53]](#53) | 2016 |  |  |  | IEEE 802.15.4 proprietary protocols, Smart plug & PSD systems |
244 | | PRE-Bin [[54]](#54) | 2016 |  |  |  |  |
245 | | Xiao et al. [[55]](#55) | 2016 |  |  |  |  |
246 | | PowerShell [[56]](#56) | 2017 |  | ARP, OSPF, DHCP, STP |  | CDP/DTP/VTP, HSRP, LLDP, LLMNR, mDNS, NBNS, VRRP |
247 | | ProPrint [[57]](#57) | 2017 |  |  |  |  |
248 | | ProHacker [[58]](#58) | 2017 |  |  |  |  |
249 | | Esoul and Walkinshaw [[59]](#59) | 2017 |  |  |  |  |
250 | | PREUGI [[60]](#60) | 2017 |  |  |  |  |
251 | | NEMESYS [[61]](#61) | 2018 |  |  |  |  |
252 | | Goo et al. [[62]](#62) | 2019 | HTTP | DNS |  |  |
253 | | Universal Radio Hacker [[63]](#63) | 2019 |  |  |  | proprietary wireless protocols of IoT devices |
254 | | Luo et al. [[64]](#64) | 2019 |  |  |  |  |
255 | | Sun et al. [[65]](#65) | 2019 |  |  |  |  |
256 | | Yang et al. [[66]](#66) | 2020 |  | IPv4, TCP |  |  |
257 | | Sun et al. [[67]](#67) | 2020 |  |  |  |  |
258 | | Shim et al. [[68]](#68) | 2020 | FTP | Modbus/TCP, Ethernet/IP |  |  |
259 | | IPART [[69]](#69) | 2020 |  | Modbus, IEC104, Ethernet/IP |  |  |
260 | | NEMETYL [[70]](#70) | 2020 |  |  |  |  |
261 | | NetPlier [[71]](#71) | 2021 |  |  |  |  |
262 | 
263 | # Source Code [&uarr;](#table-of-contents)
264 | 
265 | Most papers do not provide the code used in the research. For the following papers exists (example) code.<br/>
266 | | Name | Year | Source Code |
267 | |------|------|-------------|
268 | | PIP [[1]](#1) | 2004 | https://web.archive.org/web/20090416234849/http://4tphi.net/~awalters/PI/PI.html |
269 | | ReverX [[27]](#27) | 2011 | https://github.com/jasantunes/reverx |
270 | | Netzob [[34]](#34) | 2012 | https://github.com/netzob/netzob |
271 | | PRISMA [[35]](#35) | 2012 | https://github.com/tammok/PRISMA/ |
272 | | PULSAR [[50]](#50) | 2015 | https://github.com/hgascon/pulsar |
273 | | NEMESYS [[61]](#61) | 2018 | https://github.com/vs-uulm/nemesys |
274 | | Universal Radio Hacker [[63]](#63) | 2019 | https://github.com/jopohl/urh |
275 | | NetPlier [[71]](#71) | 2021 | https://github.com/netplier-tool/NetPlier/ |
276 | 
277 | # References [&uarr;](#table-of-contents)
278 | 
279 | #### [1]
280 | M. Beddoe, “The protocol informatics project,” 2004, http://www.4tphi.net/∼awalters/PI/PI.html. [PDF](http://www.4tphi.net/~awalters/PI/pi.pdf)
281 | #### [2]
282 | N. Borisov, D. J. Brumley, H. J. Wang, J. Dunagan, P. Joshi, and C. Guo, “Generic application-level protocol analyzer and its language,” MSR Technical Report MSR-TR-2005-133, 2005. [PDF](http://www.academia.edu/download/31148072/2005-133.pdf)
283 | #### [3]
284 | C. Leita, K. Mermoud, and M. Dacier, “ScriptGen: an automated script generation tool for Honeyd,” in Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC ’05), pp. 203–214, Tucson, Ariz, USA, December 2005. [PDF](https://www.researchgate.net/profile/Marc_Dacier/publication/4207362_ScriptGen_an_automated_script_generation_tool_for_Honeyd/links/02e7e52f3691fd9dbb000000/ScriptGen-an-automated-script-generation-tool-for-Honeyd.pdf)
285 | #### [4]
286 | W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz, “Protocolindependent adaptive replay of application dialog,” in Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS ’06), 2006. [PDF](https://www.ndss-symposium.org/wp-content/uploads/2017/09/Protocol-Independent-Adaptive-Replay-of-Application-Dialog-Weidong-Cui.pdf)
287 | #### [5]
288 | J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G. Voelker, “Automatic protocol inference: unexpected means of identifying protocols,” UCSD Computer Science Technical Report CS2006-0850, 2006. [PDF](http://www.academia.edu/download/38268/2q7zdptm5klmgg2h0g9.pdf)
289 | #### [6]
290 | Lim, J., Reps, T., Liblit, B.: Extracting output formats from executables. In: 13th Working Conference on Reverse Engineering, 2006. WCRE ’06, pp. 167–178. IEEE, Benevento (2006). doi:10.1109/WCRE.2006.29 [PDF](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.138.3603&rep=rep1&type=pdf)
291 | #### [7]
292 | Cui, W., Paxson, V., Weaver, N., Katz, R.H.: Protocol-independent adaptive replay of application dialog. In: Proceedings of the 13th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2006). http://research.microsoft.com/apps/pubs/default.aspx?id=153197
293 | #### [8]
294 | W. Cui, J. Kannan, and H. J. Wang, “Discoverer: Automatic protocol reverse engineering from network traces.,” in USENIX security symposium, 2007, pp. 1–14.  [PDF](https://www.usenix.org/event/sec07/tech/full_papers/cui/cui.pdf)
295 | #### [9]
296 | J. Caballero, H. Yin, Z. Liang, and D. Song, “Polyglot: automatic extraction of protocol message format using dynamic binary analysis,” in Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS ’07), pp. 317–329, ACM, November 2007. [PDF](https://people.eecs.berkeley.edu/~dawnsong/papers/2007%20p317-caballero.pdf)
297 | #### [10]
298 | M. Shevertalov and S. Mancoridis, “A reverse engineering tool for extracting protocols of networked applications,” in Proceedings of the 14th Working Conference on Reverse Engineering (WCRE ’07), pp. 229–238, October 2007. [PDF](http://www.cs.drexel.edu/~spiros/papers/WCRE07.pdf)
299 | #### [11]
300 | Caballero, J., Song, D.: Rosetta: Extracting Protocol Semantics Using Binary Analysis with Applications to Protocol Replay and NAT Rewriting. Technical Report CMU-CyLab-07-014, Carnegie Mellon University, Pittsburgh (2007)
301 | #### [12]
302 | Z. Lin, X. Jiang, D. Xu, and X. Zhang, “Automatic protocol format reverse engineering through context-aware monitored execution,” in Proceedings of the 15th Symposium on Network and  Distributed System Security (NDSS ’08), February 2008. [PDF](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.120.2651&rep=rep1&type=pdf)
303 | #### [13]
304 | W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irun-Briz, “Tupni: automatic reverse engineering of input formats,” in Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS ’08), pp. 391–402, ACM, Alexandria, Va, USA, October 2008. [PDF](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tupni-ccs08.pdf)
305 | #### [14]
306 | K. Gopalratnam, S. Basu, J. Dunagan, and H. J. Wang, “Automatically extracting fields from unknown network protocols,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), 2008. [PDF](http://www.nicemice.net/helen/papers/sysml-Gopalratnam.pdf)
307 | #### [15]
308 | Wang, R., Wang, X., Zhang, K., Li, Z.: Towards automatic reverse engineering of software security configurations. In: Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08, pp. 245–256. ACM, Limerick (2008). doi:10.1145/1455770.1455802
309 | #### [16]
310 | Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, “ReFormat: automatic reverse engineering of encrypted messages,” in Computer Security—ESORICS 2009. ESORICS 2009, M. Backes and P. Ning, Eds., vol. 5789 of Lecture Notes in Computer Science, pp. 200–215, Springer, Berlin, Germany, 2009. [PDF](https://link.springer.com/content/pdf/10.1007/978-3-642-04444-1_13.pdf)
311 | #### [17]
312 | P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda, “Prospex: protocol specification extraction,” in Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 110–125, Berkeley, Calif, USA, May 2009. [PDF](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.720.3272&rep=rep1&type=pdf)
313 | #### [18]
314 | M.-M. Xiao, S.-Z. Yu, and Y. Wang, “Automatic network protocol automaton extraction,” in Proceedings of the 3rd International Conference on Network and System Security (NSS ’09), pp. 336–343, October 2009.
315 | #### [19]
316 | A. Trifilo, S. Burschka, and E. Biersack, “Traffic to protocol reverse engineering,” in Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–8, July 2009. [PDF](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.2189&rep=rep1&type=pdf)
317 | #### [20]
318 | J. Antunes and N. Neves, “Building an automaton towards reverse protocol engineering,” 2009, http://www.di.fc.ul.pt/∼nuno/PAPERS/INFORUM09.pdf.
319 | #### [21]
320 | J. Caballero, P. Poosankam, C. Kreibich, and D. Song, “Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering,” in Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS ’09), pp. 621–634, ACM, Chicago, Ill, USA, November 2009. [PDF](https://people.eecs.berkeley.edu/~dawnsong/papers/2009%20Dispatcher.pdf)
321 | #### [22]
322 | Campana, G.: Fuzzgrind: an automatic fuzzing tool. In: Hack. lu. Hack. lu, Luxembourg (2009)
323 | #### [23]
324 | Lin, Z., Zhang, X., Xu, D.: Automatic reverse engineering of data structures from binary execution. In: Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2010)
325 | #### [24]
326 | Cho, C.Y., Babi D., Shin, E.C.R., Song, D.: Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp. 426–439. ACM, New York, NY (2010). doi:10.1145/1866307.1866355
327 | Cho, C.Y., Babi, D., Poosankam, P., Chen, K.Z., Wu, E.X., Song, D.: MACE: model-inference-assisted concolic exploration for protocol and vulnerability discovery. In: Proceedings of the 20th USENIX Conference on Security, SEC’11, p. 19. USENIX Association, Berkeley, CA (2011)
328 | #### [25]
329 | S. Whalen, M. Bishop, and J. P. Crutchfield, “Hidden Markov Models for Automated Protocol Learning,” in Security and Privacy in Communication Networks, vol. 50, S. Jajodia and J. Zhou, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 415–428.  [PDF](http://nob.cs.ucdavis.edu/bishop/papers/2010-securecomm/markov.pdf)
330 | #### [26]
331 | S. Gorbunov and A. Rosenbloom, “Autofuzz: Automated network protocol fuzzing framework,” IJCSNS, vol. 10, no. 8, p. 239, 2010.  [PDF](people.csail.mit.edu/sergeyg/publications/autofuzz.pdf)
332 | #### [27]
333 | J. Antunes, N. Neves, and P. Verissimo, “Reverse engineering of protocols from network traces,” in Proceedings of the 18th Working Conference on Reverse Engineering (WCRE ’11), pp. 169–178, October 2011. [PDF](https://www.researchgate.net/profile/Joao_Antunes3/publication/221200255_Reverse_Engineering_of_Protocols_from_Network_Traces/links/0fcfd50c3eb9574ac4000000.pdf)
334 | #### [28]
335 | Y. Wang, Z. Zhang, D. D. Yao, B. Qu, and L. Guo, “Inferring protocol state machine from network traces: a probabilistic approach,” in Proceedings of the 9th Applied Cryptography and Network Security International Conference (ACNS ’11), pp. 1–18, 2011. [PDF](https://link.springer.com/content/pdf/10.1007/978-3-642-21554-4_1.pdf)
336 | #### [29]
337 | Y. Wang, X. Li, J. Meng, Y. Zhao, Z. Zhang, and L. Guo, “Biprominer: automatic mining of binary protocol features,” in Proceedings of the 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT ’11), pp. 179–184, October 2011.
338 | #### [30]
339 | T. Krueger, N. Krmer, and K. Rieck, “Asap: automatic semantics-aware analysis of network payloads,” in Proceedings of the ECML/PKDD, 2011. [PDF](https://oar.tib.eu/jspui/bitstream/123456789/2288/1/664831966.pdf)
340 | #### [31]
341 | Slowinska, A., Stancescu, T., Bos, H.: Howard: a dynamic excavator for reverse engineering data structures. In: Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2011)
342 | #### [32]
343 | Y. Wang, X. Yun, M. Z. Shafiq et al., “A semantics aware approach to automated reverse engineering unknown protocols,” in Proceedings of the 20th IEEE International Conference on Network Protocols (ICNP ’12), pp. 1–10, IEEE, Austin, Tex, USA, November 2012. [PDF](https://yaogroup.cs.vt.edu/papers/ICNP-12.pdf)
344 | #### [33]
345 | Z. Zhang, Q.-Y. Wen, and W. Tang, “Mining protocol state machines by interactive grammar inference,” in Proceedings of the 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA ’12), pp. 524–527, August 2012.
346 | #### [34]
347 | G. Bossert, F. Guihéry, and G. Hiet, “Towards automated protocol reverse engineering using semantic information,” in Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, Kyoto, Japan, June 2014.
348 | G. Bossert and F. Guihéry, “Reverse and simulate your enemy botnet C&C,” in Proceedings of the Mapping a P2P Botnet with Netzob, Black Hat 2012, Abu Dhabi, UAE, December 2012. [PDF](https://www.amossys.fr/upload/publication.pdf)
349 | #### [35]
350 | Krueger, T., Gascon, H., Krmer, N., Rieck, K.: Learning stateful models for network honeypots. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence, AISec ’12, pp. 37–48. ACM, New York, NY (2012). [PDF](https://hugogascon.com/publications/2012a-aisec.pdf)
351 | #### [36]
352 | Caballero, J., Grieco, G., Marron, M., Lin, Z., Urbina, D.: ARTISTE: Automatic Generation of Hybrid Data Structure Signatures from Binary Code Executions. Technical Report TR-IMDEA-SW-2012-001, IMDEA Software Institute, Madrid (2012)
353 | #### [37]
354 | Y. Wang, N. Zhang, Y.-M. Wu, B.-B. Su, and Y.-J. Liao, “Protocol formats reverse engineering based on association rules in wireless environment,” in Proceedings of the 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom ’13), pp. 134–141, Melbourne, Australia, July 2013.
355 | #### [38]
356 | P. Laroche, A. Burrows, and A. N. Zincir-Heywood, “How far an evolutionary approach can go for protocol state analysis and discovery,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC ’13), pp. 3228–3235, June 2013.
357 | #### [39]
358 | J.-Z. Luo and S.-Z. Yu, “Position-based automatic reverse engineering of network protocols,” Journal of Network and Computer Applications, vol. 36, no. 3, pp. 1070–1077, 2013.
359 | #### [40]
360 | J. Caballero and D. Song, “Automatic protocol reverse-engineering: message format extraction and field semantics inference,” Computer Networks, vol. 57, no. 2, pp. 451–474, 2013. [PDF](http://www.academia.edu/download/47267446/j.comnet.2012.08.00320160715-7025-1uns1ji.pdf)
361 | #### [41]
362 | C. Rossow and C. J. Dietrich, “PROVEX: detecting botnets with encrypted command and control channels,” in Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2013. [PDF](https://chrisdietri.ch/files/provex-dimva2013.pdf)
363 | #### [42]
364 | F. Meng, Y. Liu, C. Zhang, T. Li, and Y. Yue, “Inferring protocol state machine for binary communication protocol,” in Proceedings of the IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA ’14), pp. 870–874, September 2014.
365 | #### [43]
366 | Zalewski, M.: American Fuzzy Loop. http://lcamtuf.coredump.cx/afl/technical_details.txt
367 | #### [44]
368 | Z. Zhang, Z. Zhang, P. P. C. Lee, Y. Liu, and G. Xie, “ProWord: An unsupervised approach to protocol feature word extraction,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Apr. 2014, pp. 1393–1401, doi: 10.1109/INFOCOM.2014.6848073.  [PDF](http://adslab.cse.cuhk.edu.hk/pubs/infocom14proword.pdf)
369 | #### [45]
370 | Q. Huang, P. P. C. Lee, and Z. Zhang, “Exploiting intrapacket dependency for fine-grained protocol format inference,” in Proceedings of the 14th IFIP Networking Conference (NETWORKING ’15), Toulouse, France, May 2015.
371 | #### [46]
372 | I. Bermudez, A. Tongaonkar, M. Iliofotou, M. Mellia, and M. M. Munafo, “Automatic protocol field inference for deeper protocol understanding,” in Proceedings of the 14th IFIP Networking Conference (Networking ’15), pp. 1–9, May 2015. [PDF](http://dl.ifip.org/db/conf/networking/networking2015/1570062733.pdf)
373 | #### [47]
374 | J.-Z. Luo, S.-Z. Yu, and J. Cai, “Capturing uncertainty information and categorical characteristics for network payload grouping in protocol reverse engineering,” Mathematical Problems in Engineering, vol. 2015, Article ID 962974, 9 pages, 2015.
375 | #### [48]
376 | R. Lin, O. Li, Q. Li, and Y. Liu, “Unknown network protocol classification method based on semi supervised learning,” in Proceedings of the IEEE International Conference on Computer and Communications (ICCC ’15), pp. 300–308, Chengdu, China, October 2015.
377 | #### [49]
378 | Zeng, J., Lin, Z.: Towards automatic inference of kernel object semantics from binary code. In: 18th International Symposium, RAID 2015, vol. 9404, pp. 538–561. Springer, Kyoto (2015). doi:10.1007/978-3-319-26362-5
379 | #### [50]
380 | H. Gascon, C. Wressnegger, F. Yamaguchi, D. Arp, and K. Rieck, “Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols,” in Security and Privacy in Communication Networks, vol. 164, B. Thuraisingham, X. Wang, and V. Yegneswaran, Eds. Cham: Springer International Publishing, 2015, pp. 330–347.  [PDF](http://user.cs.uni-goettingen.de/~krieck/docs/2015-securecomm.pdf)
381 | #### [51]
382 | H. Li, B. Shuai, J. Wang, and C. Tang, “Protocol Reverse Engineering Using LDA and Association Analysis,” in 2015 11th International Conference on Computational Intelligence and Security (CIS), Shenzhen, China, Dec. 2015, pp. 312–316, doi: 10.1109/CIS.2015.83. 
383 | #### [52]
384 | J. Cai, J. Luo, and F. Lei, “Analyzing network protocols of application layer using hidden Semi-Markov model,” Mathematical Problems in Engineering, vol. 2016, Article ID 9161723, 14 pages, 2016.
385 | #### [53]
386 | K. Choi, Y. Son, J. Noh, H. Shin, J. Choi, and Y. Kim, “Dissecting customized protocols: automatic analysis for customized protocols based on IEEE 802.15.4,” in Proceedings of the 9th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pp. 183–193, Darmstadt, Germany, July 2016. [PDF](https://koasas.kaist.ac.kr/bitstream/10203/215875/1/choi_wisec2016.pdf)
387 | #### [54]
388 | S. Tao, H. Yu, and Q. Li, “Bit‐oriented format extraction approach for automatic binary protocol reverse engineering,” IET Communications, vol. 10, no. 6, pp. 709–716, Apr. 2016, doi: 10.1049/iet-com.2015.0797.  [PDF](https://www.researchgate.net/profile/Si_Yu_Tao/publication/298803896_Bit-oriented_format_extraction_approach_for_automatic_binary_protocol_reverse_engineering/links/5cef30e64585153c3da53f0e/Bit-oriented-format-extraction-approach-for-automatic-binary-protocol-reverse-engineering.pdf)
389 | #### [55]
390 | M.-M. Xiao, S.-L. Zhang, and Y.-P. Luo, “Automatic network protocol message format analysis,” IFS, vol. 31, no. 4, pp. 2271–2279, Sep. 2016, doi: 10.3233/JIFS-169067. 
391 | #### [56]
392 | D. R. Fletcher Jr., Identifying Vulnerable Network Protocols with PowerShell, SANS Institute Reading Room site, 2017.
393 | #### [57]
394 | Y. Wang, X. Yun, Y. Zhang, L. Chen, and G. Wu, “A nonparametric approach to the automated protocol fingerprint inference,” Journal of Network and Computer Applications, vol. 99, pp. 1–9, 2017.
395 | #### [58]
396 | Y. Wang, X. Yun, Y. Zhang, L. Chen, and T. Zang, “Rethinking robust and accurate application protocol identification,” Computer Networks, vol. 129, pp. 64–78, 2017.
397 | #### [59]
398 | O. Esoul and N. Walkinshaw, “Using Segment-Based Alignment to Extract Packet Structures from Network Traces,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, Jul. 2017, pp. 398–409, doi: 10.1109/QRS.2017.49.  [PDF](https://leicester.figshare.com/articles/Using_Segment-Based_Alignment_to_Extract_Packet_Structures_from_Network_Traces/10236467/files/18473123.pdf)
399 | #### [60]
400 | M.-M. Xiao and Y.-P. Luo, “Automatic protocol reverse engineering using grammatical inference,” IFS, vol. 32, no. 5, pp. 3585–3594, Apr. 2017, doi: 10.3233/JIFS-169294. 
401 | #### [61]
402 | S. Kleber, H. Kopp, and F. Kargl, “{NEMESYS}: Network message syntax reverse engineering by analysis of the intrinsic structure of individual messages,” 2018.  [PDF](https://www.usenix.org/system/files/conference/woot18/woot18-paper-kleber.pdf)
403 | #### [62]
404 | Y.-H. Goo, K.-S. Shim, M.-S. Lee, and M.-S. Kim, “Protocol Specification Extraction Based on Contiguous Sequential Pattern Algorithm,” IEEE Access, vol. 7, pp. 36057–36074, 2019, doi: 10.1109/ACCESS.2019.2905353.  [PDF](https://ieeexplore.ieee.org/iel7/6287639/6514899/08667834.pdf)
405 | #### [63]
406 | J. Pohl and A. Noack, “Universal radio hacker: A suite for analyzing and attacking stateful wireless protocols,” Baltimore, MD, Aug. 2018, [Online]. Available: https://www.usenix.org/conference/woot18/presentation/pohl. 
407 | J. Pohl and A. Noack, “Automatic wireless protocol reverse engineering,” Santa Clara, CA, Aug. 2019, [Online]. Available: https://www.usenix.org/conference/woot19/presentation/pohl.  [PDF](https://www.usenix.org/system/files/conference/woot18/woot18-paper-pohl.pdf)
408 | #### [64]
409 | X. Luo, D. Chen, Y. Wang, and P. Xie, “A Type-Aware Approach to Message Clustering for Protocol Reverse Engineering,” Sensors, vol. 19, no. 3, p. 716, Feb. 2019, doi: 10.3390/s19030716.  [PDF](https://www.mdpi.com/1424-8220/19/3/716/pdf)
410 | #### [65]
411 | F. Sun, S. Wang, C. Zhang, and H. Zhang, “Unsupervised field segmentation of unknown protocol messages,” Computer Communications, vol. 146, pp. 121–130, Oct. 2019, doi: 10.1016/j.comcom.2019.06.013. 
412 | #### [66]
413 | C. Yang, C. Fu, Y. Qian, Y. Hong, G. Feng, and L. Han, “Deep Learning-Based Reverse Method of Binary Protocol,” in Security and Privacy in Digital Economy, vol. 1268, S. Yu, P. Mueller, and J. Qian, Eds. Singapore: Springer Singapore, 2020, pp. 606–624. 
414 | #### [67]
415 | F. Sun, S. Wang, C. Zhang, and H. Zhang, “Clustering of unknown protocol messages based on format comparison,” Computer Networks, vol. 179, p. 107296, Oct. 2020, doi: 10.1016/j.comnet.2020.107296. 
416 | #### [68]
417 | K. Shim, Y. Goo, M. Lee, and M. Kim, “Clustering method in protocol reverse engineering for industrial protocols,” International Journal of Network Management, Jun. 2020, doi: 10.1002/nem.2126.  [PDF](https://nmlab.korea.ac.kr/publication/published.papers/2020/2020.06_Clustering_method_for_ICS-APRE-IJNM.pdf)
418 | #### [69]
419 | X. Wang, K. Lv, and B. Li, “IPART: an automatic protocol reverse engineering tool based on global voting expert for industrial protocols,” International Journal of Parallel, Emergent and Distributed Systems, vol. 35, no. 3, pp. 376–395, May 2020, doi: 10.1080/17445760.2019.1655740. 
420 | #### [70]
421 | S. Kleber, R. W. van der Heijden, and F. Kargl, “Message Type Identification of Binary Network Protocols using Continuous Segment Similarity,” in IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Jul. 2020, pp. 2243–2252. doi: 10.1109/INFOCOM41043.2020.9155275.  [PDF](https://arxiv.org/pdf/2002.03391)
422 | #### [71]
423 | Ye, Yapeng, Zhuo Zhang, Fei Wang, Xiangyu Zhang, and Dongyan Xu. “NetPlier: Probabilistic Network Protocol Reverse Engineering from Message Traces.” In NDSS. 2021. [PDF](https://www.ndss-symposium.org/wp-content/uploads/ndss2021_4A-5_24531_paper.pdf)
424 | 


--------------------------------------------------------------------------------
/img/autoreengine.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/techge/PRE-list/9c18f7c8f00978ec70ba285a5f13b461ede12e93/img/autoreengine.png


--------------------------------------------------------------------------------
/img/biprominer.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/techge/PRE-list/9c18f7c8f00978ec70ba285a5f13b461ede12e93/img/biprominer.png


--------------------------------------------------------------------------------
/tools.csv:
--------------------------------------------------------------------------------
 1 | Name;Year;Paper(s);DOI;Link to paper;Approach used;NetT;ExeT;PF;PFSM;Other Output;Text-based;Binary-based;Hybrid;Other Protocols;Source Code
 2 | Discoverer;2007;W. Cui, J. Kannan, and H. J. Wang, “Discoverer: Automatic protocol reverse engineering from network traces.,” in USENIX security symposium, 2007, pp. 1–14. ;;https://www.usenix.org/event/sec07/tech/full_papers/cui/cui.pdf;Tokenization of messages, recursive clustering to find formats, merge similar formats;x;;x;;;HTTP;RPC;SMB, CIFS;;
 3 | Polyglot;2007;J. Caballero, H. Yin, Z. Liang, and D. Song, “Polyglot: automatic extraction of protocol message format using dynamic binary analysis,” in Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS ’07), pp. 317–329, ACM, November 2007.;https://doi.org/10.1145/1315245.1315286;https://people.eecs.berkeley.edu/~dawnsong/papers/2007%20p317-caballero.pdf;Dynamic taint-analysis;;x;x;;;HTTP, Samba, ICQ;DNS, IRC;;;
 4 | AutoFormat;2008;Z. Lin, X. Jiang, D. Xu, and X. Zhang, “Automatic protocol format reverse engineering through context-aware monitored execution,” in Proceedings of the 15th Symposium on Network and  Distributed System Security (NDSS ’08), February 2008.;;https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.120.2651&rep=rep1&type=pdf;Dynamic taint-analysis;;x;x;;;HTTP, SIP;DHCP, RIP, OSPF;SMB, CIFS;;
 5 | Tupni;2008;W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irun-Briz, “Tupni: automatic reverse engineering of input formats,” in Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS ’08), pp. 391–402, ACM, Alexandria, Va, USA, October 2008.;10.1145/1455770.1455820;https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tupni-ccs08.pdf;"Dynamic taint-analysis; look for loops to identify boundaries within messages";;x;x;;;HTTP, FTP;RPC, DNS, TFTP;;WMF, BMP, JPG, PNG, TIF;
 6 | ReFormat;2009;Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, “ReFormat: automatic reverse engineering of encrypted messages,” in Computer Security—ESORICS 2009. ESORICS 2009, M. Backes and P. Ning, Eds., vol. 5789 of Lecture Notes in Computer Science, pp. 200–215, Springer, Berlin, Germany, 2009.;10.1007/978-3-642-04444-1_13;https://link.springer.com/content/pdf/10.1007/978-3-642-04444-1_13.pdf;Dynamic taint-analysis, especially targeting encrypted protocols by looking for bitwise and arithmetic operations;;x;x;;;HTTP, MIME;IRC;;One unknown protocol;
 7 | Prospex;2009;P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda, “Prospex: protocol specification extraction,” in Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 110–125, Berkeley, Calif, USA, May 2009.;10.1109/SP.2009.14;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.720.3272&rep=rep1&type=pdf;Dynamic taint-analysis with following message clustering, optionally provides fuzzing candidates for Peach fuzzer;x;x;x;x;;SMTP, SIP;SMB;;Agobot (C&C);
 8 | ProDecoder;2012;Y. Wang, X. Yun, M. Z. Shafiq et al., “A semantics aware approach to automated reverse engineering unknown protocols,” in Proceedings of the 20th IEEE International Conference on Network Protocols (ICNP ’12), pp. 1–10, IEEE, Austin, Tex, USA, November 2012.;10.1109/ICNP.2012.6459963;https://yaogroup.cs.vt.edu/papers/ICNP-12.pdf;"Successor of Biprominer which also addresses text-based protocols; two-phases are used: first apply Biprominer, second use Needleman-Wunsch for alignment";x;;x;;;SMTP, SIP;SMB;;;
 9 | Wang et al.;2013;Y. Wang, N. Zhang, Y.-M. Wu, B.-B. Su, and Y.-J. Liao, “Protocol formats reverse engineering based on association rules in wireless environment,” in Proceedings of the 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom ’13), pp. 134–141, Melbourne, Australia, July 2013.;10.1109/TrustCom.2013.21;;Capturing of data, identifying frames and inferring the format by looking and frequency of frames and doing association analysis (using Apriori and FP-Growth).;x;;x;;;ICMP;ARP;;;
10 | ProGraph;2015;Q. Huang, P. P. C. Lee, and Z. Zhang, “Exploiting intrapacket dependency for fine-grained protocol format inference,” in Proceedings of the 14th IFIP Networking Conference (NETWORKING ’15), Toulouse, France, May 2015.;;;Please review;x;;x;;;HTTP;DNS, BitTorrent, WeChat;;;
11 | Cai et al.;2016;J. Cai, J. Luo, and F. Lei, “Analyzing network protocols of application layer using hidden Semi-Markov model,” Mathematical Problems in Engineering, vol. 2016, Article ID 9161723, 14 pages, 2016.;;;Please review;x;;x;;;HTTP, SSDP;DNS, BitTorrent, QQ, NetBios;;;
12 | WASp;2016;K. Choi, Y. Son, J. Noh, H. Shin, J. Choi, and Y. Kim, “Dissecting customized protocols: automatic analysis for customized protocols based on IEEE 802.15.4,” in Proceedings of the 9th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pp. 183–193, Darmstadt, Germany, July 2016.;10.1145/2939918.2939921;https://koasas.kaist.ac.kr/bitstream/10203/215875/1/choi_wisec2016.pdf;Pcap files are provided with context information (i.e. known MAC address), then grouping and analysing (looking for CRC, N-gram, Entropy, Features, Ranges), afterwards report creation based on scoring.;x;;x;;scored analysis reports, spoofing candidates;;;;IEEE 802.15.4 proprietary protocols, Smart plug & PSD systems;
13 | PEXT;2007;M. Shevertalov and S. Mancoridis, “A reverse engineering tool for extracting protocols of networked applications,” in Proceedings of the 14th Working Conference on Reverse Engineering (WCRE ’07), pp. 229–238, October 2007.;;http://www.cs.drexel.edu/~spiros/papers/WCRE07.pdf;Message clustering for creating FSM graph and simplify FSM graph;;x;;x;;FTP;;;;
14 | Xiao et al.;2009;M.-M. Xiao, S.-Z. Yu, and Y. Wang, “Automatic network protocol automaton extraction,” in Proceedings of the 3rd International Conference on Network and System Security (NSS ’09), pp. 336–343, October 2009.;;;Please review;;x;;x;;HTTP, FTP, SMTP;;;;
15 | Trifilo et al.;2009;A. Trifilo, S. Burschka, and E. Biersack, “Traffic to protocol reverse engineering,” in Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–8, July 2009.;r;https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.2189&rep=rep1&type=pdf;Measure byte-wise variances in aligned messages;x;;;x;;;TCP, DHCP, ARP, KAD;;;
16 | Antunes and Neves;2009;J. Antunes and N. Neves, “Building an automaton towards reverse protocol engineering,” 2009, http://www.di.fc.ul.pt/∼nuno/PAPERS/INFORUM09.pdf.;;;Please review;x;;;x;;FTP;;;;
17 | ReverX;2011;J. Antunes, N. Neves, and P. Verissimo, “Reverse engineering of protocols from network traces,” in Proceedings of the 18th Working Conference on Reverse Engineering (WCRE ’11), pp. 169–178, October 2011.;10.1109/WCRE.2011.28;https://www.researchgate.net/profile/Joao_Antunes3/publication/221200255_Reverse_Engineering_of_Protocols_from_Network_Traces/links/0fcfd50c3eb9574ac4000000.pdf;"Speech recognition (thus only for text-based protocols) to find carriage returns and spaces, afterwards looking for frequencies of keywords; multiple partial FSMs are merged and simplified to get PFSM";x;;x;x;;FTP;;;;https://github.com/jasantunes/reverx
18 | Veritas;2011;Y. Wang, Z. Zhang, D. D. Yao, B. Qu, and L. Guo, “Inferring protocol state machine from network traces: a probabilistic approach,” in Proceedings of the 9th Applied Cryptography and Network Security International Conference (ACNS ’11), pp. 1–18, 2011.;10.1007/978-3-642-21554-4_1;https://link.springer.com/content/pdf/10.1007/978-3-642-21554-4_1.pdf;Identifiying keywords, clustering and transition probability → probabilistic protocol state machine;x;;;x;;SMTP;PPLIVE, XUNLEI;;;
19 | Zhang et al.;2012;Z. Zhang, Q.-Y. Wen, and W. Tang, “Mining protocol state machines by interactive grammar inference,” in Proceedings of the 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA ’12), pp. 524–527, August 2012.;;;Please review;x;;;x;;HTTP, SNMP, ISAKMP;;;;
20 | Laroche et al.;2013;P. Laroche, A. Burrows, and A. N. Zincir-Heywood, “How far an evolutionary approach can go for protocol state analysis and discovery,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC ’13), pp. 3228–3235, June 2013.;;;Please review;x;;;x;;FTP;DHCP;;;
21 | Meng et al.;2014;F. Meng, Y. Liu, C. Zhang, T. Li, and Y. Yue, “Inferring protocol state machine for binary communication protocol,” in Proceedings of the IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA ’14), pp. 870–874, September 2014.;10.1109/WARTIA.2014.6976411;;Please review;x;;;x;;;TCP, ARP;;;
22 | GAPA;2005;N. Borisov, D. J. Brumley, H. J. Wang, J. Dunagan, P. Joshi, and C. Guo, “Generic application-level protocol analyzer and its language,” MSR Technical Report MSR-TR-2005-133, 2005.;;http://www.academia.edu/download/31148072/2005-133.pdf;Protocol analyzer and open language that uses the protocol analyzer specification Spec → it is meant to be integrated in monitoring and analyzing tools;;x;x;x;;HTTP;;;;
23 | Biprominer;2011;Y. Wang, X. Li, J. Meng, Y. Zhao, Z. Zhang, and L. Guo, “Biprominer: automatic mining of binary protocol features,” in Proceedings of the 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT ’11), pp. 179–184, October 2011.;10.1109/PDCAT.2011.25;;Statistical analysis including three phases, learning phase, labeling phase and transition probability model building phase. See [this figure](img/biprominer.png).;x;;x;x;;;XUNLEI, QQLive, SopCast;;;
24 | Netzob;2012;"G. Bossert, F. Guihéry, and G. Hiet, “Towards automated protocol reverse engineering using semantic information,” in Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, Kyoto, Japan, June 2014.
25 | G. Bossert and F. Guihéry, “Reverse and simulate your enemy botnet C&C,” in Proceedings of the Mapping a P2P Botnet with Netzob, Black Hat 2012, Abu Dhabi, UAE, December 2012.";10.1145/2590296.2590346;https://www.amossys.fr/upload/publication.pdf;See [this figure](https://github.com/netzob/netzob/blob/4a72c0cbd6d1e7b997b2b8ad170b7a38e400dfca/netzob/doc/documentation/source/netzob_archi.png);x;x;x;x;;FTP, Samba;SMB;;Unknown P2P & VoIP protocol;https://github.com/netzob/netzob
26 | AutoReEngine;2013;J.-Z. Luo and S.-Z. Yu, “Position-based automatic reverse engineering of network protocols,” Journal of Network and Computer Applications, vol. 36, no. 3, pp. 1070–1077, 2013.;10.1016/j.jnca.2013.01.013;;Apriori Algorithm (based on Agrawal/Srikant 1994). Identify fields and keywords by considering the amount of occurrences. Message formats are considered as series of keywords. State machines are derived from labeled messages or frequent subsequences. See [this figure](img/autoreengine.png) for clarification.;x;;x;x;;HTTP, FTP, SMTP, POP3;DNS, NetBIOS;;;
27 | ScriptGen;2005;C. Leita, K. Mermoud, and M. Dacier, “ScriptGen: an automated script generation tool for Honeyd,” in Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC ’05), pp. 203–214, Tucson, Ariz, USA, December 2005.;10.1109/CSAC.2005.49;https://www.researchgate.net/profile/Marc_Dacier/publication/4207362_ScriptGen_an_automated_script_generation_tool_for_Honeyd/links/02e7e52f3691fd9dbb000000/ScriptGen-an-automated-script-generation-tool-for-Honeyd.pdf;Grouping and clustering messages, find edges from clusters to clusters for being able to replay messages once a similar message arrives;x;;;;Dialogs/scripts (for replaying);HTTP;NetBIOS;;DCE;
28 | RolePlayer;2006;W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz, “Protocolindependent adaptive replay of application dialog,” in Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS ’06), 2006.;;https://www.ndss-symposium.org/wp-content/uploads/2017/09/Protocol-Independent-Adaptive-Replay-of-Application-Dialog-Weidong-Cui.pdf;Byte-wise sequence alignment (find variable fields in messages) and clustering with FSM simplification;x;;;;Dialogs/scripts;HTTP, FTP, SMTP, NFS, TFTP;DNS, BitTorrent, QQ, NetBios;SMB, CIFS;;
29 | Ma et al.;2006;J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G. Voelker, “Automatic protocol inference: unexpected means of identifying protocols,” UCSD Computer Science Technical Report CS2006-0850, 2006.;10.1145/1177080.1177123;http://www.academia.edu/download/38268/2q7zdptm5klmgg2h0g9.pdf;Please review;x;;;;App-identification;HTTP, FTP, SMTP, HTTPS (TCP-Protos);DNS, NetBIOS, SrvLoc (UDP-Protos);;;
30 | Boosting;2008;K. Gopalratnam, S. Basu, J. Dunagan, and H. J. Wang, “Automatically extracting fields from unknown network protocols,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), 2008.;;http://www.nicemice.net/helen/papers/sysml-Gopalratnam.pdf;Please review;x;;;;Field(s);;DNS;;;
31 | Dispatcher;2009;J. Caballero, P. Poosankam, C. Kreibich, and D. Song, “Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering,” in Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS ’09), pp. 621–634, ACM, Chicago, Ill, USA, November 2009.;10.1145/1653662.1653737;https://people.eecs.berkeley.edu/~dawnsong/papers/2009%20Dispatcher.pdf;Dynamic taint-analysis (successor of Polyglot using send instead of received messages);;x;;;C&C malware;HTTP, FTP, ICQ;DNS;;;
32 | ASAP;2011;T. Krueger, N. Krmer, and K. Rieck, “Asap: automatic semantics-aware analysis of network payloads,” in Proceedings of the ECML/PKDD, 2011.;;https://oar.tib.eu/jspui/bitstream/123456789/2288/1/664831966.pdf;Please review;x;;;;Semantics;HTTP, FTP, IRC, TFTP;;;;
33 | Dispatcher2;2013;J. Caballero and D. Song, “Automatic protocol reverse-engineering: message format extraction and field semantics inference,” Computer Networks, vol. 57, no. 2, pp. 451–474, 2013.;10.1016/j.comnet.2012.08.003;http://www.academia.edu/download/47267446/j.comnet.2012.08.00320160715-7025-1uns1ji.pdf;Please review;;x;;;C&C malware;HTTP, FTP, ICQ;DNS;SMB;;
34 | ProVeX;2013;C. Rossow and C. J. Dietrich, “PROVEX: detecting botnets with encrypted command and control channels,” in Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2013.;;https://chrisdietri.ch/files/provex-dimva2013.pdf;Identify Botnet traffic and try to infer the botnet type by using signatures;x;;;;Signatures;HTTP, SMTP, IMAP;DNS, VoIP, XMPP;;Malware Family Protocols;
35 | PIP;2004;M. Beddoe, “The protocol informatics project,” 2004, http://www.4tphi.net/∼awalters/PI/PI.html.;;http://www.4tphi.net/~awalters/PI/pi.pdf;"Keyword detection and Sequence alignment based on Needleman and Wunsch 1970 and Smith and Waterman 1981; this approach was applied and extended by many following papers";x;;;;Keywords/ fields;HTTP;;;;https://web.archive.org/web/20090416234849/http://4tphi.net/~awalters/PI/PI.html
36 | FieldHunter;2015;I. Bermudez, A. Tongaonkar, M. Iliofotou, M. Mellia, and M. M. Munafo, “Automatic protocol field inference for deeper protocol understanding,” in Proceedings of the 14th IFIP Networking Conference (Networking ’15), pp. 1–9, May 2015.;10.1109/IFIPNetworking.2015.7145307;http://dl.ifip.org/db/conf/networking/networking2015/1570062733.pdf;Please review;x;;;;Fields;MSNP;DNS;;SopCast, Ramnit;
37 | RS Cluster;2015;J.-Z. Luo, S.-Z. Yu, and J. Cai, “Capturing uncertainty information and categorical characteristics for network payload grouping in protocol reverse engineering,” Mathematical Problems in Engineering, vol. 2015, Article ID 962974, 9 pages, 2015.;;;Please review;x;;;;Grouped-messages;FTP, SMTP, POP3, HTTPS;DNS, XunLei, BitTorrent, BitSpirit, QQ, eMule;;MSSQL, Kugoo, PPTV;
38 | UPCSS;2015;R. Lin, O. Li, Q. Li, and Y. Liu, “Unknown network protocol classification method based on semi supervised learning,” in Proceedings of the IEEE International Conference on Computer and Communications (ICCC ’15), pp. 300–308, Chengdu, China, October 2015.;10.1109/CompComm.2015.7387586;;Please review;x;;;;Proto-classification;HTTP, FTP, SMTP, POP3, IMAP;DNS, SSL, SSH;SMB;;
39 | PowerShell;2017;D. R. Fletcher Jr., Identifying Vulnerable Network Protocols with PowerShell, SANS Institute Reading Room site, 2017.;;;Please review;x;;;;Dialogs/scripts;;ARP, OSPF, DHCP, STP;;CDP/DTP/VTP, HSRP, LLDP, LLMNR, mDNS, NBNS, VRRP;
40 | ProPrint;2017;Y. Wang, X. Yun, Y. Zhang, L. Chen, and G. Wu, “A nonparametric approach to the automated protocol fingerprint inference,” Journal of Network and Computer Applications, vol. 99, pp. 1–9, 2017.;10.1016/j.jnca.2017.10.009;;Please review;x;;;;Fingerprints;;;;;
41 | ProHacker;2017;Y. Wang, X. Yun, Y. Zhang, L. Chen, and T. Zang, “Rethinking robust and accurate application protocol identification,” Computer Networks, vol. 129, pp. 64–78, 2017.;10.1016/j.comnet.2017.09.006;;Please review;x;;;;Keywords;;;;;
42 | FFE/x86;2006;Lim, J., Reps, T., Liblit, B.: Extracting output formats from executables. In: 13th Working Conference on Reverse Engineering, 2006. WCRE ’06, pp. 167–178. IEEE, Benevento (2006). doi:10.1109/WCRE.2006.29;10.1109/WCRE.2006.29;http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.138.3603&rep=rep1&type=pdf;Please review;;x;;;;;;;;
43 | Replayer;2006;Cui, W., Paxson, V., Weaver, N., Katz, R.H.: Protocol-independent adaptive replay of application dialog. In: Proceedings of the 13th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2006). http://research.microsoft.com/apps/pubs/default.aspx?id=153197;;;Please review;;x;;;;;;;;
44 | Rosetta;2007;Caballero, J., Song, D.: Rosetta: Extracting Protocol Semantics Using Binary Analysis with Applications to Protocol Replay and NAT Rewriting. Technical Report CMU-CyLab-07-014, Carnegie Mellon University, Pittsburgh (2007);;;Please review;;x;;;;;;;;
45 | ConfigRE;2008;Wang, R., Wang, X., Zhang, K., Li, Z.: Towards automatic reverse engineering of software security configurations. In: Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08, pp. 245–256. ACM, Limerick (2008). doi:10.1145/1455770.1455802;;;Please review;;x;;;;;;;;
46 | Fuzzgrind;2009;Campana, G.: Fuzzgrind: an automatic fuzzing tool. In: Hack. lu. Hack. lu, Luxembourg (2009);;;Please review;;x;;;;;;;;
47 | REWARDS;2010;Lin, Z., Zhang, X., Xu, D.: Automatic reverse engineering of data structures from binary execution. In: Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2010);;;Please review;;x;;;;;;;;
48 | MACE;2010;"Cho, C.Y., Babi D., Shin, E.C.R., Song, D.: Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp. 426–439. ACM, New York, NY (2010). doi:10.1145/1866307.1866355
49 | Cho, C.Y., Babi, D., Poosankam, P., Chen, K.Z., Wu, E.X., Song, D.: MACE: model-inference-assisted concolic exploration for protocol and vulnerability discovery. In: Proceedings of the 20th USENIX Conference on Security, SEC’11, p. 19. USENIX Association, Berkeley, CA (2011)";;;Please review;;x;;;;;;;;
50 | Howard;2011;Slowinska, A., Stancescu, T., Bos, H.: Howard: a dynamic excavator for reverse engineering data structures. In: Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2011);;;Please review;;x;;;;;;;;
51 | PRISMA;2012;Krueger, T., Gascon, H., Krmer, N., Rieck, K.: Learning stateful models for network honeypots. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence, AISec ’12, pp. 37–48. ACM, New York, NY (2012).;10.1145/2381896.2381904;https://hugogascon.com/publications/2012a-aisec.pdf;Please review, follow-up paper/project to ASAP;x;;;;;;;;;https://github.com/tammok/PRISMA/
52 | ARTISTE;2012;Caballero, J., Grieco, G., Marron, M., Lin, Z., Urbina, D.: ARTISTE: Automatic Generation of Hybrid Data Structure Signatures from Binary Code Executions. Technical Report TR-IMDEA-SW-2012-001, IMDEA Software Institute, Madrid (2012);;;Please review;;x;;;;;;;;
53 | AFL;2014;Zalewski, M.: American Fuzzy Loop. http://lcamtuf.coredump.cx/afl/technical_details.txt;;;Please review;;x;;;;;;;;
54 | ARGOS;2015;Zeng, J., Lin, Z.: Towards automatic inference of kernel object semantics from binary code. In: 18th International Symposium, RAID 2015, vol. 9404, pp. 538–561. Springer, Kyoto (2015). doi:10.1007/978-3-319-26362-5;;;Please review;;x;;;;;;;;
55 | PULSAR;2015;H. Gascon, C. Wressnegger, F. Yamaguchi, D. Arp, and K. Rieck, “Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols,” in Security and Privacy in Communication Networks, vol. 164, B. Thuraisingham, X. Wang, and V. Yegneswaran, Eds. Cham: Springer International Publishing, 2015, pp. 330–347. ;10.1007/978-3-319-28865-9_18;http://user.cs.uni-goettingen.de/~krieck/docs/2015-securecomm.pdf;Reverse engineer network protocols with the aim to fuzz them with thus knowledge;;;;;;;;;;https://github.com/hgascon/pulsar
56 | Yang et al.;2020;C. Yang, C. Fu, Y. Qian, Y. Hong, G. Feng, and L. Han, “Deep Learning-Based Reverse Method of Binary Protocol,” in Security and Privacy in Digital Economy, vol. 1268, S. Yu, P. Mueller, and J. Qian, Eds. Singapore: Springer Singapore, 2020, pp. 606–624. ;10.1007/978-981-15-9129-7_42;;Using deep-learning (LSTM-FCN) for reversing binary protocols;x;;x;;;;IPv4, TCP;;;
57 | Sun et al.;2020;F. Sun, S. Wang, C. Zhang, and H. Zhang, “Clustering of unknown protocol messages based on format comparison,” Computer Networks, vol. 179, p. 107296, Oct. 2020, doi: 10.1016/j.comnet.2020.107296. ;10.1016/j.comnet.2020.107296;;"""To measure format similarity of unknown protocol messages in a proper granularity, we propose relative measurements, Token Format Distance (TFD) and Message Format Distance (MFD), based on core rules of Augmented Backus-Naur Form (ABND)."" for clustering process Silhouette Coefficient and Dunn Index are used. density based cluster algorithm DBSCAN is used for clustering of messages";;;;;;;;;;
58 | Goo et al.;2019;Y.-H. Goo, K.-S. Shim, M.-S. Lee, and M.-S. Kim, “Protocol Specification Extraction Based on Contiguous Sequential Pattern Algorithm,” IEEE Access, vol. 7, pp. 36057–36074, 2019, doi: 10.1109/ACCESS.2019.2905353. ;10.1109/ACCESS.2019.2905353;https://ieeexplore.ieee.org/iel7/6287639/6514899/08667834.pdf;Apriori based: Finding „frequent contiguous common subsequences“ via new Contiguous Sequential Pattern (CSP) algorithm which is based on Generalized Sequential Pattern (GSP) and other Apriori algorithms. CSP is used three times hierarchically to extract different information/fields based on previous results.;x;;x;x;;HTTP;DNS;;;
59 | Shim et al.;2020;K. Shim, Y. Goo, M. Lee, and M. Kim, “Clustering method in protocol reverse engineering for industrial protocols,” International Journal of Network Management, Jun. 2020, doi: 10.1002/nem.2126. ;10.1002/nem.2126;https://nmlab.korea.ac.kr/publication/published.papers/2020/2020.06_Clustering_method_for_ICS-APRE-IJNM.pdf;Follow up on Goo et al. 2019;x;;x;;;FTP;Modbus/TCP, Ethernet/IP;;;
60 | IPART;2020;X. Wang, K. Lv, and B. Li, “IPART: an automatic protocol reverse engineering tool based on global voting expert for industrial protocols,” International Journal of Parallel, Emergent and Distributed Systems, vol. 35, no. 3, pp. 376–395, May 2020, doi: 10.1080/17445760.2019.1655740. ;10.1080/17445760.2019.1655740;;Using extended voting expert algorithm to infer boundaries of fields, otherwise using three phase which are tokenizing, classifying and clustering.;x;;x;;;;Modbus, IEC104, Ethernet/IP;;;
61 | Universal Radio Hacker;2019;"J. Pohl and A. Noack, “Universal radio hacker: A suite for analyzing and attacking stateful wireless protocols,” Baltimore, MD, Aug. 2018, [Online]. Available: https://www.usenix.org/conference/woot18/presentation/pohl. 
62 | J. Pohl and A. Noack, “Automatic wireless protocol reverse engineering,” Santa Clara, CA, Aug. 2019, [Online]. Available: https://www.usenix.org/conference/woot19/presentation/pohl. ";;https://www.usenix.org/system/files/conference/woot18/woot18-paper-pohl.pdf;Physical layer based analysis of proprietary wireless protocols considering wireless specific properties like Received Signal Strength Indicator (RSSI) and using statistical methods;x;;x;;;;;;proprietary wireless protocols of IoT devices;https://github.com/jopohl/urh
63 | Proword;2014;Z. Zhang, Z. Zhang, P. P. C. Lee, Y. Liu, and G. Xie, “ProWord: An unsupervised approach to protocol feature word extraction,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Apr. 2014, pp. 1393–1401, doi: 10.1109/INFOCOM.2014.6848073. ;10.1109/INFOCOM.2014.6848073;http://adslab.cse.cuhk.edu.hk/pubs/infocom14proword.pdf;Please review;;;;;;;;;;
64 | Luo et al.;2019;X. Luo, D. Chen, Y. Wang, and P. Xie, “A Type-Aware Approach to Message Clustering for Protocol Reverse Engineering,” Sensors, vol. 19, no. 3, p. 716, Feb. 2019, doi: 10.3390/s19030716. ;10.3390/s19030716;https://www.mdpi.com/1424-8220/19/3/716/pdf;From abstract: “[…] this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message.”;;;;;;;;;;
65 | Esoul and Walkinshaw;2017;O. Esoul and N. Walkinshaw, “Using Segment-Based Alignment to Extract Packet Structures from Network Traces,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, Jul. 2017, pp. 398–409, doi: 10.1109/QRS.2017.49. ;10.1109/QRS.2017.49;https://leicester.figshare.com/articles/Using_Segment-Based_Alignment_to_Extract_Packet_Structures_from_Network_Traces/10236467/files/18473123.pdf;Please review;;;;;;;;;;
66 | Sun et al.;2019;F. Sun, S. Wang, C. Zhang, and H. Zhang, “Unsupervised field segmentation of unknown protocol messages,” Computer Communications, vol. 146, pp. 121–130, Oct. 2019, doi: 10.1016/j.comcom.2019.06.013. ;10.1016/j.comcom.2019.06.013;;Please review;;;;;;;;;;
67 | Whalen et al.;2010;S. Whalen, M. Bishop, and J. P. Crutchfield, “Hidden Markov Models for Automated Protocol Learning,” in Security and Privacy in Communication Networks, vol. 50, S. Jajodia and J. Zhou, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 415–428. ;10.1007/978-3-642-16161-2_24;http://nob.cs.ucdavis.edu/bishop/papers/2010-securecomm/markov.pdf;Please review;x;;x;;;;;;;
68 | Li et al.;2015;H. Li, B. Shuai, J. Wang, and C. Tang, “Protocol Reverse Engineering Using LDA and Association Analysis,” in 2015 11th International Conference on Computational Intelligence and Security (CIS), Shenzhen, China, Dec. 2015, pp. 312–316, doi: 10.1109/CIS.2015.83. ;10.1109/CIS.2015.83;;Please review;x;;x;;;;;;;
69 | PRE-Bin;2016;S. Tao, H. Yu, and Q. Li, “Bit‐oriented format extraction approach for automatic binary protocol reverse engineering,” IET Communications, vol. 10, no. 6, pp. 709–716, Apr. 2016, doi: 10.1049/iet-com.2015.0797. ;10.1049/iet-com.2015.0797;https://www.researchgate.net/profile/Si_Yu_Tao/publication/298803896_Bit-oriented_format_extraction_approach_for_automatic_binary_protocol_reverse_engineering/links/5cef30e64585153c3da53f0e/Bit-oriented-format-extraction-approach-for-automatic-binary-protocol-reverse-engineering.pdf;Please review;x;;x;;;;;;;
70 | Xiao et al.;2016;M.-M. Xiao, S.-L. Zhang, and Y.-P. Luo, “Automatic network protocol message format analysis,” IFS, vol. 31, no. 4, pp. 2271–2279, Sep. 2016, doi: 10.3233/JIFS-169067. ;10.3233/JIFS-169067;;Please review;x;;x;;;;;;;
71 | NEMESYS;2018;S. Kleber, H. Kopp, and F. Kargl, “{NEMESYS}: Network message syntax reverse engineering by analysis of the intrinsic structure of individual messages,” 2018. ;;https://www.usenix.org/system/files/conference/woot18/woot18-paper-kleber.pdf;Please review;x;;x;;;;;;;https://github.com/vs-uulm/nemesys
72 | PREUGI;2017;M.-M. Xiao and Y.-P. Luo, “Automatic protocol reverse engineering using grammatical inference,” IFS, vol. 32, no. 5, pp. 3585–3594, Apr. 2017, doi: 10.3233/JIFS-169294. ;10.3233/JIFS-169294;;Please review;x;;;x;;;;;;
73 | AutoFuzz;2010;S. Gorbunov and A. Rosenbloom, “Autofuzz: Automated network protocol fuzzing framework,” IJCSNS, vol. 10, no. 8, p. 239, 2010. ;;people.csail.mit.edu/sergeyg/publications/autofuzz.pdf;Please review;x;;x;x;;;;;;
74 | NEMETYL;2020;S. Kleber, R. W. van der Heijden, and F. Kargl, “Message Type Identification of Binary Network Protocols using Continuous Segment Similarity,” in IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Jul. 2020, pp. 2243–2252. doi: 10.1109/INFOCOM41043.2020.9155275. ;10.1109/INFOCOM41043.2020.9155275;https://arxiv.org/pdf/2002.03391;Please review;x;;x;;;;;;;
75 | NetPlier;2021;Ye, Yapeng, Zhuo Zhang, Fei Wang, Xiangyu Zhang, and Dongyan Xu. “NetPlier: Probabilistic Network Protocol Reverse Engineering from Message Traces.” In NDSS. 2021.;;https://www.ndss-symposium.org/wp-content/uploads/ndss2021_4A-5_24531_paper.pdf;Probabilistic method for network trace based protocol reverse engineering.;x;;;;;;;;;https://github.com/netplier-tool/NetPlier/
76 | 


--------------------------------------------------------------------------------
/tools.ods:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/techge/PRE-list/9c18f7c8f00978ec70ba285a5f13b461ede12e93/tools.ods


--------------------------------------------------------------------------------
/tools_based-on-Duchêne-et-al-2017.ods:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/techge/PRE-list/9c18f7c8f00978ec70ba285a5f13b461ede12e93/tools_based-on-Duchêne-et-al-2017.ods


--------------------------------------------------------------------------------
/tools_based-on-Sija-et-al-2018.ods:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/techge/PRE-list/9c18f7c8f00978ec70ba285a5f13b461ede12e93/tools_based-on-Sija-et-al-2018.ods


--------------------------------------------------------------------------------