├── README.md
├── packages.md
├── event_name_structure.md
├── ideas.md
└── cmv.md


/README.md:
--------------------------------------------------------------------------------
1 | # docs
2 | Documentation and Project Planning
3 | 
4 | ### [Possibly useful Go packages](packages.md)
5 | ### [Brain dump of ideas](ideas.md)
6 | 


--------------------------------------------------------------------------------
/packages.md:
--------------------------------------------------------------------------------
 1 | # Packages
 2 | 
 3 | ## Core
 4 | 
 5 | * Main event bus: https://github.com/asaskevich/EventBus
 6 | * Scripting and extending functionalities (TBD): https://github.com/robertkrimen/otto
 7 | * Sized WaitGroups: https://github.com/remeh/sizedwaitgroup
 8 | 
 9 | ## Modules
10 | 
11 | * Screenshots, browser instrumentation, etc: https://github.com/chromedp/chromedp
12 | * Bruteforcing, takeovers, other DNS stuff: https://github.com/miekg/dns
13 | 


--------------------------------------------------------------------------------
/event_name_structure.md:
--------------------------------------------------------------------------------
 1 | # Proposal for event name structure
 2 | 
 3 | Event names are what is used by Chimaera agents to exchange data with each other. Since there will likely be a lot of different events in Chimaera, it's important that we agree on a structuring that is clean and predictable to keep it manageble and easy to work with.
 4 | 
 5 | ## Proposal
 6 | 
 7 |     <state>:<master-category>:<category>[:<sub-category1[...:<sub-categoryN>]]
 8 | 
 9 | * `<state>` **(required)**: One of:
10 |  * `new`: The event data is new
11 |  * `changed`: The event data has changed
12 |  * `removed`: The event data has been removed
13 | * `<master-category>` **(required)**: The main category that the following category is associated to
14 | * `<category>`, `<sub-categoryX>`: Increasingly precise categories for the data type
15 | 
16 | ## Examples
17 | 
18 | * `new:host:ip:v4`
19 | * `new:host:ip:v6`
20 | * `new:host:name` (e.g. target.com, admin.target.com, vpn.corp.target.com)
21 | * `new:host:ip:port:tcp`
22 | * `new:host:ip:port:udp`
23 | * `new:host:ip:port:tcp:screenshot` (screenshot of a web page or RDP session)
24 | * `new:host:ip:geolocation`
25 | * `new:range:cidr`
26 | 


--------------------------------------------------------------------------------
/ideas.md:
--------------------------------------------------------------------------------
 1 | # Ideas
 2 | 
 3 | ## Core
 4 | 
 5 | - **Modules:** The tool will be structured into small modules that will perform a specific task on data. The modules might produce more data, or they might enrich data, but they should be atomic and perform one thing well
 6 | - **Event based core:** Modules will consume and emit events on a bus. One module
 7 | might emit an event about a new subdomain being discovered, which can be consumed
 8 | by other modules which will perform arbitrary actions on this newly discovered
 9 | subdomain, e.g. scan for open ports. Information on open ports will also be emited
10 | as an event on the bus in order to be picked up by other modules and so on.
11 | - **Event bus orchestration:** In order to avoid duplication of work between modules, we will need some kind of state component which modules can use to *reserve* incoming events on a first come, first served basis
12 | - **Event ancestry:** A hierachy of events should somehow be maintained in order to be able to tell how a specific piece of data was found. e.g: `new:domain{admin.target.com}` --> `new:port{8080,admin.target.com}` --> `new:endpoint{/robots.txt,8080,admin.target.com}` --> . . .
13 | Having information on the parent/child relationship between events/data would make it possible to do some interesting stuff with data graphs, etc.
14 | 
15 | ## Features
16 | 
17 | - **Subdomain discovery:** Subdomains could be discovered on domains through various methods, both passive and active
18 | - **Port scanning:** Discovered hosts could be port scanned for a configurable list of ports
19 | - **Host Discovery:** Discover alive ip addresses when specified with an IP range
20 | - **Website screenshots:** Discovered web servers could be screenshotted for easy analysis
21 | - **Remote Desktop screenshots:** Discovered RDP hosts could be screenshotted for easy analysis
22 | - **Grouping of similar sites:** Similar sites could be grouped together by analysing returned HTTP responses in a *fuzzy* manner to tolerate smaller differences
23 | - **Banner grapping:** Banners from discovered ports could be captured and potentially fingerprinted
24 | - **WHOIS lookup:** Discovered hosts and domains could be looked up in WHOIS to gather information on network range and owner, etc.
25 | - **Host geolocation:** Discovered hosts could be geolocated through a web service or by looking up in an internal database
26 | - **Module aggressiveness rating:** Modules could have an *aggressiveness* rating associated with them in order for users to specify how aggressive/noisy or stealthy they want their assessment to be. Some modules perform passive things, while others perform port scanning and other direct interaction with target infrastructure which might not always be desirable
27 | - **Scoping of IP ranges and domains:** The tool has the potential to *run wild* which can be dangerous on professional engagements with a defined scope. It should be possible to limit the tool's reach to specific IP ranges and domains
28 | - **Config File Discovery:** Discovered hosts can be searched for interesting endpoints and files leading to easy wins.
29 | - **Platform Identification:** Discovered hosts should be searched using a database or something to find the software running on them.
30 | - **Known exploit scanning:** (TBD) Discovery some common vulnerabilities that may be present based on the host.
31 | - **SSL Checks:** Check for common SSL Vulnerabilities & Also other stuff like Heartbleed, Poodle, etc
32 | - **VHostScan:** Check for VHosts in found HTTP websites. 
33 | - **Email enumeration:** Enumerating emails from domain, searching for breached accounts, StackOverflow accounts from developers
34 | 
35 | ## Challenges
36 | 
37 | ## Reference
38 | 


--------------------------------------------------------------------------------
/cmv.md:
--------------------------------------------------------------------------------
  1 | ### Chimaera Workspace Database Format Design Specifications
  2 | 
  3 | Written By - Chimaera Team
  4 | 
  5 | ---
  6 | Chimaera works on top of a graph database that links all sorts of events together in a meaningful way. This type of graph database provides chimaera the flexibilty to link events related to a host together and also as it is using a custom solution, it provides speed. Doing things from the disc allows us to save the state of program at a moment and resume it later.
  7 | 
  8 | Chimaera Workspace format uses JSON file format as the basic structural format. Each chimaera workspace begins with a root node which usually contains the hostname or an IP Address or an IP-Range. Each workspace can contain multiple ranges. For the sake of this document, we are assuming we have 2 root nodes - 
  9 | 
 10 | ```
 11 | 1) NodeType = NodeTypeNostname, NodeValue = google.com
 12 | 2) NodeType = NodeTypeIP, NodeValue = 10.10.10.70
 13 | ```
 14 | 
 15 | Each root node basically has it's own directory where other child nodes are present. In Chimaera workspace, when the workspace is decoded, each node stands for a folder. So a trailing / (slash) in the end in the diagrams presented below means that that node is a folder. The extension for Chimaera workspace files is .cmv . 
 16 | 
 17 | The workspace might look something like this for this example - 
 18 | 
 19 | ```
 20 | example.cmv/
 21 | 		   |- google.com/
 22 | 		   |- 10.10.10.70/
 23 | 		   |- database.json
 24 | ```
 25 | 
 26 | As apparent, `google.com` and `10.10.10.70` are two nodes that are present. The **database.json** file is a type of schema for the whole workspace. It contains some misc details on the structure of the workspace. 
 27 | 
 28 | The database.json file can look like this for example - 
 29 | 
 30 | ```
 31 | {
 32 | 	"nodes":[
 33 |         "google.com":{
 34 |             "Type":"Hostname"
 35 |         },
 36 |         "10.10.10.70":{
 37 |             "Type":"IP"
 38 |         }
 39 | 	],
 40 | 	
 41 |     "about":{
 42 |         "version":"1.0",
 43 |         "timestamp":"1528881960"
 44 |     }
 45 | }
 46 | ```
 47 | 
 48 | - The **nodes** contain details on the root nodes currently present in the workspace. 
 49 | - The **about** section contains misc information about the chimaera program such as version as well as some misc info such as platform version, etc.
 50 | 
 51 | > Note - This structure is just a basic description of what we are attempting to accomplish. It can be extended furthur in much many ways.
 52 | 
 53 | Each of these nodes can contain subnodes, and each of those subnodes can contain other sub-sub-nodes which provide furthur details about them. Let's take the example of the Hostname block. 
 54 | 
 55 | Reading the hostname node, we see it has 3 subnodes called Subdomains which are apparently subdomains of the current domain and also each of those 3 subnodes contain IP addresses details inside the subdomain.json file which correspond to those subdomains. In this way, we maintain a relationship between found subdomains and domains. The additional data such as found WHOIS info, found Subdomain Takeovers, Config Files, Ports, etc all are present inside the sub-node's details.json file.
 56 | 
 57 | ```
 58 | google.com/
 59 | 			  |- corp.google.com/
 60 | 			  					|- corp.google.com.json (Details file)
 61 | 			  |- mail.google.com/
 62 | 			  					|- mail.google.com.json 
 63 | 			  |- game.google.com/
 64 | 			  					|- game.google.com.json 
 65 | 			  |- 10.19.16.75/ 
 66 | 			  				|- 10.19.16.75.json
 67 | 			  | google.com.json
 68 | ```
 69 | 
 70 | 
 71 | 
 72 | Each of these json file contains details about a sub-node which can be a subdomain, or something else depending on the context of the root node. Now, google.com.json is a very important file as it contains information on all possible sub-nodes found.
 73 | 
 74 | ```
 75 | > cat google.com.json
 76 | {
 77 |     "nodes":[
 78 |         "corp.google.com":{
 79 |             "Type":"Subdomain"
 80 |         },
 81 |         "mail.google.com":{
 82 |             "Type":"Subdomain"
 83 |         },
 84 |         "game.google.com":{
 85 |             "Type":"Subdomain"
 86 |         },
 87 |         "10.19.16.75":{
 88 |             "Type":"IP"
 89 |         }
 90 |     ]
 91 | }
 92 | ```
 93 | 
 94 | Furthur changes can be made to this structure as needed. Next, we move to sub-nodes found. Let's take a look at the json file inside them.
 95 | 
 96 | ```
 97 | > cat corp.google.com.json
 98 | {
 99 |     "nodes":{
100 |         "10.19.12.26":{
101 |             "Type":"IP"
102 |         },
103 |         "10.19.12.27":{
104 |             "Type":"IP"
105 |         }
106 |         "corp.google.com/.git":{
107 |             "Type":"Content"
108 |         }
109 |     }
110 | }
111 | ```
112 | 
113 | Here, the first value in the nodes is the IP address that resolves to **corp.google.com**
114 | 
115 | > Citiation Needed - How do we maintain relationships between IP addresses and hostnames? I propose we store unique ones as sub-nodes and for hostnames that have an IP resolving to them, we simply add a sub-sub-note inside the sub-node json file. :p
116 | 
117 | 


--------------------------------------------------------------------------------