├── fig
    └── DiFF-RF.jpg
├── documentation
    ├── DiFF-RF_API.pdf
    └── DiFF-RF_API.html
├── version.py
├── setup.py
├── README.md
├── testDiFF_RF_Donuts.py
├── DiFF_RF.py
└── LICENSE


/fig/DiFF-RF.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pfmarteau/DiFF-RF/HEAD/fig/DiFF-RF.jpg


--------------------------------------------------------------------------------
/documentation/DiFF-RF_API.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pfmarteau/DiFF-RF/HEAD/documentation/DiFF-RF_API.pdf


--------------------------------------------------------------------------------
/version.py:
--------------------------------------------------------------------------------
 1 | """DiFF Random Forest version 0.1"""
 2 | 
 3 | version_tag = (0, 1, 0)
 4 | __version__ = '.'.join(map(str, version_tag[:3]))
 5 | 
 6 | if len(version_tag) > 3:
 7 |     __version__ = '%s-%s' % (__version__, version_tag[3])
 8 | 
 9 | 
10 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import os
 3 | from distutils.core import setup
 4 | prjdir = os.path.dirname(__file__)
 5 | 
 6 | def read(filename):
 7 |     return open(os.path.join(prjdir, filename)).read()
 8 | 
 9 | extra_link_args = []
10 | libraries = []
11 | library_dirs = []
12 | include_dirs = []
13 | exec(open('version.py').read())
14 | setup(
15 |     name='DiFF_RF',
16 |     version=__version__,
17 |     author='Pierre-F. Marteau, from Matias Carrasco code @ https://github.com/xhan0909/isolation_forest',
18 |     author_email='pierre-francois.marteau@irisa.fr',
19 |     scripts=[],
20 |     py_modules=['DiFF_RF','version'],
21 |     packages=[],
22 |     license='License.txt',
23 |     description='Distance based ensemble of random partitioning trees for anomaly detection',
24 |     long_description=read('README.md'),
25 |     #url='https://github.com/mgckind/iso_forest',
26 | )
27 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # DiFF-RF: forest of random partitioning trees for point-wise and collective anomaly detection
 2 | ![](/fig/DiFF-RF.jpg)
 3 | This code is a simple implementation for the DiFF-RF algorithm described in this [draft paper](https://arxiv.org/abs/2006.16801), a semi-supervised approach for detecting point-wise or collective anomalies or outliers given a dataset of 'normal' instances. It implements a distance measure to a centroid and a frequency of visit mechanism at leaf level to build point-wise and collective anomaly scores. It solves a drawback identified in the Isolation Forest (IF) algorithm and outperforms in general IF and other state of the art methods in anomaly detection on a large set of diversified application datasets.
 4 | 
 5 | 
 6 | This code is derived from the one provided by Xiao Han as an implemention of the Isolation Forest algorithm available at [github.com/xhan0909](https://github.com/xhan0909)
 7 | 
 8 | 
 9 | ## Requirements
10 | It supports python3.5+
11 | 
12 | No extra requirement is needed apart numpy.
13 | 
14 | ## Installation
15 |     $ sudo python3  setup.py install
16 | 
17 |     (or $python3 setup.py install --user)
18 | 
19 | 
20 | ## Usage
21 | 
22 | A running example exploiting 'donnuts' data is given in file testDiFF_RF_Donnuts.py
23 | 
24 | The API documentation (html or pdf) is described in the file documentation/DiFF-RF-API.(html/pdf)
25 | (generated using $ pdoc3 DiFF_RF.py --html --force)
26 | 
27 | Typical usage (close to sklearn api) is as follows:
28 | 
29 |     # Creation of the data structure
30 |     diff_rf = DiFF_TreeEnsemble(sample_size=sample_size, n_trees=ntrees)
31 |     # fit the DiFF-RF with (normal) data: X_train, a nD numpy array whose dimensions should be (n_obs, n_features), n_jobs : the number of process
32 |     diff_rf.fit(X_train, n_jobs=8)
33 |     # Get the anomaly scores for test data X_test
34 |     point_wise_scores, visiting_frequency_scores, collective_scores = diff_rf.anomaly_score(X_test,alpha=alpha0)
35 | 
36 | ### Launching the test code (requires matplotlib and sklearn)
37 |     $ python3 -i testDiFF_RF_Donuts.py
38 | 
39 | ### creating an instance of the donut dataset (normal data) and the anomaly clusters (red and green clusters)
40 |     >>> createDonutData(contamin=0)
41 | 
42 | ### Creating and evaluating DiFF-RF
43 |     >>> computeDiFF_RF(ntrees=512, sample_size=32)
44 | 
45 | ### Some of the outputs are saved on disk
46 | Data is saved in the *PKL* subdirectory
47 | Figures are saved in the *FIG* subdirectory
48 | 
49 |     
50 | Thanks to cite the above mentioned draft paper if you use this code.
51 | 
52 |     @article{marteau:hal-02882548,
53 |         TITLE = {{Random Partitioning Forest for Point-Wise and Collective Anomaly Detection - Application to Network Intrusion Detection}},
54 |         AUTHOR = {Marteau, Pierre-Fran{\c c}ois},
55 |         URL = {https://hal.archives-ouvertes.fr/hal-02882548},
56 |         JOURNAL = {{IEEE Transactions on Information Forensics and Security}},
57 |         PUBLISHER = {{Institute of Electrical and Electronics Engineers}},
58 |         PAGES = {1-16},
59 |         YEAR = {2021},
60 |         MONTH = Jan,
61 |         DOI = {10.1109/TIFS.2021.3050605},
62 |         KEYWORDS = {Machine Learning ; Semisupervised Learning ; NIDS ; Random Forest ; Anomaly Detection ; Random Partitioning Trees ; Semi- supervised Learning ; Intrusion Detection},
63 |         PDF = {https://hal.archives-ouvertes.fr/hal-02882548v2/file/DiFF-RF-v3.pdf},
64 |         HAL_ID = {hal-02882548},
65 |         HAL_VERSION = {v2},
66 |         }
67 | 


--------------------------------------------------------------------------------
/testDiFF_RF_Donuts.py:
--------------------------------------------------------------------------------
  1 | __author__ = 'P-F.Marteau, June 2020'
  2 | 
  3 | import matplotlib
  4 | import matplotlib.pyplot as plt
  5 | import numpy as np
  6 | import time
  7 | import pickle
  8 | import pathlib
  9 |     
 10 | from sklearn.metrics import roc_curve, auc
 11 | from sklearn.ensemble import IsolationForest
 12 | 
 13 | from DiFF_RF import DiFF_TreeEnsemble
 14 | 
 15 | plt.gcf().subplots_adjust(bottom=0.15)
 16 | matplotlib.rcParams.update({'font.size': 22})
 17 | 
 18 |     
 19 | def gen_tore_vecs(dims, number, rmin, rmax):
 20 |     vecs = np.random.uniform(low=-1, size=(number,dims))
 21 |     radius = rmin + np.random.sample(number) * (rmax-rmin)
 22 |     mags = np.sqrt((vecs*vecs).sum(axis=-1))
 23 |     # How to distribute the magnitude to the vectors
 24 |     for i in range(number):
 25 |         vecs[i,:] = vecs[i, :] / mags[i] *radius[i]
 26 |     return vecs[:,0], vecs[:,1]
 27 | 
 28 | 
 29 | def createDonutData(contamin=0):
 30 |     print('build donnuts data')
 31 |     Nobjs = 1000
 32 |     xn, yn = gen_tore_vecs(2, Nobjs, 1.5, 4)
 33 |     Xn = np.array([xn, yn]).T
 34 | 
 35 |     Nobjsb = 1000
 36 |     mean = [0, 0]
 37 |     cov = [[.5, 0], [0, .5]]  # diagonal covariance
 38 |     xb, yb = np.random.multivariate_normal(mean, cov, Nobjsb).T
 39 |     Xb = np.array([xb, yb]).T
 40 | 
 41 |     Nobjst = 1000
 42 |     xnt, ynt = gen_tore_vecs(2, Nobjst, 1.5, 4)
 43 |     Xnt = np.array([xnt, ynt]).T
 44 |     
 45 | 
 46 |     # create cluster of anomalies
 47 |     mean = [3., 3.]
 48 |     cov = [[.25, 0], [0, .25]]  # diagonal covariance
 49 |     Nobjsa = 1000
 50 |     xa, ya = np.random.multivariate_normal(mean, cov, Nobjsa).T
 51 |     Xa = np.array([xa, ya]).T
 52 |     
 53 |     Xab=np.concatenate([Xa,Xb])
 54 | 
 55 |     pathlib.Path('./PKL').mkdir(parents=True, exist_ok=True) 
 56 |     f = open('PKL/donnutsDataProblem.pkl', 'wb')
 57 |     pickle.dump([Xn, Xnt, Xa, Xb, Xab], f)
 58 |     f.close()
 59 | 
 60 | 
 61 | def computeDiff_RF(ntrees=1024, sample_size_ratio=.33, alpha0=.1):
 62 |     # load data
 63 |     f = open('PKL/donnutsDataProblem.pkl', 'rb')
 64 |     [Xn, Xnt, Xa, Xb, Xab] = pickle.load(f)
 65 |     f.close()
 66 |     
 67 |     if sample_size_ratio >1:
 68 |         sample_size=sample_size_ratio
 69 |     else:
 70 |         sample_size=int(sample_size_ratio*len(Xn))
 71 | 
 72 |     xn=Xn[:,0]
 73 |     yn=Xn[:,1]
 74 |     xa=Xa[:,0]
 75 |     ya=Xa[:,1]
 76 | 
 77 |     xb=Xb[:,0]
 78 |     yb=Xb[:,1]
 79 | 
 80 |     pathlib.Path('./FIG').mkdir(parents=True, exist_ok=True) 
 81 |     # plotting the donnuts data
 82 |     plt.figure(1)
 83 |     plt.plot(xn, yn, 'bo', markersize=10)
 84 |     plt.savefig('FIG/clustersDonnuts0.pdf')
 85 | 
 86 |     nn=len(Xa)
 87 |     plt.figure(2)
 88 |     plt.plot(xn, yn, 'bo', xa[0:nn], ya[0:nn], 'rs')
 89 |     plt.savefig('FIG/clustersDonnuts1.pdf')
 90 | 
 91 |     plt.figure(3)
 92 |     plt.plot(xn, yn, 'bo', xa[0:nn], ya[0:nn], 'rs', xb[0:nn], yb[0:nn], 'gd')
 93 |     plt.xticks(size=14)
 94 |     plt.yticks(size=14)
 95 |     plt.savefig('FIG/clustersDonnuts2.pdf')
 96 | 
 97 |     # Creating Forest on normal data + anomalies labels
 98 |     print('building the Diff_RF ...')
 99 | 
100 |     diff_rf = DiFF_TreeEnsemble(sample_size=sample_size, n_trees=ntrees)    # load data
101 |     fit_start = time.time()
102 |     diff_rf.fit(Xn, n_jobs=8)
103 |     fit_stop = time.time()
104 |     fit_time = fit_stop - fit_start
105 |     print(f"fit time {fit_time:3.2f}s")
106 |     n_nodes = sum([t.n_nodes for t in diff_rf.trees])
107 |     print(f"{n_nodes} total nodes in {ntrees} trees")
108 |     
109 |     XT=np.concatenate([Xnt,Xab])
110 |     
111 |     sc_di,sc_ff,sc_diff_rf = diff_rf.anomaly_score(XT,alpha=alpha0)
112 |     sc_diff_rf=np.array(sc_diff_rf)
113 |     sc_ff=np.array(sc_ff)
114 |     sc_di=np.array(sc_di)
115 |     sc_ff=(sc_ff-sc_ff.min())/(sc_ff.max()-sc_ff.min())
116 |     sc_di=(sc_di-sc_di.min())/(sc_di.max()-sc_di.min())
117 |     sc_diff_rf=(sc_diff_rf-sc_diff_rf.min())/(sc_diff_rf.max()-sc_diff_rf.min())
118 | 
119 |     plt.figure(1000)
120 |     xn=XT[:,0]
121 |     yn=XT[:,1]
122 |     plt.scatter(xn, yn, marker='o', c=sc_ff, cmap='viridis')
123 |     plt.colorbar()
124 |     plt.xticks(size=14)
125 |     plt.yticks(size=14)
126 |     plt.title('DiFF_RF (visiting frequency score) Heat Map')
127 |     plt.savefig('FIG/HeatMap_DiFF_RF_freqScore.pdf')
128 |     
129 |     plt.figure(1001)
130 |     xn=XT[:,0]
131 |     yn=XT[:,1]
132 |     plt.scatter(xn, yn, marker='o', c=sc_diff_rf, cmap='viridis')
133 |     plt.colorbar()
134 |     plt.xticks(size=14)
135 |     plt.yticks(size=14)
136 |     plt.title('DiFF_RF (collective anomaly score) Heat Map')
137 |     plt.savefig('FIG/HeatMap_DiFF_RF_collectiveScore.pdf')
138 |     
139 |     plt.figure(1002)
140 |     xn=XT[:,0]
141 |     yn=XT[:,1]
142 |     plt.scatter(xn, yn, marker='o', c=(sc_di), cmap='viridis')
143 |     plt.colorbar()
144 |     plt.xticks(size=14)
145 |     plt.yticks(size=14)
146 |     plt.title('DiFF_RF (point-wise anomaly score) Heat Map')
147 |     plt.savefig('FIG/HeatMap_DiFF_RF_pointWiseScore.pdf')
148 | 
149 |     cif = IsolationForest(n_estimators=ntrees, max_samples=sample_size, bootstrap=False, n_jobs=12)
150 |     cif.fit(Xn)
151 |     sc_if = -cif.decision_function(XT)
152 |     sc_if=(sc_if-sc_if.min())/(sc_if.max()-sc_if.min())
153 |     plt.figure(1003)
154 |     xn=XT[:,0]
155 |     yn=XT[:,1]
156 |     plt.scatter(xn, yn, marker='o', c=sc_if, cmap='viridis')
157 |     plt.colorbar()
158 |     plt.xticks(size=14)
159 |     plt.yticks(size=14)
160 |     plt.title('Isolation Forest Heat Map')
161 |     plt.savefig('FIG/HeatMap_IF.pdf')
162 |     plt.show()
163 |     
164 |     y_true = np.array([-1] * len(Xnt) + [1] * len(Xab))
165 |     fpr_IF, tpr_IF, thresholds = roc_curve(y_true, sc_if)
166 |     aucIF=auc(fpr_IF, tpr_IF)
167 |     fpr_D, tpr_D, thresholds = roc_curve(y_true, sc_di)
168 |     aucD=auc(fpr_D, tpr_D)
169 |     fpr_F, tpr_F, thresholds = roc_curve(y_true, sc_ff)
170 |     aucF=auc(fpr_F, tpr_F)
171 |     fpr_DF, tpr_DF, thresholds = roc_curve(y_true, sc_diff_rf)
172 |     aucDF=auc(fpr_DF, tpr_DF)
173 |     print("Isolation Forest AUC=", aucIF)
174 |     print("DiFF_RF (point-wise anomaly score) AUC=", aucD)
175 |     print("DiFF_RF (frequency of visit scoring only) AUC=", aucF)
176 |     print("DiFF_RF (collective anomaly score) AUC=", aucDF)
177 | 
178 | if __name__ == '__main__': 
179 |     # create donnuts data
180 |     createDonutData(contamin=0)
181 | 
182 |     # build and test IF and DiFF-RF
183 |     computeDiff_RF(ntrees=256, sample_size_ratio=.25, alpha0=1)
184 | 


--------------------------------------------------------------------------------
/DiFF_RF.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Created on Tue Mar 24 12:19:32 2020
  5 | 
  6 | @author: Pierre-François Marteau (https://people.irisa.fr/Pierre-Francois.Marteau/)
  7 | """
  8 | 
  9 | # Inspired from an implementation of the isolation forest algorithm provided at
 10 | # https://github.com/xhan0909/isolation_forest
 11 | 
 12 | import numpy as np
 13 | import time, sys
 14 | from functools import partial
 15 | from multiprocessing import Pool
 16 | 
 17 | import random as rn
 18 | 
 19 | def getSplit(X):
 20 |     """
 21 |     Randomly selects a split value from set of scalar data 'X'.
 22 |     Returns the split value.
 23 |     
 24 |     Parameters
 25 |     ----------
 26 |     X : array 
 27 |         Array of scalar values
 28 |     Returns
 29 |     -------
 30 |     float
 31 |         split value
 32 |     """
 33 |     xmin = X.min()
 34 |     xmax = X.max()
 35 |     return np.random.uniform(xmin, xmax)
 36 | 
 37 | def similarityScore(S, node, alpha):
 38 |     """
 39 |     Given a set of instances S falling into node and a value alpha >=0,
 40 |     returns for all element x in S the weighted similarity score between x
 41 |     and the centroid M of S (node.M)
 42 |     
 43 |     Parameters
 44 |     ----------
 45 |     S : array  of instances
 46 |         Array  of instances that fall into a node
 47 |     node: a DiFF tree node
 48 |         S is the set of instances "falling" into the node
 49 |     alpha: float
 50 |         alpha is the distance scaling hyper-parameter
 51 |     Returns
 52 |     -------
 53 |     array
 54 |         the array of similarity values between the instances in S and the mean of training instances falling in node
 55 | 
 56 |     """
 57 |     d = np.shape(S)[1]
 58 |     if len(S) > 0:
 59 |         d = np.shape(S)[1]
 60 |         U = (S-node.M)/node.Mstd # normalize using the standard deviation vector to the mean
 61 |         U = (2)**(-alpha*(np.sum(U*U/d, axis=1)))
 62 |     else:
 63 |         U = 0
 64 | 
 65 |     return U
 66 | 
 67 | 
 68 | def EE(hist):
 69 |     """
 70 |     given a list of positive values as a histogram drawn from any information source,
 71 |     returns the empirical entropy of its discrete probability function.
 72 |     
 73 |     Parameters
 74 |     ----------
 75 |     hist: array 
 76 |         histogram
 77 |     Returns
 78 |     -------
 79 |     float
 80 |         empirical entropy estimated from the histogram
 81 | 
 82 |     """
 83 |     h = np.asarray(hist, dtype=np.float64)
 84 |     if h.sum() <= 0 or (h < 0).any():
 85 |         return 0
 86 |     h = h/h.sum()
 87 |     return -(h*np.ma.log2(h)).sum()
 88 | 
 89 | 
 90 | def weightFeature(s, nbins):
 91 |     '''
 92 |     Given a list of values corresponding to a feature dimension, returns a weight (in [0,1]) that is 
 93 |     one minus the normalized empirical entropy, a way to characterize the importance of the feature dimension. 
 94 |     
 95 |     Parameters
 96 |     ----------
 97 |     s: array 
 98 |         list of scalar values corresponding to a feature dimension
 99 |     nbins: int
100 |         the number of bins used to discretize the feature dimension using an histogram.
101 |     Returns
102 |     -------
103 |     float
104 |         the importance weight for feature s.
105 |     '''
106 |     wmin=.02
107 |     mins=s.min()
108 |     maxs=s.max()
109 |     if not np.isfinite(mins) or not np.isfinite(maxs) or np.abs(mins- maxs)<1e-300:
110 |         return 1e-4
111 | 
112 |     hist, bin_edges = np.histogram(s, bins=nbins)
113 |     ent = EE(hist)
114 |     ent = ent/np.log2(nbins)
115 |     if np.isfinite(ent):
116 |          return max(1-ent, wmin)
117 |     else:
118 |          return wmin
119 | 
120 | 
121 | def walk_tree(forest, node, treeIdx, obsIdx, X, featureDistrib, depth=0, alpha=1e-2):
122 |     '''
123 |     Recursive function that walks a tree from an already fitted forest to compute the path length
124 |     of the new observations.
125 |     
126 |     Parameters
127 |     ----------
128 |     forest : DiFF_RF 
129 |         A fitted forest of DiFF trees
130 |     node: DiFF Tree node
131 |         the current node
132 |     treeIdx: int
133 |         index of the tree that is being walked.
134 |     obsIdx: array
135 |         1D array of length n_obs. 1/0 if the obs has reached / has not reached the node.
136 |     X: nD array. 
137 |         array of observations/instances.
138 |     depth: int
139 |         current depth.
140 |     Returns
141 |     -------
142 |     None
143 |     '''
144 | 
145 |     if isinstance(node, LeafNode):
146 |         Xnode = X[obsIdx]
147 |         f = ((node.size+1)/forest.sample_size) / ((1+len(Xnode))/forest.XtestSize)
148 |         if alpha == 0:
149 |             forest.LD[obsIdx, treeIdx] = 0
150 |             forest.LF[obsIdx, treeIdx] = -f
151 |             forest.LDF[obsIdx, treeIdx] = -f
152 |         else:
153 |             z = similarityScore(Xnode, node, alpha)
154 |             forest.LD[obsIdx, treeIdx] = z
155 |             forest.LF[obsIdx, treeIdx] = -f
156 |             forest.LDF[obsIdx, treeIdx] = z*f
157 | 
158 |     else:
159 | 
160 |         idx = (X[:, node.splitAtt] <= node.splitValue) * obsIdx
161 |         walk_tree(forest, node.left, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)
162 | 
163 |         idx = (X[:, node.splitAtt] > node.splitValue) * obsIdx
164 |         walk_tree(forest, node.right, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)
165 | 
166 | 
167 | def create_tree(X, featureDistrib, sample_size, max_height):
168 |     '''
169 |     Creates an DiFF tree using a sample of size sample_size of the original data.
170 |         
171 |     Parameters
172 |     ----------
173 |     X: nD array. 
174 |         nD array with the observations. Dimensions should be (n_obs, n_features).
175 |     sample_size: int
176 |         Size of the sample from which a DiFF tree is built.
177 |     max_height: int
178 |         Maximum height of the tree.
179 |     Returns
180 |     -------
181 |     a DiFF tree
182 |     '''
183 |     rows = np.random.choice(len(X), sample_size, replace=False)
184 |     featureDistrib = np.array(featureDistrib)
185 |     return DiFF_Tree(max_height).fit(X[rows, :], featureDistrib)
186 | 
187 | 
188 | class DiFF_TreeEnsemble:
189 |     '''
190 |     DiFF Forest.
191 |     Even though all the methods are thought to be public the main functionality of the class is given by:
192 |     - __init__
193 |     - __fit__
194 |     - __predict__
195 |     '''
196 |     def __init__(self, sample_size: int, n_trees: int = 10):
197 |         '''
198 |         Creates the DiFF-RF object.
199 |         
200 |         Parameters
201 |         ----------
202 |         sample_size: int. 
203 |             size of the sample randomly drawn from the train instances to build each DiFF tree.  
204 |         n_trees: int
205 |             The number of trees in the forest
206 |         Returns
207 |         -------
208 |             None
209 |         '''
210 | 
211 |         self.sample_size = sample_size
212 |         self.n_trees = n_trees
213 |         self.alpha=1.0
214 |         np.random.seed(int(time.time()))
215 |         rn.seed(int(time.time()))
216 | 
217 | 
218 |     def fit(self, X: (np.ndarray), n_jobs: int = 1):
219 |         """
220 |         Fits the algorithm into a model.
221 |         Given a 2D matrix of observations, create an ensemble of IsolationTree
222 |         objects and store them in a list: self.trees.  Convert DataFrames to
223 |         ndarray objects.
224 |         Uses parallel computing.
225 |         
226 |         Parameters
227 |         ----------
228 |         X: nD array. 
229 |             nD array with the train instances. Dimensions should be (n_obs, n_features).  
230 |         n_jobs: int
231 |             number of parallel jobs that will be launched
232 |         Returns
233 |         -------
234 |             the object itself.
235 |         """
236 |         self.X = X
237 |         self.path_normFactor = np.sqrt(len(X))
238 | 
239 |         self.sample_size = min(self.sample_size, len(X))
240 | 
241 |         limit_height = 1.0*np.ceil(np.log2(self.sample_size))
242 | 
243 |         featureDistrib = []
244 |         nbins = int(len(X)/8)+2
245 |         for i in range(np.shape(X)[1]):
246 |             featureDistrib.append(weightFeature(X[:, i], nbins))
247 |         featureDistrib = np.array(featureDistrib)
248 |         featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
249 |         self.featureDistrib = featureDistrib
250 | 
251 |         create_tree_partial = partial(create_tree,
252 |                                       featureDistrib=self.featureDistrib,
253 |                                       sample_size=self.sample_size,
254 |                                       max_height=limit_height)
255 | 
256 |         with Pool(n_jobs) as p:
257 |             self.trees = p.map(create_tree_partial,
258 |                                [X for _ in range(self.n_trees)]
259 |                                )
260 |         return self
261 | 
262 | 
263 |     def walk(self, X: np.ndarray) -> np.ndarray:
264 |         """
265 |         Given a nD matrix of observations, X, compute the average path length,
266 |         the distance, frequency and collective anomaly scores
267 |         for instances in X.  Compute the path length for x_i using every
268 |         tree in self.trees then compute the average for each x_i.  Return an
269 |         ndarray of shape (len(X),1).
270 |         
271 |         Parameters
272 |         ----------
273 |         X: nD array. 
274 |             nD array with the instances to be tested. Dimensions should be (n_obs, n_features).   
275 |         Returns
276 |         -------
277 |             None
278 |         """
279 | 
280 |         self.L = np.zeros((len(X), self.n_trees))
281 |         self.LD = np.zeros((len(X), self.n_trees))
282 |         self.LF = np.zeros((len(X), self.n_trees))
283 |         self.LDF = np.zeros((len(X), self.n_trees))
284 | 
285 |         for treeIdx, itree in enumerate(self.trees):
286 |             obsIdx = np.ones(len(X)).astype(bool)
287 |             walk_tree(self, itree, treeIdx, obsIdx, X, self.featureDistrib, alpha=self.alpha)
288 | 
289 | 
290 |     def anomaly_score(self, X: np.ndarray, alpha=1) -> np.ndarray:
291 |         """
292 |         Given a nD matrix of observations, X, compute the anomaly scores
293 |         for instances in X, returning 3 1D arrays of anomaly scores
294 |         
295 |         Parameters
296 |         ----------
297 |         X: nD array. 
298 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
299 |         alpha: float
300 |             scaling distance hyper-parameter.
301 |         Returns
302 |         -------
303 |         scD, scF, scFF: 1d arrays
304 |             respectively the distance scores (point-wise anomaly score), the frequency of visit scores and the collective anomaly scores
305 |         """
306 |         self.XtestSize = len(X)
307 |         self.alpha = alpha
308 | 
309 |         # Evaluate the scores for each of the observations.
310 |         self.walk(X)
311 | 
312 |         # Compute the scores from the path lengths (self.L)
313 |         scD = -self.LD.mean(1)
314 |         scF = self.LF.mean(1)
315 |         scDF = -self.LDF.mean(1)
316 | 
317 |         return scD, scF, scDF
318 |     
319 | 
320 |     def predict_from_anomaly_scores(self, scores: np.ndarray, threshold: float) -> np.ndarray:
321 |         """
322 |         Given an array of scores and a score threshold, return an array of
323 |         the predictions: 1 for any score >= the threshold and 0 otherwise.
324 |         
325 |         Parameters
326 |         ----------
327 |         scores: 1D array. 
328 |             1D array of scores. Dimensions should be (n_obs, n_features).   
329 |         threshold: float
330 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
331 |         Returns
332 |         -------
333 |         1D array
334 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
335 | 
336 |         :param scores: 1D array. Scores produced by the random forest.
337 |         :param threshold: Threshold for considering a observation an anomaly, the higher the less anomalies.
338 |         :return: Return predictions
339 |         """
340 |         out = scores >= threshold
341 |         return out*1
342 |     
343 | 
344 |     def predict(self, X: np.ndarray, threshold: float, score_type: int=2) -> np.ndarray:
345 |         """
346 |         A shorthand for calling anomaly_score() and predict_from_anomaly_scores().
347 |         
348 |         Parameters
349 |         ----------
350 |         X: nD array. 
351 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
352 |         threshold: float
353 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
354 |         score_type: 0: distance socre, 1: frequency of visit score, 2: collective anomaly score
355 |         Returns
356 |         -------
357 |         1D array
358 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
359 |         """
360 |         if score_type>2 or score_type<0:
361 |             print("ERROR ine predict() function, score_type shoud be 0 for distance score,1 for frequency of visit score or 2 for collective anomaly score")
362 |             sys.exit(-1)
363 |         scores = self.anomaly_score(X)
364 |         return self.predict_from_anomaly_scores(scores[score_type], threshold)
365 | 
366 | 
367 | class DiFF_Tree:
368 |     '''
369 |     Construct a tree via randomized splits with maximum height height_limit.
370 |     '''
371 |     def __init__(self, height_limit):
372 |         '''
373 |         Parameters
374 |         ----------
375 |         height_limit: int
376 |             Maximum height of the tree.
377 |         Returns
378 |         -------
379 |         None
380 |         '''
381 |         self.height_limit = height_limit
382 | 
383 |     def fit(self, X: np.ndarray, featureDistrib: np.array):
384 |         """
385 |         Given a 2D matrix of observations, create an DiFF tree. Set field
386 |         self.root to the root of that tree and return it.
387 |         
388 |         Parameters
389 |         ----------
390 |         X: nD array. 
391 |             nD array with the observations. Dimensions should be (n_obs, n_features).        
392 |         featureDistrib: 1D array
393 |             The distribution weight affected to each dimension
394 |         Returns
395 |         -------
396 |         A DIFF tree root.
397 |         """
398 |         self.root = InNode(X, self.height_limit, featureDistrib, len(X), 0)
399 | 
400 |         return self.root
401 | 
402 | 
403 | class InNode:
404 |     '''
405 |     Node of the tree that is not a leaf node.
406 |     The functionality of the class is:
407 |     - Do the best split from a sample of randomly chosen
408 |         dimensions and split points.
409 |     - Partition the space of observations according to the
410 |     split and send the along to two different nodes
411 |     The method usually has a higher complexity than doing it for every point.
412 |     But because it's using NumPy it's more efficient time-wise.
413 |     '''
414 |     def __init__(self, X, height_limit, featureDistrib, sample_size, current_height):
415 |         '''
416 |         Parameters
417 |         ----------
418 |         X: nD array. 
419 |             nD array with the training instances that have reached the node.
420 |         height_limit: int
421 |             Maximum height of the tree.
422 |         Xf: nD array. 
423 |             distribution used to randomly select a dimension (feature) used at parent level. 
424 |         sample_size: int
425 |             Size of the sample used to build the tree.
426 |         current_height: int
427 |             Current height of the tree.
428 |         Returns
429 |         -------
430 |             None
431 |         '''
432 | 
433 |         self.size = len(X)
434 |         self.height = current_height+1
435 |         n_obs, n_features = X.shape
436 |         next_height = current_height + 1
437 |         limit_not_reached = height_limit > next_height
438 | 
439 |         if len(X) > 32:
440 |             featureDistrib = []
441 |             nbins = int(len(X)/8)+2
442 |             for i in range(np.shape(X)[1]):
443 |                 featureDistrib.append(weightFeature(X[:, i], nbins))
444 |             featureDistrib = np.array(featureDistrib)
445 |             featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
446 | 
447 |         self.featureDistrib = featureDistrib
448 | 
449 |         cols = np.arange(np.shape(X)[1], dtype='int')
450 | 
451 |         self.splitAtt = rn.choices(cols, weights=featureDistrib)[0]
452 |         splittingCol = X[:, self.splitAtt]
453 |         self.splitValue = getSplit(splittingCol)
454 |         idx = splittingCol <= self.splitValue
455 | 
456 |         idx = splittingCol <= self.splitValue
457 | 
458 |         X_aux = X[idx, :]
459 | 
460 |         self.left = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
461 |                      if limit_not_reached and X_aux.shape[0] > 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
462 |                          X_aux, next_height, X, sample_size))
463 | 
464 |         idx = np.invert(idx)
465 |         X_aux = X[idx, :]
466 |         self.right = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
467 |                       if limit_not_reached and X_aux.shape[0] > 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
468 |                           X_aux, next_height, X, sample_size))
469 | 
470 |         self.n_nodes = 1 + self.left.n_nodes + self.right.n_nodes
471 | 
472 | 
473 | class LeafNode:
474 |     '''
475 |     Leaf node
476 |     The base funcitonality is storing the Mean and standard deviation of the observations in that node.
477 |     We also evaluate the frequency of visit for training data.
478 |     '''
479 |     def __init__(self, X, height, Xp, sample_size):
480 |         '''
481 |         Parameters
482 |         ----------
483 |         X: nD array. 
484 |             nD array with the training instances falling into the leaf node.    
485 |         height: int
486 |             Current height of the tree.
487 |         Xf: nD array. 
488 |             nD array with the training instances falling into the parent node.    
489 |         sample_size: int
490 |             Size of the sample used to build the tree.
491 |         Returns
492 |         -------
493 |             None
494 |         '''
495 |         self.height = height+1
496 |         self.size = len(X)
497 |         self.n_nodes = 1
498 |         self.freq = self.size/sample_size
499 |         self.freqs = 0
500 | 
501 |         if len(X) != 0:
502 |             self.M = np.mean(X, axis=0)
503 |             if len(X) > 10:
504 |                 self.Mstd = np.std(X, axis=0)
505 |                 self.Mstd[self.Mstd == 0] = 1e-2
506 |             else:
507 |                 self.Mstd = np.ones(np.shape(X)[1])
508 |         else:
509 |             self.M = np.mean(Xp, axis=0)
510 |             if len(Xp) > 10:
511 |                 self.Mstd = np.std(Xp, axis=0)
512 |                 self.Mstd[self.Mstd == 0] = 1e-2
513 |             else:
514 |                 self.Mstd = np.ones(np.shape(X)[1])
515 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                     GNU GENERAL PUBLIC LICENSE
  2 |                        Version 3, 29 June 2007
  3 | 
  4 |  Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
  5 |  Everyone is permitted to copy and distribute verbatim copies
  6 |  of this license document, but changing it is not allowed.
  7 | 
  8 |                             Preamble
  9 | 
 10 |   The GNU General Public License is a free, copyleft license for
 11 | software and other kinds of works.
 12 | 
 13 |   The licenses for most software and other practical works are designed
 14 | to take away your freedom to share and change the works.  By contrast,
 15 | the GNU General Public License is intended to guarantee your freedom to
 16 | share and change all versions of a program--to make sure it remains free
 17 | software for all its users.  We, the Free Software Foundation, use the
 18 | GNU General Public License for most of our software; it applies also to
 19 | any other work released this way by its authors.  You can apply it to
 20 | your programs, too.
 21 | 
 22 |   When we speak of free software, we are referring to freedom, not
 23 | price.  Our General Public Licenses are designed to make sure that you
 24 | have the freedom to distribute copies of free software (and charge for
 25 | them if you wish), that you receive source code or can get it if you
 26 | want it, that you can change the software or use pieces of it in new
 27 | free programs, and that you know you can do these things.
 28 | 
 29 |   To protect your rights, we need to prevent others from denying you
 30 | these rights or asking you to surrender the rights.  Therefore, you have
 31 | certain responsibilities if you distribute copies of the software, or if
 32 | you modify it: responsibilities to respect the freedom of others.
 33 | 
 34 |   For example, if you distribute copies of such a program, whether
 35 | gratis or for a fee, you must pass on to the recipients the same
 36 | freedoms that you received.  You must make sure that they, too, receive
 37 | or can get the source code.  And you must show them these terms so they
 38 | know their rights.
 39 | 
 40 |   Developers that use the GNU GPL protect your rights with two steps:
 41 | (1) assert copyright on the software, and (2) offer you this License
 42 | giving you legal permission to copy, distribute and/or modify it.
 43 | 
 44 |   For the developers' and authors' protection, the GPL clearly explains
 45 | that there is no warranty for this free software.  For both users' and
 46 | authors' sake, the GPL requires that modified versions be marked as
 47 | changed, so that their problems will not be attributed erroneously to
 48 | authors of previous versions.
 49 | 
 50 |   Some devices are designed to deny users access to install or run
 51 | modified versions of the software inside them, although the manufacturer
 52 | can do so.  This is fundamentally incompatible with the aim of
 53 | protecting users' freedom to change the software.  The systematic
 54 | pattern of such abuse occurs in the area of products for individuals to
 55 | use, which is precisely where it is most unacceptable.  Therefore, we
 56 | have designed this version of the GPL to prohibit the practice for those
 57 | products.  If such problems arise substantially in other domains, we
 58 | stand ready to extend this provision to those domains in future versions
 59 | of the GPL, as needed to protect the freedom of users.
 60 | 
 61 |   Finally, every program is threatened constantly by software patents.
 62 | States should not allow patents to restrict development and use of
 63 | software on general-purpose computers, but in those that do, we wish to
 64 | avoid the special danger that patents applied to a free program could
 65 | make it effectively proprietary.  To prevent this, the GPL assures that
 66 | patents cannot be used to render the program non-free.
 67 | 
 68 |   The precise terms and conditions for copying, distribution and
 69 | modification follow.
 70 | 
 71 |                        TERMS AND CONDITIONS
 72 | 
 73 |   0. Definitions.
 74 | 
 75 |   "This License" refers to version 3 of the GNU General Public License.
 76 | 
 77 |   "Copyright" also means copyright-like laws that apply to other kinds of
 78 | works, such as semiconductor masks.
 79 | 
 80 |   "The Program" refers to any copyrightable work licensed under this
 81 | License.  Each licensee is addressed as "you".  "Licensees" and
 82 | "recipients" may be individuals or organizations.
 83 | 
 84 |   To "modify" a work means to copy from or adapt all or part of the work
 85 | in a fashion requiring copyright permission, other than the making of an
 86 | exact copy.  The resulting work is called a "modified version" of the
 87 | earlier work or a work "based on" the earlier work.
 88 | 
 89 |   A "covered work" means either the unmodified Program or a work based
 90 | on the Program.
 91 | 
 92 |   To "propagate" a work means to do anything with it that, without
 93 | permission, would make you directly or secondarily liable for
 94 | infringement under applicable copyright law, except executing it on a
 95 | computer or modifying a private copy.  Propagation includes copying,
 96 | distribution (with or without modification), making available to the
 97 | public, and in some countries other activities as well.
 98 | 
 99 |   To "convey" a work means any kind of propagation that enables other
100 | parties to make or receive copies.  Mere interaction with a user through
101 | a computer network, with no transfer of a copy, is not conveying.
102 | 
103 |   An interactive user interface displays "Appropriate Legal Notices"
104 | to the extent that it includes a convenient and prominently visible
105 | feature that (1) displays an appropriate copyright notice, and (2)
106 | tells the user that there is no warranty for the work (except to the
107 | extent that warranties are provided), that licensees may convey the
108 | work under this License, and how to view a copy of this License.  If
109 | the interface presents a list of user commands or options, such as a
110 | menu, a prominent item in the list meets this criterion.
111 | 
112 |   1. Source Code.
113 | 
114 |   The "source code" for a work means the preferred form of the work
115 | for making modifications to it.  "Object code" means any non-source
116 | form of a work.
117 | 
118 |   A "Standard Interface" means an interface that either is an official
119 | standard defined by a recognized standards body, or, in the case of
120 | interfaces specified for a particular programming language, one that
121 | is widely used among developers working in that language.
122 | 
123 |   The "System Libraries" of an executable work include anything, other
124 | than the work as a whole, that (a) is included in the normal form of
125 | packaging a Major Component, but which is not part of that Major
126 | Component, and (b) serves only to enable use of the work with that
127 | Major Component, or to implement a Standard Interface for which an
128 | implementation is available to the public in source code form.  A
129 | "Major Component", in this context, means a major essential component
130 | (kernel, window system, and so on) of the specific operating system
131 | (if any) on which the executable work runs, or a compiler used to
132 | produce the work, or an object code interpreter used to run it.
133 | 
134 |   The "Corresponding Source" for a work in object code form means all
135 | the source code needed to generate, install, and (for an executable
136 | work) run the object code and to modify the work, including scripts to
137 | control those activities.  However, it does not include the work's
138 | System Libraries, or general-purpose tools or generally available free
139 | programs which are used unmodified in performing those activities but
140 | which are not part of the work.  For example, Corresponding Source
141 | includes interface definition files associated with source files for
142 | the work, and the source code for shared libraries and dynamically
143 | linked subprograms that the work is specifically designed to require,
144 | such as by intimate data communication or control flow between those
145 | subprograms and other parts of the work.
146 | 
147 |   The Corresponding Source need not include anything that users
148 | can regenerate automatically from other parts of the Corresponding
149 | Source.
150 | 
151 |   The Corresponding Source for a work in source code form is that
152 | same work.
153 | 
154 |   2. Basic Permissions.
155 | 
156 |   All rights granted under this License are granted for the term of
157 | copyright on the Program, and are irrevocable provided the stated
158 | conditions are met.  This License explicitly affirms your unlimited
159 | permission to run the unmodified Program.  The output from running a
160 | covered work is covered by this License only if the output, given its
161 | content, constitutes a covered work.  This License acknowledges your
162 | rights of fair use or other equivalent, as provided by copyright law.
163 | 
164 |   You may make, run and propagate covered works that you do not
165 | convey, without conditions so long as your license otherwise remains
166 | in force.  You may convey covered works to others for the sole purpose
167 | of having them make modifications exclusively for you, or provide you
168 | with facilities for running those works, provided that you comply with
169 | the terms of this License in conveying all material for which you do
170 | not control copyright.  Those thus making or running the covered works
171 | for you must do so exclusively on your behalf, under your direction
172 | and control, on terms that prohibit them from making any copies of
173 | your copyrighted material outside their relationship with you.
174 | 
175 |   Conveying under any other circumstances is permitted solely under
176 | the conditions stated below.  Sublicensing is not allowed; section 10
177 | makes it unnecessary.
178 | 
179 |   3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180 | 
181 |   No covered work shall be deemed part of an effective technological
182 | measure under any applicable law fulfilling obligations under article
183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184 | similar laws prohibiting or restricting circumvention of such
185 | measures.
186 | 
187 |   When you convey a covered work, you waive any legal power to forbid
188 | circumvention of technological measures to the extent such circumvention
189 | is effected by exercising rights under this License with respect to
190 | the covered work, and you disclaim any intention to limit operation or
191 | modification of the work as a means of enforcing, against the work's
192 | users, your or third parties' legal rights to forbid circumvention of
193 | technological measures.
194 | 
195 |   4. Conveying Verbatim Copies.
196 | 
197 |   You may convey verbatim copies of the Program's source code as you
198 | receive it, in any medium, provided that you conspicuously and
199 | appropriately publish on each copy an appropriate copyright notice;
200 | keep intact all notices stating that this License and any
201 | non-permissive terms added in accord with section 7 apply to the code;
202 | keep intact all notices of the absence of any warranty; and give all
203 | recipients a copy of this License along with the Program.
204 | 
205 |   You may charge any price or no price for each copy that you convey,
206 | and you may offer support or warranty protection for a fee.
207 | 
208 |   5. Conveying Modified Source Versions.
209 | 
210 |   You may convey a work based on the Program, or the modifications to
211 | produce it from the Program, in the form of source code under the
212 | terms of section 4, provided that you also meet all of these conditions:
213 | 
214 |     a) The work must carry prominent notices stating that you modified
215 |     it, and giving a relevant date.
216 | 
217 |     b) The work must carry prominent notices stating that it is
218 |     released under this License and any conditions added under section
219 |     7.  This requirement modifies the requirement in section 4 to
220 |     "keep intact all notices".
221 | 
222 |     c) You must license the entire work, as a whole, under this
223 |     License to anyone who comes into possession of a copy.  This
224 |     License will therefore apply, along with any applicable section 7
225 |     additional terms, to the whole of the work, and all its parts,
226 |     regardless of how they are packaged.  This License gives no
227 |     permission to license the work in any other way, but it does not
228 |     invalidate such permission if you have separately received it.
229 | 
230 |     d) If the work has interactive user interfaces, each must display
231 |     Appropriate Legal Notices; however, if the Program has interactive
232 |     interfaces that do not display Appropriate Legal Notices, your
233 |     work need not make them do so.
234 | 
235 |   A compilation of a covered work with other separate and independent
236 | works, which are not by their nature extensions of the covered work,
237 | and which are not combined with it such as to form a larger program,
238 | in or on a volume of a storage or distribution medium, is called an
239 | "aggregate" if the compilation and its resulting copyright are not
240 | used to limit the access or legal rights of the compilation's users
241 | beyond what the individual works permit.  Inclusion of a covered work
242 | in an aggregate does not cause this License to apply to the other
243 | parts of the aggregate.
244 | 
245 |   6. Conveying Non-Source Forms.
246 | 
247 |   You may convey a covered work in object code form under the terms
248 | of sections 4 and 5, provided that you also convey the
249 | machine-readable Corresponding Source under the terms of this License,
250 | in one of these ways:
251 | 
252 |     a) Convey the object code in, or embodied in, a physical product
253 |     (including a physical distribution medium), accompanied by the
254 |     Corresponding Source fixed on a durable physical medium
255 |     customarily used for software interchange.
256 | 
257 |     b) Convey the object code in, or embodied in, a physical product
258 |     (including a physical distribution medium), accompanied by a
259 |     written offer, valid for at least three years and valid for as
260 |     long as you offer spare parts or customer support for that product
261 |     model, to give anyone who possesses the object code either (1) a
262 |     copy of the Corresponding Source for all the software in the
263 |     product that is covered by this License, on a durable physical
264 |     medium customarily used for software interchange, for a price no
265 |     more than your reasonable cost of physically performing this
266 |     conveying of source, or (2) access to copy the
267 |     Corresponding Source from a network server at no charge.
268 | 
269 |     c) Convey individual copies of the object code with a copy of the
270 |     written offer to provide the Corresponding Source.  This
271 |     alternative is allowed only occasionally and noncommercially, and
272 |     only if you received the object code with such an offer, in accord
273 |     with subsection 6b.
274 | 
275 |     d) Convey the object code by offering access from a designated
276 |     place (gratis or for a charge), and offer equivalent access to the
277 |     Corresponding Source in the same way through the same place at no
278 |     further charge.  You need not require recipients to copy the
279 |     Corresponding Source along with the object code.  If the place to
280 |     copy the object code is a network server, the Corresponding Source
281 |     may be on a different server (operated by you or a third party)
282 |     that supports equivalent copying facilities, provided you maintain
283 |     clear directions next to the object code saying where to find the
284 |     Corresponding Source.  Regardless of what server hosts the
285 |     Corresponding Source, you remain obligated to ensure that it is
286 |     available for as long as needed to satisfy these requirements.
287 | 
288 |     e) Convey the object code using peer-to-peer transmission, provided
289 |     you inform other peers where the object code and Corresponding
290 |     Source of the work are being offered to the general public at no
291 |     charge under subsection 6d.
292 | 
293 |   A separable portion of the object code, whose source code is excluded
294 | from the Corresponding Source as a System Library, need not be
295 | included in conveying the object code work.
296 | 
297 |   A "User Product" is either (1) a "consumer product", which means any
298 | tangible personal property which is normally used for personal, family,
299 | or household purposes, or (2) anything designed or sold for incorporation
300 | into a dwelling.  In determining whether a product is a consumer product,
301 | doubtful cases shall be resolved in favor of coverage.  For a particular
302 | product received by a particular user, "normally used" refers to a
303 | typical or common use of that class of product, regardless of the status
304 | of the particular user or of the way in which the particular user
305 | actually uses, or expects or is expected to use, the product.  A product
306 | is a consumer product regardless of whether the product has substantial
307 | commercial, industrial or non-consumer uses, unless such uses represent
308 | the only significant mode of use of the product.
309 | 
310 |   "Installation Information" for a User Product means any methods,
311 | procedures, authorization keys, or other information required to install
312 | and execute modified versions of a covered work in that User Product from
313 | a modified version of its Corresponding Source.  The information must
314 | suffice to ensure that the continued functioning of the modified object
315 | code is in no case prevented or interfered with solely because
316 | modification has been made.
317 | 
318 |   If you convey an object code work under this section in, or with, or
319 | specifically for use in, a User Product, and the conveying occurs as
320 | part of a transaction in which the right of possession and use of the
321 | User Product is transferred to the recipient in perpetuity or for a
322 | fixed term (regardless of how the transaction is characterized), the
323 | Corresponding Source conveyed under this section must be accompanied
324 | by the Installation Information.  But this requirement does not apply
325 | if neither you nor any third party retains the ability to install
326 | modified object code on the User Product (for example, the work has
327 | been installed in ROM).
328 | 
329 |   The requirement to provide Installation Information does not include a
330 | requirement to continue to provide support service, warranty, or updates
331 | for a work that has been modified or installed by the recipient, or for
332 | the User Product in which it has been modified or installed.  Access to a
333 | network may be denied when the modification itself materially and
334 | adversely affects the operation of the network or violates the rules and
335 | protocols for communication across the network.
336 | 
337 |   Corresponding Source conveyed, and Installation Information provided,
338 | in accord with this section must be in a format that is publicly
339 | documented (and with an implementation available to the public in
340 | source code form), and must require no special password or key for
341 | unpacking, reading or copying.
342 | 
343 |   7. Additional Terms.
344 | 
345 |   "Additional permissions" are terms that supplement the terms of this
346 | License by making exceptions from one or more of its conditions.
347 | Additional permissions that are applicable to the entire Program shall
348 | be treated as though they were included in this License, to the extent
349 | that they are valid under applicable law.  If additional permissions
350 | apply only to part of the Program, that part may be used separately
351 | under those permissions, but the entire Program remains governed by
352 | this License without regard to the additional permissions.
353 | 
354 |   When you convey a copy of a covered work, you may at your option
355 | remove any additional permissions from that copy, or from any part of
356 | it.  (Additional permissions may be written to require their own
357 | removal in certain cases when you modify the work.)  You may place
358 | additional permissions on material, added by you to a covered work,
359 | for which you have or can give appropriate copyright permission.
360 | 
361 |   Notwithstanding any other provision of this License, for material you
362 | add to a covered work, you may (if authorized by the copyright holders of
363 | that material) supplement the terms of this License with terms:
364 | 
365 |     a) Disclaiming warranty or limiting liability differently from the
366 |     terms of sections 15 and 16 of this License; or
367 | 
368 |     b) Requiring preservation of specified reasonable legal notices or
369 |     author attributions in that material or in the Appropriate Legal
370 |     Notices displayed by works containing it; or
371 | 
372 |     c) Prohibiting misrepresentation of the origin of that material, or
373 |     requiring that modified versions of such material be marked in
374 |     reasonable ways as different from the original version; or
375 | 
376 |     d) Limiting the use for publicity purposes of names of licensors or
377 |     authors of the material; or
378 | 
379 |     e) Declining to grant rights under trademark law for use of some
380 |     trade names, trademarks, or service marks; or
381 | 
382 |     f) Requiring indemnification of licensors and authors of that
383 |     material by anyone who conveys the material (or modified versions of
384 |     it) with contractual assumptions of liability to the recipient, for
385 |     any liability that these contractual assumptions directly impose on
386 |     those licensors and authors.
387 | 
388 |   All other non-permissive additional terms are considered "further
389 | restrictions" within the meaning of section 10.  If the Program as you
390 | received it, or any part of it, contains a notice stating that it is
391 | governed by this License along with a term that is a further
392 | restriction, you may remove that term.  If a license document contains
393 | a further restriction but permits relicensing or conveying under this
394 | License, you may add to a covered work material governed by the terms
395 | of that license document, provided that the further restriction does
396 | not survive such relicensing or conveying.
397 | 
398 |   If you add terms to a covered work in accord with this section, you
399 | must place, in the relevant source files, a statement of the
400 | additional terms that apply to those files, or a notice indicating
401 | where to find the applicable terms.
402 | 
403 |   Additional terms, permissive or non-permissive, may be stated in the
404 | form of a separately written license, or stated as exceptions;
405 | the above requirements apply either way.
406 | 
407 |   8. Termination.
408 | 
409 |   You may not propagate or modify a covered work except as expressly
410 | provided under this License.  Any attempt otherwise to propagate or
411 | modify it is void, and will automatically terminate your rights under
412 | this License (including any patent licenses granted under the third
413 | paragraph of section 11).
414 | 
415 |   However, if you cease all violation of this License, then your
416 | license from a particular copyright holder is reinstated (a)
417 | provisionally, unless and until the copyright holder explicitly and
418 | finally terminates your license, and (b) permanently, if the copyright
419 | holder fails to notify you of the violation by some reasonable means
420 | prior to 60 days after the cessation.
421 | 
422 |   Moreover, your license from a particular copyright holder is
423 | reinstated permanently if the copyright holder notifies you of the
424 | violation by some reasonable means, this is the first time you have
425 | received notice of violation of this License (for any work) from that
426 | copyright holder, and you cure the violation prior to 30 days after
427 | your receipt of the notice.
428 | 
429 |   Termination of your rights under this section does not terminate the
430 | licenses of parties who have received copies or rights from you under
431 | this License.  If your rights have been terminated and not permanently
432 | reinstated, you do not qualify to receive new licenses for the same
433 | material under section 10.
434 | 
435 |   9. Acceptance Not Required for Having Copies.
436 | 
437 |   You are not required to accept this License in order to receive or
438 | run a copy of the Program.  Ancillary propagation of a covered work
439 | occurring solely as a consequence of using peer-to-peer transmission
440 | to receive a copy likewise does not require acceptance.  However,
441 | nothing other than this License grants you permission to propagate or
442 | modify any covered work.  These actions infringe copyright if you do
443 | not accept this License.  Therefore, by modifying or propagating a
444 | covered work, you indicate your acceptance of this License to do so.
445 | 
446 |   10. Automatic Licensing of Downstream Recipients.
447 | 
448 |   Each time you convey a covered work, the recipient automatically
449 | receives a license from the original licensors, to run, modify and
450 | propagate that work, subject to this License.  You are not responsible
451 | for enforcing compliance by third parties with this License.
452 | 
453 |   An "entity transaction" is a transaction transferring control of an
454 | organization, or substantially all assets of one, or subdividing an
455 | organization, or merging organizations.  If propagation of a covered
456 | work results from an entity transaction, each party to that
457 | transaction who receives a copy of the work also receives whatever
458 | licenses to the work the party's predecessor in interest had or could
459 | give under the previous paragraph, plus a right to possession of the
460 | Corresponding Source of the work from the predecessor in interest, if
461 | the predecessor has it or can get it with reasonable efforts.
462 | 
463 |   You may not impose any further restrictions on the exercise of the
464 | rights granted or affirmed under this License.  For example, you may
465 | not impose a license fee, royalty, or other charge for exercise of
466 | rights granted under this License, and you may not initiate litigation
467 | (including a cross-claim or counterclaim in a lawsuit) alleging that
468 | any patent claim is infringed by making, using, selling, offering for
469 | sale, or importing the Program or any portion of it.
470 | 
471 |   11. Patents.
472 | 
473 |   A "contributor" is a copyright holder who authorizes use under this
474 | License of the Program or a work on which the Program is based.  The
475 | work thus licensed is called the contributor's "contributor version".
476 | 
477 |   A contributor's "essential patent claims" are all patent claims
478 | owned or controlled by the contributor, whether already acquired or
479 | hereafter acquired, that would be infringed by some manner, permitted
480 | by this License, of making, using, or selling its contributor version,
481 | but do not include claims that would be infringed only as a
482 | consequence of further modification of the contributor version.  For
483 | purposes of this definition, "control" includes the right to grant
484 | patent sublicenses in a manner consistent with the requirements of
485 | this License.
486 | 
487 |   Each contributor grants you a non-exclusive, worldwide, royalty-free
488 | patent license under the contributor's essential patent claims, to
489 | make, use, sell, offer for sale, import and otherwise run, modify and
490 | propagate the contents of its contributor version.
491 | 
492 |   In the following three paragraphs, a "patent license" is any express
493 | agreement or commitment, however denominated, not to enforce a patent
494 | (such as an express permission to practice a patent or covenant not to
495 | sue for patent infringement).  To "grant" such a patent license to a
496 | party means to make such an agreement or commitment not to enforce a
497 | patent against the party.
498 | 
499 |   If you convey a covered work, knowingly relying on a patent license,
500 | and the Corresponding Source of the work is not available for anyone
501 | to copy, free of charge and under the terms of this License, through a
502 | publicly available network server or other readily accessible means,
503 | then you must either (1) cause the Corresponding Source to be so
504 | available, or (2) arrange to deprive yourself of the benefit of the
505 | patent license for this particular work, or (3) arrange, in a manner
506 | consistent with the requirements of this License, to extend the patent
507 | license to downstream recipients.  "Knowingly relying" means you have
508 | actual knowledge that, but for the patent license, your conveying the
509 | covered work in a country, or your recipient's use of the covered work
510 | in a country, would infringe one or more identifiable patents in that
511 | country that you have reason to believe are valid.
512 | 
513 |   If, pursuant to or in connection with a single transaction or
514 | arrangement, you convey, or propagate by procuring conveyance of, a
515 | covered work, and grant a patent license to some of the parties
516 | receiving the covered work authorizing them to use, propagate, modify
517 | or convey a specific copy of the covered work, then the patent license
518 | you grant is automatically extended to all recipients of the covered
519 | work and works based on it.
520 | 
521 |   A patent license is "discriminatory" if it does not include within
522 | the scope of its coverage, prohibits the exercise of, or is
523 | conditioned on the non-exercise of one or more of the rights that are
524 | specifically granted under this License.  You may not convey a covered
525 | work if you are a party to an arrangement with a third party that is
526 | in the business of distributing software, under which you make payment
527 | to the third party based on the extent of your activity of conveying
528 | the work, and under which the third party grants, to any of the
529 | parties who would receive the covered work from you, a discriminatory
530 | patent license (a) in connection with copies of the covered work
531 | conveyed by you (or copies made from those copies), or (b) primarily
532 | for and in connection with specific products or compilations that
533 | contain the covered work, unless you entered into that arrangement,
534 | or that patent license was granted, prior to 28 March 2007.
535 | 
536 |   Nothing in this License shall be construed as excluding or limiting
537 | any implied license or other defenses to infringement that may
538 | otherwise be available to you under applicable patent law.
539 | 
540 |   12. No Surrender of Others' Freedom.
541 | 
542 |   If conditions are imposed on you (whether by court order, agreement or
543 | otherwise) that contradict the conditions of this License, they do not
544 | excuse you from the conditions of this License.  If you cannot convey a
545 | covered work so as to satisfy simultaneously your obligations under this
546 | License and any other pertinent obligations, then as a consequence you may
547 | not convey it at all.  For example, if you agree to terms that obligate you
548 | to collect a royalty for further conveying from those to whom you convey
549 | the Program, the only way you could satisfy both those terms and this
550 | License would be to refrain entirely from conveying the Program.
551 | 
552 |   13. Use with the GNU Affero General Public License.
553 | 
554 |   Notwithstanding any other provision of this License, you have
555 | permission to link or combine any covered work with a work licensed
556 | under version 3 of the GNU Affero General Public License into a single
557 | combined work, and to convey the resulting work.  The terms of this
558 | License will continue to apply to the part which is the covered work,
559 | but the special requirements of the GNU Affero General Public License,
560 | section 13, concerning interaction through a network will apply to the
561 | combination as such.
562 | 
563 |   14. Revised Versions of this License.
564 | 
565 |   The Free Software Foundation may publish revised and/or new versions of
566 | the GNU General Public License from time to time.  Such new versions will
567 | be similar in spirit to the present version, but may differ in detail to
568 | address new problems or concerns.
569 | 
570 |   Each version is given a distinguishing version number.  If the
571 | Program specifies that a certain numbered version of the GNU General
572 | Public License "or any later version" applies to it, you have the
573 | option of following the terms and conditions either of that numbered
574 | version or of any later version published by the Free Software
575 | Foundation.  If the Program does not specify a version number of the
576 | GNU General Public License, you may choose any version ever published
577 | by the Free Software Foundation.
578 | 
579 |   If the Program specifies that a proxy can decide which future
580 | versions of the GNU General Public License can be used, that proxy's
581 | public statement of acceptance of a version permanently authorizes you
582 | to choose that version for the Program.
583 | 
584 |   Later license versions may give you additional or different
585 | permissions.  However, no additional obligations are imposed on any
586 | author or copyright holder as a result of your choosing to follow a
587 | later version.
588 | 
589 |   15. Disclaimer of Warranty.
590 | 
591 |   THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592 | APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596 | PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597 | IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599 | 
600 |   16. Limitation of Liability.
601 | 
602 |   IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610 | SUCH DAMAGES.
611 | 
612 |   17. Interpretation of Sections 15 and 16.
613 | 
614 |   If the disclaimer of warranty and limitation of liability provided
615 | above cannot be given local legal effect according to their terms,
616 | reviewing courts shall apply local law that most closely approximates
617 | an absolute waiver of all civil liability in connection with the
618 | Program, unless a warranty or assumption of liability accompanies a
619 | copy of the Program in return for a fee.
620 | 
621 |                      END OF TERMS AND CONDITIONS
622 | 
623 |             How to Apply These Terms to Your New Programs
624 | 
625 |   If you develop a new program, and you want it to be of the greatest
626 | possible use to the public, the best way to achieve this is to make it
627 | free software which everyone can redistribute and change under these terms.
628 | 
629 |   To do so, attach the following notices to the program.  It is safest
630 | to attach them to the start of each source file to most effectively
631 | state the exclusion of warranty; and each file should have at least
632 | the "copyright" line and a pointer to where the full notice is found.
633 | 
634 |     <one line to give the program's name and a brief idea of what it does.>
635 |     Copyright (C) <year>  <name of author>
636 | 
637 |     This program is free software: you can redistribute it and/or modify
638 |     it under the terms of the GNU General Public License as published by
639 |     the Free Software Foundation, either version 3 of the License, or
640 |     (at your option) any later version.
641 | 
642 |     This program is distributed in the hope that it will be useful,
643 |     but WITHOUT ANY WARRANTY; without even the implied warranty of
644 |     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
645 |     GNU General Public License for more details.
646 | 
647 |     You should have received a copy of the GNU General Public License
648 |     along with this program.  If not, see <https://www.gnu.org/licenses/>.
649 | 
650 | Also add information on how to contact you by electronic and paper mail.
651 | 
652 |   If the program does terminal interaction, make it output a short
653 | notice like this when it starts in an interactive mode:
654 | 
655 |     <program>  Copyright (C) <year>  <name of author>
656 |     This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657 |     This is free software, and you are welcome to redistribute it
658 |     under certain conditions; type `show c' for details.
659 | 
660 | The hypothetical commands `show w' and `show c' should show the appropriate
661 | parts of the General Public License.  Of course, your program's commands
662 | might be different; for a GUI interface, you would use an "about box".
663 | 
664 |   You should also get your employer (if you work as a programmer) or school,
665 | if any, to sign a "copyright disclaimer" for the program, if necessary.
666 | For more information on this, and how to apply and follow the GNU GPL, see
667 | <https://www.gnu.org/licenses/>.
668 | 
669 |   The GNU General Public License does not permit incorporating your program
670 | into proprietary programs.  If your program is a subroutine library, you
671 | may consider it more useful to permit linking proprietary applications with
672 | the library.  If this is what you want to do, use the GNU Lesser General
673 | Public License instead of this License.  But first, please read
674 | <https://www.gnu.org/licenses/why-not-lgpl.html>.
675 | 


--------------------------------------------------------------------------------
/documentation/DiFF-RF_API.html:
--------------------------------------------------------------------------------
   1 | <!doctype html>
   2 | <html lang="en">
   3 | <head>
   4 | <meta charset="utf-8">
   5 | <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1" />
   6 | <meta name="generator" content="pdoc 0.8.3" />
   7 | <title>DiFF_RF API documentation</title>
   8 | <meta name="description" content="Created on Tue Mar 24 12:19:32 2020 …" />
   9 | <link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/sanitize.min.css" integrity="sha256-PK9q560IAAa6WVRRh76LtCaI8pjTJ2z11v0miyNNjrs=" crossorigin>
  10 | <link rel="preload stylesheet" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/10up-sanitize.css/11.0.1/typography.min.css" integrity="sha256-7l/o7C8jubJiy74VsKTidCy1yBkRtiUGbVkYBylBqUg=" crossorigin>
  11 | <link rel="stylesheet preload" as="style" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/github.min.css" crossorigin>
  12 | <style>:root{--highlight-color:#fe9}.flex{display:flex !important}body{line-height:1.5em}#content{padding:20px}#sidebar{padding:30px;overflow:hidden}#sidebar > *:last-child{margin-bottom:2cm}.http-server-breadcrumbs{font-size:130%;margin:0 0 15px 0}#footer{font-size:.75em;padding:5px 30px;border-top:1px solid #ddd;text-align:right}#footer p{margin:0 0 0 1em;display:inline-block}#footer p:last-child{margin-right:30px}h1,h2,h3,h4,h5{font-weight:300}h1{font-size:2.5em;line-height:1.1em}h2{font-size:1.75em;margin:1em 0 .50em 0}h3{font-size:1.4em;margin:25px 0 10px 0}h4{margin:0;font-size:105%}h1:target,h2:target,h3:target,h4:target,h5:target,h6:target{background:var(--highlight-color);padding:.2em 0}a{color:#058;text-decoration:none;transition:color .3s ease-in-out}a:hover{color:#e82}.title code{font-weight:bold}h2[id^="header-"]{margin-top:2em}.ident{color:#900}pre code{background:#f8f8f8;font-size:.8em;line-height:1.4em}code{background:#f2f2f1;padding:1px 4px;overflow-wrap:break-word}h1 code{background:transparent}pre{background:#f8f8f8;border:0;border-top:1px solid #ccc;border-bottom:1px solid #ccc;margin:1em 0;padding:1ex}#http-server-module-list{display:flex;flex-flow:column}#http-server-module-list div{display:flex}#http-server-module-list dt{min-width:10%}#http-server-module-list p{margin-top:0}.toc ul,#index{list-style-type:none;margin:0;padding:0}#index code{background:transparent}#index h3{border-bottom:1px solid #ddd}#index ul{padding:0}#index h4{margin-top:.6em;font-weight:bold}@media (min-width:200ex){#index .two-column{column-count:2}}@media (min-width:300ex){#index .two-column{column-count:3}}dl{margin-bottom:2em}dl dl:last-child{margin-bottom:4em}dd{margin:0 0 1em 3em}#header-classes + dl > dd{margin-bottom:3em}dd dd{margin-left:2em}dd p{margin:10px 0}.name{background:#eee;font-weight:bold;font-size:.85em;padding:5px 10px;display:inline-block;min-width:40%}.name:hover{background:#e0e0e0}dt:target .name{background:var(--highlight-color)}.name > span:first-child{white-space:nowrap}.name.class > span:nth-child(2){margin-left:.4em}.inherited{color:#999;border-left:5px solid #eee;padding-left:1em}.inheritance em{font-style:normal;font-weight:bold}.desc h2{font-weight:400;font-size:1.25em}.desc h3{font-size:1em}.desc dt code{background:inherit}.source summary,.git-link-div{color:#666;text-align:right;font-weight:400;font-size:.8em;text-transform:uppercase}.source summary > *{white-space:nowrap;cursor:pointer}.git-link{color:inherit;margin-left:1em}.source pre{max-height:500px;overflow:auto;margin:0}.source pre code{font-size:12px;overflow:visible}.hlist{list-style:none}.hlist li{display:inline}.hlist li:after{content:',\2002'}.hlist li:last-child:after{content:none}.hlist .hlist{display:inline;padding-left:1em}img{max-width:100%}td{padding:0 .5em}.admonition{padding:.1em .5em;margin-bottom:1em}.admonition-title{font-weight:bold}.admonition.note,.admonition.info,.admonition.important{background:#aef}.admonition.todo,.admonition.versionadded,.admonition.tip,.admonition.hint{background:#dfd}.admonition.warning,.admonition.versionchanged,.admonition.deprecated{background:#fd4}.admonition.error,.admonition.danger,.admonition.caution{background:lightpink}</style>
  13 | <style media="screen and (min-width: 700px)">@media screen and (min-width:700px){#sidebar{width:30%;height:100vh;overflow:auto;position:sticky;top:0}#content{width:70%;max-width:100ch;padding:3em 4em;border-left:1px solid #ddd}pre code{font-size:1em}.item .name{font-size:1em}main{display:flex;flex-direction:row-reverse;justify-content:flex-end}.toc ul ul,#index ul{padding-left:1.5em}.toc > ul > li{margin-top:.5em}}</style>
  14 | <style media="print">@media print{#sidebar h1{page-break-before:always}.source{display:none}}@media print{*{background:transparent !important;color:#000 !important;box-shadow:none !important;text-shadow:none !important}a[href]:after{content:" (" attr(href) ")";font-size:90%}a[href][title]:after{content:none}abbr[title]:after{content:" (" attr(title) ")"}.ir a:after,a[href^="javascript:"]:after,a[href^="#"]:after{content:""}pre,blockquote{border:1px solid #999;page-break-inside:avoid}thead{display:table-header-group}tr,img{page-break-inside:avoid}img{max-width:100% !important}@page{margin:0.5cm}p,h2,h3{orphans:3;widows:3}h1,h2,h3,h4,h5,h6{page-break-after:avoid}}</style>
  15 | <script defer src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js" integrity="sha256-eOgo0OtLL4cdq7RdwRUiGKLX9XsIJ7nGhWEKbohmVAQ=" crossorigin></script>
  16 | <script>window.addEventListener('DOMContentLoaded', () => hljs.initHighlighting())</script>
  17 | </head>
  18 | <body>
  19 | <main>
  20 | <article id="content">
  21 | <header>
  22 | <h1 class="title">Module <code>DiFF_RF</code></h1>
  23 | </header>
  24 | <section id="section-intro">
  25 | <p>Created on Tue Mar 24 12:19:32 2020</p>
  26 | <p>@author: Pierre-François Marteau (<a href="https://people.irisa.fr/Pierre-Francois.Marteau/">https://people.irisa.fr/Pierre-Francois.Marteau/</a>)</p>
  27 | <details class="source">
  28 | <summary>
  29 | <span>Expand source code</span>
  30 | </summary>
  31 | <pre><code class="python">#!/usr/bin/env python3
  32 | # -*- coding: utf-8 -*-
  33 | &#34;&#34;&#34;
  34 | Created on Tue Mar 24 12:19:32 2020
  35 | 
  36 | @author: Pierre-François Marteau (https://people.irisa.fr/Pierre-Francois.Marteau/)
  37 | &#34;&#34;&#34;
  38 | 
  39 | # Inspired from an implementation of the isolation forest algorithm provided at
  40 | # https://github.com/xhan0909/isolation_forest
  41 | 
  42 | import numpy as np
  43 | import time
  44 | from functools import partial
  45 | from multiprocessing import Pool
  46 | 
  47 | import random as rn
  48 | 
  49 | def getSplit(X):
  50 |     &#34;&#34;&#34;
  51 |     Randomly selects a split value from set of scalar data &#39;X&#39;.
  52 |     Returns the split value.
  53 |     
  54 |     Parameters
  55 |     ----------
  56 |     X : array 
  57 |         Array of scalar values
  58 |     Returns
  59 |     -------
  60 |     float
  61 |         split value
  62 |     &#34;&#34;&#34;
  63 |     xmin = X.min()
  64 |     xmax = X.max()
  65 |     return np.random.uniform(xmin, xmax)
  66 | 
  67 | def similarityScore(S, node, alpha):
  68 |     &#34;&#34;&#34;
  69 |     Given a set of instances S falling into node and a value alpha &gt;=0,
  70 |     returns for all element x in S the weighted similarity score between x
  71 |     and the centroid M of S (node.M)
  72 |     
  73 |     Parameters
  74 |     ----------
  75 |     S : array  of instances
  76 |         Array  of instances that fall into a node
  77 |     node: a DiFF tree node
  78 |         S is the set of instances &#34;falling&#34; into the node
  79 |     alpha: float
  80 |         alpha is the distance scaling hyper-parameter
  81 |     Returns
  82 |     -------
  83 |     array
  84 |         the array of similarity values between the instances in S and the mean of training instances falling in node
  85 | 
  86 |     &#34;&#34;&#34;
  87 |     d = np.shape(S)[1]
  88 |     if len(S) &gt; 0:
  89 |         d = np.shape(S)[1]
  90 |         U = (S-node.M)/node.Mstd # normalize using the standard deviation vector to the mean
  91 |         U = (2)**(-alpha*(np.sum(U*U/d, axis=1)))
  92 |     else:
  93 |         U = 0
  94 | 
  95 |     return U
  96 | 
  97 | 
  98 | def EE(hist):
  99 |     &#34;&#34;&#34;
 100 |     given a list of positive values as a histogram drawn from any information source,
 101 |     returns the empirical entropy of its discrete probability function.
 102 |     
 103 |     Parameters
 104 |     ----------
 105 |     hist: array 
 106 |         histogram
 107 |     Returns
 108 |     -------
 109 |     float
 110 |         empirical entropy estimated from the histogram
 111 | 
 112 |     &#34;&#34;&#34;
 113 |     h = np.asarray(hist, dtype=np.float64)
 114 |     if h.sum() &lt;= 0 or (h &lt; 0).any():
 115 |         return 0
 116 |     h = h/h.sum()
 117 |     return -(h*np.ma.log2(h)).sum()
 118 | 
 119 | 
 120 | def weightFeature(s, nbins):
 121 |     &#39;&#39;&#39;
 122 |     Given a list of values corresponding to a feature dimension, returns a weight (in [0,1]) that is 
 123 |     one minus the normalized empirical entropy, a way to characterize the importance of the feature dimension. 
 124 |     
 125 |     Parameters
 126 |     ----------
 127 |     s: array 
 128 |         list of scalar values corresponding to a feature dimension
 129 |     nbins: int
 130 |         the number of bins used to discretize the feature dimension using an histogram.
 131 |     Returns
 132 |     -------
 133 |     float
 134 |         the importance weight for feature s.
 135 |     &#39;&#39;&#39;
 136 |     if s.min() == s.max():
 137 |         return 0
 138 |     hist = np.histogram(s, bins=nbins, density=True)
 139 |     ent = EE(hist[0])
 140 |     ent = ent/np.log2(nbins)
 141 |     return 1-ent
 142 | 
 143 | 
 144 | def walk_tree(forest, node, treeIdx, obsIdx, X, featureDistrib, depth=0, alpha=1e-2):
 145 |     &#39;&#39;&#39;
 146 |     Recursive function that walks a tree from an already fitted forest to compute the path length
 147 |     of the new observations.
 148 |     
 149 |     Parameters
 150 |     ----------
 151 |     forest : DiFF_RF 
 152 |         A fitted forest of DiFF trees
 153 |     node: DiFF Tree node
 154 |         the current node
 155 |     treeIdx: int
 156 |         index of the tree that is being walked.
 157 |     obsIdx: array
 158 |         1D array of length n_obs. 1/0 if the obs has reached / has not reached the node.
 159 |     X: nD array. 
 160 |         array of observations/instances.
 161 |     depth: int
 162 |         current depth.
 163 |     Returns
 164 |     -------
 165 |     None
 166 |     &#39;&#39;&#39;
 167 | 
 168 |     if isinstance(node, LeafNode):
 169 |         Xnode = X[obsIdx]
 170 |         f = ((node.size+1)/forest.sample_size) / ((1+len(Xnode))/forest.XtestSize)
 171 |         if alpha == 0:
 172 |             forest.LD[obsIdx, treeIdx] = 0
 173 |             forest.LF[obsIdx, treeIdx] = -f
 174 |             forest.LDF[obsIdx, treeIdx] = -f
 175 |         else:
 176 |             z = similarityScore(Xnode, node, alpha)
 177 |             forest.LD[obsIdx, treeIdx] = z
 178 |             forest.LF[obsIdx, treeIdx] = -f
 179 |             forest.LDF[obsIdx, treeIdx] = z*f
 180 | 
 181 |     else:
 182 | 
 183 |         idx = (X[:, node.splitAtt] &lt;= node.splitValue) * obsIdx
 184 |         walk_tree(forest, node.left, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)
 185 | 
 186 |         idx = (X[:, node.splitAtt] &gt; node.splitValue) * obsIdx
 187 |         walk_tree(forest, node.right, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)
 188 | 
 189 | 
 190 | def create_tree(X, featureDistrib, sample_size, max_height):
 191 |     &#39;&#39;&#39;
 192 |     Creates an DiFF tree using a sample of size sample_size of the original data.
 193 |         
 194 |     Parameters
 195 |     ----------
 196 |     X: nD array. 
 197 |         nD array with the observations. Dimensions should be (n_obs, n_features).
 198 |     sample_size: int
 199 |         Size of the sample from which a DiFF tree is built.
 200 |     max_height: int
 201 |         Maximum height of the tree.
 202 |     Returns
 203 |     -------
 204 |     a DiFF tree
 205 |     &#39;&#39;&#39;
 206 |     rows = np.random.choice(len(X), sample_size, replace=False)
 207 |     featureDistrib = np.array(featureDistrib)
 208 |     return DiFF_Tree(max_height).fit(X[rows, :], featureDistrib)
 209 | 
 210 | 
 211 | class DiFF_TreeEnsemble:
 212 |     &#39;&#39;&#39;
 213 |     DiFF Forest.
 214 |     Even though all the methods are thought to be public the main functionality of the class is given by:
 215 |     - __init__
 216 |     - __fit__
 217 |     - __predict__
 218 |     &#39;&#39;&#39;
 219 |     def __init__(self, sample_size: int, n_trees: int = 10):
 220 |         &#39;&#39;&#39;
 221 |         Creates the DiFF-RF object.
 222 |         
 223 |         Parameters
 224 |         ----------
 225 |         sample_size: int. 
 226 |             size of the sample randomly drawn from the train instances to build each DiFF tree.  
 227 |         n_trees: int
 228 |             The number of trees in the forest
 229 |         Returns
 230 |         -------
 231 |             None
 232 |         &#39;&#39;&#39;
 233 | 
 234 |         self.sample_size = sample_size
 235 |         self.n_trees = n_trees
 236 |         self.alpha=1.0
 237 |         np.random.seed(int(time.time()))
 238 |         rn.seed(int(time.time()))
 239 | 
 240 | 
 241 |     def fit(self, X: (np.ndarray), n_jobs: int = 4):
 242 |         &#34;&#34;&#34;
 243 |         Fits the algorithm into a model.
 244 |         Given a 2D matrix of observations, create an ensemble of IsolationTree
 245 |         objects and store them in a list: self.trees.  Convert DataFrames to
 246 |         ndarray objects.
 247 |         Uses parallel computing.
 248 |         
 249 |         Parameters
 250 |         ----------
 251 |         X: nD array. 
 252 |             nD array with the train instances. Dimensions should be (n_obs, n_features).  
 253 |         n_jobs: int
 254 |             number of parallel jobs that will be launched
 255 |         Returns
 256 |         -------
 257 |             the object itself.
 258 |         &#34;&#34;&#34;
 259 |         self.X = X
 260 |         self.path_normFactor = np.sqrt(len(X))
 261 | 
 262 |         self.sample_size = min(self.sample_size, len(X))
 263 | 
 264 |         limit_height = 1.0*np.ceil(np.log2(self.sample_size))
 265 | 
 266 |         featureDistrib = []
 267 |         nbins = int(len(X)/8)+2
 268 |         for i in range(np.shape(X)[1]):
 269 |             featureDistrib.append(weightFeature(X[:, i], nbins))
 270 |         featureDistrib = np.array(featureDistrib)
 271 |         featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
 272 |         self.featureDistrib = featureDistrib
 273 | 
 274 |         create_tree_partial = partial(create_tree,
 275 |                                       featureDistrib=self.featureDistrib,
 276 |                                       sample_size=self.sample_size,
 277 |                                       max_height=limit_height)
 278 | 
 279 |         with Pool(n_jobs) as p:
 280 |             self.trees = p.map(create_tree_partial,
 281 |                                [X for _ in range(self.n_trees)]
 282 |                                )
 283 |         return self
 284 | 
 285 | 
 286 |     def walk(self, X: np.ndarray) -&gt; np.ndarray:
 287 |         &#34;&#34;&#34;
 288 |         Given a nD matrix of observations, X, compute the average path length,
 289 |         the distance, frequency and collective anomaly scores
 290 |         for instances in X.  Compute the path length for x_i using every
 291 |         tree in self.trees then compute the average for each x_i.  Return an
 292 |         ndarray of shape (len(X),1).
 293 |         
 294 |         Parameters
 295 |         ----------
 296 |         X: nD array. 
 297 |             nD array with the instances to be tested. Dimensions should be (n_obs, n_features).   
 298 |         Returns
 299 |         -------
 300 |             None
 301 |         &#34;&#34;&#34;
 302 | 
 303 |         self.L = np.zeros((len(X), self.n_trees))
 304 |         self.LD = np.zeros((len(X), self.n_trees))
 305 |         self.LF = np.zeros((len(X), self.n_trees))
 306 |         self.LDF = np.zeros((len(X), self.n_trees))
 307 | 
 308 |         for treeIdx, itree in enumerate(self.trees):
 309 |             obsIdx = np.ones(len(X)).astype(bool)
 310 |             walk_tree(self, itree, treeIdx, obsIdx, X, self.featureDistrib, alpha=self.alpha)
 311 | 
 312 | 
 313 |     def anomaly_score(self, X: np.ndarray, alpha=1) -&gt; np.ndarray:
 314 |         &#34;&#34;&#34;
 315 |         Given a nD matrix of observations, X, compute the anomaly scores
 316 |         for instances in X, returning 3 1D arrays of anomaly scores
 317 |         
 318 |         Parameters
 319 |         ----------
 320 |         X: nD array. 
 321 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
 322 |         alpha: float
 323 |             scaling distance hyper-parameter.
 324 |         Returns
 325 |         -------
 326 |         scD, scF, scFF: 1d arrays
 327 |             respectively the distance scores (point-wise anomaly score), the frequency of visit socres and the collective anomaly scores
 328 |         &#34;&#34;&#34;
 329 |         self.XtestSize = len(X)
 330 |         self.alpha = alpha
 331 | 
 332 |         # Get the path length for each of the observations.
 333 |         self.walk(X)
 334 | 
 335 |         # Compute the scores from the path lengths (self.L)
 336 |         if self.sample_size &gt; 2:
 337 |             scD = -self.LD.mean(1)
 338 |         elif self.sample_size == 2:
 339 |             scD = -self.LD.mean(1)
 340 |         else:
 341 |             scD = 0
 342 | 
 343 |         scF = self.LF.mean(1)
 344 |         scDF = -self.LDF.mean(1)
 345 |         return scD, scF, scDF
 346 |     
 347 | 
 348 |     def predict_from_anomaly_scores(self, scores: np.ndarray, threshold: float) -&gt; np.ndarray:
 349 |         &#34;&#34;&#34;
 350 |         Given an array of scores and a score threshold, return an array of
 351 |         the predictions: 1 for any score &gt;= the threshold and 0 otherwise.
 352 |         
 353 |         Parameters
 354 |         ----------
 355 |         scores: 1D array. 
 356 |             1D array of scores. Dimensions should be (n_obs, n_features).   
 357 |         threshold: float
 358 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
 359 |         Returns
 360 |         -------
 361 |         1D array
 362 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
 363 | 
 364 |         :param scores: 1D array. Scores produced by the random forest.
 365 |         :param threshold: Threshold for considering a observation an anomaly, the higher the less anomalies.
 366 |         :return: Return predictions
 367 |         &#34;&#34;&#34;
 368 |         out = scores &gt;= threshold
 369 |         return out*1
 370 |     
 371 | 
 372 |     def predict(self, X: np.ndarray, threshold: float) -&gt; np.ndarray:
 373 |         &#34;&#34;&#34;
 374 |         A shorthand for calling anomaly_score() and predict_from_anomaly_scores().
 375 |         
 376 |         Parameters
 377 |         ----------
 378 |         X: nD array. 
 379 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
 380 |         threshold: float
 381 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
 382 |         Returns
 383 |         -------
 384 |         1D array
 385 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
 386 |         &#34;&#34;&#34;
 387 | 
 388 |         scores = self.anomaly_score(X)
 389 |         return self.predict_from_anomaly_scores(scores, threshold)
 390 | 
 391 | 
 392 | class DiFF_Tree:
 393 |     &#39;&#39;&#39;
 394 |     Construct a tree via randomized splits with maximum height height_limit.
 395 |     &#39;&#39;&#39;
 396 |     def __init__(self, height_limit):
 397 |         &#39;&#39;&#39;
 398 |         Parameters
 399 |         ----------
 400 |         height_limit: int
 401 |             Maximum height of the tree.
 402 |         Returns
 403 |         -------
 404 |         None
 405 |         &#39;&#39;&#39;
 406 |         self.height_limit = height_limit
 407 | 
 408 |     def fit(self, X: np.ndarray, featureDistrib: np.array):
 409 |         &#34;&#34;&#34;
 410 |         Given a 2D matrix of observations, create an DiFF tree. Set field
 411 |         self.root to the root of that tree and return it.
 412 |         
 413 |         Parameters
 414 |         ----------
 415 |         X: nD array. 
 416 |             nD array with the observations. Dimensions should be (n_obs, n_features).        
 417 |         featureDistrib: 1D array
 418 |             The distribution weight affected to each dimension
 419 |         Returns
 420 |         -------
 421 |         A DIFF tree root.
 422 |         &#34;&#34;&#34;
 423 |         self.root = InNode(X, self.height_limit, featureDistrib, len(X), 0)
 424 | 
 425 |         return self.root
 426 | 
 427 | 
 428 | class InNode:
 429 |     &#39;&#39;&#39;
 430 |     Node of the tree that is not a leaf node.
 431 |     The functionality of the class is:
 432 |     - Do the best split from a sample of randomly chosen
 433 |         dimensions and split points.
 434 |     - Partition the space of observations according to the
 435 |     split and send the along to two different nodes
 436 |     The method usually has a higher complexity than doing it for every point.
 437 |     But because it&#39;s using NumPy it&#39;s more efficient time-wise.
 438 |     &#39;&#39;&#39;
 439 |     def __init__(self, X, height_limit, featureDistrib, sample_size, current_height):
 440 |         &#39;&#39;&#39;
 441 |         Parameters
 442 |         ----------
 443 |         X: nD array. 
 444 |             nD array with the training instances that have reached the node.
 445 |         height_limit: int
 446 |             Maximum height of the tree.
 447 |         Xf: nD array. 
 448 |             distribution used to randomly select a dimension (feature) used at parent level. 
 449 |         sample_size: int
 450 |             Size of the sample used to build the tree.
 451 |         current_height: int
 452 |             Current height of the tree.
 453 |         Returns
 454 |         -------
 455 |             None
 456 |         &#39;&#39;&#39;
 457 | 
 458 |         self.size = len(X)
 459 |         self.height = current_height+1
 460 |         n_obs, n_features = X.shape
 461 |         next_height = current_height + 1
 462 |         limit_not_reached = height_limit &gt; next_height
 463 | 
 464 |         if len(X) &gt; 32:
 465 |             featureDistrib = []
 466 |             nbins = int(len(X)/8)+2
 467 |             for i in range(np.shape(X)[1]):
 468 |                 featureDistrib.append(weightFeature(X[:, i], nbins))
 469 |             featureDistrib = np.array(featureDistrib)
 470 |             featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
 471 | 
 472 |         self.featureDistrib = featureDistrib
 473 | 
 474 |         cols = np.arange(np.shape(X)[1], dtype=&#39;int&#39;)
 475 | 
 476 |         self.splitAtt = rn.choices(cols, weights=featureDistrib)[0]
 477 |         splittingCol = X[:, self.splitAtt]
 478 |         self.splitValue = getSplit(splittingCol)
 479 |         idx = splittingCol &lt;= self.splitValue
 480 | 
 481 |         idx = splittingCol &lt;= self.splitValue
 482 | 
 483 |         X_aux = X[idx, :]
 484 | 
 485 |         self.left = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
 486 |                      if limit_not_reached and X_aux.shape[0] &gt; 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
 487 |                          X_aux, next_height, X, sample_size))
 488 | 
 489 |         idx = np.invert(idx)
 490 |         X_aux = X[idx, :]
 491 |         self.right = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
 492 |                       if limit_not_reached and X_aux.shape[0] &gt; 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
 493 |                           X_aux, next_height, X, sample_size))
 494 | 
 495 |         self.n_nodes = 1 + self.left.n_nodes + self.right.n_nodes
 496 | 
 497 | 
 498 | class LeafNode:
 499 |     &#39;&#39;&#39;
 500 |     Leaf node
 501 |     The base funcitonality is storing the Mean and standard deviation of the observations in that node.
 502 |     We also evaluate the frequency of visit for training data.
 503 |     &#39;&#39;&#39;
 504 |     def __init__(self, X, height, Xp, sample_size):
 505 |         &#39;&#39;&#39;
 506 |         Parameters
 507 |         ----------
 508 |         X: nD array. 
 509 |             nD array with the training instances falling into the leaf node.    
 510 |         height: int
 511 |             Current height of the tree.
 512 |         Xf: nD array. 
 513 |             nD array with the training instances falling into the parent node.    
 514 |         sample_size: int
 515 |             Size of the sample used to build the tree.
 516 |         Returns
 517 |         -------
 518 |             None
 519 |         &#39;&#39;&#39;
 520 |         self.height = height+1
 521 |         self.size = len(X)
 522 |         self.n_nodes = 1
 523 |         self.freq = self.size/sample_size
 524 |         self.freqs = 0
 525 | 
 526 |         if len(X) != 0:
 527 |             self.M = np.mean(X, axis=0)
 528 |             if len(X) &gt; 10:
 529 |                 self.Mstd = np.std(X, axis=0)
 530 |                 self.Mstd[self.Mstd == 0] = 1e-2
 531 |             else:
 532 |                 self.Mstd = np.ones(np.shape(X)[1])
 533 |         else:
 534 |             self.M = np.mean(Xp, axis=0)
 535 |             if len(Xp) &gt; 10:
 536 |                 self.Mstd = np.std(Xp, axis=0)
 537 |                 self.Mstd[self.Mstd == 0] = 1e-2
 538 |             else:
 539 |                 self.Mstd = np.ones(np.shape(X)[1])</code></pre>
 540 | </details>
 541 | </section>
 542 | <section>
 543 | </section>
 544 | <section>
 545 | </section>
 546 | <section>
 547 | <h2 class="section-title" id="header-functions">Functions</h2>
 548 | <dl>
 549 | <dt id="DiFF_RF.EE"><code class="name flex">
 550 | <span>def <span class="ident">EE</span></span>(<span>hist)</span>
 551 | </code></dt>
 552 | <dd>
 553 | <div class="desc"><p>given a list of positive values as a histogram drawn from any information source,
 554 | returns the empirical entropy of its discrete probability function.</p>
 555 | <h2 id="parameters">Parameters</h2>
 556 | <dl>
 557 | <dt><strong><code>hist</code></strong> :&ensp;<code>array </code></dt>
 558 | <dd>histogram</dd>
 559 | </dl>
 560 | <h2 id="returns">Returns</h2>
 561 | <dl>
 562 | <dt><code>float</code></dt>
 563 | <dd>empirical entropy estimated from the histogram</dd>
 564 | </dl></div>
 565 | <details class="source">
 566 | <summary>
 567 | <span>Expand source code</span>
 568 | </summary>
 569 | <pre><code class="python">def EE(hist):
 570 |     &#34;&#34;&#34;
 571 |     given a list of positive values as a histogram drawn from any information source,
 572 |     returns the empirical entropy of its discrete probability function.
 573 |     
 574 |     Parameters
 575 |     ----------
 576 |     hist: array 
 577 |         histogram
 578 |     Returns
 579 |     -------
 580 |     float
 581 |         empirical entropy estimated from the histogram
 582 | 
 583 |     &#34;&#34;&#34;
 584 |     h = np.asarray(hist, dtype=np.float64)
 585 |     if h.sum() &lt;= 0 or (h &lt; 0).any():
 586 |         return 0
 587 |     h = h/h.sum()
 588 |     return -(h*np.ma.log2(h)).sum()</code></pre>
 589 | </details>
 590 | </dd>
 591 | <dt id="DiFF_RF.create_tree"><code class="name flex">
 592 | <span>def <span class="ident">create_tree</span></span>(<span>X, featureDistrib, sample_size, max_height)</span>
 593 | </code></dt>
 594 | <dd>
 595 | <div class="desc"><p>Creates an DiFF tree using a sample of size sample_size of the original data.</p>
 596 | <h2 id="parameters">Parameters</h2>
 597 | <dl>
 598 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
 599 | <dd>nD array with the observations. Dimensions should be (n_obs, n_features).</dd>
 600 | <dt><strong><code>sample_size</code></strong> :&ensp;<code>int</code></dt>
 601 | <dd>Size of the sample from which a DiFF tree is built.</dd>
 602 | <dt><strong><code>max_height</code></strong> :&ensp;<code>int</code></dt>
 603 | <dd>Maximum height of the tree.</dd>
 604 | </dl>
 605 | <h2 id="returns">Returns</h2>
 606 | <dl>
 607 | <dt><code>a DiFF tree</code></dt>
 608 | <dd>&nbsp;</dd>
 609 | </dl></div>
 610 | <details class="source">
 611 | <summary>
 612 | <span>Expand source code</span>
 613 | </summary>
 614 | <pre><code class="python">def create_tree(X, featureDistrib, sample_size, max_height):
 615 |     &#39;&#39;&#39;
 616 |     Creates an DiFF tree using a sample of size sample_size of the original data.
 617 |         
 618 |     Parameters
 619 |     ----------
 620 |     X: nD array. 
 621 |         nD array with the observations. Dimensions should be (n_obs, n_features).
 622 |     sample_size: int
 623 |         Size of the sample from which a DiFF tree is built.
 624 |     max_height: int
 625 |         Maximum height of the tree.
 626 |     Returns
 627 |     -------
 628 |     a DiFF tree
 629 |     &#39;&#39;&#39;
 630 |     rows = np.random.choice(len(X), sample_size, replace=False)
 631 |     featureDistrib = np.array(featureDistrib)
 632 |     return DiFF_Tree(max_height).fit(X[rows, :], featureDistrib)</code></pre>
 633 | </details>
 634 | </dd>
 635 | <dt id="DiFF_RF.getSplit"><code class="name flex">
 636 | <span>def <span class="ident">getSplit</span></span>(<span>X)</span>
 637 | </code></dt>
 638 | <dd>
 639 | <div class="desc"><p>Randomly selects a split value from set of scalar data 'X'.
 640 | Returns the split value.</p>
 641 | <h2 id="parameters">Parameters</h2>
 642 | <dl>
 643 | <dt><strong><code>X</code></strong> :&ensp;<code>array </code></dt>
 644 | <dd>Array of scalar values</dd>
 645 | </dl>
 646 | <h2 id="returns">Returns</h2>
 647 | <dl>
 648 | <dt><code>float</code></dt>
 649 | <dd>split value</dd>
 650 | </dl></div>
 651 | <details class="source">
 652 | <summary>
 653 | <span>Expand source code</span>
 654 | </summary>
 655 | <pre><code class="python">def getSplit(X):
 656 |     &#34;&#34;&#34;
 657 |     Randomly selects a split value from set of scalar data &#39;X&#39;.
 658 |     Returns the split value.
 659 |     
 660 |     Parameters
 661 |     ----------
 662 |     X : array 
 663 |         Array of scalar values
 664 |     Returns
 665 |     -------
 666 |     float
 667 |         split value
 668 |     &#34;&#34;&#34;
 669 |     xmin = X.min()
 670 |     xmax = X.max()
 671 |     return np.random.uniform(xmin, xmax)</code></pre>
 672 | </details>
 673 | </dd>
 674 | <dt id="DiFF_RF.similarityScore"><code class="name flex">
 675 | <span>def <span class="ident">similarityScore</span></span>(<span>S, node, alpha)</span>
 676 | </code></dt>
 677 | <dd>
 678 | <div class="desc"><p>Given a set of instances S falling into node and a value alpha &gt;=0,
 679 | returns for all element x in S the weighted similarity score between x
 680 | and the centroid M of S (node.M)</p>
 681 | <h2 id="parameters">Parameters</h2>
 682 | <dl>
 683 | <dt><strong><code>S</code></strong> :&ensp;<code>array</code>
 684 | of <code>instances</code></dt>
 685 | <dd>Array
 686 | of instances that fall into a node</dd>
 687 | <dt><strong><code>node</code></strong> :&ensp;<code>a DiFF tree node</code></dt>
 688 | <dd>S is the set of instances "falling" into the node</dd>
 689 | <dt><strong><code>alpha</code></strong> :&ensp;<code>float</code></dt>
 690 | <dd>alpha is the distance scaling hyper-parameter</dd>
 691 | </dl>
 692 | <h2 id="returns">Returns</h2>
 693 | <dl>
 694 | <dt><code>array</code></dt>
 695 | <dd>the array of similarity values between the instances in S and the mean of training instances falling in node</dd>
 696 | </dl></div>
 697 | <details class="source">
 698 | <summary>
 699 | <span>Expand source code</span>
 700 | </summary>
 701 | <pre><code class="python">def similarityScore(S, node, alpha):
 702 |     &#34;&#34;&#34;
 703 |     Given a set of instances S falling into node and a value alpha &gt;=0,
 704 |     returns for all element x in S the weighted similarity score between x
 705 |     and the centroid M of S (node.M)
 706 |     
 707 |     Parameters
 708 |     ----------
 709 |     S : array  of instances
 710 |         Array  of instances that fall into a node
 711 |     node: a DiFF tree node
 712 |         S is the set of instances &#34;falling&#34; into the node
 713 |     alpha: float
 714 |         alpha is the distance scaling hyper-parameter
 715 |     Returns
 716 |     -------
 717 |     array
 718 |         the array of similarity values between the instances in S and the mean of training instances falling in node
 719 | 
 720 |     &#34;&#34;&#34;
 721 |     d = np.shape(S)[1]
 722 |     if len(S) &gt; 0:
 723 |         d = np.shape(S)[1]
 724 |         U = (S-node.M)/node.Mstd # normalize using the standard deviation vector to the mean
 725 |         U = (2)**(-alpha*(np.sum(U*U/d, axis=1)))
 726 |     else:
 727 |         U = 0
 728 | 
 729 |     return U</code></pre>
 730 | </details>
 731 | </dd>
 732 | <dt id="DiFF_RF.walk_tree"><code class="name flex">
 733 | <span>def <span class="ident">walk_tree</span></span>(<span>forest, node, treeIdx, obsIdx, X, featureDistrib, depth=0, alpha=0.01)</span>
 734 | </code></dt>
 735 | <dd>
 736 | <div class="desc"><p>Recursive function that walks a tree from an already fitted forest to compute the path length
 737 | of the new observations.</p>
 738 | <h2 id="parameters">Parameters</h2>
 739 | <dl>
 740 | <dt><strong><code>forest</code></strong> :&ensp;<code><a title="DiFF_RF" href="#DiFF_RF">DiFF_RF</a> </code></dt>
 741 | <dd>A fitted forest of DiFF trees</dd>
 742 | <dt><strong><code>node</code></strong> :&ensp;<code>DiFF Tree node</code></dt>
 743 | <dd>the current node</dd>
 744 | <dt><strong><code>treeIdx</code></strong> :&ensp;<code>int</code></dt>
 745 | <dd>index of the tree that is being walked.</dd>
 746 | <dt><strong><code>obsIdx</code></strong> :&ensp;<code>array</code></dt>
 747 | <dd>1D array of length n_obs. 1/0 if the obs has reached / has not reached the node.</dd>
 748 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
 749 | <dd>array of observations/instances.</dd>
 750 | <dt><strong><code>depth</code></strong> :&ensp;<code>int</code></dt>
 751 | <dd>current depth.</dd>
 752 | </dl>
 753 | <h2 id="returns">Returns</h2>
 754 | <dl>
 755 | <dt><code>None</code></dt>
 756 | <dd>&nbsp;</dd>
 757 | </dl></div>
 758 | <details class="source">
 759 | <summary>
 760 | <span>Expand source code</span>
 761 | </summary>
 762 | <pre><code class="python">def walk_tree(forest, node, treeIdx, obsIdx, X, featureDistrib, depth=0, alpha=1e-2):
 763 |     &#39;&#39;&#39;
 764 |     Recursive function that walks a tree from an already fitted forest to compute the path length
 765 |     of the new observations.
 766 |     
 767 |     Parameters
 768 |     ----------
 769 |     forest : DiFF_RF 
 770 |         A fitted forest of DiFF trees
 771 |     node: DiFF Tree node
 772 |         the current node
 773 |     treeIdx: int
 774 |         index of the tree that is being walked.
 775 |     obsIdx: array
 776 |         1D array of length n_obs. 1/0 if the obs has reached / has not reached the node.
 777 |     X: nD array. 
 778 |         array of observations/instances.
 779 |     depth: int
 780 |         current depth.
 781 |     Returns
 782 |     -------
 783 |     None
 784 |     &#39;&#39;&#39;
 785 | 
 786 |     if isinstance(node, LeafNode):
 787 |         Xnode = X[obsIdx]
 788 |         f = ((node.size+1)/forest.sample_size) / ((1+len(Xnode))/forest.XtestSize)
 789 |         if alpha == 0:
 790 |             forest.LD[obsIdx, treeIdx] = 0
 791 |             forest.LF[obsIdx, treeIdx] = -f
 792 |             forest.LDF[obsIdx, treeIdx] = -f
 793 |         else:
 794 |             z = similarityScore(Xnode, node, alpha)
 795 |             forest.LD[obsIdx, treeIdx] = z
 796 |             forest.LF[obsIdx, treeIdx] = -f
 797 |             forest.LDF[obsIdx, treeIdx] = z*f
 798 | 
 799 |     else:
 800 | 
 801 |         idx = (X[:, node.splitAtt] &lt;= node.splitValue) * obsIdx
 802 |         walk_tree(forest, node.left, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)
 803 | 
 804 |         idx = (X[:, node.splitAtt] &gt; node.splitValue) * obsIdx
 805 |         walk_tree(forest, node.right, treeIdx, idx, X, featureDistrib, depth + 1, alpha=alpha)</code></pre>
 806 | </details>
 807 | </dd>
 808 | <dt id="DiFF_RF.weightFeature"><code class="name flex">
 809 | <span>def <span class="ident">weightFeature</span></span>(<span>s, nbins)</span>
 810 | </code></dt>
 811 | <dd>
 812 | <div class="desc"><p>Given a list of values corresponding to a feature dimension, returns a weight (in [0,1]) that is
 813 | one minus the normalized empirical entropy, a way to characterize the importance of the feature dimension. </p>
 814 | <h2 id="parameters">Parameters</h2>
 815 | <dl>
 816 | <dt><strong><code>s</code></strong> :&ensp;<code>array </code></dt>
 817 | <dd>list of scalar values corresponding to a feature dimension</dd>
 818 | <dt><strong><code>nbins</code></strong> :&ensp;<code>int</code></dt>
 819 | <dd>the number of bins used to discretize the feature dimension using an histogram.</dd>
 820 | </dl>
 821 | <h2 id="returns">Returns</h2>
 822 | <dl>
 823 | <dt><code>float</code></dt>
 824 | <dd>the importance weight for feature s.</dd>
 825 | </dl></div>
 826 | <details class="source">
 827 | <summary>
 828 | <span>Expand source code</span>
 829 | </summary>
 830 | <pre><code class="python">def weightFeature(s, nbins):
 831 |     &#39;&#39;&#39;
 832 |     Given a list of values corresponding to a feature dimension, returns a weight (in [0,1]) that is 
 833 |     one minus the normalized empirical entropy, a way to characterize the importance of the feature dimension. 
 834 |     
 835 |     Parameters
 836 |     ----------
 837 |     s: array 
 838 |         list of scalar values corresponding to a feature dimension
 839 |     nbins: int
 840 |         the number of bins used to discretize the feature dimension using an histogram.
 841 |     Returns
 842 |     -------
 843 |     float
 844 |         the importance weight for feature s.
 845 |     &#39;&#39;&#39;
 846 |     if s.min() == s.max():
 847 |         return 0
 848 |     hist = np.histogram(s, bins=nbins, density=True)
 849 |     ent = EE(hist[0])
 850 |     ent = ent/np.log2(nbins)
 851 |     return 1-ent</code></pre>
 852 | </details>
 853 | </dd>
 854 | </dl>
 855 | </section>
 856 | <section>
 857 | <h2 class="section-title" id="header-classes">Classes</h2>
 858 | <dl>
 859 | <dt id="DiFF_RF.DiFF_Tree"><code class="flex name class">
 860 | <span>class <span class="ident">DiFF_Tree</span></span>
 861 | <span>(</span><span>height_limit)</span>
 862 | </code></dt>
 863 | <dd>
 864 | <div class="desc"><p>Construct a tree via randomized splits with maximum height height_limit.</p>
 865 | <h2 id="parameters">Parameters</h2>
 866 | <dl>
 867 | <dt><strong><code>height_limit</code></strong> :&ensp;<code>int</code></dt>
 868 | <dd>Maximum height of the tree.</dd>
 869 | </dl>
 870 | <h2 id="returns">Returns</h2>
 871 | <dl>
 872 | <dt><code>None</code></dt>
 873 | <dd>&nbsp;</dd>
 874 | </dl></div>
 875 | <details class="source">
 876 | <summary>
 877 | <span>Expand source code</span>
 878 | </summary>
 879 | <pre><code class="python">class DiFF_Tree:
 880 |     &#39;&#39;&#39;
 881 |     Construct a tree via randomized splits with maximum height height_limit.
 882 |     &#39;&#39;&#39;
 883 |     def __init__(self, height_limit):
 884 |         &#39;&#39;&#39;
 885 |         Parameters
 886 |         ----------
 887 |         height_limit: int
 888 |             Maximum height of the tree.
 889 |         Returns
 890 |         -------
 891 |         None
 892 |         &#39;&#39;&#39;
 893 |         self.height_limit = height_limit
 894 | 
 895 |     def fit(self, X: np.ndarray, featureDistrib: np.array):
 896 |         &#34;&#34;&#34;
 897 |         Given a 2D matrix of observations, create an DiFF tree. Set field
 898 |         self.root to the root of that tree and return it.
 899 |         
 900 |         Parameters
 901 |         ----------
 902 |         X: nD array. 
 903 |             nD array with the observations. Dimensions should be (n_obs, n_features).        
 904 |         featureDistrib: 1D array
 905 |             The distribution weight affected to each dimension
 906 |         Returns
 907 |         -------
 908 |         A DIFF tree root.
 909 |         &#34;&#34;&#34;
 910 |         self.root = InNode(X, self.height_limit, featureDistrib, len(X), 0)
 911 | 
 912 |         return self.root</code></pre>
 913 | </details>
 914 | <h3>Methods</h3>
 915 | <dl>
 916 | <dt id="DiFF_RF.DiFF_Tree.fit"><code class="name flex">
 917 | <span>def <span class="ident">fit</span></span>(<span>self, X: numpy.ndarray, featureDistrib: <built-in function array>)</span>
 918 | </code></dt>
 919 | <dd>
 920 | <div class="desc"><p>Given a 2D matrix of observations, create an DiFF tree. Set field
 921 | self.root to the root of that tree and return it.</p>
 922 | <h2 id="parameters">Parameters</h2>
 923 | <dl>
 924 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
 925 | <dd>nD array with the observations. Dimensions should be (n_obs, n_features).</dd>
 926 | <dt><strong><code>featureDistrib</code></strong> :&ensp;<code>1D array</code></dt>
 927 | <dd>The distribution weight affected to each dimension</dd>
 928 | </dl>
 929 | <h2 id="returns">Returns</h2>
 930 | <p>A DIFF tree root.</p></div>
 931 | <details class="source">
 932 | <summary>
 933 | <span>Expand source code</span>
 934 | </summary>
 935 | <pre><code class="python">def fit(self, X: np.ndarray, featureDistrib: np.array):
 936 |     &#34;&#34;&#34;
 937 |     Given a 2D matrix of observations, create an DiFF tree. Set field
 938 |     self.root to the root of that tree and return it.
 939 |     
 940 |     Parameters
 941 |     ----------
 942 |     X: nD array. 
 943 |         nD array with the observations. Dimensions should be (n_obs, n_features).        
 944 |     featureDistrib: 1D array
 945 |         The distribution weight affected to each dimension
 946 |     Returns
 947 |     -------
 948 |     A DIFF tree root.
 949 |     &#34;&#34;&#34;
 950 |     self.root = InNode(X, self.height_limit, featureDistrib, len(X), 0)
 951 | 
 952 |     return self.root</code></pre>
 953 | </details>
 954 | </dd>
 955 | </dl>
 956 | </dd>
 957 | <dt id="DiFF_RF.DiFF_TreeEnsemble"><code class="flex name class">
 958 | <span>class <span class="ident">DiFF_TreeEnsemble</span></span>
 959 | <span>(</span><span>sample_size: int, n_trees: int = 10)</span>
 960 | </code></dt>
 961 | <dd>
 962 | <div class="desc"><p>DiFF Forest.
 963 | Even though all the methods are thought to be public the main functionality of the class is given by:
 964 | - <strong>init</strong>
 965 | - <strong>fit</strong>
 966 | - <strong>predict</strong></p>
 967 | <p>Creates the DiFF-RF object.</p>
 968 | <h2 id="parameters">Parameters</h2>
 969 | <dl>
 970 | <dt><strong><code>sample_size</code></strong> :&ensp;<code>int. </code></dt>
 971 | <dd>size of the sample randomly drawn from the train instances to build each DiFF tree.</dd>
 972 | <dt><strong><code>n_trees</code></strong> :&ensp;<code>int</code></dt>
 973 | <dd>The number of trees in the forest</dd>
 974 | </dl>
 975 | <h2 id="returns">Returns</h2>
 976 | <pre><code>None
 977 | </code></pre></div>
 978 | <details class="source">
 979 | <summary>
 980 | <span>Expand source code</span>
 981 | </summary>
 982 | <pre><code class="python">class DiFF_TreeEnsemble:
 983 |     &#39;&#39;&#39;
 984 |     DiFF Forest.
 985 |     Even though all the methods are thought to be public the main functionality of the class is given by:
 986 |     - __init__
 987 |     - __fit__
 988 |     - __predict__
 989 |     &#39;&#39;&#39;
 990 |     def __init__(self, sample_size: int, n_trees: int = 10):
 991 |         &#39;&#39;&#39;
 992 |         Creates the DiFF-RF object.
 993 |         
 994 |         Parameters
 995 |         ----------
 996 |         sample_size: int. 
 997 |             size of the sample randomly drawn from the train instances to build each DiFF tree.  
 998 |         n_trees: int
 999 |             The number of trees in the forest
1000 |         Returns
1001 |         -------
1002 |             None
1003 |         &#39;&#39;&#39;
1004 | 
1005 |         self.sample_size = sample_size
1006 |         self.n_trees = n_trees
1007 |         self.alpha=1.0
1008 |         np.random.seed(int(time.time()))
1009 |         rn.seed(int(time.time()))
1010 | 
1011 | 
1012 |     def fit(self, X: (np.ndarray), n_jobs: int = 4):
1013 |         &#34;&#34;&#34;
1014 |         Fits the algorithm into a model.
1015 |         Given a 2D matrix of observations, create an ensemble of IsolationTree
1016 |         objects and store them in a list: self.trees.  Convert DataFrames to
1017 |         ndarray objects.
1018 |         Uses parallel computing.
1019 |         
1020 |         Parameters
1021 |         ----------
1022 |         X: nD array. 
1023 |             nD array with the train instances. Dimensions should be (n_obs, n_features).  
1024 |         n_jobs: int
1025 |             number of parallel jobs that will be launched
1026 |         Returns
1027 |         -------
1028 |             the object itself.
1029 |         &#34;&#34;&#34;
1030 |         self.X = X
1031 |         self.path_normFactor = np.sqrt(len(X))
1032 | 
1033 |         self.sample_size = min(self.sample_size, len(X))
1034 | 
1035 |         limit_height = 1.0*np.ceil(np.log2(self.sample_size))
1036 | 
1037 |         featureDistrib = []
1038 |         nbins = int(len(X)/8)+2
1039 |         for i in range(np.shape(X)[1]):
1040 |             featureDistrib.append(weightFeature(X[:, i], nbins))
1041 |         featureDistrib = np.array(featureDistrib)
1042 |         featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
1043 |         self.featureDistrib = featureDistrib
1044 | 
1045 |         create_tree_partial = partial(create_tree,
1046 |                                       featureDistrib=self.featureDistrib,
1047 |                                       sample_size=self.sample_size,
1048 |                                       max_height=limit_height)
1049 | 
1050 |         with Pool(n_jobs) as p:
1051 |             self.trees = p.map(create_tree_partial,
1052 |                                [X for _ in range(self.n_trees)]
1053 |                                )
1054 |         return self
1055 | 
1056 | 
1057 |     def walk(self, X: np.ndarray) -&gt; np.ndarray:
1058 |         &#34;&#34;&#34;
1059 |         Given a nD matrix of observations, X, compute the average path length,
1060 |         the distance, frequency and collective anomaly scores
1061 |         for instances in X.  Compute the path length for x_i using every
1062 |         tree in self.trees then compute the average for each x_i.  Return an
1063 |         ndarray of shape (len(X),1).
1064 |         
1065 |         Parameters
1066 |         ----------
1067 |         X: nD array. 
1068 |             nD array with the instances to be tested. Dimensions should be (n_obs, n_features).   
1069 |         Returns
1070 |         -------
1071 |             None
1072 |         &#34;&#34;&#34;
1073 | 
1074 |         self.L = np.zeros((len(X), self.n_trees))
1075 |         self.LD = np.zeros((len(X), self.n_trees))
1076 |         self.LF = np.zeros((len(X), self.n_trees))
1077 |         self.LDF = np.zeros((len(X), self.n_trees))
1078 | 
1079 |         for treeIdx, itree in enumerate(self.trees):
1080 |             obsIdx = np.ones(len(X)).astype(bool)
1081 |             walk_tree(self, itree, treeIdx, obsIdx, X, self.featureDistrib, alpha=self.alpha)
1082 | 
1083 | 
1084 |     def anomaly_score(self, X: np.ndarray, alpha=1) -&gt; np.ndarray:
1085 |         &#34;&#34;&#34;
1086 |         Given a nD matrix of observations, X, compute the anomaly scores
1087 |         for instances in X, returning 3 1D arrays of anomaly scores
1088 |         
1089 |         Parameters
1090 |         ----------
1091 |         X: nD array. 
1092 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
1093 |         alpha: float
1094 |             scaling distance hyper-parameter.
1095 |         Returns
1096 |         -------
1097 |         scD, scF, scFF: 1d arrays
1098 |             respectively the distance scores (point-wise anomaly score), the frequency of visit socres and the collective anomaly scores
1099 |         &#34;&#34;&#34;
1100 |         self.XtestSize = len(X)
1101 |         self.alpha = alpha
1102 | 
1103 |         # Get the path length for each of the observations.
1104 |         self.walk(X)
1105 | 
1106 |         # Compute the scores from the path lengths (self.L)
1107 |         if self.sample_size &gt; 2:
1108 |             scD = -self.LD.mean(1)
1109 |         elif self.sample_size == 2:
1110 |             scD = -self.LD.mean(1)
1111 |         else:
1112 |             scD = 0
1113 | 
1114 |         scF = self.LF.mean(1)
1115 |         scDF = -self.LDF.mean(1)
1116 |         return scD, scF, scDF
1117 |     
1118 | 
1119 |     def predict_from_anomaly_scores(self, scores: np.ndarray, threshold: float) -&gt; np.ndarray:
1120 |         &#34;&#34;&#34;
1121 |         Given an array of scores and a score threshold, return an array of
1122 |         the predictions: 1 for any score &gt;= the threshold and 0 otherwise.
1123 |         
1124 |         Parameters
1125 |         ----------
1126 |         scores: 1D array. 
1127 |             1D array of scores. Dimensions should be (n_obs, n_features).   
1128 |         threshold: float
1129 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
1130 |         Returns
1131 |         -------
1132 |         1D array
1133 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
1134 | 
1135 |         :param scores: 1D array. Scores produced by the random forest.
1136 |         :param threshold: Threshold for considering a observation an anomaly, the higher the less anomalies.
1137 |         :return: Return predictions
1138 |         &#34;&#34;&#34;
1139 |         out = scores &gt;= threshold
1140 |         return out*1
1141 |     
1142 | 
1143 |     def predict(self, X: np.ndarray, threshold: float) -&gt; np.ndarray:
1144 |         &#34;&#34;&#34;
1145 |         A shorthand for calling anomaly_score() and predict_from_anomaly_scores().
1146 |         
1147 |         Parameters
1148 |         ----------
1149 |         X: nD array. 
1150 |             nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
1151 |         threshold: float
1152 |             Threshold for considering a observation an anomaly, the higher the less anomalies.
1153 |         Returns
1154 |         -------
1155 |         1D array
1156 |             The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
1157 |         &#34;&#34;&#34;
1158 | 
1159 |         scores = self.anomaly_score(X)
1160 |         return self.predict_from_anomaly_scores(scores, threshold)</code></pre>
1161 | </details>
1162 | <h3>Methods</h3>
1163 | <dl>
1164 | <dt id="DiFF_RF.DiFF_TreeEnsemble.anomaly_score"><code class="name flex">
1165 | <span>def <span class="ident">anomaly_score</span></span>(<span>self, X: numpy.ndarray, alpha=1) ‑> numpy.ndarray</span>
1166 | </code></dt>
1167 | <dd>
1168 | <div class="desc"><p>Given a nD matrix of observations, X, compute the anomaly scores
1169 | for instances in X, returning 3 1D arrays of anomaly scores</p>
1170 | <h2 id="parameters">Parameters</h2>
1171 | <dl>
1172 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1173 | <dd>nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).</dd>
1174 | <dt><strong><code>alpha</code></strong> :&ensp;<code>float</code></dt>
1175 | <dd>scaling distance hyper-parameter.</dd>
1176 | </dl>
1177 | <h2 id="returns">Returns</h2>
1178 | <dl>
1179 | <dt><strong><code>scD</code></strong>, <strong><code>scF</code></strong>, <strong><code>scFF</code></strong> :&ensp;<code>1d arrays</code></dt>
1180 | <dd>respectively the distance scores (point-wise anomaly score), the frequency of visit socres and the collective anomaly scores</dd>
1181 | </dl></div>
1182 | <details class="source">
1183 | <summary>
1184 | <span>Expand source code</span>
1185 | </summary>
1186 | <pre><code class="python">def anomaly_score(self, X: np.ndarray, alpha=1) -&gt; np.ndarray:
1187 |     &#34;&#34;&#34;
1188 |     Given a nD matrix of observations, X, compute the anomaly scores
1189 |     for instances in X, returning 3 1D arrays of anomaly scores
1190 |     
1191 |     Parameters
1192 |     ----------
1193 |     X: nD array. 
1194 |         nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
1195 |     alpha: float
1196 |         scaling distance hyper-parameter.
1197 |     Returns
1198 |     -------
1199 |     scD, scF, scFF: 1d arrays
1200 |         respectively the distance scores (point-wise anomaly score), the frequency of visit socres and the collective anomaly scores
1201 |     &#34;&#34;&#34;
1202 |     self.XtestSize = len(X)
1203 |     self.alpha = alpha
1204 | 
1205 |     # Get the path length for each of the observations.
1206 |     self.walk(X)
1207 | 
1208 |     # Compute the scores from the path lengths (self.L)
1209 |     if self.sample_size &gt; 2:
1210 |         scD = -self.LD.mean(1)
1211 |     elif self.sample_size == 2:
1212 |         scD = -self.LD.mean(1)
1213 |     else:
1214 |         scD = 0
1215 | 
1216 |     scF = self.LF.mean(1)
1217 |     scDF = -self.LDF.mean(1)
1218 |     return scD, scF, scDF</code></pre>
1219 | </details>
1220 | </dd>
1221 | <dt id="DiFF_RF.DiFF_TreeEnsemble.fit"><code class="name flex">
1222 | <span>def <span class="ident">fit</span></span>(<span>self, X: numpy.ndarray, n_jobs: int = 4)</span>
1223 | </code></dt>
1224 | <dd>
1225 | <div class="desc"><p>Fits the algorithm into a model.
1226 | Given a 2D matrix of observations, create an ensemble of IsolationTree
1227 | objects and store them in a list: self.trees.
1228 | Convert DataFrames to
1229 | ndarray objects.
1230 | Uses parallel computing.</p>
1231 | <h2 id="parameters">Parameters</h2>
1232 | <dl>
1233 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1234 | <dd>nD array with the train instances. Dimensions should be (n_obs, n_features).</dd>
1235 | <dt><strong><code>n_jobs</code></strong> :&ensp;<code>int</code></dt>
1236 | <dd>number of parallel jobs that will be launched</dd>
1237 | </dl>
1238 | <h2 id="returns">Returns</h2>
1239 | <pre><code>the object itself.
1240 | </code></pre></div>
1241 | <details class="source">
1242 | <summary>
1243 | <span>Expand source code</span>
1244 | </summary>
1245 | <pre><code class="python">def fit(self, X: (np.ndarray), n_jobs: int = 4):
1246 |     &#34;&#34;&#34;
1247 |     Fits the algorithm into a model.
1248 |     Given a 2D matrix of observations, create an ensemble of IsolationTree
1249 |     objects and store them in a list: self.trees.  Convert DataFrames to
1250 |     ndarray objects.
1251 |     Uses parallel computing.
1252 |     
1253 |     Parameters
1254 |     ----------
1255 |     X: nD array. 
1256 |         nD array with the train instances. Dimensions should be (n_obs, n_features).  
1257 |     n_jobs: int
1258 |         number of parallel jobs that will be launched
1259 |     Returns
1260 |     -------
1261 |         the object itself.
1262 |     &#34;&#34;&#34;
1263 |     self.X = X
1264 |     self.path_normFactor = np.sqrt(len(X))
1265 | 
1266 |     self.sample_size = min(self.sample_size, len(X))
1267 | 
1268 |     limit_height = 1.0*np.ceil(np.log2(self.sample_size))
1269 | 
1270 |     featureDistrib = []
1271 |     nbins = int(len(X)/8)+2
1272 |     for i in range(np.shape(X)[1]):
1273 |         featureDistrib.append(weightFeature(X[:, i], nbins))
1274 |     featureDistrib = np.array(featureDistrib)
1275 |     featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
1276 |     self.featureDistrib = featureDistrib
1277 | 
1278 |     create_tree_partial = partial(create_tree,
1279 |                                   featureDistrib=self.featureDistrib,
1280 |                                   sample_size=self.sample_size,
1281 |                                   max_height=limit_height)
1282 | 
1283 |     with Pool(n_jobs) as p:
1284 |         self.trees = p.map(create_tree_partial,
1285 |                            [X for _ in range(self.n_trees)]
1286 |                            )
1287 |     return self</code></pre>
1288 | </details>
1289 | </dd>
1290 | <dt id="DiFF_RF.DiFF_TreeEnsemble.predict"><code class="name flex">
1291 | <span>def <span class="ident">predict</span></span>(<span>self, X: numpy.ndarray, threshold: float) ‑> numpy.ndarray</span>
1292 | </code></dt>
1293 | <dd>
1294 | <div class="desc"><p>A shorthand for calling anomaly_score() and predict_from_anomaly_scores().</p>
1295 | <h2 id="parameters">Parameters</h2>
1296 | <dl>
1297 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1298 | <dd>nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).</dd>
1299 | <dt><strong><code>threshold</code></strong> :&ensp;<code>float</code></dt>
1300 | <dd>Threshold for considering a observation an anomaly, the higher the less anomalies.</dd>
1301 | </dl>
1302 | <h2 id="returns">Returns</h2>
1303 | <dl>
1304 | <dt><code>1D array</code></dt>
1305 | <dd>The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.</dd>
1306 | </dl></div>
1307 | <details class="source">
1308 | <summary>
1309 | <span>Expand source code</span>
1310 | </summary>
1311 | <pre><code class="python">def predict(self, X: np.ndarray, threshold: float) -&gt; np.ndarray:
1312 |     &#34;&#34;&#34;
1313 |     A shorthand for calling anomaly_score() and predict_from_anomaly_scores().
1314 |     
1315 |     Parameters
1316 |     ----------
1317 |     X: nD array. 
1318 |         nD array with the tested observations to be predicted. Dimensions should be (n_obs, n_features).   
1319 |     threshold: float
1320 |         Threshold for considering a observation an anomaly, the higher the less anomalies.
1321 |     Returns
1322 |     -------
1323 |     1D array
1324 |         The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
1325 |     &#34;&#34;&#34;
1326 | 
1327 |     scores = self.anomaly_score(X)
1328 |     return self.predict_from_anomaly_scores(scores, threshold)</code></pre>
1329 | </details>
1330 | </dd>
1331 | <dt id="DiFF_RF.DiFF_TreeEnsemble.predict_from_anomaly_scores"><code class="name flex">
1332 | <span>def <span class="ident">predict_from_anomaly_scores</span></span>(<span>self, scores: numpy.ndarray, threshold: float) ‑> numpy.ndarray</span>
1333 | </code></dt>
1334 | <dd>
1335 | <div class="desc"><p>Given an array of scores and a score threshold, return an array of
1336 | the predictions: 1 for any score &gt;= the threshold and 0 otherwise.</p>
1337 | <h2 id="parameters">Parameters</h2>
1338 | <dl>
1339 | <dt><strong><code>scores</code></strong> :&ensp;<code>1D array. </code></dt>
1340 | <dd>1D array of scores. Dimensions should be (n_obs, n_features).</dd>
1341 | <dt><strong><code>threshold</code></strong> :&ensp;<code>float</code></dt>
1342 | <dd>Threshold for considering a observation an anomaly, the higher the less anomalies.</dd>
1343 | </dl>
1344 | <h2 id="returns">Returns</h2>
1345 | <dl>
1346 | <dt><code>1D array</code></dt>
1347 | <dd>The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.</dd>
1348 | </dl>
1349 | <p>:param scores: 1D array. Scores produced by the random forest.
1350 | :param threshold: Threshold for considering a observation an anomaly, the higher the less anomalies.
1351 | :return: Return predictions</p></div>
1352 | <details class="source">
1353 | <summary>
1354 | <span>Expand source code</span>
1355 | </summary>
1356 | <pre><code class="python">def predict_from_anomaly_scores(self, scores: np.ndarray, threshold: float) -&gt; np.ndarray:
1357 |     &#34;&#34;&#34;
1358 |     Given an array of scores and a score threshold, return an array of
1359 |     the predictions: 1 for any score &gt;= the threshold and 0 otherwise.
1360 |     
1361 |     Parameters
1362 |     ----------
1363 |     scores: 1D array. 
1364 |         1D array of scores. Dimensions should be (n_obs, n_features).   
1365 |     threshold: float
1366 |         Threshold for considering a observation an anomaly, the higher the less anomalies.
1367 |     Returns
1368 |     -------
1369 |     1D array
1370 |         The prediction array corresponding to 1/0 if anomaly/not anomaly respectively.
1371 | 
1372 |     :param scores: 1D array. Scores produced by the random forest.
1373 |     :param threshold: Threshold for considering a observation an anomaly, the higher the less anomalies.
1374 |     :return: Return predictions
1375 |     &#34;&#34;&#34;
1376 |     out = scores &gt;= threshold
1377 |     return out*1</code></pre>
1378 | </details>
1379 | </dd>
1380 | <dt id="DiFF_RF.DiFF_TreeEnsemble.walk"><code class="name flex">
1381 | <span>def <span class="ident">walk</span></span>(<span>self, X: numpy.ndarray) ‑> numpy.ndarray</span>
1382 | </code></dt>
1383 | <dd>
1384 | <div class="desc"><p>Given a nD matrix of observations, X, compute the average path length,
1385 | the distance, frequency and collective anomaly scores
1386 | for instances in X.
1387 | Compute the path length for x_i using every
1388 | tree in self.trees then compute the average for each x_i.
1389 | Return an
1390 | ndarray of shape (len(X),1).</p>
1391 | <h2 id="parameters">Parameters</h2>
1392 | <dl>
1393 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1394 | <dd>nD array with the instances to be tested. Dimensions should be (n_obs, n_features).</dd>
1395 | </dl>
1396 | <h2 id="returns">Returns</h2>
1397 | <pre><code>None
1398 | </code></pre></div>
1399 | <details class="source">
1400 | <summary>
1401 | <span>Expand source code</span>
1402 | </summary>
1403 | <pre><code class="python">def walk(self, X: np.ndarray) -&gt; np.ndarray:
1404 |     &#34;&#34;&#34;
1405 |     Given a nD matrix of observations, X, compute the average path length,
1406 |     the distance, frequency and collective anomaly scores
1407 |     for instances in X.  Compute the path length for x_i using every
1408 |     tree in self.trees then compute the average for each x_i.  Return an
1409 |     ndarray of shape (len(X),1).
1410 |     
1411 |     Parameters
1412 |     ----------
1413 |     X: nD array. 
1414 |         nD array with the instances to be tested. Dimensions should be (n_obs, n_features).   
1415 |     Returns
1416 |     -------
1417 |         None
1418 |     &#34;&#34;&#34;
1419 | 
1420 |     self.L = np.zeros((len(X), self.n_trees))
1421 |     self.LD = np.zeros((len(X), self.n_trees))
1422 |     self.LF = np.zeros((len(X), self.n_trees))
1423 |     self.LDF = np.zeros((len(X), self.n_trees))
1424 | 
1425 |     for treeIdx, itree in enumerate(self.trees):
1426 |         obsIdx = np.ones(len(X)).astype(bool)
1427 |         walk_tree(self, itree, treeIdx, obsIdx, X, self.featureDistrib, alpha=self.alpha)</code></pre>
1428 | </details>
1429 | </dd>
1430 | </dl>
1431 | </dd>
1432 | <dt id="DiFF_RF.InNode"><code class="flex name class">
1433 | <span>class <span class="ident">InNode</span></span>
1434 | <span>(</span><span>X, height_limit, featureDistrib, sample_size, current_height)</span>
1435 | </code></dt>
1436 | <dd>
1437 | <div class="desc"><p>Node of the tree that is not a leaf node.
1438 | The functionality of the class is:
1439 | - Do the best split from a sample of randomly chosen
1440 | dimensions and split points.
1441 | - Partition the space of observations according to the
1442 | split and send the along to two different nodes
1443 | The method usually has a higher complexity than doing it for every point.
1444 | But because it's using NumPy it's more efficient time-wise.</p>
1445 | <h2 id="parameters">Parameters</h2>
1446 | <dl>
1447 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1448 | <dd>nD array with the training instances that have reached the node.</dd>
1449 | <dt><strong><code>height_limit</code></strong> :&ensp;<code>int</code></dt>
1450 | <dd>Maximum height of the tree.</dd>
1451 | <dt><strong><code>Xf</code></strong> :&ensp;<code>nD array. </code></dt>
1452 | <dd>distribution used to randomly select a dimension (feature) used at parent level.</dd>
1453 | <dt><strong><code>sample_size</code></strong> :&ensp;<code>int</code></dt>
1454 | <dd>Size of the sample used to build the tree.</dd>
1455 | <dt><strong><code>current_height</code></strong> :&ensp;<code>int</code></dt>
1456 | <dd>Current height of the tree.</dd>
1457 | </dl>
1458 | <h2 id="returns">Returns</h2>
1459 | <pre><code>None
1460 | </code></pre></div>
1461 | <details class="source">
1462 | <summary>
1463 | <span>Expand source code</span>
1464 | </summary>
1465 | <pre><code class="python">class InNode:
1466 |     &#39;&#39;&#39;
1467 |     Node of the tree that is not a leaf node.
1468 |     The functionality of the class is:
1469 |     - Do the best split from a sample of randomly chosen
1470 |         dimensions and split points.
1471 |     - Partition the space of observations according to the
1472 |     split and send the along to two different nodes
1473 |     The method usually has a higher complexity than doing it for every point.
1474 |     But because it&#39;s using NumPy it&#39;s more efficient time-wise.
1475 |     &#39;&#39;&#39;
1476 |     def __init__(self, X, height_limit, featureDistrib, sample_size, current_height):
1477 |         &#39;&#39;&#39;
1478 |         Parameters
1479 |         ----------
1480 |         X: nD array. 
1481 |             nD array with the training instances that have reached the node.
1482 |         height_limit: int
1483 |             Maximum height of the tree.
1484 |         Xf: nD array. 
1485 |             distribution used to randomly select a dimension (feature) used at parent level. 
1486 |         sample_size: int
1487 |             Size of the sample used to build the tree.
1488 |         current_height: int
1489 |             Current height of the tree.
1490 |         Returns
1491 |         -------
1492 |             None
1493 |         &#39;&#39;&#39;
1494 | 
1495 |         self.size = len(X)
1496 |         self.height = current_height+1
1497 |         n_obs, n_features = X.shape
1498 |         next_height = current_height + 1
1499 |         limit_not_reached = height_limit &gt; next_height
1500 | 
1501 |         if len(X) &gt; 32:
1502 |             featureDistrib = []
1503 |             nbins = int(len(X)/8)+2
1504 |             for i in range(np.shape(X)[1]):
1505 |                 featureDistrib.append(weightFeature(X[:, i], nbins))
1506 |             featureDistrib = np.array(featureDistrib)
1507 |             featureDistrib = featureDistrib/(np.sum(featureDistrib)+1e-5)
1508 | 
1509 |         self.featureDistrib = featureDistrib
1510 | 
1511 |         cols = np.arange(np.shape(X)[1], dtype=&#39;int&#39;)
1512 | 
1513 |         self.splitAtt = rn.choices(cols, weights=featureDistrib)[0]
1514 |         splittingCol = X[:, self.splitAtt]
1515 |         self.splitValue = getSplit(splittingCol)
1516 |         idx = splittingCol &lt;= self.splitValue
1517 | 
1518 |         idx = splittingCol &lt;= self.splitValue
1519 | 
1520 |         X_aux = X[idx, :]
1521 | 
1522 |         self.left = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
1523 |                      if limit_not_reached and X_aux.shape[0] &gt; 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
1524 |                          X_aux, next_height, X, sample_size))
1525 | 
1526 |         idx = np.invert(idx)
1527 |         X_aux = X[idx, :]
1528 |         self.right = (InNode(X_aux, height_limit, featureDistrib, sample_size, next_height)
1529 |                       if limit_not_reached and X_aux.shape[0] &gt; 5 and (np.any(X_aux.max(0) != X_aux.min(0))) else LeafNode(
1530 |                           X_aux, next_height, X, sample_size))
1531 | 
1532 |         self.n_nodes = 1 + self.left.n_nodes + self.right.n_nodes</code></pre>
1533 | </details>
1534 | </dd>
1535 | <dt id="DiFF_RF.LeafNode"><code class="flex name class">
1536 | <span>class <span class="ident">LeafNode</span></span>
1537 | <span>(</span><span>X, height, Xp, sample_size)</span>
1538 | </code></dt>
1539 | <dd>
1540 | <div class="desc"><p>Leaf node
1541 | The base funcitonality is storing the Mean and standard deviation of the observations in that node.
1542 | We also evaluate the frequency of visit for training data.</p>
1543 | <h2 id="parameters">Parameters</h2>
1544 | <dl>
1545 | <dt><strong><code>X</code></strong> :&ensp;<code>nD array. </code></dt>
1546 | <dd>nD array with the training instances falling into the leaf node.</dd>
1547 | <dt><strong><code>height</code></strong> :&ensp;<code>int</code></dt>
1548 | <dd>Current height of the tree.</dd>
1549 | <dt><strong><code>Xf</code></strong> :&ensp;<code>nD array. </code></dt>
1550 | <dd>nD array with the training instances falling into the parent node.</dd>
1551 | <dt><strong><code>sample_size</code></strong> :&ensp;<code>int</code></dt>
1552 | <dd>Size of the sample used to build the tree.</dd>
1553 | </dl>
1554 | <h2 id="returns">Returns</h2>
1555 | <pre><code>None
1556 | </code></pre></div>
1557 | <details class="source">
1558 | <summary>
1559 | <span>Expand source code</span>
1560 | </summary>
1561 | <pre><code class="python">class LeafNode:
1562 |     &#39;&#39;&#39;
1563 |     Leaf node
1564 |     The base funcitonality is storing the Mean and standard deviation of the observations in that node.
1565 |     We also evaluate the frequency of visit for training data.
1566 |     &#39;&#39;&#39;
1567 |     def __init__(self, X, height, Xp, sample_size):
1568 |         &#39;&#39;&#39;
1569 |         Parameters
1570 |         ----------
1571 |         X: nD array. 
1572 |             nD array with the training instances falling into the leaf node.    
1573 |         height: int
1574 |             Current height of the tree.
1575 |         Xf: nD array. 
1576 |             nD array with the training instances falling into the parent node.    
1577 |         sample_size: int
1578 |             Size of the sample used to build the tree.
1579 |         Returns
1580 |         -------
1581 |             None
1582 |         &#39;&#39;&#39;
1583 |         self.height = height+1
1584 |         self.size = len(X)
1585 |         self.n_nodes = 1
1586 |         self.freq = self.size/sample_size
1587 |         self.freqs = 0
1588 | 
1589 |         if len(X) != 0:
1590 |             self.M = np.mean(X, axis=0)
1591 |             if len(X) &gt; 10:
1592 |                 self.Mstd = np.std(X, axis=0)
1593 |                 self.Mstd[self.Mstd == 0] = 1e-2
1594 |             else:
1595 |                 self.Mstd = np.ones(np.shape(X)[1])
1596 |         else:
1597 |             self.M = np.mean(Xp, axis=0)
1598 |             if len(Xp) &gt; 10:
1599 |                 self.Mstd = np.std(Xp, axis=0)
1600 |                 self.Mstd[self.Mstd == 0] = 1e-2
1601 |             else:
1602 |                 self.Mstd = np.ones(np.shape(X)[1])</code></pre>
1603 | </details>
1604 | </dd>
1605 | </dl>
1606 | </section>
1607 | </article>
1608 | <nav id="sidebar">
1609 | <h1>Index</h1>
1610 | <div class="toc">
1611 | <ul></ul>
1612 | </div>
1613 | <ul id="index">
1614 | <li><h3><a href="#header-functions">Functions</a></h3>
1615 | <ul class="two-column">
1616 | <li><code><a title="DiFF_RF.EE" href="#DiFF_RF.EE">EE</a></code></li>
1617 | <li><code><a title="DiFF_RF.create_tree" href="#DiFF_RF.create_tree">create_tree</a></code></li>
1618 | <li><code><a title="DiFF_RF.getSplit" href="#DiFF_RF.getSplit">getSplit</a></code></li>
1619 | <li><code><a title="DiFF_RF.similarityScore" href="#DiFF_RF.similarityScore">similarityScore</a></code></li>
1620 | <li><code><a title="DiFF_RF.walk_tree" href="#DiFF_RF.walk_tree">walk_tree</a></code></li>
1621 | <li><code><a title="DiFF_RF.weightFeature" href="#DiFF_RF.weightFeature">weightFeature</a></code></li>
1622 | </ul>
1623 | </li>
1624 | <li><h3><a href="#header-classes">Classes</a></h3>
1625 | <ul>
1626 | <li>
1627 | <h4><code><a title="DiFF_RF.DiFF_Tree" href="#DiFF_RF.DiFF_Tree">DiFF_Tree</a></code></h4>
1628 | <ul class="">
1629 | <li><code><a title="DiFF_RF.DiFF_Tree.fit" href="#DiFF_RF.DiFF_Tree.fit">fit</a></code></li>
1630 | </ul>
1631 | </li>
1632 | <li>
1633 | <h4><code><a title="DiFF_RF.DiFF_TreeEnsemble" href="#DiFF_RF.DiFF_TreeEnsemble">DiFF_TreeEnsemble</a></code></h4>
1634 | <ul class="">
1635 | <li><code><a title="DiFF_RF.DiFF_TreeEnsemble.anomaly_score" href="#DiFF_RF.DiFF_TreeEnsemble.anomaly_score">anomaly_score</a></code></li>
1636 | <li><code><a title="DiFF_RF.DiFF_TreeEnsemble.fit" href="#DiFF_RF.DiFF_TreeEnsemble.fit">fit</a></code></li>
1637 | <li><code><a title="DiFF_RF.DiFF_TreeEnsemble.predict" href="#DiFF_RF.DiFF_TreeEnsemble.predict">predict</a></code></li>
1638 | <li><code><a title="DiFF_RF.DiFF_TreeEnsemble.predict_from_anomaly_scores" href="#DiFF_RF.DiFF_TreeEnsemble.predict_from_anomaly_scores">predict_from_anomaly_scores</a></code></li>
1639 | <li><code><a title="DiFF_RF.DiFF_TreeEnsemble.walk" href="#DiFF_RF.DiFF_TreeEnsemble.walk">walk</a></code></li>
1640 | </ul>
1641 | </li>
1642 | <li>
1643 | <h4><code><a title="DiFF_RF.InNode" href="#DiFF_RF.InNode">InNode</a></code></h4>
1644 | </li>
1645 | <li>
1646 | <h4><code><a title="DiFF_RF.LeafNode" href="#DiFF_RF.LeafNode">LeafNode</a></code></h4>
1647 | </li>
1648 | </ul>
1649 | </li>
1650 | </ul>
1651 | </nav>
1652 | </main>
1653 | <footer id="footer">
1654 | <p>Generated by <a href="https://pdoc3.github.io/pdoc"><cite>pdoc</cite> 0.8.3</a>.</p>
1655 | </footer>
1656 | </body>
1657 | </html>


--------------------------------------------------------------------------------