├── .gitignore
├── DESCRIPTION.rst
├── FATS
    ├── Base.py
    ├── DESCRIPTION.rst
    ├── Feature.py
    ├── FeatureFunctionLib.py
    ├── PreprocessLC.py
    ├── README.rst
    ├── __init__.py
    ├── alignLC.py
    ├── featureFunction.py
    ├── import_lc_cluster.py
    ├── import_lightcurve.py
    ├── lomb.py
    └── test_library.py
├── LICENSE
├── README.rst
├── requirements.txt
└── setup.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | 
 5 | # C extensions
 6 | *.so
 7 | 
 8 | # Distribution / packaging
 9 | .Python
10 | env/
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | *.egg-info/
23 | .installed.cfg
24 | *.egg
25 | 
26 | # PyInstaller
27 | #  Usually these files are written by a python script from a template
28 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
29 | *.manifest
30 | *.spec
31 | 
32 | # Installer logs
33 | pip-log.txt
34 | pip-delete-this-directory.txt
35 | 
36 | # Unit test / coverage reports
37 | htmlcov/
38 | .tox/
39 | .coverage
40 | .coverage.*
41 | .cache
42 | nosetests.xml
43 | coverage.xml
44 | *,cover
45 | 
46 | # Translations
47 | *.mo
48 | *.pot
49 | 
50 | # Django stuff:
51 | *.log
52 | 
53 | # Sphinx documentation
54 | docs/_build/
55 | 
56 | # PyBuilder
57 | target/
58 | 


--------------------------------------------------------------------------------
/DESCRIPTION.rst:
--------------------------------------------------------------------------------
 1 | Feature Analysis for Time Series
 2 | ================================
 3 | 
 4 | In time-domain astronomy, data gathered from the telescopes is usually represented in the form of light-curves. These are time series that show the brightness variation of an object through a period of time (for a visual representation see video below). Based on the variability characteristics of the light-curves, celestial objects can be classified into different groups (quasars, long period variables, eclipsing binaries, etc.) and consequently be studied in depth independentely.
 5 | 
 6 | In order to characterize this variability, some of the existing methods use machine learning algorithms that build their decision on the light-curves features. Features, the topic of the following work, are numerical descriptors that aim to characterize and distinguish the different variability classes. They can go from basic statistical measures such as the mean or the standard deviation, to complex time-series characteristics such as the autocorrelation function.
 7 | 
 8 | In this package we present a library with a compilation of some of the existing light-curve features. The main goal is to create a collaborative and open tool where every user can characterize or analyze an astronomical photometric database while also contributing to the library by adding new features. However, it is important to highlight that this library is not restricted to the astronomical field and could also be applied to any kind of time series.
 9 | 
10 | Our vision is to be capable of analyzing and comparing light-curves from all the available astronomical catalogs in a standard and universal way. This would facilitate and make more efficient tasks as modelling, classification, data cleaning, outlier detection and data analysis in general. Consequently, when studying light-curves, astronomers and data analysts would be on the same wavelength and would not have the necessity to find a way of comparing or matching different features. In order to achieve this goal, the library should be run in every existent survey (MACHO, EROS, OGLE, Catalina, Pan-STARRS, etc) and future surveys (LSST) and the results should be ideally shared in the same open way as this library.
11 | 
12 | -------------------------
13 | Usage examples
14 | 
15 | 
16 | 
17 | -------------------------
18 | What's new
19 | 
20 | 
21 | This is the description file for the project.
22 | 
23 | The file should use UTF-8 encoding and be written using ReStructured Text. It
24 | will be used to generate the project webpage on PyPI, and should be written for
25 | that purpose.
26 | 
27 | Typical contents for this file would include an overview of the project, basic
28 | usage examples, etc. Generally, including the project changelog in here is not
29 | a good idea, although a simple "What's New" section for the most recent version
30 | may be appropriate.
31 | 


--------------------------------------------------------------------------------
/FATS/Base.py:
--------------------------------------------------------------------------------
1 | import os,sys,time
2 | import numpy as np
3 | class Base:
4 |     def __init__(self):
5 | 	self.category='all'
6 |     def fit(self, data):
7 |         return self
8 | 


--------------------------------------------------------------------------------
/FATS/DESCRIPTION.rst:
--------------------------------------------------------------------------------
 1 | Feature Analysis for Time Series
 2 | ================================
 3 | 
 4 | In time-domain astronomy, data gathered from the telescopes is usually represented in the form of light-curves. These are time series that show the brightness variation of an object through a period of time (for a visual representation see video below). Based on the variability characteristics of the light-curves, celestial objects can be classified into different groups (quasars, long period variables, eclipsing binaries, etc.) and consequently be studied in depth independentely.
 5 | 
 6 | In order to characterize this variability, some of the existing methods use machine learning algorithms that build their decision on the light-curves features. Features, the topic of the following work, are numerical descriptors that aim to characterize and distinguish the different variability classes. They can go from basic statistical measures such as the mean or the standard deviation, to complex time-series characteristics such as the autocorrelation function.
 7 | 
 8 | In this package we present a library with a compilation of some of the existing light-curve features. The main goal is to create a collaborative and open tool where every user can characterize or analyze an astronomical photometric database while also contributing to the library by adding new features. However, it is important to highlight that this library is not restricted to the astronomical field and could also be applied to any kind of time series.
 9 | 
10 | Our vision is to be capable of analyzing and comparing light-curves from all the available astronomical catalogs in a standard and universal way. This would facilitate and make more efficient tasks as modelling, classification, data cleaning, outlier detection and data analysis in general. Consequently, when studying light-curves, astronomers and data analysts would be on the same wavelength and would not have the necessity to find a way of comparing or matching different features. In order to achieve this goal, the library should be run in every existent survey (MACHO, EROS, OGLE, Catalina, Pan-STARRS, etc) and future surveys (LSST) and the results should be ideally shared in the same open way as this library.
11 | 
12 | -------------------------
13 | Usage examples
14 | 
15 | 
16 | 
17 | -------------------------
18 | What's new
19 | 
20 | 
21 | This is the description file for the project.
22 | 
23 | The file should use UTF-8 encoding and be written using ReStructured Text. It
24 | will be used to generate the project webpage on PyPI, and should be written for
25 | that purpose.
26 | 
27 | Typical contents for this file would include an overview of the project, basic
28 | usage examples, etc. Generally, including the project changelog in here is not
29 | a good idea, although a simple "What's New" section for the most recent version
30 | may be appropriate.
31 | 


--------------------------------------------------------------------------------
/FATS/Feature.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import sys
  3 | import time
  4 | import inspect
  5 | 
  6 | import numpy as np
  7 | import pandas as pd
  8 | import matplotlib.pyplot as plt
  9 | 
 10 | import featureFunction
 11 | 
 12 | 
 13 | class FeatureSpace:
 14 |     """
 15 |     This Class is a wrapper class, to allow user select the
 16 |     features based on the available time series vectors (magnitude, time,
 17 |     error, second magnitude, etc.) or specify a list of features.
 18 | 
 19 |     __init__ will take in the list of the available data and featureList.
 20 | 
 21 |     User could only specify the available time series vectors, which will
 22 |     output all the features that need this data to be calculated.
 23 | 
 24 |     User could only specify featureList, which will output
 25 |     all the features in the list.
 26 | 
 27 |     User could specify a list of the available time series vectors and
 28 |     featureList, which will output all the features in the List that
 29 |     use the available data.
 30 | 
 31 |     Additional parameters are used for individual features.
 32 |     Format is featurename = [parameters]
 33 | 
 34 |     usage:
 35 |     data = np.random.randint(0,10000, 100000000)
 36 |     # automean is the featurename and [0,0] is the parameter for the feature
 37 |     a = FeatureSpace(category='all', automean=[0,0])
 38 |     print a.featureList
 39 |     a=a.calculateFeature(data)
 40 |     print a.result(method='array')
 41 |     print a.result(method='dict')
 42 | 
 43 |     """
 44 |     def __init__(self, Data=None, featureList=None, excludeList=[], **kwargs):
 45 |         self.featureFunc = []
 46 |         self.featureList = []
 47 |         self.featureOrder = []
 48 |         self.featureList = []
 49 | 
 50 |         self.sort = False
 51 | 
 52 |         if Data is not None:
 53 |             self.Data = Data
 54 | 
 55 |             if self.Data == 'all':
 56 |                 if featureList == None:
 57 | 
 58 |                     if excludeList == None:
 59 |                         for name, obj in inspect.getmembers(featureFunction):
 60 |                             if inspect.isclass(obj) and name != 'Base':
 61 |                                 if obj.__module__.endswith('FeatureFunctionLib'):
 62 |                                     # if set(obj().Data).issubset(self.Data):
 63 |                                     self.featureOrder.append((inspect.getsourcelines(obj)[-1:])[0])
 64 |                                     self.featureList.append(name)
 65 |                     else:
 66 |                         for name, obj in inspect.getmembers(featureFunction):
 67 |                             if inspect.isclass(obj) and name != 'Base' and not name in excludeList:
 68 |                                 if obj.__module__.endswith('FeatureFunctionLib'):
 69 |                                     # if set(obj().Data).issubset(self.Data):
 70 |                                     self.featureOrder.append((inspect.getsourcelines(obj)[-1:])[0])
 71 |                                     self.featureList.append(name)
 72 | 
 73 |                 else:
 74 |                     for feature in featureList:
 75 |                         for name, obj in inspect.getmembers(featureFunction):
 76 |                             if name != 'Base':
 77 |                                 if inspect.isclass(obj) and feature == name:
 78 |                                     self.featureList.append(name)
 79 | 
 80 |             else:
 81 | 
 82 |                 if featureList is None:
 83 |                     for name, obj in inspect.getmembers(featureFunction):
 84 |                         if inspect.isclass(obj) and name != 'Base' and not name in excludeList:
 85 |                             if obj.__module__.endswith('FeatureFunctionLib'):
 86 |                                 if name in kwargs.keys():
 87 |                                     if set(obj(kwargs[name]).Data).issubset(self.Data):
 88 |                                         self.featureOrder.append((inspect.getsourcelines(obj)[-1:])[0])
 89 |                                         self.featureList.append(name)
 90 | 
 91 |                                 else:
 92 |                                     if set(obj().Data).issubset(self.Data):
 93 |                                         self.featureOrder.append((inspect.getsourcelines(obj)[-1:])[0])
 94 |                                         self.featureList.append(name)
 95 |                                     else:
 96 |                                         print "Warning: the feature", name, "could not be calculated because", obj().Data, "are needed."
 97 |                 else:
 98 | 
 99 |                     for feature in featureList:
100 |                         for name, obj in inspect.getmembers(featureFunction):
101 |                             if name != 'Base':
102 |                                 if inspect.isclass(obj) and feature == name:
103 |                                     if set(obj().Data).issubset(self.Data):
104 |                                         self.featureList.append(name)
105 |                                     else:
106 |                                         print "Warning: the feature", name, "could not be calculated because", obj().Data, "are needed."
107 | 
108 |             if self.featureOrder != []:
109 |                 self.sort = True
110 |                 self.featureOrder = np.argsort(self.featureOrder)
111 |                 self.featureList = [self.featureList[i] for i in self.featureOrder]
112 |                 self.idx = np.argsort(self.featureList)
113 | 
114 |         else:
115 |             self.featureList = featureList
116 | 
117 |         m = featureFunction
118 | 
119 |         for item in self.featureList:
120 |             if item in kwargs.keys():
121 |                 try:
122 |                     a = getattr(m, item)(kwargs[item])
123 |                 except:
124 |                     print "error in feature " + item
125 |                     sys.exit(1)
126 |             else:
127 |                 try:
128 |                     a = getattr(m, item)()
129 |                 except:
130 |                     print " could not find feature " + item
131 |                     # discuss -- should we exit?
132 |                     sys.exit(1)
133 |             try:
134 |                 self.featureFunc.append(a.fit)
135 |             except:
136 |                 print "could not initilize " + item
137 | 
138 |     def calculateFeature(self, data):
139 |         self._X = np.asarray(data)
140 |         self.__result = []
141 |         for f in self.featureFunc:
142 |             self.__result.append(f(self._X))
143 |         return self
144 | 
145 |     def result(self, method='array'):
146 |         if method == 'array':
147 |             if self.sort == True:
148 |                 return [np.asarray(self.__result)[i] for i in self.idx]
149 |             else:
150 |                 return np.asarray(self.__result)
151 |         elif method == 'dict':
152 |             if self.sort == True:
153 |                 return dict(zip([self.featureList[i] for i in self.idx], [np.asarray(self.__result)[i] for i in self.idx]))
154 |             else:
155 |                 return dict(zip(self.featureList, np.asarray(self.__result)))
156 |         elif method == 'features':
157 |             if self.sort == True:
158 |                 return [self.featureList[i] for i in self.idx]
159 |             else:
160 |                 return self.featureList
161 |         else:
162 |             return self.__result
163 | 
164 | 


--------------------------------------------------------------------------------
/FATS/FeatureFunctionLib.py:
--------------------------------------------------------------------------------
   1 | import os
   2 | import sys
   3 | import time
   4 | import math
   5 | import bisect
   6 | 
   7 | import numpy as np
   8 | import pandas as pd
   9 | from scipy import stats
  10 | from scipy.optimize import minimize
  11 | from scipy.optimize import curve_fit
  12 | from statsmodels.tsa import stattools
  13 | from scipy.interpolate import interp1d
  14 | 
  15 | from Base import Base
  16 | import lomb
  17 | 
  18 | 
  19 | class Amplitude(Base):
  20 |     """Half the difference between the maximum and the minimum magnitude"""
  21 | 
  22 |     def __init__(self):
  23 |         self.Data = ['magnitude']
  24 | 
  25 |     def fit(self, data):
  26 |         magnitude = data[0]
  27 |         N = len(magnitude)
  28 |         sorted_mag = np.sort(magnitude)
  29 | 
  30 |         return (np.median(sorted_mag[-int(math.ceil(0.05 * N)):]) -
  31 |                 np.median(sorted_mag[0:int(math.ceil(0.05 * N))])) / 2.0
  32 |         # return sorted_mag[10]
  33 | 
  34 | 
  35 | class Rcs(Base):
  36 |     """Range of cumulative sum"""
  37 | 
  38 |     def __init__(self):
  39 |         self.Data = ['magnitude']
  40 | 
  41 |     def fit(self, data):
  42 |         magnitude = data[0]
  43 |         sigma = np.std(magnitude)
  44 |         N = len(magnitude)
  45 |         m = np.mean(magnitude)
  46 |         s = np.cumsum(magnitude - m) * 1.0 / (N * sigma)
  47 |         R = np.max(s) - np.min(s)
  48 |         return R
  49 | 
  50 | 
  51 | class StetsonK(Base):
  52 |     def __init__(self):
  53 |         self.Data = ['magnitude', 'error']
  54 | 
  55 |     def fit(self, data):
  56 |         magnitude = data[0]
  57 |         error = data[2]
  58 | 
  59 |         mean_mag = (np.sum(magnitude/(error*error)) /
  60 |                     np.sum(1.0 / (error * error)))
  61 | 
  62 |         N = len(magnitude)
  63 |         sigmap = (np.sqrt(N * 1.0 / (N - 1)) *
  64 |                   (magnitude - mean_mag) / error)
  65 | 
  66 |         K = (1 / np.sqrt(N * 1.0) *
  67 |              np.sum(np.abs(sigmap)) / np.sqrt(np.sum(sigmap ** 2)))
  68 | 
  69 |         return K
  70 | 
  71 | 
  72 | class Meanvariance(Base):
  73 |     """variability index"""
  74 |     def __init__(self):
  75 |         self.Data = ['magnitude']
  76 | 
  77 |     def fit(self, data):
  78 |         magnitude = data[0]
  79 |         return np.std(magnitude) / np.mean(magnitude)
  80 | 
  81 | 
  82 | class Autocor_length(Base):
  83 | 
  84 |     def __init__(self, lags=100):
  85 |         self.Data = ['magnitude']
  86 |         self.nlags = lags
  87 | 
  88 |     def fit(self, data):
  89 | 
  90 |         magnitude = data[0]
  91 |         AC = stattools.acf(magnitude, nlags=self.nlags)
  92 |         k = next((index for index, value in
  93 |                  enumerate(AC) if value < np.exp(-1)), None)
  94 | 
  95 |         while k is None:
  96 |             self.nlags = self.nlags + 100
  97 |             AC = stattools.acf(magnitude, nlags=self.nlags)
  98 |             k = next((index for index, value in
  99 |                       enumerate(AC) if value < np.exp(-1)), None)
 100 | 
 101 |         return k
 102 | 
 103 | 
 104 | class SlottedA_length(Base):
 105 | 
 106 |     def __init__(self, T=-99):
 107 |         """
 108 |         lc: MACHO lightcurve in a pandas DataFrame
 109 |         k: lag (default: 1)
 110 |         T: tau (slot size in days. default: 4)
 111 |         """
 112 |         self.Data = ['magnitude', 'time']
 113 | 
 114 |         SlottedA_length.SAC = []
 115 | 
 116 |         self.T = T
 117 | 
 118 |     def slotted_autocorrelation(self, data, time, T, K,
 119 |                                 second_round=False, K1=100):
 120 | 
 121 |         slots = np.zeros((K, 1))
 122 |         i = 1
 123 | 
 124 |         # make time start from 0
 125 |         time = time - np.min(time)
 126 | 
 127 |         # subtract mean from mag values
 128 |         m = np.mean(data)
 129 |         data = data - m
 130 | 
 131 |         prod = np.zeros((K, 1))
 132 |         pairs = np.subtract.outer(time, time)
 133 |         pairs[np.tril_indices_from(pairs)] = 10000000
 134 | 
 135 |         ks = np.int64(np.floor(np.abs(pairs) / T + 0.5))
 136 | 
 137 |         # We calculate the slotted autocorrelation for k=0 separately
 138 |         idx = np.where(ks == 0)
 139 |         prod[0] = ((sum(data ** 2) + sum(data[idx[0]] *
 140 |                    data[idx[1]])) / (len(idx[0]) + len(data)))
 141 |         slots[0] = 0
 142 | 
 143 |         # We calculate it for the rest of the ks
 144 |         if second_round is False:
 145 |             for k in np.arange(1, K):
 146 |                 idx = np.where(ks == k)
 147 |                 if len(idx[0]) != 0:
 148 |                     prod[k] = sum(data[idx[0]] * data[idx[1]]) / (len(idx[0]))
 149 |                     slots[i] = k
 150 |                     i = i + 1
 151 |                 else:
 152 |                     prod[k] = np.infty
 153 |         else:
 154 |             for k in np.arange(K1, K):
 155 |                 idx = np.where(ks == k)
 156 |                 if len(idx[0]) != 0:
 157 |                     prod[k] = sum(data[idx[0]] * data[idx[1]]) / (len(idx[0]))
 158 |                     slots[i - 1] = k
 159 |                     i = i + 1
 160 |                 else:
 161 |                     prod[k] = np.infty
 162 |             np.trim_zeros(prod, trim='b')
 163 | 
 164 |         slots = np.trim_zeros(slots, trim='b')
 165 |         return prod / prod[0], np.int64(slots).flatten()
 166 | 
 167 |     def fit(self, data):
 168 |         magnitude = data[0]
 169 |         time = data[1]
 170 |         N = len(time)
 171 | 
 172 |         if self.T == -99:
 173 |             deltaT = time[1:] - time[:-1]
 174 |             sorted_deltaT = np.sort(deltaT)
 175 |             self.T = sorted_deltaT[int(N * 0.05)+1]
 176 | 
 177 |         K = 100
 178 | 
 179 |         [SAC, slots] = self.slotted_autocorrelation(magnitude, time, self.T, K)
 180 |         # SlottedA_length.SAC = SAC
 181 |         # SlottedA_length.slots = slots
 182 | 
 183 |         SAC2 = SAC[slots]
 184 |         SlottedA_length.autocor_vector = SAC2
 185 | 
 186 |         k = next((index for index, value in
 187 |                  enumerate(SAC2) if value < np.exp(-1)), None)
 188 | 
 189 |         while k is None:
 190 |             K = K+K
 191 | 
 192 |             if K > (np.max(time) - np.min(time)) / self.T:
 193 |                 break
 194 |             else:
 195 |                 [SAC, slots] = self.slotted_autocorrelation(magnitude,
 196 |                                                             time, self.T, K,
 197 |                                                             second_round=True,
 198 |                                                             K1=K/2)
 199 |                 SAC2 = SAC[slots]
 200 |                 k = next((index for index, value in
 201 |                          enumerate(SAC2) if value < np.exp(-1)), None)
 202 | 
 203 |         return slots[k] * self.T
 204 | 
 205 |     def getAtt(self):
 206 |         # return SlottedA_length.SAC, SlottedA_length.slots
 207 |         return SlottedA_length.autocor_vector
 208 | 
 209 | 
 210 | class StetsonK_AC(SlottedA_length):
 211 | 
 212 |     def __init__(self):
 213 | 
 214 |         self.Data = ['magnitude', 'time', 'error']
 215 | 
 216 |     def fit(self, data):
 217 | 
 218 |         try:
 219 | 
 220 |             a = StetsonK_AC()
 221 |             # [autocor_vector, slots] = a.getAtt()
 222 |             autocor_vector = a.getAtt()
 223 | 
 224 |             # autocor_vector = autocor_vector[slots]
 225 |             N_autocor = len(autocor_vector)
 226 |             sigmap = (np.sqrt(N_autocor * 1.0 / (N_autocor - 1)) *
 227 |                       (autocor_vector - np.mean(autocor_vector)) /
 228 |                       np.std(autocor_vector))
 229 | 
 230 |             K = (1 / np.sqrt(N_autocor * 1.0) *
 231 |                  np.sum(np.abs(sigmap)) / np.sqrt(np.sum(sigmap ** 2)))
 232 | 
 233 |             return K
 234 | 
 235 |         except:
 236 | 
 237 |             print "error: please run SlottedA_length first to generate values for StetsonK_AC "
 238 | 
 239 | 
 240 | class StetsonL(Base):
 241 |     def __init__(self):
 242 |         self.Data = ['magnitude', 'time', 'error', 'magnitude2', 'error2']
 243 | 
 244 |     def fit(self, data):
 245 | 
 246 |         aligned_magnitude = data[4]
 247 |         aligned_magnitude2 = data[5]
 248 |         aligned_error = data[7]
 249 |         aligned_error2 = data[8]
 250 | 
 251 |         N = len(aligned_magnitude)
 252 | 
 253 |         mean_mag = (np.sum(aligned_magnitude/(aligned_error*aligned_error)) /
 254 |                     np.sum(1.0 / (aligned_error * aligned_error)))
 255 |         mean_mag2 = (np.sum(aligned_magnitude2/(aligned_error2*aligned_error2)) /
 256 |                      np.sum(1.0 / (aligned_error2 * aligned_error2)))
 257 | 
 258 |         sigmap = (np.sqrt(N * 1.0 / (N - 1)) *
 259 |                   (aligned_magnitude[:N] - mean_mag) /
 260 |                   aligned_error)
 261 | 
 262 |         sigmaq = (np.sqrt(N * 1.0 / (N - 1)) *
 263 |                   (aligned_magnitude2[:N] - mean_mag2) /
 264 |                   aligned_error2)
 265 |         sigma_i = sigmap * sigmaq
 266 | 
 267 |         J = (1.0 / len(sigma_i) *
 268 |              np.sum(np.sign(sigma_i) * np.sqrt(np.abs(sigma_i))))
 269 | 
 270 |         K = (1 / np.sqrt(N * 1.0) *
 271 |              np.sum(np.abs(sigma_i)) / np.sqrt(np.sum(sigma_i ** 2)))
 272 | 
 273 |         return J * K / 0.798
 274 | 
 275 | 
 276 | class Con(Base):
 277 |     """Index introduced for selection of variable starts from OGLE database.
 278 | 
 279 | 
 280 |     To calculate Con, we counted the number of three consecutive measurements
 281 |     that are out of 2sigma range, and normalized by N-2
 282 |     Pavlos not happy
 283 |     """
 284 |     def __init__(self, consecutiveStar=3):
 285 |         self.Data = ['magnitude']
 286 | 
 287 |         self.consecutiveStar = consecutiveStar
 288 | 
 289 |     def fit(self, data):
 290 | 
 291 |         magnitude = data[0]
 292 |         N = len(magnitude)
 293 |         if N < self.consecutiveStar:
 294 |             return 0
 295 |         sigma = np.std(magnitude)
 296 |         m = np.mean(magnitude)
 297 |         count = 0
 298 | 
 299 |         for i in xrange(N - self.consecutiveStar + 1):
 300 |             flag = 0
 301 |             for j in xrange(self.consecutiveStar):
 302 |                 if(magnitude[i + j] > m + 2 * sigma or magnitude[i + j] < m - 2 * sigma):
 303 |                     flag = 1
 304 |                 else:
 305 |                     flag = 0
 306 |                     break
 307 |             if flag:
 308 |                 count = count + 1
 309 |         return count * 1.0 / (N - self.consecutiveStar + 1)
 310 | 
 311 | 
 312 | # class VariabilityIndex(Base):
 313 | 
 314 | #     # Eta. Removed, it is not invariant to time sampling
 315 | #     '''
 316 | #     The index is the ratio of mean of the square of successive difference to
 317 | #     the variance of data points
 318 | #     '''
 319 | #     def __init__(self):
 320 | #         self.category='timeSeries'
 321 | 
 322 | 
 323 | #     def fit(self, data):
 324 | 
 325 | #         N = len(data)
 326 | #         sigma2 = np.var(data)
 327 | 
 328 | #         return 1.0/((N-1)*sigma2) * np.sum(np.power(data[1:] - data[:-1] , 2)
 329 | #      )
 330 | 
 331 | 
 332 | class Color(Base):
 333 |     """Average color for each MACHO lightcurve
 334 |     mean(B1) - mean(B2)
 335 |     """
 336 |     def __init__(self):
 337 |         self.Data = ['magnitude', 'time', 'magnitude2']
 338 | 
 339 |     def fit(self, data):
 340 |         magnitude = data[0]
 341 |         magnitude2 = data[3]
 342 |         return np.mean(magnitude) - np.mean(magnitude2)
 343 | 
 344 | # The categories of the following featurs should be revised
 345 | 
 346 | 
 347 | class Beyond1Std(Base):
 348 |     """Percentage of points beyond one st. dev. from the weighted
 349 |     (by photometric errors) mean
 350 |     """
 351 | 
 352 |     def __init__(self):
 353 |         self.Data = ['magnitude', 'error']
 354 | 
 355 |     def fit(self, data):
 356 | 
 357 |         magnitude = data[0]
 358 |         error = data[2]
 359 |         n = len(magnitude)
 360 | 
 361 |         weighted_mean = np.average(magnitude, weights=1 / error ** 2)
 362 | 
 363 |         # Standard deviation with respect to the weighted mean
 364 | 
 365 |         var = sum((magnitude - weighted_mean) ** 2)
 366 |         std = np.sqrt((1.0 / (n - 1)) * var)
 367 | 
 368 |         count = np.sum(np.logical_or(magnitude > weighted_mean + std,
 369 |                                      magnitude < weighted_mean - std))
 370 | 
 371 |         return float(count) / n
 372 | 
 373 | 
 374 | class SmallKurtosis(Base):
 375 |     """Small sample kurtosis of the magnitudes.
 376 | 
 377 |     See http://www.xycoon.com/peakedness_small_sample_test_1.htm
 378 |     """
 379 | 
 380 |     def __init__(self):
 381 |         self.category = 'basic'
 382 |         self.Data = ['magnitude']
 383 | 
 384 |     def fit(self, data):
 385 |         magnitude = data[0]
 386 |         n = len(magnitude)
 387 |         mean = np.mean(magnitude)
 388 |         std = np.std(magnitude)
 389 | 
 390 |         S = sum(((magnitude - mean) / std) ** 4)
 391 | 
 392 |         c1 = float(n * (n + 1)) / ((n - 1) * (n - 2) * (n - 3))
 393 |         c2 = float(3 * (n - 1) ** 2) / ((n - 2) * (n - 3))
 394 | 
 395 |         return c1 * S - c2
 396 | 
 397 | 
 398 | class Std(Base):
 399 |     """Standard deviation of the magnitudes"""
 400 | 
 401 |     def __init__(self):
 402 |         self.Data = ['magnitude']
 403 | 
 404 |     def fit(self, data):
 405 |         magnitude = data[0]
 406 |         return np.std(magnitude)
 407 | 
 408 | 
 409 | class Skew(Base):
 410 |     """Skewness of the magnitudes"""
 411 | 
 412 |     def __init__(self):
 413 |         self.Data = ['magnitude']
 414 | 
 415 |     def fit(self, data):
 416 |         magnitude = data[0]
 417 |         return stats.skew(magnitude)
 418 | 
 419 | 
 420 | class StetsonJ(Base):
 421 |     """Stetson (1996) variability index, a robust standard deviation"""
 422 | 
 423 |     def __init__(self):
 424 |         self.Data = ['magnitude', 'time', 'error', 'magnitude2', 'error2']
 425 | 
 426 |     def fit(self, data):
 427 |         aligned_magnitude = data[4]
 428 |         aligned_magnitude2 = data[5]
 429 |         aligned_error = data[7]
 430 |         aligned_error2 = data[8]
 431 |         N = len(aligned_magnitude)
 432 | 
 433 |         mean_mag = (np.sum(aligned_magnitude/(aligned_error*aligned_error)) /
 434 |                     np.sum(1.0 / (aligned_error * aligned_error)))
 435 | 
 436 |         mean_mag2 = (np.sum(aligned_magnitude2 / (aligned_error2*aligned_error2)) /
 437 |                      np.sum(1.0 / (aligned_error2 * aligned_error2)))
 438 | 
 439 |         sigmap = (np.sqrt(N * 1.0 / (N - 1)) *
 440 |                   (aligned_magnitude[:N] - mean_mag) /
 441 |                   aligned_error)
 442 |         sigmaq = (np.sqrt(N * 1.0 / (N - 1)) *
 443 |                   (aligned_magnitude2[:N] - mean_mag2) /
 444 |                   aligned_error2)
 445 |         sigma_i = sigmap * sigmaq
 446 | 
 447 |         J = (1.0 / len(sigma_i) * np.sum(np.sign(sigma_i) *
 448 |              np.sqrt(np.abs(sigma_i))))
 449 | 
 450 |         return J
 451 | 
 452 | 
 453 | class MaxSlope(Base):
 454 |     """
 455 |     Examining successive (time-sorted) magnitudes, the maximal first difference
 456 |     (value of delta magnitude over delta time)
 457 |     """
 458 | 
 459 |     def __init__(self):
 460 |         self.Data = ['magnitude', 'time']
 461 | 
 462 |     def fit(self, data):
 463 | 
 464 |         magnitude = data[0]
 465 |         time = data[1]
 466 |         slope = np.abs(magnitude[1:] - magnitude[:-1]) / (time[1:] - time[:-1])
 467 |         np.max(slope)
 468 | 
 469 |         return np.max(slope)
 470 | 
 471 | 
 472 | class MedianAbsDev(Base):
 473 | 
 474 |     def __init__(self):
 475 |         self.category = 'basic'
 476 |         self.Data = ['magnitude']
 477 | 
 478 |     def fit(self, data):
 479 |         magnitude = data[0]
 480 |         median = np.median(magnitude)
 481 | 
 482 |         devs = (abs(magnitude - median))
 483 | 
 484 |         return np.median(devs)
 485 | 
 486 | 
 487 | class MedianBRP(Base):
 488 |     """Median buffer range percentage
 489 | 
 490 |     Fraction (<= 1) of photometric points within amplitude/10
 491 |     of the median magnitude
 492 |     """
 493 | 
 494 |     def __init__(self):
 495 |         self.Data = ['magnitude']
 496 | 
 497 |     def fit(self, data):
 498 |         magnitude = data[0]
 499 |         median = np.median(magnitude)
 500 |         amplitude = (np.max(magnitude) - np.min(magnitude)) / 10
 501 |         n = len(magnitude)
 502 | 
 503 |         count = np.sum(np.logical_and(magnitude < median + amplitude,
 504 |                                       magnitude > median - amplitude))
 505 | 
 506 |         return float(count) / n
 507 | 
 508 | 
 509 | class PairSlopeTrend(Base):
 510 |     """
 511 |     Considering the last 30 (time-sorted) measurements of source magnitude,
 512 |     the fraction of increasing first differences minus the fraction of
 513 |     decreasing first differences.
 514 |     """
 515 | 
 516 |     def __init__(self):
 517 |         self.Data = ['magnitude']
 518 | 
 519 |     def fit(self, data):
 520 |         magnitude = data[0]
 521 |         data_last = magnitude[-30:]
 522 | 
 523 |         return (float(len(np.where(np.diff(data_last) > 0)[0]) -
 524 |                 len(np.where(np.diff(data_last) <= 0)[0])) / 30)
 525 | 
 526 | 
 527 | class FluxPercentileRatioMid20(Base):
 528 | 
 529 |     def __init__(self):
 530 |         self.Data = ['magnitude']
 531 | 
 532 |     def fit(self, data):
 533 |         magnitude = data[0]
 534 |         sorted_data = np.sort(magnitude)
 535 |         lc_length = len(sorted_data)
 536 | 
 537 |         F_60_index = int(math.ceil(0.60 * lc_length))
 538 |         F_40_index = int(math.ceil(0.40 * lc_length))
 539 |         F_5_index = int(math.ceil(0.05 * lc_length))
 540 |         F_95_index = int(math.ceil(0.95 * lc_length))
 541 | 
 542 |         F_40_60 = sorted_data[F_60_index] - sorted_data[F_40_index]
 543 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 544 |         F_mid20 = F_40_60 / F_5_95
 545 | 
 546 |         return F_mid20
 547 | 
 548 | 
 549 | class FluxPercentileRatioMid35(Base):
 550 | 
 551 |     def __init__(self):
 552 |         self.Data = ['magnitude']
 553 | 
 554 |     def fit(self, data):
 555 |         magnitude = data[0]
 556 |         sorted_data = np.sort(magnitude)
 557 |         lc_length = len(sorted_data)
 558 | 
 559 |         F_325_index = int(math.ceil(0.325 * lc_length))
 560 |         F_675_index = int(math.ceil(0.675 * lc_length))
 561 |         F_5_index = int(math.ceil(0.05 * lc_length))
 562 |         F_95_index = int(math.ceil(0.95 * lc_length))
 563 | 
 564 |         F_325_675 = sorted_data[F_675_index] - sorted_data[F_325_index]
 565 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 566 |         F_mid35 = F_325_675 / F_5_95
 567 | 
 568 |         return F_mid35
 569 | 
 570 | 
 571 | class FluxPercentileRatioMid50(Base):
 572 | 
 573 |     def __init__(self):
 574 |         self.Data = ['magnitude']
 575 | 
 576 |     def fit(self, data):
 577 |         magnitude = data[0]
 578 |         sorted_data = np.sort(magnitude)
 579 |         lc_length = len(sorted_data)
 580 | 
 581 |         F_25_index = int(math.ceil(0.25 * lc_length))
 582 |         F_75_index = int(math.ceil(0.75 * lc_length))
 583 |         F_5_index = int(math.ceil(0.05 * lc_length))
 584 |         F_95_index = int(math.ceil(0.95 * lc_length))
 585 | 
 586 |         F_25_75 = sorted_data[F_75_index] - sorted_data[F_25_index]
 587 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 588 |         F_mid50 = F_25_75 / F_5_95
 589 | 
 590 |         return F_mid50
 591 | 
 592 | 
 593 | class FluxPercentileRatioMid65(Base):
 594 | 
 595 |     def __init__(self):
 596 |         self.Data = ['magnitude']
 597 | 
 598 |     def fit(self, data):
 599 |         magnitude = data[0]
 600 |         sorted_data = np.sort(magnitude)
 601 |         lc_length = len(sorted_data)
 602 | 
 603 |         F_175_index = int(math.ceil(0.175 * lc_length))
 604 |         F_825_index = int(math.ceil(0.825 * lc_length))
 605 |         F_5_index = int(math.ceil(0.05 * lc_length))
 606 |         F_95_index = int(math.ceil(0.95 * lc_length))
 607 | 
 608 |         F_175_825 = sorted_data[F_825_index] - sorted_data[F_175_index]
 609 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 610 |         F_mid65 = F_175_825 / F_5_95
 611 | 
 612 |         return F_mid65
 613 | 
 614 | 
 615 | class FluxPercentileRatioMid80(Base):
 616 | 
 617 |     def __init__(self):
 618 |         self.Data = ['magnitude']
 619 | 
 620 |     def fit(self, data):
 621 |         magnitude = data[0]
 622 |         sorted_data = np.sort(magnitude)
 623 |         lc_length = len(sorted_data)
 624 | 
 625 |         F_10_index = int(math.ceil(0.10 * lc_length))
 626 |         F_90_index = int(math.ceil(0.90 * lc_length))
 627 |         F_5_index = int(math.ceil(0.05 * lc_length))
 628 |         F_95_index = int(math.ceil(0.95 * lc_length))
 629 | 
 630 |         F_10_90 = sorted_data[F_90_index] - sorted_data[F_10_index]
 631 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 632 |         F_mid80 = F_10_90 / F_5_95
 633 | 
 634 |         return F_mid80
 635 | 
 636 | 
 637 | class PercentDifferenceFluxPercentile(Base):
 638 | 
 639 |     def __init__(self):
 640 |         self.Data = ['magnitude']
 641 | 
 642 |     def fit(self, data):
 643 |         magnitude = data[0]
 644 |         median_data = np.median(magnitude)
 645 | 
 646 |         sorted_data = np.sort(magnitude)
 647 |         lc_length = len(sorted_data)
 648 |         F_5_index = int(math.ceil(0.05 * lc_length))
 649 |         F_95_index = int(math.ceil(0.95 * lc_length))
 650 |         F_5_95 = sorted_data[F_95_index] - sorted_data[F_5_index]
 651 | 
 652 |         percent_difference = F_5_95 / median_data
 653 | 
 654 |         return percent_difference
 655 | 
 656 | 
 657 | class PercentAmplitude(Base):
 658 | 
 659 |     def __init__(self):
 660 |         self.Data = ['magnitude']
 661 | 
 662 |     def fit(self, data):
 663 |         magnitude = data[0]
 664 |         median_data = np.median(magnitude)
 665 |         distance_median = np.abs(magnitude - median_data)
 666 |         max_distance = np.max(distance_median)
 667 | 
 668 |         percent_amplitude = max_distance / median_data
 669 | 
 670 |         return percent_amplitude
 671 | 
 672 | 
 673 | class LinearTrend(Base):
 674 | 
 675 |     def __init__(self):
 676 |         self.Data = ['magnitude', 'time']
 677 | 
 678 |     def fit(self, data):
 679 |         magnitude = data[0]
 680 |         time = data[1]
 681 |         regression_slope = stats.linregress(time, magnitude)[0]
 682 | 
 683 |         return regression_slope
 684 | 
 685 | 
 686 | class Eta_color(Base):
 687 | 
 688 |     def __init__(self):
 689 | 
 690 |         self.Data = ['magnitude', 'time', 'magnitude2']
 691 | 
 692 |     def fit(self, data):
 693 |         aligned_magnitude = data[4]
 694 |         aligned_magnitude2 = data[5]
 695 |         aligned_time = data[6]
 696 |         N = len(aligned_magnitude)
 697 |         B_Rdata = aligned_magnitude - aligned_magnitude2
 698 | 
 699 |         w = 1.0 / np.power(aligned_time[1:] - aligned_time[:-1], 2)
 700 |         w_mean = np.mean(w)
 701 | 
 702 |         N = len(aligned_time)
 703 |         sigma2 = np.var(B_Rdata)
 704 | 
 705 |         S1 = sum(w * (B_Rdata[1:] - B_Rdata[:-1]) ** 2)
 706 |         S2 = sum(w)
 707 | 
 708 |         eta_B_R = (w_mean * np.power(aligned_time[N - 1] -
 709 |                    aligned_time[0], 2) * S1 / (sigma2 * S2 * N ** 2))
 710 | 
 711 |         return eta_B_R
 712 | 
 713 | 
 714 | class Eta_e(Base):
 715 | 
 716 |     def __init__(self):
 717 | 
 718 |         self.Data = ['magnitude', 'time']
 719 | 
 720 |     def fit(self, data):
 721 | 
 722 |         magnitude = data[0]
 723 |         time = data[1]
 724 |         w = 1.0 / np.power(np.subtract(time[1:], time[:-1]), 2)
 725 |         w_mean = np.mean(w)
 726 | 
 727 |         N = len(time)
 728 |         sigma2 = np.var(magnitude)
 729 | 
 730 |         S1 = sum(w * (magnitude[1:] - magnitude[:-1]) ** 2)
 731 |         S2 = sum(w)
 732 | 
 733 |         eta_e = (w_mean * np.power(time[N - 1] -
 734 |                  time[0], 2) * S1 / (sigma2 * S2 * N ** 2))
 735 | 
 736 |         return eta_e
 737 | 
 738 | 
 739 | class Mean(Base):
 740 | 
 741 |     def __init__(self):
 742 | 
 743 |         self.Data = ['magnitude']
 744 | 
 745 |     def fit(self, data):
 746 |         magnitude = data[0]
 747 |         B_mean = np.mean(magnitude)
 748 | 
 749 |         return B_mean
 750 | 
 751 | 
 752 | class Q31(Base):
 753 | 
 754 |     def __init__(self):
 755 | 
 756 |         self.Data = ['magnitude']
 757 | 
 758 |     def fit(self, data):
 759 |         magnitude = data[0]
 760 |         return np.percentile(magnitude, 75) - np.percentile(magnitude, 25)
 761 | 
 762 | 
 763 | class Q31_color(Base):
 764 | 
 765 |     def __init__(self):
 766 | 
 767 |         self.Data = ['magnitude', 'time', 'magnitude2']
 768 | 
 769 |     def fit(self, data):
 770 |         aligned_magnitude = data[4]
 771 |         aligned_magnitude2 = data[5]
 772 |         N = len(aligned_magnitude)
 773 |         b_r = aligned_magnitude[:N] - aligned_magnitude2[:N]
 774 | 
 775 |         return np.percentile(b_r, 75) - np.percentile(b_r, 25)
 776 | 
 777 | 
 778 | class AndersonDarling(Base):
 779 | 
 780 |     def __init__(self):
 781 | 
 782 |         self.Data = ['magnitude']
 783 | 
 784 |     def fit(self, data):
 785 | 
 786 |         magnitude = data[0]
 787 |         ander = stats.anderson(magnitude)[0]
 788 |         return 1 / (1.0 + np.exp(-10 * (ander - 0.3)))
 789 | 
 790 | 
 791 | class PeriodLS(Base):
 792 | 
 793 |     def __init__(self, ofac=6.):
 794 | 
 795 |         self.Data = ['magnitude', 'time']
 796 |         self.ofac = ofac
 797 | 
 798 |     def fit(self, data):
 799 | 
 800 |         magnitude = data[0]
 801 |         time = data[1]
 802 | 
 803 |         global new_time
 804 |         global prob
 805 |         global period
 806 | 
 807 |         fx, fy, nout, jmax, prob = lomb.fasper(time, magnitude, self.ofac, 100.)
 808 |         period = fx[jmax]
 809 |         T = 1.0 / period
 810 |         new_time = np.mod(time, 2 * T) / (2 * T)
 811 | 
 812 |         return T
 813 | 
 814 | 
 815 | class Period_fit(Base):
 816 | 
 817 |     def __init__(self):
 818 | 
 819 |         self.Data = ['magnitude', 'time']
 820 | 
 821 |     def fit(self, data):
 822 | 
 823 |         try:
 824 |             return prob
 825 |         except:
 826 |             print "error: please run PeriodLS first to generate values for Period_fit"
 827 | 
 828 | 
 829 | class Psi_CS(Base):
 830 | 
 831 |     def __init__(self):
 832 | 
 833 |         self.Data = ['magnitude', 'time']
 834 | 
 835 |     def fit(self, data):
 836 | 
 837 |         try:
 838 |             magnitude = data[0]
 839 |             time = data[1]
 840 |             folded_data = magnitude[np.argsort(new_time)]
 841 |             sigma = np.std(folded_data)
 842 |             N = len(folded_data)
 843 |             m = np.mean(folded_data)
 844 |             s = np.cumsum(folded_data - m) * 1.0 / (N * sigma)
 845 |             R = np.max(s) - np.min(s)
 846 | 
 847 |             return R
 848 |         except:
 849 |             print "error: please run PeriodLS first to generate values for Psi_CS"
 850 | 
 851 | 
 852 | class Psi_eta(Base):
 853 | 
 854 |     def __init__(self):
 855 | 
 856 |         self.Data = ['magnitude', 'time']
 857 | 
 858 |     def fit(self, data):
 859 | 
 860 |         # folded_time = np.sort(new_time)
 861 |         try:
 862 |             magnitude = data[0]
 863 |             folded_data = magnitude[np.argsort(new_time)]
 864 | 
 865 |             # w = 1.0 / np.power(folded_time[1:]-folded_time[:-1] ,2)
 866 |             # w_mean = np.mean(w)
 867 | 
 868 |             # N = len(folded_time)
 869 |             # sigma2=np.var(folded_data)
 870 | 
 871 |             # S1 = sum(w*(folded_data[1:]-folded_data[:-1])**2)
 872 |             # S2 = sum(w)
 873 | 
 874 |             # Psi_eta = w_mean * np.power(folded_time[N-1]-folded_time[0],2) * S1 /
 875 |             # (sigma2 * S2 * N**2)
 876 | 
 877 |             N = len(folded_data)
 878 |             sigma2 = np.var(folded_data)
 879 | 
 880 |             Psi_eta = (1.0 / ((N - 1) * sigma2) *
 881 |                        np.sum(np.power(folded_data[1:] - folded_data[:-1], 2)))
 882 | 
 883 |             return Psi_eta
 884 |         except:
 885 |             print "error: please run PeriodLS first to generate values for Psi_eta"
 886 | 
 887 | 
 888 | class CAR_sigma(Base):
 889 | 
 890 |     def __init__(self):
 891 | 
 892 |         self.Data = ['magnitude', 'time', 'error']
 893 | 
 894 |     def CAR_Lik(self, parameters, t, x, error_vars):
 895 | 
 896 |         sigma = parameters[0]
 897 |         tau = parameters[1]
 898 |         # b = parameters[1] #comment it to do 2 pars estimation
 899 |         # tau = params(1,1);
 900 |         # sigma = sqrt(2*var(x)/tau);
 901 | 
 902 |         b = np.mean(x) / tau
 903 |         epsilon = 1e-300
 904 |         cte_neg = -np.infty
 905 |         num_datos = np.size(x)
 906 | 
 907 |         Omega = []
 908 |         x_hat = []
 909 |         a = []
 910 |         x_ast = []
 911 | 
 912 |         # Omega = np.zeros((num_datos,1))
 913 |         # x_hat = np.zeros((num_datos,1))
 914 |         # a = np.zeros((num_datos,1))
 915 |         # x_ast = np.zeros((num_datos,1))
 916 | 
 917 |         # Omega[0]=(tau*(sigma**2))/2.
 918 |         # x_hat[0]=0.
 919 |         # a[0]=0.
 920 |         # x_ast[0]=x[0] - b*tau
 921 | 
 922 |         Omega.append((tau * (sigma ** 2)) / 2.)
 923 |         x_hat.append(0.)
 924 |         a.append(0.)
 925 |         x_ast.append(x[0] - b * tau)
 926 | 
 927 |         loglik = 0.
 928 | 
 929 |         for i in range(1, num_datos):
 930 | 
 931 |             a_new = np.exp(-(t[i] - t[i - 1]) / tau)
 932 |             x_ast.append(x[i] - b * tau)
 933 |             x_hat.append(
 934 |                 a_new * x_hat[i - 1] +
 935 |                 (a_new * Omega[i - 1] / (Omega[i - 1] + error_vars[i - 1])) *
 936 |                 (x_ast[i - 1] - x_hat[i - 1]))
 937 | 
 938 |             Omega.append(
 939 |                 Omega[0] * (1 - (a_new ** 2)) + ((a_new ** 2)) * Omega[i - 1] *
 940 |                 (1 - (Omega[i - 1] / (Omega[i - 1] + error_vars[i - 1]))))
 941 | 
 942 |             # x_ast[i]=x[i] - b*tau
 943 |             # x_hat[i]=a_new*x_hat[i-1] + (a_new*Omega[i-1]/(Omega[i-1] +
 944 |             # error_vars[i-1]))*(x_ast[i-1]-x_hat[i-1])
 945 |             # Omega[i]=Omega[0]*(1-(a_new**2)) + ((a_new**2))*Omega[i-1]*
 946 |             # ( 1 - (Omega[i-1]/(Omega[i-1]+ error_vars[i-1])))
 947 | 
 948 |             loglik_inter = np.log(
 949 |                 ((2 * np.pi * (Omega[i] + error_vars[i])) ** -0.5) *
 950 |                 (np.exp(-0.5 * (((x_hat[i] - x_ast[i]) ** 2) /
 951 |                  (Omega[i] + error_vars[i]))) + epsilon))
 952 | 
 953 |             loglik = loglik + loglik_inter
 954 | 
 955 |             if(loglik <= cte_neg):
 956 |                 print('CAR lik se fue a inf')
 957 |                 return None
 958 | 
 959 |         # the minus one is to perfor maximization using the minimize function
 960 |         return -loglik
 961 | 
 962 |     def calculateCAR(self, time, data, error):
 963 | 
 964 |         x0 = [10, 0.5]
 965 |         bnds = ((0, 100), (0, 100))
 966 |         # res = minimize(self.CAR_Lik, x0, args=(LC[:,0],LC[:,1],LC[:,2]) ,
 967 |         # method='nelder-mead',bounds = bnds)
 968 | 
 969 |         res = minimize(self.CAR_Lik, x0, args=(time, data, error),
 970 |                        method='powell', bounds=bnds)
 971 |         # options={'disp': True}
 972 |         sigma = res.x[0]
 973 |         global tau
 974 |         tau = res.x[1]
 975 |         return sigma
 976 | 
 977 |     # def getAtt(self):
 978 |     #     return CAR_sigma.tau
 979 | 
 980 |     def fit(self, data):
 981 |         # LC = np.hstack((self.time , data.reshape((self.N,1)), self.error))
 982 | 
 983 |         N = len(data[0])
 984 |         magnitude = data[0].reshape((N, 1))
 985 |         time = data[1].reshape((N, 1))
 986 |         error = data[2].reshape((N, 1)) ** 2
 987 | 
 988 |         a = self.calculateCAR(time, magnitude, error)
 989 | 
 990 |         return a
 991 | 
 992 | 
 993 | class CAR_tau(Base):
 994 | 
 995 |     def __init__(self):
 996 | 
 997 |         self.Data = ['magnitude', 'time', 'error']
 998 | 
 999 |     def fit(self, data):
1000 | 
1001 |         try:
1002 |             return tau
1003 |         except:
1004 |             print "error: please run CAR_sigma first to generate values for CAR_tau"
1005 | 
1006 | 
1007 | class CAR_mean(Base):
1008 | 
1009 |     def __init__(self):
1010 | 
1011 |         self.Data = ['magnitude', 'time', 'error']
1012 | 
1013 |     def fit(self, data):
1014 | 
1015 |         magnitude = data[0]
1016 | 
1017 |         try:
1018 |             return np.mean(magnitude) / tau
1019 |         except:
1020 |             print "error: please run CAR_sigma first to generate values for CAR_mean"
1021 | 
1022 | 
1023 | class Freq1_harmonics_amplitude_0(Base):
1024 |     def __init__(self):
1025 |         self.Data = ['magnitude', 'time']
1026 | 
1027 |     def fit(self, data):
1028 |         magnitude = data[0]
1029 |         time = data[1]
1030 | 
1031 |         time = time - np.min(time)
1032 | 
1033 |         global A
1034 |         global PH
1035 |         global scaledPH
1036 |         A = []
1037 |         PH = []
1038 |         scaledPH = []
1039 | 
1040 |         def model(x, a, b, c, Freq):
1041 |             return a*np.sin(2*np.pi*Freq*x)+b*np.cos(2*np.pi*Freq*x)+c
1042 | 
1043 |         for i in range(3):
1044 | 
1045 |             wk1, wk2, nout, jmax, prob = lomb.fasper(time, magnitude, 6., 100.)
1046 | 
1047 |             fundamental_Freq = wk1[jmax]
1048 | 
1049 |             # fit to a_i sin(2pi f_i t) + b_i cos(2 pi f_i t) + b_i,o
1050 | 
1051 |             # a, b are the parameters we care about
1052 |             # c is a constant offset
1053 |             # f is the fundamental Frequency
1054 |             def yfunc(Freq):
1055 |                 def func(x, a, b, c):
1056 |                     return a*np.sin(2*np.pi*Freq*x)+b*np.cos(2*np.pi*Freq*x)+c
1057 |                 return func
1058 | 
1059 |             Atemp = []
1060 |             PHtemp = []
1061 |             popts = []
1062 | 
1063 |             for j in range(4):
1064 |                 popt, pcov = curve_fit(yfunc((j+1)*fundamental_Freq), time, magnitude)
1065 |                 Atemp.append(np.sqrt(popt[0]**2+popt[1]**2))
1066 |                 PHtemp.append(np.arctan(popt[1] / popt[0]))
1067 |                 popts.append(popt)
1068 | 
1069 |             A.append(Atemp)
1070 |             PH.append(PHtemp)
1071 | 
1072 |             for j in range(4):
1073 |                 magnitude = np.array(magnitude) - model(time, popts[j][0], popts[j][1], popts[j][2], (j+1)*fundamental_Freq)
1074 | 
1075 |         for ph in PH:
1076 |             scaledPH.append(np.array(ph) - ph[0])
1077 | 
1078 |         return A[0][0]
1079 | 
1080 | 
1081 | class Freq1_harmonics_amplitude_1(Base):
1082 |     def __init__(self):
1083 | 
1084 |         self.Data = ['magnitude', 'time']
1085 | 
1086 |     def fit(self, data):
1087 | 
1088 |         try:
1089 |             return A[0][1]
1090 |         except:
1091 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1092 | 
1093 | 
1094 | class Freq1_harmonics_amplitude_2(Base):
1095 |     def __init__(self):
1096 | 
1097 |         self.Data = ['magnitude', 'time']
1098 | 
1099 |     def fit(self, data):
1100 | 
1101 |         try:
1102 |             return A[0][2]
1103 |         except:
1104 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1105 | 
1106 | 
1107 | class Freq1_harmonics_amplitude_3(Base):
1108 |     def __init__(self):
1109 | 
1110 |         self.Data = ['magnitude', 'time']
1111 | 
1112 |     def fit(self, data):
1113 |         try:
1114 |             return A[0][3]
1115 |         except:
1116 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1117 | 
1118 | 
1119 | class Freq2_harmonics_amplitude_0(Base):
1120 |     def __init__(self):
1121 | 
1122 |         self.Data = ['magnitude', 'time']
1123 | 
1124 |     def fit(self, data):
1125 |         try:
1126 |             return A[1][0]
1127 |         except:
1128 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1129 | 
1130 | 
1131 | class Freq2_harmonics_amplitude_1(Base):
1132 |     def __init__(self):
1133 | 
1134 |         self.Data = ['magnitude', 'time']
1135 | 
1136 |     def fit(self, data):
1137 |         try:
1138 |             return A[1][1]
1139 |         except:
1140 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1141 | 
1142 | 
1143 | class Freq2_harmonics_amplitude_2(Base):
1144 |     def __init__(self):
1145 | 
1146 |         self.Data = ['magnitude', 'time']
1147 | 
1148 |     def fit(self, data):
1149 |         try:
1150 |             return A[1][2]
1151 |         except:
1152 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1153 | 
1154 | 
1155 | class Freq2_harmonics_amplitude_3(Base):
1156 |     def __init__(self):
1157 | 
1158 |         self.Data = ['magnitude', 'time']
1159 | 
1160 |     def fit(self, data):
1161 |         try:
1162 |             return A[1][3]
1163 |         except:
1164 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1165 | 
1166 | 
1167 | class Freq3_harmonics_amplitude_0(Base):
1168 |     def __init__(self):
1169 | 
1170 |         self.Data = ['magnitude', 'time']
1171 | 
1172 |     def fit(self, data):
1173 |         try:
1174 |             return A[2][0]
1175 |         except:
1176 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1177 | 
1178 | 
1179 | class Freq3_harmonics_amplitude_1(Base):
1180 |     def __init__(self):
1181 | 
1182 |         self.Data = ['magnitude', 'time']
1183 | 
1184 |     def fit(self, data):
1185 |         try:
1186 |             return A[2][1]
1187 |         except:
1188 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1189 | 
1190 | 
1191 | class Freq3_harmonics_amplitude_2(Base):
1192 |     def __init__(self):
1193 | 
1194 |         self.Data = ['magnitude', 'time']
1195 | 
1196 |     def fit(self, data):
1197 |         try:
1198 |             return A[2][2]
1199 |         except:
1200 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1201 | 
1202 | 
1203 | class Freq3_harmonics_amplitude_3(Base):
1204 |     def __init__(self):
1205 | 
1206 |         self.Data = ['magnitude', 'time']
1207 | 
1208 |     def fit(self, data):
1209 |         try:
1210 |             return A[2][3]
1211 |         except:
1212 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1213 | 
1214 | 
1215 | class Freq1_harmonics_rel_phase_0(Base):
1216 |     def __init__(self):
1217 | 
1218 |         self.Data = ['magnitude', 'time']
1219 | 
1220 |     def fit(self, data):
1221 |         try:
1222 |             return scaledPH[0][0]
1223 |         except:
1224 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1225 | 
1226 | 
1227 | class Freq1_harmonics_rel_phase_1(Base):
1228 |     def __init__(self):
1229 | 
1230 |         self.Data = ['magnitude', 'time']
1231 | 
1232 |     def fit(self, data):
1233 |         try:
1234 |             return scaledPH[0][1]
1235 |         except:
1236 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1237 | 
1238 | 
1239 | class Freq1_harmonics_rel_phase_2(Base):
1240 |     def __init__(self):
1241 | 
1242 |         self.Data = ['magnitude', 'time']
1243 | 
1244 |     def fit(self, data):
1245 |         try:
1246 |             return scaledPH[0][2]
1247 |         except:
1248 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1249 | 
1250 | 
1251 | class Freq1_harmonics_rel_phase_3(Base):
1252 |     def __init__(self):
1253 | 
1254 |         self.Data = ['magnitude', 'time']
1255 | 
1256 |     def fit(self, data):
1257 |         try:
1258 |             return scaledPH[0][3]
1259 |         except:
1260 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1261 | 
1262 | 
1263 | class Freq2_harmonics_rel_phase_0(Base):
1264 |     def __init__(self):
1265 |         self.category = 'timeSeries'
1266 | 
1267 |         self.Data = ['magnitude', 'time']
1268 | 
1269 |     def fit(self, data):
1270 |         try:
1271 |             return scaledPH[1][0]
1272 |         except:
1273 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1274 | 
1275 | 
1276 | class Freq2_harmonics_rel_phase_1(Base):
1277 |     def __init__(self):
1278 | 
1279 |         self.Data = ['magnitude', 'time']
1280 | 
1281 |     def fit(self, data):
1282 |         try:
1283 |             return scaledPH[1][1]
1284 |         except:
1285 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1286 | 
1287 | 
1288 | class Freq2_harmonics_rel_phase_2(Base):
1289 |     def __init__(self):
1290 | 
1291 |         self.Data = ['magnitude', 'time']
1292 | 
1293 |     def fit(self, data):
1294 |         try:
1295 |             return scaledPH[1][2]
1296 |         except:
1297 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1298 | 
1299 | 
1300 | class Freq2_harmonics_rel_phase_3(Base):
1301 |     def __init__(self):
1302 | 
1303 |         self.Data = ['magnitude', 'time']
1304 | 
1305 |     def fit(self, data):
1306 |         try:
1307 |             return scaledPH[1][3]
1308 |         except:
1309 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1310 | 
1311 | 
1312 | class Freq3_harmonics_rel_phase_0(Base):
1313 |     def __init__(self):
1314 | 
1315 |         self.Data = ['magnitude', 'time']
1316 | 
1317 |     def fit(self, data):
1318 |         try:
1319 |             return scaledPH[2][0]
1320 |         except:
1321 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1322 | 
1323 | 
1324 | class Freq3_harmonics_rel_phase_1(Base):
1325 |     def __init__(self):
1326 | 
1327 |         self.Data = ['magnitude', 'time']
1328 | 
1329 |     def fit(self, data):
1330 |         try:
1331 |             return scaledPH[2][1]
1332 |         except:
1333 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1334 | 
1335 | 
1336 | class Freq3_harmonics_rel_phase_2(Base):
1337 |     def __init__(self):
1338 | 
1339 |         self.Data = ['magnitude', 'time']
1340 | 
1341 |     def fit(self, data):
1342 |         try:
1343 |             return scaledPH[2][2]
1344 |         except:
1345 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1346 | 
1347 | 
1348 | class Freq3_harmonics_rel_phase_3(Base):
1349 |     def __init__(self):
1350 | 
1351 |         self.Data = ['magnitude', 'time']
1352 | 
1353 |     def fit(self, data):
1354 |         try:
1355 |             return scaledPH[2][3]
1356 |         except:
1357 |             print "error: please run Freq1_harmonics_amplitude_0 first to generate values for all harmonics"
1358 | 
1359 | 
1360 | class Gskew(Base):
1361 |     """Median-based measure of the skew"""
1362 | 
1363 |     def __init__(self):
1364 |         self.Data = ['magnitude']
1365 | 
1366 |     def fit(self, data):
1367 |         magnitude = np.array(data[0])
1368 |         median_mag = np.median(magnitude)
1369 |         F_3_value = np.percentile(magnitude, 3)
1370 |         F_97_value = np.percentile(magnitude, 97)
1371 | 
1372 |         return (np.median(magnitude[magnitude <= F_3_value]) +
1373 |                 np.median(magnitude[magnitude >= F_97_value])
1374 |                 - 2*median_mag)
1375 | 
1376 | 
1377 | class StructureFunction_index_21(Base):
1378 | 
1379 |     def __init__(self):
1380 |         self.Data = ['magnitude', 'time']
1381 | 
1382 |     def fit(self, data):
1383 |         magnitude = data[0]
1384 |         time = data[1]
1385 | 
1386 |         global m_21
1387 |         global m_31
1388 |         global m_32
1389 | 
1390 |         Nsf = 100
1391 |         Np = 100
1392 |         sf1 = np.zeros(Nsf)
1393 |         sf2 = np.zeros(Nsf)
1394 |         sf3 = np.zeros(Nsf)
1395 |         f = interp1d(time, magnitude)
1396 | 
1397 |         time_int = np.linspace(np.min(time), np.max(time), Np)
1398 |         mag_int = f(time_int)
1399 | 
1400 |         for tau in np.arange(1, Nsf):
1401 |             sf1[tau-1] = np.mean(np.power(np.abs(mag_int[0:Np-tau] - mag_int[tau:Np]) , 1.0))
1402 |             sf2[tau-1] = np.mean(np.abs(np.power(np.abs(mag_int[0:Np-tau] - mag_int[tau:Np]) , 2.0)))
1403 |             sf3[tau-1] = np.mean(np.abs(np.power(np.abs(mag_int[0:Np-tau] - mag_int[tau:Np]) , 3.0)))
1404 |         sf1_log = np.log10(np.trim_zeros(sf1))
1405 |         sf2_log = np.log10(np.trim_zeros(sf2))
1406 |         sf3_log = np.log10(np.trim_zeros(sf3))
1407 | 
1408 |         m_21, b_21 = np.polyfit(sf1_log, sf2_log, 1)
1409 |         m_31, b_31 = np.polyfit(sf1_log, sf3_log, 1)
1410 |         m_32, b_32 = np.polyfit(sf2_log, sf3_log, 1)
1411 | 
1412 |         return m_21
1413 | 
1414 | 
1415 | class StructureFunction_index_31(Base):
1416 |     def __init__(self):
1417 | 
1418 |         self.Data = ['magnitude', 'time']
1419 | 
1420 |     def fit(self, data):
1421 |         try:
1422 |             return m_31
1423 |         except:
1424 |             print "error: please run StructureFunction_index_21 first to generate values for all Structure Function"
1425 | 
1426 | 
1427 | class StructureFunction_index_32(Base):
1428 |     def __init__(self):
1429 | 
1430 |         self.Data = ['magnitude', 'time']
1431 | 
1432 |     def fit(self, data):
1433 |         try:
1434 |             return m_32
1435 |         except:
1436 |             print "error: please run StructureFunction_index_21 first to generate values for all Structure Function"
1437 | 


--------------------------------------------------------------------------------
/FATS/PreprocessLC.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | class Preprocess_LC:
 4 | 
 5 | 	def __init__(self, data, mjd, error):
 6 | 
 7 | 		self.N = len(mjd)
 8 | 		self.m = np.mean(error)
 9 | 		self.mjd = mjd
10 | 		self.data = data
11 | 		self.error = error
12 | 
13 | 	def Preprocess(self):
14 | 
15 | 		mjd_out = []
16 | 		data_out = []
17 | 		error_out = []	
18 | 
19 | 		for i in xrange(len(self.data)):
20 | 					
21 | 					if self.error[i] < (3 * self.m) and (np.absolute(self.data[i] - np.mean(self.data)) / np.std(self.data)) < 5 :
22 | 						
23 | 						mjd_out.append(self.mjd[i])
24 | 						data_out.append(self.data[i])
25 | 						error_out.append(self.error[i])
26 | 
27 | 
28 | 		data_out = np.asarray(data_out)	
29 | 		mjd_out = np.asarray(mjd_out)	
30 | 		error_out = np.asarray(error_out)	
31 | 
32 | 		return [data_out, mjd_out, error_out]
33 | 


--------------------------------------------------------------------------------
/FATS/README.rst:
--------------------------------------------------------------------------------
 1 | FATS: Feature Analysis for Time Series
 2 | ==============================
 3 | 
 4 | Summary: Compilation of some of the existing light-curve features.
 5 | 
 6 | Authors: Isadora Nun and Pavlos Protopapas
 7 | 
 8 | Contributors: Karim Pichara, Rahul Dave, Daniel Acuña, Nicolás Castro, Cristobal Mackenzie, Andrés Riveros and Ming Zhu
 9 | 
10 | -----------------------------------------------------
11 | 
12 | Description: In time-domain astronomy, data gathered from the telescopes is usually represented in the form of light-curves. These are time series that show the brightness variation of an object through a period of time (for a visual representation see video below). Based on the variability characteristics of the light-curves, celestial objects can be classified into different groups (quasars, long period variables, eclipsing binaries, etc.) and consequently be studied in depth independently.
13 | 
14 | In order to characterize this variability, some of the existing methods use machine learning algorithms that build their decision on the light-curves features. Features, the topic of the following work, are numerical descriptors that aim to characterize and distinguish the different variability classes. They can go from basic statistical measures such as the mean or the standard deviation, to complex time-series characteristics such as the autocorrelation function.
15 | 
16 | In this document we present a library with a compilation of some of the existing light-curve features. The main goal is to create a collaborative and open tool where every user can characterize or analyze an astronomical photometric database while also contributing to the library by adding new features. However, it is important to highlight that this library is not restricted to the astronomical field and could also be applied to any kind of time series.
17 | 
18 | Our vision is to be capable of analyzing and comparing light-curves from all the available astronomical catalogs in a standard and universal way. This would facilitate and make more efficient tasks as modeling, classification, data cleaning, outlier detection and data analysis in general. Consequently, when studying light-curves, astronomers and data analysts would be on the same wavelength and would not have the necessity to find a way of comparing or matching different features. In order to achieve this goal, the library should be run in every existent survey (MACHO, EROS, OGLE, Catalina, Pan-STARRS, etc) and future surveys (LSST) and the results should be ideally shared in the same open way as this library.
19 | 
20 | ---------------------------------------------------------
21 | 
22 | An extended explanation of the package is available at http://isadoranun.github.io/tsfeat/FeaturesDocumentation.html


--------------------------------------------------------------------------------
/FATS/__init__.py:
--------------------------------------------------------------------------------
1 | from .import_lightcurve import ReadLC_MACHO
2 | from .Feature import FeatureSpace
3 | from .alignLC import Align_LC
4 | from .PreprocessLC import Preprocess_LC
5 | 


--------------------------------------------------------------------------------
/FATS/alignLC.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def Align_LC(mjd, mjd2, data, data2, error, error2):
 5 | 
 6 |     if len(data2) > len(data):
 7 | 
 8 |         new_data2 = []
 9 |         new_error2 = []
10 |         new_mjd2 = []
11 |         new_mjd = np.copy(mjd)
12 |         new_error = np.copy(error)
13 |         new_data = np.copy(data)
14 |         count = 0
15 | 
16 |         for index in xrange(len(data)):
17 | 
18 |             where = np.where(mjd2 == mjd[index])
19 | 
20 |             if np.array_equal(where[0], []) is False:
21 | 
22 |                 new_data2.append(data2[where])
23 |                 new_error2.append(error2[where])
24 |                 new_mjd2.append(mjd2[where])
25 |             else:
26 |                 new_mjd = np.delete(new_mjd, index - count)
27 |                 new_error = np.delete(new_error, index - count)
28 |                 new_data = np.delete(new_data, index - count)
29 |                 count = count + 1
30 | 
31 |         new_data2 = np.asarray(new_data2).flatten()
32 |         new_error2 = np.asarray(new_error2).flatten()
33 | 
34 | 
35 |     else:
36 | 
37 |         new_data = []
38 |         new_error = []
39 |         new_mjd = []
40 |         new_mjd2 = np.copy(mjd2)
41 |         new_error2 = np.copy(error2)
42 |         new_data2 = np.copy(data2)
43 |         count = 0
44 |         for index in xrange(len(data2)):
45 |             where = np.where(mjd == mjd2[index])
46 | 
47 |             if np.array_equal(where[0], []) is False:
48 |                 new_data.append(data[where])
49 |                 new_error.append(error[where])
50 |                 new_mjd.append(mjd[where])
51 |             else:
52 |                 new_mjd2 = np.delete(new_mjd2, (index - count))
53 |                 new_error2 = np.delete(new_error2, (index - count))
54 |                 new_data2 = np.delete(new_data2, (index - count))
55 |                 count = count + 1
56 | 
57 |         new_data = np.asarray(new_data).flatten()
58 |         new_mjd = np.asarray(new_mjd).flatten()
59 |         new_error =  np.asarray(new_error).flatten()
60 | 
61 |     return new_data, new_data2, new_mjd, new_error, new_error2
62 |     #return new_mjd, new_data, new_error, new_mjd2, new_data2, new_error2
63 | 


--------------------------------------------------------------------------------
/FATS/featureFunction.py:
--------------------------------------------------------------------------------
1 | import os,sys,time
2 | import numpy as np
3 | import pandas as pd
4 | import matplotlib.pyplot as plt
5 | import Base
6 | from FeatureFunctionLib import *


--------------------------------------------------------------------------------
/FATS/import_lc_cluster.py:
--------------------------------------------------------------------------------
 1 | 
 2 | #from Feature import FeatureSpace
 3 | import numpy as np
 4 | 
 5 | class ReadLC_MACHO:
 6 | 
 7 | 
 8 |     def __init__(self,lc):
 9 | 
10 |         self.content1=lc
11 | 
12 |     def ReadLC(self):
13 | 
14 |         data = []
15 |         mjd = []
16 |         error = []
17 |         # Opening the blue band
18 |         #fid = open(self.id,'r')
19 | 
20 |         self.content1 = self.content1[3:]
21 | 
22 | 
23 |         for i in xrange(len(self.content1)):
24 |             if not self.content1[i]:
25 |                 break
26 |             else:
27 |                 content = self.content1[i].split(' ')
28 |                 mjd.append(float(content[0]))
29 |                 data.append(float(content[1]))
30 |                 error.append(float(content[2]))
31 | 
32 |         # Opening the red band
33 | 
34 |         return [data, mjd, error]
35 | 
36 |     


--------------------------------------------------------------------------------
/FATS/import_lightcurve.py:
--------------------------------------------------------------------------------
 1 | 
 2 | #from Feature import FeatureSpace
 3 | import numpy as np
 4 | 
 5 | class ReadLC_MACHO:
 6 | 
 7 | 
 8 |     def __init__(self,id):
 9 | 
10 |         self.id=id
11 | 
12 |     def ReadLC(self):
13 | 
14 |         # Opening the blue band
15 |         fid = open(self.id,'r')
16 | 
17 |         saltos_linea = 3
18 |         delimiter = ' '
19 |         for i in range(0,saltos_linea):
20 |             fid.next()
21 |         LC = []
22 | 
23 |         for lines in fid:
24 |             str_line = lines.strip().split()
25 |             floats = map(float, str_line)
26 |             #numbers = (number for number in str_line.split())
27 |             LC.append(floats)
28 | 
29 |         LC = np.asarray(LC)
30 | 
31 |         data  = LC[:,1]
32 |         error = LC[:,2]
33 |         mjd = LC[:,0]
34 | 
35 |         # Opening the red band
36 | 
37 |         return [data, mjd, error]
38 | 
39 |     


--------------------------------------------------------------------------------
/FATS/lomb.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python  
  2 | """ Fast algorithm for spectral analysis of unevenly sampled data 
  3 |  
  4 | The Lomb-Scargle method performs spectral analysis on unevenly sampled 
  5 | data and is known to be a powerful way to find, and test the  
  6 | significance of, weak periodic signals. The method has previously been 
  7 | thought to be 'slow', requiring of order 10(2)N(2) operations to analyze 
  8 | N data points. We show that Fast Fourier Transforms (FFTs) can be used 
  9 | in a novel way to make the computation of order 10(2)N log N. Despite 
 10 | its use of the FFT, the algorithm is in no way equivalent to  
 11 | conventional FFT periodogram analysis. 
 12 |  
 13 | Keywords: 
 14 |   DATA SAMPLING, FAST FOURIER TRANSFORMATIONS,  
 15 |   SPECTRUM ANALYSIS, SIGNAL  PROCESSING 
 16 |  
 17 | Example: 
 18 |   > import numpy 
 19 |   > import lomb 
 20 |   > x = numpy.arange(10) 
 21 |   > y = numpy.sin(x) 
 22 |   > fx,fy, nout, jmax, prob = lomb.fasper(x,y, 6., 6.) 
 23 |  
 24 | Reference:  
 25 |   Press, W. H. & Rybicki, G. B. 1989 
 26 |   ApJ vol. 338, p. 277-280. 
 27 |   Fast algorithm for spectral analysis of unevenly sampled data 
 28 |   bib code: 1989ApJ...338..277P 
 29 |  
 30 | """  
 31 | from numpy import *  
 32 | from numpy.fft import *  
 33 |   
 34 | def __spread__(y, yy, n, x, m):  
 35 |   """ 
 36 |   Given an array yy(0:n-1), extirpolate (spread) a value y into 
 37 |   m actual array elements that best approximate the "fictional" 
 38 |   (i.e., possible noninteger) array element number x. The weights 
 39 |   used are coefficients of the Lagrange interpolating polynomial 
 40 |   Arguments: 
 41 |     y :  
 42 |     yy :  
 43 |     n :  
 44 |     x :  
 45 |     m :  
 46 |   Returns: 
 47 |      
 48 |   """  
 49 |   nfac=[0,1,1,2,6,24,120,720,5040,40320,362880]  
 50 |   if m > 10. :  
 51 |     print 'factorial table too small in spread'  
 52 |     return  
 53 |   
 54 |   ix=long(x)  
 55 |   if x == float(ix):   
 56 |     yy[ix]=yy[ix]+y  
 57 |   else:  
 58 |     ilo = long(x-0.5*float(m)+1.0)  
 59 |     ilo = min( max( ilo , 1 ), n-m+1 )   
 60 |     ihi = ilo+m-1  
 61 |     nden = nfac[m]  
 62 |     fac=x-ilo  
 63 |     for j in range(ilo+1,ihi+1): fac = fac*(x-j)  
 64 |     yy[ihi] = yy[ihi] + y*fac/(nden*(x-ihi))  
 65 |     for j in range(ihi-1,ilo-1,-1):  
 66 |       nden=(nden/(j+1-ilo))*(j-ihi)  
 67 |       yy[j] = yy[j] + y*fac/(nden*(x-j))  
 68 |   
 69 | def fasper(x,y,ofac,hifac, MACC=4):  
 70 |   """ function fasper 
 71 |     Given abscissas x (which need not be equally spaced) and ordinates 
 72 |     y, and given a desired oversampling factor ofac (a typical value 
 73 |     being 4 or larger). this routine creates an array wk1 with a 
 74 |     sequence of nout increasing frequencies (not angular frequencies) 
 75 |     up to hifac times the "average" Nyquist frequency, and creates 
 76 |     an array wk2 with the values of the Lomb normalized periodogram at 
 77 |     those frequencies. The arrays x and y are not altered. This 
 78 |     routine also returns jmax such that wk2(jmax) is the maximum 
 79 |     element in wk2, and prob, an estimate of the significance of that 
 80 |     maximum against the hypothesis of random noise. A small value of prob 
 81 |     indicates that a significant periodic signal is present. 
 82 |    
 83 |   Reference:  
 84 |     Press, W. H. & Rybicki, G. B. 1989 
 85 |     ApJ vol. 338, p. 277-280. 
 86 |     Fast algorithm for spectral analysis of unevenly sampled data 
 87 |     (1989ApJ...338..277P) 
 88 |  
 89 |   Arguments: 
 90 |       X   : Abscissas array, (e.g. an array of times). 
 91 |       Y   : Ordinates array, (e.g. corresponding counts). 
 92 |       Ofac : Oversampling factor. 
 93 |       Hifac : Hifac * "average" Nyquist frequency = highest frequency 
 94 |            for which values of the Lomb normalized periodogram will 
 95 |            be calculated. 
 96 |        
 97 |    Returns: 
 98 |       Wk1 : An array of Lomb periodogram frequencies. 
 99 |       Wk2 : An array of corresponding values of the Lomb periodogram. 
100 |       Nout : Wk1 & Wk2 dimensions (number of calculated frequencies) 
101 |       Jmax : The array index corresponding to the MAX( Wk2 ). 
102 |       Prob : False Alarm Probability of the largest Periodogram value 
103 |       MACC : Number of interpolation points per 1/4 cycle 
104 |             of highest frequency 
105 |  
106 |   History: 
107 |     02/23/2009, v1.0, MF 
108 |       Translation of IDL code (orig. Numerical recipies) 
109 |   """  
110 |   #Check dimensions of input arrays  
111 |   n = long(len(x))  
112 |   if n != len(y):  
113 |     print 'Incompatible arrays.'  
114 |     return  
115 |   
116 |   nout  = 0.5*ofac*hifac*n  
117 |   if round(nout) != nout:
118 |     print("Warning: nout is not an integer and will be rounded down.")
119 |   nout = int(nout)
120 |   nfreqt = long(ofac*hifac*n*MACC)   #Size the FFT as next power  
121 |   nfreq = 64L             # of 2 above nfreqt.  
122 |   
123 |   while nfreq < nfreqt:   
124 |     nfreq = 2*nfreq  
125 |   
126 |   ndim = long(2*nfreq)  
127 |     
128 |   #Compute the mean, variance  
129 |   ave = y.mean()  
130 |   ##sample variance because the divisor is N-1  
131 |   var = ((y-y.mean())**2).sum()/(len(y)-1)   
132 |   # and range of the data.  
133 |   xmin = x.min()  
134 |   xmax = x.max()  
135 |   xdif = xmax-xmin  
136 |   
137 |   #extirpolate the data into the workspaces  
138 |   wk1 = zeros(ndim, dtype='complex')  
139 |   wk2 = zeros(ndim, dtype='complex')  
140 |   
141 |   fac  = ndim/(xdif*ofac)  
142 |   fndim = ndim  
143 |   ck  = ((x-xmin)*fac) % fndim  
144 |   ckk  = (2.0*ck) % fndim  
145 |   
146 |   for j in range(0L, n):  
147 |     __spread__(y[j]-ave,wk1,ndim,ck[j],MACC)  
148 |     __spread__(1.0,wk2,ndim,ckk[j],MACC)  
149 |   
150 |   #Take the Fast Fourier Transforms  
151 |   wk1 = ifft( wk1 )*len(wk1)  
152 |   wk2 = ifft( wk2 )*len(wk1)  
153 |   
154 |   wk1 = wk1[1:nout+1]  
155 |   wk2 = wk2[1:nout+1]  
156 |   rwk1 = wk1.real  
157 |   iwk1 = wk1.imag  
158 |   rwk2 = wk2.real  
159 |   iwk2 = wk2.imag  
160 |     
161 |   df  = 1.0/(xdif*ofac)  
162 |     
163 |   #Compute the Lomb value for each frequency  
164 |   hypo2 = 2.0 * abs( wk2 )  
165 |   hc2wt = rwk2/hypo2  
166 |   hs2wt = iwk2/hypo2  
167 |   
168 |   cwt  = sqrt(0.5+hc2wt)  
169 |   swt  = sign(hs2wt)*(sqrt(0.5-hc2wt))  
170 |   den  = 0.5*n+hc2wt*rwk2+hs2wt*iwk2  
171 |   cterm = (cwt*rwk1+swt*iwk1)**2./den  
172 |   sterm = (cwt*iwk1-swt*rwk1)**2./(n-den)  
173 |   
174 |   wk1 = df*(arange(nout, dtype='float')+1.)  
175 |   wk2 = (cterm+sterm)/(2.0*var)  
176 |   pmax = wk2.max()  
177 |   jmax = wk2.argmax()  
178 |   
179 |   
180 |   #Significance estimation  
181 |   #expy = exp(-wk2)            
182 |   #effm = 2.0*(nout)/ofac         
183 |   #sig = effm*expy  
184 |   #ind = (sig > 0.01).nonzero()  
185 |   #sig[ind] = 1.0-(1.0-expy[ind])**effm  
186 |   
187 |   #Estimate significance of largest peak value  
188 |   expy = exp(-pmax)            
189 |   effm = 2.0*(nout)/ofac         
190 |   prob = effm*expy  
191 |   
192 |   if prob > 0.01:   
193 |     prob = 1.0-(1.0-expy)**effm  
194 |   
195 |   return wk1,wk2,nout,jmax,prob  
196 |   
197 | def getSignificance(wk1, wk2, nout, ofac):  
198 |   """ returns the peak false alarm probabilities 
199 |   Hence the lower is the probability and the more significant is the peak 
200 |   """  
201 |   expy = exp(-wk2)            
202 |   effm = 2.0*(nout)/ofac         
203 |   sig = effm*expy  
204 |   ind = (sig > 0.01).nonzero()  
205 |   sig[ind] = 1.0-(1.0-expy[ind])**effm  
206 |   return sig  


--------------------------------------------------------------------------------
/FATS/test_library.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | from Feature import FeatureSpace
  4 | import numpy as np
  5 | from import_lc_cluster import ReadLC_MACHO
  6 | from PreprocessLC import Preprocess_LC
  7 | from alignLC import Align_LC
  8 | import os.path
  9 | import tarfile
 10 | import sys
 11 | import pandas as pd
 12 | import pytest
 13 | 
 14 | @pytest.fixture
 15 | def white_noise():
 16 | 	data = np.random.normal(size=10000)
 17 | 	mjd=np.arange(10000)
 18 | 	error = np.random.normal(loc=0.01, scale =0.8, size=10000)
 19 | 	second_data = np.random.normal(size=10000)
 20 | 	mjd2=np.arange(10000)
 21 | 	error2 = np.random.normal(loc=0.01, scale =0.8, size=10000)
 22 | 	aligned_data = data
 23 | 	aligned_second_data = second_data
 24 | 	aligned_mjd = mjd
 25 | 	lc = np.array([data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd])
 26 | 	return lc
 27 | 
 28 | @pytest.fixture
 29 | def periodic_lc():
 30 | 	N=100
 31 | 	mjd_periodic = np.arange(N)
 32 | 	Period = 20
 33 | 	cov = np.zeros([N,N])
 34 | 	mean = np.zeros(N)
 35 | 	for i in np.arange(N):
 36 | 	    for j in np.arange(N):
 37 | 	        cov[i,j] = np.exp( -(np.sin( (np.pi/Period) *(i-j))**2))
 38 | 	data_periodic=np.random.multivariate_normal(mean, cov)
 39 | 	lc = np.array([data_periodic, mjd_periodic])
 40 | 	return lc
 41 | 
 42 | 
 43 | @pytest.fixture
 44 | def uniform_lc():
 45 | 	mjd_uniform=np.arange(1000000)
 46 | 	data_uniform=np.random.uniform(size=1000000)
 47 | 	lc = np.array([data_uniform, mjd_uniform])
 48 | 	return lc
 49 | 
 50 | @pytest.fixture
 51 | def random_walk():
 52 | 	N = 10000
 53 | 	alpha = 1.
 54 | 	sigma = 0.5
 55 | 	data_rw = np.zeros([N,1])
 56 | 	data_rw[0] = 1
 57 | 	time_rw = xrange(1, N)
 58 | 	for t in time_rw:
 59 | 		data_rw[t] = alpha * data_rw[t-1] + np.random.normal(loc=0.0, scale=sigma)
 60 | 	time_rw = np.array(range(0,N)) + 1 * np.random.uniform(size=N)
 61 | 	data_rw = data_rw.squeeze()
 62 | 	lc = np.array([data_rw, time_rw])
 63 | 	return lc
 64 | 
 65 | # def test_Amplitude(white_noise):
 66 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
 67 | 
 68 | # 	a = FeatureSpace(featureList=['Amplitude'])
 69 | # 	a=a.calculateFeature(white_noise[0])
 70 | 
 71 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
 72 | 
 73 | # def test_Autocor(white_noise):
 74 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
 75 | 
 76 | # 	a = FeatureSpace(featureList=['Autocor'] )
 77 | # 	a=a.calculateFeature(white_noise[0])
 78 | 
 79 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
 80 | 
 81 | # def test_Automean(white_noise):
 82 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
 83 | 
 84 | # 	a = FeatureSpace(featureList=['Automean'] , Automean=[0,0])
 85 | # 	a=a.calculateFeature(white_noise[0])
 86 | 
 87 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
 88 | 
 89 | # def test_B_R(white_noise):
 90 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
 91 | 
 92 | # 	a = FeatureSpace(featureList=['B_R'] , B_R=second_data)
 93 | # 	a=a.calculateFeature(white_noise[0])
 94 | 
 95 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
 96 | 
 97 | def test_Beyond1Std(white_noise):
 98 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
 99 | 
100 | 	a = FeatureSpace(featureList=['Beyond1Std'])
101 | 	a=a.calculateFeature(white_noise)
102 | 
103 | 	assert(a.result(method='array') >= 0.30 and a.result(method='array') <= 0.40)
104 | 
105 | def test_Mean(white_noise):
106 | 
107 | 	a = FeatureSpace(featureList=['Mean'])
108 | 	a=a.calculateFeature(white_noise)
109 | 
110 | 	assert(a.result(method='array') >= -0.1 and a.result(method='array') <= 0.1)
111 | 
112 | # def test_CAR(white_noise):
113 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
114 | 
115 | # 	a = FeatureSpace(featureList=['CAR_sigma', 'CAR_tau', 'CAR_tmean'] , CAR_sigma=[mjd, error])
116 | # 	a=a.calculateFeature(white_noise[0])
117 | 
118 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
119 | 
120 | 
121 | def test_Con(white_noise):
122 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
123 | 
124 | 	a = FeatureSpace(featureList=['Con'] , Con=1)
125 | 	a=a.calculateFeature(white_noise)
126 | 
127 | 	assert(a.result(method='array') >= 0.04 and a.result(method='array') <= 0.05)
128 | 
129 | def test_Eta_color(white_noise):
130 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
131 | 
132 | 	a = FeatureSpace(featureList=['Eta_color'])
133 | 	a=a.calculateFeature(white_noise)
134 | 
135 | 	assert(a.result(method='array') >= 1.9 and a.result(method='array') <= 2.1)
136 | 
137 | def test_Eta_e(white_noise):
138 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
139 | 
140 | 	a = FeatureSpace(featureList=['Eta_e'])
141 | 	a=a.calculateFeature(white_noise)
142 | 
143 | 	assert(a.result(method='array') >= 1.9 and a.result(method='array') <= 2.1)
144 | 
145 | def test_FluxPercentile(white_noise):
146 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
147 | 
148 | 	a = FeatureSpace(featureList=['FluxPercentileRatioMid20','FluxPercentileRatioMid35','FluxPercentileRatioMid50','FluxPercentileRatioMid65','FluxPercentileRatioMid80'] )
149 | 	a=a.calculateFeature(white_noise)
150 | 
151 | 	assert(a.result(method='array')[0] >= 0.145 and a.result(method='array')[0] <= 0.160)
152 | 	assert(a.result(method='array')[1] >= 0.260 and a.result(method='array')[1] <= 0.290)
153 | 	assert(a.result(method='array')[2] >= 0.350 and a.result(method='array')[2] <= 0.450)
154 | 	assert(a.result(method='array')[3] >= 0.540 and a.result(method='array')[3] <= 0.580)
155 | 	assert(a.result(method='array')[4] >= 0.760 and a.result(method='array')[4] <= 0.800)
156 | 
157 | 
158 | 
159 | def test_LinearTrend(white_noise):
160 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
161 | 
162 | 	a = FeatureSpace(featureList=['LinearTrend'])
163 | 	a=a.calculateFeature(white_noise)
164 | 
165 | 	assert(a.result(method='array') >= -0.1 and a.result(method='array') <= 0.1)
166 | 
167 | # def test_MaxSlope(white_noise):
168 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
169 | 
170 | # 	a = FeatureSpace(featureList=['MaxSlope'] , MaxSlope=mjd)
171 | # 	a=a.calculateFeature(white_noise[0])
172 | 
173 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
174 | 
175 | def test_Meanvariance(uniform_lc):
176 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
177 | 
178 | 	a = FeatureSpace(featureList=['Meanvariance'])
179 | 	a=a.calculateFeature(uniform_lc)
180 | 
181 | 	assert(a.result(method='array') >= 0.575 and a.result(method='array') <= 0.580)
182 | 
183 | def test_MedianAbsDev(white_noise):
184 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
185 | 
186 | 	a = FeatureSpace(featureList=['MedianAbsDev'])
187 | 	a=a.calculateFeature(white_noise)
188 | 
189 | 	assert(a.result(method='array') >= 0.630 and a.result(method='array') <= 0.700)
190 | 
191 | # def test_MedianBRP(white_noise):
192 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
193 | 
194 | # 	a = FeatureSpace(featureList=['MedianBRP'] , MaxSlope=mjd)
195 | # 	a=a.calculateFeature(white_noise[0])
196 | 
197 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
198 | 
199 | def test_PairSlopeTrend(white_noise):
200 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
201 | 
202 | 	a = FeatureSpace(featureList=['PairSlopeTrend'])
203 | 	a=a.calculateFeature(white_noise)
204 | 
205 | 	assert(a.result(method='array') >= -0.25 and a.result(method='array') <= 0.25)
206 | 
207 | # def test_PercentAmplitude(white_noise):
208 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
209 | 
210 | # 	a = FeatureSpace(featureList=['PercentAmplitude'])
211 | # 	a=a.calculateFeature(white_noise[0])
212 | 
213 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
214 | 
215 | # def test_PercentDifferenceFluxPercentile(white_noise):
216 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
217 | 
218 | # 	a = FeatureSpace(featureList=['PercentDifferenceFluxPercentile'])
219 | # 	a=a.calculateFeature(white_noise[0])
220 | 
221 | # 	assert(a.result(method='array') >= 0.043 and a.result(method='array') <= 0.046)
222 | 
223 | def test_Period_Psi(periodic_lc):
224 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
225 | 
226 | 	a = FeatureSpace(featureList=['PeriodLS', 'Period_fit','Psi_CS','Psi_eta'])
227 | 	a=a.calculateFeature(periodic_lc)
228 | 	# print a.result(method='array'), len(periodic_lc[0])
229 | 	assert(a.result(method='array')[0] >= 19 and a.result(method='array')[0] <= 21)
230 | 
231 | def test_Q31(white_noise):
232 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
233 | 
234 | 	a = FeatureSpace(featureList=['Q31'])
235 | 	a=a.calculateFeature(white_noise)
236 | 	assert(a.result(method='array') >= 1.30 and a.result(method='array') <= 1.38)
237 | 
238 | # def test_Q31B_R(white_noise):
239 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
240 | 
241 | # 	a = FeatureSpace(featureList=['Q31B_R'], Q31B_R = [aligned_second_data, aligned_data])
242 | # 	a=a.calculateFeature(white_noise[0])
243 | 
244 | def test_Rcs(white_noise):
245 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
246 | 
247 | 	a = FeatureSpace(featureList=['Rcs'])
248 | 	a=a.calculateFeature(white_noise)
249 | 	assert(a.result(method='array') >= 0 and a.result(method='array') <= 0.1)
250 | 
251 | def test_Skew(white_noise):
252 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
253 | 
254 | 	a = FeatureSpace(featureList=['Skew'])
255 | 	a=a.calculateFeature(white_noise)
256 | 	assert(a.result(method='array') >= -0.1 and a.result(method='array') <= 0.1)
257 | 
258 | 
259 | # def test_SlottedA(white_noise):
260 | # 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
261 | 
262 | # 	a = FeatureSpace(featureList=['SlottedA'], SlottedA = [mjd, 1])
263 | # 	a=a.calculateFeature(white_noise[0])
264 | 
265 | def test_SmallKurtosis(white_noise):
266 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
267 | 
268 | 	a = FeatureSpace(featureList=['SmallKurtosis'])
269 | 	a=a.calculateFeature(white_noise)
270 | 	assert(a.result(method='array') >= -0.2 and a.result(method='array') <= 0.2)
271 | 
272 | 
273 | def test_Std(white_noise):
274 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
275 | 
276 | 	a = FeatureSpace(featureList=['Std'])
277 | 	a=a.calculateFeature(white_noise)
278 | 
279 | 	assert(a.result(method='array') >= 0.9 and a.result(method='array') <= 1.1)
280 | 
281 | 
282 | def test_Stetson(white_noise):
283 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
284 | 
285 | 	a = FeatureSpace(featureList=['SlottedA_length','StetsonK', 'StetsonK_AC', 'StetsonJ', 'StetsonL'])
286 | 	a=a.calculateFeature(white_noise)
287 | 
288 | 	assert(a.result(method='array')[1] >= 0.790 and a.result(method='array')[1] <= 0.85)
289 | 	assert(a.result(method='array')[2] >= 0.20 and a.result(method='array')[2] <= 0.45)
290 | 	assert(a.result(method='array')[3] >= -0.1 and a.result(method='array')[3] <= 0.1)
291 | 	assert(a.result(method='array')[4] >= -0.1 and a.result(method='array')[4] <= 0.1)
292 | 
293 | 
294 | def test_Gskew(white_noise):
295 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
296 | 
297 | 	a = FeatureSpace(featureList=['Gskew'])
298 | 	a=a.calculateFeature(white_noise)
299 | 	assert(a.result(method='array') >= -0.2 and a.result(method='array') <= 0.2)
300 | 
301 | def test_StructureFunction(random_walk):
302 | 	# data, mjd, error, second_data, aligned_data, aligned_second_data, aligned_mjd = white_noise()
303 | 
304 | 	a = FeatureSpace(featureList=['StructureFunction_index_21', 'StructureFunction_index_31',
305 |                                     'StructureFunction_index_32'])
306 | 	a = a.calculateFeature(random_walk)
307 | 	assert(a.result(method='array')[0] >= 1.520 and a.result(method='array')[0] <= 2.067)
308 | 	assert(a.result(method='array')[1] >= 1.821 and a.result(method='array')[1] <= 3.162)
309 | 	assert(a.result(method='array')[2] >= 1.243 and a.result(method='array')[2] <= 1.562)
310 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2015 Isadora Nun
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 
23 | 


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
 1 | FATS: Feature Analysis for Time Series
 2 | ==============================
 3 | 
 4 | Summary: Compilation of some of the existing light-curve features.
 5 | 
 6 | Authors: Isadora Nun and Pavlos Protopapas
 7 | 
 8 | Contributors: Karim Pichara, Rahul Dave, Daniel Acuña, Nicolás Castro, Cristobal Mackenzie, Andrés Riveros and Ming Zhu
 9 | 
10 | -----------------------------------------------------
11 | 
12 | Installation: Clone this repository and do `python setup.py install`.
13 | 
14 | Or `pip install FATS` for the latest stable version.
15 | 
16 | -----------------------------------------------------
17 | 
18 | Description: In time-domain astronomy, data gathered from the telescopes is usually represented in the form of light-curves. These are time series that show the brightness variation of an object through a period of time (for a visual representation see video below). Based on the variability characteristics of the light-curves, celestial objects can be classified into different groups (quasars, long period variables, eclipsing binaries, etc.) and consequently be studied in depth independently.
19 | 
20 | In order to characterize this variability, some of the existing methods use machine learning algorithms that build their decision on the light-curves features. Features, the topic of the following work, are numerical descriptors that aim to characterize and distinguish the different variability classes. They can go from basic statistical measures such as the mean or the standard deviation, to complex time-series characteristics such as the autocorrelation function.
21 | 
22 | In this document we present a library with a compilation of some of the existing light-curve features. The main goal is to create a collaborative and open tool where every user can characterize or analyze an astronomical photometric database while also contributing to the library by adding new features. However, it is important to highlight that this library is not restricted to the astronomical field and could also be applied to any kind of time series.
23 | 
24 | Our vision is to be capable of analyzing and comparing light-curves from all the available astronomical catalogs in a standard and universal way. This would facilitate and make more efficient tasks as modeling, classification, data cleaning, outlier detection and data analysis in general. Consequently, when studying light-curves, astronomers and data analysts would be on the same wavelength and would not have the necessity to find a way of comparing or matching different features. In order to achieve this goal, the library should be run in every existent survey (MACHO, EROS, OGLE, Catalina, Pan-STARRS, etc) and future surveys (LSST) and the results should be ideally shared in the same open way as this library.
25 | 
26 | ---------------------------------------------------------
27 | 
28 | An extended explanation of the package is available at http://isadoranun.github.io/tsfeat/FeaturesDocumentation.html
29 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib==2.1.0
2 | pandas==0.13.1
3 | statsmodels==0.8.0
4 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | from setuptools import setup, find_packages  # Always prefer setuptools over distutils
  2 | from codecs import open  # To use a consistent encoding
  3 | import os
  4 | 
  5 | 
  6 | REPO_DIR = os.path.dirname(os.path.realpath(__file__))
  7 | 
  8 | 
  9 | # Get the long description from the relevant file
 10 | # with open(path.join(here, 'DESCRIPTION.rst'), encoding='utf-8') as f:
 11 | #     long_description = f.read()
 12 | def readme():
 13 |     with open('README.rst') as f:
 14 |         return f.read()
 15 | 
 16 | 
 17 | def get_requirements():
 18 |     """Parses and returns installation requirements."""
 19 |     path = os.path.join(REPO_DIR, "requirements.txt")
 20 |     return [line.strip() for line in open(path).readlines()
 21 |             if not line.startswith("#")]
 22 | 
 23 | setup(
 24 |     name='FATS',
 25 | 
 26 |     # Versions should comply with PEP440.  For a discussion on single-sourcing
 27 |     # the version across setup.py and the project code, see
 28 |     # https://packaging.python.org/en/latest/development.html#single-sourcing-the-version
 29 |     version='1.3.6',
 30 | 
 31 |     description='Library with compilation of features for time series',
 32 |     long_description=readme(),
 33 |     # The project's main homepage.
 34 |     url=' http://isadoranun.github.io/tsfeat/FeaturesDocumentation.html',
 35 | 
 36 |     download_url = 'https://github.com/isadoranun/tsfeat',
 37 | 
 38 |     # Author details
 39 |     author='Isadora Nun',
 40 |     author_email='isadoranun@seas.harvard.edu',
 41 | 
 42 |     # Choose your license
 43 |     license='MIT licence',
 44 | 
 45 |     # See https://pypi.python.org/pypi?%3Aaction=list_classifiers
 46 |     classifiers=[
 47 |         # How mature is this project? Common values are
 48 |         #   3 - Alpha
 49 |         #   4 - Beta
 50 |         #   5 - Production/Stable
 51 |         'Development Status :: 3 - Alpha',
 52 | 
 53 |         # Indicate who your project is intended for
 54 |         'Intended Audience :: Science/Research',
 55 |         'Topic :: Scientific/Engineering :: Astronomy',
 56 |         'Topic :: Software Development :: Libraries :: Python Modules',
 57 | 
 58 | 
 59 |         # Pick your license as you wish (should match "license" above)
 60 |         # 'License :: OSI Approved :: MIT License',
 61 | 
 62 |         # Specify the Python versions you support here. In particular, ensure
 63 |         # that you indicate whether you support Python 2, Python 3 or both.
 64 |         'Programming Language :: Python :: 2',
 65 |         'Programming Language :: Python :: 2.6',
 66 |         'Programming Language :: Python :: 2.7',
 67 |         'Programming Language :: Python :: 3',
 68 |         'Programming Language :: Python :: 3.2',
 69 |         'Programming Language :: Python :: 3.3',
 70 |         'Programming Language :: Python :: 3.4',
 71 |     ],
 72 | 
 73 |     # What does your project relate to?
 74 |     keywords='times series features, light curves',
 75 | 
 76 |     # You can just specify the packages manually here if your project is
 77 |     # simple. Or you can use find_packages().
 78 |     # packages=find_packages(exclude=['contrib', 'docs', 'tests*']),
 79 | 
 80 |     packages = ['FATS'],
 81 | 
 82 |     include_package_data=True,
 83 | 
 84 |     zip_safe=False,
 85 | 
 86 |     # List run-time dependencies here.  These will be installed by pip when your
 87 |     # project is installed. For an analysis of "install_requires" vs pip's
 88 |     # requirements files see:
 89 |     # https://packaging.python.org/en/latest/technical.html#install-requires-vs-requirements-files
 90 |     install_requires=get_requirements()
 91 | 
 92 |     # List additional groups of dependencies here (e.g. development dependencies).
 93 |     # You can install these using the following syntax, for example:
 94 |     # $ pip install -e .[dev,test]
 95 |     
 96 |     # isa
 97 |     # extras_require = {
 98 |     #     'dev': ['check-manifest'],
 99 |     #     'test': ['coverage'],
100 |     # },
101 | 
102 |     # If there are data files included in your packages that need to be
103 |     # installed, specify them here.  If using Python 2.6 or less, then these
104 |     # have to be included in MANIFEST.in as well.
105 |     
106 |     # isa
107 |     # package_data={
108 |     #     'sample': ['package_data.dat'],
109 |     # },
110 | 
111 |     # Although 'package_data' is the preferred approach, in some case you may
112 |     # need to place data files outside of your packages.
113 |     # see http://docs.python.org/3.4/distutils/setupscript.html#installing-additional-files
114 |     # In this case, 'data_file' will be installed into '<sys.prefix>/my_data'
115 |     
116 |     # isa
117 |     # data_files=[('my_data', ['data/data_file'])],
118 | 
119 |     # To provide executable scripts, use entry points in preference to the
120 |     # "scripts" keyword. Entry points provide cross-platform support and allow
121 |     # pip to create the appropriate form of executable for the target platform.
122 |     
123 |     # isa
124 |     # entry_points={
125 |     #     'console_scripts': [
126 |     #         'sample=sample:main',
127 |     #     ],
128 |     # },
129 | )


--------------------------------------------------------------------------------