├── .gitignore
├── CHANGELOG.rst
├── LICENSE.txt
├── MANIFEST.in
├── README.rst
├── TODO.rst
├── contributors.txt
├── requirements.txt
├── setup.py
└── tlseparation
    ├── __init__.py
    ├── classification
        ├── __init__.py
        ├── classes_reference.py
        ├── classify_wood.py
        ├── gmm.py
        ├── path_detection.py
        ├── point_features.py
        └── wlseparation.py
    ├── scripts
        ├── __init__.py
        ├── automated_separation.py
        └── post_processing.py
    └── utility
        ├── __init__.py
        ├── cloud_analysis.py
        ├── clustering.py
        ├── data_utils.py
        ├── downsampling.py
        ├── filtering.py
        ├── knnsearch.py
        ├── peakdetect.py
        ├── shortpath.py
        └── voxels.py


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | env/
 12 | build/
 13 | develop-eggs/
 14 | dist/
 15 | downloads/
 16 | eggs/
 17 | .eggs/
 18 | lib/
 19 | lib64/
 20 | parts/
 21 | sdist/
 22 | var/
 23 | wheels/
 24 | *.egg-info/
 25 | .installed.cfg
 26 | *.egg
 27 | 
 28 | # PyInstaller
 29 | #  Usually these files are written by a python script from a template
 30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 31 | *.manifest
 32 | *.spec
 33 | 
 34 | # Installer logs
 35 | pip-log.txt
 36 | pip-delete-this-directory.txt
 37 | 
 38 | # Unit test / coverage reports
 39 | htmlcov/
 40 | .tox/
 41 | .coverage
 42 | .coverage.*
 43 | .cache
 44 | nosetests.xml
 45 | coverage.xml
 46 | *.cover
 47 | .hypothesis/
 48 | 
 49 | # Translations
 50 | *.mo
 51 | *.pot
 52 | 
 53 | # Django stuff:
 54 | *.log
 55 | local_settings.py
 56 | 
 57 | # Flask stuff:
 58 | instance/
 59 | .webassets-cache
 60 | 
 61 | # Scrapy stuff:
 62 | .scrapy
 63 | 
 64 | # PyBuilder
 65 | target/
 66 | 
 67 | # Jupyter Notebook
 68 | .ipynb_checkpoints
 69 | 
 70 | # pyenv
 71 | .python-version
 72 | 
 73 | # celery beat schedule file
 74 | celerybeat-schedule
 75 | 
 76 | # SageMath parsed files
 77 | *.sage.py
 78 | 
 79 | # dotenv
 80 | .env
 81 | 
 82 | # virtualenv
 83 | .venv
 84 | venv/
 85 | ENV/
 86 | 
 87 | # Spyder project settings
 88 | .spyderproject
 89 | .spyproject
 90 | .spyderworkspace
 91 | 
 92 | # Rope project settings
 93 | .ropeproject
 94 | 
 95 | # mkdocs documentation
 96 | /site
 97 | 
 98 | # mypy
 99 | .mypy_cache/
100 | 


--------------------------------------------------------------------------------
/CHANGELOG.rst:
--------------------------------------------------------------------------------
  1 | v1.3.2
  2 | ------
  3 | - Added a new clustering module containing a 'connected_component' approach.
  4 | - Added two cluster based filtering to be used along 'clustering.connected_component'.
  5 | - Added new script module for automated post-processing.
  6 | 
  7 | v1.3.1
  8 | ------
  9 | - Bug fix in 'generic_tree' script. Now 'path_detect_frequency' also uses the voxel size defined in the main script.
 10 | 
 11 | v1.3
 12 | ----
 13 | - Major bump in version to point out operational status after series of minor improvements.
 14 | 
 15 | v1.2.2.7 
 16 | --------
 17 | - Minor changes mainly to update for a new stable version.
 18 | 
 19 | v1.2.2.6 
 20 | --------
 21 | - Removed 'future_code' from the package. These codes will be kept aside until they are ready to be added back into the package.
 22 | - Completely removed all references for *HDBSCAN* which caused import errors.
 23 | - Renamed *automated_separation.large_tree_5* to *automated_separation.generic_tree*.
 24 | 
 25 | v1.2.2.5 
 26 | --------
 27 | - Changed *remove_duplicates* function to allow indices output.
 28 | - Temporarily removed *continuous_clustering* module until further improvements.
 29 | - Replaced HDBSCAN for DBSCAN in the entire package. This aims to make installation simpler and avoid incompatibilities.
 30 | - Set full_matrices to False in *svd_evals* to improve processing efficiency (reduced processing time and memory usage).
 31 | - Added new autometed separation script *large_tree_5*.
 32 | - Removed old automated separation scripts: *large_tree_1* and *large_tree_2*.
 33 | - Added new filters: *plane_filter*, *cluster_filter* and *feature_filter*.
 34 | - Added new path detection script, *path_detect_frequency*.
 35 | 
 36 | v1.2.2.4
 37 | --------
 38 | - Corrected automated calculation of parameter cf_rad in *large_tree_3*.
 39 | - Added new gmm_nclasses parameter to *large_tree_3*.
 40 | 
 41 | v1.2.2.3
 42 | --------
 43 | - Changed *voxel_path_detect* parameters to speed up processing.
 44 | - Added maximum iterations to *detect_main_pathways* to avoid infinite loops or long processing times.
 45 | 
 46 | v1.2.2.2
 47 | --------
 48 | - Bug fixes in *automated_separation.large_tree_3*.
 49 | 
 50 | v1.2.2.1
 51 | --------
 52 | - Fixed base point index in *continuity_filter*.
 53 | - Added new voxelization wrapped around *detect_main_pathways* that aims to speed up the processing.
 54 | - Added new *automated_separation* script, *large_tree_3*.
 55 | 
 56 | v1.2.1.7
 57 | --------
 58 | - Changed clustering in filtering.cluster_filter from DBSCAN to HDBSCAN in order to improve memory efficiency.
 59 | - Minor adjustments in automated_separation.large_tree_1.
 60 | - Created new knn optimization function to detect knn values automatically.
 61 | - Added block processing to *subset_nbrs*.e
 62 | - Minor fixes for improvement on continuity_filter stability. 
 63 | - Added new automated separation script, automated_separation.large_tree_2.
 64 | - Corrected class_filter application on large_tree_1 and large_tree_2.
 65 | - Fixed class_filter input target values (finished changing valid values from 1 or 2 to 0 or 1).
 66 | - Added a new final filtering step to large_tree_2 using detect_main_pathways.
 67 | 
 68 | v1.2.1.6
 69 | --------
 70 | - Minor fixes.
 71 | 
 72 | v1.2.1.5
 73 | --------
 74 | - Added verbose option to some modules.
 75 | - Changed docstrings style to numpydoc.
 76 | - Added default class_ref DataFrame as a built-in object. User now has the option to use this new default or continue to load a .csv file.
 77 | - Added voxels.py module to create voxels from point clouds.
 78 | - Added voxelization step in automated_separation.large_tree_1 to improve performance in path_detection.
 79 | 
 80 | 
 81 | v1.2.1.4
 82 | --------
 83 | - Fixed imports. Now, to access any low level function, one has to go through the proper module hierarchy.
 84 | 
 85 | v1.2.1.3
 86 | --------
 87 | - Changed approach of relative import. Removed all sys.path.append statements and adopted double dots (..) for parent folder imports.
 88 | 
 89 | v1.2.1.2
 90 | --------
 91 | 
 92 | - Fixed bug in classification.__init__.py failing to import *wlseparate_ref* as this function no longer exists;
 93 | - Updated documentation strings for Sphinx;		
 94 | 
 95 | v1.2.1.1
 96 | --------
 97 | This versions has enough important modifications to get a new subversion number, starting the 1.2 phase.
 98 | 
 99 | Some of the changes included in this version are:
100 | 
101 | - Changed *geodescriptors* function name to *knn_features*;
102 | - Updated version number in all files and setup.py;
103 | - Changed *point_features.eigen* (now called knn_evals) name to accommodate for radius and knn options;
104 | - Merged *array_majority* and *array_majority_rad* into the same function. Use kwargs to make it easier to parse arguments;
105 | - Merged *class_filter* and *class_filter_rad* into the same function. Use kwargs to make it easier to parse arguments;
106 | - Changed *point_compare* module name to *data_utils*;
107 | - Revised version of *path_detection*;
108 | - Changed new output configuration to *wlseparate_abs* and *wlseparate_ref_voting*;
109 | - Removed *wlseparate_ref* as it's redundant. Same function can be run by using a single 'knn' parameter value in *wlseparate_ref_voting*;
110 | - Changed *filtering* outputs. Now all functions (except for continuity_filter) output arrays of indices instead of points coordinates.;
111 | - Revised documentation for the whole package. Now, all docstrings are compatible with Sphinx;
112 | 
113 | v1.1.4
114 | ------
115 | Corrected list of required packages.
116 | 
117 | v1.1.3
118 | ------
119 | Added new option for automated separation (auto_separation_2).
120 | Renamed old separation.py to auto_separation_1.py.
121 | Added classificaition probability output to gmm.py.
122 | Added classification probability filter to separation. Now all points below some probability threshold will be left unclassified.
123 | Added new wlseparate method to auto_separation_2, based on a voting scheme.
124 | 
125 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
  1 |                     GNU GENERAL PUBLIC LICENSE
  2 |                        Version 3, 29 June 2007
  3 | 
  4 |  Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
  5 |  Everyone is permitted to copy and distribute verbatim copies
  6 |  of this license document, but changing it is not allowed.
  7 | 
  8 |                             Preamble
  9 | 
 10 |   The GNU General Public License is a free, copyleft license for
 11 | software and other kinds of works.
 12 | 
 13 |   The licenses for most software and other practical works are designed
 14 | to take away your freedom to share and change the works.  By contrast,
 15 | the GNU General Public License is intended to guarantee your freedom to
 16 | share and change all versions of a program--to make sure it remains free
 17 | software for all its users.  We, the Free Software Foundation, use the
 18 | GNU General Public License for most of our software; it applies also to
 19 | any other work released this way by its authors.  You can apply it to
 20 | your programs, too.
 21 | 
 22 |   When we speak of free software, we are referring to freedom, not
 23 | price.  Our General Public Licenses are designed to make sure that you
 24 | have the freedom to distribute copies of free software (and charge for
 25 | them if you wish), that you receive source code or can get it if you
 26 | want it, that you can change the software or use pieces of it in new
 27 | free programs, and that you know you can do these things.
 28 | 
 29 |   To protect your rights, we need to prevent others from denying you
 30 | these rights or asking you to surrender the rights.  Therefore, you have
 31 | certain responsibilities if you distribute copies of the software, or if
 32 | you modify it: responsibilities to respect the freedom of others.
 33 | 
 34 |   For example, if you distribute copies of such a program, whether
 35 | gratis or for a fee, you must pass on to the recipients the same
 36 | freedoms that you received.  You must make sure that they, too, receive
 37 | or can get the source code.  And you must show them these terms so they
 38 | know their rights.
 39 | 
 40 |   Developers that use the GNU GPL protect your rights with two steps:
 41 | (1) assert copyright on the software, and (2) offer you this License
 42 | giving you legal permission to copy, distribute and/or modify it.
 43 | 
 44 |   For the developers' and authors' protection, the GPL clearly explains
 45 | that there is no warranty for this free software.  For both users' and
 46 | authors' sake, the GPL requires that modified versions be marked as
 47 | changed, so that their problems will not be attributed erroneously to
 48 | authors of previous versions.
 49 | 
 50 |   Some devices are designed to deny users access to install or run
 51 | modified versions of the software inside them, although the manufacturer
 52 | can do so.  This is fundamentally incompatible with the aim of
 53 | protecting users' freedom to change the software.  The systematic
 54 | pattern of such abuse occurs in the area of products for individuals to
 55 | use, which is precisely where it is most unacceptable.  Therefore, we
 56 | have designed this version of the GPL to prohibit the practice for those
 57 | products.  If such problems arise substantially in other domains, we
 58 | stand ready to extend this provision to those domains in future versions
 59 | of the GPL, as needed to protect the freedom of users.
 60 | 
 61 |   Finally, every program is threatened constantly by software patents.
 62 | States should not allow patents to restrict development and use of
 63 | software on general-purpose computers, but in those that do, we wish to
 64 | avoid the special danger that patents applied to a free program could
 65 | make it effectively proprietary.  To prevent this, the GPL assures that
 66 | patents cannot be used to render the program non-free.
 67 | 
 68 |   The precise terms and conditions for copying, distribution and
 69 | modification follow.
 70 | 
 71 |                        TERMS AND CONDITIONS
 72 | 
 73 |   0. Definitions.
 74 | 
 75 |   "This License" refers to version 3 of the GNU General Public License.
 76 | 
 77 |   "Copyright" also means copyright-like laws that apply to other kinds of
 78 | works, such as semiconductor masks.
 79 | 
 80 |   "The Program" refers to any copyrightable work licensed under this
 81 | License.  Each licensee is addressed as "you".  "Licensees" and
 82 | "recipients" may be individuals or organizations.
 83 | 
 84 |   To "modify" a work means to copy from or adapt all or part of the work
 85 | in a fashion requiring copyright permission, other than the making of an
 86 | exact copy.  The resulting work is called a "modified version" of the
 87 | earlier work or a work "based on" the earlier work.
 88 | 
 89 |   A "covered work" means either the unmodified Program or a work based
 90 | on the Program.
 91 | 
 92 |   To "propagate" a work means to do anything with it that, without
 93 | permission, would make you directly or secondarily liable for
 94 | infringement under applicable copyright law, except executing it on a
 95 | computer or modifying a private copy.  Propagation includes copying,
 96 | distribution (with or without modification), making available to the
 97 | public, and in some countries other activities as well.
 98 | 
 99 |   To "convey" a work means any kind of propagation that enables other
100 | parties to make or receive copies.  Mere interaction with a user through
101 | a computer network, with no transfer of a copy, is not conveying.
102 | 
103 |   An interactive user interface displays "Appropriate Legal Notices"
104 | to the extent that it includes a convenient and prominently visible
105 | feature that (1) displays an appropriate copyright notice, and (2)
106 | tells the user that there is no warranty for the work (except to the
107 | extent that warranties are provided), that licensees may convey the
108 | work under this License, and how to view a copy of this License.  If
109 | the interface presents a list of user commands or options, such as a
110 | menu, a prominent item in the list meets this criterion.
111 | 
112 |   1. Source Code.
113 | 
114 |   The "source code" for a work means the preferred form of the work
115 | for making modifications to it.  "Object code" means any non-source
116 | form of a work.
117 | 
118 |   A "Standard Interface" means an interface that either is an official
119 | standard defined by a recognized standards body, or, in the case of
120 | interfaces specified for a particular programming language, one that
121 | is widely used among developers working in that language.
122 | 
123 |   The "System Libraries" of an executable work include anything, other
124 | than the work as a whole, that (a) is included in the normal form of
125 | packaging a Major Component, but which is not part of that Major
126 | Component, and (b) serves only to enable use of the work with that
127 | Major Component, or to implement a Standard Interface for which an
128 | implementation is available to the public in source code form.  A
129 | "Major Component", in this context, means a major essential component
130 | (kernel, window system, and so on) of the specific operating system
131 | (if any) on which the executable work runs, or a compiler used to
132 | produce the work, or an object code interpreter used to run it.
133 | 
134 |   The "Corresponding Source" for a work in object code form means all
135 | the source code needed to generate, install, and (for an executable
136 | work) run the object code and to modify the work, including scripts to
137 | control those activities.  However, it does not include the work's
138 | System Libraries, or general-purpose tools or generally available free
139 | programs which are used unmodified in performing those activities but
140 | which are not part of the work.  For example, Corresponding Source
141 | includes interface definition files associated with source files for
142 | the work, and the source code for shared libraries and dynamically
143 | linked subprograms that the work is specifically designed to require,
144 | such as by intimate data communication or control flow between those
145 | subprograms and other parts of the work.
146 | 
147 |   The Corresponding Source need not include anything that users
148 | can regenerate automatically from other parts of the Corresponding
149 | Source.
150 | 
151 |   The Corresponding Source for a work in source code form is that
152 | same work.
153 | 
154 |   2. Basic Permissions.
155 | 
156 |   All rights granted under this License are granted for the term of
157 | copyright on the Program, and are irrevocable provided the stated
158 | conditions are met.  This License explicitly affirms your unlimited
159 | permission to run the unmodified Program.  The output from running a
160 | covered work is covered by this License only if the output, given its
161 | content, constitutes a covered work.  This License acknowledges your
162 | rights of fair use or other equivalent, as provided by copyright law.
163 | 
164 |   You may make, run and propagate covered works that you do not
165 | convey, without conditions so long as your license otherwise remains
166 | in force.  You may convey covered works to others for the sole purpose
167 | of having them make modifications exclusively for you, or provide you
168 | with facilities for running those works, provided that you comply with
169 | the terms of this License in conveying all material for which you do
170 | not control copyright.  Those thus making or running the covered works
171 | for you must do so exclusively on your behalf, under your direction
172 | and control, on terms that prohibit them from making any copies of
173 | your copyrighted material outside their relationship with you.
174 | 
175 |   Conveying under any other circumstances is permitted solely under
176 | the conditions stated below.  Sublicensing is not allowed; section 10
177 | makes it unnecessary.
178 | 
179 |   3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180 | 
181 |   No covered work shall be deemed part of an effective technological
182 | measure under any applicable law fulfilling obligations under article
183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184 | similar laws prohibiting or restricting circumvention of such
185 | measures.
186 | 
187 |   When you convey a covered work, you waive any legal power to forbid
188 | circumvention of technological measures to the extent such circumvention
189 | is effected by exercising rights under this License with respect to
190 | the covered work, and you disclaim any intention to limit operation or
191 | modification of the work as a means of enforcing, against the work's
192 | users, your or third parties' legal rights to forbid circumvention of
193 | technological measures.
194 | 
195 |   4. Conveying Verbatim Copies.
196 | 
197 |   You may convey verbatim copies of the Program's source code as you
198 | receive it, in any medium, provided that you conspicuously and
199 | appropriately publish on each copy an appropriate copyright notice;
200 | keep intact all notices stating that this License and any
201 | non-permissive terms added in accord with section 7 apply to the code;
202 | keep intact all notices of the absence of any warranty; and give all
203 | recipients a copy of this License along with the Program.
204 | 
205 |   You may charge any price or no price for each copy that you convey,
206 | and you may offer support or warranty protection for a fee.
207 | 
208 |   5. Conveying Modified Source Versions.
209 | 
210 |   You may convey a work based on the Program, or the modifications to
211 | produce it from the Program, in the form of source code under the
212 | terms of section 4, provided that you also meet all of these conditions:
213 | 
214 |     a) The work must carry prominent notices stating that you modified
215 |     it, and giving a relevant date.
216 | 
217 |     b) The work must carry prominent notices stating that it is
218 |     released under this License and any conditions added under section
219 |     7.  This requirement modifies the requirement in section 4 to
220 |     "keep intact all notices".
221 | 
222 |     c) You must license the entire work, as a whole, under this
223 |     License to anyone who comes into possession of a copy.  This
224 |     License will therefore apply, along with any applicable section 7
225 |     additional terms, to the whole of the work, and all its parts,
226 |     regardless of how they are packaged.  This License gives no
227 |     permission to license the work in any other way, but it does not
228 |     invalidate such permission if you have separately received it.
229 | 
230 |     d) If the work has interactive user interfaces, each must display
231 |     Appropriate Legal Notices; however, if the Program has interactive
232 |     interfaces that do not display Appropriate Legal Notices, your
233 |     work need not make them do so.
234 | 
235 |   A compilation of a covered work with other separate and independent
236 | works, which are not by their nature extensions of the covered work,
237 | and which are not combined with it such as to form a larger program,
238 | in or on a volume of a storage or distribution medium, is called an
239 | "aggregate" if the compilation and its resulting copyright are not
240 | used to limit the access or legal rights of the compilation's users
241 | beyond what the individual works permit.  Inclusion of a covered work
242 | in an aggregate does not cause this License to apply to the other
243 | parts of the aggregate.
244 | 
245 |   6. Conveying Non-Source Forms.
246 | 
247 |   You may convey a covered work in object code form under the terms
248 | of sections 4 and 5, provided that you also convey the
249 | machine-readable Corresponding Source under the terms of this License,
250 | in one of these ways:
251 | 
252 |     a) Convey the object code in, or embodied in, a physical product
253 |     (including a physical distribution medium), accompanied by the
254 |     Corresponding Source fixed on a durable physical medium
255 |     customarily used for software interchange.
256 | 
257 |     b) Convey the object code in, or embodied in, a physical product
258 |     (including a physical distribution medium), accompanied by a
259 |     written offer, valid for at least three years and valid for as
260 |     long as you offer spare parts or customer support for that product
261 |     model, to give anyone who possesses the object code either (1) a
262 |     copy of the Corresponding Source for all the software in the
263 |     product that is covered by this License, on a durable physical
264 |     medium customarily used for software interchange, for a price no
265 |     more than your reasonable cost of physically performing this
266 |     conveying of source, or (2) access to copy the
267 |     Corresponding Source from a network server at no charge.
268 | 
269 |     c) Convey individual copies of the object code with a copy of the
270 |     written offer to provide the Corresponding Source.  This
271 |     alternative is allowed only occasionally and noncommercially, and
272 |     only if you received the object code with such an offer, in accord
273 |     with subsection 6b.
274 | 
275 |     d) Convey the object code by offering access from a designated
276 |     place (gratis or for a charge), and offer equivalent access to the
277 |     Corresponding Source in the same way through the same place at no
278 |     further charge.  You need not require recipients to copy the
279 |     Corresponding Source along with the object code.  If the place to
280 |     copy the object code is a network server, the Corresponding Source
281 |     may be on a different server (operated by you or a third party)
282 |     that supports equivalent copying facilities, provided you maintain
283 |     clear directions next to the object code saying where to find the
284 |     Corresponding Source.  Regardless of what server hosts the
285 |     Corresponding Source, you remain obligated to ensure that it is
286 |     available for as long as needed to satisfy these requirements.
287 | 
288 |     e) Convey the object code using peer-to-peer transmission, provided
289 |     you inform other peers where the object code and Corresponding
290 |     Source of the work are being offered to the general public at no
291 |     charge under subsection 6d.
292 | 
293 |   A separable portion of the object code, whose source code is excluded
294 | from the Corresponding Source as a System Library, need not be
295 | included in conveying the object code work.
296 | 
297 |   A "User Product" is either (1) a "consumer product", which means any
298 | tangible personal property which is normally used for personal, family,
299 | or household purposes, or (2) anything designed or sold for incorporation
300 | into a dwelling.  In determining whether a product is a consumer product,
301 | doubtful cases shall be resolved in favor of coverage.  For a particular
302 | product received by a particular user, "normally used" refers to a
303 | typical or common use of that class of product, regardless of the status
304 | of the particular user or of the way in which the particular user
305 | actually uses, or expects or is expected to use, the product.  A product
306 | is a consumer product regardless of whether the product has substantial
307 | commercial, industrial or non-consumer uses, unless such uses represent
308 | the only significant mode of use of the product.
309 | 
310 |   "Installation Information" for a User Product means any methods,
311 | procedures, authorization keys, or other information required to install
312 | and execute modified versions of a covered work in that User Product from
313 | a modified version of its Corresponding Source.  The information must
314 | suffice to ensure that the continued functioning of the modified object
315 | code is in no case prevented or interfered with solely because
316 | modification has been made.
317 | 
318 |   If you convey an object code work under this section in, or with, or
319 | specifically for use in, a User Product, and the conveying occurs as
320 | part of a transaction in which the right of possession and use of the
321 | User Product is transferred to the recipient in perpetuity or for a
322 | fixed term (regardless of how the transaction is characterized), the
323 | Corresponding Source conveyed under this section must be accompanied
324 | by the Installation Information.  But this requirement does not apply
325 | if neither you nor any third party retains the ability to install
326 | modified object code on the User Product (for example, the work has
327 | been installed in ROM).
328 | 
329 |   The requirement to provide Installation Information does not include a
330 | requirement to continue to provide support service, warranty, or updates
331 | for a work that has been modified or installed by the recipient, or for
332 | the User Product in which it has been modified or installed.  Access to a
333 | network may be denied when the modification itself materially and
334 | adversely affects the operation of the network or violates the rules and
335 | protocols for communication across the network.
336 | 
337 |   Corresponding Source conveyed, and Installation Information provided,
338 | in accord with this section must be in a format that is publicly
339 | documented (and with an implementation available to the public in
340 | source code form), and must require no special password or key for
341 | unpacking, reading or copying.
342 | 
343 |   7. Additional Terms.
344 | 
345 |   "Additional permissions" are terms that supplement the terms of this
346 | License by making exceptions from one or more of its conditions.
347 | Additional permissions that are applicable to the entire Program shall
348 | be treated as though they were included in this License, to the extent
349 | that they are valid under applicable law.  If additional permissions
350 | apply only to part of the Program, that part may be used separately
351 | under those permissions, but the entire Program remains governed by
352 | this License without regard to the additional permissions.
353 | 
354 |   When you convey a copy of a covered work, you may at your option
355 | remove any additional permissions from that copy, or from any part of
356 | it.  (Additional permissions may be written to require their own
357 | removal in certain cases when you modify the work.)  You may place
358 | additional permissions on material, added by you to a covered work,
359 | for which you have or can give appropriate copyright permission.
360 | 
361 |   Notwithstanding any other provision of this License, for material you
362 | add to a covered work, you may (if authorized by the copyright holders of
363 | that material) supplement the terms of this License with terms:
364 | 
365 |     a) Disclaiming warranty or limiting liability differently from the
366 |     terms of sections 15 and 16 of this License; or
367 | 
368 |     b) Requiring preservation of specified reasonable legal notices or
369 |     author attributions in that material or in the Appropriate Legal
370 |     Notices displayed by works containing it; or
371 | 
372 |     c) Prohibiting misrepresentation of the origin of that material, or
373 |     requiring that modified versions of such material be marked in
374 |     reasonable ways as different from the original version; or
375 | 
376 |     d) Limiting the use for publicity purposes of names of licensors or
377 |     authors of the material; or
378 | 
379 |     e) Declining to grant rights under trademark law for use of some
380 |     trade names, trademarks, or service marks; or
381 | 
382 |     f) Requiring indemnification of licensors and authors of that
383 |     material by anyone who conveys the material (or modified versions of
384 |     it) with contractual assumptions of liability to the recipient, for
385 |     any liability that these contractual assumptions directly impose on
386 |     those licensors and authors.
387 | 
388 |   All other non-permissive additional terms are considered "further
389 | restrictions" within the meaning of section 10.  If the Program as you
390 | received it, or any part of it, contains a notice stating that it is
391 | governed by this License along with a term that is a further
392 | restriction, you may remove that term.  If a license document contains
393 | a further restriction but permits relicensing or conveying under this
394 | License, you may add to a covered work material governed by the terms
395 | of that license document, provided that the further restriction does
396 | not survive such relicensing or conveying.
397 | 
398 |   If you add terms to a covered work in accord with this section, you
399 | must place, in the relevant source files, a statement of the
400 | additional terms that apply to those files, or a notice indicating
401 | where to find the applicable terms.
402 | 
403 |   Additional terms, permissive or non-permissive, may be stated in the
404 | form of a separately written license, or stated as exceptions;
405 | the above requirements apply either way.
406 | 
407 |   8. Termination.
408 | 
409 |   You may not propagate or modify a covered work except as expressly
410 | provided under this License.  Any attempt otherwise to propagate or
411 | modify it is void, and will automatically terminate your rights under
412 | this License (including any patent licenses granted under the third
413 | paragraph of section 11).
414 | 
415 |   However, if you cease all violation of this License, then your
416 | license from a particular copyright holder is reinstated (a)
417 | provisionally, unless and until the copyright holder explicitly and
418 | finally terminates your license, and (b) permanently, if the copyright
419 | holder fails to notify you of the violation by some reasonable means
420 | prior to 60 days after the cessation.
421 | 
422 |   Moreover, your license from a particular copyright holder is
423 | reinstated permanently if the copyright holder notifies you of the
424 | violation by some reasonable means, this is the first time you have
425 | received notice of violation of this License (for any work) from that
426 | copyright holder, and you cure the violation prior to 30 days after
427 | your receipt of the notice.
428 | 
429 |   Termination of your rights under this section does not terminate the
430 | licenses of parties who have received copies or rights from you under
431 | this License.  If your rights have been terminated and not permanently
432 | reinstated, you do not qualify to receive new licenses for the same
433 | material under section 10.
434 | 
435 |   9. Acceptance Not Required for Having Copies.
436 | 
437 |   You are not required to accept this License in order to receive or
438 | run a copy of the Program.  Ancillary propagation of a covered work
439 | occurring solely as a consequence of using peer-to-peer transmission
440 | to receive a copy likewise does not require acceptance.  However,
441 | nothing other than this License grants you permission to propagate or
442 | modify any covered work.  These actions infringe copyright if you do
443 | not accept this License.  Therefore, by modifying or propagating a
444 | covered work, you indicate your acceptance of this License to do so.
445 | 
446 |   10. Automatic Licensing of Downstream Recipients.
447 | 
448 |   Each time you convey a covered work, the recipient automatically
449 | receives a license from the original licensors, to run, modify and
450 | propagate that work, subject to this License.  You are not responsible
451 | for enforcing compliance by third parties with this License.
452 | 
453 |   An "entity transaction" is a transaction transferring control of an
454 | organization, or substantially all assets of one, or subdividing an
455 | organization, or merging organizations.  If propagation of a covered
456 | work results from an entity transaction, each party to that
457 | transaction who receives a copy of the work also receives whatever
458 | licenses to the work the party's predecessor in interest had or could
459 | give under the previous paragraph, plus a right to possession of the
460 | Corresponding Source of the work from the predecessor in interest, if
461 | the predecessor has it or can get it with reasonable efforts.
462 | 
463 |   You may not impose any further restrictions on the exercise of the
464 | rights granted or affirmed under this License.  For example, you may
465 | not impose a license fee, royalty, or other charge for exercise of
466 | rights granted under this License, and you may not initiate litigation
467 | (including a cross-claim or counterclaim in a lawsuit) alleging that
468 | any patent claim is infringed by making, using, selling, offering for
469 | sale, or importing the Program or any portion of it.
470 | 
471 |   11. Patents.
472 | 
473 |   A "contributor" is a copyright holder who authorizes use under this
474 | License of the Program or a work on which the Program is based.  The
475 | work thus licensed is called the contributor's "contributor version".
476 | 
477 |   A contributor's "essential patent claims" are all patent claims
478 | owned or controlled by the contributor, whether already acquired or
479 | hereafter acquired, that would be infringed by some manner, permitted
480 | by this License, of making, using, or selling its contributor version,
481 | but do not include claims that would be infringed only as a
482 | consequence of further modification of the contributor version.  For
483 | purposes of this definition, "control" includes the right to grant
484 | patent sublicenses in a manner consistent with the requirements of
485 | this License.
486 | 
487 |   Each contributor grants you a non-exclusive, worldwide, royalty-free
488 | patent license under the contributor's essential patent claims, to
489 | make, use, sell, offer for sale, import and otherwise run, modify and
490 | propagate the contents of its contributor version.
491 | 
492 |   In the following three paragraphs, a "patent license" is any express
493 | agreement or commitment, however denominated, not to enforce a patent
494 | (such as an express permission to practice a patent or covenant not to
495 | sue for patent infringement).  To "grant" such a patent license to a
496 | party means to make such an agreement or commitment not to enforce a
497 | patent against the party.
498 | 
499 |   If you convey a covered work, knowingly relying on a patent license,
500 | and the Corresponding Source of the work is not available for anyone
501 | to copy, free of charge and under the terms of this License, through a
502 | publicly available network server or other readily accessible means,
503 | then you must either (1) cause the Corresponding Source to be so
504 | available, or (2) arrange to deprive yourself of the benefit of the
505 | patent license for this particular work, or (3) arrange, in a manner
506 | consistent with the requirements of this License, to extend the patent
507 | license to downstream recipients.  "Knowingly relying" means you have
508 | actual knowledge that, but for the patent license, your conveying the
509 | covered work in a country, or your recipient's use of the covered work
510 | in a country, would infringe one or more identifiable patents in that
511 | country that you have reason to believe are valid.
512 | 
513 |   If, pursuant to or in connection with a single transaction or
514 | arrangement, you convey, or propagate by procuring conveyance of, a
515 | covered work, and grant a patent license to some of the parties
516 | receiving the covered work authorizing them to use, propagate, modify
517 | or convey a specific copy of the covered work, then the patent license
518 | you grant is automatically extended to all recipients of the covered
519 | work and works based on it.
520 | 
521 |   A patent license is "discriminatory" if it does not include within
522 | the scope of its coverage, prohibits the exercise of, or is
523 | conditioned on the non-exercise of one or more of the rights that are
524 | specifically granted under this License.  You may not convey a covered
525 | work if you are a party to an arrangement with a third party that is
526 | in the business of distributing software, under which you make payment
527 | to the third party based on the extent of your activity of conveying
528 | the work, and under which the third party grants, to any of the
529 | parties who would receive the covered work from you, a discriminatory
530 | patent license (a) in connection with copies of the covered work
531 | conveyed by you (or copies made from those copies), or (b) primarily
532 | for and in connection with specific products or compilations that
533 | contain the covered work, unless you entered into that arrangement,
534 | or that patent license was granted, prior to 28 March 2007.
535 | 
536 |   Nothing in this License shall be construed as excluding or limiting
537 | any implied license or other defenses to infringement that may
538 | otherwise be available to you under applicable patent law.
539 | 
540 |   12. No Surrender of Others' Freedom.
541 | 
542 |   If conditions are imposed on you (whether by court order, agreement or
543 | otherwise) that contradict the conditions of this License, they do not
544 | excuse you from the conditions of this License.  If you cannot convey a
545 | covered work so as to satisfy simultaneously your obligations under this
546 | License and any other pertinent obligations, then as a consequence you may
547 | not convey it at all.  For example, if you agree to terms that obligate you
548 | to collect a royalty for further conveying from those to whom you convey
549 | the Program, the only way you could satisfy both those terms and this
550 | License would be to refrain entirely from conveying the Program.
551 | 
552 |   13. Use with the GNU Affero General Public License.
553 | 
554 |   Notwithstanding any other provision of this License, you have
555 | permission to link or combine any covered work with a work licensed
556 | under version 3 of the GNU Affero General Public License into a single
557 | combined work, and to convey the resulting work.  The terms of this
558 | License will continue to apply to the part which is the covered work,
559 | but the special requirements of the GNU Affero General Public License,
560 | section 13, concerning interaction through a network will apply to the
561 | combination as such.
562 | 
563 |   14. Revised Versions of this License.
564 | 
565 |   The Free Software Foundation may publish revised and/or new versions of
566 | the GNU General Public License from time to time.  Such new versions will
567 | be similar in spirit to the present version, but may differ in detail to
568 | address new problems or concerns.
569 | 
570 |   Each version is given a distinguishing version number.  If the
571 | Program specifies that a certain numbered version of the GNU General
572 | Public License "or any later version" applies to it, you have the
573 | option of following the terms and conditions either of that numbered
574 | version or of any later version published by the Free Software
575 | Foundation.  If the Program does not specify a version number of the
576 | GNU General Public License, you may choose any version ever published
577 | by the Free Software Foundation.
578 | 
579 |   If the Program specifies that a proxy can decide which future
580 | versions of the GNU General Public License can be used, that proxy's
581 | public statement of acceptance of a version permanently authorizes you
582 | to choose that version for the Program.
583 | 
584 |   Later license versions may give you additional or different
585 | permissions.  However, no additional obligations are imposed on any
586 | author or copyright holder as a result of your choosing to follow a
587 | later version.
588 | 
589 |   15. Disclaimer of Warranty.
590 | 
591 |   THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592 | APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596 | PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597 | IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599 | 
600 |   16. Limitation of Liability.
601 | 
602 |   IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610 | SUCH DAMAGES.
611 | 
612 |   17. Interpretation of Sections 15 and 16.
613 | 
614 |   If the disclaimer of warranty and limitation of liability provided
615 | above cannot be given local legal effect according to their terms,
616 | reviewing courts shall apply local law that most closely approximates
617 | an absolute waiver of all civil liability in connection with the
618 | Program, unless a warranty or assumption of liability accompanies a
619 | copy of the Program in return for a fee.
620 | 
621 |                      END OF TERMS AND CONDITIONS
622 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include LICENSE.txt README.rst CHANGES.txt contributors.txt requirements.txt
2 | include tlseparation/config/example_config.txt
3 | 


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
 1 | TLSeparation
 2 | ============
 3 | 
 4 | TLSeparation is a Python library for material separation from tree/forests 3d point clouds.
 5 | 
 6 | Some features included in this package are:
 7 | 
 8 | * Automated scripts to perform separation from single tree data.
 9 | * Very extensible, modules and functions that can be imported to build a custom workflow.
10 | * Separation functions based on topology and geometric arrangement of points.
11 | * Filtering module to improve classification results.
12 | 
13 | This is still a work in progress, requiring some polishing to improve user-friendliness, but the core modules are sound and tested.
14 | 
15 | The TLSeparation library is being developed as part of my PhD research, supervised by Dr. Mat Disney, in the Department of Geography at University College London (UCL). My research 
16 | is funded through Science Without Borders from the National Council of Technological and Scientific Development (10.13039/501100003593) – Brazil (Process number 233849/2014-9). 
17 | 
18 | Any questions or suggestions, feel free to contact me using one of the following e-mails: matheus.boni.vicari@gmail.com or matheus.vicari.15@ucl.ac.uk
19 | 


--------------------------------------------------------------------------------
/TODO.rst:
--------------------------------------------------------------------------------
 1 | ===================================
 2 | To-Do list for TLSeparation project
 3 | ===================================
 4 | 
 5 | 
 6 | Python package
 7 | ~~~~~~~~~~~~~~
 8 | - Improve *continuity_filter*;
 9 | - Add logging options;
10 | - Change path frequency detection threshold to [np.max(np.log(c)) / 2];
11 | - Change radius thresholds from frequency path detection;
12 | 


--------------------------------------------------------------------------------
/contributors.txt:
--------------------------------------------------------------------------------
1 | Matheus Boni Vicari
2 |  (matheus.boni.vicari@gmail.com or matheus.vicari.15@ucl.ac.uk)
3 | 
4 | Phil Wilkes
5 |  (p.wilkes@ucl.ac.uk)


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | scipy==0.19.0
2 | pandas==0.19.2
3 | numpy==1.22.0
4 | networkx==1.11
5 | scikit_learn==0.19.1
6 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from setuptools import setup, find_packages
29 | 
30 | 
31 | def readme():
32 |     with open('README.rst') as f:
33 |         return f.read()
34 | 
35 | with open('requirements.txt') as f:
36 |     required = f.read().splitlines()
37 | 
38 | setup(
39 |     name="tlseparation",
40 |     version="1.3.2",
41 |     author='Matheus Boni Vicari',
42 |     author_email='matheus.boni.vicari@gmail.com',
43 |     packages=find_packages(),
44 |     entry_points={
45 | 	},
46 |     url='https://github.com/TLSeparation/source',
47 |     license='LICENSE.txt',
48 |     description='Performs the wood/leaf separation from\
49 |  3D point clouds generated using Terrestrial LiDAR\
50 |  Scanners.',
51 |     long_description=readme(),
52 |     classifiers=['Programming Language :: Python',
53 |                  'Topic :: Scientific/Engineering'],
54 |     keywords='wood/leaf separation TLS point cloud LiDAR',
55 |     install_requires=required,
56 |     # ...
57 | )
58 | 


--------------------------------------------------------------------------------
/tlseparation/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2018, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | __all__ = ['classification', 'utility', 'scripts']
29 | 
30 | from . import classification
31 | from . import utility
32 | from . import scripts
33 | 


--------------------------------------------------------------------------------
/tlseparation/classification/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from .point_features import (knn_features, curvature)
29 | from .gmm import (classify, class_select_abs, class_select_ref)
30 | from .path_detection import (detect_main_pathways, voxel_path_detection,
31 |                              path_detect_frequency, get_base)
32 | from .wlseparation import (wlseparate_abs, wlseparate_ref_voting, fill_class)
33 | from .classify_wood import (reference_classification,
34 |                             threshold_classification)
35 | from .classes_reference import DefaultClass
36 | 


--------------------------------------------------------------------------------
/tlseparation/classification/classes_reference.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari", "Phil Wilkes"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | 
29 | import numpy as np
30 | from pandas import DataFrame
31 | 
32 | 
33 | class DefaultClass:
34 | 
35 |     """
36 |     Defines a default reference class to be used in classification of
37 |     tree point clouds.
38 | 
39 |     """
40 | 
41 |     def __init__(self):
42 |         self.ref_table = DataFrame(np.array([['leaf', 1, 0, 0, 0, 0, 0],
43 |                                              ['twig', 0, 1, 0, 0, 0.5, 1],
44 |                                              ['trunk', 0, 0, 1, 1, 0.5, 1]]),
45 |                                    columns=['class', 0, 1, 2, 3, 4, 5])
46 | 


--------------------------------------------------------------------------------
/tlseparation/classification/classify_wood.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from .classes_reference import DefaultClass
 30 | from .wlseparation import wlseparate_abs, wlseparate_ref_voting
 31 | from ..utility.filtering import class_filter
 32 | 
 33 | 
 34 | def reference_classification(point_cloud, knn_list, n_classes=4,
 35 |                              prob_threshold=0.95):
 36 | 
 37 |     """
 38 |     Classifies wood material points from a point cloud. This function
 39 |     uses *wlseparate_ref_voting* to perform the basic classification and then
 40 |     apply *class_filter* to filter out potentially misclassified wood points.
 41 | 
 42 |     Parameters
 43 |     ----------
 44 |     point_cloud: numpy.ndarray
 45 |         2D (n x 3) array containing n points in 3D space (x, y, z).
 46 |     knn_list: list
 47 |         List of knn values to be used iteratively in the voting separation.
 48 |     n_classes: int
 49 |         Number of intermediate classes. Minimum classes should be 3, but
 50 |         default value is set to 4 in order to accommodate for noise/outliers
 51 |         classes.
 52 |     prob_threshold: float
 53 |         Classification probability threshold to filter classes. This aims to
 54 |         avoid selecting points that are not confidently enough assigned to
 55 |         any given class. Default is 0.95.
 56 | 
 57 |     Returns
 58 |     -------
 59 |     wood_points: numpy.ndarray
 60 |         2D (nw x 3) array containing n wood points in 3D space (x, y, z).
 61 | 
 62 |     """
 63 | 
 64 |     # Defining reference class table.
 65 |     class_file = DefaultClass().ref_table
 66 | 
 67 |     # Classifying point cloud using wlseparate_ref_voting. The output will
 68 |     # be a combination of classes indices, vote counts and probabilities.
 69 |     ids, count, prob = wlseparate_ref_voting(point_cloud, knn_list, class_file,
 70 |                                              n_classes=n_classes)
 71 |     # Selecting indices, probabilities and votes count for wood classes
 72 |     # (twig and trunk).
 73 |     twig_mask = ids['twig']
 74 |     twig_prob = prob['twig']
 75 |     twig_count = count['twig']
 76 |     # Selecting only twig points with a high probability and vote count.
 77 |     twig = twig_mask[(twig_prob >= prob_threshold) &
 78 |                      (twig_count >= np.max(twig_count) - 1)]
 79 |     trunk_mask = ids['trunk']
 80 |     trunk_prob = prob['trunk']
 81 |     trunk_count = count['trunk']
 82 |     # Selecting only trunk points with a high probability and vote count.
 83 |     trunk = trunk_mask[(trunk_prob >= prob_threshold) &
 84 |                        (trunk_count >= np.max(trunk_count) - 1)]
 85 | 
 86 |     # Creating boolean mask with the same number of entries as input
 87 |     # point cloud. Entries of points classified as wood are set to True.
 88 |     class_mask = np.zeros(point_cloud.shape[0], dtype=bool)
 89 |     class_mask[twig] = True
 90 |     class_mask[trunk] = True
 91 | 
 92 |     # Stacking wood and not wood points and applying class_filter.
 93 |     temp_arr = np.vstack((point_cloud[class_mask], point_cloud[~class_mask]))
 94 |     k = int(np.min(knn_list))
 95 |     wood_ids, not_wood_ids = class_filter(point_cloud[class_mask],
 96 |                                           point_cloud[~class_mask], 0, knn=k)
 97 | 
 98 |     return temp_arr[wood_ids]
 99 | 
100 | 
101 | def threshold_classification(point_cloud, knn, n_classes=3,
102 |                              prob_threshold=0.95):
103 | 
104 |     """
105 |     Classifies wood material points from a point cloud. This function
106 |     uses *wlseparate_abs* to perform the basic classification and then
107 |     apply *class_filter* to filter out potentially misclassified wood points.
108 | 
109 |     Parameters
110 |     ----------
111 |     point_cloud : numpy.ndarray
112 |         2D (n x 3) array containing n points in 3D space (x, y, z).
113 |     knn : int
114 |         Number of neighbors to select around each point. Used to describe
115 |         local point arrangement.
116 |     n_classes: int
117 |         Number of intermediate classes. Default is 3.
118 |     prob_threshold: float
119 |         Classification probability threshold to filter classes. This aims to
120 |         avoid selecting points that are not confidently enough assigned to
121 |         any given class. Default is 0.95.
122 | 
123 |     Returns
124 |     -------
125 |     wood_points: numpy.ndarray
126 |         2D (nw x 3) array containing n wood points in 3D space (x, y, z).
127 | 
128 |     """
129 | 
130 |     # Running wlseparate_abs to classify the input point cloud into wood and
131 |     # leaf classes.
132 |     ids, prob = wlseparate_abs(point_cloud, knn, n_classes)
133 |     # Selecting wood indices and probabilities.
134 |     wood_mask = ids['wood']
135 |     wood_prob = prob['wood']
136 |     # Filtering out wood points with classification probability lower than
137 |     # threshold.
138 |     wood = wood_mask[wood_prob >= prob_threshold]
139 | 
140 |     # Creating boolean mask with the same number of entries as input
141 |     # point cloud. Entries of points classified as wood are set to True.
142 |     class_mask = np.zeros(point_cloud.shape[0], dtype=bool)
143 |     class_mask[wood] = True
144 | 
145 |     # Stacking wood and not wood points and applying class_filter.
146 |     temp_arr = np.vstack((point_cloud[class_mask],
147 |                           point_cloud[~class_mask]))
148 |     wood_ids, not_wood_ids = class_filter(point_cloud[class_mask],
149 |                                           point_cloud[~class_mask], 0,
150 |                                           knn=int(knn))
151 | 
152 |     return temp_arr[wood_ids]
153 | 


--------------------------------------------------------------------------------
/tlseparation/classification/gmm.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from sklearn.mixture import GaussianMixture as GMM
 30 | 
 31 | 
 32 | def classify(variables, n_classes):
 33 | 
 34 |     """
 35 |     Function to perform the classification of a dataset using sklearn's
 36 |     Gaussian Mixture Models with Expectation Maximization.
 37 | 
 38 |     Parameters
 39 |     ----------
 40 |     variables : array
 41 |         N-dimensional array (m x n) containing a set of parameters (n)
 42 |         over a set of observations (m).
 43 |     n_classes : int
 44 |         Number of classes to assign the input variables.
 45 | 
 46 |     Returns
 47 |     -------
 48 |     classes : list
 49 |         List of classes labels for each observation from the input variables.
 50 |     means : array
 51 |         N-dimensional array (c x n) of each class (c) parameter space means
 52 |         (n).
 53 |     probability : array
 54 |         Probability of samples belonging to every class in the classification.
 55 |         Sum of sample-wise probability should be 1.
 56 | 
 57 |     """
 58 | 
 59 |     # Initialize a GMM classifier with n_classes and fit variables to it.
 60 |     gmm = GMM(n_components=n_classes)
 61 |     gmm.fit(variables)
 62 | 
 63 |     return gmm.predict(variables), gmm.means_, gmm.predict_proba(variables)
 64 | 
 65 | 
 66 | def class_select_ref(classes, cm, classes_ref):
 67 | 
 68 |     """
 69 |     Selects from the classification results which classes are wood and which
 70 |     are leaf.
 71 | 
 72 |     Parameters
 73 |     ----------
 74 |     classes : list
 75 |         List of classes labels for each observation from the input variables.
 76 |     cm : array
 77 |         N-dimensional array (c x n) of each class (c) parameter space mean
 78 |         valuess (n).
 79 |     classes_ref : array
 80 |         Reference classes values.
 81 | 
 82 |     Returns
 83 |     -------
 84 |     mask : array
 85 |         List of booleans where True represents wood points and False
 86 |         represents leaf points.
 87 | 
 88 |     """
 89 | 
 90 |     # Initializing array of class ids.
 91 |     class_ids = np.zeros([cm.shape[0]])
 92 | 
 93 |     # Looping over each index in the classes means array.
 94 |     for c in range(cm.shape[0]):
 95 |         # Setting initial minimum distance value.
 96 |         mindist = np.inf
 97 |         # Looping over indices in classes reference values.
 98 |         for i in range(classes_ref.shape[0]):
 99 |             # Calculating distance of current class mean parameters and
100 |             # current reference paramenters.
101 |             d = np.linalg.norm(cm[c] - classes_ref[i])
102 |             # Checking if current distance is smaller than previous distance
103 |             # if so, assign current reference index to current class index.
104 |             if d < mindist:
105 |                 class_ids[c] = i
106 |                 mindist = d
107 | 
108 |     # Assigning final classes values to new classes.
109 |     new_classes = np.zeros([classes.shape[0]])
110 |     for i in range(new_classes.shape[0]):
111 |         new_classes[i] = class_ids[classes[i]]
112 | 
113 |     return new_classes
114 | 
115 | 
116 | def class_select_abs(classes, cm, nbrs_idx, feature=5, threshold=0.5):
117 | 
118 |     """
119 |     Select from GMM classification results which classes are wood and which
120 |     are leaf based on a absolute value threshold from a single feature in
121 |     the parameter space.
122 | 
123 |     Parameters
124 |     ----------
125 |     classes : list or array
126 |         Classes labels for each observation from the input variables.
127 |     cm : array
128 |         N-dimensional array (c x n) of each class (c) parameter space mean
129 |         valuess (n).
130 |     nbrs_idx : array
131 |         Nearest Neighbors indices relative to every point of the array that
132 |         originated the classes labels.
133 |     feature : int
134 |         Column index of the feature to use as constraint.
135 |     threshold : float
136 |         Threshold value to mask classes. All classes with means >= threshold
137 |         are masked as true.
138 | 
139 |     Returns
140 |     -------
141 |     mask : list
142 |         List of booleans where True represents wood points and False
143 |         represents leaf points.
144 | 
145 |     """
146 | 
147 |     # Calculating the ratio of first 3 components of the classes means (cm).
148 |     # These components are the basic geometric descriptors.
149 |     if np.max(np.sum(cm, axis=1)) >= threshold:
150 | 
151 |         class_id = np.argmax(cm[:, feature])
152 | 
153 |         # Masking classes based on the criterias set above. Mask will present
154 |         # True for wood points and False for leaf points.
155 |         mask = classes == class_id
156 | 
157 |     else:
158 |         mask = []
159 | 
160 |     return mask
161 | 


--------------------------------------------------------------------------------
/tlseparation/classification/path_detection.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari", "Phil Wilkes"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | 
 29 | import datetime
 30 | import numpy as np
 31 | from sklearn.neighbors import NearestNeighbors
 32 | from ..utility.shortpath import (array_to_graph, extract_path_info)
 33 | from ..utility.voxels import voxelize_cloud
 34 | from ..utility.downsampling import (downsample_cloud, upsample_cloud)
 35 | from ..utility.filtering import radius_filter
 36 | from ..utility.knnsearch import set_nbrs_rad
 37 | 
 38 | 
 39 | def path_detect_frequency(point_cloud, downsample_size,
 40 |                           frequency_threshold):
 41 | 
 42 |     """
 43 |     Detects points from major paths in a graph generated from a point cloud.
 44 |     The detection is performed by comparing the frequency of all paths that
 45 |     each node is present. Nodes with frequency larger than threshold are
 46 |     selected as detected. In order to fill pathways regions with low nodes
 47 |     density, neighboring points within downsampling_size * 1.5 distance are
 48 |     also set as detected.
 49 | 
 50 |     Parameters
 51 |     ----------
 52 |     point_cloud : numpy.ndarray
 53 |         2D (n x 3) array containing n points in 3D space (x, y, z).
 54 |     downsample_size : float
 55 |         Distance threshold used to group (downsample) the input point cloud.
 56 |         Simplificaton of the cloud by downsampling, improves the results and
 57 |         processing times.
 58 |     frequency_threshold : float
 59 |         Minimum path frequency for a node to be selected as part of major
 60 |         pathways.
 61 | 
 62 |     Returns
 63 |     -------
 64 |     path_points: numpy.ndarray
 65 |         2D (np x 3) array containing n points in 3D space (x, y, z) that
 66 |         belongs to major pathways in the point cloud.
 67 | 
 68 |     """
 69 | 
 70 |     # Downsampling point cloud. The function returns downsampled indices
 71 |     # (down_ids) and a set of original neighboring indices around each
 72 |     # downsampled points (up_ids) that can be later used to revert the
 73 |     # downsampling.
 74 |     down_ids, up_ids = downsample_cloud(point_cloud, downsample_size,
 75 |                                         return_indices=True,
 76 |                                         return_neighbors=True)
 77 |     # Obtaining downsampled cloud base index (lowest point in the cloud).
 78 |     base_id = np.argmin(point_cloud[down_ids, 2])
 79 |     # Generating networkx graph from point cloud.
 80 |     G = array_to_graph(point_cloud[down_ids], base_id, 3, 100,
 81 |                        downsample_size * 1.77, 0.02)
 82 |     # Extracting shortest path information from graph: nodes indices
 83 |     # (nodes_ids), shortest path distance (D) and list of nodes in each
 84 |     # nodes' paths (path_dict).
 85 |     nodes_ids, D, path_dict = extract_path_info(G, base_id,
 86 |                                                 return_path=True)
 87 |     # Selecting nodes coordinates
 88 |     nodes = point_cloud[list(down_ids)][list(nodes_ids)]
 89 |     # Unpacking all indices in path_dict and appending them to path_ids_list.
 90 |     path_ids_list = []
 91 |     for k, v in path_dict.items():
 92 |         path_ids_list.append(v)
 93 |     # Flattening path_ids_list.
 94 |     path_ids_list = [j for i in path_ids_list for j in i]
 95 | 
 96 |     # Calculating counts of paths (c) that go through each unique node (u).
 97 |     u, c = np.unique(path_ids_list, return_counts=True)
 98 |     # Masking nodes with log(c) larger than treshold.
 99 |     mask = np.log(c) >= frequency_threshold
100 |     # Filtering isolated nodes.
101 |     mask_radius = radius_filter(nodes[mask], 0.2, 3)
102 |     # Selecting neighboring nodes.
103 |     nbrs_idx = set_nbrs_rad(point_cloud, nodes[mask][mask_radius],
104 |                             downsample_size * 1.5, False)
105 | 
106 |     # Flattening list of neighboring nodes and upscaling results to
107 |     # original point cloud.
108 |     ids = [j for i in nbrs_idx for j in i]
109 |     ids = np.unique(ids)
110 |     ups_ids = upsample_cloud(ids, up_ids)
111 | 
112 |     return point_cloud[ups_ids]
113 | 
114 | 
115 | def voxel_path_detection(point_cloud, voxel_size, k_retrace, knn,
116 |                          nbrs_threshold, verbose=False):
117 | 
118 |     """
119 |     Applies detect_main_pathways but with a voxelization option to speed up
120 |     processing.
121 | 
122 |     Parameters
123 |     ----------
124 |     point_cloud : array
125 |         Three-dimensional point cloud of a single tree to perform the
126 |         wood-leaf separation. This should be a n-dimensional array (m x n)
127 |         containing a set of coordinates (n) over a set of points (m).
128 |     voxel_size: float
129 |         Voxel dimensions' size.
130 |     k_retrace : int
131 |         Number of steps in the graph to retrace back to graph's base. Every
132 |         node in graph will be moved  k_retrace steps from the extremities
133 |         towards to base.
134 |     knn : int
135 |         Number of neighbors to fill gaps in detected paths. The larger the
136 |         better. A large knn will increase memory usage. Recommended value
137 |         between 50 and 150.
138 |     nbrs_threshold : float
139 |         Maximum distance to valid neighboring points used to fill gaps in
140 |         detected paths.
141 |     verbose: bool
142 |         Option to set verbose on/off.
143 | 
144 |     Returns
145 |     -------
146 |     path_mask : array
147 |         Boolean mask where 'True' represents points detected as part of the
148 |         main pathways and 'False' represents points not part of the pathways.
149 | 
150 |     Raises
151 |     ------
152 |     AssertionError:
153 |         point_cloud has the wrong shape or number of dimensions.
154 |     """
155 | 
156 |     # Making sure input point cloud has the right shape and number of
157 |     # dimensions.
158 |     assert point_cloud.ndim == 2, "point_cloud must be an array with 2\
159 |  dimensions, n_points x 3 (x, y, z)."
160 |     assert point_cloud.shape[1] == 3, "point_cloud must be a 3D point cloud.\
161 |  Make sure it has the shape n_points x 3 (x, y, z)."
162 | 
163 |     # Voxelizing point cloud.
164 |     if verbose:
165 |         print(str(datetime.datetime.now()) + ' | >>> voxelizing point cloud, \
166 | with a voxel size of %s' % voxel_size)
167 |     vox = voxelize_cloud(point_cloud, voxel_size=voxel_size)
168 |     vox_coords = np.asarray(list(vox.keys()))
169 | 
170 |     # Running detect_main_pathways over voxels' coordinates.
171 |     if verbose:
172 |         print(str(datetime.datetime.now()) + ' | >>> running \
173 | detect_main_pathways with %s number of steps retraced' % k_retrace)
174 |     path_mask_voxel = detect_main_pathways(vox_coords, k_retrace, knn,
175 |                                            nbrs_threshold, verbose=verbose)
176 |     # Re-indexing point_cloud indices from voxels coordinates detected as
177 |     # part of the path.
178 |     path_ids = np.unique([j for i in vox_coords[path_mask_voxel] for
179 |                           j in vox[tuple(i)]])
180 |     path_mask = np.zeros(point_cloud.shape[0], dtype=bool)
181 |     path_mask[path_ids] = True
182 | 
183 |     return path_mask
184 | 
185 | 
186 | def detect_main_pathways(point_cloud, k_retrace, knn, nbrs_threshold,
187 |                          verbose=False, max_iter=100):
188 | 
189 |     """
190 |     Detects the main pathways of an unordered 3D point cloud. Set as true
191 |     all points detected as part of all detected pathways that down to the
192 |     base of the graph.
193 | 
194 |     Parameters
195 |     ----------
196 |     point_cloud : array
197 |         Three-dimensional point cloud of a single tree to perform the
198 |         wood-leaf separation. This should be a n-dimensional array (m x n)
199 |         containing a set of coordinates (n) over a set of points (m).
200 |     k_retrace : int
201 |         Number of steps in the graph to retrace back to graph's base. Every
202 |         node in graph will be moved  k_retrace steps from the extremities
203 |         towards to base.
204 |     knn : int
205 |         Number of neighbors to fill gaps in detected paths. The larger the
206 |         better. A large knn will increase memory usage. Recommended value
207 |         between 50 and 150.
208 |     nbrs_threshold : float
209 |         Maximum distance to valid neighboring points used to fill gaps in
210 |         detected paths.
211 |     verbose: bool
212 |         Option to set verbose on/off.
213 | 
214 |     Returns
215 |     -------
216 |     path_mask : array
217 |         Boolean mask where 'True' represents points detected as part of the
218 |         main pathways and 'False' represents points not part of the pathways.
219 | 
220 |     Raises
221 |     ------
222 |     AssertionError:
223 |         point_cloud has the wrong shape or number of dimensions.
224 | 
225 |     """
226 | 
227 |     # Making sure input point cloud has the right shape and number of
228 |     # dimensions.
229 |     assert point_cloud.ndim == 2, "point_cloud must be an array with 2\
230 |  dimensions, n_points x 3 (x, y, z)."
231 |     assert point_cloud.shape[1] == 3, "point_cloud must be a 3D point cloud.\
232 |  Make sure it has the shape n_points x 3 (x, y, z)."
233 | 
234 |     # Getting root index (base_id) from point cloud.
235 |     base_id = np.argmin(point_cloud[:, 2])
236 | 
237 |     # Generating graph from point cloud and extracting shortest path
238 |     # information.
239 |     if verbose:
240 |         print(str(datetime.datetime.now()) + ' | >>> generating graph from \
241 | point cloud and extracting shortest path information')
242 |     G = array_to_graph(point_cloud, base_id, 3, knn, nbrs_threshold, 0.02)
243 |     nodes_ids, D, path_list = extract_path_info(G, base_id,
244 |                                                 return_path=True)
245 |     # Obtaining nodes coordinates from shortest path information.
246 |     nodes = point_cloud[list(nodes_ids)]
247 |     # Converting list of shortest path distances to array.
248 |     D = np.asarray(list(D))
249 | 
250 |     # Retracing path for nodes in G. This step aims to detect only major
251 |     # pathways in G. For a tree, these paths are expected to represent
252 |     # branches and trunk.
253 |     new_id = np.zeros(nodes.shape[0], dtype='int')
254 |     for key, values in path_list.items():
255 |         if len(values) >= k_retrace:
256 |             new_id[key] = values[len(values) - k_retrace]
257 |         else:
258 |             new_id[key] = values[0]
259 | 
260 |     # Getting unique indices after retracing path_list.
261 |     ids = np.unique(new_id)
262 | 
263 |     # Generating array of all indices from 'arr' and all indices to process
264 |     # 'idx'.
265 |     idx_base = np.arange(point_cloud.shape[0], dtype=int)
266 |     idx = np.arange(point_cloud.shape[0], dtype=int)
267 | 
268 |     # Initializing NearestNeighbors search and searching for all 'knn'
269 |     # neighboring points arround each point in 'arr'.
270 |     if verbose:
271 |         print(str(datetime.datetime.now()) + ' | >>> initializing \
272 | NearestNeighbors search and searching for all knn neighboring points \
273 | arround each point in arr')
274 |     nbrs = NearestNeighbors(n_neighbors=knn, metric='euclidean',
275 |                             leaf_size=15, n_jobs=-1).fit(point_cloud)
276 |     distances, indices = nbrs.kneighbors(point_cloud)
277 |     indices = indices.astype(int)
278 | 
279 |     # Initializing variables for current ids being processed (current_idx)
280 |     # and all ids already processed (processed_idx).
281 |     current_idx = ids
282 |     processed_idx = ids
283 | 
284 |     # Looping while there are still indices in current_idx to process.
285 |     if verbose:
286 |         print(str(datetime.datetime.now()) + ' | >>> looping while there \
287 | are still indices in current_idx to process')
288 |     iteration = 0
289 |     while (len(current_idx) > 0) & (iteration <= max_iter):
290 | 
291 |         # Selecting NearestNeighbors indices and distances for current
292 |         # indices being processed.
293 |         nn = indices[current_idx]
294 |         dd = distances[current_idx]
295 | 
296 |         # Masking out indices already contained in processed_idx.
297 |         mask1 = np.in1d(nn, processed_idx, invert=True).reshape(nn.shape)
298 |         # Masking neighboring points that are withing threshold distance.
299 |         mask2 = dd < nbrs_threshold
300 |         # mask1 AND mask2. This will mask only indices that are part of
301 |         # the graph and within threshold distance.
302 |         mask = np.logical_and(mask1, mask2)
303 | 
304 |         # Initializing temporary list of nearest neighbors. This list
305 |         # is latter used to accumulate points that will be added to
306 |         # processed points list.
307 |         nntemp = []
308 | 
309 |         # Looping over current indices's set of nn points and selecting
310 |         # knn points that hasn't been added/processed yet (mask1).
311 |         for i, (n, d) in enumerate(zip(nn, dd)):
312 |             nn_idx = n[mask[i]][1:]
313 | 
314 |             # Checking if current neighbor has an accumulated distance
315 |             # shorter than central node (n[0]) minus some distance based
316 |             # on nbrs_threshold. This penalisation aims to restrict potential
317 |             # neighbors to those more likely to be along an actual path. This
318 |             # would remove points placed along the sides of a path.
319 |             for ni in nn_idx:
320 |                 if D[ni] <= D[n[0]] - (nbrs_threshold / 3):
321 |                     nntemp.append(ni)
322 | 
323 |         # Obtaining an unique array of points currently being processed.
324 |         current_idx = np.unique(nntemp)
325 |         # Updating array of processed indices with indices processed within
326 |         # current iteration (current_idx).
327 |         processed_idx = np.append(processed_idx, current_idx)
328 |         processed_idx = np.unique(processed_idx).astype(int)
329 | 
330 |         # Generating list of remaining proints to process.
331 |         idx = idx_base[np.in1d(idx_base, processed_idx, invert=True)]
332 | 
333 |         # Increasing one iteration step.
334 |         iteration += 1
335 | 
336 |     # Just in case of not having detected all points in the desired paths, run
337 |     # another last iteration.
338 | 
339 |     # Getting NearestNeighbors indices and distance for all indices
340 |     # that remain to be processed.
341 |     idx2 = indices[idx]
342 |     dist2 = distances[idx]
343 | 
344 |     # Masking indices in idx2 that have already been processed. The
345 |     # idea is to connect remaining points to existing graph nodes.
346 |     mask1 = np.in1d(idx2, processed_idx).reshape(idx2.shape)
347 |     # Masking neighboring points that are withing threshold distance.
348 |     mask2 = dist2 < nbrs_threshold
349 |     # mask1 AND mask2. This will mask only indices that are part of
350 |     # the graph and within threshold distance.
351 |     mask = np.logical_and(mask1, mask2)
352 | 
353 |     # Getting unique array of indices that match the criteria from
354 |     # mask1 and mask2.
355 |     temp_idx = np.unique(np.where(mask)[0])
356 |     # Assigns remaining indices (idx) matched in temp_idx to
357 |     # current_idx.
358 |     n_idx = idx[temp_idx]
359 | 
360 |     # Selecting NearestNeighbors indices and distances for current
361 |     # indices being processed.
362 |     nn = indices[n_idx]
363 |     dd = distances[n_idx]
364 | 
365 |     # Masking points in nn that have already been processed.
366 |     # This is the oposite approach as above, where points that are
367 |     # still not in the graph are desired. Now, to make sure the
368 |     # continuity of the graph is kept, join current remaining indices
369 |     # to indices already in G.
370 |     mask = np.in1d(nn, processed_idx, invert=True).reshape(nn.shape)
371 | 
372 |     # Initializing temporary list of nearest neighbors. This list
373 |     # is latter used to accumulate points that will be added to
374 |     # processed points list.
375 |     nntemp = []
376 | 
377 |     # Looping over current indices's set of nn points and selecting
378 |     # knn points that have alreay been added/processed (mask).
379 |     # Also, to ensure continuity over next iteration, select another
380 |     # kpairs points from indices that haven't been processed (~mask).
381 |     if verbose:
382 |         print(str(datetime.datetime.now()) + ' | >>> looping over current \
383 | indicess set of nn points and selecting knn points that have alreay been \
384 | added/processed (mask)')
385 |     for i, n in enumerate(nn):
386 |         nn_idx = n[mask[i]][1:]
387 | 
388 |         # Checking if current neighbor has an accumulated distance
389 |         # shorter than central node (n[0]).
390 |         for ni in nn_idx:
391 |             if D[ni] <= D[n[0]] - (nbrs_threshold / 3):
392 |                 nntemp.append(ni)
393 | 
394 |         nn_idx = n[~mask[i]][1:]
395 | 
396 |         # Checking if current neighbor has an accumulated distance
397 |         # shorter than central node (n[0]).
398 |         for ni in nn_idx:
399 |             if D[ni] <= D[n[0]] - (nbrs_threshold / 3):
400 |                 nntemp.append(ni)
401 | 
402 |     current_idx = np.unique(nntemp)
403 | 
404 |     # Appending current_idx to processed_idx.
405 |     processed_idx = np.append(processed_idx, current_idx)
406 |     processed_idx = np.unique(processed_idx).astype(int)
407 | 
408 |     # Generating final path mask and setting processed indices as True.
409 |     path_mask = np.zeros(point_cloud.shape[0], dtype=bool)
410 |     path_mask[processed_idx] = True
411 | 
412 |     return path_mask
413 | 
414 | 
415 | def get_base(point_cloud, base_height):
416 | 
417 |     """
418 |     Get the base of a point cloud based on a certain height from the bottom.
419 | 
420 |     Parameters
421 |     ----------
422 |     point_cloud : array
423 |         Three-dimensional point cloud of a single tree to perform the
424 |         wood-leaf separation. This should be a n-dimensional array (m x n)
425 |         containing a set of coordinates (n) over a set of points (m).
426 |     base_height : float
427 |         Height of the base slice to mask.
428 | 
429 |     Returns
430 |     -------
431 |     mask : array
432 |         Base slice masked as True.
433 | 
434 |     """
435 | 
436 |     return point_cloud[:, 2] <= base_height
437 | 


--------------------------------------------------------------------------------
/tlseparation/classification/point_features.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | 
 30 | 
 31 | def curvature(arr, nbrs_idx):
 32 | 
 33 |     """
 34 |     Calculates pointwise curvature of a point cloud.
 35 | 
 36 |     Parameters
 37 |     ----------
 38 |     arr : array
 39 |         Three-dimensional (m x n) array of a point cloud, where the
 40 |         coordinates are represented in the columns (n) and the points are
 41 |         represented in the rows (m).
 42 |     nbr_idx : array
 43 |         N-dimensional array of indices from a nearest neighbors search of the
 44 |         point cloud in 'arr', where the rows (m) represents the points in
 45 |         'arr' and the columns represents the indices of the nearest neighbors
 46 |         from 'arr'.
 47 | 
 48 |     Returns
 49 |     -------
 50 |     c : numpy.ndarray
 51 |         1D (m x 1) array containing the curvature of each point in 'arr'.
 52 | 
 53 |     """
 54 | 
 55 |     # Allocating eigenvalues (evals) as array with shape n_points x 3 filled
 56 |     # with zeros.
 57 |     evals = np.zeros([arr.shape[0], 3], dtype=float)
 58 | 
 59 |     # Looping over each set of neighbors in nbrs_idx.
 60 |     for i, nids in enumerate(nbrs_idx):
 61 |         # Checking if local neighborhood of points contains more than 3
 62 |         # points. Otherwise, the calculation of eigenvalues/eigenvectors
 63 |         # is not possible.
 64 |         if arr[nids].shape[0] > 3:
 65 |             # Calculates ith eigenvalues using svd_evals.
 66 |             evals[i] = svd_evals(arr[nids])
 67 | 
 68 |     # Calculating curvature.
 69 |     c = evals[:, 2] / np.sum(evals, axis=1)
 70 | 
 71 |     return c
 72 | 
 73 | 
 74 | def knn_features(arr, nbr_idx, block_size=200000):
 75 | 
 76 |     """
 77 |     Calculates geometric descriptors: salient features and tensor features
 78 |     from an array and an indexing with fixed numbers of neighbors.
 79 | 
 80 |     Parameters
 81 |     ----------
 82 |     arr : array
 83 |         Three-dimensional (m x n) array of a point cloud, where the
 84 |         coordinates are represented in the columns (n) and the points are
 85 |         represented in the rows (m).
 86 |     nbr_idx : array
 87 |         N-dimensional array of indices from a nearest neighbors search of the
 88 |         point cloud in 'arr', where the rows (m) represents the points in
 89 |         'arr' and the columns represents the indices of the nearest neighbors
 90 |         from 'arr'.
 91 | 
 92 |     Returns
 93 |     -------
 94 |     features : array
 95 |         N-dimensional array (m x 6) of the calculated geometric descriptors.
 96 |         Where the rows (m) represent the points from 'arr' and the columns
 97 |         represents the features.
 98 | 
 99 |     """
100 | 
101 |     # Making sure block_size is limited by at most the number of points in
102 |     # arr.
103 |     if block_size > arr.shape[0]:
104 |         block_size = arr.shape[0]
105 | 
106 |     # Creating block of ids.
107 |     ids = np.arange(arr.shape[0])
108 |     ids = np.array_split(ids, int(arr.shape[0] / block_size))
109 | 
110 |     # Making sure nbr_idx has the correct data type.
111 |     nbr_idx = nbr_idx.astype(int)
112 | 
113 |     # Allocating s.
114 |     s = np.zeros([arr.shape[0], 3], dtype=float)
115 | 
116 |     # Looping over blocks of ids to calculating eigenvalues for the
117 |     # neighborhood around each point in arr.
118 |     for i in ids:
119 |         # Calculating the eigenvalues.
120 |         s[i] = knn_evals(arr[nbr_idx[i]])
121 | 
122 |     # Calculating the ratio of the eigenvalues.
123 |     ratio = (s.T / np.sum(s, axis=1)).T
124 | 
125 |     # Calculating the salient features and tensor features from the
126 |     # eigenvalues ratio.
127 |     features = calc_features(ratio)
128 | 
129 |     # Replacing the 'nan' values for 0.
130 |     features[np.isnan(features)] = 0
131 | 
132 |     return features
133 | 
134 | 
135 | def knn_evals(arr_stack):
136 | 
137 |     """
138 |     Calculates eigenvalues of a stack of arrays.
139 | 
140 |     Parameters
141 |     ----------
142 |     arr_stack : array
143 |         N-dimensional array (l x m x n) containing a stack of data, where the
144 |         rows (m) represents the points coordinates, the columns (n) represents
145 |         the axis coordinates and the layer (l) represents the stacks of points.
146 | 
147 |     Returns
148 |     -------
149 |     evals : array
150 |         N-dimensional array (l x n) of eigenvalues calculated from
151 |         'arr_stack'. The rows (l) represents the stack layers of points in
152 |         'arr_stack' and the columns (n) represent the parameters in
153 |         'arr_stack'.
154 | 
155 |     """
156 | 
157 |     # Calculating the covariance of the stack of arrays.
158 |     cov = vectorized_app(arr_stack)
159 | 
160 |     # Calculating the eigenvalues using Singular Value Decomposition (svd).
161 |     evals = np.linalg.svd(cov, compute_uv=False)
162 | 
163 |     return evals
164 | 
165 | 
166 | def calc_features(e):
167 | 
168 |     """
169 |     Calculates the geometric features using a set of eigenvalues, based on Ma
170 |     et al. [#]_ and Wang et al. [#]_.
171 | 
172 |     Parameters
173 |     ----------
174 |     e : array
175 |         N-dimensional array (m x 3) containing sets of 3 eigenvalues per
176 |         row (m).
177 | 
178 |     Returns
179 |     -------
180 |     features : array
181 |         N-dimensional array (m x 6) containing the calculated geometric
182 |         features from 'e'.
183 | 
184 |     References
185 |     ----------
186 |     ..  [#] Ma et al., 2015. Improved Salient Feature-Based Approach for
187 |             Automatically Separating Photosynthetic and Nonphotosynthetic
188 |             Components Within Terrestrial Lidar Point Cloud Data of Forest
189 |             Canopies.
190 |     ..  [#] Wang et al., 2015. A Multiscale and Hierarchical Feature Extraction
191 |             Method for Terrestrial Laser Scanning Point Cloud Classification.
192 | 
193 |     """
194 | 
195 |     # Calculating salient features.
196 |     e1 = e[:, 2]
197 |     e2 = e[:, 0] - e[:, 1]
198 |     e3 = e[:, 1] - e[:, 2]
199 | 
200 |     # Calculating tensor features.
201 |     t1 = (e[:, 1] - e[:, 2]) / e[:, 0]
202 |     t2 = ((e[:, 0] * np.log(e[:, 0])) + (e[:, 1] * np.log(e[:, 1])) +
203 |           (e[:, 2] * np.log(e[:, 2])))
204 |     t3 = (e[:, 0] - e[:, 1]) / e[:, 0]
205 | 
206 |     return np.vstack(([e1, e2, e3, t1, t2, t3])).T
207 | 
208 | 
209 | def vectorized_app(arr_stack):
210 | 
211 |     """
212 |     Function to calculate the covariance of a stack of arrays. This function
213 |     uses einstein summation to make the covariance calculation more efficient.
214 |     Based on a reply from the user Divakar [#]_ at stackoverflow.
215 | 
216 |     Parameters
217 |     ----------
218 |     arr_stack : array
219 |         N-dimensional array (l x m x n) containing a stack of data, where the
220 |         rows (m) represents the points coordinates, the columns (n) represents
221 |         the axis coordinates and the layer (l) represents the stacks of
222 |         points.
223 | 
224 |     Returns
225 |     -------
226 |     cov : array
227 |         N-dimensional array (l x n x n) of covariance values calculated from
228 |         'arr_stack'. Each layer (l) contains a (n x n) covariance matrix
229 |         calculated from the layers (l) in 'arr_stack'.
230 | 
231 |     References
232 |     ----------
233 |     ..  [#] Divakar, 2016. http://stackoverflow.com/questions/35756952/\
234 | quickly-compute-eigenvectors-for-each-element-of-an-array-in-\
235 | python.
236 | 
237 |     """
238 | 
239 |     # Centralizing the data around the mean.
240 |     diffs = arr_stack - arr_stack.mean(1, keepdims=True)
241 | 
242 |     # Using the einstein summation of the centered data in regard to the array
243 |     # stack shape to return the covariance of each array in the stack.
244 |     return np.einsum('ijk,ijl->ikl', diffs, diffs)/arr_stack.shape[1]
245 | 
246 | 
247 | def svd_evals(arr):
248 | 
249 |     """
250 |     Calculates eigenvalues of an array using SVD.
251 | 
252 |     Parameters
253 |     ----------
254 |     arr : array
255 |         nxm numpy.ndarray where n is the number of samples and m is the number
256 |         of dimensions.
257 | 
258 |     Returns
259 |     -------
260 |     evals : array
261 |         1xm numpy.ndarray containing the calculated eigenvalues in decrescent
262 |         order.
263 | 
264 |     """
265 | 
266 |     # Calculating centroid coordinates of points in 'arr'.
267 |     centroid = np.average(arr, axis=0)
268 | 
269 |     # Running SVD on centered points from 'arr'.
270 |     _, evals, evecs = np.linalg.svd(arr - centroid, full_matrices=False)
271 | 
272 |     return evals
273 | 


--------------------------------------------------------------------------------
/tlseparation/classification/wlseparation.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | 
 29 | import numpy as np
 30 | import pandas as pd
 31 | from sklearn.neighbors import NearestNeighbors
 32 | from ..utility.knnsearch import set_nbrs_knn
 33 | from ..classification.point_features import knn_features
 34 | from ..classification.gmm import (classify, class_select_abs,
 35 |                                   class_select_ref)
 36 | 
 37 | 
 38 | def fill_class(arr1, arr2, noclass, k):
 39 | 
 40 |     """
 41 |     Assigns noclass entries to either arr1 or arr2, depending on
 42 |     neighborhood majority analisys.
 43 | 
 44 |     Parameters
 45 |     ----------
 46 |     arr1 : array
 47 |         Point coordinates for entries of the first class.
 48 |     arr2 : array
 49 |         Point coordinates for entries of the second class.
 50 |     noclass : array
 51 |         Point coordinates for noclass entries.
 52 |     k : int
 53 |         Number of neighbors to use in the neighborhood majority analysis.
 54 | 
 55 |     Returns
 56 |     -------
 57 |     arr1 : array
 58 |         Point coordinates for entries of the first class.
 59 |     arr2 : array
 60 |         Point coordinates for entries of the second class.
 61 | 
 62 |     """
 63 | 
 64 |     # Stacking arr1 and arr2. This will be fitted in the NearestNeighbors
 65 |     # search in order to define local majority and assign classes to
 66 |     # noclass.
 67 |     arr = np.vstack((arr1, arr2))
 68 | 
 69 |     # Generating classes labels with the same shapes as arr1, arr2 and,
 70 |     # after stacking, arr.
 71 |     class_1 = np.full(arr1.shape[0], 1, dtype=np.int)
 72 |     class_2 = np.full(arr2.shape[0], 2, dtype=np.int)
 73 |     classes = np.hstack((class_1, class_2)).T
 74 | 
 75 |     # Performin NearestNeighbors search to detect local sets of points.
 76 |     nbrs = NearestNeighbors(leaf_size=25, n_jobs=-1).fit(arr)
 77 |     indices = nbrs.kneighbors(noclass, n_neighbors=k, return_distance=False)
 78 | 
 79 |     # Allocating output variable.
 80 |     new_class = np.zeros(noclass.shape[0])
 81 | 
 82 |     # Selecting subset of classes based on the neighborhood expressed by
 83 |     # indices.
 84 |     class_ = classes[indices]
 85 | 
 86 |     # Looping over all points in indices.
 87 |     for i in range(len(indices)):
 88 | 
 89 |         # Counting the number of occurrences of each value in the ith instance
 90 |         # of class_.
 91 |         unique, count = np.unique(class_[i, :], return_counts=True)
 92 |         # Appending the majority class into the output variable.
 93 |         new_class[i] = unique[np.argmax(count)]
 94 | 
 95 |     # Stacking new points to arr1 and arr2.
 96 |     arr1 = np.vstack((arr1, noclass[new_class == 1]))
 97 |     arr2 = np.vstack((arr2, noclass[new_class == 2]))
 98 | 
 99 |     # Making sure all points were processed and assigned a class.
100 |     assert ((arr1.shape[0] + arr2.shape[0]) ==
101 |             (arr.shape[0] + noclass.shape[0]))
102 | 
103 |     return arr1, arr2
104 | 
105 | 
106 | def wlseparate_ref_voting(arr, knn_lst, class_file, n_classes=3):
107 | 
108 |     """
109 |     Classifies a point cloud (arr) into two main classes, wood and leaf.
110 |     Altough this function does not output a noclass category, it still
111 |     filters out results based on classification confidence interval in the
112 |     voting process (if lower than prob_threshold, then voting is not used
113 |     for current point and knn value).
114 | 
115 |     The final class selection is based a voting scheme applied to a similar
116 |     approach of wlseparate_ref. In this case, the function iterates over a
117 |     series of knn values and apply the reference distance criteria to select
118 |     wood and leaf classes.
119 | 
120 |     Each knn class result is accumulated in a list and in the end a voting
121 |     is applied. For each point, if the number of times it was classified as
122 |     wood is larger than threhsold, the final class is set to wood. Otherwise
123 |     it is set as leaf.
124 | 
125 |     Class selection will mask points according to their class mean distance
126 |     to reference classes. The closes reference class gets assignes to each
127 |     intermediate class.
128 | 
129 |     Parameters
130 |     ----------
131 |     arr : array
132 |         Three-dimensional point cloud of a single tree to perform the
133 |         wood-leaf separation. This should be a n-dimensional array (m x n)
134 |         containing a set of coordinates (n) over a set of points (m).
135 |     knn_lst : list
136 |         List of knn values to use in the search to constitue local subsets of
137 |         points around each point in 'arr'. It can be a single knn value, as
138 |         long as it has list data type.
139 |     class_file : pandas dataframe or str
140 |         Dataframe or path to reference classes file.
141 |     n_classes : int
142 |         Number of classes to use in the Gaussian Mixture Classification.
143 | 
144 |     Returns
145 |     -------
146 |     class_dict : dict
147 |         Dictionary containing indices for all classes in class_ref. Classes
148 |         are labeled according to classes names in class_file.
149 |     count_dict : dict
150 |         Dictionary containin votes count for all classes in class_ref. Classes
151 |         are labeled according to classes names in class_file.
152 |     prob_dict : dict
153 |         Dictionary containing probabilities for all classes in class_ref.
154 |         Classes are labeled according to classes names in class_file.
155 | 
156 |     """
157 | 
158 |     # Making sure 'knn_lst' is of list type.
159 |     if type(knn_lst) != list:
160 |         knn_lst = [knn_lst]
161 | 
162 |     # Initializing voting accumulator and class probability arrays.
163 |     vt = np.full([arr.shape[0], len(knn_lst)], -1, dtype=int)
164 |     prob = np.full([arr.shape[0], len(knn_lst)], -1, dtype=float)
165 | 
166 |     # Generating a base set of indices and distances around each point.
167 |     # This step uses the largest value in knn_lst to make further searches,
168 |     # with smaller values of knn, more efficient.
169 |     idx_base = set_nbrs_knn(arr, arr, np.max(knn_lst), return_dist=False)
170 | 
171 |     # Reading in class reference values from file.
172 |     if isinstance(class_file, str):
173 |         class_table = pd.read_csv(class_file)
174 |         print(class_table)
175 |     elif isinstance(class_file, pd.core.frame.DataFrame):
176 |         class_table = class_file
177 |     else:
178 |         raise Exception('class file should be a pandas dataframe or file path')
179 |     class_ref = np.asarray(class_table.iloc[:, 1:]).astype(float)
180 | 
181 |     # Looping over values of knn in knn_lst.
182 |     for i, k in enumerate(knn_lst):
183 |         # Subseting indices and distances based on initial knn search and
184 |         # current knn value (k).
185 |         idx_1 = idx_base[:, :k+1]
186 | 
187 |         # Calculating the geometric descriptors.
188 |         gd_1 = knn_features(arr, idx_1)
189 | 
190 |         # Classifying the points based on the geometric descriptors.
191 |         classes_1, cm_1, proba_1 = classify(gd_1, n_classes)
192 |         cm_1 = ((cm_1 - np.min(cm_1, axis=0)) /
193 |                 (np.max(cm_1, axis=0) - np.min(cm_1, axis=0)))
194 | 
195 |         # Selecting which classes represent classes from classes reference
196 |         # file.
197 |         new_classes = class_select_ref(classes_1, cm_1, class_ref)
198 | 
199 |         # Appending results to vt temporary list.
200 |         vt[:, i] = new_classes.astype(int)
201 |         prob[:, i] = np.max(proba_1, axis=1)
202 | 
203 |     # Performing the voting scheme (majority selection) for each point.
204 |     # Initializing final_* variables to store class number, vote counts and
205 |     # class provability.
206 |     final_class = np.full([arr.shape[0]], -1, dtype=int)
207 |     final_count = np.full([arr.shape[0]], -1, dtype=int)
208 |     final_prob = np.full([arr.shape[0]], -1, dtype=float)
209 |     # Iterating over class votes (vt) and their probabilities (prob).
210 |     for i, (v, p) in enumerate(zip(vt, prob)):
211 |         # Counting votes of each class.
212 |         unique, count = np.unique(v, return_counts=True)
213 |         # Appending to final_* arrays the most voted class, the total number
214 |         # of votes this class received and it's classficiation probability.
215 |         final_class[i] = unique[np.argmax(count)]
216 |         final_count[i] = count[np.argmax(count)]
217 |         # Masking entries that received a vote for the most voted class.
218 |         final_class_mask = v == final_class[i]
219 |         # Averaging over all classification probabilities for all votes of
220 |         # the most voted class.
221 |         final_prob[i] = np.mean(p[final_class_mask])
222 | 
223 |     # Selecting classes labels from entries in class_ref.
224 |     # Generating indices array to help in future indexing.
225 |     idx = np.arange(arr.shape[0], dtype=int)
226 |     # Initializing dictionaires for output variables.
227 |     class_dict = {}
228 |     count_dict = {}
229 |     prob_dict = {}
230 |     # Looping over each unique class in final_class.
231 |     for c in np.unique(final_class).astype(int):
232 |         # Selecting all indices for points that were classfied as
233 |         # belonging to current class.
234 |         class_idx = idx[final_class == c]
235 |         # Selecting all vote counts for points that were classfied as
236 |         # belonging to current class. Only gets votes of most voted class for
237 |         # each point.
238 |         class_count = final_count[final_class == c]
239 |         # Selecting all classification probabilities for points that were
240 |         # classfied as belonging to current class. Only gets probability of
241 |         # most voted class for each point.
242 |         class_prob = final_prob[final_class == c]
243 |         # Assigining current class indices, votes and probability to
244 |         # output dictionaries. Current key name is set as selected class name
245 |         # from class_ref.
246 |         class_dict[class_table.iloc[c, :]['class']] = class_idx
247 |         count_dict[class_table.iloc[c, :]['class']] = class_count
248 |         prob_dict[class_table.iloc[c, :]['class']] = class_prob
249 | 
250 |     return class_dict, count_dict, prob_dict
251 | 
252 | 
253 | def wlseparate_abs(arr, knn, knn_downsample=1, n_classes=3):
254 | 
255 |     """
256 |     Classifies a point cloud (arr) into three main classes, wood, leaf and
257 |     noclass.
258 | 
259 |     The final class selection is based on the absolute value of the last
260 |     geometric feature (see point_features module).
261 |     Points will be only classified as wood or leaf if their classification
262 |     probability is higher than prob_threshold. Otherwise, points are
263 |     assigned to noclass.
264 | 
265 |     Class selection will mask points with feature value larger than a given
266 |     threshold as wood and the remaining points as leaf.
267 | 
268 |     Parameters
269 |     ----------
270 |     arr : array
271 |         Three-dimensional point cloud of a single tree to perform the
272 |         wood-leaf separation. This should be a n-dimensional array (m x n)
273 |         containing a set of coordinates (n) over a set of points (m).
274 |     knn : int
275 |         Number of nearest neighbors to search to constitue the local subset of
276 |         points around each point in 'arr'.
277 |     knn_downsample : float
278 |         Downsample factor (0, 1) for the knn parameter. If less than 1, a
279 |         sample of size (knn * knn_downsample) will be selected from the
280 |         nearest neighbors indices. This option aims to maintain the spatial
281 |         representation of the local subsets of points, but reducing overhead
282 |         in memory and processing time.
283 |     n_classes : int
284 |         Number of classes to use in the Gaussian Mixture Classification.
285 | 
286 |     Returns
287 |     -------
288 |     class_indices : dict
289 |         Dictionary containing indices for wood and leaf classes.
290 |     class_probability : dict
291 |         Dictionary containing probabilities for wood and leaf classes.
292 | 
293 |     """
294 | 
295 |     # Generating the indices array of the 'k' nearest neighbors (knn) for all
296 |     # points in arr.
297 |     idx_1 = set_nbrs_knn(arr, arr, knn, return_dist=False)
298 | 
299 |     # If downsample fraction value is set to lower than 1. Apply downsampling
300 |     # on knn indices.
301 |     if knn_downsample < 1:
302 |         n_samples = np.int(idx_1.shape[1] * knn_downsample)
303 |         idx_f = np.zeros([idx_1.shape[0], n_samples + 1])
304 |         idx_f[:, 0] = idx_1[:, 0]
305 |         for i in range(idx_f.shape[0]):
306 |             idx_f[i, 1:] = np.random.choice(idx_1[i, 1:], n_samples,
307 |                                             replace=False)
308 |         idx_1 = idx_f.astype(int)
309 | 
310 |     # Calculating geometric descriptors.
311 |     gd_1 = knn_features(arr, idx_1)
312 | 
313 |     # Classifying the points based on the geometric descriptors.
314 |     classes_1, cm_1, proba_1 = classify(gd_1, n_classes)
315 | 
316 |     # Selecting which classes represent wood and leaf. Wood classes are masked
317 |     # as True and leaf classes as False.
318 |     mask_1 = class_select_abs(classes_1, cm_1, idx_1)
319 | 
320 |     # Generating set of indices of entries in arr. This will be part of the
321 |     # output.
322 |     arr_ids = np.arange(0, arr.shape[0], 1, dtype=int)
323 | 
324 |     # Creating output class indices dictionary and class probabilities
325 |     # dictionary.
326 |     # mask represent wood points, (~) not mask represent leaf points.
327 |     class_indices = {}
328 |     class_probability = {}
329 |     try:
330 |         class_indices['wood'] = arr_ids[mask_1]
331 |         class_probability['wood'] = np.max(proba_1, axis=1)[mask_1]
332 |     except:
333 |         class_indices['wood'] = []
334 |         class_probability['wood'] = []
335 |     try:
336 |         class_indices['leaf'] = arr_ids[~mask_1]
337 |         class_probability['leaf'] = np.max(proba_1, axis=1)[~mask_1]
338 |     except:
339 |         class_indices['leaf'] = []
340 |         class_probability['leaf'] = []
341 | 
342 |     return class_indices, class_probability
343 | 


--------------------------------------------------------------------------------
/tlseparation/scripts/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from .automated_separation import large_tree_3, large_tree_4, generic_tree, nopath_generic_tree
29 | from .post_processing import isolated_clusters
30 | 


--------------------------------------------------------------------------------
/tlseparation/scripts/automated_separation.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari", "Phil Wilkes"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | import datetime
 30 | from ..classification import (wlseparate_abs, wlseparate_ref_voting,
 31 |                               threshold_classification,
 32 |                               reference_classification, path_detect_frequency,
 33 |                               voxel_path_detection, get_base, DefaultClass)
 34 | from ..utility import (get_diff, remove_duplicates, radius_filter,
 35 |                        class_filter, cluster_filter, continuity_filter,
 36 |                        feature_filter, plane_filter,
 37 |                        detect_nn_dist, cluster_features, cluster_size,
 38 |                        connected_component)
 39 | 
 40 | 
 41 | def large_tree_3(arr, class_file=[], knn_lst=[20, 40, 60, 80], gmm_nclasses=4,
 42 |                  class_prob_threshold=0.95, cont_filt=True, cf_rad=None,
 43 |                  verbose=False):
 44 | 
 45 |     """
 46 |     Run an automated separation of a single tree point cloud.
 47 | 
 48 |     Parameters
 49 |     ----------
 50 |     arr : array
 51 |         Three-dimensional point cloud of a single tree to perform the
 52 |         wood-leaf separation. This should be a n-dimensional array (m x n)
 53 |         containing a set of coordinates (n) over a set of points (m).
 54 |     class_file : str
 55 |         Path to classes reference values file. This file will be loaded and
 56 |         its reference values are used to select wood and leaf classes.
 57 |     knn_lst: list
 58 |         Set of knn values to use in the neighborhood search in classification
 59 |         steps. This variable will be directly used in a step containing
 60 |         the function wlseparate_ref_voting  and its minimum value will be used
 61 |         in another step containing wlseparate_abs (both from
 62 |         classification.wlseparate). These values are directly dependent of
 63 |         point density and were defined based on a medium point density
 64 |         scenario (mean distance between points aroun 0.05m). Therefore, for
 65 |         higher density point clouds it's recommended the use of larger knn
 66 |         values for optimal results.
 67 |     gmm_nclasses: int
 68 |         Number of classes to use in Gaussian Mixture Classification. Default
 69 |         is 4.
 70 |     cont_filt : boolean
 71 |         Option to select if continuity_filter should be applied to wood and
 72 |         leaf point clouds. Default is True.
 73 |     class_prob_threshold : float
 74 |         Classification probability threshold to filter classes. This aims to
 75 |         avoid selecting points that are not confidently enough assigned to
 76 |         any given class. Default is 0.95.
 77 |     cf_rad : float
 78 |         Continuity filter search radius.
 79 |     verbose : bool
 80 |         Option to set (or not) verbose output.
 81 | 
 82 |     Returns
 83 |     -------
 84 |     wood_final : array
 85 |         Wood point cloud.
 86 |     leaf_final : array
 87 |         Leaf point cloud.
 88 | 
 89 |     """
 90 | 
 91 |     # Checking input class_file, if it's an empty list, use default values.
 92 |     if len(class_file) == 0:
 93 |         class_file = DefaultClass().ref_table
 94 | 
 95 |     ###########################################################################
 96 |     # Making sure input array has only 3 dimensions and no duplicated points.
 97 |     if verbose:
 98 |         print(str(datetime.datetime.now()) + ' | removing duplicates')
 99 |     arr = remove_duplicates(arr[:, :3])
100 | 
101 |     # Calculating recommended distance between neighboring points.
102 |     if verbose:
103 |         print(str(datetime.datetime.now()) + ' | calculating recommended \
104 | distance between neighboring points')
105 |     nndist = detect_nn_dist(arr, 10, 0.5)
106 |     # Checking if no input was given to cf_rad and if so, calculate it from
107 |     # nndist.
108 |     if cf_rad is None:
109 |         cf_rad = nndist * 0.66
110 | 
111 |     if verbose:
112 |         print(str(datetime.datetime.now()) + ' | nndist: %s' % nndist)
113 | 
114 |     # Setting up knn value based on the minimum value from knn_lst.
115 |     knn = np.min(knn_lst)
116 | 
117 |     ###########################################################################
118 |     # Obtaining mask of points from a slice of points located at the base of
119 |     # the tree.
120 |     if verbose:
121 |         print(str(datetime.datetime.now()) + ' | obtaining mask of points \
122 | from a slice of points located at the base of the tree')
123 | 
124 |     try:
125 |         base_mask = get_base(arr, 0.5)
126 |         base_ids = np.where(base_mask)[0]
127 |     except:
128 |         base_ids = []
129 |         print('Failed to obtain base_mask.')
130 | 
131 |     # Masking points most likely to be part of the trunk and larger branches.
132 |     if verbose:
133 |         print(str(datetime.datetime.now()) + ' | masking points most likely \
134 | to be part of the trunk and larger branches')
135 |     try:
136 |         trunk_mask = voxel_path_detection(arr, 0.1, 40, 100, 0.15, True)
137 |         # Obtaining indices of points that are part of the trunk (trunk_ids)
138 |         # and not part of the trunk (not_trunk_ids).
139 |         # trunk.
140 |         trunk_ids = np.where(trunk_mask)[0].astype(int)
141 |         not_trunk_ids = np.where(~trunk_mask)[0].astype(int)
142 |     except:
143 |         trunk_ids = []
144 |         print('Failed to obtain trunk_mask.')
145 | 
146 |     ###########################################################################
147 |     try:
148 |         if verbose:
149 |             print(str(datetime.datetime.now()) + ' | performing absolute \
150 | threshold separation on points not detected as trunk (not_trunk_ids)')
151 |         # Performing absolute threshold separation on points not detected
152 |         # as trunk (not_trunk_ids).
153 |         ids_1, prob_1 = wlseparate_abs(arr[not_trunk_ids], knn,
154 |                                        n_classes=gmm_nclasses)
155 | 
156 |         # Obtaining wood_1 ids and classification probability.
157 |         if verbose:
158 |             print(str(datetime.datetime.now()) + ' | obtaining wood_1 ids \
159 | and classification probability')
160 |         wood_1_mask = not_trunk_ids[ids_1['wood']]
161 |         wood_1_prob = prob_1['wood']
162 |         # Filtering out points that were classified with a probability lower
163 |         # than class_prob_threshold.
164 |         if verbose:
165 |             print(str(datetime.datetime.now()) + ' | filtering out points \
166 | that were classified with a probability lower than class_prob_threshold')
167 |         wood_1 = wood_1_mask[wood_1_prob >= class_prob_threshold]
168 | 
169 |         try:
170 |             # Applying class_filter to remove wood_1 points that are more
171 |             # likely to be part of a leaf point cloud (not_wood_1).
172 |             if verbose:
173 |                 print(str(datetime.datetime.now()) + ' | \
174 | applying class_filter to remove wood_1 points that are more likely to be \
175 | part of a leaf point cloud (not_wood_1)')
176 |             # Setting up a boolean mask of wood_1 and not_wood_1 points.
177 |             wood_1_bool = np.zeros(arr.shape[0], dtype=bool)
178 |             wood_1_bool[wood_1] = True
179 | 
180 |             # Obtaining wood_1 filtered point indices.
181 |             if verbose:
182 |                 print(str(datetime.datetime.now()) + ' | obtaining wood_1 \
183 | filtered point indices')
184 |             wood_1_1_mask, _ = class_filter(arr[wood_1_bool],
185 |                                             arr[~wood_1_bool], 0, knn=10)
186 |             wood_1_1_mask = np.where(wood_1_1_mask)[0]
187 |             wood_1_1 = wood_1[wood_1_1_mask]
188 |         except:
189 |             wood_1_1 = wood_1
190 | 
191 |     except:
192 |         # In case absolute threshold separation fails, set wood_1_1 as an
193 |         # empty list.
194 |         wood_1_1 = []
195 |         if verbose:
196 |             print(str(datetime.datetime.now()) + ' | absolute threshold \
197 | separation failed, setting wood_1_1 as an empty list')
198 |     ###########################################################################
199 |     try:
200 |         # Performing reference class voting separation on the whole input point
201 |         # cloud.
202 |         # Running reference class voting separation.
203 |         if verbose:
204 |             print(str(datetime.datetime.now()) + ' | running reference class \
205 | voting separation')
206 |         ids_2, count_2, prob_2 = wlseparate_ref_voting(arr[not_trunk_ids],
207 |                                                        knn_lst, class_file,
208 |                                                        n_classes=gmm_nclasses)
209 | 
210 |         # Obtaining indices and classification probabilities for classes
211 |         # twig and trunk (both components of wood points).
212 |         twig_2_mask = not_trunk_ids[ids_2['twig']]
213 |         twig_2_prob = prob_2['twig']
214 | 
215 |         # Masking twig and trunk classes by classification probability
216 |         # threshold.
217 |         twig_2_prob_mask = twig_2_prob >= class_prob_threshold
218 | 
219 |         # Obtaining twig_2 and trunk_2 vote counts, which are the number of
220 |         # votes that each point in twig_2 and trunk_2 received to be
221 |         # classified as such.
222 |         twig_2_count = count_2['twig']
223 | 
224 |         # Filtering twig_2 and trunk_2 by a minimun number of votes. Point
225 |         # indices with number of votes smaller than the defined threshold
226 |         # are left out.
227 |         twig_2 = twig_2_mask[twig_2_count >= 2][twig_2_prob_mask]
228 | 
229 |         try:
230 |             # Applying class_filter on filtered twig point cloud.
231 |             if verbose:
232 |                 print(str(datetime.datetime.now()) + ' | applying \
233 | class_filter on filtered twig point cloud')
234 |             # Setting up a boolean mask of twig_2 and not_twig_2 points.
235 |             twig_2_bool = np.zeros(arr.shape[0], dtype=bool)
236 |             twig_2_bool[twig_2] = True
237 |             twig_2_1_mask, _ = class_filter(arr[twig_2_bool],
238 |                                             arr[~twig_2_bool], 0, knn=10)
239 |             twig_2_1_mask = np.where(twig_2_1_mask)[0]
240 |             twig_2_1 = twig_2[twig_2_1_mask]
241 | 
242 |             # Applying radius_filter on filtered twig point cloud.
243 |             if verbose:
244 |                 print(str(datetime.datetime.now()) + ' | applying \
245 | radius_filter on filtered twig point cloud')
246 |             twig_2_2_mask = radius_filter(arr[twig_2_1], 0.05, 5)
247 |             twig_2_2 = twig_2_1[twig_2_2_mask]
248 | 
249 |         except:
250 |             twig_2_2 = twig_2
251 | 
252 |     except:
253 |         # In case voting separation fails, set twig_2_2 as an empty list.
254 |         twig_2_2 = []
255 |         if verbose:
256 |             print(str(datetime.datetime.now()) + ' | reference class \
257 | separation failed, setting twig_2_2 as an empty list')
258 |     ###########################################################################
259 |     # Stacking all clouds part of the wood portion.
260 |     wood_ids = np.hstack((base_ids, trunk_ids, twig_2_2, wood_1_1))
261 |     wood_ids = np.unique(wood_ids).astype(int)
262 | 
263 |     # Selecting initial set of wood and leaf points.
264 |     wood = arr[wood_ids]
265 | 
266 |     ###########################################################################
267 | 
268 |     # Applying path filter to remove small clusters of leaves at the tips of
269 |     # the branches.
270 |     if verbose:
271 |          print(str(datetime.datetime.now()) + ' | running path filtering \
272 | on wood points')
273 |     try:
274 |         path_filter_mask = voxel_path_detection(wood, 0.1, 8, 100, 0.15,
275 |                                                 verbose=True)
276 |         wood_filt_1 = wood[path_filter_mask]
277 |         leaf_filt_1 = get_diff(arr, wood_filt_1)
278 |     except:
279 |         if verbose:
280 |              print(str(datetime.datetime.now()) + ' | failed running path \
281 | filtering')
282 |         wood_filt_1 = wood
283 |         leaf_filt_1 = get_diff(arr, wood_filt_1)
284 |     ###########################################################################
285 |     if cont_filt:
286 |         # Applying continuity filter in an attempt to close gaps in the wood
287 |         # point cloud (i.e. misclassified leaf points in between portions of
288 |         # wood points).
289 |         if verbose:
290 |             print(str(datetime.datetime.now()) + ' | applying continuity \
291 | filter in an attempt to close gaps in the wood point cloud')
292 |         try:
293 |             wood_filt_2, leaf_filt_2 = continuity_filter(wood_filt_1,
294 |                                                          leaf_filt_1,
295 |                                                          rad=cf_rad)
296 | 
297 |             # Applying path filter agin to clean up data after continuity filter.
298 |             if verbose:
299 |                  print(str(datetime.datetime.now()) + ' | running path \
300 | filtering on wood points')
301 |             try:
302 |                 path_filter_mask_2 = voxel_path_detection(wood_filt_2, 0.1,
303 |                                                           4, 100, 0.15,
304 |                                                           verbose=True)
305 |                 wood_filt_2 = wood_filt_2[path_filter_mask_2]
306 |             except:
307 |                 if verbose:
308 |                      print(str(datetime.datetime.now()) + ' | failed running \
309 | path filtering')
310 |                 wood_filt_2 = wood_filt_2
311 | 
312 |         except:
313 |             wood_filt_2 = wood_filt_1
314 |     else:
315 |         wood_filt_2 = wood_filt_1
316 | 
317 |     ###########################################################################
318 |     # After filtering wood points, add back smaller branches to fill in
319 |     # the tips lost by path filtering.
320 |     wood_final = np.vstack((wood_filt_2, arr[wood_1_1]))
321 |     wood_final = remove_duplicates(wood_final)
322 |     # Obtaining leaf point cloud from the difference between input cloud 'arr'
323 |     # and wood points.
324 |     leaf_final = get_diff(arr, wood_final)
325 | 
326 |     ###########################################################################
327 | 
328 |     return wood_final, leaf_final
329 | 
330 | 
331 | def large_tree_4(arr, class_file=[], knn_lst=[20, 40, 60, 80], gmm_nclasses=4,
332 |                  class_prob_threshold=0.95, cont_filt=True, cf_rad=None,
333 |                  verbose=False):
334 | 
335 |     """
336 |     Run an automated separation of a single tree point cloud.
337 | 
338 |     Parameters
339 |     ----------
340 |     arr : array
341 |         Three-dimensional point cloud of a single tree to perform the
342 |         wood-leaf separation. This should be a n-dimensional array (m x n)
343 |         containing a set of coordinates (n) over a set of points (m).
344 |     class_file : str
345 |         Path to classes reference values file. This file will be loaded and
346 |         its reference values are used to select wood and leaf classes.
347 |     knn_lst: list
348 |         Set of knn values to use in the neighborhood search in classification
349 |         steps. This variable will be directly used in a step containing
350 |         the function wlseparate_ref_voting  and its minimum value will be used
351 |         in another step containing wlseparate_abs (both from
352 |         classification.wlseparate). These values are directly dependent of
353 |         point density and were defined based on a medium point density
354 |         scenario (mean distance between points aroun 0.05m). Therefore, for
355 |         higher density point clouds it's recommended the use of larger knn
356 |         values for optimal results.
357 |     gmm_nclasses: int
358 |         Number of classes to use in Gaussian Mixture Classification. Default
359 |         is 4.
360 |     cont_filt : boolean
361 |         Option to select if continuity_filter should be applied to wood and
362 |         leaf point clouds. Default is True.
363 |     class_prob_threshold : float
364 |         Classification probability threshold to filter classes. This aims to
365 |         avoid selecting points that are not confidently enough assigned to
366 |         any given class. Default is 0.95.
367 |     cf_rad : float
368 |         Continuity filter search radius.
369 |     verbose : bool
370 |         Option to set (or not) verbose output.
371 | 
372 |     Returns
373 |     -------
374 |     wood_final : array
375 |         Wood point cloud.
376 |     leaf_final : array
377 |         Leaf point cloud.
378 | 
379 |     """
380 | 
381 |     # Checking input class_file, if it's an empty list, use default values.
382 |     if len(class_file) == 0:
383 |         class_file = DefaultClass().ref_table
384 | 
385 |     ###########################################################################
386 |     # Making sure input array has only 3 dimensions and no duplicated points.
387 |     if verbose:
388 |         print(str(datetime.datetime.now()) + ' | removing duplicates')
389 |     arr = remove_duplicates(arr[:, :3])
390 | 
391 |     # Calculating recommended distance between neighboring points.
392 |     if verbose:
393 |         print(str(datetime.datetime.now()) + ' | calculating recommended \
394 | distance between neighboring points')
395 |     nndist = detect_nn_dist(arr, 10, 0.5)
396 |     # Checking if no input was given to cf_rad and if so, calculate it from
397 |     # nndist.
398 |     if cf_rad is None:
399 |         cf_rad = nndist * 0.66
400 | 
401 |     if verbose:
402 |         print(str(datetime.datetime.now()) + ' | nndist: %s' % nndist)
403 | 
404 |     # Setting up knn value based on the minimum value from knn_lst.
405 |     knn = np.min(knn_lst)
406 | 
407 |     ###########################################################################
408 |     # Obtaining mask of points from a slice of points located at the base of
409 |     # the tree.
410 |     if verbose:
411 |         print(str(datetime.datetime.now()) + ' | obtaining mask of points \
412 | from a slice of points located at the base of the tree')
413 | 
414 |     try:
415 |         base_mask = get_base(arr, 0.5)
416 |         base_ids = np.where(base_mask)[0]
417 |     except:
418 |         base_ids = []
419 |         print('Failed to obtain base_mask.')
420 | 
421 |     # Masking points most likely to be part of the trunk and larger branches.
422 |     if verbose:
423 |         print(str(datetime.datetime.now()) + ' | masking points most likely \
424 | to be part of the trunk and larger branches')
425 |     try:
426 |         trunk_mask = voxel_path_detection(arr, 0.1, 40, 100, 0.15, True)
427 |         # Obtaining indices of points that are part of the trunk (trunk_ids)
428 |         # and not part of the trunk (not_trunk_ids).
429 |         # trunk.
430 |         trunk_ids = np.where(trunk_mask)[0].astype(int)
431 |         not_trunk_ids = np.where(~trunk_mask)[0].astype(int)
432 |     except:
433 |         trunk_ids = []
434 |         print('Failed to obtain trunk_mask.')
435 | 
436 |     ###########################################################################
437 |     try:
438 |         if verbose:
439 |             print(str(datetime.datetime.now()) + ' | performing absolute \
440 | threshold separation on points not detected as trunk (not_trunk_ids)')
441 |         # Performing absolute threshold separation on points not detected
442 |         # as trunk (not_trunk_ids).
443 |         ids_1, prob_1 = wlseparate_abs(arr[not_trunk_ids], knn,
444 |                                        n_classes=gmm_nclasses)
445 | 
446 |         # Obtaining wood_1 ids and classification probability.
447 |         if verbose:
448 |             print(str(datetime.datetime.now()) + ' | obtaining wood_1 ids \
449 | and classification probability')
450 |         wood_1_mask = not_trunk_ids[ids_1['wood']]
451 |         wood_1_prob = prob_1['wood']
452 |         # Filtering out points that were classified with a probability lower
453 |         # than class_prob_threshold.
454 |         if verbose:
455 |             print(str(datetime.datetime.now()) + ' | filtering out points \
456 | that were classified with a probability lower than class_prob_threshold')
457 |         wood_1 = wood_1_mask[wood_1_prob >= class_prob_threshold]
458 | 
459 |         try:
460 |             # Applying class_filter to remove wood_1 points that are more
461 |             # likely to be part of a leaf point cloud (not_wood_1).
462 |             if verbose:
463 |                 print(str(datetime.datetime.now()) + ' | \
464 | applying class_filter to remove wood_1 points that are more likely to be \
465 | part of a leaf point cloud (not_wood_1)')
466 |             # Setting up a boolean mask of wood_1 and not_wood_1 points.
467 |             wood_1_bool = np.zeros(arr.shape[0], dtype=bool)
468 |             wood_1_bool[wood_1] = True
469 | 
470 |             # Obtaining wood_1 filtered point indices.
471 |             if verbose:
472 |                 print(str(datetime.datetime.now()) + ' | obtaining wood_1 \
473 | filtered point indices')
474 |             wood_1_1_mask, _ = class_filter(arr[wood_1_bool],
475 |                                             arr[~wood_1_bool], 0, knn=10)
476 |             wood_1_1_mask = np.where(wood_1_1_mask)[0]
477 |             wood_1_1 = wood_1[wood_1_1_mask]
478 |         except:
479 |             wood_1_1 = wood_1
480 | 
481 |     except:
482 |         # In case absolute threshold separation fails, set wood_1_1 as an
483 |         # empty list.
484 |         wood_1_1 = []
485 |         if verbose:
486 |             print(str(datetime.datetime.now()) + ' | absolute threshold \
487 | separation failed, setting wood_1_1 as an empty list')
488 |     ###########################################################################
489 |     try:
490 |         # Performing reference class voting separation on the whole input point
491 |         # cloud.
492 |         # Running reference class voting separation.
493 |         if verbose:
494 |             print(str(datetime.datetime.now()) + ' | running reference class \
495 | voting separation')
496 |         ids_2, count_2, prob_2 = wlseparate_ref_voting(arr[not_trunk_ids],
497 |                                                        knn_lst, class_file,
498 |                                                        n_classes=gmm_nclasses)
499 | 
500 |         # Obtaining indices and classification probabilities for classes
501 |         # twig and trunk (both components of wood points).
502 |         twig_2_mask = not_trunk_ids[ids_2['twig']]
503 |         twig_2_prob = prob_2['twig']
504 | 
505 |         # Masking twig and trunk classes by classification probability
506 |         # threshold.
507 |         twig_2_prob_mask = twig_2_prob >= class_prob_threshold
508 | 
509 |         # Obtaining twig_2 and trunk_2 vote counts, which are the number of
510 |         # votes that each point in twig_2 and trunk_2 received to be
511 |         # classified as such.
512 |         twig_2_count = count_2['twig']
513 | 
514 |         # Filtering twig_2 and trunk_2 by a minimun number of votes. Point
515 |         # indices with number of votes smaller than the defined threshold
516 |         # are left out.
517 |         twig_2 = twig_2_mask[twig_2_count >= 2][twig_2_prob_mask]
518 | 
519 |         try:
520 |             # Applying class_filter on filtered twig point cloud.
521 |             if verbose:
522 |                 print(str(datetime.datetime.now()) + ' | applying \
523 | class_filter on filtered twig point cloud')
524 |             # Setting up a boolean mask of twig_2 and not_twig_2 points.
525 |             twig_2_bool = np.zeros(arr.shape[0], dtype=bool)
526 |             twig_2_bool[twig_2] = True
527 |             twig_2_1_mask, _ = class_filter(arr[twig_2_bool],
528 |                                             arr[~twig_2_bool], 0, knn=10)
529 |             twig_2_1_mask = np.where(twig_2_1_mask)[0]
530 |             twig_2_1 = twig_2[twig_2_1_mask]
531 | 
532 |             # Applying radius_filter on filtered twig point cloud.
533 |             if verbose:
534 |                 print(str(datetime.datetime.now()) + ' | applying \
535 | radius_filter on filtered twig point cloud')
536 |             twig_2_2_mask = radius_filter(arr[twig_2_1], 0.05, 5)
537 |             twig_2_2 = twig_2_1[twig_2_2_mask]
538 | 
539 |         except:
540 |             twig_2_2 = twig_2
541 | 
542 |     except:
543 |         # In case voting separation fails, set twig_2_2 as an empty list.
544 |         twig_2_2 = []
545 |         if verbose:
546 |             print(str(datetime.datetime.now()) + ' | reference class \
547 | separation failed, setting twig_2_2 as an empty list')
548 |     ###########################################################################
549 |     # Stacking all clouds part of the wood portion.
550 |     wood_ids = np.hstack((base_ids, trunk_ids, twig_2_2, wood_1_1))
551 |     wood_ids = np.unique(wood_ids).astype(int)
552 | 
553 |     # Selecting initial set of wood and leaf points.
554 |     wood = arr[wood_ids]
555 | 
556 |     mask_plane = plane_filter(wood, 0.05, 0.02)
557 |     mask_feature = feature_filter(wood, 4, -1, 30)
558 |     temp_mask = np.logical_and(mask_plane, mask_feature)
559 |     mask_cluster = cluster_filter(wood, 0.05, 0.2)
560 |     final_mask = np.logical_and(temp_mask, mask_cluster)
561 |     wood = wood[final_mask]
562 |     leaf = get_diff(arr, wood)
563 | 
564 | 
565 |     ###########################################################################
566 | 
567 |     # Applying path filter to remove small clusters of leaves at the tips of
568 |     # the branches.
569 |     if verbose:
570 |          print(str(datetime.datetime.now()) + ' | running path filtering \
571 | on wood points')
572 |     try:
573 |         path_filter_mask = voxel_path_detection(wood, 0.1, 8, 100, 0.15,
574 |                                                 verbose=True)
575 |         wood_filt_1 = wood[path_filter_mask]
576 |         leaf_filt_1 = get_diff(arr, wood_filt_1)
577 |     except:
578 |         if verbose:
579 |              print(str(datetime.datetime.now()) + ' | failed running path \
580 | filtering')
581 |         wood_filt_1 = wood
582 |         leaf_filt_1 = get_diff(arr, wood_filt_1)
583 |     ###########################################################################
584 |     if cont_filt:
585 |         # Applying continuity filter in an attempt to close gaps in the wood
586 |         # point cloud (i.e. misclassified leaf points in between portions of
587 |         # wood points).
588 |         if verbose:
589 |             print(str(datetime.datetime.now()) + ' | applying continuity \
590 | filter in an attempt to close gaps in the wood point cloud')
591 |         try:
592 |             wood_filt_2, leaf_filt_2 = continuity_filter(wood_filt_1,
593 |                                                          leaf_filt_1,
594 |                                                          rad=cf_rad)
595 | 
596 |             # Applying path filter agin to clean up data after continuity filter.
597 |             if verbose:
598 |                  print(str(datetime.datetime.now()) + ' | running path \
599 | filtering on wood points')
600 |             try:
601 |                 path_filter_mask_2 = voxel_path_detection(wood_filt_2, 0.1,
602 |                                                           4, 100, 0.15,
603 |                                                           verbose=True)
604 |                 wood_filt_2 = wood_filt_2[path_filter_mask_2]
605 |             except:
606 |                 if verbose:
607 |                      print(str(datetime.datetime.now()) + ' | failed running \
608 | path filtering')
609 |                 wood_filt_2 = wood_filt_2
610 | 
611 |         except:
612 |             wood_filt_2 = wood_filt_1
613 |     else:
614 |         wood_filt_2 = wood_filt_1
615 | 
616 |     ###########################################################################
617 |     # After filtering wood points, add back smaller branches to fill in
618 |     # the tips lost by path filtering.
619 |     wood_final = np.vstack((wood_filt_2, arr[wood_1_1]))
620 |     wood_final = remove_duplicates(wood_final)
621 |     # Obtaining leaf point cloud from the difference between input cloud 'arr'
622 |     # and wood points.
623 |     leaf_final = get_diff(arr, wood_final)
624 | 
625 |     ###########################################################################
626 | 
627 |     return wood_final, leaf_final
628 | 
629 | 
630 | def generic_tree(arr, knn_list=[40, 50, 80, 100, 120], voxel_size=0.05,
631 |                  retrace_steps=40):
632 | 
633 |     """
634 |     Run an automated separation of a single tree point cloud.
635 | 
636 |     Parameters
637 |     ----------
638 |     arr : array
639 |         Three-dimensional point cloud of a single tree to perform the
640 |         wood-leaf separation. This should be a n-dimensional array (m x n)
641 |         containing a set of coordinates (n) over a set of points (m).
642 |     knn_lst: list
643 |         Set of knn values to use in the neighborhood search in classification
644 |         steps. This variable will be directly used in a step containing
645 |         the function reference_classification  and its minimum and maximum
646 |         values will be used in a different step with threshold_classification
647 |         (both from classification.classify_wood). These values are directl
648 |         dependent of point density and were defined based on a medium point
649 |         density scenario (mean distance between points aroun 0.05m).
650 |         Therefore, for higher density point clouds it's recommended the use of
651 |         larger knn values for optimal results.
652 |     verbose : bool
653 |         Option to set (or not) verbose output.
654 | 
655 |     Returns
656 |     -------
657 |     wood_final : array
658 |         Wood point cloud.
659 |     leaf_final : array
660 |         Leaf point cloud.
661 | 
662 |     """
663 | 
664 |     # Running voxel_path_detection to detect main pathways (trunk and
665 |     # low order branches) in a tree point cloud. This step generates a
666 |     # graph from the point cloud and retrace n retrace_steps towards the
667 |     # root of the tree.
668 |     path_mask = voxel_path_detection(arr, voxel_size, retrace_steps, 100,
669 |                                      voxel_size * 1.77, False)
670 |     # Filtering path_mask points by feature threshold. In this case,
671 |     # feature 4 has a very distinctive pattern for wood and leaf. Usually
672 |     # the threshold is around -0.9 to -1.
673 |     path_mask_feature = feature_filter(arr[path_mask], 4, -0.9,
674 |                                        np.min(knn_list))
675 |     # Selecting filtered points in path_mask.
676 |     path_retrace_arr = arr[path_mask][path_mask_feature]
677 |     # Running path_detect_frequency to detect main pathways (trunk and
678 |     # low order branches) in a tree point cloud. This step generates a
679 |     # graph from the point cloud and select nodes with high frequency
680 |     # of paths passing through.
681 |     path_frequency_arr = path_detect_frequency(arr, voxel_size, 6)
682 |     # Running threshold_classification to detect small branches.
683 |     wood_abs = threshold_classification(arr, np.min(knn_list))
684 |     # Running reference_classification to detect both trunk, medium branches
685 |     # and small branches.
686 |     wood_vote = reference_classification(arr, knn_list)
687 |     # Stacking classified wood points.
688 |     wood1 = np.vstack((wood_abs, wood_vote))
689 |     # Obtaining leaf points by the difference set between wood and initial
690 |     # point clouds.
691 |     leaf1 = get_diff(arr, wood1)
692 |     # Obtaining larger branches that might have been missed in previous
693 |     # steps. The basic idea is to use a much larger knn value.
694 |     wood_abs_2 = threshold_classification(leaf1, np.max(knn_list) * 2)
695 |     # If wood_abs_2 has more than 10 points, do a cluster filtering to
696 |     # remove cluster with round/flat shapes.
697 |     if len(wood_abs_2) >= 10:
698 |         mask_cluster_2 = cluster_filter(wood_abs_2, 0.06, 0.6)
699 |         wood_abs_2 = wood_abs_2[mask_cluster_2]
700 |     # Obtaining small branches that might have been missed in previous
701 |     # steps. To detect small features, the ideal approach is to use a
702 |     # small neighborhood of points.
703 |     wood_abs_3 = threshold_classification(leaf1, np.min(knn_list))
704 |     # Stacking all wood points classified through Gaussian Mixture/EM.
705 |     wood2 = np.vstack((wood1, wood_abs_2, wood_abs_3))
706 |     # Removing duplicated points.
707 |     wood2 = remove_duplicates(wood2)
708 |     # Applying plane filter to remove points in a plane-ish neighborhood
709 |     # of points. These plane points are more likely to be part of leaves.
710 |     mask_plane = plane_filter(wood2, 0.03, 0.02)
711 |     # Stacking final wood points from GMM classification and path
712 |     # detection.
713 |     wood_final = np.vstack((path_frequency_arr, path_retrace_arr,
714 |                             wood2[mask_plane]))
715 |     # Removes duplicate points and obtains final leaf points from
716 |     # the difference set between initial and final wood point clouds.
717 |     wood_final = remove_duplicates(wood_final)
718 |     leaf_final = get_diff(arr, wood_final)
719 | 
720 |     return wood_final, leaf_final
721 | 
722 | 
723 | def nopath_generic_tree(arr, knn_list=[40, 50, 80, 100, 120]):
724 | 
725 |     """
726 |     Run an automated separation of a single tree point cloud.
727 | 
728 |     Parameters
729 |     ----------
730 |     arr : array
731 |         Three-dimensional point cloud of a single tree to perform the
732 |         wood-leaf separation. This should be a n-dimensional array (m x n)
733 |         containing a set of coordinates (n) over a set of points (m).
734 |     knn_lst: list
735 |         Set of knn values to use in the neighborhood search in classification
736 |         steps. This variable will be directly used in a step containing
737 |         the function reference_classification  and its minimum and maximum
738 |         values will be used in a different step with threshold_classification
739 |         (both from classification.classify_wood). These values are directl
740 |         dependent of point density and were defined based on a medium point
741 |         density scenario (mean distance between points aroun 0.05m).
742 |         Therefore, for higher density point clouds it's recommended the use of
743 |         larger knn values for optimal results.
744 | 
745 |     Returns
746 |     -------
747 |     wood_final : array
748 |         Wood point cloud.
749 |     leaf_final : array
750 |         Leaf point cloud.
751 | 
752 |     """
753 | 
754 |     # Running threshold_classification to detect small branches.
755 |     wood_abs = threshold_classification(arr, np.min(knn_list))
756 |     # Running reference_classification to detect both trunk, medium branches
757 |     # and small branches.
758 |     wood_vote = reference_classification(arr, knn_list)
759 |     # Stacking classified wood points.
760 |     wood1 = np.vstack((wood_abs, wood_vote))
761 |     # Obtaining leaf points by the difference set between wood and initial
762 |     # point clouds.
763 |     leaf1 = get_diff(arr, wood1)
764 |     # Obtaining larger branches that might have been missed in previous
765 |     # steps. The basic idea is to use a much larger knn value.
766 |     wood_abs_2 = threshold_classification(leaf1, np.max(knn_list) * 2)
767 |     # If wood_abs_2 has more than 10 points, do a cluster filtering to
768 |     # remove cluster with round/flat shapes.
769 |     if len(wood_abs_2) >= 10:
770 |         mask_cluster_2 = cluster_filter(wood_abs_2, 0.06, 0.6)
771 |         wood_abs_2 = wood_abs_2[mask_cluster_2]
772 |     # Obtaining small branches that might have been missed in previous
773 |     # steps. To detect small features, the ideal approach is to use a
774 |     # small neighborhood of points.
775 |     wood_abs_3 = threshold_classification(leaf1, np.min(knn_list))
776 |     # Stacking all wood points classified through Gaussian Mixture/EM.
777 |     wood2 = np.vstack((wood1, wood_abs_2, wood_abs_3))
778 |     # Removing duplicated points.
779 |     wood2 = remove_duplicates(wood2)
780 |     # Applying plane filter to remove points in a plane-ish neighborhood
781 |     # of points. These plane points are more likely to be part of leaves.
782 |     mask_plane = plane_filter(wood2, 0.03, 0.02)
783 |     # Stacking final wood points from GMM classification and path
784 |     # detection.
785 |     wood_final = wood2[mask_plane]
786 |     # Removes duplicate points and obtains final leaf points from
787 |     # the difference set between initial and final wood point clouds.
788 |     wood_final = remove_duplicates(wood_final)
789 |     leaf_final = get_diff(arr, wood_final)
790 | 
791 |     return wood_final, leaf_final
792 | 


--------------------------------------------------------------------------------
/tlseparation/scripts/post_processing.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from ..utility import (cluster_features, cluster_size,
29 |                        connected_component)
30 | 
31 | 
32 | def isolated_clusters(arr, voxel_size=0.05, size_threshold=0.3,
33 |                       feature_threshold=0.6, min_pts=10):
34 |     
35 |     """
36 |     Performs a connected component analysis to cluster points from a point
37 |     cloud and filters them these clusters based on size and shape (geometric
38 |     feature).
39 |     
40 |     Parameters
41 |     ----------
42 |     arr : array
43 |         Three-dimensional (m x n) array of a point cloud, where the
44 |         coordinates are represented in the columns (n) and the points are
45 |         represented in the rows (m).
46 |     voxel_size: float
47 |         Distance used to generate voxels from point cloud in order to 
48 |         perform the connected component analysis in 3D space.
49 |     size_threshold : int/float
50 |         Minimum size, on any dimension, for a cluster to be set as
51 |         valid (True)
52 |     feature_threshold : float
53 |         Minimum feature value for the cluster to be set as elongated (True).
54 |     min_pts : int
55 |         Minimum number of points for the cluster to be set as valid (True).
56 |         
57 |     Returns
58 |     -------
59 |     filter_mask : array
60 |         1D mask array setting True for valid poins in 'arr' and False
61 |         otherwise.
62 |         
63 |     """
64 |     
65 |     # Clustering points in 'arr' using connected_components.
66 |     labels = connected_component(arr, voxel_size)
67 |     # Filtering clustered points based on cluster size.
68 |     filter_mask1 = cluster_size(arr, labels, size_threshold)
69 |     # Filtering clustered points based on cluster geometric feature.
70 |     filter_mask2 = cluster_features(arr, labels, feature_threshold,
71 |                                      min_pts=min_pts)
72 |     # Joining filter masks to generate the final mask.
73 |     filter_mask = (filter_mask1 + filter_mask2).astype(bool)
74 |     
75 |     return arr[filter_mask], arr[~filter_mask]
76 |     


--------------------------------------------------------------------------------
/tlseparation/utility/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from .shortpath import (array_to_graph, extract_path_info)
29 | from .knnsearch import *
30 | from .data_utils import *
31 | from .filtering import *
32 | from .cloud_analysis import *
33 | from .voxels import *
34 | from .downsampling import *
35 | from .clustering import *
36 | 


--------------------------------------------------------------------------------
/tlseparation/utility/cloud_analysis.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from .knnsearch import (set_nbrs_knn, set_nbrs_rad)
 30 | from .peakdetect import peakdet
 31 | 
 32 | 
 33 | def detect_optimal_knn(arr, rad_lst=[0.1, 0.2, 0.3], sample_size=10000):
 34 | 
 35 |     """
 36 |     Detects optimal values for knn in order to facilitate material separation.
 37 | 
 38 |     Parameters
 39 |     ----------
 40 |     arr: array
 41 |         Set of 3D points.
 42 |     rad_lst: list
 43 |         Set of radius values to generate samples of neighborhoods. This is
 44 |         used to select points to calculate a number of neighboring points
 45 |         distribution from the point cloud.
 46 |     sample_size: int
 47 |         Number of points in arr to process in order to genrate a distribution.
 48 | 
 49 |     Returns
 50 |     -------
 51 |     knn_lst: list
 52 |         Set of k-nearest neighbors values.
 53 | 
 54 |     """
 55 | 
 56 |     # Generating sample indices.
 57 |     sids = np.random.choice(np.arange(arr.shape[0]), sample_size,
 58 |                             replace=False)
 59 | 
 60 |     # Obtaining nearest neighbors' indices and distance for sampled points.
 61 |     # This process is done just once, with the largest value of radius in
 62 |     # rad_lst. Later on, it is possible to subsample indices by limiting
 63 |     # their distances for a smaller radius.
 64 |     dist, ids = set_nbrs_rad(arr, arr[sids], np.max(rad_lst), True)
 65 | 
 66 |     # Initializing empty list to store knn values.
 67 |     knn_lst = []
 68 | 
 69 |     # Looping over each radius value.
 70 |     for r in rad_lst:
 71 |         # Counting number of points inside radius r.
 72 |         n_pts = [len(i[d <= r]) for i, d in zip(ids, dist)]
 73 | 
 74 |         # Binning n_pts into a histogram.
 75 |         y, x = np.histogram(n_pts)
 76 | 
 77 |         # Detecting peaks of accumulated points from n_pts.
 78 |         maxtab, mintab = peakdet(y, 100)
 79 |         maxtab = np.array(maxtab)
 80 | 
 81 |         # Appending knn values relative to peaks detected in n_pts.
 82 |         knn_lst.append(x[maxtab[:, 0]])
 83 | 
 84 |     # Flattening nested lists into a final list of knn values.
 85 |     knn_lst = [i for j in knn_lst for i in j]
 86 | 
 87 |     return knn_lst
 88 | 
 89 | 
 90 | def detect_rad_nn(arr, rad):
 91 | 
 92 |     """
 93 |     Calculates an average of number of neighbors based on a fixed radius
 94 |     around each point in a point cloud.
 95 | 
 96 |     Parameters
 97 |     ----------
 98 |     arr : array
 99 |         Three-dimensional (m x n) array of a point cloud, where the
100 |         coordinates are represented in the columns (n) and the points are
101 |         represented in the rows (m).
102 |     rad : float
103 |         Radius distance to select neighboring points.
104 | 
105 |     Returns
106 |     -------
107 |     mean_knn : int
108 |         Average number of points inside a radius 'rad' around each point in
109 |         'arr'.
110 | 
111 |     """
112 | 
113 |     # Performin Nearest Neighbors search for the whole point cloud.
114 |     indices = set_nbrs_rad(arr, arr, rad, return_dist=False)
115 | 
116 |     # Counting number of points around each point in 'arr'.
117 |     indices_len = [len(i) for i in indices]
118 | 
119 |     # Calculates a mean of all neighboring point counts.
120 |     mean_knn = np.mean(indices_len).astype(int)
121 | 
122 |     return mean_knn
123 | 
124 | 
125 | def detect_nn_dist(arr, knn, sigma=1):
126 | 
127 |     """
128 |     Calcuates the optimum distance among neighboring points.
129 | 
130 |     Parameters
131 |     ----------
132 |     arr : array
133 |         N-dimensional array (m x n) containing a set of parameters (n) over
134 |         a set of observations (m).
135 |     knn : int
136 |         Number of nearest neighbors to search to constitue the local subset
137 |         of points around each point in 'arr'.
138 | 
139 |     Returns
140 |     -------
141 |     dist : float
142 |         Optimal distance among neighboring points.
143 | 
144 |     """
145 | 
146 |     dist, indices = set_nbrs_knn(arr, arr, knn)
147 | 
148 |     return np.mean(dist[:, 1:]) + (np.std(dist[:, 1:]) * sigma)
149 | 


--------------------------------------------------------------------------------
/tlseparation/utility/clustering.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | import numpy as np
29 | from sklearn.cluster import DBSCAN
30 | 
31 | def connected_component(arr, voxel_size):
32 |     
33 |     """
34 |     Performs a connected component analysis to cluster points from a point
35 |     cloud. 
36 |     
37 |     Parameters
38 |     ----------
39 |     arr : array
40 |         Three-dimensional (m x n) array of a point cloud, where the
41 |         coordinates are represented in the columns (n) and the points are
42 |         represented in the rows (m).
43 |     voxel_size: float
44 |         Distance used to generate voxels from point cloud in order to 
45 |         perform the connected component analysis in 3D space.
46 |         
47 |     Returns
48 |     -------
49 |     point_labels : array
50 |         1D array with cluster labels assigned to each point from the input
51 |         point cloud.
52 |         
53 |     """
54 |     
55 |     # Generate voxels central coordinates.
56 |     voxel_coords = (arr / voxel_size).astype(int)
57 |     # Initialize voxels and fills them based on the voxel coordinates for
58 |     # each point.
59 |     voxels = {}
60 |     for i, v in enumerate(voxel_coords):
61 |         if tuple(v) in voxels:
62 |             voxels[tuple(v)].append(i)
63 |         else:
64 |             voxels[tuple(v)] = [i]
65 |         
66 |     # Running DBSCAN on the voxels created from the input point cloud. This
67 |     # step takes advantage of the integer coordinates to cluster voxels
68 |     # in a similar approach used in a classic connected components.
69 |     db = DBSCAN(eps=1, min_samples=1, algorithm='kd_tree', metric='chebyshev',
70 |                 n_jobs=-1).fit(voxel_coords)
71 |     labels = db.labels_
72 |     
73 |     # Assigning voxel cluster labels to each voxel's respective points.
74 |     point_labels = np.full(arr.shape[0], -1, dtype=int)
75 |     for l in np.unique(labels):
76 |         mask = l == labels
77 |         for c in voxel_coords[mask]:
78 |             point_labels[voxels[tuple(c)]] = l
79 |             
80 |     return point_labels
81 | 


--------------------------------------------------------------------------------
/tlseparation/utility/data_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | import pandas as pd
 30 | from .knnsearch import set_nbrs_knn
 31 | 
 32 | 
 33 | def get_diff(arr1, arr2):
 34 | 
 35 |     """
 36 |     Performs the intersection of two arrays, returning the entries not
 37 |     intersected between arr1 and arr2.
 38 | 
 39 |     Parameters
 40 |     ----------
 41 |     arr1 : array
 42 |         N-dimensional array of points to intersect.
 43 |     arr2 : array
 44 |         N-dimensional array of points to intersect.
 45 | 
 46 |     Returns
 47 |     -------
 48 |     arr : array
 49 |         Difference array between 'arr1' and 'arr2'.
 50 | 
 51 |     """
 52 | 
 53 |     # Asserting that both arrays have the same number of columns.
 54 |     assert arr1.shape[1] == arr2.shape[1]
 55 | 
 56 |     # Stacking both arrays.
 57 |     arr3 = np.vstack((arr1, arr2))
 58 | 
 59 |     # Creating a pandas.DataFrame from the stacked array.
 60 |     df = pd.DataFrame(arr3)
 61 | 
 62 |     # Removing duplicate points and keeping only points that have only a
 63 |     # single occurrence in the stacked array.
 64 |     diff = df.drop_duplicates(keep=False)
 65 | 
 66 |     return np.asarray(diff)
 67 | 
 68 | 
 69 | def remove_duplicates(arr, return_ids=False):
 70 | 
 71 |     """
 72 |     Removes duplicated rows from an array.
 73 | 
 74 |     Parameters
 75 |     ----------
 76 |     arr : array
 77 |         N-dimensional array (m x n) containing a set of parameters (n) over
 78 |         a set of observations (m).
 79 |     return_ids: bool
 80 |         Option to return indices of duplicated entries instead of new array
 81 |         with unique entries.
 82 | 
 83 |     Returns
 84 |     -------
 85 |     unique : array
 86 |         N-dimensional array (m* x n) containing a set of unique parameters (n)
 87 |         over a set of unique observations (m*).
 88 | 
 89 |     """
 90 | 
 91 |     # Setting the pandas.DataFrame from the array (arr) data.
 92 |     df = pd.DataFrame({'x': arr[:, 0],
 93 |                        'y': arr[:, 1], 'z': arr[:, 2]})
 94 | 
 95 |     if return_ids:
 96 |         # Using the duplicated function to mask duplicate points from df.
 97 |         return np.where(df.duplicated((['x', 'y', 'z'])))[0]
 98 | 
 99 |     else:
100 |         # Using the drop_duplicates function to remove duplicate points
101 |         # from df.
102 |         unique = df.drop_duplicates(['x', 'y', 'z'])
103 | 
104 |         return np.asarray(unique).astype(float)
105 | 
106 | 
107 | def apply_nn_value(base, arr, attr):
108 | 
109 |     """
110 |     Upscales a set of attributes from a base array to another denser array.
111 | 
112 |     Parameters
113 |     ----------
114 |     base : array
115 |         Base array to which the attributes to upscale were originaly matched.
116 |     arr : array
117 |         Target array to which the attributes will be upscaled.
118 |     attr : array
119 |         Attributes to upscale.
120 | 
121 |     Returns
122 |     -------
123 |     new_attr : array
124 |         Upscales attributes.
125 | 
126 |     Raises
127 |     ------
128 |     AssertionError:
129 |         length (number of samples) of "base" and "attr" must be equal.
130 | 
131 |     """
132 | 
133 |     assert base.shape[0] == attr.shape[0], '"base" and "attr" must have the\
134 |  same number of samples.'
135 | 
136 |     # Obtaining the closest in base for each point in arr.
137 |     idx = set_nbrs_knn(base, arr, 1, return_dist=False)
138 | 
139 |     # Making sure idx has the right type, int, for indexing.
140 |     idx = idx.astype(int)
141 | 
142 |     # Applying base's attribute (attr) to points in arr.
143 |     newattr = attr[idx]
144 | 
145 |     return np.reshape(newattr, newattr.shape[0])
146 | 
147 | 
148 | def entries_to_remove(entries, d):
149 | 
150 |     """
151 |     Function to remove selected entries (key and respective values) from
152 |     a given dict.
153 |     Based on a reply from the user mattbornski [#]_ at stackoverflow.
154 | 
155 |     Parameters
156 |     ----------
157 |     entries : array
158 |         Set of entried to be removed.
159 |     d : dict
160 |         Dictionary to apply the entried removal.
161 | 
162 |     References
163 |     ----------
164 |     ..  [#] mattbornski, 2012. http://stackoverflow.com/questions/8995611/\
165 | removing-multiple-keys-from-a-dictionary-safely
166 | 
167 |     """
168 | 
169 |     for k in entries:
170 |         d.pop(k, None)
171 | 


--------------------------------------------------------------------------------
/tlseparation/utility/downsampling.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from scipy.spatial.distance import cdist
 30 | 
 31 | 
 32 | def downsample_cloud(point_cloud, downsample_size, return_indices=False,
 33 |                      return_neighbors=False):
 34 | 
 35 |     """
 36 |     Downsamples a point cloud by voxelizing it and selecting points closest
 37 |     to the median coordinate of all points inside each voxel. The remaining
 38 |     points can be stored and returned as a dictrionary for later
 39 |     use in upsampling back to original input data.
 40 | 
 41 |     Parameters
 42 |     ----------
 43 |     point_cloud : numpy.ndarray
 44 |         Three-dimensional (m x n) array of a point cloud, where the
 45 |         coordinates are represented in the columns (n) and the points are
 46 |         represented in the rows (m).
 47 |     downsample_size : float
 48 |         Size of the voxels used to sample points into groups and select the
 49 |         most central point from. Note that this will not be the final points
 50 |         distance from each other, but an approximation.
 51 |     return_indices : bool
 52 |         Option to return results as downsampled array (False) or the
 53 |         indices of downsampled points from original point cloud (True).
 54 |     return_neighbors : bool
 55 |         Option to return original neighbors of downsampled points (True) or
 56 |         not (False). This information can be used to upsample back the
 57 |         downsampled indices.
 58 | 
 59 |     """
 60 | 
 61 |     # Voxelizing input point cloud by truncating coordinates based on
 62 |     # downsample_size.
 63 |     voxels_ids = (point_cloud / downsample_size).astype(int)
 64 |     voxels = {}
 65 | 
 66 |     # Looping over each point voxel index. Adds each point index to its
 67 |     # voxel key (vid).
 68 |     for i, vid in enumerate(voxels_ids):
 69 |         if tuple(vid) in voxels:
 70 |             voxels[tuple(vid)].append(i)
 71 |         else:
 72 |             voxels[tuple(vid)] = [i]
 73 | 
 74 |     # If return_neighbors is set to True, initialize neighbors_ids dictionary.
 75 |     if return_neighbors:
 76 |         neighbors_ids = {}
 77 | 
 78 |     # Initializing point cloud downsampled indices as array of zeros with
 79 |     # length equal to number of voxels.
 80 |     pc_downsample_ids = np.zeros(len(voxels.keys()), dtype=int)
 81 |     # Looping over each pair of voxel indices and point indices.
 82 |     for i, (vid, pids) in enumerate(voxels.items()):
 83 |         # Calculating median coordinates of points inside current voxel.
 84 |         median_coord = np.median(point_cloud[pids], axis=0)
 85 |         # Calculating distance of every point inside current voxel to
 86 |         # their median.
 87 |         dist = cdist(point_cloud[pids], median_coord.reshape([1, 3]))
 88 |         # Sorting indices by distance and selecting closest point as
 89 |         # representative of current voxel's center. Assign selected point's
 90 |         # index to current index of pc_downsample_ids.
 91 |         sort_ids = np.argsort(dist.T)
 92 |         pids = np.array(pids).flatten()
 93 |         pc_downsample_ids[i] = pids[sort_ids[0][0]]
 94 |         # If set to return neighbors indices, assign all remaining points
 95 |         # indices to selected center index in neighbors_ids.
 96 |         if return_neighbors:
 97 |             neighbors_ids[pc_downsample_ids[i]] = pids[sort_ids[0]]
 98 | 
 99 |     if return_indices:
100 |         if return_neighbors:
101 |             return pc_downsample_ids, neighbors_ids
102 |         else:
103 |             return pc_downsample_ids
104 |     else:
105 |         if return_neighbors:
106 |             return point_cloud[pc_downsample_ids], neighbors_ids
107 |         else:
108 |             return point_cloud[pc_downsample_ids]
109 | 
110 | 
111 | def upsample_cloud(upsample_ids, neighbors_dict):
112 | 
113 |     """
114 |     Upsample cloud based on downsampling information from 'downsample_cloud'.
115 |     This function will loop over each 'upsample_ids' and retrieve its
116 |     original neighboring points stored in 'neighbors_dict'.
117 | 
118 |     Parameters
119 |     ----------
120 |     upsample_ids : list
121 |         List of indices in 'neighbors_dict' to upsample.
122 |     neighbors_dict : dict
123 |         Neighbors information provided by 'downsample_cloud' containing
124 |         all the original neighboring points to each point in the downsampled
125 |         cloud.
126 | 
127 |     Returns
128 |     -------
129 |     upsampled_indices : numpy.ndarray
130 |         Upsampled points from original point cloud.
131 | 
132 |     """
133 | 
134 |     # Looping over each index in upsample_ids and retrieving its
135 |     # original neighbors indices.
136 |     ids = [neighbors_dict[i] for i in upsample_ids if i in neighbors_dict]
137 | 
138 |     return np.unique([i for j in ids for i in j])
139 | 


--------------------------------------------------------------------------------
/tlseparation/utility/filtering.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from .knnsearch import (set_nbrs_knn, set_nbrs_rad)
 30 | from .data_utils import (get_diff, remove_duplicates)
 31 | from .shortpath import (array_to_graph, extract_path_info)
 32 | from sklearn.neighbors import NearestNeighbors
 33 | from sklearn.cluster import DBSCAN
 34 | from ..classification.point_features import (svd_evals, knn_features,
 35 |                                              curvature)
 36 | 
 37 | def cluster_size(arr, labels, min_size):
 38 |     
 39 |     """
 40 |     Filters a set of connected components by maximum size on any dimension.
 41 | 
 42 |     Parameters
 43 |     ----------
 44 |     arr : array
 45 |         Three-dimensional (m x n) array of a point cloud, where the
 46 |         coordinates are represented in the columns (n) and the points are
 47 |         represented in the rows (m).
 48 |     labels : array
 49 |         1D array with cluster labels assigned to each point from the input
 50 |         point cloud.
 51 |     min_size : int/float
 52 |         Minimum size, on any dimension, for a cluster to be set as
 53 |         valid (True)
 54 |         
 55 |     Returns
 56 |     -------
 57 |     filter_mask : array
 58 |         1D mask array setting True for valid poins in 'arr' and False
 59 |         otherwise.
 60 |         
 61 |     """
 62 |     
 63 |     # Initializes mask.
 64 |     filter_mask = np.zeros(labels.shape[0], dtype=int)
 65 |     # Loops over each cluster label.
 66 |     for l in np.unique(labels):
 67 |         # Masks indices of current cluster.
 68 |         mask = l == labels
 69 |         # Selects point from current cluster.
 70 |         cluster_points = arr[mask]
 71 |         # Calculates the size of the current cluster in all dimensions.
 72 |         cluster_size = (np.max(cluster_points, axis=0).astype(float) -
 73 |                         np.min(cluster_points, axis=0))
 74 |         # Checks if cluster has at least one dimension with size larger then
 75 |         # min_size and, if so, assign a True (1) value to filter mask for
 76 |         # points that are part of the current cluster.
 77 |         if np.max(cluster_size) > min_size:
 78 |             filter_mask[mask] = 1
 79 |         
 80 |     return filter_mask.astype(bool)
 81 | 
 82 | 
 83 | def cluster_features(arr, labels, feature_threshold, min_pts=10):
 84 |     
 85 |     """
 86 |     Filters a set of connected components by a geometric feature threshold.
 87 |     This feature (n 2 in the separation methodology) is used to describe
 88 |     elongated shapes. If the shape of the cluster is elongated enough
 89 |     (i.e. feature value larger than threshold) the points belonging to this
 90 |     cluster are masked as True.
 91 | 
 92 |     Parameters
 93 |     ----------
 94 |     arr : array
 95 |         Three-dimensional (m x n) array of a point cloud, where the
 96 |         coordinates are represented in the columns (n) and the points are
 97 |         represented in the rows (m).
 98 |     labels : array
 99 |         1D array with cluster labels assigned to each point from the input
100 |         point cloud.
101 |     feature_threshold : float
102 |         Minimum feature value for the cluster to be set as elongated (True).
103 |     min_pts : int
104 |         Minimum number of points for the cluster to be set as valid (True).
105 |         
106 |     Returns
107 |     -------
108 |     filter_mask : array
109 |         1D mask array setting True for valid poins in 'arr' and False
110 |         otherwise.    
111 |         
112 |     """
113 |     
114 |     # Initializes arrays for mask and eigenvalues ratio.    
115 |     filter_mask = np.zeros(labels.shape[0], dtype=int)
116 |     evals_ratio = np.zeros([arr.shape[0], 3])
117 |     # Loops over each cluster label.
118 |     for l in np.unique(labels):
119 |         # Masks indices of current cluster.
120 |         mask = l == labels
121 |         # Check if current cluster has at least points, otherwise is not
122 |         # possible to estimate its eigenvalues
123 |         if np.sum(mask) >= 3:
124 |             # Selects point from current cluster.
125 |             cluster_points = arr[mask]         
126 |             if cluster_points.shape[0] >= min_pts:
127 |                 # Calculating centroid coordinates of points in
128 |                 # 'cluster_points'.
129 |                 centroid = np.average(cluster_points, axis=0)
130 |                 # Running SVD on centered points from 'cluster_points'.
131 |                 _, evals, _ = np.linalg.svd(cluster_points - centroid,
132 |                                             full_matrices=False)
133 |                 # Calculating eigenvalues ratio and assigning to the
134 |                 # respective indices of current points to evals_ratio.
135 |                 evals_ratio[mask] = evals / np.sum(evals)
136 | 
137 |         else:
138 |             pass
139 |         
140 |     # Calculating geometric feature.
141 |     feature = evals_ratio[:, 0] - evals_ratio[:, 1]
142 |     # Checking feature values against threshold and masking feature values
143 |     # larger than threshold as True.
144 |     filter_mask = feature >= feature_threshold
145 |     
146 |     return filter_mask 
147 | 
148 | 
149 | def feature_filter(arr, feature_id, threshold, knn):
150 | 
151 |     """
152 |     Filters a point cloud based on a given feature threshold. Only points
153 |     with selected feature values higher than threshold are kept as valid.
154 | 
155 |     Parameters
156 |     ----------
157 |     arr : array
158 |         Three-dimensional (m x n) array of a point cloud, where the
159 |         coordinates are represented in the columns (n) and the points are
160 |         represented in the rows (m).
161 |     feature_id : int
162 |         Column index of feature selected as criteria to filter. Column
163 |         indices follow Python notation [0 - (n_columns - 1)].
164 |     threshold : float
165 |         Minimum feature value for valid points.
166 |     knn : int
167 |         Number of neighbors to select around each point. Used to describe
168 |         local point arrangement.
169 | 
170 |     Returns
171 |     -------
172 |     mask_feature : numpy.ndarray
173 |         Boolean mask with valid points entries set as True.
174 | 
175 |     """
176 | 
177 |     # Running NearestNeighborhood search and calculating geometric features
178 |     # for each point's neighborhood.
179 |     nbrs_idx = set_nbrs_knn(arr, arr, knn, False)
180 |     features = knn_features(arr, nbrs_idx)
181 |     # Masking valid points.
182 |     return features[:, feature_id] >= threshold
183 | 
184 | 
185 | def plane_filter(arr, rad, threshold):
186 | 
187 |     """
188 |     Filters a point cloud based on its points planicity. Removes points that
189 |     are part of a neighbohood with planar spatial arrangement (low curvature).
190 | 
191 |     Parameters
192 |     ----------
193 |     arr : array
194 |         Three-dimensional (m x n) array of a point cloud, where the
195 |         coordinates are represented in the columns (n) and the points are
196 |         represented in the rows (m).
197 |     rad : float
198 |         Search radius distance around each point. Used to describe
199 |         local point arrangement.
200 |     threshold : float
201 |         Minimum curvature value for valid points.
202 | 
203 |     Returns
204 |     -------
205 |     mask_plane : numpy.ndarray
206 |         Boolean mask with valid points entries set as True.
207 | 
208 |     """
209 | 
210 |     # Running NearestNeighborhood search around each point in arr.
211 |     nbrs_idx = set_nbrs_rad(arr, arr, rad, False)
212 |     # Calculating curvature for each point's neighborhood.
213 |     c = curvature(arr, nbrs_idx)
214 | 
215 |     return c >= threshold
216 | 
217 | 
218 | def cluster_filter(arr, max_dist, eval_threshold):
219 | 
220 |     """
221 |     Applies a cluster filter to a point cloud 'arr'. This filter aims to
222 |     remove small, isolated, clusters of points.
223 | 
224 |     Parameters
225 |     ----------
226 |     arr : array
227 |         Point cloud of shape n points x m dimensions to be filtered.
228 |     max_dist : float
229 |         Maximum distance between points to considered part of the same
230 |         cluster.
231 |     eval_threshold : float
232 |         Minimum value for largest eigenvalue for a valid cluster. This value
233 |         is an indication of cluster shape, in which the higher the eigenvalue,
234 |         more elongated is the cluster. Points from clusters that have
235 |         eigenvalue smaller then eval_threshold are filtered out.
236 | 
237 |     Returns
238 |     -------
239 |     mask : array
240 |         Boolean mask of filtered points. Entries are set as True if belonging
241 |         to a valid cluster and False otherwise.
242 | 
243 |     """
244 | 
245 |     # Initializing and fitting HDBSCAN clustering to input array 'arr'.
246 |     clusterer = DBSCAN(max_dist).fit(arr)
247 |     labels = clusterer.labels_
248 | 
249 |     # Initializing arrat of final eigenvalues for each cluster.
250 |     final_evals = np.zeros([labels.shape[0], 3])
251 |     # Looping over each unique cluster label.
252 |     for L in np.unique(labels):
253 |         # Obtaining indices for all entries in 'arr' that are part of current
254 |         # cluster.
255 |         ids = np.where(labels == L)[0]
256 |         # Checking if current cluster is not an empty cluster (label == -1)
257 |         # and if current cluster has more than 3 points.
258 |         if (L != -1) & len(ids) >= 3:
259 |             # Calculated eigenvalues for current cluster.
260 |             e = svd_evals(arr[ids])
261 |             # Assigning current eigenvalues to indices of all points of
262 |             # current cluster in final_evals.
263 |             final_evals[ids] = e
264 | 
265 |     # Calculate eigenvalues ratio. This standardizes all rows (eigenvalues
266 |     # of each point) to an interval between 0 and 1. The sum of each row
267 |     # is 1.
268 |     ratio = np.asarray([i / np.sum(i) for i in final_evals])
269 | 
270 |     # Mask points by largest eigenvalue (column -0).
271 |     return ratio[:, 0] >= eval_threshold
272 | 
273 | 
274 | def radius_filter(arr, radius, min_points):
275 | 
276 |     """
277 |     Applies a radius search filter, which remove isolated points/clusters of
278 |     points.
279 | 
280 |     Parameters
281 |     ----------
282 |     arr : array
283 |         Point cloud of shape n points x m dimensions to be filtered.
284 |     radius : float
285 |         Search radius around each point to form a neighborhood.
286 |     min_point : int
287 |         Minimum number of points in a neighborhood for it to be considered
288 |         valid, i.e not filtered out.
289 | 
290 |     Returns
291 |     -------
292 |     mask : array
293 |         Array of bools masking valid points as True and "noise" points as
294 |         False.
295 | 
296 |     """
297 | 
298 |     # Setting up neighborhood indices.
299 |     indices = set_nbrs_rad(arr, arr, radius, return_dist=False)
300 | 
301 |     # Allocating array of neighborhood's sizes (one entry for each point in
302 |     # arr).
303 |     n_points = np.zeros(arr.shape[0], dtype=int)
304 | 
305 |     # Iterating over each entry in indices and calculating total number of
306 |     # points.
307 |     for i, id_ in enumerate(indices):
308 |         n_points[i] = id_.shape[0]
309 | 
310 |     return n_points >= min_points
311 | 
312 | 
313 | def continuity_filter(wood, leaf, rad=0.05):
314 | 
315 |     """
316 |     Function to apply a continuity filter to a point cloud that contains gaps
317 |     defined as points from a second point cloud.
318 |     This function works assuming that the continuous variable is the
319 |     wood portion of a tree point cloud and the gaps in it are empty space
320 |     or missclassified leaf data. In this sense, this function tries to correct
321 |     gaps where leaf points are present.
322 | 
323 |     Parameters
324 |     ----------
325 |     wood : array
326 |         Wood point cloud to be filtered.
327 |     leaf : array
328 |         Leaf point cloud, with points that may be causing discontinuities in
329 |         the wood point cloud.
330 |     rad : float
331 |         Radius to search for neighboring points in the iterative process.
332 | 
333 |     Returns
334 |     -------
335 |     wood : array
336 |         Filtered wood point cloud.
337 |     not_wood : array
338 |         Remaining point clouds after the filtering.
339 | 
340 |     """
341 | 
342 |     # Stacking wood and leaf arrays.
343 |     arr = np.vstack((wood, leaf))
344 | 
345 |     # Getting root index (base_id) from point cloud 'arr'.
346 |     base_id = np.argmin(arr[:, 2])
347 | 
348 |     # Calculating shortest path graph over sampled array.
349 |     G = array_to_graph(arr, base_id, 3, 100, 0.05, 0.02, 0.5)
350 |     node_ids, dist = extract_path_info(G, base_id, return_path=False)
351 |     node_ids = np.array(node_ids)
352 | 
353 |     # Obtaining wood point cloud indices.
354 |     wood_id = node_ids[node_ids <= wood.shape[0]]
355 | 
356 |     # Generating nearest neighbors search for the entire point cloud (arr).
357 |     nbrs = NearestNeighbors(algorithm='kd_tree', leaf_size=10,
358 |                             n_jobs=-1).fit(arr[node_ids])
359 | 
360 |     # Converting dist variable to array, as it is originaly a list.
361 |     dist = np.asarray(dist)
362 | 
363 |     # Selecting points and accummulated distance for all wood points in arr.
364 |     gp = arr[wood_id]
365 |     d = dist[wood_id]
366 | 
367 |     # Preparing control variables to iterate over. idbase will be all initial
368 |     # wood ids and pts all initial wood points. These variables are the ones
369 |     # to use in search of possible missclassified neighbors.
370 |     idbase = wood_id
371 |     pts = gp
372 | 
373 |     # Setting treshold variables to iterative process.
374 |     e = 9999999
375 |     e_threshold = 3
376 | 
377 |     # Iterating until threshold is met.
378 |     while e > e_threshold:
379 | 
380 |         # Obtaining the neighbor indices of current set of points (pts).
381 |         idx2 = nbrs.radius_neighbors(pts, radius=rad,
382 |                                      return_distance=False)
383 | 
384 |         # Initializing temporary variable id1.
385 |         id1 = []
386 |         # Looping over nn search indices and comparing their respective
387 |         # distances to center point distance. If nearest neighbor distance (to
388 |         # point cloud base) is smaller than center point distance, then ith
389 |         # point is also wood.
390 |         for i in range(idx2.shape[0]):
391 |             for i_ in idx2[i]:
392 |                 if dist[i_] <= (d[i]):
393 |                     id1.append(i_)
394 | 
395 |         # Uniquifying id1.
396 |         id1 = np.unique(id1)
397 | 
398 |         # Comparing original idbase to new wood ids (id1).
399 |         comp = np.in1d(id1, idbase)
400 | 
401 |         # Maintaining only new ids for next iteration.
402 |         diff = id1[np.where(~comp)[0]]
403 |         idbase = np.unique(np.hstack((idbase, id1)))
404 | 
405 |         # Passing new wood points to pts and recalculating e value.
406 |         pts = arr[diff]
407 |         e = pts.shape[0]
408 | 
409 |         # Passing accummulated distances from new points to d.
410 |         d = dist[diff]
411 | 
412 |         # Stacking new points to initial wood points and removing duplicates.
413 |         gp = np.vstack((gp, pts))
414 |         gp = remove_duplicates(gp)
415 | 
416 |     # Removing duplicates from final wood points and obtaining not_wood points
417 |     # from the difference between final wood points and full point cloud.
418 |     wood = remove_duplicates(gp)
419 |     not_wood = get_diff(wood, arr)
420 | 
421 |     return wood, not_wood
422 | 
423 | 
424 | def array_majority(arr_1, arr_2, **kwargs):
425 | 
426 |     """
427 |     Applies majority filter on two arrays.
428 | 
429 |     Parameters
430 |     ----------
431 |     arr_1 : array
432 |         n-dimensional array of points to filter.
433 |     arr_2 : array
434 |         n-dimensional array of points to filter.
435 |     **knn : int or float
436 |         Number neighbors to select around each point in arr in order to apply
437 |         the majority criteria.
438 |     **rad : int or float
439 |         Search radius arount each point in arr to select neighbors in order
440 |         to apply the majority criteria.
441 | 
442 |     Returns
443 |     -------
444 |     c_maj_1 : array
445 |         Boolean mask of filtered entries of same class as input 'arr_1'.
446 |     c_maj_2 : array
447 |         Boolean mask of filtered entries of same class as input 'arr_2'.
448 | 
449 |     Raises
450 |     ------
451 |     AssertionError:
452 |         Raised if neither 'knn' or 'rad' arguments are passed with valid
453 |         values (int or float).
454 | 
455 |     """
456 | 
457 |     # Asserting input arguments are valid.
458 |     assert ('knn' in kwargs.keys()) or ('rad' in kwargs.keys()), 'Please\
459 |  input a value for either "knn" or "rad".'
460 | 
461 |     if 'knn' in kwargs.keys():
462 |         assert (type(kwargs['knn']) == int) or (type(kwargs['knn']) ==
463 |                                                 float), \
464 |             '"knn" variable must be of type int or float.'
465 |     elif 'rad' in kwargs.keys():
466 |         assert (type(kwargs['rad']) == int) or (type(kwargs['rad']) ==
467 |                                                 float), \
468 |             '"rad" variable must be of type int or float.'
469 | 
470 |     # Stacking the arrays from both classes to generate a combined array.
471 |     arr = np.vstack((arr_1, arr_2))
472 | 
473 |     # Generating the indices for the local subsets of points around all points
474 |     # in the combined array. Function used is based upon the argument passed.
475 |     if 'knn' in kwargs.keys():
476 |         indices = set_nbrs_knn(arr, arr, kwargs['knn'], return_dist=False)
477 |     elif 'rad' in kwargs.keys():
478 |         indices = set_nbrs_rad(arr, arr, kwargs['rad'], return_dist=False)
479 | 
480 |     # Making sure indices has type int.
481 |     indices = indices.astype(int)
482 | 
483 |     # Generating the class arrays from both classified arrays and combining
484 |     # them into a single classes array (classes).
485 |     class_1 = np.full(arr_1.shape[0], 1, dtype=np.int)
486 |     class_2 = np.full(arr_2.shape[0], 2, dtype=np.int)
487 |     classes = np.hstack((class_1, class_2)).T
488 | 
489 |     # Allocating output variable.
490 |     c_maj = np.zeros(classes.shape)
491 | 
492 |     # Selecting subset of classes based on the neighborhood expressed by
493 |     # indices.
494 |     class_ = classes[indices]
495 | 
496 |     # Looping over all points in indices.
497 |     for i in range(len(indices)):
498 | 
499 |         # Counting the number of occurrences of each value in the ith instance
500 |         # of class_.
501 |         unique, count = np.unique(class_[i, :], return_counts=True)
502 |         # Appending the majority class into the output variable.
503 |         c_maj[i] = unique[np.argmax(count)]
504 | 
505 |     return c_maj == 1, c_maj == 2
506 | 
507 | 
508 | def class_filter(arr_1, arr_2, target, **kwargs):
509 | 
510 |     """
511 |     Function to apply class filter on an array based on the combination of
512 |     classed from both arrays (arr_1 and arr_2). Which array gets filtered
513 |     is defined by ''target''.
514 | 
515 |     Parameters
516 |     ----------
517 |     arr_1 : array
518 |         n-dimensional array of points to filter.
519 |     arr_2 : array
520 |         n-dimensional array of points to filter.
521 |     target : int or float
522 |         Number of the input array to filter. Valid values are 0 or 1.
523 |     **knn : int or float
524 |         Number neighbors to select around each point in arr in order to apply
525 |         the majority criteria.
526 |     **rad : int or float
527 |         Search radius arount each point in arr to select neighbors in order
528 |         to apply the majority criteria.
529 | 
530 |     Returns
531 |     -------
532 |     c_maj_1 : array
533 |         Boolean mask of filtered entries of same class as input 'arr_1'.
534 |     c_maj_2 : array
535 |         Boolean mask of filtered entries of same class as input 'arr_2'.
536 | 
537 |     Raises
538 |     ------
539 |     AssertionError:
540 |         Raised if neither 'knn' or 'rad' arguments are passed with valid
541 |         values (int or float).
542 |     AssertionError:
543 |         Raised if 'target' variable is not an int or float with value 0 or 1.
544 | 
545 |     """
546 | 
547 |     # Asserting input arguments are valid.
548 |     assert ('knn' in kwargs.keys()) or ('rad' in kwargs.keys()), 'Please\
549 |  input a value for either "knn" or "rad".'
550 | 
551 |     if 'knn' in kwargs.keys():
552 |         assert (type(kwargs['knn']) == int) or (type(kwargs['knn']) ==
553 |                                                 float), \
554 |             '"knn" variable must be of type int or float.'
555 |     elif 'rad' in kwargs.keys():
556 |         assert (type(kwargs['rad']) == int) or (type(kwargs['rad']) ==
557 |                                                 float), \
558 |             '"rad" variable must be of type int or float.'
559 | 
560 |     assert (type(target) == int) or (type(target) == float), '"target"\
561 |  variable must be of type int or float.'
562 |     assert (target == 0) or (target == 1), '"target" variable must be either\
563 |  0 or 1.'
564 | 
565 |     # Stacking the arrays from both classes to generate a combined array.
566 |     arr = np.vstack((arr_1, arr_2))
567 | 
568 |     # Generating the class arrays from both classified arrays and combining
569 |     # them into a single classes array (classes).
570 |     class_1 = np.full(arr_1.shape[0], 0, dtype=np.int)
571 |     class_2 = np.full(arr_2.shape[0], 1, dtype=np.int)
572 |     classes = np.hstack((class_1, class_2)).T
573 | 
574 |     # Generating the indices for the local subsets of points around all points
575 |     # in the combined array. Function used is based upon the argument passed.
576 |     if 'knn' in kwargs.keys():
577 |         indices = set_nbrs_knn(arr, arr, kwargs['knn'], return_dist=False)
578 |     elif 'rad' in kwargs.keys():
579 |         indices = set_nbrs_rad(arr, arr, kwargs['rad'], return_dist=False)
580 | 
581 |     # Making sure indices has type int.
582 |     indices = indices.astype(int)
583 | 
584 |     # Allocating output variable.
585 |     c_maj = classes.copy()
586 | 
587 |     # Selecting subset of classes based on the neighborhood expressed by
588 |     # indices.
589 |     class_ = classes[indices]
590 | 
591 |     # Checking for the target class.
592 |     target_idx = np.where(classes == target)[0]
593 | 
594 |     # Looping over the target points to filter.
595 |     for i in target_idx:
596 | 
597 |         # Counting the number of occurrences of each value in the ith instance
598 |         # of class_.
599 |         count = np.bincount(class_[i, :])
600 |         # Appending the majority class into the output variable.
601 |         c_maj[i] = count.argmax()
602 | 
603 |     return c_maj == 0, c_maj == 1
604 | 
605 | 
606 | def dist_majority(arr_1, arr_2, **kwargs):
607 | 
608 |     """
609 |     Applies majority filter on two arrays.
610 | 
611 |     Parameters
612 |     ----------
613 |     arr_1 : array
614 |         n-dimensional array of points to filter.
615 |     arr_2 : array
616 |         n-dimensional array of points to filter.
617 |     **knn : int or float
618 |         Number neighbors to select around each point in arr in order to apply
619 |         the majority criteria.
620 |     **rad : int or float
621 |         Search radius arount each point in arr to select neighbors in order to
622 |         apply the majority criteria.
623 | 
624 |     Returns
625 |     -------
626 |     c_maj_1 : array
627 |         Boolean mask of filtered entries of same class as input 'arr_1'.
628 |     c_maj_2 : array
629 |         Boolean mask of filtered entries of same class as input 'arr_2'.
630 | 
631 |     Raises:
632 |     AssertionError:
633 |         Raised if neither 'knn' or 'rad' arguments are passed with valid
634 |         values (int or float).
635 | 
636 |     """
637 | 
638 |     # Asserting input arguments are valid.
639 |     assert ('knn' in kwargs.keys()) or ('rad' in kwargs.keys()), 'Please\
640 |  input a value for either "knn" or "rad".'
641 | 
642 |     if 'knn' in kwargs.keys():
643 |         assert (type(kwargs['knn']) == int) or (type(kwargs['knn']) ==
644 |                                                 float), \
645 |             '"knn" variable must be of type int or float.'
646 |     elif 'rad' in kwargs.keys():
647 |         assert (type(kwargs['rad']) == int) or (type(kwargs['rad']) ==
648 |                                                 float), \
649 |             '"rad" variable must be of type int or float.'
650 | 
651 |     # Stacking the arrays from both classes to generate a combined array.
652 |     arr = np.vstack((arr_1, arr_2))
653 | 
654 |     # Generating the indices for the local subsets of points around all points
655 |     # in the combined array. Function used is based upon the argument passed.
656 |     if 'knn' in kwargs.keys():
657 |         dist, indices = set_nbrs_knn(arr, arr, kwargs['knn'])
658 |     elif 'rad' in kwargs.keys():
659 |         dist, indices = set_nbrs_rad(arr, arr, kwargs['rad'])
660 | 
661 |     # Making sure indices has type int.
662 |     indices = indices.astype(int)
663 | 
664 |     # Generating the class arrays from both classified arrays and combining
665 |     # them into a single classes array (classes).
666 |     class_1 = np.full(arr_1.shape[0], 1, dtype=np.int)
667 |     class_2 = np.full(arr_2.shape[0], 2, dtype=np.int)
668 |     classes = np.hstack((class_1, class_2)).T
669 | 
670 |     # Allocating output variable.
671 |     c_maj = np.zeros(classes.shape)
672 | 
673 |     # Selecting subset of classes based on the neighborhood expressed by
674 |     # indices.
675 |     class_ = classes[indices]
676 | 
677 |     # Looping over all points in indices.
678 |     for i in range(len(indices)):
679 | 
680 |         # Obtaining classe from indices i.
681 |         c = class_[i, :]
682 |         # Caculating accummulated distance for each class.
683 |         d1 = np.sum(dist[i][c == 1])
684 |         d2 = np.sum(dist[i][c == 2])
685 |         # Checking which class has the highest distance and assigning it
686 |         # to current index in c_maj.
687 |         if d1 >= d2:
688 |             c_maj[i] = 1
689 |         elif d1 < d2:
690 |             c_maj[i] = 2
691 | 
692 |     return c_maj == 1, c_maj == 2
693 | 


--------------------------------------------------------------------------------
/tlseparation/utility/knnsearch.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import numpy as np
 29 | from sklearn.neighbors import NearestNeighbors
 30 | 
 31 | 
 32 | def set_nbrs_knn(arr, pts, knn, return_dist=True, block_size=100000):
 33 | 
 34 |     """
 35 |     Function to create a set of nearest neighbors indices and their respective
 36 |     distances for a set of points. This function uses a knn search and sets a
 37 |     limit size for a block of points to query. This makes it less efficient in
 38 |     terms of processing time, but avoids running out of memory in cases of
 39 |     very dense/large arrays/queries.
 40 | 
 41 |     Parameters
 42 |     ----------
 43 |     arr : array
 44 |         N-dimensional array to perform the knn search on.
 45 |     pts : array
 46 |         N-dimensional array to search for on the knn search.
 47 |     knn : int
 48 |         Number of nearest neighbors to search for.
 49 |     return_dist : boolean
 50 |         Option to return or not the distances of each neighbor.
 51 |     block_size : int
 52 |         Limit of points to query. The variable 'pts' will be subdivided in n
 53 |         blocks of size block_size to perform query.
 54 | 
 55 |     Returns
 56 |     -------
 57 |     indices : array
 58 |         Set of neighbors indices from 'arr' for each entry in 'pts'.
 59 |     distance : array
 60 |         Distances from each neighbor to each central point in 'pts'.
 61 | 
 62 |     """
 63 | 
 64 |     # Making sure knn is of type int.
 65 |     knn = int(knn)
 66 | 
 67 |     # Initiating the nearest neighbors search and fitting it to the input
 68 |     # array.
 69 |     nbrs = NearestNeighbors(n_neighbors=knn, metric='euclidean',
 70 |                             algorithm='kd_tree', leaf_size=15,
 71 |                             n_jobs=-1).fit(arr)
 72 | 
 73 |     # Making sure block_size is limited by at most the number of points in
 74 |     # arr.
 75 |     if block_size > pts.shape[0]:
 76 |         block_size = pts.shape[0]
 77 | 
 78 |     # Creating block of ids.
 79 |     ids = np.arange(pts.shape[0])
 80 |     ids = np.array_split(ids, int(pts.shape[0] / block_size))
 81 | 
 82 |     # Initializing variables to store distance and indices.
 83 |     if return_dist is True:
 84 |         distance = np.zeros([pts.shape[0], knn])
 85 |     indices = np.zeros([pts.shape[0], knn])
 86 | 
 87 |     # Checking if the function should return the distance as well or only the
 88 |     # neighborhood indices.
 89 |     if return_dist is True:
 90 |         # Obtaining the neighborhood indices and their respective distances
 91 |         # from the center point by looping over blocks of ids.
 92 |         for i in ids:
 93 |             nbrs_dist, nbrs_ids = nbrs.kneighbors(pts[i])
 94 |             distance[i] = nbrs_dist
 95 |             indices[i] = nbrs_ids
 96 |         return distance, indices
 97 | 
 98 |     elif return_dist is False:
 99 |         # Obtaining the neighborhood indices only  by looping over blocks of
100 |         # ids.
101 |         for i in ids:
102 |             nbrs_ids = nbrs.kneighbors(pts[i], return_distance=False)
103 |             indices[i] = nbrs_ids
104 |         return indices
105 | 
106 | 
107 | def set_nbrs_rad(arr, pts, rad, return_dist=True, block_size=100000):
108 | 
109 |     """
110 |     Function to create a set of nearest neighbors indices and their respective
111 |     distances for a set of points. This function uses a radius search and sets
112 |     a limit size for a block of points to query. This makes it less efficient
113 |     in terms of processing time, but avoids running out of memory in cases of
114 |     very dense/large arrays/queries.
115 | 
116 |     Parameters
117 |     ----------
118 |     arr : array
119 |         N-dimensional array to perform the radius search on.
120 |     pts : array
121 |         N-dimensional array to search for on the knn search.
122 |     rad : float
123 |         Radius of the NearestNeighbors search.
124 |     return_dist : boolean
125 |         Option to return or not the distances of each neighbor.
126 |     block_size : int
127 |         Limit of points to query. The variable 'pts' will be subdivided in n
128 |         blocks of size block_size to perform query.
129 | 
130 |     Returns
131 |     -------
132 |     indices : array
133 |         Set of neighbors indices from 'arr' for each entry in 'pts'.
134 |     distance : array
135 |         Distances from each neighbor to each central point in 'pts'.
136 | 
137 |     """
138 | 
139 |     # Making sure block_size is limited by at most the number of points in
140 |     # arr.
141 |     if block_size > pts.shape[0]:
142 |         block_size = pts.shape[0]
143 | 
144 |     # Initiating the nearest neighbors search and fitting it to the input
145 |     # array.
146 |     nbrs = NearestNeighbors(radius=rad, metric='euclidean',
147 |                             algorithm='kd_tree', leaf_size=15,
148 |                             n_jobs=-1).fit(arr)
149 | 
150 |     # Creating block of ids.
151 |     ids = np.arange(pts.shape[0])
152 |     ids = np.array_split(ids, int(pts.shape[0] / block_size))
153 | 
154 |     # Initializing variables to store distance and indices.
155 |     if return_dist is True:
156 |         distance = []
157 |     indices = []
158 | 
159 |     # Checking if the function should return the distance as well or only the
160 |     # neighborhood indices.
161 |     if return_dist is True:
162 |         # Obtaining the neighborhood indices and their respective distances
163 |         # from the center point by looping over blocks of ids.
164 |         for i in ids:
165 |             nbrs_dist, nbrs_ids = nbrs.radius_neighbors(pts[i])
166 |             for j, k in enumerate(i):
167 |                 distance.append(nbrs_dist[j])
168 |                 indices.append(nbrs_ids[j])
169 |         return distance, indices
170 | 
171 |     elif return_dist is False:
172 |         # Obtaining the neighborhood indices only  by looping over blocks of
173 |         # ids.
174 |         for i in ids:
175 |             nbrs_ids = nbrs.radius_neighbors(pts[i], return_distance=False)
176 |             for j, k in enumerate(i):
177 |                 indices.append(nbrs_ids[j])
178 |         return indices
179 | 
180 | 
181 | def subset_nbrs(distance, indices, new_knn, block_size=100000):
182 | 
183 |     """
184 |     Performs a subseting of points from the results of a nearest neighbors
185 |     search.
186 |     This function assumes that the first index/distance in each row represents
187 |     the center point of the neighborhood represented by said rows.
188 | 
189 |     Parameters
190 |     ----------
191 |     distance : array
192 |         Distances from each neighbor to each central point in 'pts'.
193 |     indices : array
194 |         Set of neighbors indices from 'arr' for each entry in 'pts'.
195 |     new_knn : array
196 |         Number of neighbors to select from the initial number of neighbors.
197 |     block_size : int
198 |         Limit of points to query. The variables 'distance' and 'indices' will
199 |         be subdivided in n blocks of size block_size to perform query.
200 | 
201 |     Returns
202 |     -------
203 |     distance : array
204 |         Subset of distances from each neighbor 'indices'.
205 |     indices : array
206 |         Subset of neighbors indices from 'indices'.
207 | 
208 |     """
209 | 
210 |     # Making sure block_size is limited by at most the number of points in
211 |     # arr.
212 |     if block_size > distance.shape[0]:
213 |         block_size = distance.shape[0]
214 | 
215 |     # Creating block of ids.
216 |     ids = np.arange(distance.shape[0])
217 |     ids = np.array_split(ids, int(distance.shape[0] / block_size))
218 | 
219 |     # Initializing new_distance and new_indices variables.
220 |     new_distance = []
221 |     new_indices = []
222 | 
223 |     # Processing all blocks of indices in ids.
224 |     for id_ in ids:
225 | 
226 |         # Looping over each sample in distance and indices.
227 |         for d, i in zip(distance[id_], indices[id_]):
228 |             # Checks if new knn values are smaller than current distance and
229 |             # indices rows. This avoids errors of trying to select a number of
230 |             # columns larger than the available columns.
231 |             if distance.shape[1] >= new_knn:
232 |                 new_distance.append(d[:new_knn+1])
233 |                 new_indices.append(i[:new_knn+1].astype(int))
234 |             else:
235 |                 new_distance.append(d)
236 |                 new_indices.append(int(i))
237 | 
238 |     # Returning new_distance and new_indices as arrays.
239 |     return np.asarray(new_distance), np.asarray(new_indices)
240 | 


--------------------------------------------------------------------------------
/tlseparation/utility/peakdetect.py:
--------------------------------------------------------------------------------
 1 | """
 2 | % Eli Billauer, 3.4.05 (Explicitly not copyrighted).
 3 | % This function is released to the public domain; Any use is allowed.
 4 | 
 5 | Modifications in docstrings were performed by TLSepartion project
 6 | to improve autodocumentation using Sphinx. All credits are still to
 7 | Eli Billauer.
 8 | 
 9 | """
10 | 
11 | import sys
12 | import numpy as np
13 | 
14 | 
15 | def peakdet(v, delta, x=None):
16 | 
17 |     """
18 |     Converted from MATLAB script at http://billauer.co.il/peakdet.html
19 | 
20 | 
21 |     Parameters
22 |     ----------
23 |     v: array
24 |         Input vector (1D array) of values.
25 |     delta: float
26 |         Value change that characterizes a peak. A point is considered a
27 |         maximum peak if it has the maximal value, and was preceded
28 |         (to the left) by a value lower by delta.
29 |     x: array
30 |         Set of x values to replace indices in maxtab/mintab.
31 | 
32 |     Returns
33 |     -------
34 |     maxtab: array
35 |         2D array containing maxima peaks indices and values.
36 |     mintab: array
37 |         2D array containing minima peaks indices and values.
38 | 
39 |     Notes
40 |     ----------
41 |     Eli Billauer, 3.4.05 (Explicitly not copyrighted).
42 |     This function is released to the public domain; Any use is allowed.
43 | 
44 |     """
45 |     maxtab = []
46 |     mintab = []
47 | 
48 |     if x is None:
49 |         x = np.arange(len(v))
50 | 
51 |     v = np.asarray(v)
52 | 
53 |     if len(v) != len(x):
54 |         sys.exit('Input vectors v and x must have same length')
55 | 
56 |     if not np.isscalar(delta):
57 |         sys.exit('Input argument delta must be a scalar')
58 | 
59 |     if delta <= 0:
60 |         sys.exit('Input argument delta must be positive')
61 | 
62 |     mn, mx = np.Inf, -np.Inf
63 |     mnpos, mxpos = np.NaN, np.NaN
64 | 
65 |     lookformax = True
66 | 
67 |     for i in np.arange(len(v)):
68 |         this = v[i]
69 |         if this > mx:
70 |             mx = this
71 |             mxpos = x[i]
72 |         if this < mn:
73 |             mn = this
74 |             mnpos = x[i]
75 | 
76 |         if lookformax:
77 |             if this < mx-delta:
78 |                 maxtab.append((mxpos, mx))
79 |                 mn = this
80 |                 mnpos = x[i]
81 |                 lookformax = False
82 |         else:
83 |             if this > mn+delta:
84 |                 mintab.append((mnpos, mn))
85 |                 mx = this
86 |                 mxpos = x[i]
87 |                 lookformax = True
88 | 
89 |     return np.array(maxtab), np.array(mintab)
90 | 


--------------------------------------------------------------------------------
/tlseparation/utility/shortpath.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
  2 | # All rights reserved.
  3 | #
  4 | #
  5 | #    This program is free software: you can redistribute it and/or modify
  6 | #    it under the terms of the GNU General Public License as published by
  7 | #    the Free Software Foundation, either version 3 of the License, or
  8 | #    (at your option) any later version.
  9 | #
 10 | #    This program is distributed in the hope that it will be useful,
 11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 | #    GNU General Public License for more details.
 14 | #
 15 | #    You should have received a copy of the GNU General Public License
 16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | 
 18 | 
 19 | __author__ = "Matheus Boni Vicari"
 20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
 21 | __credits__ = ["Matheus Boni Vicari"]
 22 | __license__ = "GPL3"
 23 | __version__ = "1.3.2"
 24 | __maintainer__ = "Matheus Boni Vicari"
 25 | __email__ = "matheus.boni.vicari@gmail.com"
 26 | __status__ = "Development"
 27 | 
 28 | import networkx as nx
 29 | import numpy as np
 30 | from sklearn.neighbors import NearestNeighbors
 31 | 
 32 | 
 33 | def array_to_graph(arr, base_id, kpairs, knn, nbrs_threshold,
 34 |                    nbrs_threshold_step, graph_threshold=np.inf):
 35 | 
 36 |     """
 37 |     Converts a numpy.array of points coordinates into a Weighted BiDirectional
 38 |     NetworkX Graph.
 39 |     This funcions uses a NearestNeighbor search to determine points adajency.
 40 |     The NNsearch results are used to select pairs of points (or nodes) that
 41 |     have a common edge.
 42 | 
 43 | 
 44 |     Parameters
 45 |     ----------
 46 |     arr : array
 47 |         n-dimensional array of points.
 48 |     base_id : int
 49 |         Index of base id (root) in the graph.
 50 |     kpairs : int
 51 |         Number of points around each point in arr to select in order to
 52 |         build edges.
 53 |     knn : int
 54 |         Number of neighbors to search around each point in the neighborhood
 55 |         phase. The higher the better (careful, it's  memory intensive).
 56 |     nbrs_threshold : float
 57 |         Maximum valid distance between neighbors points.
 58 |     nbrs_threshold_step : float
 59 |         Distance increment used in the final phase of edges generation. It's
 60 |         used to make sure that in the end, every point in arr will be
 61 |         translated to nodes in the graph.
 62 |     graph_threshold : float
 63 |         Maximum distance between pairs of nodes (edge distance) accepted in
 64 |         the graph generation.
 65 | 
 66 |     Returns
 67 |     -------
 68 |     G : networkx graph
 69 |         Graph containing all points in 'arr' as nodes.
 70 | 
 71 |     """
 72 | 
 73 |     # only pass if all points are able to connect to graph
 74 |     not_connected = True
 75 |     iterations = 0
 76 |     while not_connected:
 77 |         not_connected = False
 78 |         iterations += 1
 79 |         knn *= iterations
 80 | #         print('iterations',iterations)
 81 | #         print('knn',knn)
 82 | 
 83 |         # Initializing graph.
 84 |         G = nx.Graph()
 85 | 
 86 |         # Generating array of all indices from 'arr' and all indices to process
 87 |         # 'idx'.
 88 |         idx_base = np.arange(arr.shape[0], dtype=int)
 89 |         idx = np.arange(arr.shape[0], dtype=int)
 90 | 
 91 |         # Initializing NearestNeighbors search and searching for all 'knn'
 92 |         # neighboring points arround each point in 'arr'.
 93 |         nbrs = NearestNeighbors(n_neighbors=knn, metric='euclidean',
 94 |                                 leaf_size=15, n_jobs=-1).fit(arr)
 95 |         distances, indices = nbrs.kneighbors(arr)
 96 |         indices = indices.astype(int)
 97 | 
 98 |         # Initializing variables for current ids being processed (current_idx)
 99 |         # and all ids already processed (processed_idx).
100 |         current_idx = [base_id]
101 |         processed_idx = [base_id]
102 | 
103 |         # Looping while there are still indices (idx) left to process.
104 |         while idx.shape[0] > 0:
105 | 
106 |             # If current_idx is a list containing several indices.
107 |             if len(current_idx) > 0:
108 | 
109 |                 # Selecting NearestNeighbors indices and distances for current
110 |                 # indices being processed.
111 |                 nn = indices[current_idx]
112 |                 dd = distances[current_idx]
113 | 
114 |                 # Masking out indices already contained in processed_idx.
115 |                 mask1 = np.in1d(nn, processed_idx, invert=True).reshape(nn.shape)
116 | 
117 |                 # Initializing temporary list of nearest neighbors. This list
118 |                 # is latter used to accumulate points that will be added to
119 |                 # processed points list.
120 |                 nntemp = []
121 | 
122 |                 # Looping over current indices's set of nn points and selecting
123 |                 # knn points that hasn't been added/processed yet (mask1).
124 |                 for i, (n, d, g) in enumerate(zip(nn, dd, current_idx)):
125 |                     nn_idx = n[mask1[i]][0:kpairs+1]
126 |                     dd_idx = d[mask1[i]][0:kpairs+1]
127 |                     nntemp.append(nn_idx)
128 | 
129 |                     # Adding current knn selected points as nodes to graph G.
130 |                     add_nodes(G, g, nn_idx, dd_idx, graph_threshold)
131 | 
132 |                 # Obtaining an unique array of points currently being processed.
133 |                 current_idx = np.unique([t2 for t1 in nntemp for t2 in t1])
134 | 
135 |             # If current_idx is an empty list.
136 |             elif len(current_idx) == 0:
137 | 
138 |                 # Getting NearestNeighbors indices and distance for all indices
139 |                 # that remain to be processed.
140 |                 idx2 = indices[idx]
141 |                 dist2 = distances[idx]
142 | 
143 |                 # Masking indices in idx2 that have already been processed. The
144 |                 # idea is to connect remaining points to existing graph nodes.
145 |                 mask1 = np.in1d(idx2, processed_idx).reshape(idx2.shape)
146 | 
147 |                 # check to see if mask1 produces empty set. If so, must redo
148 |                 # nearest neighbor search
149 |                 mask1_check = np.unique(np.where(mask1)[0])
150 |                 if mask1_check.shape[0] == 0:
151 |                     not_connected = True
152 |                     break
153 | 
154 |                 # Masking neighboring points that are withing threshold distance.
155 |                 mask2 = dist2 < nbrs_threshold
156 |                 # mask1 AND mask2. This will mask only indices that are part of
157 |                 # the graph and within threshold distance.
158 |                 mask = np.logical_and(mask1, mask2)
159 | 
160 |                 # Getting unique array of indices that match the criteria from
161 |                 # mask1 and mask2.
162 |                 temp_idx = np.unique(np.where(mask)[0])
163 |                 # Assigns remaining indices (idx) matched in temp_idx to
164 |                 # current_idx.
165 |                 current_idx = idx[temp_idx]
166 | 
167 |                 # Selecting NearestNeighbors indices and distances for current
168 |                 # indices being processed.
169 |                 nn = indices[current_idx]
170 |                 dd = distances[current_idx]
171 | 
172 |                 # Masking points in nn that have already been processed.
173 |                 # This is the oposite approach as above, where points that are
174 |                 # still not in the graph are desired. Now, to make sure the
175 |                 # continuity of the graph is kept, join current remaining indices
176 |                 # to indices already in G.
177 |                 mask = np.in1d(nn, processed_idx, invert=True).reshape(nn.shape)
178 | 
179 |                 # Initializing temporary list of nearest neighbors. This list
180 |                 # is latter used to accumulate points that will be added to
181 |                 # processed points list.
182 |                 nntemp = []
183 | 
184 |                 # Looping over current indices's set of nn points and selecting
185 |                 # knn points that have alreay been added/processed (mask).
186 |                 # Also, to ensure continuity over next iteration, select another
187 |                 # kpairs points from indices that haven't been processed (~mask).
188 |                 for i, (n, d, g) in enumerate(zip(nn, dd, current_idx)):
189 |                     nn_idx = n[mask[i]][0:kpairs+1]
190 |                     dd_idx = d[mask[i]][0:kpairs+1]
191 | 
192 |                     # Adding current knn selected points as nodes to graph G.
193 |                     add_nodes(G, g, nn_idx, dd_idx, graph_threshold)
194 | 
195 |                     nn_idx = n[~mask[i]][0:kpairs+1]
196 |                     dd_idx = d[~mask[i]][0:kpairs+1]
197 | 
198 |                     # Adding current knn selected points as nodes to graph G.
199 |                     add_nodes(G, g, nn_idx, dd_idx, graph_threshold)
200 | 
201 |                 # Check if current_idx is still empty. If so, increase the
202 |                 # nbrs_threshold to try to include more points in the next
203 |                 # iteration.
204 |                 if len(current_idx) == 0:
205 |                     nbrs_threshold += nbrs_threshold_step
206 | 
207 |             # Appending current_idx to processed_idx.
208 |             processed_idx = np.append(processed_idx, current_idx)
209 |             processed_idx = np.unique(processed_idx).astype(int)
210 | 
211 |             # Generating list of remaining proints to process.
212 |             idx = idx_base[np.in1d(idx_base, processed_idx, invert=True)]
213 | 
214 |     return G
215 | 
216 | 
217 | def extract_path_info(G, base_id, return_path=True):
218 | 
219 |     """
220 |     Extracts shortest path information from a NetworkX graph.
221 | 
222 |     Parameters
223 |     ----------
224 |     G : networkx graph
225 |         NetworkX graph object from which to extract the information.
226 |     base_id : int
227 |         Base (root) node id to calculate the shortest path for all other
228 |         nodes.
229 |     return_path : boolean
230 |         Option to select if function should output path list for every node
231 |         in G to base_id.
232 | 
233 |     Returns
234 |     -------
235 |     nodes_ids : list
236 |         Indices of all nodes in graph G.
237 |     distance : list
238 |         Shortest path distance (accumulated) from all nodes in G to base_id
239 |         node.
240 |     path_list : dict
241 |         Dictionary of nodes that comprises the path of every node in G to
242 |         base_id node.
243 | 
244 |     """
245 | 
246 |     # Calculating the shortest path
247 |     shortpath = nx.single_source_dijkstra_path_length(G, base_id)
248 | 
249 |     # Obtaining the node coordinates and their respective distance from
250 |     # the base point.
251 |     nodes_ids = shortpath.keys()
252 |     distance = shortpath.values()
253 | 
254 |     # Checking if the function should also return the paths of each node and
255 |     # if so, generating the path list and returning it.
256 |     if return_path is True:
257 |         path_list = nx.single_source_dijkstra_path(G, base_id)
258 |         return nodes_ids, distance, path_list
259 | 
260 |     elif return_path is False:
261 |         return nodes_ids, distance
262 | 
263 | 
264 | def add_nodes(G, base_node, indices, distance, threshold):
265 | 
266 |     """
267 |     Adds a set of nodes and weighted edges based on pairs of indices
268 |     between base_node and all entries in indices. Each node pair shares an
269 |     edge with weight equal to the distance between both nodes.
270 | 
271 |     Parameters
272 |     ----------
273 |     G : networkx graph
274 |         NetworkX graph object to which all nodes/edges will be added.
275 |     base_node : int
276 |         Base node's id to be added. All other nodes will be paired with
277 |         base_node to form different edges.
278 |     indices : list or array
279 |         Set of nodes indices to be paired with base_node.
280 |     distance : list or array
281 |         Set of distances between all nodes in 'indices' and base_node.
282 |     threshold : float
283 |         Edge distance threshold. All edges with distance larger than
284 |         'threshold' will not be added to G.
285 | 
286 |     """
287 | 
288 |     for c in np.arange(len(indices)):
289 |         if distance[c] <= threshold:
290 |             # If the distance between vertices is less than a given
291 |             # threshold, add edge (i[0], i[c]) to Graph.
292 |             G.add_weighted_edges_from([(base_node, indices[c],
293 |                                         distance[c])])
294 | 


--------------------------------------------------------------------------------
/tlseparation/utility/voxels.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2017-2019, Matheus Boni Vicari, TLSeparation Project
 2 | # All rights reserved.
 3 | #
 4 | #
 5 | #    This program is free software: you can redistribute it and/or modify
 6 | #    it under the terms of the GNU General Public License as published by
 7 | #    the Free Software Foundation, either version 3 of the License, or
 8 | #    (at your option) any later version.
 9 | #
10 | #    This program is distributed in the hope that it will be useful,
11 | #    but WITHOUT ANY WARRANTY; without even the implied warranty of
12 | #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13 | #    GNU General Public License for more details.
14 | #
15 | #    You should have received a copy of the GNU General Public License
16 | #    along with this program.  If not, see <http://www.gnu.org/licenses/>.
17 | 
18 | 
19 | __author__ = "Matheus Boni Vicari"
20 | __copyright__ = "Copyright 2017-2019, TLSeparation Project"
21 | __credits__ = ["Matheus Boni Vicari"]
22 | __license__ = "GPL3"
23 | __version__ = "1.3.2"
24 | __maintainer__ = "Matheus Boni Vicari"
25 | __email__ = "matheus.boni.vicari@gmail.com"
26 | __status__ = "Development"
27 | 
28 | from collections import defaultdict
29 | 
30 | 
31 | def voxelize_cloud(arr, voxel_size):
32 | 
33 |     """
34 |     Generates a dictionary of voxels containing their central coordinates
35 |     and indices of points belonging to each voxel.
36 | 
37 |     Parameters
38 |     ----------
39 |     arr: array
40 |         Array of points/entries to voxelize.
41 |     voxel_size: float
42 |         Length of all voxels sides/edges.
43 | 
44 |     Returns
45 |     -------
46 |     vox: defaultdict
47 |         Dictionary containing voxels. Keys are voxels' central coordinates and
48 |         values are indices of points in arr inside each voxel.
49 | 
50 |     """
51 | 
52 |     voxels_ids = (arr / voxel_size).astype(int) * voxel_size
53 |     vox = defaultdict(list)
54 | 
55 |     for i, v in enumerate(voxels_ids):
56 |         vox[tuple(v)].append(i)
57 | 
58 |     return vox
59 | 


--------------------------------------------------------------------------------