20 | "License" shall mean the terms and conditions for use, reproduction,
21 | and distribution as defined by Sections 1 through 9 of this document.
22 |
23 |
24 | "Licensor" shall mean the copyright owner or entity authorized by
25 | the copyright owner that is granting the License.
26 |
27 |
28 | "Legal Entity" shall mean the union of the acting entity and all
29 | other entities that control, are controlled by, or are under common
30 | control with that entity. For the purposes of this definition,
31 | "control" means (i) the power, direct or indirect, to cause the
32 | direction or management of such entity, whether by contract or
33 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
34 | outstanding shares, or (iii) beneficial ownership of such entity.
35 |
36 |
37 | "You" (or "Your") shall mean an individual or Legal Entity
38 | exercising permissions granted by this License.
39 |
40 |
41 | "Source" form shall mean the preferred form for making modifications,
42 | including but not limited to software source code, documentation
43 | source, and configuration files.
44 |
45 |
46 | "Object" form shall mean any form resulting from mechanical
47 | transformation or translation of a Source form, including but
48 | not limited to compiled object code, generated documentation,
49 | and conversions to other media types.
50 |
51 |
52 | "Work" shall mean the work of authorship, whether in Source or
53 | Object form, made available under the License, as indicated by a
54 | copyright notice that is included in or attached to the work
55 | (an example is provided in the Appendix below).
56 |
57 |
58 | "Derivative Works" shall mean any work, whether in Source or Object
59 | form, that is based on (or derived from) the Work and for which the
60 | editorial revisions, annotations, elaborations, or other modifications
61 | represent, as a whole, an original work of authorship. For the purposes
62 | of this License, Derivative Works shall not include works that remain
63 | separable from, or merely link (or bind by name) to the interfaces of,
64 | the Work and Derivative Works thereof.
65 |
66 |
67 | "Contribution" shall mean any work of authorship, including
68 | the original version of the Work and any modifications or additions
69 | to that Work or Derivative Works thereof, that is intentionally
70 | submitted to Licensor for inclusion in the Work by the copyright owner
71 | or by an individual or Legal Entity authorized to submit on behalf of
72 | the copyright owner. For the purposes of this definition, "submitted"
73 | means any form of electronic, verbal, or written communication sent
74 | to the Licensor or its representatives, including but not limited to
75 | communication on electronic mailing lists, source code control systems,
76 | and issue tracking systems that are managed by, or on behalf of, the
77 | Licensor for the purpose of discussing and improving the Work, but
78 | excluding communication that is conspicuously marked or otherwise
79 | designated in writing by the copyright owner as "Not a Contribution."
80 |
81 |
82 | "Contributor" shall mean Licensor and any individual or Legal Entity
83 | on behalf of whom a Contribution has been received by Licensor and
84 | subsequently incorporated within the Work.
85 |
86 |
2. Grant of Copyright License.
87 | Subject to the terms and conditions of
88 | this License, each Contributor hereby grants to You a perpetual,
89 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
90 | copyright license to reproduce, prepare Derivative Works of,
91 | publicly display, publicly perform, sublicense, and distribute the
92 | Work and such Derivative Works in Source or Object form.
93 |
94 |
3. Grant of Patent License.
95 | Subject to the terms and conditions of
96 | this License, each Contributor hereby grants to You a perpetual,
97 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
98 | (except as stated in this section) patent license to make, have made,
99 | use, offer to sell, sell, import, and otherwise transfer the Work,
100 | where such license applies only to those patent claims licensable
101 | by such Contributor that are necessarily infringed by their
102 | Contribution(s) alone or by combination of their Contribution(s)
103 | with the Work to which such Contribution(s) was submitted. If You
104 | institute patent litigation against any entity (including a
105 | cross-claim or counterclaim in a lawsuit) alleging that the Work
106 | or a Contribution incorporated within the Work constitutes direct
107 | or contributory patent infringement, then any patent licenses
108 | granted to You under this License for that Work shall terminate
109 | as of the date such litigation is filed.
110 |
111 |
4. Redistribution.
112 | You may reproduce and distribute copies of the
113 | Work or Derivative Works thereof in any medium, with or without
114 | modifications, and in Source or Object form, provided that You
115 | meet the following conditions:
116 |
117 |
You must give any other recipients of the Work or
118 | Derivative Works a copy of this License; and
119 |
120 |
121 |
You must cause any modified files to carry prominent notices
122 | stating that You changed the files; and
123 |
124 |
125 |
You must retain, in the Source form of any Derivative Works
126 | that You distribute, all copyright, patent, trademark, and
127 | attribution notices from the Source form of the Work,
128 | excluding those notices that do not pertain to any part of
129 | the Derivative Works; and
130 |
131 |
132 |
If the Work includes a "NOTICE" text file as part of its
133 | distribution, then any Derivative Works that You distribute must
134 | include a readable copy of the attribution notices contained
135 | within such NOTICE file, excluding those notices that do not
136 | pertain to any part of the Derivative Works, in at least one
137 | of the following places: within a NOTICE text file distributed
138 | as part of the Derivative Works; within the Source form or
139 | documentation, if provided along with the Derivative Works; or,
140 | within a display generated by the Derivative Works, if and
141 | wherever such third-party notices normally appear. The contents
142 | of the NOTICE file are for informational purposes only and
143 | do not modify the License. You may add Your own attribution
144 | notices within Derivative Works that You distribute, alongside
145 | or as an addendum to the NOTICE text from the Work, provided
146 | that such additional attribution notices cannot be construed
147 | as modifying the License.
148 |
149 | You may add Your own copyright statement to Your modifications and
150 | may provide additional or different license terms and conditions
151 | for use, reproduction, or distribution of Your modifications, or
152 | for any such Derivative Works as a whole, provided Your use,
153 | reproduction, and distribution of the Work otherwise complies with
154 | the conditions stated in this License.
155 |
156 |
5. Submission of Contributions.
157 | Unless You explicitly state otherwise,
158 | any Contribution intentionally submitted for inclusion in the Work
159 | by You to the Licensor shall be under the terms and conditions of
160 | this License, without any additional terms or conditions.
161 | Notwithstanding the above, nothing herein shall supersede or modify
162 | the terms of any separate license agreement you may have executed
163 | with Licensor regarding such Contributions.
164 |
165 |
6. Trademarks.
166 | This License does not grant permission to use the trade
167 | names, trademarks, service marks, or product names of the Licensor,
168 | except as required for reasonable and customary use in describing the
169 | origin of the Work and reproducing the content of the NOTICE file.
170 |
171 |
7. Disclaimer of Warranty.
172 | Unless required by applicable law or
173 | agreed to in writing, Licensor provides the Work (and each
174 | Contributor provides its Contributions) on an "AS IS" BASIS,
175 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
176 | implied, including, without limitation, any warranties or conditions
177 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
178 | PARTICULAR PURPOSE. You are solely responsible for determining the
179 | appropriateness of using or redistributing the Work and assume any
180 | risks associated with Your exercise of permissions under this License.
181 |
182 |
8. Limitation of Liability.
183 | In no event and under no legal theory,
184 | whether in tort (including negligence), contract, or otherwise,
185 | unless required by applicable law (such as deliberate and grossly
186 | negligent acts) or agreed to in writing, shall any Contributor be
187 | liable to You for damages, including any direct, indirect, special,
188 | incidental, or consequential damages of any character arising as a
189 | result of this License or out of the use or inability to use the
190 | Work (including but not limited to damages for loss of goodwill,
191 | work stoppage, computer failure or malfunction, or any and all
192 | other commercial damages or losses), even if such Contributor
193 | has been advised of the possibility of such damages.
194 |
195 |
9. Accepting Warranty or Additional Liability.
196 | While redistributing
197 | the Work or Derivative Works thereof, You may choose to offer,
198 | and charge a fee for, acceptance of support, warranty, indemnity,
199 | or other liability obligations and/or rights consistent with this
200 | License. However, in accepting such obligations, You may act only
201 | on Your own behalf and on Your sole responsibility, not on behalf
202 | of any other Contributor, and only if You agree to indemnify,
203 | defend, and hold each Contributor harmless for any liability
204 | incurred by, or claims asserted against, such Contributor by reason
205 | of your accepting any such warranty or additional liability.
206 |
207 |
208 | END OF TERMS AND CONDITIONS
209 |
210 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | hierarchical-clustering-java
2 | ============================
3 |
4 | Implementation of an agglomerative hierarchical clustering algorithm in Java. Different linkage approaches are supported:
5 | * Single Linkage
6 | * Complete Linkage
7 |
8 | What you put in
9 | ---------------
10 |
11 | Pass a distance matrix and a cluster name array along with a linkage strategy to the clustering algorithm:
12 |
13 | String[] names = new String[] { "O1", "O2", "O3", "O4", "O5", "O6" };
14 | double[][] distances = new double[][] {
15 | { 0, 1, 9, 7, 11, 14 },
16 | { 1, 0, 4, 3, 8, 10 },
17 | { 9, 4, 0, 9, 2, 8 },
18 | { 7, 3, 9, 0, 6, 13 },
19 | { 11, 8, 2, 6, 0, 10 },
20 | { 14, 10, 8, 13, 10, 0 }};
21 |
22 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
23 | Cluster cluster = alg.performClustering(distances, names,
24 | new AverageLinkageStrategy());
25 |
26 | Alternatively, you can pass a [pdist](http://www.mathworks.com/help/stats/pdist.html)-like matrix containing one row:
27 |
28 | String[] names = new String[] { "O1", "O2", "O3", "O4", "O5", "O6" };
29 | double[][] pdist = new double[][] {
30 | {1, 9, 7, 11 ,14 ,4 ,3 ,8 ,10 ,9 ,2 ,8 ,6 ,13 ,10}
31 | };
32 | ClusteringAlgorithm alg = new PDistClusteringAlgorithm();
33 | Cluster cluster = alg.performClustering(pdist, names, new MinimumLinkageStrategy());
34 |
35 | What you get out
36 | ----------------
37 |
38 | The algorithm creates a *Cluster* instance representing an hierachy of cluster nodes based on their distances.
39 | You may want to visualize the result using the Swing component *DendrogramPanel*:
40 |
41 | DendrogramPanel dp = new DendrogramPanel();
42 | dp.setModel(cluster);
43 |
44 | When embedded into a JFrame the dendrogram panel should display this:
45 |
46 | 
47 |
48 | License
49 | -------
50 | Licensed under the Apache License, Version 2.0 (the "License");
51 | you may not use this file except in compliance with the License.
52 | You may obtain a copy of the License at (http://www.apache.org/licenses/LICENSE-2.0).
53 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS,
54 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
55 | See the License for the specific language governing permissions and limitations under the License.
56 |
--------------------------------------------------------------------------------
/azure-pipelines.yml:
--------------------------------------------------------------------------------
1 | # Maven
2 | # Build your Java project and run tests with Apache Maven.
3 | # Add steps that analyze code, save build artifacts, deploy, and more:
4 | # https://docs.microsoft.com/azure/devops/pipelines/languages/java
5 |
6 | trigger:
7 | - master
8 |
9 | pool:
10 | vmImage: 'ubuntu-latest'
11 |
12 | steps:
13 | - task: Maven@3
14 | inputs:
15 | mavenPomFile: 'pom.xml'
16 | mavenOptions: '-Xmx3072m'
17 | javaHomeOption: 'JDKVersion'
18 | jdkVersionOption: '1.8'
19 | jdkArchitectureOption: 'x64'
20 | publishJUnitResults: true
21 | testResultsFiles: '**/surefire-reports/TEST-*.xml'
22 | goals: 'package'
23 |
--------------------------------------------------------------------------------
/etc/screenshot1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/lbehnke/hierarchical-clustering-java/266bdb63aa926d49e14f9083cf4f78987f47175d/etc/screenshot1.png
--------------------------------------------------------------------------------
/etc/screenshot2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/lbehnke/hierarchical-clustering-java/266bdb63aa926d49e14f9083cf4f78987f47175d/etc/screenshot2.png
--------------------------------------------------------------------------------
/pom.xml:
--------------------------------------------------------------------------------
1 |
2 | 4.0.0
3 | com.apporiented
4 | hierarchical-clustering
5 | 1.2.1-SNAPSHOT
6 | Agglomerative hierarchical clustering
7 | Agglomerative hierarchical clustering analysis and visualization implemented in Java
8 | 2011
9 | https://github.com/lbehnke/hierarchical-clustering-java
10 |
11 |
12 | lbehnke
13 | Lars Behnke
14 | https://github.com/lbehnke
15 |
16 | developer
17 | administrator
18 |
19 |
20 |
21 |
22 |
23 |
24 | Michel Daviot
25 | https://github.com/tyrcho
26 |
27 | developer
28 |
29 |
30 |
31 | Robert Macaulay
32 |
33 | developer
34 |
35 |
36 |
37 | Alexandre Masselot
38 |
39 | developer
40 |
41 |
42 |
43 |
44 |
45 | github
46 | UTF-8
47 |
48 |
49 |
50 |
51 | Apache License 2.0
52 | http://www.apache.org/licenses/LICENSE-2.0.html
53 | A business-friendly OSS license
54 |
55 |
56 |
57 |
58 | https://github.com/lbehnke/hierarchical-clustering-java
59 | scm:git:git://github.com/lbehnke/hierarchical-clustering-java.git
60 | scm:git:ssh://git@github.com/lbehnke/hierarchical-clustering-java.git
61 | HEAD
62 |
63 |
64 |
65 | Lars Behnke
66 | https://github.com/lbehnke
67 |
68 |
69 |
70 |
71 | install
72 |
73 |
74 |
75 | maven-compiler-plugin
76 | 3.1
77 |
78 | 1.6
79 | 1.6
80 |
81 |
82 |
83 |
84 | maven-jar-plugin
85 | 2.5
86 |
87 |
88 |
89 | true
90 | true
91 |
92 | com.apporiented.algorithm.clustering.visualization.DendrogramPanel
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 | maven-source-plugin
101 | 2.3
102 |
103 |
104 |
105 | jar
106 |
107 |
108 |
109 |
110 |
111 |
112 | maven-surefire-plugin
113 | 2.17
114 |
115 |
116 | **/*Test.java
117 |
118 |
119 |
120 |
121 |
122 | maven-release-plugin
123 | 2.5.1
124 |
125 | v@{project.version}
126 |
127 |
128 |
129 |
130 |
131 |
132 |
133 | junit
134 | junit
135 | 4.13.1
136 | test
137 |
138 |
139 | commons-io
140 | commons-io
141 | 2.5
142 | test
143 |
144 |
145 |
146 |
147 |
148 |
149 | sonatype-nexus-snapshots
150 | Sonatype Snapshot Repository
151 | https://oss.sonatype.org/content/repositories/snapshots
152 |
153 |
154 | sonatype-nexus-staging
155 | Sonatype Staging Repository
156 | https://oss.sonatype.org/service/local/staging/deploy/maven2
157 |
158 |
159 |
160 |
161 |
162 | pgp
163 |
164 |
165 |
166 | org.apache.maven.plugins
167 | maven-gpg-plugin
168 | 1.5
169 |
170 |
171 | sign-artifacts
172 | verify
173 |
174 | sign
175 |
176 |
177 |
178 |
179 |
180 |
181 |
182 |
183 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/AverageLinkageStrategy.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.Collection;
20 |
21 | // TODO Not working correctly, fix
22 | public class AverageLinkageStrategy implements LinkageStrategy {
23 |
24 | @Override
25 | public Distance calculateDistance(Collection distances) {
26 | double sum = 0;
27 | double result;
28 |
29 | for (Distance dist : distances) {
30 | sum += dist.getDistance();
31 | }
32 | if (distances.size() > 0) {
33 | result = sum / distances.size();
34 | } else {
35 | result = 0.0;
36 | }
37 | return new Distance(result);
38 | }
39 | }
40 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/Cluster.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.ArrayList;
20 | import java.util.List;
21 |
22 | public class Cluster
23 | {
24 |
25 | private String name;
26 |
27 | private Cluster parent;
28 |
29 | private List children;
30 |
31 | private List leafNames;
32 |
33 | private Distance distance = new Distance();
34 |
35 |
36 | public Cluster(String name)
37 | {
38 | this.name = name;
39 | leafNames = new ArrayList();
40 | }
41 |
42 | public Distance getDistance()
43 | {
44 | return distance;
45 | }
46 |
47 | public Double getWeightValue()
48 | {
49 | return distance.getWeight();
50 | }
51 |
52 | public Double getDistanceValue()
53 | {
54 | return distance.getDistance();
55 | }
56 |
57 | public void setDistance(Distance distance)
58 | {
59 | this.distance = distance;
60 | }
61 |
62 | public List getChildren()
63 | {
64 | if (children == null)
65 | {
66 | children = new ArrayList();
67 | }
68 |
69 | return children;
70 | }
71 |
72 | public void addLeafName(String lname)
73 | {
74 | leafNames.add(lname);
75 | }
76 |
77 | public void appendLeafNames(List lnames)
78 | {
79 | leafNames.addAll(lnames);
80 | }
81 |
82 | public List getLeafNames()
83 | {
84 | return leafNames;
85 | }
86 |
87 | public void setChildren(List children)
88 | {
89 | this.children = children;
90 | }
91 |
92 | public Cluster getParent()
93 | {
94 | return parent;
95 | }
96 |
97 | public void setParent(Cluster parent)
98 | {
99 | this.parent = parent;
100 | }
101 |
102 |
103 | public String getName()
104 | {
105 | return name;
106 | }
107 |
108 | public void setName(String name)
109 | {
110 | this.name = name;
111 | }
112 |
113 | public void addChild(Cluster cluster)
114 | {
115 | getChildren().add(cluster);
116 |
117 | }
118 |
119 | public boolean contains(Cluster cluster)
120 | {
121 | return getChildren().contains(cluster);
122 | }
123 |
124 | @Override
125 | public String toString()
126 | {
127 | return "Cluster " + name;
128 | }
129 |
130 | @Override
131 | public boolean equals(Object obj)
132 | {
133 | if (this == obj)
134 | {
135 | return true;
136 | }
137 | if (obj == null)
138 | {
139 | return false;
140 | }
141 | if (getClass() != obj.getClass())
142 | {
143 | return false;
144 | }
145 | Cluster other = (Cluster) obj;
146 | if (name == null)
147 | {
148 | if (other.name != null)
149 | {
150 | return false;
151 | }
152 | } else if (!name.equals(other.name))
153 | {
154 | return false;
155 | }
156 | return true;
157 | }
158 |
159 | @Override
160 | public int hashCode()
161 | {
162 | return (name == null) ? 0 : name.hashCode();
163 | }
164 |
165 | public boolean isLeaf()
166 | {
167 | return getChildren().size() == 0;
168 | }
169 |
170 | public int countLeafs()
171 | {
172 | return countLeafs(this, 0);
173 | }
174 |
175 | public int countLeafs(Cluster node, int count)
176 | {
177 | if (node.isLeaf()) count++;
178 | for (Cluster child : node.getChildren())
179 | {
180 | count += child.countLeafs();
181 | }
182 | return count;
183 | }
184 |
185 | public void toConsole(int indent)
186 | {
187 | for (int i = 0; i < indent; i++)
188 | {
189 | System.out.print(" ");
190 |
191 | }
192 | String name = getName() + (isLeaf() ? " (leaf)" : "") + (distance != null ? " distance: " + distance : "");
193 | System.out.println(name);
194 | for (Cluster child : getChildren())
195 | {
196 | child.toConsole(indent + 1);
197 | }
198 | }
199 |
200 | public String toNewickString(int indent)
201 | {
202 | String cdtString = "";
203 | if(!isLeaf()) cdtString+="(";
204 |
205 | for (int i = 0; i < indent; i++) cdtString+=" ";
206 |
207 |
208 | if(isLeaf()) {
209 | cdtString+=getName();
210 | }
211 |
212 | List children = getChildren();
213 |
214 | boolean firstChild = true;
215 | for (Cluster child : children)
216 | {
217 | cdtString+=child.toNewickString(indent);
218 | String distanceString = distance.getDistance().toString().replace(",", ".");
219 | String weightString = distance.getWeight().toString().replace(",", ".");
220 | if(firstChild) cdtString+=":"+distanceString+",";
221 | else cdtString+=":"+weightString;
222 |
223 | firstChild=false;
224 | }
225 |
226 | for (int i = 0; i < indent; i++)
227 | cdtString += " ";
228 |
229 | if(!isLeaf()) cdtString+=")";
230 |
231 | return cdtString;
232 | }
233 |
234 | public double getTotalDistance()
235 | {
236 | Double dist = getDistance() == null ? 0 : getDistance().getDistance();
237 | if (getChildren().size() > 0)
238 | {
239 | dist += children.get(0).getTotalDistance();
240 | }
241 | return dist;
242 |
243 | }
244 |
245 | }
246 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/ClusterPair.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | public class ClusterPair implements Comparable {
20 |
21 | private Cluster lCluster;
22 | private Cluster rCluster;
23 | private Double linkageDistance;
24 |
25 | public ClusterPair(){
26 | }
27 |
28 | public ClusterPair(Cluster left, Cluster right, Double distance) {
29 | lCluster = left;
30 | rCluster = right;
31 | linkageDistance = distance;
32 | }
33 |
34 | public Cluster getOtherCluster(Cluster c) {
35 | return lCluster == c ? rCluster : lCluster;
36 | }
37 |
38 | public Cluster getlCluster() {
39 | return lCluster;
40 | }
41 |
42 | public void setlCluster(Cluster lCluster) {
43 | this.lCluster = lCluster;
44 | }
45 |
46 | public Cluster getrCluster() {
47 | return rCluster;
48 | }
49 |
50 | public void setrCluster(Cluster rCluster) {
51 | this.rCluster = rCluster;
52 | }
53 |
54 | public Double getLinkageDistance() {
55 | return linkageDistance;
56 | }
57 |
58 | public void setLinkageDistance(Double distance) {
59 | this.linkageDistance = distance;
60 | }
61 |
62 | /**
63 | * @return a new ClusterPair with the two left/right inverted
64 | */
65 | public ClusterPair reverse() {
66 | return new ClusterPair(getrCluster(), getlCluster(), getLinkageDistance());
67 | }
68 |
69 |
70 |
71 | @Override
72 | public int compareTo(ClusterPair o) {
73 | int result;
74 | if (o == null || o.getLinkageDistance() == null) {
75 | result = -1;
76 | } else if (getLinkageDistance() == null) {
77 | result = 1;
78 | } else {
79 | result = getLinkageDistance().compareTo(o.getLinkageDistance());
80 | }
81 |
82 | return result;
83 | }
84 |
85 | public Cluster agglomerate(int clusterIdx) {
86 | return agglomerate("clstr#" + clusterIdx);
87 | }
88 |
89 | public Cluster agglomerate(String name) {
90 | Cluster cluster = new Cluster(name);
91 | cluster.setDistance(new Distance(getLinkageDistance()));
92 | //New clusters will track their children's leaf names; i.e. each cluster knows what part of the original data it contains
93 | cluster.appendLeafNames(lCluster.getLeafNames());
94 | cluster.appendLeafNames(rCluster.getLeafNames());
95 | cluster.addChild(lCluster);
96 | cluster.addChild(rCluster);
97 | lCluster.setParent(cluster);
98 | rCluster.setParent(cluster);
99 |
100 | Double lWeight = lCluster.getWeightValue();
101 | Double rWeight = rCluster.getWeightValue();
102 | double weight = lWeight + rWeight;
103 | cluster.getDistance().setWeight(weight);
104 |
105 | return cluster;
106 | }
107 |
108 | @Override
109 | public String toString() {
110 | StringBuilder sb = new StringBuilder();
111 | if (lCluster != null) {
112 | sb.append(lCluster.getName());
113 | }
114 | if (rCluster != null) {
115 | if (sb.length() > 0) {
116 | sb.append(" + ");
117 | }
118 | sb.append(rCluster.getName());
119 | }
120 | sb.append(" : ").append(linkageDistance);
121 | return sb.toString();
122 | }
123 |
124 | }
125 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/ClusteringAlgorithm.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.List;
20 |
21 | public interface ClusteringAlgorithm
22 | {
23 |
24 | public Cluster performClustering(double[][] distances, String[] clusterNames,
25 | LinkageStrategy linkageStrategy);
26 |
27 | public Cluster performWeightedClustering(double[][] distances, String[] clusterNames,
28 | double[] weights, LinkageStrategy linkageStrategy);
29 |
30 | public List performFlatClustering(double[][] distances,
31 | String[] clusterNames, LinkageStrategy linkageStrategy, Double threshold);
32 | }
33 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/CompleteLinkageStrategy.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.Collection;
20 |
21 | public class CompleteLinkageStrategy implements LinkageStrategy {
22 |
23 | @Override
24 | public Distance calculateDistance(Collection distances) {
25 | double max = Double.NaN;
26 |
27 | for (Distance dist : distances) {
28 | if (Double.isNaN(max) || dist.getDistance() > max)
29 | max = dist.getDistance();
30 | }
31 | return new Distance(max);
32 | }
33 | }
34 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/DefaultClusteringAlgorithm.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.*;
20 |
21 | public class DefaultClusteringAlgorithm implements ClusteringAlgorithm
22 | {
23 |
24 | @Override
25 | public Cluster performClustering(double[][] distances,
26 | String[] clusterNames, LinkageStrategy linkageStrategy)
27 | {
28 |
29 | checkArguments(distances, clusterNames, linkageStrategy);
30 | /* Setup model */
31 | List clusters = createClusters(clusterNames);
32 | DistanceMap linkages = createLinkages(distances, clusters);
33 |
34 | /* Process */
35 | HierarchyBuilder builder = new HierarchyBuilder(clusters, linkages);
36 | while (!builder.isTreeComplete())
37 | {
38 | builder.agglomerate(linkageStrategy);
39 | }
40 |
41 | return builder.getRootCluster();
42 | }
43 |
44 | @Override
45 | public List performFlatClustering(double[][] distances,
46 | String[] clusterNames, LinkageStrategy linkageStrategy, Double threshold)
47 | {
48 |
49 | checkArguments(distances, clusterNames, linkageStrategy);
50 | /* Setup model */
51 | List clusters = createClusters(clusterNames);
52 | DistanceMap linkages = createLinkages(distances, clusters);
53 |
54 | /* Process */
55 | HierarchyBuilder builder = new HierarchyBuilder(clusters, linkages);
56 | return builder.flatAgg(linkageStrategy, threshold);
57 | }
58 |
59 | private void checkArguments(double[][] distances, String[] clusterNames,
60 | LinkageStrategy linkageStrategy)
61 | {
62 | if (distances == null || distances.length == 0
63 | || distances[0].length != distances.length)
64 | {
65 | throw new IllegalArgumentException("Invalid distance matrix");
66 | }
67 | if (distances.length != clusterNames.length)
68 | {
69 | throw new IllegalArgumentException("Invalid cluster name array");
70 | }
71 | if (linkageStrategy == null)
72 | {
73 | throw new IllegalArgumentException("Undefined linkage strategy");
74 | }
75 | int uniqueCount = new HashSet(Arrays.asList(clusterNames)).size();
76 | if (uniqueCount != clusterNames.length)
77 | {
78 | throw new IllegalArgumentException("Duplicate names");
79 | }
80 | }
81 |
82 | @Override
83 | public Cluster performWeightedClustering(double[][] distances, String[] clusterNames,
84 | double[] weights, LinkageStrategy linkageStrategy)
85 | {
86 |
87 | checkArguments(distances, clusterNames, linkageStrategy);
88 |
89 | if (weights.length != clusterNames.length)
90 | {
91 | throw new IllegalArgumentException("Invalid weights array");
92 | }
93 |
94 | /* Setup model */
95 | List clusters = createClusters(clusterNames, weights);
96 | DistanceMap linkages = createLinkages(distances, clusters);
97 |
98 | /* Process */
99 | HierarchyBuilder builder = new HierarchyBuilder(clusters, linkages);
100 | while (!builder.isTreeComplete())
101 | {
102 | builder.agglomerate(linkageStrategy);
103 | }
104 |
105 | return builder.getRootCluster();
106 | }
107 |
108 | private DistanceMap createLinkages(double[][] distances,
109 | List clusters)
110 | {
111 | DistanceMap linkages = new DistanceMap();
112 | for (int col = 0; col < clusters.size(); col++)
113 | {
114 | for (int row = col + 1; row < clusters.size(); row++)
115 | {
116 | ClusterPair link = new ClusterPair();
117 | Cluster lCluster = clusters.get(col);
118 | Cluster rCluster = clusters.get(row);
119 | link.setLinkageDistance(distances[col][row]);
120 | link.setlCluster(lCluster);
121 | link.setrCluster(rCluster);
122 | linkages.add(link);
123 | }
124 | }
125 | return linkages;
126 | }
127 |
128 | private List createClusters(String[] clusterNames)
129 | {
130 | List clusters = new ArrayList();
131 | for (String clusterName : clusterNames)
132 | {
133 | Cluster cluster = new Cluster(clusterName);
134 | cluster.addLeafName(clusterName);
135 | clusters.add(cluster);
136 | }
137 | return clusters;
138 | }
139 |
140 | private List createClusters(String[] clusterNames, double[] weights)
141 | {
142 | List clusters = new ArrayList();
143 | for (int i = 0; i < weights.length; i++)
144 | {
145 | Cluster cluster = new Cluster(clusterNames[i]);
146 | cluster.setDistance(new Distance(0.0, weights[i]));
147 | clusters.add(cluster);
148 | }
149 | return clusters;
150 | }
151 |
152 | }
153 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/Distance.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2015 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | public class Distance implements Comparable, Cloneable {
20 |
21 | private Double distance;
22 | private Double weight;
23 |
24 | public Distance() {
25 | this(0.0);
26 | }
27 |
28 | public Distance(Double distance) {
29 | this(distance, 1.0);
30 | }
31 |
32 | public Distance(Double distance, Double weight) {
33 | this.distance = distance;
34 | this.weight = weight;
35 | }
36 |
37 | public Double getDistance() {
38 | return distance;
39 | }
40 |
41 | public void setDistance(Double distance) {
42 | this.distance = distance;
43 | }
44 |
45 | public Double getWeight() {
46 | return weight;
47 | }
48 |
49 | public void setWeight(Double weight) {
50 | this.weight = weight;
51 | }
52 |
53 | public boolean isNaN() {
54 | return distance == null || distance.isNaN();
55 | }
56 |
57 | @Override
58 | public int compareTo(Distance distance) {
59 | return distance == null ? 1 : getDistance().compareTo(distance.getDistance());
60 | }
61 |
62 | @Override
63 | public String toString() {
64 | return String.format("distance : %.2f, weight : %.2f", distance, weight);
65 | }
66 | }
67 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/DistanceMap.java:
--------------------------------------------------------------------------------
1 | package com.apporiented.algorithm.clustering;
2 |
3 | import java.util.*;
4 |
5 | /**
6 | * Container for linkages
7 | * with the minimal methods needed in the package
8 | * Created by Alexandre Masselot on 7/18/14.
9 | */
10 | public class DistanceMap {
11 |
12 | private Map pairHash;
13 | private PriorityQueue data;
14 |
15 | private class Item implements Comparable {
16 | final ClusterPair pair;
17 | final String hash;
18 | boolean removed = false;
19 |
20 | Item(ClusterPair p) {
21 | pair = p;
22 | hash = hashCodePair(p);
23 | }
24 |
25 | @Override
26 | public int compareTo(Item o) {
27 | return pair.compareTo(o.pair);
28 | }
29 |
30 | @Override
31 | public String toString() {
32 | return hash;
33 | }
34 | }
35 |
36 | public DistanceMap() {
37 | data = new PriorityQueue();
38 | pairHash = new HashMap();
39 | }
40 |
41 | public List list() {
42 | List l = new ArrayList(data.size());
43 | for (Item clusterPair : data) {
44 | l.add(clusterPair.pair);
45 | }
46 | return l;
47 | }
48 |
49 | public ClusterPair findByCodePair(Cluster c1, Cluster c2) {
50 | String inCode = hashCodePair(c1, c2);
51 | return pairHash.get(inCode).pair;
52 | }
53 |
54 | public ClusterPair removeFirst() {
55 | Item poll = data.poll();
56 | while (poll != null && poll.removed) {
57 | poll = data.poll();
58 | }
59 | if (poll == null) {
60 | return null;
61 | }
62 | ClusterPair link = poll.pair;
63 | pairHash.remove(poll.hash);
64 | return link;
65 | }
66 |
67 | public boolean remove(ClusterPair link) {
68 | Item remove = pairHash.remove(hashCodePair(link));
69 | if (remove == null) {
70 | return false;
71 | }
72 | remove.removed = true;
73 | // data.remove(remove); // bottleneck
74 | return true;
75 | }
76 |
77 |
78 | public boolean add(ClusterPair link) {
79 | Item e = new Item(link);
80 | Item existingItem = pairHash.get(e.hash);
81 | if (existingItem != null) {
82 | System.err.println("hashCode = " + existingItem.hash +
83 | " adding redundant link:" + link + " (exist:" + existingItem + ")");
84 | return false;
85 | } else {
86 | pairHash.put(e.hash, e);
87 | data.add(e);
88 | return true;
89 | }
90 | }
91 |
92 | /**
93 | * Peak into the minimum distance
94 | * @return
95 | */
96 | public Double minDist()
97 | {
98 | Item peek = data.peek();
99 | if (peek != null)
100 | return peek.pair.getLinkageDistance();
101 | else
102 | return null;
103 | }
104 |
105 | /**
106 | * Compute some kind of unique ID for a given cluster pair.
107 | * @return The ID
108 | */
109 | private String hashCodePair(ClusterPair link) {
110 | return hashCodePair(link.getlCluster(), link.getrCluster());
111 | }
112 |
113 | private String hashCodePair(Cluster lCluster, Cluster rCluster) {
114 | return hashCodePairNames(lCluster.getName(), rCluster.getName());
115 | }
116 |
117 | private String hashCodePairNames(String lName, String rName) {
118 | if (lName.compareTo(rName) < 0) {
119 | return lName + "~~~" + rName;//getlCluster().hashCode() + 31 * (getrCluster().hashCode());
120 | } else {
121 | return rName + "~~~" + lName;//return getrCluster().hashCode() + 31 * (getlCluster().hashCode());
122 | }
123 | }
124 |
125 | @Override
126 | public String toString() {
127 | return data.toString();
128 | }
129 | }
130 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/HierarchyBuilder.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.ArrayList;
20 | import java.util.Collection;
21 | import java.util.List;
22 |
23 | public class HierarchyBuilder {
24 |
25 | private DistanceMap distances;
26 | private List clusters;
27 | private int globalClusterIndex = 0;
28 |
29 | public DistanceMap getDistances() {
30 | return distances;
31 | }
32 |
33 | public List getClusters() {
34 | return clusters;
35 | }
36 |
37 | public HierarchyBuilder(List clusters, DistanceMap distances) {
38 | this.clusters = clusters;
39 | this.distances = distances;
40 | }
41 |
42 | /**
43 | * Returns Flattened clusters, i.e. clusters that are at least apart by a given threshold
44 | * @param linkageStrategy
45 | * @param threshold
46 | * @return flat list of clusters
47 | */
48 | public List flatAgg(LinkageStrategy linkageStrategy, Double threshold)
49 | {
50 | while((!isTreeComplete()) && (distances.minDist() != null) && (distances.minDist() <= threshold))
51 | {
52 | //System.out.println("Cluster Distances: " + distances.toString());
53 | //System.out.println("Cluster Size: " + clusters.size());
54 | agglomerate(linkageStrategy);
55 | }
56 |
57 | //System.out.println("Final MinDistance: " + distances.minDist());
58 | //System.out.println("Tree complete: " + isTreeComplete());
59 | return clusters;
60 | }
61 |
62 | public void agglomerate(LinkageStrategy linkageStrategy) {
63 | ClusterPair minDistLink = distances.removeFirst();
64 | if (minDistLink != null) {
65 | clusters.remove(minDistLink.getrCluster());
66 | clusters.remove(minDistLink.getlCluster());
67 |
68 | Cluster oldClusterL = minDistLink.getlCluster();
69 | Cluster oldClusterR = minDistLink.getrCluster();
70 | Cluster newCluster = minDistLink.agglomerate(++globalClusterIndex);
71 |
72 | for (Cluster iClust : clusters) {
73 | ClusterPair link1 = findByClusters(iClust, oldClusterL);
74 | ClusterPair link2 = findByClusters(iClust, oldClusterR);
75 | ClusterPair newLinkage = new ClusterPair();
76 | newLinkage.setlCluster(iClust);
77 | newLinkage.setrCluster(newCluster);
78 | Collection distanceValues = new ArrayList();
79 |
80 | if (link1 != null) {
81 | Double distVal = link1.getLinkageDistance();
82 | Double weightVal = link1.getOtherCluster(iClust).getWeightValue();
83 | distanceValues.add(new Distance(distVal, weightVal));
84 | distances.remove(link1);
85 | }
86 | if (link2 != null) {
87 | Double distVal = link2.getLinkageDistance();
88 | Double weightVal = link2.getOtherCluster(iClust).getWeightValue();
89 | distanceValues.add(new Distance(distVal, weightVal));
90 | distances.remove(link2);
91 | }
92 |
93 | Distance newDistance = linkageStrategy.calculateDistance(distanceValues);
94 |
95 | newLinkage.setLinkageDistance(newDistance.getDistance());
96 | distances.add(newLinkage);
97 | }
98 | clusters.add(newCluster);
99 | }
100 | }
101 |
102 | private ClusterPair findByClusters(Cluster c1, Cluster c2) {
103 | return distances.findByCodePair(c1, c2);
104 | }
105 |
106 | public boolean isTreeComplete() {
107 | return clusters.size() == 1;
108 | }
109 |
110 | public Cluster getRootCluster() {
111 | if (!isTreeComplete()) {
112 | throw new RuntimeException("No root available");
113 | }
114 | return clusters.get(0);
115 | }
116 |
117 | }
118 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/LinkageStrategy.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.Collection;
20 |
21 | public interface LinkageStrategy {
22 |
23 | public Distance calculateDistance(Collection distances);
24 | }
25 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/PDistClusteringAlgorithm.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.ArrayList;
20 | import java.util.List;
21 |
22 | public class PDistClusteringAlgorithm implements ClusteringAlgorithm {
23 |
24 | @Override
25 | public Cluster performClustering(double[][] distances,
26 | String[] clusterNames, LinkageStrategy linkageStrategy) {
27 |
28 | /* Argument checks */
29 | if (distances == null || distances.length == 0) {
30 | throw new IllegalArgumentException("Invalid distance matrix");
31 | }
32 | if (distances[0].length != clusterNames.length
33 | * (clusterNames.length - 1) / 2) {
34 | throw new IllegalArgumentException("Invalid cluster name array");
35 | }
36 | if (linkageStrategy == null) {
37 | throw new IllegalArgumentException("Undefined linkage strategy");
38 | }
39 |
40 | /* Setup model */
41 | List clusters = createClusters(clusterNames);
42 | DistanceMap linkages = createLinkages(distances, clusters);
43 |
44 | /* Process */
45 | HierarchyBuilder builder = new HierarchyBuilder(clusters, linkages);
46 | while (!builder.isTreeComplete()) {
47 | builder.agglomerate(linkageStrategy);
48 | }
49 |
50 | return builder.getRootCluster();
51 | }
52 |
53 | @Override
54 | public List performFlatClustering(double[][] distances,
55 | String[] clusterNames, LinkageStrategy linkageStrategy, Double threshold) {
56 |
57 | /* Argument checks */
58 | if (distances == null || distances.length == 0) {
59 | throw new IllegalArgumentException("Invalid distance matrix");
60 | }
61 | if (distances[0].length != clusterNames.length
62 | * (clusterNames.length - 1) / 2) {
63 | throw new IllegalArgumentException("Invalid cluster name array");
64 | }
65 | if (linkageStrategy == null) {
66 | throw new IllegalArgumentException("Undefined linkage strategy");
67 | }
68 |
69 | /* Setup model */
70 | List clusters = createClusters(clusterNames);
71 | DistanceMap linkages = createLinkages(distances, clusters);
72 |
73 | /* Process */
74 | HierarchyBuilder builder = new HierarchyBuilder(clusters, linkages);
75 | return builder.flatAgg(linkageStrategy, threshold);
76 | }
77 |
78 | @Override
79 | public Cluster performWeightedClustering(double[][] distances, String[] clusterNames,
80 | double[] weights, LinkageStrategy linkageStrategy) {
81 | return performClustering(distances, clusterNames, linkageStrategy);
82 | }
83 |
84 | private DistanceMap createLinkages(double[][] distances,
85 | List clusters) {
86 | DistanceMap linkages = new DistanceMap();
87 | for (int col = 0; col < clusters.size(); col++) {
88 | Cluster cluster_col = clusters.get(col);
89 | for (int row = col + 1; row < clusters.size(); row++) {
90 | ClusterPair link = new ClusterPair();
91 | Double d = distances[0][accessFunction(row, col,
92 | clusters.size())];
93 | link.setLinkageDistance(d);
94 | link.setlCluster(cluster_col);
95 | link.setrCluster(clusters.get(row));
96 | linkages.add(link);
97 | }
98 | }
99 | return linkages;
100 | }
101 |
102 | private List createClusters(String[] clusterNames) {
103 | List clusters = new ArrayList();
104 | for (String clusterName : clusterNames) {
105 | Cluster cluster = new Cluster(clusterName);
106 | cluster.addLeafName(clusterName);
107 | clusters.add(cluster);
108 | }
109 | return clusters;
110 | }
111 |
112 | // Credit to this function goes to
113 | // http://stackoverflow.com/questions/13079563/how-does-condensed-distance-matrix-work-pdist
114 | private static int accessFunction(int i, int j, int n) {
115 | return n * j - j * (j + 1) / 2 + i - 1 - j;
116 | }
117 |
118 | }
119 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/SingleLinkageStrategy.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.Collection;
20 |
21 | public class SingleLinkageStrategy implements LinkageStrategy {
22 |
23 | @Override
24 | public Distance calculateDistance(Collection distances) {
25 | double min = Double.NaN;
26 |
27 | for (Distance dist : distances) {
28 | if (Double.isNaN(min) || dist.getDistance() < min)
29 | min = dist.getDistance();
30 | }
31 | return new Distance(min);
32 | }
33 | }
34 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/WeightedLinkageStrategy.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import java.util.Collection;
20 |
21 | public class WeightedLinkageStrategy implements LinkageStrategy {
22 |
23 | @Override
24 | public Distance calculateDistance(Collection distances) {
25 | double sum = 0;
26 | double weightTotal = 0;
27 | for (Distance distance : distances) {
28 | weightTotal += distance.getWeight();
29 | sum += distance.getDistance() * distance.getWeight();
30 | }
31 |
32 | return new Distance(sum / weightTotal, weightTotal);
33 | }
34 | }
35 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/visualization/ClusterComponent.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering.visualization;
18 |
19 | import java.awt.FontMetrics;
20 | import java.awt.Graphics2D;
21 | import java.awt.geom.Rectangle2D;
22 | import java.util.ArrayList;
23 | import java.util.List;
24 |
25 | import com.apporiented.algorithm.clustering.Cluster;
26 |
27 |
28 | public class ClusterComponent implements Paintable {
29 |
30 | private Cluster cluster;
31 | private VCoord linkPoint;
32 | private VCoord initPoint;
33 | private boolean printName;
34 | private int dotRadius = 2;
35 | private int namePadding = 6;
36 |
37 | private List children;
38 |
39 | public List getChildren() {
40 | if (children == null) {
41 | children = new ArrayList();
42 | }
43 | return children;
44 | }
45 |
46 | public int getNamePadding() {
47 | return namePadding;
48 | }
49 |
50 | public void setNamePadding(int namePadding) {
51 | this.namePadding = namePadding;
52 | }
53 |
54 | public int getDotRadius() {
55 | return dotRadius;
56 | }
57 |
58 | public void setDotRadius(int dotRadius) {
59 | this.dotRadius = dotRadius;
60 | }
61 |
62 | public void setChildren(List children) {
63 | this.children = children;
64 | }
65 |
66 | public VCoord getLinkPoint() {
67 | return linkPoint;
68 | }
69 |
70 | public void setLinkPoint(VCoord linkPoint) {
71 | this.linkPoint = linkPoint;
72 | }
73 |
74 | public VCoord getInitPoint() {
75 | return initPoint;
76 | }
77 |
78 | public void setInitPoint(VCoord initPoint) {
79 | this.initPoint = initPoint;
80 | }
81 |
82 | public Cluster getCluster() {
83 | return cluster;
84 | }
85 |
86 | public void setCluster(Cluster cluster) {
87 | this.cluster = cluster;
88 | }
89 |
90 | public boolean isPrintName() {
91 | return printName;
92 | }
93 |
94 | public void setPrintName(boolean printName) {
95 | this.printName = printName;
96 | }
97 |
98 | public ClusterComponent (Cluster cluster, boolean printName, VCoord initPoint) {
99 | this.printName = printName;
100 | this.cluster = cluster;
101 | this.initPoint = initPoint;
102 | this.linkPoint = initPoint;
103 | }
104 |
105 | @Override
106 | public void paint(Graphics2D g, int xDisplayOffset, int yDisplayOffset, double xDisplayFactor, double yDisplayFactor, boolean decorated) {
107 | int x1, y1, x2, y2;
108 | FontMetrics fontMetrics = g.getFontMetrics();
109 | x1 = (int)(initPoint.getX() * xDisplayFactor + xDisplayOffset);
110 | y1 = (int)(initPoint.getY() * yDisplayFactor + yDisplayOffset);
111 | x2 = (int)(linkPoint.getX() * xDisplayFactor + xDisplayOffset);
112 | y2 = y1;
113 | g.fillOval(x1-dotRadius, y1-dotRadius, dotRadius * 2, dotRadius * 2);
114 | g.drawLine(x1, y1, x2, y2);
115 |
116 | if (cluster.isLeaf()) {
117 | g.drawString(cluster.getName(), x1 + namePadding, y1 + (fontMetrics.getHeight()/2) - 2);
118 | }
119 | if (decorated && cluster.getDistance() != null && !cluster.getDistance().isNaN() && cluster.getDistance().getDistance() > 0) {
120 | String s = String.format("%.2f", cluster.getDistance());
121 | Rectangle2D rect = fontMetrics.getStringBounds(s, g);
122 | g.drawString(s, x1 - (int)rect.getWidth() , y1 - 2);
123 | }
124 |
125 | x1 = x2;
126 | y1 = y2;
127 | y2 = (int)(linkPoint.getY() * yDisplayFactor + yDisplayOffset);
128 | g.drawLine(x1, y1, x2, y2);
129 |
130 |
131 | for (ClusterComponent child : children) {
132 | child.paint(g, xDisplayOffset, yDisplayOffset, xDisplayFactor, yDisplayFactor, decorated);
133 | }
134 | }
135 |
136 | public double getRectMinX() {
137 |
138 | // TODO Better use closure / callback here
139 | assert initPoint != null && linkPoint != null;
140 | double val = Math.min(initPoint.getX(), linkPoint.getX());
141 | for (ClusterComponent child : getChildren()) {
142 | val = Math.min(val, child.getRectMinX());
143 | }
144 | return val;
145 | }
146 |
147 | public double getRectMinY() {
148 |
149 | // TODO Better use closure here
150 | assert initPoint != null && linkPoint != null;
151 | double val = Math.min(initPoint.getY(), linkPoint.getY());
152 | for (ClusterComponent child : getChildren()) {
153 | val = Math.min(val, child.getRectMinY());
154 | }
155 | return val;
156 | }
157 |
158 | public double getRectMaxX() {
159 |
160 | // TODO Better use closure here
161 | assert initPoint != null && linkPoint != null;
162 | double val = Math.max(initPoint.getX(), linkPoint.getX());
163 | for (ClusterComponent child : getChildren()) {
164 | val = Math.max(val, child.getRectMaxX());
165 | }
166 | return val;
167 | }
168 |
169 | public double getRectMaxY() {
170 |
171 | // TODO Better use closure here
172 | assert initPoint != null && linkPoint != null;
173 | double val = Math.max(initPoint.getY(), linkPoint.getY());
174 | for (ClusterComponent child : getChildren()) {
175 | val = Math.max(val, child.getRectMaxY());
176 | }
177 | return val;
178 | }
179 |
180 | public int getNameWidth(Graphics2D g, boolean includeNonLeafs) {
181 | int width = 0;
182 | if (includeNonLeafs || cluster.isLeaf()) {
183 | Rectangle2D rect = g.getFontMetrics().getStringBounds(cluster.getName(), g);
184 | width = (int)rect.getWidth();
185 | }
186 | return width;
187 | }
188 |
189 | public int getMaxNameWidth(Graphics2D g, boolean includeNonLeafs) {
190 | int width = getNameWidth(g, includeNonLeafs);
191 | for (ClusterComponent comp : getChildren()) {
192 | int childWidth = comp.getMaxNameWidth(g, includeNonLeafs);
193 | if (childWidth > width) {
194 | width = childWidth;
195 | }
196 | }
197 | return width;
198 | }
199 | }
200 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/visualization/DendrogramFrame.java:
--------------------------------------------------------------------------------
1 | package com.apporiented.algorithm.clustering.visualization;
2 |
3 | import com.apporiented.algorithm.clustering.*;
4 |
5 | import javax.swing.*;
6 | import java.awt.*;
7 |
8 |
9 | public class DendrogramFrame extends JFrame {
10 |
11 | public DendrogramFrame(Cluster cluster) {
12 | setSize(500, 400);
13 | setLocation(100, 200);
14 | setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
15 |
16 | JPanel content = new JPanel();
17 | DendrogramPanel dp = new DendrogramPanel();
18 |
19 | setContentPane(content);
20 | content.setBackground(Color.red);
21 | content.setLayout(new BorderLayout());
22 | content.add(dp, BorderLayout.CENTER);
23 | dp.setBackground(Color.WHITE);
24 | dp.setLineColor(Color.BLACK);
25 | dp.setScaleValueDecimals(0);
26 | dp.setScaleValueInterval(1);
27 | dp.setShowDistances(false);
28 |
29 | dp.setModel(cluster);
30 | setVisible(true);
31 | }
32 |
33 | public static void main(String[] args) {
34 | LinkageStrategy strategy = new CompleteLinkageStrategy();
35 | Frame f1 = new DendrogramFrame(createSampleCluster(strategy));
36 | f1.setSize(500, 400);
37 | f1.setLocation(100, 200);
38 | Frame f2 = new DendrogramFrame(createSampleCluster2(strategy));
39 | f2.setSize(500, 400);
40 | f2.setLocation(600, 200);
41 | }
42 |
43 | private static Cluster createSampleCluster(LinkageStrategy strategy) {
44 | double[][] distances = new double[][] {
45 | { 1, 9, 7, 11, 14, 4, 3, 8, 10, 9, 2, 8, 6, 13, 10 }
46 | };
47 | String[] names = new String[] { "O1", "O2", "O3", "O4", "O5", "O6" };
48 | ClusteringAlgorithm alg = new PDistClusteringAlgorithm();
49 | Cluster cluster = alg.performClustering(distances, names, strategy);
50 | cluster.toConsole(0);
51 | return cluster;
52 | }
53 |
54 | private static Cluster createSampleCluster2(LinkageStrategy strategy) {
55 | double[][] distances = new double[][] {
56 | { 1, 9, 7, 11, 14, 12, 4, 3, 8, 10, 12, 9, 2, 8, 9, 6, 13, 11, 10, 7, 2}
57 | };
58 | String[] names = new String[] { "O1", "O2", "O3", "O4", "O5", "O6", "07" };
59 | ClusteringAlgorithm alg = new PDistClusteringAlgorithm();
60 | Cluster cluster = alg.performClustering(distances, names, strategy);
61 | cluster.toConsole(0);
62 | return cluster;
63 | }
64 | }
65 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/visualization/DendrogramPanel.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering.visualization;
18 |
19 | import java.awt.BasicStroke;
20 | import java.awt.Color;
21 | import java.awt.Graphics;
22 | import java.awt.Graphics2D;
23 | import java.awt.RenderingHints;
24 | import java.awt.geom.Rectangle2D;
25 |
26 | import javax.swing.*;
27 |
28 | import com.apporiented.algorithm.clustering.Cluster;
29 |
30 | public class DendrogramPanel extends JPanel {
31 |
32 | private static final long serialVersionUID = 1L;
33 |
34 | private final static BasicStroke SOLID_STROKE =
35 | new BasicStroke(1.0f, BasicStroke.CAP_BUTT, BasicStroke.JOIN_ROUND);
36 |
37 | private Cluster model;
38 | private ClusterComponent component;
39 | private Color lineColor = Color.BLACK;
40 | private boolean showDistanceValues = false;
41 | private boolean showScale = true;
42 | private int borderTop = 20;
43 | private int borderLeft = 20;
44 | private int borderRight = 20;
45 | private int borderBottom = 20;
46 | private int scalePadding = 10;
47 | private int scaleTickLength = 4;
48 | private int scaleTickLabelPadding = 4;
49 | private double scaleValueInterval = 0;
50 | private int scaleValueDecimals = 0;
51 |
52 | private double xModelOrigin = 0.0;
53 | private double yModelOrigin = 0.0;
54 | private double wModel = 0.0;
55 | private double hModel = 0.0;
56 |
57 | public boolean isShowDistanceValues() {
58 | return showDistanceValues;
59 | }
60 |
61 | public void setShowDistances(boolean showDistanceValues) {
62 | this.showDistanceValues = showDistanceValues;
63 | }
64 |
65 | public boolean isShowScale() {
66 | return showScale;
67 | }
68 |
69 | public void setShowScale(boolean showScale) {
70 | this.showScale = showScale;
71 | }
72 |
73 | public int getScalePadding() {
74 | return scalePadding;
75 | }
76 |
77 | public void setScalePadding(int scalePadding) {
78 | this.scalePadding = scalePadding;
79 | }
80 |
81 | public int getScaleTickLength() {
82 | return scaleTickLength;
83 | }
84 |
85 | public void setScaleTickLength(int scaleTickLength) {
86 | this.scaleTickLength = scaleTickLength;
87 | }
88 |
89 | public double getScaleValueInterval() {
90 | return scaleValueInterval;
91 | }
92 |
93 | public void setScaleValueInterval(double scaleTickInterval) {
94 | this.scaleValueInterval = scaleTickInterval;
95 | }
96 |
97 | public int getScaleValueDecimals() {
98 | return scaleValueDecimals;
99 | }
100 |
101 | public void setScaleValueDecimals(int scaleValueDecimals) {
102 | this.scaleValueDecimals = scaleValueDecimals;
103 | }
104 |
105 | public int getBorderTop() {
106 | return borderTop;
107 | }
108 |
109 | public void setBorderTop(int borderTop) {
110 | this.borderTop = borderTop;
111 | }
112 |
113 | public int getBorderLeft() {
114 | return borderLeft;
115 | }
116 |
117 | public void setBorderLeft(int borderLeft) {
118 | this.borderLeft = borderLeft;
119 | }
120 |
121 | public int getBorderRight() {
122 | return borderRight;
123 | }
124 |
125 | public void setBorderRight(int borderRight) {
126 | this.borderRight = borderRight;
127 | }
128 |
129 | public int getBorderBottom() {
130 | return borderBottom;
131 | }
132 |
133 | public void setBorderBottom(int borderBottom) {
134 | this.borderBottom = borderBottom;
135 | }
136 |
137 | public Color getLineColor() {
138 | return lineColor;
139 | }
140 |
141 | public void setLineColor(Color lineColor) {
142 | this.lineColor = lineColor;
143 | }
144 |
145 | public Cluster getModel() {
146 | return model;
147 | }
148 |
149 | public void setModel(Cluster model) {
150 | this.model = model;
151 | component = createComponent(model);
152 | updateModelMetrics();
153 | }
154 |
155 | private void updateModelMetrics() {
156 | double minX = component.getRectMinX();
157 | double maxX = component.getRectMaxX();
158 | double minY = component.getRectMinY();
159 | double maxY = component.getRectMaxY();
160 |
161 | xModelOrigin = minX;
162 | yModelOrigin = minY;
163 | wModel = maxX - minX;
164 | hModel = maxY - minY;
165 | }
166 |
167 | private ClusterComponent createComponent(Cluster cluster, VCoord initCoord, double clusterHeight) {
168 |
169 | ClusterComponent comp = null;
170 | if (cluster != null) {
171 | comp = new ClusterComponent(cluster, cluster.isLeaf(), initCoord);
172 | double leafHeight = clusterHeight / cluster.countLeafs();
173 | double yChild = initCoord.getY() - (clusterHeight / 2);
174 | double distance = cluster.getDistanceValue() == null ? 0 : cluster.getDistanceValue();
175 | for (Cluster child : cluster.getChildren()) {
176 | int childLeafCount = child.countLeafs();
177 | double childHeight = childLeafCount * leafHeight;
178 | double childDistance = child.getDistanceValue() == null ? 0 : child.getDistanceValue();
179 | VCoord childInitCoord = new VCoord(
180 | initCoord.getX() + (distance - childDistance),
181 | yChild + childHeight / 2.0);
182 | yChild += childHeight;
183 |
184 | /* Traverse cluster node tree */
185 | ClusterComponent childComp = createComponent(child, childInitCoord, childHeight);
186 |
187 | childComp.setLinkPoint(initCoord);
188 | comp.getChildren().add(childComp);
189 | }
190 | }
191 | return comp;
192 |
193 | }
194 |
195 | private ClusterComponent createComponent(Cluster model) {
196 |
197 | double virtualModelHeight = 1;
198 | VCoord initCoord = new VCoord(0, virtualModelHeight / 2);
199 |
200 | ClusterComponent comp = createComponent(model, initCoord, virtualModelHeight);
201 | comp.setLinkPoint(initCoord);
202 | return comp;
203 | }
204 |
205 | @Override
206 | public void paint(Graphics g) {
207 | super.paint(g);
208 | Graphics2D g2 = (Graphics2D) g;
209 | g2.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
210 | g2.setColor(lineColor);
211 | g2.setStroke(SOLID_STROKE);
212 |
213 | int wDisplay = getWidth() - borderLeft - borderRight;
214 | int hDisplay = getHeight() - borderTop - borderBottom;
215 | int xDisplayOrigin = borderLeft;
216 | int yDisplayOrigin = borderBottom;
217 |
218 | if (component != null) {
219 |
220 | int nameGutterWidth = component.getMaxNameWidth(g2, false) + component.getNamePadding();
221 | wDisplay -= nameGutterWidth;
222 |
223 | if (showScale) {
224 | Rectangle2D rect = g2.getFontMetrics().getStringBounds("0", g2);
225 | int scaleHeight = (int) rect.getHeight() + scalePadding + scaleTickLength + scaleTickLabelPadding;
226 | hDisplay -= scaleHeight;
227 | yDisplayOrigin += scaleHeight;
228 | }
229 |
230 | /* Calculate conversion factor and offset for display */
231 | double xFactor = wDisplay / wModel;
232 | double yFactor = hDisplay / hModel;
233 | int xOffset = (int) (xDisplayOrigin - xModelOrigin * xFactor);
234 | int yOffset = (int) (yDisplayOrigin - yModelOrigin * yFactor);
235 | component.paint(g2, xOffset, yOffset, xFactor, yFactor, showDistanceValues);
236 |
237 | if (showScale) {
238 | int x1 = xDisplayOrigin;
239 | int y1 = yDisplayOrigin - scalePadding;
240 | int x2 = x1 + wDisplay;
241 | int y2 = y1;
242 | g2.drawLine(x1, y1, x2, y2);
243 |
244 | double totalDistance = component.getCluster().getTotalDistance();
245 | double xModelInterval;
246 | if (scaleValueInterval <= 0) {
247 | xModelInterval = totalDistance / 10.0;
248 | } else {
249 | xModelInterval = scaleValueInterval;
250 | }
251 |
252 | int xTick = xDisplayOrigin + wDisplay;
253 | y1 = yDisplayOrigin - scalePadding;
254 | y2 = yDisplayOrigin - scalePadding - scaleTickLength;
255 | double distanceValue = 0;
256 | double xDisplayInterval = xModelInterval * xFactor;
257 | while (xTick >= xDisplayOrigin) {
258 | g2.drawLine(xTick, y1, xTick, y2);
259 |
260 | String distanceValueStr = String.format("%." + scaleValueDecimals + "f", distanceValue);
261 | Rectangle2D rect = g2.getFontMetrics().getStringBounds(distanceValueStr, g2);
262 | g2.drawString(distanceValueStr, (int) (xTick - (rect.getWidth() / 2)), y2 - scaleTickLabelPadding);
263 | xTick -= xDisplayInterval;
264 | distanceValue += xModelInterval;
265 | }
266 |
267 | }
268 | } else {
269 |
270 | /* No data available */
271 | String str = "No data";
272 | Rectangle2D rect = g2.getFontMetrics().getStringBounds(str, g2);
273 | int xt = (int) (wDisplay / 2.0 - rect.getWidth() / 2.0);
274 | int yt = (int) (hDisplay / 2.0 - rect.getHeight() / 2.0);
275 | g2.drawString(str, xt, yt);
276 | }
277 | }
278 | }
279 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/visualization/Paintable.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering.visualization;
18 |
19 | import java.awt.Graphics2D;
20 |
21 | /**
22 | * Implemented by visual components of the dendrogram.
23 | * @author lars
24 | *
25 | */
26 | public interface Paintable {
27 |
28 | void paint(Graphics2D g, int xDisplayOffset, int yDisplayOffset, double xDisplayFactor, double yDisplayFactor, boolean decorated);
29 |
30 | }
31 |
--------------------------------------------------------------------------------
/src/main/java/com/apporiented/algorithm/clustering/visualization/VCoord.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering.visualization;
18 |
19 | /**
20 | * Immutable Virtual coordinate.
21 | */
22 | public class VCoord {
23 |
24 | private double x;
25 | private double y;
26 |
27 | public VCoord(double x, double y) {
28 | this.x = x;
29 | this.y = y;
30 | }
31 |
32 | public double getX() {
33 | return x;
34 | }
35 |
36 | public double getY() {
37 | return y;
38 | }
39 |
40 | @Override
41 | public boolean equals(Object obj) {
42 | if (obj instanceof VCoord) {
43 | VCoord other = (VCoord)obj;
44 | return x == other.getX() && y == other.getY();
45 | }
46 | else {
47 | return false;
48 | }
49 | }
50 |
51 | @Override
52 | public String toString() {
53 | return String.format("Coord(%.3f,%.3f)", x, y);
54 | }
55 |
56 | }
57 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/ClusterPerfTest.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import org.junit.Test;
20 |
21 | import java.util.Random;
22 |
23 | import static org.junit.Assert.assertArrayEquals;
24 | import static org.junit.Assert.assertEquals;
25 |
26 | public class ClusterPerfTest {
27 |
28 | Cluster randomCluster(int n) {
29 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
30 |
31 | return alg.performClustering(randomDataDist(n), randomDataNames(n),
32 | new AverageLinkageStrategy());
33 | }
34 |
35 | private double[][] randomDataDist(int n) {
36 | Random rnd = new Random();
37 | double[][] mat = new double[n][n];
38 | for (int i = 0; i < n; i++) {
39 | mat[i][i] = 0;
40 | for (int j = i + 1; j < n; j++) {
41 | double r = Math.floor(rnd.nextDouble() * 100) * 0.1;
42 | mat[i][j] = r;
43 | mat[j][i] = r;
44 | }
45 | }
46 |
47 | return mat;
48 | }
49 |
50 | private String[] randomDataNames(int n) {
51 | String[] ret = new String[n];
52 | for (int i = 0; i < n; i++) {
53 | ret[i] = "" + i;
54 | }
55 | return ret;
56 | }
57 |
58 | @Test
59 | public void testRandomDataDist() throws Exception {
60 | double[][] dist = randomDataDist(4);
61 | assertEquals(dist.length, 4);
62 | }
63 |
64 | @Test
65 | public void testRandomDataNames() throws Exception {
66 | String[] names = randomDataNames(4);
67 | assertEquals(names.length, 4);
68 | String[] exp = {"0", "1", "2", "3"};
69 | assertArrayEquals(names, exp);
70 | }
71 |
72 | private Long timeN(int n) {
73 | Long t0 = System.currentTimeMillis();
74 | Cluster cluster = randomCluster(n);
75 | return System.currentTimeMillis() - t0;
76 | }
77 |
78 | @Test
79 | public void testn() throws Exception {
80 | for (int n = 1; n <= 1024; n = n * 2) {
81 | Long t = timeN(n);
82 | System.out.println(String.format("%3d nodes -> %5d ms", n, t));
83 | }
84 | }
85 | }
86 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/ClusterTest.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import org.junit.Before;
20 | import org.junit.Test;
21 |
22 | import static org.junit.Assert.assertEquals;
23 |
24 | public class ClusterTest {
25 |
26 | private Cluster cluster;
27 |
28 | @Before
29 | public void setup() {
30 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
31 | cluster = alg.performClustering(SampleClusterData.DISTANCES, SampleClusterData.NAMES,
32 | new AverageLinkageStrategy());
33 | }
34 |
35 | @Test
36 | public void testCountLeafs() throws Exception {
37 | int leafs = cluster.countLeafs();
38 | assertEquals(6, leafs);
39 | assertEquals("clstr#5", cluster.getName());
40 | }
41 |
42 | @Test
43 | public void testGetTotalDistance() throws Exception {
44 | int dist = (int) cluster.getTotalDistance();
45 | assertEquals(10, dist);
46 | assertEquals("clstr#5", cluster.getName());
47 | }
48 | }
49 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/CoordTest.java:
--------------------------------------------------------------------------------
1 | package com.apporiented.algorithm.clustering;
2 |
3 | import com.apporiented.algorithm.clustering.visualization.DendrogramPanel;
4 |
5 | import javax.swing.*;
6 | import java.awt.*;
7 | import java.io.BufferedReader;
8 | import java.io.IOException;
9 | import java.io.InputStreamReader;
10 | import java.util.ArrayList;
11 | import java.util.List;
12 |
13 | /**
14 | * @author Lars Behnke
15 | */
16 | public class CoordTest {
17 |
18 | private static Cluster importCluster() throws IOException {
19 | List coords = readCoordinates();
20 |
21 | double[][] distances = new double[coords.size()][coords.size()];
22 | String[] names = new String[coords.size()];
23 | for (int row = 0; row < coords.size(); row++) {
24 | Coord coord1 = coords.get(row);
25 | for (int col = row+1; col < coords.size(); col++) {
26 | Coord coord2 = coords.get(col);
27 | double d = Math.sqrt(Math.pow(coord2.getX()-coord1.getX(), 2)+ Math.pow(coord2.getY()-coord1.getY(), 2));
28 | distances[row][col] = d;
29 | distances[col][row] = d;
30 | }
31 | names[row] = ""+row;
32 | }
33 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
34 | Cluster cluster = alg.performClustering(distances, names,
35 | new AverageLinkageStrategy());
36 | return cluster;
37 | }
38 |
39 |
40 | public static void main(String[] args) throws Exception {
41 | JFrame frame = new JFrame();
42 | frame.setSize(1024, 768);
43 | frame.setLocation(400, 300);
44 | frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
45 |
46 | JPanel content = new JPanel();
47 | DendrogramPanel dp = new DendrogramPanel();
48 |
49 | frame.setContentPane(content);
50 | content.setBackground(Color.red);
51 | content.setLayout(new BorderLayout());
52 | content.add(dp, BorderLayout.CENTER);
53 | dp.setBackground(Color.WHITE);
54 | dp.setLineColor(Color.BLACK);
55 | dp.setScaleValueDecimals(0);
56 | dp.setScaleValueInterval(1);
57 | dp.setShowDistances(false);
58 |
59 | Cluster cluster = importCluster();
60 | dp.setModel(cluster);
61 | frame.setVisible(true);
62 | }
63 |
64 | private static List readCoordinates() throws IOException {
65 | List coordList = new ArrayList();
66 | BufferedReader br = new BufferedReader(new InputStreamReader(CoordTest.class.getResourceAsStream("/testData1.txt")));
67 | String line;
68 |
69 | while ((line = br.readLine()) != null) {
70 | String[] elems = line.split(" ");
71 | if (elems.length != 2) {
72 | continue;
73 | }
74 | int x;
75 | int y;
76 |
77 | try {
78 | x = Integer.parseInt(elems[0]);
79 | y = Integer.parseInt(elems[1]);
80 | } catch (Exception e) {
81 | continue;
82 | }
83 | coordList.add(new Coord(x, y));
84 | }
85 | return coordList;
86 | }
87 |
88 | public static class Coord {
89 | private double x;
90 | private double y;
91 | public Coord(double x, double y) {
92 | this.x = x;
93 | this.y = y;
94 | }
95 |
96 | public double getX() {
97 | return x;
98 | }
99 |
100 | public double getY() {
101 | return y;
102 | }
103 | }
104 | }
105 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/DefaultClusteringAlgorithmTest.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import org.junit.Assert;
20 | import org.junit.Before;
21 | import org.junit.Test;
22 |
23 | public class DefaultClusteringAlgorithmTest {
24 |
25 | private double[][] distances;
26 | private String[] names;
27 |
28 | @Before
29 | public void setup() {
30 | distances = SampleClusterData.DISTANCES;
31 | names = SampleClusterData.NAMES;
32 | }
33 |
34 | @Test
35 | public void testClusteringAvgLink() {
36 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
37 | Cluster c = alg.performClustering(distances, names, new AverageLinkageStrategy());
38 | Assert.assertNotNull(c);
39 | }
40 |
41 | @Test
42 | public void testDuplicateNames() {
43 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
44 | try {
45 | Cluster c = alg.performClustering(distances, SampleClusterData.NAMES_WITH_DUPLICATE,
46 | new AverageLinkageStrategy());
47 | Assert.fail("Exception expected");
48 | } catch (IllegalArgumentException e) {
49 | Assert.assertEquals("Duplicate names", e.getMessage());
50 | }
51 | }
52 |
53 | }
54 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/DistanceMapTest.java:
--------------------------------------------------------------------------------
1 | package com.apporiented.algorithm.clustering;
2 |
3 | import org.junit.Test;
4 |
5 | import static org.junit.Assert.*;
6 |
7 | public class DistanceMapTest {
8 | DistanceMap map = new DistanceMap();
9 | ClusterPair ab = new ClusterPair(new Cluster("a"), new Cluster("b"), 1.0);
10 | ClusterPair bc = new ClusterPair(new Cluster("b"), new Cluster("c"), 2.0);
11 | ClusterPair ca = new ClusterPair(new Cluster("c"), new Cluster("a"), 3.0);
12 |
13 | @Test
14 | public void testMapWorksWithSameDistance() throws Exception {
15 | this.map.add(ab);
16 | this.map.add(ab); //add the same link twice. This seems to be an error case
17 | assertEquals(1,this.map.list().size());
18 | ClusterPair remove = this.map.removeFirst();
19 | assertNotNull(remove);
20 | assertEquals(0,this.map.list().size()); //still exists in the map(even though removeFirst will return null now)
21 | ClusterPair remove2 = this.map.removeFirst();
22 | assertNull(remove2);
23 | }
24 | @Test
25 | public void testMapRemovalFront() throws Exception {
26 | this.map.add(ca);
27 | this.map.add(bc);
28 | this.map.add(ab);
29 |
30 | ClusterPair removeFirst = this.map.removeFirst();
31 | assertEquals(ab, removeFirst);
32 | }
33 | @Test
34 | public void testMapRemovalByObjectPollLoop() throws Exception {
35 | this.map.add(ca);
36 | this.map.add(bc);
37 | this.map.add(ab);
38 |
39 | assertTrue(this.map.remove(ab)); //Doesn't actually remove from prioQueue
40 | ClusterPair removeFirst = this.map.removeFirst();
41 | assertEquals(bc, removeFirst); //removeFirst should now skip the ab
42 | }
43 | @Test
44 | public void testMapRemovalByObjectPollLoopHandlesAllEmpty() throws Exception {
45 | this.map.add(ca);
46 | this.map.add(bc);
47 | this.map.add(ab);
48 |
49 | assertTrue(this.map.remove(ab)); //Doesn't actually remove from prioQueue
50 | assertTrue(this.map.remove(bc)); //Doesn't actually remove from prioQueue
51 | assertTrue(this.map.remove(ca)); //Doesn't actually remove from prioQueue
52 | assertFalse(this.map.remove(ab)); //Doesn't actually remove from prioQueue
53 | ClusterPair removeFirst = this.map.removeFirst();
54 | assertNull(removeFirst); //removeFirst should now skip the ab
55 | }
56 | }
57 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/SampleClusterData.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | public final class SampleClusterData {
20 |
21 | public static final double[][] DISTANCES = new double[][] { { 0, 1, 9, 7, 11, 14 },
22 | { 1, 0, 4, 3, 8, 10 }, { 9, 4, 0, 9, 2, 8 }, { 7, 3, 9, 0, 6, 13 }, { 11, 8, 2, 6, 0, 10 },
23 | { 14, 10, 8, 13, 10, 0 } };
24 | public static final String[] NAMES = new String[] { "O1", "O2", "O3", "O4", "O5", "O6" };
25 | public static final double[] WEIGHTS = new double[] { 1, 2, 3, 4, 5, 6 };
26 |
27 | public static final String[] NAMES_WITH_DUPLICATE = new String[] { "O2", "O2", "O3", "O4", "O5",
28 | "O6" };
29 |
30 | }
31 |
--------------------------------------------------------------------------------
/src/test/java/com/apporiented/algorithm/clustering/WeightedClusteringAlgorithmTest.java:
--------------------------------------------------------------------------------
1 | /*******************************************************************************
2 | * Copyright 2013 Lars Behnke
3 | *
4 | * Licensed under the Apache License, Version 2.0 (the "License");
5 | * you may not use this file except in compliance with the License.
6 | * You may obtain a copy of the License at
7 | *
8 | * http://www.apache.org/licenses/LICENSE-2.0
9 | *
10 | * Unless required by applicable law or agreed to in writing, software
11 | * distributed under the License is distributed on an "AS IS" BASIS,
12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | * See the License for the specific language governing permissions and
14 | * limitations under the License.
15 | ******************************************************************************/
16 |
17 | package com.apporiented.algorithm.clustering;
18 |
19 | import org.junit.Assert;
20 | import org.junit.Test;
21 |
22 | public class WeightedClusteringAlgorithmTest {
23 | public static final double[][] DISTANCES = new double[][] { // A--B-C--D
24 | { 0, 2, 3, 5 }, { 2, 0, 1, 3 }, { 3, 1, 0, 2 }, { 5, 3, 2, 0 } };
25 | public static final String[] NAMES = new String[] { "A", "B", "C", "D" };
26 | public static final double[] WEIGHTSA = new double[] { 1, 100, 1, 1 }; // weight on B, D should be alone
27 | public static final double[] WEIGHTSD = new double[] { 1, 1, 100, 1 };// weight on C, A should be alone
28 |
29 | @Test
30 | public void testClusteringAvgLink() {
31 | ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
32 | Cluster ca = alg.performWeightedClustering(DISTANCES, NAMES, WEIGHTSA, new WeightedLinkageStrategy());
33 | Assert.assertEquals("D", ca.getChildren().get(0).getName());
34 | Cluster cd = alg.performWeightedClustering(DISTANCES, NAMES, WEIGHTSD, new WeightedLinkageStrategy());
35 | Assert.assertEquals("A", cd.getChildren().get(0).getName());
36 | }
37 | }
38 |
--------------------------------------------------------------------------------
/src/test/resources/testData1.txt:
--------------------------------------------------------------------------------
1 | # spaeth_01.txt
2 | #
3 | # Discussion:
4 | #
5 | # Data matrix with 37 rows and 2 columns
6 | #
7 | # Modified:
8 | #
9 | # 02 April 2002
10 | #
11 | # Reference:
12 | #
13 | # Helmut Spaeth,
14 | # Cluster Dissection and Analysis,
15 | # Theory, FORTRAN Programs, Examples,
16 | # Ellis Horwood, 1985, page 144.
17 | #
18 | 41 45
19 | 39 44
20 | 42 43
21 | 44 43
22 | 10 42
23 | 38 42
24 | 8 41
25 | 41 41
26 | 13 40
27 | 45 40
28 | 7 39
29 | 38 39
30 | 42 39
31 | 9 38
32 | 12 38
33 | 19 38
34 | 25 38
35 | 6 37
36 | 13 35
37 | 9 34
38 | 12 34
39 | 32 27
40 | 26 25
41 | 39 24
42 | 34 23
43 | 37 23
44 | 22 22
45 | 38 21
46 | 35 20
47 | 31 18
48 | 26 16
49 | 38 13
50 | 29 11
51 | 34 11
52 | 37 10
53 | 40 9
54 | 42 9
--------------------------------------------------------------------------------