├── .gitignore ├── .travis.yml ├── LICENSE ├── README.md ├── doc └── intro.md ├── java └── dataframe │ ├── Lists.java │ └── TableBuilder.java ├── profiles.clj ├── project.clj ├── src └── dataframe │ ├── core.clj │ ├── frame.clj │ ├── series.clj │ └── util.clj └── test └── dataframe ├── core_test.clj ├── frame_test.clj ├── pipeline_test.clj ├── series_test.clj └── util_test.clj /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | /target 3 | /classes 4 | /checkouts 5 | pom.xml 6 | pom.xml.asc 7 | *.jar 8 | *.class 9 | /.lein-* 10 | /.nrepl-port 11 | .hgignore 12 | .hg/ 13 | .idea 14 | *.iml 15 | 16 | pom.xml 17 | pom.xml.asc 18 | *jar 19 | /lib/ 20 | /classes/ 21 | /target/ 22 | /checkouts/ 23 | .lein-deps-sum 24 | .lein-repl-history 25 | .lein-plugins/ 26 | .lein-failures 27 | .nrepl-port 28 | 29 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: clojure 2 | 3 | script: lein expectations 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 George Herbert Lewis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # dataframe 2 | 3 | [![Build Status](https://travis-ci.org/ghl3/dataframe.svg?branch=master)](https://travis-ci.org/ghl3/dataframe) 4 | 5 | DataFrames for Clojure (inspired by Python's Pandas) 6 | 7 | 8 | The dataframe package contains two core data structures: 9 | 10 | - A Series is a map of index keys to values. It is ordered and supports O(1) lookup of values by index as well as O(1) lookup of values by positional offset (based on the order of the index). 11 | - A Frame is a map of column names to column values, which are represented as Series, each with an identical index. A Frame may also be thought of as a map of index keys to maps, where each map is a row of a Frame that maps column names to the value in that row. 12 | 13 | 14 | 15 | Series 16 | ====== 17 | 18 | A series can be thought of as a 1-D vector of data with an index (vector of keys) for every value. The keys are typically either integers or clojure Keywords, but can be any value. Any of values may be nil, but the non-nil values must all be of the same type. 19 | 20 | When iterated over, a Series is a collection of pairs of `[index value]`. 21 | 22 | | index | val | 23 | |-------|-----| 24 | | :a | 10 | 25 | | :b | 20 | 26 | | :c | 30 | 27 | | :d | 40 | 28 | 29 | 30 | To create a Series, pass a sequence of values and an index sequence to the constructor function: 31 | 32 | ```clojure 33 | 34 | (require '[dataframe.core :as df]) 35 | 36 | (def srs (df/series [1 2 3] [:a :b :c])) 37 | srs 38 | ``` 39 | 40 |
 41 | => class dataframe.series.Series
 42 | :a 1
 43 | :b 2
 44 | :c 3
 45 | 
46 | 47 | DataFrame core has a number of functions for operating on or manipulating Series objects. 48 | 49 | ```clojure 50 | (df/ix srs :b) 51 | ; 2 52 | 53 | (df/values srs) 54 | ; [1 2 3] 55 | ``` 56 | 57 | One can apply arithmetic operations on a Series which return Series objects. These operations obey broadcast rules: You may combine a primitive with a series which will apply the operation to every element of a series and return a new series with the same index as the first. Or, you may apply a row-by-row operation on two series (if their indices exactly align): 58 | 59 | ```clojure 60 | (df/add 1 srs) 61 | ``` 62 | 63 |
 64 | => class dataframe.series.Series
 65 | :a 2
 66 | :b 3
 67 | :c 4
 68 | 
69 | 70 | ```clojure 71 | (df/eq 2 srs) 72 | ``` 73 | 74 |
 75 | => class dataframe.series.Series
 76 | :a false
 77 | :b true
 78 | :c false
 79 | 
80 | 81 | ```clojure 82 | (df/add (series [1 2 3]) (series [10 20 30])) 83 | ``` 84 | 85 |
 86 | => class dataframe.series.Series
 87 | 0 11
 88 | 1 22
 89 | 2 33
 90 | 
91 | 92 | 93 | Frames 94 | ====== 95 | 96 | Frames are aligned collections of column-names to Series. 97 | 98 | When iterated over, a Frame is a collection of pairs of indexes to maps of rows: `[index {col->val}]`. 99 | 100 | 101 | | columns: | :a | :b | :c | 102 | |----------|----|----|-----| 103 | | index | | | | 104 | | :x | 10 | 2 | 100 | 105 | | :y | 20 | 4 | 300 | 106 | | :z | 30 | 6 | 600 | 107 | 108 | 109 | 110 | There are a number of equivalent ways to create a DataFrame. These all use the `dataframe.core/frame` constructor function. These ways are: 111 | 112 | - Pass a map of column names to column values as well as an optional index (if no index is passed, then a standard index of integers starting at 0 will be used). The column values can either be sequences or they can be Series objects, but must all have the same length. 113 | 114 | 115 | ```clojure 116 | 117 | (require '[dataframe.core :as df]) 118 | 119 | (def frame (df/frame {:a [1 2 3] :b [10 20 30]} [:x :y :z])) 120 | frame 121 | ``` 122 |
123 | => class dataframe.frame.Frame
124 | 	:a	:b
125 | :x	1	10
126 | :y	2	20
127 | :z	3	30
128 | 
129 | 130 | Here, `:a` and `:b` are the names of the columns and the index over rows is `[:x :y :z]`. 131 | 132 | - Pass a list of pairs of index keys and rows-as-maps. 133 | 134 | ```clojure 135 | (def frame (df/frame [[:x {:a 1 :b 10}] 136 | [:y {:a 2 :b 20}] 137 | [:z {:a 3 :b 30}]])) 138 | frame 139 | ``` 140 |
141 | => class dataframe.frame.Frame
142 | 	:a	:b
143 | :x	1	10
144 | :y	2	20
145 | :z	3	30
146 | 
147 | 148 | - Pass a list of maps and an optional index sequence: 149 | 150 | ```clojure 151 | (def frame (df/frame [{:a 1 :b 10} 152 | {:a 2 :b 20} 153 | {:a 3 :b 30}] 154 | [:x :y :z])) 155 | frame 156 | ``` 157 |
158 | => class dataframe.frame.Frame
159 | 	:a	:b
160 | :x	1	10
161 | :y	2	20
162 | :z	3	30
163 | 
164 | 165 | 166 | Selecting 167 | ========= 168 | 169 | DataFrame core contains a number of functions for selecting specific subsets and items from Series and Frames. 170 | 171 | We've already seen the `ix` function, which selects either a single value from a Series or a single row-map from a Frame. 172 | 173 | ```clojure 174 | (ix (df/series [1 2 3] [:x :y :z]) :x) 175 | ;1 176 | ``` 177 | 178 | ```clojure 179 | (ix (df/frame [{:a 1 :b 10} 180 | {:a 2 :b 20} 181 | {:a 3 :b 30}] 182 | [:x :y :z])) 183 | :x) 184 | ;{:a 1 :b 10} 185 | ``` 186 | 187 | The `loc` function allows one to select a subset of the input Series or Frame consisting of a list of index values. 188 | 189 | 190 | ```clojure 191 | (loc (df/series [1 2 3] [:x :y :z]) [:x :y]) 192 | ``` 193 |
194 | => class dataframe.series.Series
195 | :x 1
196 | :y 2
197 | 
198 | 199 | 200 | ```clojure 201 | (loc (df/frame [{:a 1 :b 10} 202 | {:a 2 :b 20} 203 | {:a 3 :b 30}] 204 | [:x :y :z])) 205 | [:x :y]) 206 | ``` 207 |
208 | => class dataframe.frame.Frame
209 | 	:a	:b
210 | :x	1	10
211 | :y	2	20
212 | 
213 | 214 | 215 | In addition to the index-based location, one can select values/rows using a Series of boolean values (the index of this series must align to the index of the Series or Frame) 216 | 217 | 218 | 219 | ```clojure 220 | (df/select (df/series [1 2 3] [:x :y :z]) 221 | (df/series [true false true] [:x :y :z])) 222 | ``` 223 |
224 | => class dataframe.series.Series
225 | :x 1
226 | :z 3
227 | 
228 | 229 | 230 | ```clojure 231 | (df/select (df/frame [{:a 1 :b 10} 232 | {:a 2 :b 20} 233 | {:a 3 :b 30}] 234 | [:x :y :z])) 235 | (df/series [true false true] [:x :y :z])) 236 | ``` 237 |
238 | => class dataframe.frame.Frame
239 | 	:a	:b
240 | :x	1	10
241 | :z	3	30
242 | 
243 | 244 | 245 | Grouping 246 | ======== 247 | 248 | The `group-by` function takes a Frame and a series whose index 249 | is aligned with the Frame's index and returns a map of 250 | values to Frames. Each Frame is grouped by the value in the 251 | input index. 252 | 253 | ```clojure 254 | 255 | (def data (df/frame [{:a 1 :b 10} 256 | {:a 2 :b 20} 257 | {:a 3 :b 30}] 258 | [:x :y :z])) 259 | 260 | (df/group-by data (df/series [:foo :foo :bar] [:x :y :z])) 261 | ``` 262 | 263 | One can also group by a function of each row using the `group-by-fn` function. This function should take the row as a map of column names to values and return a single value that represents the group value for that row: 264 | 265 | ```clojure 266 | 267 | (def data (df/frame [{:a 1 :b 10} 268 | {:a 2 :b 20} 269 | {:a 3 :b 30}] 270 | [:x :y :z])) 271 | 272 | (df/group-by-fn data (fn [row] (+ (:a row) (:b row)))) 273 | ``` 274 | 275 | Joining 276 | ======= 277 | 278 | To DataFrames may be joined together. Dataframe supports inner, left, right, and outer joins, which are performed using the index of the two dataframes. 279 | 280 | 281 | 282 | ```clojure 283 | 284 | (def left (df/frame [{:a 1 :b 10} 285 | {:a 2 :b 20} 286 | {:a 3 :b 30}] 287 | [:x :y :z])) 288 | 289 | (def right (df/frame [{:c 100 :d "Foo"} 290 | {:c 200 :d "Bar"} 291 | {:c 300 :d "Baz"}] 292 | [:w :x :y])) 293 | 294 | (df/join left right :how :outer) 295 | ``` 296 | 297 |
298 | => class dataframe.frame.Frame
299 |     :b  :a  :c  :d 
300 | :x  10   1 200 Bar 
301 | :y  20   2 300 Baz 
302 | :z  30   3 nil nil 
303 | :w nil nil 100 Foo 
304 | 
305 | 306 | 307 | Transforming 308 | ============ 309 | 310 | 311 | DataFrame core has a number of functions for operating on or manipulating Frames. 312 | 313 | ```clojure 314 | (def frame (df/frame [[:x {:a 1 :b 10}] 315 | [:y {:a 2 :b 20}] 316 | [:z {:a 3 :b 30}]])) 317 | 318 | (df/ix frame :x) 319 | ;=> class dataframe.series.Series 320 | ;:b 10 321 | ;:a 1 322 | 323 | (df/col frame :a) 324 | ;=> class dataframe.series.Series 325 | ;:x 1 326 | ;:y 2 327 | ;:z 3 328 | 329 | 330 | (df/assoc-col frame :c (df/add (df/col frame :a) (df/col frame :b))) 331 | ;=> class dataframe.frame.Frame 332 | ; :b :a :c 333 | ;:x 10 1 11 334 | ;:y 20 2 22 335 | ;:z 30 3 33 336 | 337 | ``` 338 | 339 | To make manipulating Frames easier, dataframe introduces the `with->` macro, which combines Clojure's threading macro with notation for easily accessing the column of a Frame. This macro takes a Frame and threads it through a series of operations. In doing so, when it encounters a symbol of the form `$col`, it knows to replace it with a reference to a column in the dataframe whose name is the keyword `:col` (for this reason, it is preferred to use keywords as column names). 340 | 341 | 342 | ```clojure 343 | 344 | (require '[dataframe.core :refer :all]) 345 | 346 | (def my-df (frame {:a [1 2 3] :b [10 20 30]})) 347 | 348 | (with-> my-df 349 | (assoc-col :c (add $a 5)) 350 | (assoc-col :d (add $b $c))) 351 | ``` 352 |
353 | => class dataframe.frame.Frame
354 | 	:a	:b	:c	:d
355 | 0	1	10	6	16
356 | 1	2	20	7	27
357 | 2	3	30	8	38
358 | 
359 | 360 | 361 | Notice how the uses of `$a`, `$b`, and `$c` are replaced by the corresponding columns, as Series objects, in the dataframe pipeline above. This allows us to leverage functions that act on Series objects to transform these columns and to use them to update the Frame object. 362 | 363 | These pipelines can be arbitrarily complicated: 364 | 365 | ```clojure 366 | 367 | (def my-df (frame [[:w {:a 0 :b 8}] 368 | [:x {:a 1 :b 2}] 369 | [:y {:a 2 :b 4}] 370 | [:z {:a 3 :b 8}]])) 371 | 372 | (with-> my-df 373 | (select (and (lte $a 2) (gte $b 4))) 374 | (assoc-col :c (add $a $b)) 375 | (map-rows->df (fn [row] {:foo (+ (:a row) (:c row)) 376 | :bar (- (:b row) (:c row))})) 377 | (sort-rows :foo :bar) 378 | head) 379 | ``` 380 | 381 |
382 | => class dataframe.frame.Frame
383 | 	:bar	:foo
384 | :y	-2		8
385 | :w	0		8	
386 | :z	-3		14
387 | 
388 | 389 | 390 | 391 | DataFrame is distributed under the MIT license 392 | 393 | Copyright © 2016 George Herbert Lewis 394 | 395 | -------------------------------------------------------------------------------- /doc/intro.md: -------------------------------------------------------------------------------- 1 | # Introduction to dataframe 2 | 3 | TODO: write [great documentation](http://jacobian.org/writing/what-to-write/) 4 | -------------------------------------------------------------------------------- /java/dataframe/Lists.java: -------------------------------------------------------------------------------- 1 | package dataframe; 2 | 3 | import java.util.ArrayList; 4 | import java.util.Collection; 5 | import java.util.Collections; 6 | import java.util.Iterator; 7 | 8 | public class Lists { 9 | 10 | public static ArrayList newArrayList() { 11 | return new ArrayList(); 12 | } 13 | 14 | public static ArrayList newArrayList(E... elements) { 15 | int capacity = computeArrayListCapacity(elements.length); 16 | ArrayList list = new ArrayList(capacity); 17 | Collections.addAll(list, elements); 18 | return list; 19 | } 20 | 21 | public static ArrayList newArrayList(Iterable elements) { 22 | return (elements instanceof Collection) 23 | ? new ArrayList(cast(elements)) 24 | : newArrayList(elements.iterator()); 25 | } 26 | 27 | public static ArrayList newArrayList(Iterator elements) { 28 | ArrayList list = newArrayList(); 29 | addAll(list, elements); 30 | return list; 31 | } 32 | 33 | public static boolean addAll(Collection addTo, Iterator iterator) { 34 | boolean wasModified = false; 35 | while (iterator.hasNext()) { 36 | wasModified |= addTo.add(iterator.next()); 37 | } 38 | return wasModified; 39 | } 40 | 41 | static Collection cast(Iterable iterable) { 42 | return (Collection) iterable; 43 | } 44 | 45 | static int computeArrayListCapacity(int arraySize) { 46 | return saturatedCast(5L + arraySize + (arraySize / 10)); 47 | } 48 | 49 | public static int saturatedCast(long value) { 50 | if (value > Integer.MAX_VALUE) { 51 | return Integer.MAX_VALUE; 52 | } 53 | if (value < Integer.MIN_VALUE) { 54 | return Integer.MIN_VALUE; 55 | } 56 | return (int) value; 57 | } 58 | } 59 | -------------------------------------------------------------------------------- /java/dataframe/TableBuilder.java: -------------------------------------------------------------------------------- 1 | package dataframe; 2 | 3 | import org.apache.commons.lang3.StringUtils; 4 | 5 | import java.util.ArrayList; 6 | import java.util.LinkedList; 7 | import java.util.List; 8 | 9 | public class TableBuilder { 10 | 11 | private final String[] header; 12 | 13 | private List rows; 14 | 15 | static final String COLUMN_SEPARATOR = " "; 16 | 17 | 18 | public TableBuilder(String indexName, Iterable columns) { 19 | 20 | List names = Lists.newArrayList(); 21 | 22 | names.add(formatObject(indexName)); 23 | 24 | for (Object col : columns) { 25 | names.add(formatObject(col)); 26 | } 27 | 28 | String[] header = new String[names.size()]; 29 | header = names.toArray(header); 30 | this.header = header; 31 | 32 | rows = new LinkedList(); 33 | } 34 | 35 | public TableBuilder(Iterable columns) { 36 | this("idx", columns); 37 | } 38 | 39 | public TableBuilder addRow(Object idx, List row) { 40 | 41 | assert row.size() == this.header.length-1; 42 | 43 | List formattedRow = new ArrayList(); 44 | 45 | formattedRow.add(formatObject(idx)); 46 | 47 | for (Object item: row) { 48 | formattedRow.add(formatObject(item)); 49 | } 50 | 51 | String[] cols = new String[row.size()]; 52 | cols = formattedRow.toArray(cols); 53 | 54 | rows.add(cols); 55 | 56 | return this; 57 | } 58 | 59 | public static String formatObject(Object o) { 60 | if (o == null) { 61 | return "nil"; 62 | } else { 63 | return o.toString(); 64 | } 65 | } 66 | 67 | private int totalWidth() { 68 | int total = 0; 69 | for (int w: colWidths()) { 70 | total += w + COLUMN_SEPARATOR.length(); 71 | } 72 | return total; 73 | } 74 | 75 | private int[] colWidths() { 76 | 77 | int numCols = header.length; 78 | 79 | int[] widths = new int[numCols]; 80 | 81 | for(int colNum = 0; colNum < header.length; colNum++) { 82 | widths[colNum] = header[colNum].length(); 83 | } 84 | 85 | for(String[] row : rows) { 86 | for(int colNum = 0; colNum < row.length; colNum++) { 87 | widths[colNum] = Math.max(widths[colNum], StringUtils.length(row[colNum])); 88 | } 89 | } 90 | 91 | return widths; 92 | } 93 | 94 | static void addLine(StringBuilder buf, 95 | String[] line, 96 | int[] colWidths) { 97 | for(int colNum = 0; colNum < line.length; colNum++) { 98 | buf.append( 99 | StringUtils.leftPad( 100 | StringUtils.defaultString( 101 | line[colNum]), colWidths[colNum])); 102 | buf.append(COLUMN_SEPARATOR); 103 | } 104 | 105 | buf.append('\n'); 106 | } 107 | 108 | @Override 109 | public String toString() { 110 | 111 | StringBuilder buf = new StringBuilder(); 112 | 113 | int[] colWidths = colWidths(); 114 | 115 | addLine(buf, this.header, colWidths); 116 | 117 | for(String[] row: this.rows) { 118 | addLine(buf, row, colWidths); 119 | } 120 | 121 | return buf.toString(); 122 | } 123 | } 124 | 125 | -------------------------------------------------------------------------------- /profiles.clj: -------------------------------------------------------------------------------- 1 | {:dev {:dependencies [[expectations "2.1.9"]] 2 | 3 | :plugins [[jonase/eastwood "0.2.3"] 4 | [lein-cljfmt "0.5.6" :exclusions [org.clojure/clojure]] 5 | [lein-expectations "0.0.8" :exclusions [org.clojure/clojure]]] 6 | 7 | ; Generate docs 8 | :codox {:output-path "resources/codox" 9 | :metadata {:doc/format :markdown} 10 | :source-uri "http://github.com/lendup/citadel/blob/master/{filepath}#L{line}"} 11 | 12 | :eastwood {:exclude-namespaces [:test-paths]} 13 | 14 | ; Format code 15 | :cljfmt {:indents 16 | {require [[:block 0]] 17 | ns [[:block 0]] 18 | #"^(?!:require|:import).*" [[:inner 0]]}}} 19 | } 20 | -------------------------------------------------------------------------------- /project.clj: -------------------------------------------------------------------------------- 1 | (defproject dataframe "0.1.0-SNAPSHOT" 2 | 3 | :description "DataFrames in clojure" 4 | 5 | :url "http://example.com/FIXME" 6 | 7 | :license {:name "MIT License" 8 | :url "http://www.opensource.org/licenses/mit-license.php"} 9 | 10 | :dependencies [ 11 | [org.clojure/clojure "1.8.0"] 12 | [net.mikera/core.matrix "0.54.0"] 13 | [net.mikera/vectorz-clj "0.45.0"] 14 | 15 | [org.apache.commons/commons-lang3 "3.0"] 16 | 17 | [expectations "2.1.8"] 18 | 19 | ] 20 | 21 | :source-paths ["src"] 22 | 23 | :java-source-paths ["java"] 24 | 25 | :plugins [[lein-expectations "0.0.7"]] 26 | 27 | ) 28 | -------------------------------------------------------------------------------- /src/dataframe/core.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.core 2 | (:refer-clojure :exclude [group-by]) 3 | (:require [dataframe.series] 4 | [dataframe.frame]) 5 | (:import (dataframe.series Series) 6 | (dataframe.frame Frame))) 7 | 8 | 9 | ; Multi Methods 10 | 11 | (defn delegate 12 | "Deligate the implementation of a multimethod to an existing function" 13 | [multifn dispatch-val f] 14 | (.. multifn (addMethod dispatch-val f))) 15 | 16 | (defn first-type 17 | [& args] 18 | (type (first args))) 19 | 20 | (defmulti ix first-type) 21 | (delegate ix Series dataframe.series/ix) 22 | (delegate ix Frame dataframe.frame/ix) 23 | 24 | (defmulti index first-type) 25 | (delegate index Series dataframe.series/index) 26 | (delegate index Frame dataframe.frame/index) 27 | 28 | (defmulti set-index first-type) 29 | (delegate set-index Series dataframe.series/set-index) 30 | (delegate set-index Frame dataframe.frame/set-index) 31 | 32 | (defmulti select first-type) 33 | (delegate select Series dataframe.series/select) 34 | (delegate select Frame dataframe.frame/select) 35 | 36 | (defmulti loc first-type) 37 | (delegate loc Series dataframe.series/loc) 38 | (delegate loc Frame dataframe.frame/loc) 39 | 40 | (defmulti subset first-type) 41 | (delegate subset Series dataframe.series/subset) 42 | (delegate subset Frame dataframe.frame/subset) 43 | 44 | (defmulti head first-type) 45 | (delegate head Series dataframe.series/head) 46 | (delegate head Frame dataframe.frame/head) 47 | 48 | (defmulti tail first-type) 49 | (delegate tail Series dataframe.series/tail) 50 | (delegate tail Frame dataframe.frame/tail) 51 | 52 | 53 | ; Imported series methods 54 | 55 | (def series dataframe.series/series) 56 | (def series? dataframe.series/series?) 57 | (def values dataframe.series/values) 58 | (def update-key dataframe.series/update-key) 59 | (def mapvals dataframe.series/mapvals) 60 | 61 | (def lt dataframe.series/lt) 62 | (def lte dataframe.series/lte) 63 | (def gt dataframe.series/gt) 64 | (def gte dataframe.series/gte) 65 | (def add dataframe.series/add) 66 | (def sub dataframe.series/sub) 67 | (def mul dataframe.series/mul) 68 | (def div dataframe.series/div) 69 | (def eq dataframe.series/eq) 70 | (def neq dataframe.series/neq) 71 | 72 | 73 | ; Imported frame methods 74 | 75 | (def frame dataframe.frame/frame) 76 | (def col dataframe.frame/col) 77 | (def column-map dataframe.frame/column-map) 78 | (def columns dataframe.frame/columns) 79 | (def assoc-ix dataframe.frame/assoc-ix) 80 | (def assoc-col dataframe.frame/assoc-col) 81 | (def iterrows dataframe.frame/iterrows) 82 | (def maprows->srs dataframe.frame/maprows->srs) 83 | (def maprows->df dataframe.frame/maprows->df) 84 | (def sort-rows dataframe.frame/sort-rows) 85 | (def group-by dataframe.frame/group-by) 86 | (def group-by-fn dataframe.frame/group-by-fn) 87 | (def join dataframe.frame/join) 88 | 89 | (defmacro with-> [& args] `(dataframe.frame/with-> ~@args)) 90 | -------------------------------------------------------------------------------- /src/dataframe/frame.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.frame 2 | (:refer-clojure :exclude [group-by]) 3 | (:require [dataframe.series :as series] 4 | [clojure.string :as str] 5 | [dataframe.series :as series] 6 | [dataframe.util :refer :all] 7 | [clojure.set :as set] 8 | [clojure.core :as core]) 9 | (:import (java.util Map) 10 | (dataframe TableBuilder))) 11 | 12 | (declare frame 13 | assoc-ix 14 | assoc-col 15 | iterrows 16 | columns 17 | print-row 18 | rows->vectors 19 | set-index 20 | -seq->frame 21 | -list-of-row-maps->frame 22 | -list-of-index-row-pairs->frame 23 | -map->frame 24 | -map-of-series->frame 25 | -map-of-sequence->frame) 26 | 27 | ; A Frame can be interpreted as: 28 | ; - A Map of index keys to maps of values 29 | ; - A Map of column names to Series as columns 30 | ; 31 | ; A Frame supports 32 | ; - Order 1 lookup of row maps by index key 33 | ; - Order 1 lookup of [index row] pairs by position (nth) 34 | ; - Order 1 lookup of columns by name 35 | ; 36 | ; A Frame does not guarantee column order 37 | ; 38 | ; As viewed as a Clojure PersistentColleection, it is a 39 | ; collection of [index row] pairs, where a row is a map 40 | ; of [column val] pairs (for the purpose of seq and cons). 41 | ; As viewed as an association, it is a map from index 42 | ; keys to row maps. 43 | (deftype Frame [index column-map] 44 | 45 | java.lang.Object 46 | (equals [this other] 47 | (cond (nil? other) false 48 | (not (= Frame (class other))) false 49 | :else (and 50 | (= (. this index) (. other index)) 51 | (= (. this column-map) (. other column-map))))) 52 | (hashCode [this] 53 | (hash [(hash (. this index)) (hash (. this column-map))])) 54 | 55 | java.lang.Iterable 56 | (iterator [this] 57 | (.iterator (iterrows this))) 58 | 59 | clojure.lang.Counted 60 | (count [this] (count index)) 61 | 62 | clojure.lang.IPersistentCollection 63 | (seq [this] (if (empty? index) 64 | nil 65 | (iterrows this))) 66 | ;Takes a vector pair of [idx row], 67 | ;where row is a map, and returns a 68 | ;Frame extended by one row." 69 | (cons [this other] 70 | (assert vector? other) 71 | (assert (= 2 (count other))) 72 | (assert map? (last other)) 73 | (let [[idx m] other] 74 | (assoc-ix this idx m))) 75 | (empty [this] (empty? index)) 76 | (equiv [this other] (.. this (equals other)))) 77 | 78 | ; It has an index for row-wise lookups 79 | (defn frame 80 | "Create a Frame from on of the following inputs: 81 | 82 | - A map of column keys to sequences representing column values 83 | - A map of column keys to Series reprsenting column values 84 | - A sequence of index keys to maps representing rows 85 | " 86 | ([data index] (set-index (frame data) index)) 87 | ([data] 88 | (cond 89 | (map? data) (-map->frame data) 90 | (seq? data) (-seq->frame data) 91 | (vector? data) (-seq->frame data) 92 | :else (throw (new Exception "Encountered unexpected type for frame constructor"))))) 93 | 94 | 95 | (def empty-frame (Frame. [] {})) 96 | 97 | (defmethod print-method Frame 98 | [df writer] 99 | (.write writer (str (class df) "\n")) 100 | (if (empty? df) 101 | (.write writer "[]") 102 | (.write writer 103 | (let [table (new TableBuilder "" (columns df))] 104 | (doall (for [[idx row] (rows->vectors df)] 105 | (. table (addRow idx row)))) 106 | (. table toString))))) 107 | 108 | 109 | (defn ^{:protected true} -map->frame 110 | [^Map data-map] 111 | 112 | ; Ensure all values have the same length 113 | (if (not (empty? data-map)) 114 | (assert (apply = (map count (vals data-map))))) 115 | 116 | (let [k->srs (into {} 117 | (for [[k xs] data-map] 118 | (if (series/series? xs) 119 | [k xs] 120 | [k (series/series xs)])))] 121 | 122 | (-map-of-series->frame k->srs))) 123 | 124 | (defn ^{:protected true} -map-of-series->frame 125 | "Takes a map of column keys to Series objects 126 | representing column values. 127 | Return a Frame." 128 | [map-of-srs] 129 | 130 | ; Assert all the indices are aligned 131 | (if (not (empty? map-of-srs)) 132 | (assert (apply = (map series/index (vals map-of-srs))))) 133 | 134 | (if (empty? map-of-srs) 135 | (Frame. [] {}) 136 | 137 | (let [any-index (series/index (nth (vals map-of-srs) 0))] 138 | (Frame. any-index map-of-srs)))) 139 | 140 | (defn ^{:protected true} -seq->frame 141 | "Take a list of either maps 142 | (each representing a row) 143 | or pairs of index->maps. 144 | Return a Frame." 145 | [s] 146 | (if (map? (first s)) 147 | (-list-of-row-maps->frame s) 148 | (-list-of-index-row-pairs->frame s))) 149 | 150 | (defn ^{:protected true} -list-of-row-maps->frame 151 | "Take a list of maps (each representing a row 152 | with keys as columns and vals as row values) 153 | and return a Frame" 154 | [row-maps] 155 | (let [index (range (count row-maps)) 156 | columns (into #{} (flatten (map keys row-maps))) 157 | col->vec (into {} (for [col columns] 158 | [col (vec (map #(get % col nil) row-maps))])) 159 | col->srs (into {} (for [[col vals] col->vec] 160 | [col (series/series vals index)]))] 161 | 162 | (-map-of-series->frame col->srs))) 163 | 164 | (defn ^{:protected true} -list-of-index-row-pairs->frame 165 | "Take a list of pairs 166 | of index values to row-maps 167 | and return a Frame." 168 | [seq-of-idx->maps] 169 | 170 | (let [index (into [] (map first seq-of-idx->maps)) 171 | row-maps (map last seq-of-idx->maps) 172 | columns (into #{} (filter (comp not nil?) (flatten (map keys row-maps)))) 173 | col->vec (into {} (for [col columns] 174 | [col (vec (map #(get % col nil) row-maps))])) 175 | col->srs (into {} (for [[col vals] col->vec] 176 | [col (series/series vals index)]))] 177 | 178 | (-map-of-series->frame col->srs))) 179 | 180 | (defn index 181 | [^Frame frame] 182 | (. frame index)) 183 | 184 | (defn column-map 185 | [^Frame frame] 186 | (. frame column-map)) 187 | 188 | (defn columns 189 | [^Frame frame] 190 | (keys (column-map frame))) 191 | 192 | (defn set-index 193 | [^Frame frame index] 194 | (Frame. index (into {} (for [[col srs] (column-map frame)] 195 | [col (series/set-index srs index)])))) 196 | 197 | (defn assoc-ix 198 | "Takes a key of the index type and map 199 | of column names to values and return a 200 | frame with a new row added corresponding 201 | to the input index and column map." 202 | [^Frame df i row-map] 203 | 204 | (assert map? row-map) 205 | 206 | (let [new-columns (into {} 207 | (for [[k srs] (column-map df)] 208 | [k (conj srs [i (get row-map k nil)])])) 209 | new-index (conj (index df) i)] 210 | (frame new-columns new-index))) 211 | 212 | (defn assoc-col 213 | "Takes a key of the index type and map 214 | of column names to values and return a 215 | frame with a new row added corresponding 216 | to the input index and column map." 217 | [^Frame df col-name col] 218 | 219 | (let [col (if (series/series? col) 220 | col 221 | (series/series col (index df)))] 222 | (frame (assoc (column-map df) col-name col) 223 | (index df)))) 224 | 225 | 226 | (defn ^{:protected true} print-row 227 | [row] 228 | (str/join 229 | \tab 230 | (into [] 231 | (map #(if (nil? %) "nil" %) row)))) 232 | 233 | 234 | (defn ix 235 | "Get the 'row' of the input dataframe 236 | corresponding to the input index. 237 | 238 | The 'row' is a Series corresponding to the 239 | input index applied to every column 240 | in this dataframe, where the index of 241 | the new series are the column names. 242 | 243 | If no row matching the index exists, 244 | return nil 245 | " 246 | [df i] 247 | (if (some #(= i %) (index df)) 248 | (series/series (map #(series/ix % i) (-> df column-map vals)) (-> df column-map keys)) 249 | nil)) 250 | 251 | (defn col 252 | "Return the column from the dataframe 253 | by the given name as a Series" 254 | [df col-name] 255 | (get (column-map df) col-name)) 256 | 257 | (defn rows->vectors 258 | "Return an iterator key-val pairs 259 | of index values to row values (as a vector)" 260 | [df] 261 | (zip 262 | (index df) 263 | (apply zip (map series/values (vals (column-map df)))))) 264 | 265 | (defn iterrows 266 | "Return an iterator over vectors 267 | of key-val pairs of the row's 268 | index value and the value of that 269 | row as a map" 270 | [df] 271 | (for [idx (index df)] 272 | [idx (into {} (for [[col srs] (column-map df)] 273 | [col (series/ix srs idx)]))])) 274 | 275 | (defn maprows->srs 276 | "Apply the function to each row in the DataFrame 277 | (where the representation of each row is a map of 278 | column names to values). 279 | Return a Series whose index is the index of the 280 | original DataFrame and whose value is the value 281 | of the applied function." 282 | [^Frame df f] 283 | (let [rows (for [[_ row] (iterrows df)] 284 | (f row))] 285 | (series/series rows (index df)))) 286 | 287 | (defn maprows->df 288 | "Apply the function to each row in the DataFrame 289 | (where the representation of each row is a map of 290 | column names to values). The function should return 291 | a Map. 292 | Return a DataFrame whose index is the same as the 293 | original dataframe and whose columns are the values 294 | of the maps returned by the function." 295 | [^Frame df f] 296 | (let [rows (for [[idx row] (iterrows df)] 297 | [idx (f row)])] 298 | (-list-of-index-row-pairs->frame rows))) 299 | 300 | 301 | 302 | (defn indices-alignable? 303 | [idx-left idx-right] 304 | (= (sort idx-left) (sort idx-right))) 305 | 306 | (defn loc 307 | "Take a Frame and a list of indices. 308 | Return a DataFrame consisting only of 309 | the input index rows (in the order of 310 | the given index). 311 | If an entry in indices is not in the 312 | input Frame, then each column will be nil." 313 | [^Frame df indices] 314 | 315 | (if (empty? indices) 316 | empty-frame 317 | (-list-of-index-row-pairs->frame 318 | (into [] (for [i indices] 319 | [i (if-let [row (ix df i)] (series/->map row) {})]))))) 320 | 321 | (defn select 322 | [^Frame df sel] 323 | 324 | (let [sel (if (series/series? sel) sel (series/series sel))] 325 | 326 | (assert (indices-alignable? (index df) (series/index sel))) 327 | 328 | (let [to-keep (for [[[idx keep?] [idx row-map]] (zip sel (loc df (series/index sel))) 329 | :when keep?] 330 | [idx row-map]) 331 | idx (map first to-keep) 332 | vals (map last to-keep)] 333 | (frame vals idx)))) 334 | 335 | 336 | (defn subset 337 | "Return a subset of the input Frame 338 | the start and end indices (which are 339 | integer like) using the index order. 340 | 341 | The subset is inclusive on the start 342 | but exclusive on the end, meaning that 343 | (subset srs 0 (count srs)) returns the 344 | same series" 345 | [^Frame df start end] 346 | 347 | (assert (<= start end)) 348 | 349 | (let [last (count df) 350 | srs-begin (min (max 0 start) last) 351 | srs-end (min (max 0 end) last) 352 | subset-index (subvec (index df) srs-begin srs-end) 353 | subset-columns (into {} (for [[name col] (column-map df)] 354 | [name (series/subset col start end)]))] 355 | (frame subset-columns subset-index))) 356 | 357 | (defn head 358 | "Return a subseries consisting of the 359 | first n elements of the input frame 360 | using the index order. 361 | 362 | If n > (count df), return the 363 | whole frame." 364 | ([^Frame df] (head df 5)) 365 | ([^Frame df n] (subset df 0 n))) 366 | 367 | (defn tail 368 | "Return a subseries consisting of the 369 | last n elements of the input frame 370 | using the index order. 371 | 372 | If n > (count df), return the 373 | whole frame." 374 | ([^Frame df] (tail df 5)) 375 | ([^Frame df n] 376 | (let [start (- (count df) n) 377 | end (count df)] 378 | (subset df start end)))) 379 | 380 | (defn sort-rows 381 | "Sort DataFrame rows using the 382 | given column names in the order 383 | that they appear" 384 | [^Frame df & col-names] 385 | (let [get-sort-key (fn [[idx row-map]] 386 | (into [] (for [col col-names] (get row-map col)))) 387 | sorted-idx-row-pairs (sort-by get-sort-key df)] 388 | (-list-of-index-row-pairs->frame sorted-idx-row-pairs))) 389 | 390 | 391 | (defn add-suffix 392 | "Add a suffix to a name. 393 | A name or suffix can be either 394 | a string or a keyword" 395 | [col-name suffix] 396 | (cond 397 | (string? col-name) (str col-name (name suffix)) 398 | (keyword? col-name) (keyword (str (name col-name) (name suffix))) 399 | :else (throw (new Exception)))) 400 | 401 | 402 | (defn assoc-common-column 403 | "Conditionally associated the input 404 | column and value pair into the input 405 | map. The name associated in and whether 406 | the association happens at all is dependent 407 | on the common-col-resolution map, using 408 | whether the columns is left? or not." 409 | [into-map col-name val left? common-columns 410 | {:keys [suffixes prefer-column] 411 | :or {suffixes ["-x" "-y"] 412 | prefer-column nil}}] 413 | 414 | (assert (contains? #{:left :right nil} prefer-column)) 415 | 416 | (if (not (contains? common-columns col-name)) 417 | (assoc into-map col-name val) 418 | 419 | (if prefer-column 420 | 421 | (case [prefer-column left?] 422 | [:left true] (assoc into-map col-name val) 423 | [:left false] into-map 424 | [:right false] (assoc into-map col-name val) 425 | [:right true] into-map 426 | :else (throw (new Exception))) 427 | 428 | (let [[left-suffix right-suffix] suffixes 429 | suffix (if left? left-suffix right-suffix)] 430 | 431 | (assoc into-map (add-suffix col-name suffix) val))))) 432 | 433 | 434 | (defn join-index 435 | [left-index right-index join-type] 436 | (case join-type 437 | :left left-index 438 | :right right-index 439 | :outer (concat left-index (filter #(not (contains? (set left-index) %)) right-index)) 440 | :inner (filter #(contains? (set right-index) %) left-index) 441 | :else (throw (new Exception)))) 442 | 443 | 444 | (defn join 445 | [^Frame left ^Frame right 446 | & {:keys [how suffixes prefer-column] 447 | :or {how :inner 448 | suffixes ["-x" "-y"] 449 | prefer-column nil} 450 | :as kwargs}] 451 | 452 | (let [shared-cols (into #{} (set/intersection (set (columns left)) (set (columns right)))) 453 | idx (join-index (index left) (index right) how) 454 | left-cols (for [[col srs] (column-map left)] [col (series/series (map #(series/ix srs %) idx) idx) true]) 455 | right-cols (for [[col srs] (column-map right)] [col (series/series (map #(series/ix srs %) idx) idx) false]) 456 | all-cols (concat left-cols right-cols) 457 | col-map (reduce (fn [coll [col-name val left?]] 458 | (assoc-common-column coll col-name val left? shared-cols kwargs)) 459 | {} 460 | all-cols)] 461 | (frame col-map idx))) 462 | 463 | 464 | (defn group-by 465 | [^Frame df vals] 466 | (let [srs (if (series/series? vals) vals (series/series vals)) 467 | grouped-vals (core/group-by (fn [[ix val]] val) srs) 468 | val-index-list (into {} (for [[val ix-val-list] grouped-vals] 469 | [val (into [] (map first ix-val-list))]))] 470 | (into {} (for [[val idx-list] val-index-list] 471 | [val (loc df idx-list)])))) 472 | 473 | (defn group-by-fn 474 | "Group a Frame by the given function, 475 | which must be a function of it's a row-map, 476 | and return a map of fn vals to Frames" 477 | [^Frame df f] 478 | (let [grouped-idx-rows (core/group-by (fn [[idx row]] (f row)) (iterrows df))] 479 | (into {} (for [[k idx-row-list] grouped-idx-rows] 480 | [k (frame idx-row-list)])))) 481 | 482 | (defn replace-$-with-keys 483 | "Takes a context (typically a map or a Frame), 484 | an expression (containing '$' values) 485 | and a getter function (typically core/get 486 | or frame/col). 487 | Return an expression where each instance of 488 | a $var is replaced with the getter-function 489 | getting :var from the input context. 490 | 491 | In other words, if the expression contains: 492 | 493 | $foo 494 | 495 | it is replaced by: 496 | 497 | (get ctx :foo) 498 | " 499 | [ctx expr get-fn] 500 | (clojure.walk/postwalk 501 | (fn [x] 502 | (if (and 503 | (symbol? x) 504 | (clojure.string/starts-with? (name x) "$")) 505 | `(~get-fn ~ctx ~(keyword (subs (name x) 1))) 506 | x)) 507 | expr)) 508 | 509 | (defmacro with-> 510 | "A threading macro intended to thread 511 | expressions on data frames. 512 | Automatically replaces symbols starting 513 | with '$' with columns from the last 514 | DataFrame that was encountered 515 | in the threading." 516 | [df & exprs] 517 | (if (empty? exprs) 518 | df 519 | (let [sym (gensym) 520 | head (replace-$-with-keys df (first exprs) 'dataframe.frame/col) 521 | tail (rest exprs)] 522 | `(let [~sym (-> ~df ~head)] (with-> ~sym ~@tail))))) 523 | -------------------------------------------------------------------------------- /src/dataframe/series.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.series 2 | (:refer-clojure) 3 | (:require [dataframe.util :refer :all] 4 | [clojure.string :as str]) 5 | (:import (clojure.lang IPersistentVector IPersistentMap MapEntry))) 6 | 7 | (declare series 8 | update-key) 9 | 10 | ; A Series is a data structure that maps an index 11 | ; A Series is a data structure that maps an index 12 | ; to valus. It supports: 13 | ; - Order 1 access to values by index 14 | ; - Order 1 access to [index value] pairs by position (nth) 15 | ; - Maintaining the order of [index value] pairs for iteration 16 | ; 17 | ; As viewed as a Clojure Persistent collection, it is a collection 18 | ; of [index value] pairs. 19 | ; It is also Associative between the index keys and its values 20 | (deftype Series [^IPersistentVector values 21 | ^IPersistentVector index 22 | ^IPersistentMap lookup] 23 | 24 | java.lang.Object 25 | (equals [this other] 26 | (cond (nil? other) false 27 | (not (= Series (class other))) false 28 | :else (every? true? 29 | [(= (. this values) (. other values)) 30 | (= (. this index) (. other index))]))) 31 | (hashCode [this] 32 | (hash [(hash (. this index)) (hash (. this values))])) 33 | 34 | java.lang.Iterable 35 | (iterator [this] 36 | (.iterator (zip index values))) 37 | 38 | clojure.lang.Counted 39 | (count [this] (count index)) 40 | 41 | clojure.lang.IPersistentCollection 42 | (seq [this] (if (empty? index) 43 | nil 44 | (zip (. this index) (. this values)))) 45 | ; Return a sequence of key-val pairs 46 | (cons [this other] 47 | (assert (vector? other)) 48 | (assert (= 2 (count other))) 49 | (assoc this (first other) (last other))) 50 | ;(cons (.iterator this) other)) 51 | (empty [this] (empty? index)) 52 | (equiv [this other] (.. this (equals other))) 53 | 54 | clojure.lang.ILookup 55 | (valAt [this i] (.. this (valAt i nil))) 56 | (valAt [this i or-else] (if-let [n (get (. this lookup) i)] 57 | (nth (. this values) n) 58 | or-else)) 59 | 60 | clojure.lang.Associative 61 | (containsKey [this key] 62 | (contains? lookup key)) 63 | (entryAt [this key] 64 | (MapEntry/create key (.. this (valAt key)))) 65 | ;Takes a key of the index type and map 66 | ; of column names to values and return a 67 | ; frame with a new row added corresponding 68 | ; to the input index and column map." 69 | (assoc [this idx val] 70 | (if (contains? this idx) 71 | (series (assoc values (get lookup idx) val) 72 | index) 73 | (series (conj values val) (conj index idx))))) 74 | 75 | ; Constructor 76 | (defn series 77 | 78 | ([data] (series data (range (count data)))) 79 | 80 | ([data index] 81 | 82 | (let [data (->vector data) 83 | index (->vector index) 84 | lookup (into {} (enumerate index false))] 85 | 86 | (assert (apply distinct? index)) 87 | (assert (= (count data) (count index))) 88 | (if (not (every? nil? data)) 89 | (assert (apply = (map type (filter (comp not nil?) data))))) 90 | 91 | (Series. data index lookup)))) 92 | 93 | (defmethod print-method Series [^Series srs writer] 94 | (.write writer (str (class srs) 95 | "\n" 96 | (str/join "\n" 97 | (map 98 | (fn [[i d]] 99 | (str i " " (if (nil? d) "nil" d))) 100 | (zip (. srs index) (. srs values))))))) 101 | 102 | (defn series? 103 | [x] 104 | (instance? Series x)) 105 | 106 | (defn index 107 | [^Series srs] 108 | (. srs index)) 109 | 110 | (defn values 111 | [^Series srs] 112 | (. srs values)) 113 | 114 | (defn ix 115 | "Takes a series and an index and returns 116 | the item in the series corresponding 117 | to the input index" 118 | ([^Series srs i] (get srs i nil)) 119 | ([^Series srs i or-else] (get srs i or-else))) 120 | 121 | (defn loc 122 | "Take a Series and a list of indices. 123 | Return a Seriues consisting only of 124 | the input index rows (in the order of 125 | the given index). 126 | If an entry in indices is not in the 127 | input Series, then it's value will be nil" 128 | [^Series srs indices] 129 | (if (empty? indices) 130 | (series []) 131 | (series 132 | (for [i indices] (ix srs i)) 133 | indices))) 134 | 135 | 136 | (defn set-index 137 | "Return a series with the same values 138 | but with the updated index." 139 | [^Series srs index] 140 | (series (values srs) 141 | (->vector index))) 142 | 143 | (defn mapvals 144 | "Apply the function to all vals in the Series, 145 | returning a new Series consistening of these 146 | transformed vals with their indices." 147 | [^Series srs f] 148 | (series (map f (values srs)) (index srs))) 149 | 150 | (defn select 151 | "Takes a series and a list of possibly-true values 152 | and return a series containing only vals that 153 | line up to truthy values" 154 | [^Series srs selection] 155 | 156 | (assert (= (count srs) (count selection))) 157 | 158 | (let [selection (if (series? selection) (values selection) selection) 159 | to-keep (for [[keep? [idx val]] (zip selection srs) 160 | :when keep?] 161 | [idx val]) 162 | idx (map #(nth % 0) to-keep) 163 | vals (map #(nth % 1) to-keep)] 164 | 165 | (series vals idx))) 166 | 167 | (defn subset 168 | "Return a subseries defined 169 | the start and end indices (which are 170 | integer like) using the index order. 171 | 172 | The subset is inclusive on the start 173 | but exclusive on the end, meaning that 174 | (subset srs 0 (count srs)) returns the 175 | same series" 176 | [^Series srs start end] 177 | 178 | (assert (<= start end)) 179 | 180 | (let [last (count srs) 181 | srs-begin (min (max 0 start) last) 182 | srs-end (min (max 0 end) last)] 183 | (series 184 | (subvec (values srs) srs-begin srs-end) 185 | (subvec (index srs) srs-begin srs-end)))) 186 | 187 | (defn head 188 | "Return a subseries consisting of the 189 | first n elements of the input series 190 | using the index order. 191 | 192 | If n > (count srs), return the 193 | whole series." 194 | ([^Series srs] (head srs 5)) 195 | ([^Series srs n] (subset srs 0 n))) 196 | 197 | (defn tail 198 | "Return a subseries consisting of the 199 | last n elements of the input series 200 | using the index order. 201 | 202 | If n > (count srs), return the 203 | whole series." 204 | ([^Series srs] (tail srs 5)) 205 | ([^Series srs n] 206 | (let [start (- (count srs) n) 207 | end (count srs)] 208 | (subset srs start end)))) 209 | 210 | 211 | (defn ->map 212 | [^Series srs] 213 | (into {} srs)) 214 | 215 | (defn index-aligned-pairs 216 | "Take two series and return 217 | a joined index and a sequence 218 | over pairs of the left and right 219 | series" 220 | [^Series left ^Series right] 221 | 222 | (if (= (index left) (index right)) 223 | 224 | [(index left) (zip (values left) (values right))] 225 | 226 | (let [left-idx (index left) 227 | right-only-idx (->> right index (filter #(not (contains? left %)))) 228 | idx (concat left-idx right-only-idx) 229 | vals (for [i idx] [(ix left i) (ix right i)])] 230 | [idx vals]))) 231 | 232 | (defn join-map 233 | "Takes a function of two arguments and 234 | applies it to the pairs in the outer join of the 235 | two input series, returning a new Series." 236 | [f ^Series x ^Series y] 237 | (let [[idx pairs] (index-aligned-pairs x y) 238 | vals (for [[l r] pairs] (f l r))] 239 | (series vals idx))) 240 | 241 | (defn ^{:protected true} broadcast 242 | "Take a binary function and turn it into 243 | a bradcasted function so that it can 244 | operate on Series in any of it's arguments" 245 | [f] 246 | (fn [x y] 247 | (cond 248 | (and (instance? Series x) (instance? Series y)) (join-map f x y) 249 | (instance? Series x) (series (for [l (values x)] 250 | (f l y)) 251 | (index x)) 252 | (instance? Series y) (series (for [r (values y)] 253 | (f x r)) 254 | (index y)) 255 | :else (f x y)))) 256 | 257 | (defn ^{:protected true} multi-broadcast 258 | "Take a function of any arity and turn it into 259 | a bradcasted function so that it can 260 | operate on Series in any of it's arguments" 261 | [f] 262 | (fn [x & args] 263 | (loop [x x 264 | args args] 265 | (if (empty? args) 266 | x 267 | (recur ((broadcast f) x (first args)) 268 | (rest args)))))) 269 | 270 | (def lt (broadcast (nillify <))) 271 | (def lte (broadcast (nillify <=))) 272 | (def gt (broadcast (nillify >))) 273 | (def gte (broadcast (nillify >=))) 274 | 275 | (def add (multi-broadcast (nillify +))) 276 | (def sub (multi-broadcast (nillify -))) 277 | (def mul (multi-broadcast (nillify *))) 278 | (def div (multi-broadcast (nillify /))) 279 | 280 | (def eq (multi-broadcast (nillify =))) 281 | (def neq (multi-broadcast (comp (nillify not) (nillify =)))) 282 | 283 | -------------------------------------------------------------------------------- /src/dataframe/util.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.util) 2 | 3 | (defn zip 4 | "Take a number of iterables and 5 | return a single iterable over vectors, 6 | each containing the ith element of each 7 | input iterable (ordered by the order of 8 | the input iterables). 9 | If the input iterables are not of the same 10 | length, the returned iterable is as long as 11 | the shortest input iterable (data from 12 | longer input interables will not be 13 | returned). 14 | " 15 | [& args] 16 | (apply map vector args)) 17 | 18 | (defn enumerate 19 | ([xs] (enumerate xs true)) 20 | ([xs index-first?] 21 | (if index-first? 22 | (zip (range) xs) 23 | (zip xs (range))))) 24 | 25 | (defn ->vector 26 | [x] 27 | (if (vector? x) 28 | x 29 | (vec x))) 30 | 31 | (defn nillify 32 | "Takes a binary function and returns 33 | a function that short-circuits nil values." 34 | [f] 35 | (fn [& args] 36 | (if (some nil? args) 37 | nil 38 | (apply f args)))) 39 | -------------------------------------------------------------------------------- /test/dataframe/core_test.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.core-test 2 | (:refer-clojure :exclude [group-by]) 3 | (:require [clojure.test :refer :all] 4 | [dataframe.core :refer :all] 5 | [dataframe.frame :as frame])) 6 | 7 | ; 8 | ; 9 | ;(expect (more-of df 10 | ; 11 | ; ) 12 | ; 13 | ; (let [df (frame/frame {:a '(1 2 3) :b '(2 4 6)}) 14 | ; a-min (with-df-> f (frame/filter :a (< 10) -------------------------------------------------------------------------------- /test/dataframe/frame_test.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.frame-test 2 | (:require [dataframe.frame :as frame :refer [index]] 3 | [expectations :refer [expect expect-focused more-of]] 4 | [dataframe.series :as series] 5 | [clojure.core :as core])) 6 | 7 | ; Constructors 8 | 9 | (expect (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z]) 10 | (frame/frame [[:x {:a 1 :b 2}] 11 | [:y {:a 2 :b 4}] 12 | [:z {:a 3 :b 6}]])) 13 | 14 | (expect '(0 1 2) 15 | (let [df (frame/frame {:a '(1 2 3) :b '(2 4 6)})] 16 | (index df))) 17 | 18 | (expect (series/series '(1 2 3) '(0 1 2)) 19 | (-> (frame/frame {:a '(1 2 3) :b '(2 4 6)}) 20 | (frame/col :a))) 21 | 22 | (expect nil 23 | (-> (frame/frame {:a '(1 2 3) :b '(2 4 6)}) 24 | (frame/col :c))) 25 | 26 | (expect (series/series [1 2] [:a :b]) 27 | (-> (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z]) 28 | (frame/ix :x))) 29 | 30 | (expect '([:x {:a 1 :b 2}] 31 | [:y {:a 2 :b 4}] 32 | [:z {:a 3 :b 6}]) 33 | (-> (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z]) 34 | frame/iterrows)) 35 | 36 | ; Assert the iterator over a datafram 37 | ; iterates over [index row-map] pairs 38 | (expect '([:x {:a 1 :b 2}] 39 | [:y {:a 2 :b 4}] 40 | [:z {:a 3 :b 6}]) 41 | (for [x (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z])] 42 | x)) 43 | 44 | (expect (frame/frame [[:x {:a 1 :b 2}] 45 | [:y {:a 2 :b 4}] 46 | [:z {:a 3 :b 6}]]) 47 | (conj 48 | (frame/frame {:a '(1 2) :b '(2 4)} [:x :y]) 49 | [:z {:a 3 :b 6}])) 50 | 51 | (expect false 52 | (empty? (frame/frame {:a '(1 2) :b '(2 4)} [:x :y]))) 53 | 54 | (expect true 55 | (empty? (frame/frame {} []))) 56 | 57 | (expect (series/series [3 6 9] [:x :y :z]) 58 | (frame/maprows->srs 59 | (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z]) 60 | (fn [row] (+ (:a row) (:b row))))) 61 | 62 | (expect (frame/frame {:bar [-1 -2 -3] :foo [3 6 9]} [:x :y :z]) 63 | (frame/maprows->df 64 | (frame/frame {:a '(1 2 3) :b '(2 4 6)} [:x :y :z]) 65 | (fn [row] {:foo (+ (:a row) (:b row)) :bar (- (:a row) (:b row))}))) 66 | 67 | (expect (frame/frame {:a [1 2 3] :b [4 5 6]} [:x :y :z]) 68 | (frame/frame 69 | [{:a 1 :b 4} {:a 2 :b 5} {:a 3 :b 6}] 70 | [:x :y :z])) 71 | 72 | (expect (frame/frame [{:a 2 :b 6} {:a 4 :b 8}] [:x :z]) 73 | (frame/select 74 | (frame/frame 75 | {:a [1 2 3 4] :b [5 6 7 8]} 76 | [:w :x :y :z]) 77 | (series/series [false true nil true] [:w :x :y :z]))) 78 | 79 | 80 | (expect (frame/frame {:a [1 3] :b [20 30]} [1 2]) 81 | (frame/loc 82 | (frame/frame {:a [1 1 3] :b [10 20 30]}) [1 2])) 83 | 84 | (expect (frame/frame {:a [1 3 nil nil] :b [20 30 nil nil]} [1 2 10 20]) 85 | (frame/loc 86 | (frame/frame {:a [1 1 3] :b [10 20 30]}) [1 2 10 20])) 87 | 88 | ;(expect (series/series [15]) 89 | ; (frame/with-context 90 | ; (frame/frame [{:b 10}]) 91 | ; (series/add 5 $b))) 92 | 93 | 94 | (expect '(+ 5 (core/get {:b 10} :b)) 95 | (frame/replace-$-with-keys {:b 10} '(+ 5 $b) 'core/get)) 96 | 97 | (expect 15 98 | (eval (frame/replace-$-with-keys {:b 10} '(+ 5 $b) get))) 99 | 100 | (expect 15 101 | (frame/with-> 12 (+ 5) (- 2))) 102 | 103 | (expect (frame/frame {:a [1 2] :z [1 2]}) 104 | (frame/with-> (frame/frame {:a [1 2]}) (frame/assoc-col :z $a))) 105 | 106 | (expect 20 107 | (frame/with-> {:x {:y 20}} :x :y)) 108 | 109 | (expect (frame/frame [{:a 3 :b 300}] [2]) 110 | 111 | (let [df (frame/frame {:a [1 2 3] :b [100 200 300]})] 112 | (frame/with-> df (frame/select (series/gt $a 2))))) 113 | 114 | (expect (frame/frame {:a [1 2 3] :b [100 200 300] :c [10 20 30]}) 115 | (let [df (frame/frame {:a [1 2] :b [100 200] :c [10 20]})] 116 | (frame/assoc-ix df 2 {:a 3 :b 300 :c 30}))) 117 | 118 | (expect (frame/frame {:a [1 2] :b [100 200] :c [10 20] :d [5 10]}) 119 | (let [df (frame/frame {:a [1 2] :b [100 200] :c [10 20]})] 120 | (frame/assoc-col df :d [5 10]))) 121 | 122 | (expect (frame/frame {:a [1] :b [2]} [:x]) 123 | (frame/head 124 | (frame/frame [[:x {:a 1 :b 2}] 125 | [:y {:a 2 :b 4}] 126 | [:z {:a 3 :b 6}]]) 127 | 1)) 128 | 129 | (expect 3 130 | (count (frame/frame {:a '(1 2 3) :b '(2 4 6)}))) 131 | 132 | (expect true 133 | (= (frame/frame {:a '(1 2 3) :b '(2 4 6)}) 134 | (frame/frame {:a '(1 2 3) :b '(2 4 6)}))) 135 | 136 | (expect true 137 | (= (frame/frame {:b '(2 4 6) :a '(1 2 3)}) 138 | (frame/frame {:a '(1 2 3) :b '(2 4 6)}))) 139 | 140 | (expect false 141 | (= (frame/frame {:a '(1 2 5) :b '(2 4 6)}) 142 | (frame/frame {:a '(1 2 3) :b '(2 4 6)}))) 143 | 144 | (expect (frame/frame {:a [2 4 7] :b [4 2 8]} [:y :x :z]) 145 | (frame/sort-rows (frame/frame [[:x {:a 4, :b 2}] [:y {:a 2, :b 4}] [:z {:a 7, :b 8}]]) 146 | :a)) 147 | 148 | (expect (frame/frame {:a [1 2 3] 149 | :b [10 20 30] 150 | :c [1 2 3] 151 | :d [10 20 30]}) 152 | (frame/join 153 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 154 | (frame/frame {:c [1 2 3] :d [10 20 30]}) 155 | :how :outer)) 156 | 157 | (expect (frame/frame {:a [1 2 3 nil nil] 158 | :b [10 20 30 nil nil] 159 | :c [nil nil 1 2 3] 160 | :d [nil nil 10 20 30]}) 161 | (frame/join 162 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 163 | (frame/frame {:c [1 2 3] :d [10 20 30]} [2 3 4]) 164 | :how :outer)) 165 | 166 | 167 | (expect (frame/frame {:a-y [4 5 6] 168 | :a-x [1 2 3] 169 | :b [10 20 30] 170 | :c [100 200 300]}) 171 | (frame/join 172 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 173 | (frame/frame {:a [4 5 6] :c [100 200 300]}) 174 | :how :outer)) 175 | 176 | 177 | (expect (frame/frame {:a-x [3] 178 | :b [30] 179 | :a-y [1] 180 | :d [10]} 181 | [2]) 182 | (frame/join 183 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 184 | (frame/frame {:a [1 2 3] :d [10 20 30]} [2 3 4]) 185 | :how :inner)) 186 | 187 | (expect (frame/frame {:a-x [1 2 3] 188 | :b [10 20 30] 189 | :a-y [nil nil 1] 190 | :d [nil nil 10]} 191 | [0 1 2]) 192 | (frame/join 193 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 194 | (frame/frame {:a [1 2 3] :d [10 20 30]} [2 3 4]) 195 | :how :left)) 196 | 197 | 198 | ; Test handling of common columns 199 | 200 | (expect "foobar" (frame/add-suffix "foo" "bar")) 201 | 202 | (expect "foobar" (frame/add-suffix "foo" "bar")) 203 | 204 | (expect "foobar" (frame/add-suffix "foo" :bar)) 205 | 206 | (expect :foobar (frame/add-suffix :foo :bar)) 207 | 208 | (expect :foobar (frame/add-suffix :foo "bar")) 209 | 210 | (expect {:foo 10} (frame/assoc-common-column {} :foo 10 true #{:foo :bar} {:prefer-column :left})) 211 | 212 | (expect {} (frame/assoc-common-column {} :foo 10 true #{:foo :bar} {:prefer-column :right})) 213 | 214 | (expect {:baz 10} (frame/assoc-common-column {} :baz 10 true #{:foo :bar} {:prefer-column :right})) 215 | 216 | (expect {:baz 10} (frame/assoc-common-column {} :baz 10 true #{:foo :bar} {:suffixes ["-left" "-right"]})) 217 | 218 | (expect {:foo-left 10} (frame/assoc-common-column {} :foo 10 true #{:foo :bar} {:suffixes ["-left" "-right"]})) 219 | 220 | (expect (frame/assoc-common-column {} :foo 10 false #{:foo :bar} {:suffixes ["-left" "-right"]})) 221 | 222 | 223 | (expect (more-of grouped 224 | (frame/frame {:a [1 2] :b [10 20]} [0 1]) (:foo grouped) 225 | (frame/frame {:a [3] :b [30]} [2]) (:bar grouped)) 226 | (frame/group-by 227 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 228 | [:foo :foo :bar])) 229 | 230 | (expect (more-of grouped 231 | (frame/frame {:a [3 2] :b [30 20]} [2 1]) (:foo grouped) 232 | (frame/frame {:a [1] :b [10]} [0]) (:bar grouped)) 233 | (frame/group-by 234 | (frame/frame {:a [1 2 3] :b [10 20 30]}) 235 | (series/series [:foo :foo :bar] [2 1 0]))) 236 | 237 | 238 | (expect (more-of grouped 239 | (frame/frame {:a [1 1] :b [10 20]} [0 1]) (get grouped 1) 240 | (frame/frame {:a [3] :b [30]} [2]) (get grouped 3)) 241 | (frame/group-by-fn 242 | (frame/frame {:a [1 1 3] :b [10 20 30]}) :a)) 243 | 244 | 245 | (expect (more-of grouped 246 | (frame/frame {:a [1 10] :b [10 1]} [0 2]) (get grouped 11) 247 | (frame/frame {:a [1] :b [5]} [1]) (get grouped 6) 248 | (frame/frame {:a [10] :b [17]} [3]) (get grouped 27)) 249 | (frame/with-> (frame/frame {:a [1 1 10 10] :b [10 5 1 17]}) 250 | (frame/group-by (series/add $a $b)))) 251 | -------------------------------------------------------------------------------- /test/dataframe/pipeline_test.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.pipeline-test 2 | (:refer-clojure :exclude [group-by]) 3 | (:require [dataframe.core :refer :all] 4 | [expectations :refer [expect]])) 5 | 6 | (expect (frame {:a [1 2 3] 7 | :b [10 20 30] 8 | :c [6 7 8] 9 | :d [16 27 38]}) 10 | (with-> (frame {:a [1 2 3] :b [10 20 30]}) 11 | (assoc-col :c (add $a 5)) 12 | (assoc-col :d (add $b $c)))) 13 | 14 | (expect (frame {:c [3 6] :b [2 4] :a [1 2]} [:x :y]) 15 | (let [df (frame [[:x {:a 1 :b 2}] 16 | [:y {:a 2 :b 4}] 17 | [:z {:a 3 :b 8}]])] 18 | (with-> df 19 | (select (lte $a 2)) 20 | (assoc-col :c (add $a $b)) 21 | (sort-rows :c :b)))) 22 | 23 | (expect (frame {:foo [8 8 14] :bar [0 -2 -3]} [:w :y :z]) 24 | (let [df (frame [[:w {:a 0 :b 8}] 25 | [:x {:a 1 :b 2}] 26 | [:y {:a 2 :b 4}] 27 | [:z {:a 3 :b 8}]])] 28 | (with-> df 29 | (select (and (lte $a 2) (gte $b 4))) 30 | (assoc-col :c (add $a $b)) 31 | (maprows->df (fn [row] {:foo (+ (:a row) (:c row)) 32 | :bar (- (:b row) (:c row))})) 33 | head))) -------------------------------------------------------------------------------- /test/dataframe/series_test.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.series-test 2 | (:require [dataframe.series :as srs :refer [series index]] 3 | [expectations :refer [expect more-of]] 4 | [dataframe.series :as series] 5 | [dataframe.util :as util])) 6 | 7 | (expect '(0 1 2) 8 | (let [my-srs (srs/series '(:x :y :z))] 9 | (index my-srs))) 10 | 11 | (expect 1 12 | (let [my-srs (srs/series '(1 2 3) '("A" "B" "C"))] 13 | (srs/ix my-srs "A"))) 14 | 15 | (expect nil 16 | (let [my-srs (srs/series '(1 2 3) '("A" "B" "C"))] 17 | (srs/ix my-srs "D"))) 18 | 19 | (expect "Bar" 20 | (let [my-srs (srs/series '(1 2 3) '("A" "B" "C"))] 21 | (srs/ix my-srs "D" "Bar"))) 22 | 23 | (expect AssertionError 24 | (let [my-srs (srs/series '(1 2 3) '("A" "B" "A"))] 25 | (srs/ix my-srs "D"))) 26 | 27 | ; Test iteration 28 | (expect '([:a 1] [:b 2] [:c 3]) 29 | (map identity 30 | (srs/series [1 2 3] [:a :b :c]))) 31 | 32 | (expect (srs/series [2 4 6] [:a :b :c]) 33 | (srs/mapvals 34 | (srs/series [1 2 3] [:a :b :c]) 35 | #(* 2 %))) 36 | 37 | (expect (srs/series [1 2 3] [:c :d :e]) 38 | (srs/set-index 39 | (srs/series [1 2 3] [:a :b :c]) 40 | [:c :d :e])) 41 | 42 | (expect (srs/series [2 4] [:b :d]) 43 | (srs/select 44 | (srs/series [1 2 3 4] [:a :b :c :d]) 45 | [false true nil "true"])) 46 | 47 | (expect (srs/series [false false true]) 48 | (srs/gt 49 | (srs/series [1 5 10]) 50 | 5)) 51 | 52 | (expect (srs/series [116 120 125]) 53 | (srs/add 54 | (srs/series [1 5 10]) 55 | 5 56 | 10 57 | (srs/series [100 100 100]))) 58 | 59 | (expect (srs/series [6 nil 15]) 60 | (srs/add 61 | (srs/series [1 nil 10]) 62 | 5)) 63 | 64 | (expect (srs/series [false true false]) 65 | (series/eq (series/series [1 5 10]) 5)) 66 | 67 | (expect (srs/series [true false true]) 68 | (series/neq (series/series [1 5 10]) 5)) 69 | 70 | (expect (more-of srs 71 | (series/series [1] [0]) (series/subset srs 0 1) 72 | (series/series [3 4] [2 3]) (series/subset srs 2 4) 73 | (series/series [1 2] [0 1]) (series/head srs 2) 74 | (series/series [6 7] [5 6]) (series/tail srs 2)) 75 | (series/series [1 2 3 4 5 6 7])) 76 | 77 | (expect 2 78 | (.valAt (series [1 2 3] [:a :b :c]) :b)) 79 | 80 | (expect true 81 | (contains? (series [1 2 3] [:a :b :c]) :b)) 82 | 83 | (expect false 84 | (contains? (series [1 2 3] [:a :b :c]) :d)) 85 | 86 | (expect [:b 2] 87 | (.entryAt (series [1 2 3] [:a :b :c]) :b)) 88 | 89 | (expect (series [1 2 3 4 5 6] [:a :b :c :d :e :f]) 90 | (assoc 91 | (series [1 2 3 4 5] [:a :b :c :d :e]) 92 | :f 6)) 93 | 94 | (expect (series [1 10 3 4 5] [:a :b :c :d :e]) 95 | (assoc 96 | (series [1 2 3 4 5] [:a :b :c :d :e]) 97 | :b 10)) 98 | 99 | (expect 2 100 | (get (series [1 2 3] [:a :b :c]) :b)) 101 | 102 | (expect '([:a 1] [:b 2] [:c 3]) 103 | (seq (series/series [1 2 3] [:a :b :c]))) 104 | 105 | (expect '([:d 4] [:a 1] [:b 2] [:c 3]) 106 | (cons [:d 4] (series/series [1 2 3] [:a :b :c]))) 107 | 108 | (expect '([:d 4] [:a 1] [:b 2] [:c 3]) 109 | (cons [:d 4] (series/series [1 2 3] [:a :b :c]))) 110 | 111 | (expect true 112 | (= (series/series [1 2 3] [:a :b :c]) 113 | (series/series [1 2 3] [:a :b :c]))) 114 | 115 | (expect false 116 | (= (series/series [1 2 3] [:a :b :d]) 117 | (series/series [1 2 3] [:a :b :c]))) 118 | 119 | ; Equality checks order 120 | (expect false 121 | (= (series/series [3 2 1] [:c :b :a]) 122 | (series/series [1 2 3] [:a :b :c]))) 123 | 124 | ; Check that we iterate as pairs of index->val 125 | (expect '([:a 1] [:b 2] [:c 3]) 126 | (for [x (series/series [1 2 3] [:a :b :c])] 127 | x)) 128 | 129 | (expect ['(:a :b :c :d) '([1 10] [2 nil] [3 20] [nil 30])] 130 | (series/index-aligned-pairs 131 | (series/series [1 2 3] [:a :b :c]) 132 | (series/series [10 20 30] [:a :c :d]))) 133 | 134 | (expect (series/series [11 nil 23 nil] [:a :b :c :d]) 135 | (series/join-map 136 | (util/nillify +) 137 | (series/series [1 2 3] [:a :b :c]) 138 | (series/series [10 20 30] [:a :c :d]))) 139 | 140 | (expect 141 | (series/series [1 2] [0 1]) 142 | (series/loc (series/series [1 2 3]) [0 1])) 143 | 144 | (expect 145 | (series/series [1 2 nil nil] [0 1 1000 2000]) 146 | (series/loc (series/series [1 2 3]) [0 1 1000 2000])) -------------------------------------------------------------------------------- /test/dataframe/util_test.clj: -------------------------------------------------------------------------------- 1 | (ns dataframe.util-test 2 | (:require [expectations :refer [expect]] 3 | [dataframe.util :as util])) 4 | --------------------------------------------------------------------------------