├── .gitignore ├── 01_constructors.ipynb ├── 02_basicinfo.ipynb ├── 03_missingvalues.ipynb ├── 04_loadsave.ipynb ├── 05_columns.ipynb ├── 06_rows.ipynb ├── 07_factors.ipynb ├── 08_joins.ipynb ├── 09_reshaping.ipynb ├── 10_transforms.ipynb ├── 11_performance.ipynb ├── 12_pitfalls.ipynb ├── 13_extras.ipynb ├── LICENSE ├── Manifest.toml ├── Project.toml └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints/ 2 | -------------------------------------------------------------------------------- /03_missingvalues.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to DataFrames\n", 8 | "**[Bogumił Kamiński](http://bogumilkaminski.pl/about/), February 13, 2023**" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "metadata": {}, 15 | "outputs": [], 16 | "source": [ 17 | "using DataFrames" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": [ 24 | "## Handling missing values\n", 25 | "\n", 26 | "A singelton type `Missing` allows us to deal with missing values." 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 2, 32 | "metadata": {}, 33 | "outputs": [ 34 | { 35 | "data": { 36 | "text/plain": [ 37 | "(missing, Missing)" 38 | ] 39 | }, 40 | "execution_count": 2, 41 | "metadata": {}, 42 | "output_type": "execute_result" 43 | } 44 | ], 45 | "source": [ 46 | "missing, typeof(missing)" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "Arrays automatically create an appropriate union type." 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": 3, 59 | "metadata": {}, 60 | "outputs": [ 61 | { 62 | "data": { 63 | "text/plain": [ 64 | "4-element Vector{Union{Missing, Int64}}:\n", 65 | " 1\n", 66 | " 2\n", 67 | " missing\n", 68 | " 3" 69 | ] 70 | }, 71 | "execution_count": 3, 72 | "metadata": {}, 73 | "output_type": "execute_result" 74 | } 75 | ], 76 | "source": [ 77 | "x = [1, 2, missing, 3]" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | "`ismissing` checks if passed value is missing." 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": 4, 90 | "metadata": {}, 91 | "outputs": [ 92 | { 93 | "data": { 94 | "text/plain": [ 95 | "(false, true, false, Bool[0, 0, 1, 0])" 96 | ] 97 | }, 98 | "execution_count": 4, 99 | "metadata": {}, 100 | "output_type": "execute_result" 101 | } 102 | ], 103 | "source": [ 104 | "ismissing(1), ismissing(missing), ismissing(x), ismissing.(x)" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "We can extract the type combined with Missing from a `Union` via `nonmissingtype`\n", 112 | "\n", 113 | "(This is useful for arrays!)" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 5, 119 | "metadata": {}, 120 | "outputs": [ 121 | { 122 | "data": { 123 | "text/plain": [ 124 | "(Union{Missing, Int64}, Int64)" 125 | ] 126 | }, 127 | "execution_count": 5, 128 | "metadata": {}, 129 | "output_type": "execute_result" 130 | } 131 | ], 132 | "source": [ 133 | "eltype(x), nonmissingtype(eltype(x))" 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "`missing` comparisons produce `missing`." 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": 6, 146 | "metadata": {}, 147 | "outputs": [ 148 | { 149 | "data": { 150 | "text/plain": [ 151 | "(missing, missing, missing)" 152 | ] 153 | }, 154 | "execution_count": 6, 155 | "metadata": {}, 156 | "output_type": "execute_result" 157 | } 158 | ], 159 | "source": [ 160 | "missing == missing, missing != missing, missing < missing" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": [ 167 | "This is also true when `missing`s are compared with values of other types." 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 7, 173 | "metadata": {}, 174 | "outputs": [ 175 | { 176 | "data": { 177 | "text/plain": [ 178 | "(missing, missing, missing)" 179 | ] 180 | }, 181 | "execution_count": 7, 182 | "metadata": {}, 183 | "output_type": "execute_result" 184 | } 185 | ], 186 | "source": [ 187 | "1 == missing, 1 != missing, 1 < missing" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "`isequal`, `isless`, and `===` produce results of type `Bool`. Notice that `missing` is considered greater than any numeric value." 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "execution_count": 8, 200 | "metadata": {}, 201 | "outputs": [ 202 | { 203 | "data": { 204 | "text/plain": [ 205 | "(true, true, false, true)" 206 | ] 207 | }, 208 | "execution_count": 8, 209 | "metadata": {}, 210 | "output_type": "execute_result" 211 | } 212 | ], 213 | "source": [ 214 | "isequal(missing, missing), missing === missing, isequal(1, missing), isless(1, missing)" 215 | ] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "metadata": {}, 220 | "source": [ 221 | "In the next few examples, we see that many (not all) functions handle `missing`." 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": 9, 227 | "metadata": {}, 228 | "outputs": [ 229 | { 230 | "data": { 231 | "text/plain": [ 232 | "4-element Vector{Missing}:\n", 233 | " missing\n", 234 | " missing\n", 235 | " missing\n", 236 | " missing" 237 | ] 238 | }, 239 | "execution_count": 9, 240 | "metadata": {}, 241 | "output_type": "execute_result" 242 | } 243 | ], 244 | "source": [ 245 | "map(x -> x(missing), [sin, cos, zero, sqrt]) # part 1" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 10, 251 | "metadata": {}, 252 | "outputs": [ 253 | { 254 | "data": { 255 | "text/plain": [ 256 | "5-element Vector{Missing}:\n", 257 | " missing\n", 258 | " missing\n", 259 | " missing\n", 260 | " missing\n", 261 | " missing" 262 | ] 263 | }, 264 | "execution_count": 10, 265 | "metadata": {}, 266 | "output_type": "execute_result" 267 | } 268 | ], 269 | "source": [ 270 | "map(x -> x(missing, 1), [+, - , *, /, div]) # part 2 " 271 | ] 272 | }, 273 | { 274 | "cell_type": "code", 275 | "execution_count": 11, 276 | "metadata": {}, 277 | "outputs": [ 278 | { 279 | "data": { 280 | "text/plain": [ 281 | "5-element Vector{Any}:\n", 282 | " missing\n", 283 | " missing\n", 284 | " (missing, missing)\n", 285 | " missing\n", 286 | " Union{Missing, Float64}[1.0, 2.0, missing]" 287 | ] 288 | }, 289 | "execution_count": 11, 290 | "metadata": {}, 291 | "output_type": "execute_result" 292 | } 293 | ], 294 | "source": [ 295 | "using Statistics # needed for mean\n", 296 | "map(x -> x([1,2,missing]), [minimum, maximum, extrema, mean, float]) # part 3" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": {}, 302 | "source": [ 303 | "`skipmissing` returns iterator skipping missing values. We can use `collect` and `skipmissing` to create an array that excludes these missing values." 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": 12, 309 | "metadata": {}, 310 | "outputs": [ 311 | { 312 | "data": { 313 | "text/plain": [ 314 | "2-element Vector{Int64}:\n", 315 | " 1\n", 316 | " 2" 317 | ] 318 | }, 319 | "execution_count": 12, 320 | "metadata": {}, 321 | "output_type": "execute_result" 322 | } 323 | ], 324 | "source": [ 325 | "collect(skipmissing([1, missing, 2, missing]))" 326 | ] 327 | }, 328 | { 329 | "cell_type": "markdown", 330 | "metadata": {}, 331 | "source": [ 332 | "Here we use `replace` to create a new array that replaces all missing values with some value (`NaN` in this case)." 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": 13, 338 | "metadata": {}, 339 | "outputs": [ 340 | { 341 | "data": { 342 | "text/plain": [ 343 | "4-element Vector{Float64}:\n", 344 | " 1.0\n", 345 | " NaN\n", 346 | " 2.0\n", 347 | " NaN" 348 | ] 349 | }, 350 | "execution_count": 13, 351 | "metadata": {}, 352 | "output_type": "execute_result" 353 | } 354 | ], 355 | "source": [ 356 | "replace([1.0, missing, 2.0, missing], missing=>NaN)" 357 | ] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": {}, 362 | "source": [ 363 | "Another way to do this:" 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": 14, 369 | "metadata": {}, 370 | "outputs": [ 371 | { 372 | "data": { 373 | "text/plain": [ 374 | "4-element Vector{Float64}:\n", 375 | " 1.0\n", 376 | " NaN\n", 377 | " 2.0\n", 378 | " NaN" 379 | ] 380 | }, 381 | "execution_count": 14, 382 | "metadata": {}, 383 | "output_type": "execute_result" 384 | } 385 | ], 386 | "source": [ 387 | "coalesce.([1.0, missing, 2.0, missing], NaN)" 388 | ] 389 | }, 390 | { 391 | "cell_type": "markdown", 392 | "metadata": {}, 393 | "source": [ 394 | "You can also use `recode` from CategoricalArrays.jl if you have a default output value." 395 | ] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "execution_count": 15, 400 | "metadata": {}, 401 | "outputs": [], 402 | "source": [ 403 | "using CategoricalArrays" 404 | ] 405 | }, 406 | { 407 | "cell_type": "code", 408 | "execution_count": 16, 409 | "metadata": {}, 410 | "outputs": [ 411 | { 412 | "data": { 413 | "text/plain": [ 414 | "4-element Vector{Bool}:\n", 415 | " 0\n", 416 | " 1\n", 417 | " 0\n", 418 | " 1" 419 | ] 420 | }, 421 | "execution_count": 16, 422 | "metadata": {}, 423 | "output_type": "execute_result" 424 | } 425 | ], 426 | "source": [ 427 | "recode([1.0, missing, 2.0, missing], false, missing=>true)" 428 | ] 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "metadata": {}, 433 | "source": [ 434 | "There are also `replace!` and `recode!` functions that work in place." 435 | ] 436 | }, 437 | { 438 | "cell_type": "markdown", 439 | "metadata": {}, 440 | "source": [ 441 | "Here is an example how you can to missing inputation in a data frame." 442 | ] 443 | }, 444 | { 445 | "cell_type": "code", 446 | "execution_count": 17, 447 | "metadata": {}, 448 | "outputs": [ 449 | { 450 | "data": { 451 | "text/html": [ 452 | "
3×2 DataFrame
Rowab
Int64?String?
11a
22b
3missingmissing
" 453 | ], 454 | "text/latex": [ 455 | "\\begin{tabular}{r|cc}\n", 456 | "\t& a & b\\\\\n", 457 | "\t\\hline\n", 458 | "\t& Int64? & String?\\\\\n", 459 | "\t\\hline\n", 460 | "\t1 & 1 & a \\\\\n", 461 | "\t2 & 2 & b \\\\\n", 462 | "\t3 & \\emph{missing} & \\emph{missing} \\\\\n", 463 | "\\end{tabular}\n" 464 | ], 465 | "text/plain": [ 466 | "\u001b[1m3×2 DataFrame\u001b[0m\n", 467 | "\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\n", 468 | " │\u001b[90m Int64? \u001b[0m\u001b[90m String? \u001b[0m\n", 469 | "─────┼──────────────────\n", 470 | " 1 │ 1 a\n", 471 | " 2 │ 2 b\n", 472 | " 3 │\u001b[90m missing \u001b[0m\u001b[90m missing \u001b[0m" 473 | ] 474 | }, 475 | "execution_count": 17, 476 | "metadata": {}, 477 | "output_type": "execute_result" 478 | } 479 | ], 480 | "source": [ 481 | "df = DataFrame(a=[1,2,missing], b=[\"a\", \"b\", missing])" 482 | ] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": {}, 487 | "source": [ 488 | "we change `df.a` vector in place." 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": 18, 494 | "metadata": {}, 495 | "outputs": [ 496 | { 497 | "data": { 498 | "text/plain": [ 499 | "3-element Vector{Union{Missing, Int64}}:\n", 500 | " 1\n", 501 | " 2\n", 502 | " 100" 503 | ] 504 | }, 505 | "execution_count": 18, 506 | "metadata": {}, 507 | "output_type": "execute_result" 508 | } 509 | ], 510 | "source": [ 511 | "replace!(df.a, missing=>100)" 512 | ] 513 | }, 514 | { 515 | "cell_type": "markdown", 516 | "metadata": {}, 517 | "source": [ 518 | "Now we overwrite `df.b` with a new vector, because the replacement type is different than what `eltype(df.b)` accepts." 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": 19, 524 | "metadata": {}, 525 | "outputs": [ 526 | { 527 | "data": { 528 | "text/plain": [ 529 | "3-element Vector{Any}:\n", 530 | " \"a\"\n", 531 | " \"b\"\n", 532 | " 100" 533 | ] 534 | }, 535 | "execution_count": 19, 536 | "metadata": {}, 537 | "output_type": "execute_result" 538 | } 539 | ], 540 | "source": [ 541 | "df.b = coalesce.(df.b, 100)" 542 | ] 543 | }, 544 | { 545 | "cell_type": "code", 546 | "execution_count": 20, 547 | "metadata": {}, 548 | "outputs": [ 549 | { 550 | "data": { 551 | "text/html": [ 552 | "
3×2 DataFrame
Rowab
Int64?Any
11a
22b
3100100
" 553 | ], 554 | "text/latex": [ 555 | "\\begin{tabular}{r|cc}\n", 556 | "\t& a & b\\\\\n", 557 | "\t\\hline\n", 558 | "\t& Int64? & Any\\\\\n", 559 | "\t\\hline\n", 560 | "\t1 & 1 & a \\\\\n", 561 | "\t2 & 2 & b \\\\\n", 562 | "\t3 & 100 & 100 \\\\\n", 563 | "\\end{tabular}\n" 564 | ], 565 | "text/plain": [ 566 | "\u001b[1m3×2 DataFrame\u001b[0m\n", 567 | "\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\n", 568 | " │\u001b[90m Int64? \u001b[0m\u001b[90m Any \u001b[0m\n", 569 | "─────┼─────────────\n", 570 | " 1 │ 1 a\n", 571 | " 2 │ 2 b\n", 572 | " 3 │ 100 100" 573 | ] 574 | }, 575 | "execution_count": 20, 576 | "metadata": {}, 577 | "output_type": "execute_result" 578 | } 579 | ], 580 | "source": [ 581 | "df" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": {}, 587 | "source": [ 588 | "You can use `unique` or `levels` to get unique values with or without missings, respectively." 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": 21, 594 | "metadata": {}, 595 | "outputs": [ 596 | { 597 | "data": { 598 | "text/plain": [ 599 | "(Union{Missing, Int64}[1, missing, 2], [1, 2])" 600 | ] 601 | }, 602 | "execution_count": 21, 603 | "metadata": {}, 604 | "output_type": "execute_result" 605 | } 606 | ], 607 | "source": [ 608 | "unique([1, missing, 2, missing]), levels([1, missing, 2, missing])" 609 | ] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": {}, 614 | "source": [ 615 | "In this next example, we convert `x` to `y` with `allowmissing`, where `y` has a type that accepts missings." 616 | ] 617 | }, 618 | { 619 | "cell_type": "code", 620 | "execution_count": 22, 621 | "metadata": {}, 622 | "outputs": [ 623 | { 624 | "data": { 625 | "text/plain": [ 626 | "3-element Vector{Union{Missing, Int64}}:\n", 627 | " 1\n", 628 | " 2\n", 629 | " 3" 630 | ] 631 | }, 632 | "execution_count": 22, 633 | "metadata": {}, 634 | "output_type": "execute_result" 635 | } 636 | ], 637 | "source": [ 638 | "x = [1,2,3]\n", 639 | "y = allowmissing(x)" 640 | ] 641 | }, 642 | { 643 | "cell_type": "markdown", 644 | "metadata": {}, 645 | "source": [ 646 | "Then, we convert back with `disallowmissing`. This would fail if `y` contained missing values!" 647 | ] 648 | }, 649 | { 650 | "cell_type": "code", 651 | "execution_count": 23, 652 | "metadata": {}, 653 | "outputs": [ 654 | { 655 | "data": { 656 | "text/plain": [ 657 | "([1, 2, 3], Union{Missing, Int64}[1, 2, 3], [1, 2, 3])" 658 | ] 659 | }, 660 | "execution_count": 23, 661 | "metadata": {}, 662 | "output_type": "execute_result" 663 | } 664 | ], 665 | "source": [ 666 | "z = disallowmissing(y)\n", 667 | "x,y,z" 668 | ] 669 | }, 670 | { 671 | "cell_type": "markdown", 672 | "metadata": {}, 673 | "source": [ 674 | "`disallowmissing` has `error` keyword argument that can be used to decide how it should behave when it encounters a column that actually contains a `missing` value" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": 24, 680 | "metadata": {}, 681 | "outputs": [ 682 | { 683 | "data": { 684 | "text/html": [ 685 | "
2×3 DataFrame
Rowx1x2x3
Float64?Float64?Float64?
11.01.01.0
21.01.01.0
" 686 | ], 687 | "text/latex": [ 688 | "\\begin{tabular}{r|ccc}\n", 689 | "\t& x1 & x2 & x3\\\\\n", 690 | "\t\\hline\n", 691 | "\t& Float64? & Float64? & Float64?\\\\\n", 692 | "\t\\hline\n", 693 | "\t1 & 1.0 & 1.0 & 1.0 \\\\\n", 694 | "\t2 & 1.0 & 1.0 & 1.0 \\\\\n", 695 | "\\end{tabular}\n" 696 | ], 697 | "text/plain": [ 698 | "\u001b[1m2×3 DataFrame\u001b[0m\n", 699 | "\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\n", 700 | " │\u001b[90m Float64? \u001b[0m\u001b[90m Float64? \u001b[0m\u001b[90m Float64? \u001b[0m\n", 701 | "─────┼──────────────────────────────\n", 702 | " 1 │ 1.0 1.0 1.0\n", 703 | " 2 │ 1.0 1.0 1.0" 704 | ] 705 | }, 706 | "execution_count": 24, 707 | "metadata": {}, 708 | "output_type": "execute_result" 709 | } 710 | ], 711 | "source": [ 712 | "df = allowmissing(DataFrame(ones(2,3), :auto))" 713 | ] 714 | }, 715 | { 716 | "cell_type": "code", 717 | "execution_count": 25, 718 | "metadata": {}, 719 | "outputs": [ 720 | { 721 | "data": { 722 | "text/plain": [ 723 | "missing" 724 | ] 725 | }, 726 | "execution_count": 25, 727 | "metadata": {}, 728 | "output_type": "execute_result" 729 | } 730 | ], 731 | "source": [ 732 | "df[1,1] = missing" 733 | ] 734 | }, 735 | { 736 | "cell_type": "code", 737 | "execution_count": 26, 738 | "metadata": {}, 739 | "outputs": [ 740 | { 741 | "data": { 742 | "text/html": [ 743 | "
2×3 DataFrame
Rowx1x2x3
Float64?Float64?Float64?
1missing1.01.0
21.01.01.0
" 744 | ], 745 | "text/latex": [ 746 | "\\begin{tabular}{r|ccc}\n", 747 | "\t& x1 & x2 & x3\\\\\n", 748 | "\t\\hline\n", 749 | "\t& Float64? & Float64? & Float64?\\\\\n", 750 | "\t\\hline\n", 751 | "\t1 & \\emph{missing} & 1.0 & 1.0 \\\\\n", 752 | "\t2 & 1.0 & 1.0 & 1.0 \\\\\n", 753 | "\\end{tabular}\n" 754 | ], 755 | "text/plain": [ 756 | "\u001b[1m2×3 DataFrame\u001b[0m\n", 757 | "\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\n", 758 | " │\u001b[90m Float64? \u001b[0m\u001b[90m Float64? \u001b[0m\u001b[90m Float64? \u001b[0m\n", 759 | "─────┼───────────────────────────────\n", 760 | " 1 │\u001b[90m missing \u001b[0m 1.0 1.0\n", 761 | " 2 │ 1.0 1.0 1.0" 762 | ] 763 | }, 764 | "execution_count": 26, 765 | "metadata": {}, 766 | "output_type": "execute_result" 767 | } 768 | ], 769 | "source": [ 770 | "df" 771 | ] 772 | }, 773 | { 774 | "cell_type": "code", 775 | "execution_count": 27, 776 | "metadata": {}, 777 | "outputs": [ 778 | { 779 | "ename": "LoadError", 780 | "evalue": "ArgumentError: Missing value found in column :x1 in row 1", 781 | "output_type": "error", 782 | "traceback": [ 783 | "ArgumentError: Missing value found in column :x1 in row 1", 784 | "", 785 | "Stacktrace:", 786 | " [1] disallowmissing(df::DataFrame, cols::Colon; error::Bool)", 787 | " @ DataFrames C:\\Users\\bogum\\.julia\\packages\\DataFrames\\LteEl\\src\\abstractdataframe\\abstractdataframe.jl:2197", 788 | " [2] disallowmissing", 789 | " @ C:\\Users\\bogum\\.julia\\packages\\DataFrames\\LteEl\\src\\abstractdataframe\\abstractdataframe.jl:2180 [inlined]", 790 | " [3] disallowmissing(df::DataFrame)", 791 | " @ DataFrames C:\\Users\\bogum\\.julia\\packages\\DataFrames\\LteEl\\src\\abstractdataframe\\abstractdataframe.jl:2180", 792 | " [4] top-level scope", 793 | " @ In[27]:1" 794 | ] 795 | } 796 | ], 797 | "source": [ 798 | "disallowmissing(df) # an error is thrown" 799 | ] 800 | }, 801 | { 802 | "cell_type": "code", 803 | "execution_count": 28, 804 | "metadata": {}, 805 | "outputs": [ 806 | { 807 | "data": { 808 | "text/html": [ 809 | "
2×3 DataFrame
Rowx1x2x3
Float64?Float64Float64
1missing1.01.0
21.01.01.0
" 810 | ], 811 | "text/latex": [ 812 | "\\begin{tabular}{r|ccc}\n", 813 | "\t& x1 & x2 & x3\\\\\n", 814 | "\t\\hline\n", 815 | "\t& Float64? & Float64 & Float64\\\\\n", 816 | "\t\\hline\n", 817 | "\t1 & \\emph{missing} & 1.0 & 1.0 \\\\\n", 818 | "\t2 & 1.0 & 1.0 & 1.0 \\\\\n", 819 | "\\end{tabular}\n" 820 | ], 821 | "text/plain": [ 822 | "\u001b[1m2×3 DataFrame\u001b[0m\n", 823 | "\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\n", 824 | " │\u001b[90m Float64? \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\n", 825 | "─────┼─────────────────────────────\n", 826 | " 1 │\u001b[90m missing \u001b[0m 1.0 1.0\n", 827 | " 2 │ 1.0 1.0 1.0" 828 | ] 829 | }, 830 | "execution_count": 28, 831 | "metadata": {}, 832 | "output_type": "execute_result" 833 | } 834 | ], 835 | "source": [ 836 | "disallowmissing(df, error=false) # column :x1 is left untouched as it contains missing" 837 | ] 838 | }, 839 | { 840 | "cell_type": "markdown", 841 | "metadata": {}, 842 | "source": [ 843 | "In this next example, we show that the type of each column in `x` is initially `Int64`. After using `allowmissing!` to accept missing values in columns 1 and 3, the types of those columns become `Union{Int64,Missing}`." 844 | ] 845 | }, 846 | { 847 | "cell_type": "code", 848 | "execution_count": 29, 849 | "metadata": {}, 850 | "outputs": [ 851 | { 852 | "name": "stdout", 853 | "output_type": "stream", 854 | "text": [ 855 | "Before: DataType[Int64, Int64, Int64]\n", 856 | "After: Type[Union{Missing, Int64}, Int64, Union{Missing, Int64}]\n" 857 | ] 858 | } 859 | ], 860 | "source": [ 861 | "x = DataFrame(rand(Int, 2,3), :auto)\n", 862 | "println(\"Before: \", eltype.(eachcol(x)))\n", 863 | "allowmissing!(x, 1) # make first column accept missings\n", 864 | "allowmissing!(x, :x3) # make :x3 column accept missings\n", 865 | "println(\"After: \", eltype.(eachcol(x)))" 866 | ] 867 | }, 868 | { 869 | "cell_type": "markdown", 870 | "metadata": {}, 871 | "source": [ 872 | "In this next example, we'll use `completecases` to find all the rows of a `DataFrame` that have complete data." 873 | ] 874 | }, 875 | { 876 | "cell_type": "code", 877 | "execution_count": 30, 878 | "metadata": {}, 879 | "outputs": [ 880 | { 881 | "data": { 882 | "text/html": [ 883 | "
4×2 DataFrame
RowAB
Int64?String?
11A
2missingB
33missing
44C
" 884 | ], 885 | "text/latex": [ 886 | "\\begin{tabular}{r|cc}\n", 887 | "\t& A & B\\\\\n", 888 | "\t\\hline\n", 889 | "\t& Int64? & String?\\\\\n", 890 | "\t\\hline\n", 891 | "\t1 & 1 & A \\\\\n", 892 | "\t2 & \\emph{missing} & B \\\\\n", 893 | "\t3 & 3 & \\emph{missing} \\\\\n", 894 | "\t4 & 4 & C \\\\\n", 895 | "\\end{tabular}\n" 896 | ], 897 | "text/plain": [ 898 | "\u001b[1m4×2 DataFrame\u001b[0m\n", 899 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 900 | " │\u001b[90m Int64? \u001b[0m\u001b[90m String? \u001b[0m\n", 901 | "─────┼──────────────────\n", 902 | " 1 │ 1 A\n", 903 | " 2 │\u001b[90m missing \u001b[0m B\n", 904 | " 3 │ 3 \u001b[90m missing \u001b[0m\n", 905 | " 4 │ 4 C" 906 | ] 907 | }, 908 | "execution_count": 30, 909 | "metadata": {}, 910 | "output_type": "execute_result" 911 | } 912 | ], 913 | "source": [ 914 | "x = DataFrame(A=[1, missing, 3, 4], B=[\"A\", \"B\", missing, \"C\"])" 915 | ] 916 | }, 917 | { 918 | "cell_type": "code", 919 | "execution_count": 31, 920 | "metadata": {}, 921 | "outputs": [ 922 | { 923 | "name": "stdout", 924 | "output_type": "stream", 925 | "text": [ 926 | "Complete cases:\n", 927 | "Bool[1, 0, 0, 1]\n" 928 | ] 929 | } 930 | ], 931 | "source": [ 932 | "println(\"Complete cases:\\n\", completecases(x))" 933 | ] 934 | }, 935 | { 936 | "cell_type": "markdown", 937 | "metadata": {}, 938 | "source": [ 939 | "We can use `dropmissing` or `dropmissing!` to remove the rows with incomplete data from a `DataFrame` and either create a new `DataFrame` or mutate the original in-place." 940 | ] 941 | }, 942 | { 943 | "cell_type": "code", 944 | "execution_count": 32, 945 | "metadata": {}, 946 | "outputs": [ 947 | { 948 | "data": { 949 | "text/html": [ 950 | "
2×2 DataFrame
RowAB
Int64String
11A
24C
" 951 | ], 952 | "text/latex": [ 953 | "\\begin{tabular}{r|cc}\n", 954 | "\t& A & B\\\\\n", 955 | "\t\\hline\n", 956 | "\t& Int64 & String\\\\\n", 957 | "\t\\hline\n", 958 | "\t1 & 1 & A \\\\\n", 959 | "\t2 & 4 & C \\\\\n", 960 | "\\end{tabular}\n" 961 | ], 962 | "text/plain": [ 963 | "\u001b[1m2×2 DataFrame\u001b[0m\n", 964 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 965 | " │\u001b[90m Int64 \u001b[0m\u001b[90m String \u001b[0m\n", 966 | "─────┼───────────────\n", 967 | " 1 │ 1 A\n", 968 | " 2 │ 4 C" 969 | ] 970 | }, 971 | "execution_count": 32, 972 | "metadata": {}, 973 | "output_type": "execute_result" 974 | } 975 | ], 976 | "source": [ 977 | "y = dropmissing(x)\n", 978 | "dropmissing!(x)" 979 | ] 980 | }, 981 | { 982 | "cell_type": "code", 983 | "execution_count": 33, 984 | "metadata": {}, 985 | "outputs": [ 986 | { 987 | "data": { 988 | "text/html": [ 989 | "
2×2 DataFrame
RowAB
Int64String
11A
24C
" 990 | ], 991 | "text/latex": [ 992 | "\\begin{tabular}{r|cc}\n", 993 | "\t& A & B\\\\\n", 994 | "\t\\hline\n", 995 | "\t& Int64 & String\\\\\n", 996 | "\t\\hline\n", 997 | "\t1 & 1 & A \\\\\n", 998 | "\t2 & 4 & C \\\\\n", 999 | "\\end{tabular}\n" 1000 | ], 1001 | "text/plain": [ 1002 | "\u001b[1m2×2 DataFrame\u001b[0m\n", 1003 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 1004 | " │\u001b[90m Int64 \u001b[0m\u001b[90m String \u001b[0m\n", 1005 | "─────┼───────────────\n", 1006 | " 1 │ 1 A\n", 1007 | " 2 │ 4 C" 1008 | ] 1009 | }, 1010 | "execution_count": 33, 1011 | "metadata": {}, 1012 | "output_type": "execute_result" 1013 | } 1014 | ], 1015 | "source": [ 1016 | "x" 1017 | ] 1018 | }, 1019 | { 1020 | "cell_type": "code", 1021 | "execution_count": 34, 1022 | "metadata": {}, 1023 | "outputs": [ 1024 | { 1025 | "data": { 1026 | "text/html": [ 1027 | "
2×2 DataFrame
RowAB
Int64String
11A
24C
" 1028 | ], 1029 | "text/latex": [ 1030 | "\\begin{tabular}{r|cc}\n", 1031 | "\t& A & B\\\\\n", 1032 | "\t\\hline\n", 1033 | "\t& Int64 & String\\\\\n", 1034 | "\t\\hline\n", 1035 | "\t1 & 1 & A \\\\\n", 1036 | "\t2 & 4 & C \\\\\n", 1037 | "\\end{tabular}\n" 1038 | ], 1039 | "text/plain": [ 1040 | "\u001b[1m2×2 DataFrame\u001b[0m\n", 1041 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 1042 | " │\u001b[90m Int64 \u001b[0m\u001b[90m String \u001b[0m\n", 1043 | "─────┼───────────────\n", 1044 | " 1 │ 1 A\n", 1045 | " 2 │ 4 C" 1046 | ] 1047 | }, 1048 | "execution_count": 34, 1049 | "metadata": {}, 1050 | "output_type": "execute_result" 1051 | } 1052 | ], 1053 | "source": [ 1054 | "y" 1055 | ] 1056 | }, 1057 | { 1058 | "cell_type": "markdown", 1059 | "metadata": {}, 1060 | "source": [ 1061 | "When we call `describe` on a `DataFrame` with dropped missing values, the columns do not allow missing values any more by default." 1062 | ] 1063 | }, 1064 | { 1065 | "cell_type": "code", 1066 | "execution_count": 35, 1067 | "metadata": {}, 1068 | "outputs": [ 1069 | { 1070 | "data": { 1071 | "text/html": [ 1072 | "
2×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
1A2.512.540Int64
2BAC0String
" 1073 | ], 1074 | "text/latex": [ 1075 | "\\begin{tabular}{r|ccccccc}\n", 1076 | "\t& variable & mean & min & median & max & nmissing & eltype\\\\\n", 1077 | "\t\\hline\n", 1078 | "\t& Symbol & Union… & Any & Union… & Any & Int64 & DataType\\\\\n", 1079 | "\t\\hline\n", 1080 | "\t1 & A & 2.5 & 1 & 2.5 & 4 & 0 & Int64 \\\\\n", 1081 | "\t2 & B & & A & & C & 0 & String \\\\\n", 1082 | "\\end{tabular}\n" 1083 | ], 1084 | "text/plain": [ 1085 | "\u001b[1m2×7 DataFrame\u001b[0m\n", 1086 | "\u001b[1m Row \u001b[0m│\u001b[1m variable \u001b[0m\u001b[1m mean \u001b[0m\u001b[1m min \u001b[0m\u001b[1m median \u001b[0m\u001b[1m max \u001b[0m\u001b[1m nmissing \u001b[0m\u001b[1m eltype \u001b[0m\n", 1087 | " │\u001b[90m Symbol \u001b[0m\u001b[90m Union… \u001b[0m\u001b[90m Any \u001b[0m\u001b[90m Union… \u001b[0m\u001b[90m Any \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m DataType \u001b[0m\n", 1088 | "─────┼────────────────────────────────────────────────────────\n", 1089 | " 1 │ A 2.5 1 2.5 4 0 Int64\n", 1090 | " 2 │ B \u001b[90m \u001b[0m A \u001b[90m \u001b[0m C 0 String" 1091 | ] 1092 | }, 1093 | "execution_count": 35, 1094 | "metadata": {}, 1095 | "output_type": "execute_result" 1096 | } 1097 | ], 1098 | "source": [ 1099 | "describe(x)" 1100 | ] 1101 | }, 1102 | { 1103 | "cell_type": "markdown", 1104 | "metadata": {}, 1105 | "source": [ 1106 | "Alternatively you can pass `disallowmissing` keyword argument to `dropmissing` and `dropmissing!`" 1107 | ] 1108 | }, 1109 | { 1110 | "cell_type": "code", 1111 | "execution_count": 36, 1112 | "metadata": {}, 1113 | "outputs": [ 1114 | { 1115 | "data": { 1116 | "text/html": [ 1117 | "
4×2 DataFrame
RowAB
Int64?String?
11A
2missingB
33missing
44C
" 1118 | ], 1119 | "text/latex": [ 1120 | "\\begin{tabular}{r|cc}\n", 1121 | "\t& A & B\\\\\n", 1122 | "\t\\hline\n", 1123 | "\t& Int64? & String?\\\\\n", 1124 | "\t\\hline\n", 1125 | "\t1 & 1 & A \\\\\n", 1126 | "\t2 & \\emph{missing} & B \\\\\n", 1127 | "\t3 & 3 & \\emph{missing} \\\\\n", 1128 | "\t4 & 4 & C \\\\\n", 1129 | "\\end{tabular}\n" 1130 | ], 1131 | "text/plain": [ 1132 | "\u001b[1m4×2 DataFrame\u001b[0m\n", 1133 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 1134 | " │\u001b[90m Int64? \u001b[0m\u001b[90m String? \u001b[0m\n", 1135 | "─────┼──────────────────\n", 1136 | " 1 │ 1 A\n", 1137 | " 2 │\u001b[90m missing \u001b[0m B\n", 1138 | " 3 │ 3 \u001b[90m missing \u001b[0m\n", 1139 | " 4 │ 4 C" 1140 | ] 1141 | }, 1142 | "execution_count": 36, 1143 | "metadata": {}, 1144 | "output_type": "execute_result" 1145 | } 1146 | ], 1147 | "source": [ 1148 | "x = DataFrame(A=[1, missing, 3, 4], B=[\"A\", \"B\", missing, \"C\"])" 1149 | ] 1150 | }, 1151 | { 1152 | "cell_type": "code", 1153 | "execution_count": 37, 1154 | "metadata": {}, 1155 | "outputs": [ 1156 | { 1157 | "data": { 1158 | "text/html": [ 1159 | "
2×2 DataFrame
RowAB
Int64?String?
11A
24C
" 1160 | ], 1161 | "text/latex": [ 1162 | "\\begin{tabular}{r|cc}\n", 1163 | "\t& A & B\\\\\n", 1164 | "\t\\hline\n", 1165 | "\t& Int64? & String?\\\\\n", 1166 | "\t\\hline\n", 1167 | "\t1 & 1 & A \\\\\n", 1168 | "\t2 & 4 & C \\\\\n", 1169 | "\\end{tabular}\n" 1170 | ], 1171 | "text/plain": [ 1172 | "\u001b[1m2×2 DataFrame\u001b[0m\n", 1173 | "\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n", 1174 | " │\u001b[90m Int64? \u001b[0m\u001b[90m String? \u001b[0m\n", 1175 | "─────┼─────────────────\n", 1176 | " 1 │ 1 A\n", 1177 | " 2 │ 4 C" 1178 | ] 1179 | }, 1180 | "execution_count": 37, 1181 | "metadata": {}, 1182 | "output_type": "execute_result" 1183 | } 1184 | ], 1185 | "source": [ 1186 | "dropmissing!(x, disallowmissing=false)" 1187 | ] 1188 | }, 1189 | { 1190 | "cell_type": "markdown", 1191 | "metadata": {}, 1192 | "source": [ 1193 | "### Making functions `missing`-aware\n", 1194 | "\n", 1195 | "If we have a function that does not handle `missing` values we can wrap it using `passmissing` function so that if any of its positional arguments is missing we will get a `missing` value in return. In the example below we change how `string` function behaves:" 1196 | ] 1197 | }, 1198 | { 1199 | "cell_type": "code", 1200 | "execution_count": 38, 1201 | "metadata": {}, 1202 | "outputs": [ 1203 | { 1204 | "data": { 1205 | "text/plain": [ 1206 | "\"missing\"" 1207 | ] 1208 | }, 1209 | "execution_count": 38, 1210 | "metadata": {}, 1211 | "output_type": "execute_result" 1212 | } 1213 | ], 1214 | "source": [ 1215 | "string(missing)" 1216 | ] 1217 | }, 1218 | { 1219 | "cell_type": "code", 1220 | "execution_count": 39, 1221 | "metadata": {}, 1222 | "outputs": [ 1223 | { 1224 | "data": { 1225 | "text/plain": [ 1226 | "\"missing missing\"" 1227 | ] 1228 | }, 1229 | "execution_count": 39, 1230 | "metadata": {}, 1231 | "output_type": "execute_result" 1232 | } 1233 | ], 1234 | "source": [ 1235 | "string(missing, \" \", missing)" 1236 | ] 1237 | }, 1238 | { 1239 | "cell_type": "code", 1240 | "execution_count": 40, 1241 | "metadata": {}, 1242 | "outputs": [ 1243 | { 1244 | "data": { 1245 | "text/plain": [ 1246 | "\"123\"" 1247 | ] 1248 | }, 1249 | "execution_count": 40, 1250 | "metadata": {}, 1251 | "output_type": "execute_result" 1252 | } 1253 | ], 1254 | "source": [ 1255 | "string(1,2,3)" 1256 | ] 1257 | }, 1258 | { 1259 | "cell_type": "code", 1260 | "execution_count": 41, 1261 | "metadata": {}, 1262 | "outputs": [ 1263 | { 1264 | "data": { 1265 | "text/plain": [ 1266 | "(::Missings.PassMissing{typeof(string)}) (generic function with 2 methods)" 1267 | ] 1268 | }, 1269 | "execution_count": 41, 1270 | "metadata": {}, 1271 | "output_type": "execute_result" 1272 | } 1273 | ], 1274 | "source": [ 1275 | "lift_string = passmissing(string)" 1276 | ] 1277 | }, 1278 | { 1279 | "cell_type": "code", 1280 | "execution_count": 42, 1281 | "metadata": {}, 1282 | "outputs": [ 1283 | { 1284 | "data": { 1285 | "text/plain": [ 1286 | "missing" 1287 | ] 1288 | }, 1289 | "execution_count": 42, 1290 | "metadata": {}, 1291 | "output_type": "execute_result" 1292 | } 1293 | ], 1294 | "source": [ 1295 | "lift_string(missing)" 1296 | ] 1297 | }, 1298 | { 1299 | "cell_type": "code", 1300 | "execution_count": 43, 1301 | "metadata": {}, 1302 | "outputs": [ 1303 | { 1304 | "data": { 1305 | "text/plain": [ 1306 | "missing" 1307 | ] 1308 | }, 1309 | "execution_count": 43, 1310 | "metadata": {}, 1311 | "output_type": "execute_result" 1312 | } 1313 | ], 1314 | "source": [ 1315 | "lift_string(missing, \" \", missing)" 1316 | ] 1317 | }, 1318 | { 1319 | "cell_type": "code", 1320 | "execution_count": 44, 1321 | "metadata": {}, 1322 | "outputs": [ 1323 | { 1324 | "data": { 1325 | "text/plain": [ 1326 | "\"123\"" 1327 | ] 1328 | }, 1329 | "execution_count": 44, 1330 | "metadata": {}, 1331 | "output_type": "execute_result" 1332 | } 1333 | ], 1334 | "source": [ 1335 | "lift_string(1, 2, 3)" 1336 | ] 1337 | }, 1338 | { 1339 | "cell_type": "markdown", 1340 | "metadata": {}, 1341 | "source": [ 1342 | "### Aggregating rows containing missing values" 1343 | ] 1344 | }, 1345 | { 1346 | "cell_type": "markdown", 1347 | "metadata": {}, 1348 | "source": [ 1349 | "Create an example data frame containing missing values:" 1350 | ] 1351 | }, 1352 | { 1353 | "cell_type": "code", 1354 | "execution_count": 45, 1355 | "metadata": {}, 1356 | "outputs": [ 1357 | { 1358 | "data": { 1359 | "text/html": [ 1360 | "
3×2 DataFrame
Rowab
Int64?Int64?
111
2missing2
3missingmissing
" 1361 | ], 1362 | "text/latex": [ 1363 | "\\begin{tabular}{r|cc}\n", 1364 | "\t& a & b\\\\\n", 1365 | "\t\\hline\n", 1366 | "\t& Int64? & Int64?\\\\\n", 1367 | "\t\\hline\n", 1368 | "\t1 & 1 & 1 \\\\\n", 1369 | "\t2 & \\emph{missing} & 2 \\\\\n", 1370 | "\t3 & \\emph{missing} & \\emph{missing} \\\\\n", 1371 | "\\end{tabular}\n" 1372 | ], 1373 | "text/plain": [ 1374 | "\u001b[1m3×2 DataFrame\u001b[0m\n", 1375 | "\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\n", 1376 | " │\u001b[90m Int64? \u001b[0m\u001b[90m Int64? \u001b[0m\n", 1377 | "─────┼──────────────────\n", 1378 | " 1 │ 1 1\n", 1379 | " 2 │\u001b[90m missing \u001b[0m 2\n", 1380 | " 3 │\u001b[90m missing \u001b[0m\u001b[90m missing \u001b[0m" 1381 | ] 1382 | }, 1383 | "execution_count": 45, 1384 | "metadata": {}, 1385 | "output_type": "execute_result" 1386 | } 1387 | ], 1388 | "source": [ 1389 | "df = DataFrame(a = [1, missing, missing], b=[1, 2, missing])" 1390 | ] 1391 | }, 1392 | { 1393 | "cell_type": "markdown", 1394 | "metadata": {}, 1395 | "source": [ 1396 | "If we just run `sum` on the rows we get two missing entries:" 1397 | ] 1398 | }, 1399 | { 1400 | "cell_type": "code", 1401 | "execution_count": 46, 1402 | "metadata": {}, 1403 | "outputs": [ 1404 | { 1405 | "data": { 1406 | "text/plain": [ 1407 | "3-element Vector{Union{Missing, Int64}}:\n", 1408 | " 2\n", 1409 | " missing\n", 1410 | " missing" 1411 | ] 1412 | }, 1413 | "execution_count": 46, 1414 | "metadata": {}, 1415 | "output_type": "execute_result" 1416 | } 1417 | ], 1418 | "source": [ 1419 | "sum.(eachrow(df))" 1420 | ] 1421 | }, 1422 | { 1423 | "cell_type": "markdown", 1424 | "metadata": {}, 1425 | "source": [ 1426 | "One can apply `skipmissing` on the rows to avoid this problem:" 1427 | ] 1428 | }, 1429 | { 1430 | "cell_type": "code", 1431 | "execution_count": 47, 1432 | "metadata": {}, 1433 | "outputs": [ 1434 | { 1435 | "ename": "LoadError", 1436 | "evalue": "ArgumentError: reducing over an empty collection of unknown element type is not allowed.\nYou may be able to prevent this error by supplying an `init` value to the reducer.", 1437 | "output_type": "error", 1438 | "traceback": [ 1439 | "ArgumentError: reducing over an empty collection of unknown element type is not allowed.\nYou may be able to prevent this error by supplying an `init` value to the reducer.", 1440 | "", 1441 | "Stacktrace:", 1442 | " [1] reduce_empty_iter(op::Base.BottomRF{typeof(Base.add_sum)}, itr::Base.SkipMissing{DataFrameRow{DataFrame, DataFrames.Index}}, #unused#::Base.EltypeUnknown)", 1443 | " @ Base .\\reduce.jl:380", 1444 | " [2] reduce_empty_iter", 1445 | " @ .\\reduce.jl:378 [inlined]", 1446 | " [3] foldl_impl", 1447 | " @ .\\reduce.jl:49 [inlined]", 1448 | " [4] mapfoldl_impl", 1449 | " @ .\\reduce.jl:44 [inlined]", 1450 | " [5] #mapfoldl#288", 1451 | " @ .\\reduce.jl:170 [inlined]", 1452 | " [6] mapfoldl", 1453 | " @ .\\reduce.jl:170 [inlined]", 1454 | " [7] #mapreduce#292", 1455 | " @ .\\reduce.jl:302 [inlined]", 1456 | " [8] mapreduce", 1457 | " @ .\\reduce.jl:302 [inlined]", 1458 | " [9] #sum#295", 1459 | " @ .\\reduce.jl:530 [inlined]", 1460 | " [10] sum", 1461 | " @ .\\reduce.jl:530 [inlined]", 1462 | " [11] #sum#296", 1463 | " @ .\\reduce.jl:559 [inlined]", 1464 | " [12] sum", 1465 | " @ .\\reduce.jl:559 [inlined]", 1466 | " [13] _broadcast_getindex_evalf", 1467 | " @ .\\broadcast.jl:683 [inlined]", 1468 | " [14] _broadcast_getindex", 1469 | " @ .\\broadcast.jl:656 [inlined]", 1470 | " [15] getindex", 1471 | " @ .\\broadcast.jl:610 [inlined]", 1472 | " [16] copyto_nonleaf!(dest::Vector{Int64}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(sum), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(skipmissing), Tuple{Base.Broadcast.Extruded{DataFrames.DataFrameRows{DataFrame}, Tuple{Bool}, Tuple{Int64}}}}}}, iter::Base.OneTo{Int64}, state::Int64, count::Int64)", 1473 | " @ Base.Broadcast .\\broadcast.jl:1068", 1474 | " [17] copy", 1475 | " @ .\\broadcast.jl:920 [inlined]", 1476 | " [18] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(sum), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(skipmissing), Tuple{DataFrames.DataFrameRows{DataFrame}}}}})", 1477 | " @ Base.Broadcast .\\broadcast.jl:873", 1478 | " [19] top-level scope", 1479 | " @ In[47]:1" 1480 | ] 1481 | } 1482 | ], 1483 | "source": [ 1484 | "sum.(skipmissing.(eachrow(df)))" 1485 | ] 1486 | }, 1487 | { 1488 | "cell_type": "markdown", 1489 | "metadata": {}, 1490 | "source": [ 1491 | "However, we get an error. The problem is that the last row of `df` contains only missing values, and since `eachrow` is type unstable the `eltype` of the result of `skipmissing` is unknown (so it is marked `Any`)" 1492 | ] 1493 | }, 1494 | { 1495 | "cell_type": "code", 1496 | "execution_count": 48, 1497 | "metadata": {}, 1498 | "outputs": [ 1499 | { 1500 | "data": { 1501 | "text/plain": [ 1502 | "Any[]" 1503 | ] 1504 | }, 1505 | "execution_count": 48, 1506 | "metadata": {}, 1507 | "output_type": "execute_result" 1508 | } 1509 | ], 1510 | "source": [ 1511 | "collect(skipmissing(eachrow(df)[end]))" 1512 | ] 1513 | }, 1514 | { 1515 | "cell_type": "markdown", 1516 | "metadata": {}, 1517 | "source": [ 1518 | "In such cases it is useful to switch to `Tables.namedtupleiterator` which is type stable as discussed in 01_constructors.ipynb notebook." 1519 | ] 1520 | }, 1521 | { 1522 | "cell_type": "code", 1523 | "execution_count": 49, 1524 | "metadata": {}, 1525 | "outputs": [ 1526 | { 1527 | "data": { 1528 | "text/plain": [ 1529 | "3-element Vector{Int64}:\n", 1530 | " 2\n", 1531 | " 2\n", 1532 | " 0" 1533 | ] 1534 | }, 1535 | "execution_count": 49, 1536 | "metadata": {}, 1537 | "output_type": "execute_result" 1538 | } 1539 | ], 1540 | "source": [ 1541 | "sum.(skipmissing.(Tables.namedtupleiterator(df)))" 1542 | ] 1543 | }, 1544 | { 1545 | "cell_type": "markdown", 1546 | "metadata": {}, 1547 | "source": [ 1548 | "Later in the tutorial you will learn that you can efficiently calculate such sums using the `select` function:" 1549 | ] 1550 | }, 1551 | { 1552 | "cell_type": "code", 1553 | "execution_count": 50, 1554 | "metadata": {}, 1555 | "outputs": [ 1556 | { 1557 | "data": { 1558 | "text/html": [ 1559 | "
3×1 DataFrame
Rowa_b_sum_skipmissing
Int64
12
22
30
" 1560 | ], 1561 | "text/latex": [ 1562 | "\\begin{tabular}{r|c}\n", 1563 | "\t& a\\_b\\_sum\\_skipmissing\\\\\n", 1564 | "\t\\hline\n", 1565 | "\t& Int64\\\\\n", 1566 | "\t\\hline\n", 1567 | "\t1 & 2 \\\\\n", 1568 | "\t2 & 2 \\\\\n", 1569 | "\t3 & 0 \\\\\n", 1570 | "\\end{tabular}\n" 1571 | ], 1572 | "text/plain": [ 1573 | "\u001b[1m3×1 DataFrame\u001b[0m\n", 1574 | "\u001b[1m Row \u001b[0m│\u001b[1m a_b_sum_skipmissing \u001b[0m\n", 1575 | " │\u001b[90m Int64 \u001b[0m\n", 1576 | "─────┼─────────────────────\n", 1577 | " 1 │ 2\n", 1578 | " 2 │ 2\n", 1579 | " 3 │ 0" 1580 | ] 1581 | }, 1582 | "execution_count": 50, 1583 | "metadata": {}, 1584 | "output_type": "execute_result" 1585 | } 1586 | ], 1587 | "source": [ 1588 | "select(df, AsTable(:) => ByRow(sum∘skipmissing))" 1589 | ] 1590 | }, 1591 | { 1592 | "cell_type": "markdown", 1593 | "metadata": {}, 1594 | "source": [ 1595 | "Note that it correctly handles the rows with all missing values." 1596 | ] 1597 | } 1598 | ], 1599 | "metadata": { 1600 | "@webio": { 1601 | "lastCommId": null, 1602 | "lastKernelId": null 1603 | }, 1604 | "kernelspec": { 1605 | "display_name": "Julia 1.9.0", 1606 | "language": "julia", 1607 | "name": "julia-1.9" 1608 | }, 1609 | "language_info": { 1610 | "file_extension": ".jl", 1611 | "mimetype": "application/julia", 1612 | "name": "julia", 1613 | "version": "1.9.0" 1614 | } 1615 | }, 1616 | "nbformat": 4, 1617 | "nbformat_minor": 1 1618 | } 1619 | -------------------------------------------------------------------------------- /07_factors.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to DataFrames\n", 8 | "**[Bogumił Kamiński](http://bogumilkaminski.pl/about/), February 13, 2023**" 9 | ] 10 | }, 11 | { 12 | "cell_type": "code", 13 | "execution_count": 1, 14 | "metadata": {}, 15 | "outputs": [], 16 | "source": [ 17 | "using DataFrames # load package" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": 2, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "using CategoricalArrays # CategoricalArrays.jl is independent from DataFrames.jl but it is often used in combination" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "## Working with CategoricalArrays" 34 | ] 35 | }, 36 | { 37 | "cell_type": "markdown", 38 | "metadata": {}, 39 | "source": [ 40 | "### Constructor" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 3, 46 | "metadata": {}, 47 | "outputs": [ 48 | { 49 | "data": { 50 | "text/plain": [ 51 | "4-element CategoricalArray{String,1,UInt32}:\n", 52 | " \"A\"\n", 53 | " \"B\"\n", 54 | " \"B\"\n", 55 | " \"C\"" 56 | ] 57 | }, 58 | "execution_count": 3, 59 | "metadata": {}, 60 | "output_type": "execute_result" 61 | } 62 | ], 63 | "source": [ 64 | "x = categorical([\"A\", \"B\", \"B\", \"C\"]) # unordered" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 4, 70 | "metadata": {}, 71 | "outputs": [ 72 | { 73 | "data": { 74 | "text/plain": [ 75 | "4-element CategoricalArray{String,1,UInt32}:\n", 76 | " \"A\"\n", 77 | " \"B\"\n", 78 | " \"B\"\n", 79 | " \"C\"" 80 | ] 81 | }, 82 | "execution_count": 4, 83 | "metadata": {}, 84 | "output_type": "execute_result" 85 | } 86 | ], 87 | "source": [ 88 | "y = categorical([\"A\", \"B\", \"B\", \"C\"], ordered=true) # ordered, by default order is sorting order" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 5, 94 | "metadata": {}, 95 | "outputs": [ 96 | { 97 | "data": { 98 | "text/plain": [ 99 | "5-element CategoricalArray{Union{Missing, String},1,UInt32}:\n", 100 | " \"A\"\n", 101 | " \"B\"\n", 102 | " \"B\"\n", 103 | " \"C\"\n", 104 | " missing" 105 | ] 106 | }, 107 | "execution_count": 5, 108 | "metadata": {}, 109 | "output_type": "execute_result" 110 | } 111 | ], 112 | "source": [ 113 | "z = categorical([\"A\",\"B\",\"B\",\"C\", missing]) # unordered with missings" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 6, 119 | "metadata": {}, 120 | "outputs": [ 121 | { 122 | "data": { 123 | "text/plain": [ 124 | "10-element CategoricalArray{String,1,UInt32}:\n", 125 | " \"Q1: [1.0, 2.8)\"\n", 126 | " \"Q1: [1.0, 2.8)\"\n", 127 | " \"Q2: [2.8, 4.6)\"\n", 128 | " \"Q2: [2.8, 4.6)\"\n", 129 | " \"Q3: [4.6, 6.4)\"\n", 130 | " \"Q3: [4.6, 6.4)\"\n", 131 | " \"Q4: [6.4, 8.2)\"\n", 132 | " \"Q4: [6.4, 8.2)\"\n", 133 | " \"Q5: [8.2, 10.0]\"\n", 134 | " \"Q5: [8.2, 10.0]\"" 135 | ] 136 | }, 137 | "execution_count": 6, 138 | "metadata": {}, 139 | "output_type": "execute_result" 140 | } 141 | ], 142 | "source": [ 143 | "c = cut(1:10, 5) # ordered, into equal counts, possible to rename labels and give custom breaks" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "(we will cover grouping later, but let us here use it to analyze the results, we use Chain.jl for chaining)" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 7, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "using Chain" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 8, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "data": { 169 | "text/html": [ 170 | "
10×2 DataFrame
Rowxnrow
Cat…Int64
1Q1: [-4.37425549047856, -1.2786607637653769)10000
2Q2: [-1.2786607637653769, -0.8430443474683171)10000
3Q3: [-0.8430443474683171, -0.5258259568979863)10000
4Q4: [-0.5258259568979863, -0.2593167185316153)10000
5Q5: [-0.2593167185316153, -0.0018062175416992735)10000
6Q6: [-0.0018062175416992735, 0.25137476303234196)10000
7Q7: [0.25137476303234196, 0.5222753668380758)10000
8Q8: [0.5222753668380758, 0.838323113350727)10000
9Q9: [0.838323113350727, 1.2837083260172553)10000
10Q10: [1.2837083260172553, 4.269831619555335]10000
" 171 | ], 172 | "text/latex": [ 173 | "\\begin{tabular}{r|cc}\n", 174 | "\t& x & nrow\\\\\n", 175 | "\t\\hline\n", 176 | "\t& Cat… & Int64\\\\\n", 177 | "\t\\hline\n", 178 | "\t1 & Q1: [-4.37425549047856, -1.2786607637653769) & 10000 \\\\\n", 179 | "\t2 & Q2: [-1.2786607637653769, -0.8430443474683171) & 10000 \\\\\n", 180 | "\t3 & Q3: [-0.8430443474683171, -0.5258259568979863) & 10000 \\\\\n", 181 | "\t4 & Q4: [-0.5258259568979863, -0.2593167185316153) & 10000 \\\\\n", 182 | "\t5 & Q5: [-0.2593167185316153, -0.0018062175416992735) & 10000 \\\\\n", 183 | "\t6 & Q6: [-0.0018062175416992735, 0.25137476303234196) & 10000 \\\\\n", 184 | "\t7 & Q7: [0.25137476303234196, 0.5222753668380758) & 10000 \\\\\n", 185 | "\t8 & Q8: [0.5222753668380758, 0.838323113350727) & 10000 \\\\\n", 186 | "\t9 & Q9: [0.838323113350727, 1.2837083260172553) & 10000 \\\\\n", 187 | "\t10 & Q10: [1.2837083260172553, 4.269831619555335] & 10000 \\\\\n", 188 | "\\end{tabular}\n" 189 | ], 190 | "text/plain": [ 191 | "\u001b[1m10×2 DataFrame\u001b[0m\n", 192 | "\u001b[1m Row \u001b[0m│\u001b[1m x \u001b[0m\u001b[1m nrow \u001b[0m\n", 193 | " │\u001b[90m Cat… \u001b[0m\u001b[90m Int64 \u001b[0m\n", 194 | "─────┼──────────────────────────────────────────\n", 195 | " 1 │ Q1: [-4.37425549047856, -1.27866… 10000\n", 196 | " 2 │ Q2: [-1.2786607637653769, -0.843… 10000\n", 197 | " 3 │ Q3: [-0.8430443474683171, -0.525… 10000\n", 198 | " 4 │ Q4: [-0.5258259568979863, -0.259… 10000\n", 199 | " 5 │ Q5: [-0.2593167185316153, -0.001… 10000\n", 200 | " 6 │ Q6: [-0.0018062175416992735, 0.2… 10000\n", 201 | " 7 │ Q7: [0.25137476303234196, 0.5222… 10000\n", 202 | " 8 │ Q8: [0.5222753668380758, 0.83832… 10000\n", 203 | " 9 │ Q9: [0.838323113350727, 1.283708… 10000\n", 204 | " 10 │ Q10: [1.2837083260172553, 4.2698… 10000" 205 | ] 206 | }, 207 | "execution_count": 8, 208 | "metadata": {}, 209 | "output_type": "execute_result" 210 | } 211 | ], 212 | "source": [ 213 | "@chain DataFrame(x=cut(randn(100000), 10)) begin\n", 214 | " groupby(:x)\n", 215 | " combine(nrow) # just to make sure cut works right\n", 216 | "end" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 9, 222 | "metadata": {}, 223 | "outputs": [ 224 | { 225 | "data": { 226 | "text/plain": [ 227 | "5-element CategoricalArray{Int64,1,UInt32}:\n", 228 | " 1\n", 229 | " 2\n", 230 | " 2\n", 231 | " 3\n", 232 | " 3" 233 | ] 234 | }, 235 | "execution_count": 9, 236 | "metadata": {}, 237 | "output_type": "execute_result" 238 | } 239 | ], 240 | "source": [ 241 | "v = categorical([1,2,2,3,3]) # contains integers not strings" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": 10, 247 | "metadata": {}, 248 | "outputs": [ 249 | { 250 | "data": { 251 | "text/plain": [ 252 | "5-element Vector{Union{Missing, String}}:\n", 253 | " \"A\"\n", 254 | " \"B\"\n", 255 | " \"B\"\n", 256 | " \"C\"\n", 257 | " missing" 258 | ] 259 | }, 260 | "execution_count": 10, 261 | "metadata": {}, 262 | "output_type": "execute_result" 263 | } 264 | ], 265 | "source": [ 266 | "Vector{Union{String, Missing}}(z) # sometimes you need to convert back to a standard vector" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": {}, 272 | "source": [ 273 | "### Managing levels" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": 11, 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/plain": [ 284 | "5-element Vector{CategoricalVector{T, UInt32} where T}:\n", 285 | " CategoricalValue{String, UInt32}[\"A\", \"B\", \"B\", \"C\"]\n", 286 | " CategoricalValue{String, UInt32}[\"A\", \"B\", \"B\", \"C\"]\n", 287 | " Union{Missing, CategoricalValue{String, UInt32}}[\"A\", \"B\", \"B\", \"C\", missing]\n", 288 | " CategoricalValue{String, UInt32}[\"Q1: [1.0, 2.8)\", \"Q1: [1.0, 2.8)\", \"Q2: [2.8, 4.6)\", \"Q2: [2.8, 4.6)\", \"Q3: [4.6, 6.4)\", \"Q3: [4.6, 6.4)\", \"Q4: [6.4, 8.2)\", \"Q4: [6.4, 8.2)\", \"Q5: [8.2, 10.0]\", \"Q5: [8.2, 10.0]\"]\n", 289 | " CategoricalValue{Int64, UInt32}[1, 2, 2, 3, 3]" 290 | ] 291 | }, 292 | "execution_count": 11, 293 | "metadata": {}, 294 | "output_type": "execute_result" 295 | } 296 | ], 297 | "source": [ 298 | "arr = [x,y,z,c,v]" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": 12, 304 | "metadata": {}, 305 | "outputs": [ 306 | { 307 | "data": { 308 | "text/plain": [ 309 | "5-element BitVector:\n", 310 | " 0\n", 311 | " 1\n", 312 | " 0\n", 313 | " 1\n", 314 | " 0" 315 | ] 316 | }, 317 | "execution_count": 12, 318 | "metadata": {}, 319 | "output_type": "execute_result" 320 | } 321 | ], 322 | "source": [ 323 | "isordered.(arr) # chcek if categorical array is orderd" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": 13, 329 | "metadata": {}, 330 | "outputs": [ 331 | { 332 | "data": { 333 | "text/plain": [ 334 | "(CategoricalValue{String, UInt32}[\"A\", \"B\", \"B\", \"C\"], true)" 335 | ] 336 | }, 337 | "execution_count": 13, 338 | "metadata": {}, 339 | "output_type": "execute_result" 340 | } 341 | ], 342 | "source": [ 343 | "ordered!(x, true), isordered(x) # make x ordered" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": 14, 349 | "metadata": {}, 350 | "outputs": [ 351 | { 352 | "data": { 353 | "text/plain": [ 354 | "(CategoricalValue{String, UInt32}[\"A\", \"B\", \"B\", \"C\"], false)" 355 | ] 356 | }, 357 | "execution_count": 14, 358 | "metadata": {}, 359 | "output_type": "execute_result" 360 | } 361 | ], 362 | "source": [ 363 | "ordered!(x, false), isordered(x) # and unordered again" 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": 15, 369 | "metadata": {}, 370 | "outputs": [ 371 | { 372 | "data": { 373 | "text/plain": [ 374 | "5-element Vector{Vector}:\n", 375 | " [\"A\", \"B\", \"C\"]\n", 376 | " [\"A\", \"B\", \"C\"]\n", 377 | " [\"A\", \"B\", \"C\"]\n", 378 | " [\"Q1: [1.0, 2.8)\", \"Q2: [2.8, 4.6)\", \"Q3: [4.6, 6.4)\", \"Q4: [6.4, 8.2)\", \"Q5: [8.2, 10.0]\"]\n", 379 | " [1, 2, 3]" 380 | ] 381 | }, 382 | "execution_count": 15, 383 | "metadata": {}, 384 | "output_type": "execute_result" 385 | } 386 | ], 387 | "source": [ 388 | "levels.(arr) # list levels" 389 | ] 390 | }, 391 | { 392 | "cell_type": "code", 393 | "execution_count": 16, 394 | "metadata": {}, 395 | "outputs": [ 396 | { 397 | "data": { 398 | "text/plain": [ 399 | "5-element Vector{Vector}:\n", 400 | " [\"A\", \"B\", \"C\"]\n", 401 | " [\"A\", \"B\", \"C\"]\n", 402 | " Union{Missing, String}[\"A\", \"B\", \"C\", missing]\n", 403 | " [\"Q1: [1.0, 2.8)\", \"Q2: [2.8, 4.6)\", \"Q3: [4.6, 6.4)\", \"Q4: [6.4, 8.2)\", \"Q5: [8.2, 10.0]\"]\n", 404 | " [1, 2, 3]" 405 | ] 406 | }, 407 | "execution_count": 16, 408 | "metadata": {}, 409 | "output_type": "execute_result" 410 | } 411 | ], 412 | "source": [ 413 | "unique.(arr) # missing will be included" 414 | ] 415 | }, 416 | { 417 | "cell_type": "code", 418 | "execution_count": 17, 419 | "metadata": {}, 420 | "outputs": [ 421 | { 422 | "data": { 423 | "text/plain": [ 424 | "true" 425 | ] 426 | }, 427 | "execution_count": 17, 428 | "metadata": {}, 429 | "output_type": "execute_result" 430 | } 431 | ], 432 | "source": [ 433 | "y[1] < y[2] # can compare as y is ordered" 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "execution_count": 18, 439 | "metadata": {}, 440 | "outputs": [ 441 | { 442 | "ename": "LoadError", 443 | "evalue": "ArgumentError: Unordered CategoricalValue objects cannot be tested for order using <. Use isless instead, or call the ordered! function on the parent array to change this", 444 | "output_type": "error", 445 | "traceback": [ 446 | "ArgumentError: Unordered CategoricalValue objects cannot be tested for order using <. Use isless instead, or call the ordered! function on the parent array to change this", 447 | "", 448 | "Stacktrace:", 449 | " [1] <(x::CategoricalValue{Int64, UInt32}, y::CategoricalValue{Int64, UInt32})", 450 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\value.jl:166", 451 | " [2] top-level scope", 452 | " @ In[18]:1" 453 | ] 454 | } 455 | ], 456 | "source": [ 457 | "v[1] < v[2] # not comparable, v is unordered although it contains integers" 458 | ] 459 | }, 460 | { 461 | "cell_type": "code", 462 | "execution_count": 19, 463 | "metadata": {}, 464 | "outputs": [ 465 | { 466 | "ename": "LoadError", 467 | "evalue": "ArgumentError: cannot compare a `CategoricalValue` to value `v` of type `String`: wrap `v` using `CategoricalValue(v, catvalue)` or `CategoricalValue(v, catarray)` first", 468 | "output_type": "error", 469 | "traceback": [ 470 | "ArgumentError: cannot compare a `CategoricalValue` to value `v` of type `String`: wrap `v` using `CategoricalValue(v, catvalue)` or `CategoricalValue(v, catarray)` first", 471 | "", 472 | "Stacktrace:", 473 | " [1] <(x::CategoricalValue{String, UInt32}, y::String)", 474 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\value.jl:173", 475 | " [2] top-level scope", 476 | " @ In[19]:1" 477 | ] 478 | } 479 | ], 480 | "source": [ 481 | "y[2] < \"A\" # comparison against type underlying categorical value is not allowed" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "execution_count": 20, 487 | "metadata": {}, 488 | "outputs": [ 489 | { 490 | "data": { 491 | "text/plain": [ 492 | "false" 493 | ] 494 | }, 495 | "execution_count": 20, 496 | "metadata": {}, 497 | "output_type": "execute_result" 498 | } 499 | ], 500 | "source": [ 501 | "y[2] < CategoricalValue(\"A\", y) # you need to explicitly convert a value to a level" 502 | ] 503 | }, 504 | { 505 | "cell_type": "code", 506 | "execution_count": 21, 507 | "metadata": {}, 508 | "outputs": [ 509 | { 510 | "ename": "LoadError", 511 | "evalue": "ArgumentError: level Z not found in source pool", 512 | "output_type": "error", 513 | "traceback": [ 514 | "ArgumentError: level Z not found in source pool", 515 | "", 516 | "Stacktrace:", 517 | " [1] CategoricalValue(value::String, source::CategoricalVector{String, UInt32, String, CategoricalValue{String, UInt32}, Union{}})", 518 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\value.jl:13", 519 | " [2] top-level scope", 520 | " @ In[21]:1" 521 | ] 522 | } 523 | ], 524 | "source": [ 525 | "y[2] < CategoricalValue(\"Z\", y) # but it is treated as a level, and thus only valid levels are allowed" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": 22, 531 | "metadata": {}, 532 | "outputs": [ 533 | { 534 | "data": { 535 | "text/plain": [ 536 | "4-element CategoricalArray{String,1,UInt32}:\n", 537 | " \"A\"\n", 538 | " \"B\"\n", 539 | " \"B\"\n", 540 | " \"C\"" 541 | ] 542 | }, 543 | "execution_count": 22, 544 | "metadata": {}, 545 | "output_type": "execute_result" 546 | } 547 | ], 548 | "source": [ 549 | "levels!(y, [\"C\", \"B\", \"A\"]) # you can reorder levels, mostly useful for ordered CategoricalArrays" 550 | ] 551 | }, 552 | { 553 | "cell_type": "code", 554 | "execution_count": 23, 555 | "metadata": {}, 556 | "outputs": [ 557 | { 558 | "data": { 559 | "text/plain": [ 560 | "false" 561 | ] 562 | }, 563 | "execution_count": 23, 564 | "metadata": {}, 565 | "output_type": "execute_result" 566 | } 567 | ], 568 | "source": [ 569 | "y[1] < y[2] # observe that the order is changed" 570 | ] 571 | }, 572 | { 573 | "cell_type": "code", 574 | "execution_count": 24, 575 | "metadata": {}, 576 | "outputs": [ 577 | { 578 | "ename": "LoadError", 579 | "evalue": "ArgumentError: cannot remove level \"C\" as it is used at position 4 and allowmissing=false.", 580 | "output_type": "error", 581 | "traceback": [ 582 | "ArgumentError: cannot remove level \"C\" as it is used at position 4 and allowmissing=false.", 583 | "", 584 | "Stacktrace:", 585 | " [1] levels!(A::CategoricalVector{Union{Missing, String}, UInt32, String, CategoricalValue{String, UInt32}, Missing}, newlevels::Vector{String}; allowmissing::Bool, allow_missing::Nothing)", 586 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\array.jl:859", 587 | " [2] levels!(A::CategoricalVector{Union{Missing, String}, UInt32, String, CategoricalValue{String, UInt32}, Missing}, newlevels::Vector{String})", 588 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\array.jl:793", 589 | " [3] top-level scope", 590 | " @ In[24]:1" 591 | ] 592 | } 593 | ], 594 | "source": [ 595 | "levels!(z, [\"A\", \"B\"]) # you have to specify all levels that are present" 596 | ] 597 | }, 598 | { 599 | "cell_type": "code", 600 | "execution_count": 25, 601 | "metadata": {}, 602 | "outputs": [ 603 | { 604 | "data": { 605 | "text/plain": [ 606 | "5-element CategoricalArray{Union{Missing, String},1,UInt32}:\n", 607 | " \"A\"\n", 608 | " \"B\"\n", 609 | " \"B\"\n", 610 | " missing\n", 611 | " missing" 612 | ] 613 | }, 614 | "execution_count": 25, 615 | "metadata": {}, 616 | "output_type": "execute_result" 617 | } 618 | ], 619 | "source": [ 620 | "levels!(z, [\"A\", \"B\"], allowmissing=true) # unless the underlying array allows for missings and force removal of levels" 621 | ] 622 | }, 623 | { 624 | "cell_type": "code", 625 | "execution_count": 26, 626 | "metadata": {}, 627 | "outputs": [ 628 | { 629 | "data": { 630 | "text/plain": [ 631 | "5-element CategoricalArray{Union{Missing, String},1,UInt32}:\n", 632 | " \"B\"\n", 633 | " \"B\"\n", 634 | " \"B\"\n", 635 | " missing\n", 636 | " missing" 637 | ] 638 | }, 639 | "execution_count": 26, 640 | "metadata": {}, 641 | "output_type": "execute_result" 642 | } 643 | ], 644 | "source": [ 645 | "z[1] = \"B\"\n", 646 | "z # now z has only \"B\" entries" 647 | ] 648 | }, 649 | { 650 | "cell_type": "code", 651 | "execution_count": 27, 652 | "metadata": {}, 653 | "outputs": [ 654 | { 655 | "data": { 656 | "text/plain": [ 657 | "2-element Vector{String}:\n", 658 | " \"A\"\n", 659 | " \"B\"" 660 | ] 661 | }, 662 | "execution_count": 27, 663 | "metadata": {}, 664 | "output_type": "execute_result" 665 | } 666 | ], 667 | "source": [ 668 | "levels(z) # but it remembers the levels it had (the reason is mostly performance)" 669 | ] 670 | }, 671 | { 672 | "cell_type": "code", 673 | "execution_count": 28, 674 | "metadata": {}, 675 | "outputs": [ 676 | { 677 | "data": { 678 | "text/plain": [ 679 | "1-element Vector{String}:\n", 680 | " \"B\"" 681 | ] 682 | }, 683 | "execution_count": 28, 684 | "metadata": {}, 685 | "output_type": "execute_result" 686 | } 687 | ], 688 | "source": [ 689 | "droplevels!(z) # this way we can clean it up\n", 690 | "levels(z)" 691 | ] 692 | }, 693 | { 694 | "cell_type": "markdown", 695 | "metadata": {}, 696 | "source": [ 697 | "### Data manipulation" 698 | ] 699 | }, 700 | { 701 | "cell_type": "code", 702 | "execution_count": 29, 703 | "metadata": {}, 704 | "outputs": [ 705 | { 706 | "data": { 707 | "text/plain": [ 708 | "(CategoricalValue{String, UInt32}[\"A\", \"B\", \"B\", \"C\"], [\"A\", \"B\", \"C\"])" 709 | ] 710 | }, 711 | "execution_count": 29, 712 | "metadata": {}, 713 | "output_type": "execute_result" 714 | } 715 | ], 716 | "source": [ 717 | "x, levels(x)" 718 | ] 719 | }, 720 | { 721 | "cell_type": "code", 722 | "execution_count": 30, 723 | "metadata": {}, 724 | "outputs": [ 725 | { 726 | "data": { 727 | "text/plain": [ 728 | "(CategoricalValue{String, UInt32}[\"A\", \"0\", \"B\", \"C\"], [\"A\", \"B\", \"C\", \"0\"])" 729 | ] 730 | }, 731 | "execution_count": 30, 732 | "metadata": {}, 733 | "output_type": "execute_result" 734 | } 735 | ], 736 | "source": [ 737 | "x[2] = \"0\"\n", 738 | "x, levels(x) # new level added at the end (works only for unordered)" 739 | ] 740 | }, 741 | { 742 | "cell_type": "code", 743 | "execution_count": 31, 744 | "metadata": {}, 745 | "outputs": [ 746 | { 747 | "data": { 748 | "text/plain": [ 749 | "(CategoricalValue{Int64, UInt32}[1, 2, 2, 3, 3], [1, 2, 3])" 750 | ] 751 | }, 752 | "execution_count": 31, 753 | "metadata": {}, 754 | "output_type": "execute_result" 755 | } 756 | ], 757 | "source": [ 758 | "v, levels(v)" 759 | ] 760 | }, 761 | { 762 | "cell_type": "code", 763 | "execution_count": 32, 764 | "metadata": {}, 765 | "outputs": [ 766 | { 767 | "ename": "LoadError", 768 | "evalue": "MethodError: no method matching +(::CategoricalValue{Int64, UInt32}, ::CategoricalValue{Int64, UInt32})\n\n\u001b[0mClosest candidates are:\n\u001b[0m +(::Any, ::Any, \u001b[91m::Any\u001b[39m, \u001b[91m::Any...\u001b[39m)\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4moperators.jl:578\u001b[24m\u001b[39m\n", 769 | "output_type": "error", 770 | "traceback": [ 771 | "MethodError: no method matching +(::CategoricalValue{Int64, UInt32}, ::CategoricalValue{Int64, UInt32})\n\n\u001b[0mClosest candidates are:\n\u001b[0m +(::Any, ::Any, \u001b[91m::Any\u001b[39m, \u001b[91m::Any...\u001b[39m)\n\u001b[0m\u001b[90m @\u001b[39m \u001b[90mBase\u001b[39m \u001b[90m\u001b[4moperators.jl:578\u001b[24m\u001b[39m\n", 772 | "", 773 | "Stacktrace:", 774 | " [1] top-level scope", 775 | " @ In[32]:1" 776 | ] 777 | } 778 | ], 779 | "source": [ 780 | "v[1] + v[2] # even though the underlying data is Int, we cannot operate on it" 781 | ] 782 | }, 783 | { 784 | "cell_type": "code", 785 | "execution_count": 33, 786 | "metadata": {}, 787 | "outputs": [ 788 | { 789 | "data": { 790 | "text/plain": [ 791 | "5-element Vector{Int64}:\n", 792 | " 1\n", 793 | " 2\n", 794 | " 2\n", 795 | " 3\n", 796 | " 3" 797 | ] 798 | }, 799 | "execution_count": 33, 800 | "metadata": {}, 801 | "output_type": "execute_result" 802 | } 803 | ], 804 | "source": [ 805 | "Vector{Int}(v) # you have either to retrieve the data by conversion (may be expensive)" 806 | ] 807 | }, 808 | { 809 | "cell_type": "code", 810 | "execution_count": 34, 811 | "metadata": {}, 812 | "outputs": [ 813 | { 814 | "data": { 815 | "text/plain": [ 816 | "3" 817 | ] 818 | }, 819 | "execution_count": 34, 820 | "metadata": {}, 821 | "output_type": "execute_result" 822 | } 823 | ], 824 | "source": [ 825 | "unwrap(v[1]) + unwrap(v[2]) # or get a single value" 826 | ] 827 | }, 828 | { 829 | "cell_type": "code", 830 | "execution_count": 35, 831 | "metadata": {}, 832 | "outputs": [ 833 | { 834 | "data": { 835 | "text/plain": [ 836 | "5-element Vector{Int64}:\n", 837 | " 1\n", 838 | " 2\n", 839 | " 2\n", 840 | " 3\n", 841 | " 3" 842 | ] 843 | }, 844 | "execution_count": 35, 845 | "metadata": {}, 846 | "output_type": "execute_result" 847 | } 848 | ], 849 | "source": [ 850 | "unwrap.(v) # this will work for arrays witout missings" 851 | ] 852 | }, 853 | { 854 | "cell_type": "code", 855 | "execution_count": 36, 856 | "metadata": {}, 857 | "outputs": [ 858 | { 859 | "data": { 860 | "text/plain": [ 861 | "5-element Vector{Union{Missing, String}}:\n", 862 | " \"B\"\n", 863 | " \"B\"\n", 864 | " \"B\"\n", 865 | " missing\n", 866 | " missing" 867 | ] 868 | }, 869 | "execution_count": 36, 870 | "metadata": {}, 871 | "output_type": "execute_result" 872 | } 873 | ], 874 | "source": [ 875 | "unwrap.(z) # also works on missing values" 876 | ] 877 | }, 878 | { 879 | "cell_type": "code", 880 | "execution_count": 37, 881 | "metadata": {}, 882 | "outputs": [ 883 | { 884 | "data": { 885 | "text/plain": [ 886 | "5-element Vector{Union{Missing, String}}:\n", 887 | " \"B\"\n", 888 | " \"B\"\n", 889 | " \"B\"\n", 890 | " missing\n", 891 | " missing" 892 | ] 893 | }, 894 | "execution_count": 37, 895 | "metadata": {}, 896 | "output_type": "execute_result" 897 | } 898 | ], 899 | "source": [ 900 | "Vector{Union{String, Missing}}(z) # or do the conversion" 901 | ] 902 | }, 903 | { 904 | "cell_type": "code", 905 | "execution_count": 38, 906 | "metadata": {}, 907 | "outputs": [ 908 | { 909 | "data": { 910 | "text/plain": [ 911 | "6-element Vector{Union{Missing, Int64}}:\n", 912 | " 10\n", 913 | " 2\n", 914 | " 3\n", 915 | " 4\n", 916 | " 5\n", 917 | " missing" 918 | ] 919 | }, 920 | "execution_count": 38, 921 | "metadata": {}, 922 | "output_type": "execute_result" 923 | } 924 | ], 925 | "source": [ 926 | "recode([1,2,3,4,5,missing], 1=>10) # recode some values in an array; has also in place recode! equivalent" 927 | ] 928 | }, 929 | { 930 | "cell_type": "code", 931 | "execution_count": 39, 932 | "metadata": {}, 933 | "outputs": [ 934 | { 935 | "data": { 936 | "text/plain": [ 937 | "6-element Vector{Union{Missing, Int64, String}}:\n", 938 | " 10\n", 939 | " 20\n", 940 | " \"a\"\n", 941 | " \"a\"\n", 942 | " \"a\"\n", 943 | " missing" 944 | ] 945 | }, 946 | "execution_count": 39, 947 | "metadata": {}, 948 | "output_type": "execute_result" 949 | } 950 | ], 951 | "source": [ 952 | "recode([1,2,3,4,5,missing], \"a\", 1=>10, 2=>20) # here we provided a default value for not mapped recodings" 953 | ] 954 | }, 955 | { 956 | "cell_type": "code", 957 | "execution_count": 40, 958 | "metadata": {}, 959 | "outputs": [ 960 | { 961 | "data": { 962 | "text/plain": [ 963 | "6-element Vector{Union{Int64, String}}:\n", 964 | " 10\n", 965 | " 2\n", 966 | " 3\n", 967 | " 4\n", 968 | " 5\n", 969 | " \"missing\"" 970 | ] 971 | }, 972 | "execution_count": 40, 973 | "metadata": {}, 974 | "output_type": "execute_result" 975 | } 976 | ], 977 | "source": [ 978 | "recode([1,2,3,4,5,missing], 1=>10, missing=>\"missing\") # to recode Missing you have to do it explicitly" 979 | ] 980 | }, 981 | { 982 | "cell_type": "code", 983 | "execution_count": 41, 984 | "metadata": {}, 985 | "outputs": [ 986 | { 987 | "data": { 988 | "text/plain": [ 989 | "(Union{Missing, CategoricalValue{Int64, UInt32}}[1, 2, 3, 4, 5, missing], [1, 2, 3, 4, 5])" 990 | ] 991 | }, 992 | "execution_count": 41, 993 | "metadata": {}, 994 | "output_type": "execute_result" 995 | } 996 | ], 997 | "source": [ 998 | "t = categorical([1:5; missing])\n", 999 | "t, levels(t)" 1000 | ] 1001 | }, 1002 | { 1003 | "cell_type": "code", 1004 | "execution_count": 42, 1005 | "metadata": {}, 1006 | "outputs": [ 1007 | { 1008 | "data": { 1009 | "text/plain": [ 1010 | "(Union{Missing, CategoricalValue{Int64, UInt32}}[2, 2, 2, 4, 5, missing], [2, 4, 5])" 1011 | ] 1012 | }, 1013 | "execution_count": 42, 1014 | "metadata": {}, 1015 | "output_type": "execute_result" 1016 | } 1017 | ], 1018 | "source": [ 1019 | "recode!(t, [1,3]=>2)\n", 1020 | "t, levels(t) # note that the levels are dropped after recode" 1021 | ] 1022 | }, 1023 | { 1024 | "cell_type": "code", 1025 | "execution_count": 43, 1026 | "metadata": {}, 1027 | "outputs": [ 1028 | { 1029 | "data": { 1030 | "text/plain": [ 1031 | "3-element Vector{Int64}:\n", 1032 | " 3\n", 1033 | " 0\n", 1034 | " -1" 1035 | ] 1036 | }, 1037 | "execution_count": 43, 1038 | "metadata": {}, 1039 | "output_type": "execute_result" 1040 | } 1041 | ], 1042 | "source": [ 1043 | "t = categorical([1,2,3], ordered=true)\n", 1044 | "levels(recode(t, 2=>0, 1=>-1)) # and if you introduce a new levels they are added at the end in the order of appearance" 1045 | ] 1046 | }, 1047 | { 1048 | "cell_type": "code", 1049 | "execution_count": 44, 1050 | "metadata": {}, 1051 | "outputs": [ 1052 | { 1053 | "data": { 1054 | "text/plain": [ 1055 | "3-element Vector{Int64}:\n", 1056 | " 100\n", 1057 | " 200\n", 1058 | " 300" 1059 | ] 1060 | }, 1061 | "execution_count": 44, 1062 | "metadata": {}, 1063 | "output_type": "execute_result" 1064 | } 1065 | ], 1066 | "source": [ 1067 | "t = categorical([1,2,3,4,5], ordered=true) # when using default it becomes the last level\n", 1068 | "levels(recode(t, 300, [1,2]=>100, 3=>200))" 1069 | ] 1070 | }, 1071 | { 1072 | "cell_type": "markdown", 1073 | "metadata": {}, 1074 | "source": [ 1075 | "### Comparisons" 1076 | ] 1077 | }, 1078 | { 1079 | "cell_type": "code", 1080 | "execution_count": 45, 1081 | "metadata": {}, 1082 | "outputs": [ 1083 | { 1084 | "data": { 1085 | "text/plain": [ 1086 | "4×4 Matrix{Bool}:\n", 1087 | " 1 1 1 1\n", 1088 | " 1 1 1 1\n", 1089 | " 1 1 1 1\n", 1090 | " 1 1 1 1" 1091 | ] 1092 | }, 1093 | "execution_count": 45, 1094 | "metadata": {}, 1095 | "output_type": "execute_result" 1096 | } 1097 | ], 1098 | "source": [ 1099 | "x = categorical([1,2,3])\n", 1100 | "xs = [x, categorical(x), categorical(x, ordered=true), categorical(x, ordered=true)]\n", 1101 | "levels!(xs[2], [3,2,1])\n", 1102 | "levels!(xs[4], [2,3,1])\n", 1103 | "[a == b for a in xs, b in xs] # all are equal - comparison only by contents" 1104 | ] 1105 | }, 1106 | { 1107 | "cell_type": "code", 1108 | "execution_count": 46, 1109 | "metadata": {}, 1110 | "outputs": [ 1111 | { 1112 | "data": { 1113 | "text/plain": [ 1114 | "4×4 Matrix{Bool}:\n", 1115 | " 1 0 0 0\n", 1116 | " 0 1 0 0\n", 1117 | " 0 0 1 0\n", 1118 | " 0 0 0 1" 1119 | ] 1120 | }, 1121 | "execution_count": 46, 1122 | "metadata": {}, 1123 | "output_type": "execute_result" 1124 | } 1125 | ], 1126 | "source": [ 1127 | "signature(x::CategoricalArray) = (x, levels(x), isordered(x)) # this is actually the full signature of CategoricalArray\n", 1128 | "# all are different, notice that x[1] and x[2] are unordered but have a different order of levels\n", 1129 | "[signature(a) == signature(b) for a in xs, b in xs]" 1130 | ] 1131 | }, 1132 | { 1133 | "cell_type": "code", 1134 | "execution_count": 47, 1135 | "metadata": {}, 1136 | "outputs": [ 1137 | { 1138 | "ename": "LoadError", 1139 | "evalue": "ArgumentError: Unordered CategoricalValue objects cannot be tested for order using <. Use isless instead, or call the ordered! function on the parent array to change this", 1140 | "output_type": "error", 1141 | "traceback": [ 1142 | "ArgumentError: Unordered CategoricalValue objects cannot be tested for order using <. Use isless instead, or call the ordered! function on the parent array to change this", 1143 | "", 1144 | "Stacktrace:", 1145 | " [1] <(x::CategoricalValue{Int64, UInt32}, y::CategoricalValue{Int64, UInt32})", 1146 | " @ CategoricalArrays C:\\Users\\bogum\\.julia\\packages\\CategoricalArrays\\tJ8hD\\src\\value.jl:166", 1147 | " [2] top-level scope", 1148 | " @ In[47]:1" 1149 | ] 1150 | } 1151 | ], 1152 | "source": [ 1153 | "x[1] < x[2] # you cannot compare elements of unordered CategoricalArray" 1154 | ] 1155 | }, 1156 | { 1157 | "cell_type": "code", 1158 | "execution_count": 48, 1159 | "metadata": {}, 1160 | "outputs": [ 1161 | { 1162 | "data": { 1163 | "text/plain": [ 1164 | "true" 1165 | ] 1166 | }, 1167 | "execution_count": 48, 1168 | "metadata": {}, 1169 | "output_type": "execute_result" 1170 | } 1171 | ], 1172 | "source": [ 1173 | "t[1] < t[2] # but you can do it for an ordered one" 1174 | ] 1175 | }, 1176 | { 1177 | "cell_type": "code", 1178 | "execution_count": 49, 1179 | "metadata": {}, 1180 | "outputs": [ 1181 | { 1182 | "data": { 1183 | "text/plain": [ 1184 | "true" 1185 | ] 1186 | }, 1187 | "execution_count": 49, 1188 | "metadata": {}, 1189 | "output_type": "execute_result" 1190 | } 1191 | ], 1192 | "source": [ 1193 | "isless(x[1], x[2]) # isless works within the same CategoricalArray even if it is not ordered" 1194 | ] 1195 | }, 1196 | { 1197 | "cell_type": "code", 1198 | "execution_count": 50, 1199 | "metadata": {}, 1200 | "outputs": [ 1201 | { 1202 | "data": { 1203 | "text/plain": [ 1204 | "true" 1205 | ] 1206 | }, 1207 | "execution_count": 50, 1208 | "metadata": {}, 1209 | "output_type": "execute_result" 1210 | } 1211 | ], 1212 | "source": [ 1213 | "y = deepcopy(x) # but not across categorical arrays\n", 1214 | "isless(x[1], y[2])" 1215 | ] 1216 | }, 1217 | { 1218 | "cell_type": "code", 1219 | "execution_count": 51, 1220 | "metadata": {}, 1221 | "outputs": [ 1222 | { 1223 | "data": { 1224 | "text/plain": [ 1225 | "true" 1226 | ] 1227 | }, 1228 | "execution_count": 51, 1229 | "metadata": {}, 1230 | "output_type": "execute_result" 1231 | } 1232 | ], 1233 | "source": [ 1234 | "isless(unwrap(x[1]), unwrap(y[2])) # you can use get to make a comparison of the contents of CategoricalArray" 1235 | ] 1236 | }, 1237 | { 1238 | "cell_type": "code", 1239 | "execution_count": 52, 1240 | "metadata": {}, 1241 | "outputs": [ 1242 | { 1243 | "data": { 1244 | "text/plain": [ 1245 | "false" 1246 | ] 1247 | }, 1248 | "execution_count": 52, 1249 | "metadata": {}, 1250 | "output_type": "execute_result" 1251 | } 1252 | ], 1253 | "source": [ 1254 | "x[1] == y[2] # equality tests works OK across CategoricalArrays" 1255 | ] 1256 | }, 1257 | { 1258 | "cell_type": "markdown", 1259 | "metadata": {}, 1260 | "source": [ 1261 | "### Categorical columns in a DataFrame" 1262 | ] 1263 | }, 1264 | { 1265 | "cell_type": "code", 1266 | "execution_count": 53, 1267 | "metadata": {}, 1268 | "outputs": [ 1269 | { 1270 | "data": { 1271 | "text/html": [ 1272 | "
3×3 DataFrame
Rowxyz
Int64CharString
11aa
22bb
33cc
" 1273 | ], 1274 | "text/latex": [ 1275 | "\\begin{tabular}{r|ccc}\n", 1276 | "\t& x & y & z\\\\\n", 1277 | "\t\\hline\n", 1278 | "\t& Int64 & Char & String\\\\\n", 1279 | "\t\\hline\n", 1280 | "\t1 & 1 & a & a \\\\\n", 1281 | "\t2 & 2 & b & b \\\\\n", 1282 | "\t3 & 3 & c & c \\\\\n", 1283 | "\\end{tabular}\n" 1284 | ], 1285 | "text/plain": [ 1286 | "\u001b[1m3×3 DataFrame\u001b[0m\n", 1287 | "\u001b[1m Row \u001b[0m│\u001b[1m x \u001b[0m\u001b[1m y \u001b[0m\u001b[1m z \u001b[0m\n", 1288 | " │\u001b[90m Int64 \u001b[0m\u001b[90m Char \u001b[0m\u001b[90m String \u001b[0m\n", 1289 | "─────┼─────────────────────\n", 1290 | " 1 │ 1 a a\n", 1291 | " 2 │ 2 b b\n", 1292 | " 3 │ 3 c c" 1293 | ] 1294 | }, 1295 | "execution_count": 53, 1296 | "metadata": {}, 1297 | "output_type": "execute_result" 1298 | } 1299 | ], 1300 | "source": [ 1301 | "df = DataFrame(x = 1:3, y = 'a':'c', z = [\"a\",\"b\",\"c\"])" 1302 | ] 1303 | }, 1304 | { 1305 | "cell_type": "markdown", 1306 | "metadata": {}, 1307 | "source": [ 1308 | "Convert all `String` columns to categorical in-place" 1309 | ] 1310 | }, 1311 | { 1312 | "cell_type": "code", 1313 | "execution_count": 54, 1314 | "metadata": {}, 1315 | "outputs": [ 1316 | { 1317 | "data": { 1318 | "text/html": [ 1319 | "
3×3 DataFrame
Rowxyz
Int64CharCat…
11aa
22bb
33cc
" 1320 | ], 1321 | "text/latex": [ 1322 | "\\begin{tabular}{r|ccc}\n", 1323 | "\t& x & y & z\\\\\n", 1324 | "\t\\hline\n", 1325 | "\t& Int64 & Char & Cat…\\\\\n", 1326 | "\t\\hline\n", 1327 | "\t1 & 1 & a & a \\\\\n", 1328 | "\t2 & 2 & b & b \\\\\n", 1329 | "\t3 & 3 & c & c \\\\\n", 1330 | "\\end{tabular}\n" 1331 | ], 1332 | "text/plain": [ 1333 | "\u001b[1m3×3 DataFrame\u001b[0m\n", 1334 | "\u001b[1m Row \u001b[0m│\u001b[1m x \u001b[0m\u001b[1m y \u001b[0m\u001b[1m z \u001b[0m\n", 1335 | " │\u001b[90m Int64 \u001b[0m\u001b[90m Char \u001b[0m\u001b[90m Cat… \u001b[0m\n", 1336 | "─────┼───────────────────\n", 1337 | " 1 │ 1 a a\n", 1338 | " 2 │ 2 b b\n", 1339 | " 3 │ 3 c c" 1340 | ] 1341 | }, 1342 | "execution_count": 54, 1343 | "metadata": {}, 1344 | "output_type": "execute_result" 1345 | } 1346 | ], 1347 | "source": [ 1348 | "transform!(df, names(df, String) => categorical, renamecols=false)" 1349 | ] 1350 | }, 1351 | { 1352 | "cell_type": "code", 1353 | "execution_count": 55, 1354 | "metadata": {}, 1355 | "outputs": [ 1356 | { 1357 | "data": { 1358 | "text/html": [ 1359 | "
3×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
1x2.012.030Int64
2yac0Char
3zac0CategoricalValue{String, UInt32}
" 1360 | ], 1361 | "text/latex": [ 1362 | "\\begin{tabular}{r|ccccccc}\n", 1363 | "\t& variable & mean & min & median & max & nmissing & eltype\\\\\n", 1364 | "\t\\hline\n", 1365 | "\t& Symbol & Union… & Any & Union… & Any & Int64 & DataType\\\\\n", 1366 | "\t\\hline\n", 1367 | "\t1 & x & 2.0 & 1 & 2.0 & 3 & 0 & Int64 \\\\\n", 1368 | "\t2 & y & & a & & c & 0 & Char \\\\\n", 1369 | "\t3 & z & & a & & c & 0 & CategoricalValue\\{String, UInt32\\} \\\\\n", 1370 | "\\end{tabular}\n" 1371 | ], 1372 | "text/plain": [ 1373 | "\u001b[1m3×7 DataFrame\u001b[0m\n", 1374 | "\u001b[1m Row \u001b[0m│\u001b[1m variable \u001b[0m\u001b[1m mean \u001b[0m\u001b[1m min \u001b[0m\u001b[1m median \u001b[0m\u001b[1m max \u001b[0m\u001b[1m nmissing \u001b[0m\u001b[1m eltype \u001b[0m ⋯\n", 1375 | " │\u001b[90m Symbol \u001b[0m\u001b[90m Union… \u001b[0m\u001b[90m Any \u001b[0m\u001b[90m Union… \u001b[0m\u001b[90m Any \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m DataType \u001b[0m ⋯\n", 1376 | "─────┼──────────────────────────────────────────────────────────────────────────\n", 1377 | " 1 │ x 2.0 1 2.0 3 0 Int64 ⋯\n", 1378 | " 2 │ y \u001b[90m \u001b[0m a \u001b[90m \u001b[0m c 0 Char\n", 1379 | " 3 │ z \u001b[90m \u001b[0m a \u001b[90m \u001b[0m c 0 CategoricalValue{String,\n", 1380 | "\u001b[36m 1 column omitted\u001b[0m" 1381 | ] 1382 | }, 1383 | "execution_count": 55, 1384 | "metadata": {}, 1385 | "output_type": "execute_result" 1386 | } 1387 | ], 1388 | "source": [ 1389 | "describe(df)" 1390 | ] 1391 | } 1392 | ], 1393 | "metadata": { 1394 | "@webio": { 1395 | "lastCommId": null, 1396 | "lastKernelId": null 1397 | }, 1398 | "kernelspec": { 1399 | "display_name": "Julia 1.9.0", 1400 | "language": "julia", 1401 | "name": "julia-1.9" 1402 | }, 1403 | "language_info": { 1404 | "file_extension": ".jl", 1405 | "mimetype": "application/julia", 1406 | "name": "julia", 1407 | "version": "1.9.0" 1408 | } 1409 | }, 1410 | "nbformat": 4, 1411 | "nbformat_minor": 1 1412 | } 1413 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2017-18 Bogumił Kamiński 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Manifest.toml: -------------------------------------------------------------------------------- 1 | # This file is machine-generated - editing it directly is not advised 2 | 3 | julia_version = "1.9.0" 4 | manifest_format = "2.0" 5 | project_hash = "5a686dfaa1c76c393e25dc0e8ea240febe155185" 6 | 7 | [[deps.AbstractFFTs]] 8 | deps = ["ChainRulesCore", "LinearAlgebra"] 9 | git-tree-sha1 = "69f7020bd72f069c219b5e8c236c1fa90d2cb409" 10 | uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c" 11 | version = "1.2.1" 12 | 13 | [[deps.Adapt]] 14 | deps = ["LinearAlgebra"] 15 | git-tree-sha1 = "0310e08cb19f5da31d08341c6120c047598f5b9c" 16 | uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" 17 | version = "3.5.0" 18 | 19 | [[deps.ArgTools]] 20 | uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f" 21 | version = "1.1.1" 22 | 23 | [[deps.Arpack]] 24 | deps = ["Arpack_jll", "Libdl", "LinearAlgebra", "Logging"] 25 | git-tree-sha1 = "9b9b347613394885fd1c8c7729bfc60528faa436" 26 | uuid = "7d9fca2a-8960-54d3-9f78-7d1dccf2cb97" 27 | version = "0.5.4" 28 | 29 | [[deps.Arpack_jll]] 30 | deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS_jll", "Pkg"] 31 | git-tree-sha1 = "5ba6c757e8feccf03a1554dfaf3e26b3cfc7fd5e" 32 | uuid = "68821587-b530-5797-8361-c406ea357684" 33 | version = "3.5.1+1" 34 | 35 | [[deps.Arrow]] 36 | deps = ["ArrowTypes", "BitIntegers", "CodecLz4", "CodecZstd", "DataAPI", "Dates", "LoggingExtras", "Mmap", "PooledArrays", "SentinelArrays", "Tables", "TimeZones", "UUIDs", "WorkerUtilities"] 37 | git-tree-sha1 = "4e40f4868281b7fd702c605c764ab82a52ac3f4b" 38 | uuid = "69666777-d1a9-59fb-9406-91d4454c9d45" 39 | version = "2.4.3" 40 | 41 | [[deps.ArrowTypes]] 42 | deps = ["UUIDs"] 43 | git-tree-sha1 = "563d60f89fcb730668bd568ba3e752ee71dde023" 44 | uuid = "31f734f8-188a-4ce0-8406-c8a06bd891cd" 45 | version = "2.0.2" 46 | 47 | [[deps.Artifacts]] 48 | uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33" 49 | 50 | [[deps.AxisAlgorithms]] 51 | deps = ["LinearAlgebra", "Random", "SparseArrays", "WoodburyMatrices"] 52 | git-tree-sha1 = "66771c8d21c8ff5e3a93379480a2307ac36863f7" 53 | uuid = "13072b0f-2c55-5437-9ae7-d433b7a33950" 54 | version = "1.0.1" 55 | 56 | [[deps.BSON]] 57 | git-tree-sha1 = "86e9781ac28f4e80e9b98f7f96eae21891332ac2" 58 | uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0" 59 | version = "0.3.6" 60 | 61 | [[deps.Base64]] 62 | uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" 63 | 64 | [[deps.BenchmarkTools]] 65 | deps = ["JSON", "Logging", "Printf", "Profile", "Statistics", "UUIDs"] 66 | git-tree-sha1 = "d9a9701b899b30332bbcb3e1679c41cce81fb0e8" 67 | uuid = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf" 68 | version = "1.3.2" 69 | 70 | [[deps.BitFlags]] 71 | git-tree-sha1 = "43b1a4a8f797c1cddadf60499a8a077d4af2cd2d" 72 | uuid = "d1d4a3ce-64b1-5f1a-9ba4-7e7e69966f35" 73 | version = "0.1.7" 74 | 75 | [[deps.BitIntegers]] 76 | deps = ["Random"] 77 | git-tree-sha1 = "fc54d5837033a170f3bad307f993e156eefc345f" 78 | uuid = "c3b6d118-76ef-56ca-8cc7-ebb389d030a1" 79 | version = "0.2.7" 80 | 81 | [[deps.Blosc]] 82 | deps = ["Blosc_jll"] 83 | git-tree-sha1 = "310b77648d38c223d947ff3f50f511d08690b8d5" 84 | uuid = "a74b3585-a348-5f62-a45c-50e91977d574" 85 | version = "0.7.3" 86 | 87 | [[deps.Blosc_jll]] 88 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Lz4_jll", "Pkg", "Zlib_jll", "Zstd_jll"] 89 | git-tree-sha1 = "e94024822c0a5b14989abbdba57820ad5b177b95" 90 | uuid = "0b7ba130-8d10-5ba8-a3d6-c5182647fed9" 91 | version = "1.21.2+0" 92 | 93 | [[deps.BufferedStreams]] 94 | git-tree-sha1 = "bb065b14d7f941b8617bc323063dbe79f55d16ea" 95 | uuid = "e1450e63-4bb3-523b-b2a4-4ffa8c0fd77d" 96 | version = "1.1.0" 97 | 98 | [[deps.Bzip2_jll]] 99 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 100 | git-tree-sha1 = "19a35467a82e236ff51bc17a3a44b69ef35185a2" 101 | uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0" 102 | version = "1.0.8+0" 103 | 104 | [[deps.CEnum]] 105 | git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90" 106 | uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82" 107 | version = "0.4.2" 108 | 109 | [[deps.CSV]] 110 | deps = ["CodecZlib", "Dates", "FilePathsBase", "InlineStrings", "Mmap", "Parsers", "PooledArrays", "SentinelArrays", "SnoopPrecompile", "Tables", "Unicode", "WeakRefStrings", "WorkerUtilities"] 111 | git-tree-sha1 = "c700cce799b51c9045473de751e9319bdd1c6e94" 112 | uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 113 | version = "0.10.9" 114 | 115 | [[deps.Cairo_jll]] 116 | deps = ["Artifacts", "Bzip2_jll", "CompilerSupportLibraries_jll", "Fontconfig_jll", "FreeType2_jll", "Glib_jll", "JLLWrappers", "LZO_jll", "Libdl", "Pixman_jll", "Pkg", "Xorg_libXext_jll", "Xorg_libXrender_jll", "Zlib_jll", "libpng_jll"] 117 | git-tree-sha1 = "4b859a208b2397a7a623a03449e4636bdb17bcf2" 118 | uuid = "83423d85-b0ee-5818-9007-b63ccbeb887a" 119 | version = "1.16.1+1" 120 | 121 | [[deps.Calculus]] 122 | deps = ["LinearAlgebra"] 123 | git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad" 124 | uuid = "49dc2e85-a5d0-5ad3-a950-438e2897f1b9" 125 | version = "0.5.1" 126 | 127 | [[deps.CategoricalArrays]] 128 | deps = ["DataAPI", "Future", "Missings", "Printf", "Requires", "Statistics", "Unicode"] 129 | git-tree-sha1 = "5084cc1a28976dd1642c9f337b28a3cb03e0f7d2" 130 | uuid = "324d7699-5711-5eae-9e2f-1d82baa6b597" 131 | version = "0.10.7" 132 | 133 | [[deps.Chain]] 134 | git-tree-sha1 = "8c4920235f6c561e401dfe569beb8b924adad003" 135 | uuid = "8be319e6-bccf-4806-a6f7-6fae938471bc" 136 | version = "0.5.0" 137 | 138 | [[deps.ChainRulesCore]] 139 | deps = ["Compat", "LinearAlgebra", "SparseArrays"] 140 | git-tree-sha1 = "c6d890a52d2c4d55d326439580c3b8d0875a77d9" 141 | uuid = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" 142 | version = "1.15.7" 143 | 144 | [[deps.Clustering]] 145 | deps = ["Distances", "LinearAlgebra", "NearestNeighbors", "Printf", "Random", "SparseArrays", "Statistics", "StatsBase"] 146 | git-tree-sha1 = "64df3da1d2a26f4de23871cd1b6482bb68092bd5" 147 | uuid = "aaaa29a8-35af-508c-8bc3-b662a17a0fe5" 148 | version = "0.14.3" 149 | 150 | [[deps.CodecLz4]] 151 | deps = ["Lz4_jll", "TranscodingStreams"] 152 | git-tree-sha1 = "59fe0cb37784288d6b9f1baebddbf75457395d40" 153 | uuid = "5ba52731-8f18-5e0d-9241-30f10d1ec561" 154 | version = "0.4.0" 155 | 156 | [[deps.CodecZlib]] 157 | deps = ["TranscodingStreams", "Zlib_jll"] 158 | git-tree-sha1 = "9c209fb7536406834aa938fb149964b985de6c83" 159 | uuid = "944b1d66-785c-5afd-91f1-9de20f533193" 160 | version = "0.7.1" 161 | 162 | [[deps.CodecZstd]] 163 | deps = ["CEnum", "TranscodingStreams", "Zstd_jll"] 164 | git-tree-sha1 = "849470b337d0fa8449c21061de922386f32949d9" 165 | uuid = "6b39b394-51ab-5f42-8807-6242bab2b4c2" 166 | version = "0.7.2" 167 | 168 | [[deps.ColorSchemes]] 169 | deps = ["ColorTypes", "ColorVectorSpace", "Colors", "FixedPointNumbers", "Random", "SnoopPrecompile"] 170 | git-tree-sha1 = "aa3edc8f8dea6cbfa176ee12f7c2fc82f0608ed3" 171 | uuid = "35d6a980-a343-548e-a6ea-1d62b119f2f4" 172 | version = "3.20.0" 173 | 174 | [[deps.ColorTypes]] 175 | deps = ["FixedPointNumbers", "Random"] 176 | git-tree-sha1 = "eb7f0f8307f71fac7c606984ea5fb2817275d6e4" 177 | uuid = "3da002f7-5984-5a60-b8a6-cbb66c0b333f" 178 | version = "0.11.4" 179 | 180 | [[deps.ColorVectorSpace]] 181 | deps = ["ColorTypes", "FixedPointNumbers", "LinearAlgebra", "SpecialFunctions", "Statistics", "TensorCore"] 182 | git-tree-sha1 = "600cc5508d66b78aae350f7accdb58763ac18589" 183 | uuid = "c3611d14-8923-5661-9e6a-0046d554d3a4" 184 | version = "0.9.10" 185 | 186 | [[deps.Colors]] 187 | deps = ["ColorTypes", "FixedPointNumbers", "Reexport"] 188 | git-tree-sha1 = "fc08e5930ee9a4e03f84bfb5211cb54e7769758a" 189 | uuid = "5ae59095-9a9b-59fe-a467-6f913c188581" 190 | version = "0.12.10" 191 | 192 | [[deps.Combinatorics]] 193 | git-tree-sha1 = "08c8b6831dc00bfea825826be0bc8336fc369860" 194 | uuid = "861a8166-3701-5b0c-9a16-15d98fcdc6aa" 195 | version = "1.0.2" 196 | 197 | [[deps.Compat]] 198 | deps = ["Dates", "LinearAlgebra", "UUIDs"] 199 | git-tree-sha1 = "61fdd77467a5c3ad071ef8277ac6bd6af7dd4c04" 200 | uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" 201 | version = "4.6.0" 202 | 203 | [[deps.CompilerSupportLibraries_jll]] 204 | deps = ["Artifacts", "Libdl"] 205 | uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" 206 | version = "1.0.2+0" 207 | 208 | [[deps.Conda]] 209 | deps = ["Downloads", "JSON", "VersionParsing"] 210 | git-tree-sha1 = "e32a90da027ca45d84678b826fffd3110bb3fc90" 211 | uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d" 212 | version = "1.8.0" 213 | 214 | [[deps.Contour]] 215 | git-tree-sha1 = "d05d9e7b7aedff4e5b51a029dced05cfb6125781" 216 | uuid = "d38c429a-6771-53c6-b99e-75d170b6e991" 217 | version = "0.6.2" 218 | 219 | [[deps.Crayons]] 220 | git-tree-sha1 = "249fe38abf76d48563e2f4556bebd215aa317e15" 221 | uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" 222 | version = "4.1.1" 223 | 224 | [[deps.DataAPI]] 225 | git-tree-sha1 = "e8119c1a33d267e16108be441a287a6981ba1630" 226 | uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a" 227 | version = "1.14.0" 228 | 229 | [[deps.DataFrames]] 230 | deps = ["Compat", "DataAPI", "Future", "InlineStrings", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Random", "Reexport", "SentinelArrays", "SnoopPrecompile", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"] 231 | git-tree-sha1 = "aa51303df86f8626a962fccb878430cdb0a97eee" 232 | uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 233 | version = "1.5.0" 234 | 235 | [[deps.DataFramesMeta]] 236 | deps = ["Chain", "DataFrames", "MacroTools", "OrderedCollections", "Reexport"] 237 | git-tree-sha1 = "f9db5b04be51162fbeacf711005cb36d8434c55b" 238 | uuid = "1313f7d8-7da2-5740-9ea0-a2ca25f37964" 239 | version = "0.13.0" 240 | 241 | [[deps.DataStructures]] 242 | deps = ["Compat", "InteractiveUtils", "OrderedCollections"] 243 | git-tree-sha1 = "d1fff3a548102f48987a52a2e0d114fa97d730f0" 244 | uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8" 245 | version = "0.18.13" 246 | 247 | [[deps.DataValueInterfaces]] 248 | git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6" 249 | uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464" 250 | version = "1.0.0" 251 | 252 | [[deps.DataValues]] 253 | deps = ["DataValueInterfaces", "Dates"] 254 | git-tree-sha1 = "d88a19299eba280a6d062e135a43f00323ae70bf" 255 | uuid = "e7dc6d0d-1eca-5fa6-8ad6-5aecde8b7ea5" 256 | version = "0.4.13" 257 | 258 | [[deps.Dates]] 259 | deps = ["Printf"] 260 | uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" 261 | 262 | [[deps.DelimitedFiles]] 263 | deps = ["Mmap"] 264 | git-tree-sha1 = "9e2f36d3c96a820c678f2f1f1782582fcf685bae" 265 | uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab" 266 | version = "1.9.1" 267 | 268 | [[deps.DensityInterface]] 269 | deps = ["InverseFunctions", "Test"] 270 | git-tree-sha1 = "80c3e8639e3353e5d2912fb3a1916b8455e2494b" 271 | uuid = "b429d917-457f-4dbc-8f4c-0cc954292b1d" 272 | version = "0.4.0" 273 | 274 | [[deps.Distances]] 275 | deps = ["LinearAlgebra", "SparseArrays", "Statistics", "StatsAPI"] 276 | git-tree-sha1 = "3258d0659f812acde79e8a74b11f17ac06d0ca04" 277 | uuid = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7" 278 | version = "0.10.7" 279 | 280 | [[deps.Distributed]] 281 | deps = ["Random", "Serialization", "Sockets"] 282 | uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b" 283 | 284 | [[deps.Distributions]] 285 | deps = ["ChainRulesCore", "DensityInterface", "FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "Test"] 286 | git-tree-sha1 = "74911ad88921455c6afcad1eefa12bd7b1724631" 287 | uuid = "31c24e10-a181-5473-b8eb-7969acd0382f" 288 | version = "0.25.80" 289 | 290 | [[deps.DocStringExtensions]] 291 | deps = ["LibGit2"] 292 | git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d" 293 | uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" 294 | version = "0.9.3" 295 | 296 | [[deps.Downloads]] 297 | deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"] 298 | uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6" 299 | version = "1.6.0" 300 | 301 | [[deps.DualNumbers]] 302 | deps = ["Calculus", "NaNMath", "SpecialFunctions"] 303 | git-tree-sha1 = "5837a837389fccf076445fce071c8ddaea35a566" 304 | uuid = "fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74" 305 | version = "0.6.8" 306 | 307 | [[deps.Expat_jll]] 308 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 309 | git-tree-sha1 = "bad72f730e9e91c08d9427d5e8db95478a3c323d" 310 | uuid = "2e619515-83b5-522b-bb60-26c02a35a201" 311 | version = "2.4.8+0" 312 | 313 | [[deps.ExprTools]] 314 | git-tree-sha1 = "56559bbef6ca5ea0c0818fa5c90320398a6fbf8d" 315 | uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" 316 | version = "0.1.8" 317 | 318 | [[deps.FFMPEG]] 319 | deps = ["FFMPEG_jll"] 320 | git-tree-sha1 = "b57e3acbe22f8484b4b5ff66a7499717fe1a9cc8" 321 | uuid = "c87230d0-a227-11e9-1b43-d7ebe4e7570a" 322 | version = "0.4.1" 323 | 324 | [[deps.FFMPEG_jll]] 325 | deps = ["Artifacts", "Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "JLLWrappers", "LAME_jll", "Libdl", "Ogg_jll", "OpenSSL_jll", "Opus_jll", "PCRE2_jll", "Pkg", "Zlib_jll", "libaom_jll", "libass_jll", "libfdk_aac_jll", "libvorbis_jll", "x264_jll", "x265_jll"] 326 | git-tree-sha1 = "74faea50c1d007c85837327f6775bea60b5492dd" 327 | uuid = "b22a6f82-2f65-5046-a5b2-351ab43fb4e5" 328 | version = "4.4.2+2" 329 | 330 | [[deps.FFTW]] 331 | deps = ["AbstractFFTs", "FFTW_jll", "LinearAlgebra", "MKL_jll", "Preferences", "Reexport"] 332 | git-tree-sha1 = "90630efff0894f8142308e334473eba54c433549" 333 | uuid = "7a1cc6ca-52ef-59f5-83cd-3a7055c09341" 334 | version = "1.5.0" 335 | 336 | [[deps.FFTW_jll]] 337 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 338 | git-tree-sha1 = "c6033cc3892d0ef5bb9cd29b7f2f0331ea5184ea" 339 | uuid = "f5851436-0d7a-5f13-b9de-f02708fd171a" 340 | version = "3.3.10+0" 341 | 342 | [[deps.FileIO]] 343 | deps = ["Pkg", "Requires", "UUIDs"] 344 | git-tree-sha1 = "7be5f99f7d15578798f338f5433b6c432ea8037b" 345 | uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549" 346 | version = "1.16.0" 347 | 348 | [[deps.FilePathsBase]] 349 | deps = ["Compat", "Dates", "Mmap", "Printf", "Test", "UUIDs"] 350 | git-tree-sha1 = "e27c4ebe80e8699540f2d6c805cc12203b614f12" 351 | uuid = "48062228-2e41-5def-b9a4-89aafe57970f" 352 | version = "0.9.20" 353 | 354 | [[deps.FileWatching]] 355 | uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee" 356 | 357 | [[deps.FillArrays]] 358 | deps = ["LinearAlgebra", "Random", "SparseArrays", "Statistics"] 359 | git-tree-sha1 = "d3ba08ab64bdfd27234d3f61956c966266757fe6" 360 | uuid = "1a297f60-69ca-5386-bcde-b61e274b549b" 361 | version = "0.13.7" 362 | 363 | [[deps.FixedPointNumbers]] 364 | deps = ["Statistics"] 365 | git-tree-sha1 = "335bfdceacc84c5cdf16aadc768aa5ddfc5383cc" 366 | uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93" 367 | version = "0.8.4" 368 | 369 | [[deps.Fontconfig_jll]] 370 | deps = ["Artifacts", "Bzip2_jll", "Expat_jll", "FreeType2_jll", "JLLWrappers", "Libdl", "Libuuid_jll", "Pkg", "Zlib_jll"] 371 | git-tree-sha1 = "21efd19106a55620a188615da6d3d06cd7f6ee03" 372 | uuid = "a3f928ae-7b40-5064-980b-68af3947d34b" 373 | version = "2.13.93+0" 374 | 375 | [[deps.Formatting]] 376 | deps = ["Printf"] 377 | git-tree-sha1 = "8339d61043228fdd3eb658d86c926cb282ae72a8" 378 | uuid = "59287772-0a20-5a39-b81b-1366585eb4c0" 379 | version = "0.4.2" 380 | 381 | [[deps.FreeType2_jll]] 382 | deps = ["Artifacts", "Bzip2_jll", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll"] 383 | git-tree-sha1 = "87eb71354d8ec1a96d4a7636bd57a7347dde3ef9" 384 | uuid = "d7e528f0-a631-5988-bf34-fe36492bcfd7" 385 | version = "2.10.4+0" 386 | 387 | [[deps.FreqTables]] 388 | deps = ["CategoricalArrays", "Missings", "NamedArrays", "Tables"] 389 | git-tree-sha1 = "488ad2dab30fd2727ee65451f790c81ed454666d" 390 | uuid = "da1fdf0e-e0ff-5433-a45f-9bb5ff651cb1" 391 | version = "0.4.5" 392 | 393 | [[deps.FriBidi_jll]] 394 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 395 | git-tree-sha1 = "aa31987c2ba8704e23c6c8ba8a4f769d5d7e4f91" 396 | uuid = "559328eb-81f9-559d-9380-de523a88c83c" 397 | version = "1.0.10+0" 398 | 399 | [[deps.Future]] 400 | deps = ["Random"] 401 | uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" 402 | 403 | [[deps.GLFW_jll]] 404 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Libglvnd_jll", "Pkg", "Xorg_libXcursor_jll", "Xorg_libXi_jll", "Xorg_libXinerama_jll", "Xorg_libXrandr_jll"] 405 | git-tree-sha1 = "d972031d28c8c8d9d7b41a536ad7bb0c2579caca" 406 | uuid = "0656b61e-2033-5cc2-a64a-77c0f6c09b89" 407 | version = "3.3.8+0" 408 | 409 | [[deps.GR]] 410 | deps = ["Artifacts", "Base64", "DelimitedFiles", "Downloads", "GR_jll", "HTTP", "JSON", "Libdl", "LinearAlgebra", "Pkg", "Preferences", "Printf", "Random", "Serialization", "Sockets", "TOML", "Tar", "Test", "UUIDs", "p7zip_jll"] 411 | git-tree-sha1 = "350c974a2fc6c73792cc337be3ea6a37e5fe5f44" 412 | uuid = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71" 413 | version = "0.71.6" 414 | 415 | [[deps.GR_jll]] 416 | deps = ["Artifacts", "Bzip2_jll", "Cairo_jll", "FFMPEG_jll", "Fontconfig_jll", "GLFW_jll", "JLLWrappers", "JpegTurbo_jll", "Libdl", "Libtiff_jll", "Pixman_jll", "Pkg", "Qt5Base_jll", "Zlib_jll", "libpng_jll"] 417 | git-tree-sha1 = "b23a8733e5b294a49351b419cb54ff4e5279c330" 418 | uuid = "d2c73de3-f751-5644-a686-071e5b155ba9" 419 | version = "0.71.6+0" 420 | 421 | [[deps.Gettext_jll]] 422 | deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Libiconv_jll", "Pkg", "XML2_jll"] 423 | git-tree-sha1 = "9b02998aba7bf074d14de89f9d37ca24a1a0b046" 424 | uuid = "78b55507-aeef-58d4-861c-77aaff3498b1" 425 | version = "0.21.0+0" 426 | 427 | [[deps.Glib_jll]] 428 | deps = ["Artifacts", "Gettext_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Libiconv_jll", "Libmount_jll", "PCRE2_jll", "Pkg", "Zlib_jll"] 429 | git-tree-sha1 = "d3b3624125c1474292d0d8ed0f65554ac37ddb23" 430 | uuid = "7746bdde-850d-59dc-9ae8-88ece973131d" 431 | version = "2.74.0+2" 432 | 433 | [[deps.Graphite2_jll]] 434 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 435 | git-tree-sha1 = "344bf40dcab1073aca04aa0df4fb092f920e4011" 436 | uuid = "3b182d85-2403-5c21-9c21-1e1f0cc25472" 437 | version = "1.3.14+0" 438 | 439 | [[deps.Grisu]] 440 | git-tree-sha1 = "53bb909d1151e57e2484c3d1b53e19552b887fb2" 441 | uuid = "42e2da0e-8278-4e71-bc24-59509adca0fe" 442 | version = "1.0.2" 443 | 444 | [[deps.HTTP]] 445 | deps = ["Base64", "CodecZlib", "Dates", "IniFile", "Logging", "LoggingExtras", "MbedTLS", "NetworkOptions", "OpenSSL", "Random", "SimpleBufferStream", "Sockets", "URIs", "UUIDs"] 446 | git-tree-sha1 = "37e4657cd56b11abe3d10cd4a1ec5fbdb4180263" 447 | uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3" 448 | version = "1.7.4" 449 | 450 | [[deps.HarfBuzz_jll]] 451 | deps = ["Artifacts", "Cairo_jll", "Fontconfig_jll", "FreeType2_jll", "Glib_jll", "Graphite2_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg"] 452 | git-tree-sha1 = "129acf094d168394e80ee1dc4bc06ec835e510a3" 453 | uuid = "2e76f6c2-a576-52d4-95c1-20adfe4de566" 454 | version = "2.8.1+1" 455 | 456 | [[deps.HypergeometricFunctions]] 457 | deps = ["DualNumbers", "LinearAlgebra", "OpenLibm_jll", "SpecialFunctions", "Test"] 458 | git-tree-sha1 = "709d864e3ed6e3545230601f94e11ebc65994641" 459 | uuid = "34004b35-14d8-5ef3-9330-4cdb6864b03a" 460 | version = "0.3.11" 461 | 462 | [[deps.IJulia]] 463 | deps = ["Base64", "Conda", "Dates", "InteractiveUtils", "JSON", "Libdl", "Logging", "Markdown", "MbedTLS", "Pkg", "Printf", "REPL", "Random", "SoftGlobalScope", "Test", "UUIDs", "ZMQ"] 464 | git-tree-sha1 = "59e19713542dd9dd02f31d59edbada69530d6a14" 465 | uuid = "7073ff75-c697-5162-941a-fcdaad2a7d2a" 466 | version = "1.24.0" 467 | 468 | [[deps.IniFile]] 469 | git-tree-sha1 = "f550e6e32074c939295eb5ea6de31849ac2c9625" 470 | uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f" 471 | version = "0.5.1" 472 | 473 | [[deps.InlineStrings]] 474 | deps = ["Parsers"] 475 | git-tree-sha1 = "9cc2baf75c6d09f9da536ddf58eb2f29dedaf461" 476 | uuid = "842dd82b-1e85-43dc-bf29-5d0ee9dffc48" 477 | version = "1.4.0" 478 | 479 | [[deps.IntelOpenMP_jll]] 480 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 481 | git-tree-sha1 = "d979e54b71da82f3a65b62553da4fc3d18c9004c" 482 | uuid = "1d5cc7b8-4909-519e-a0f8-d0f5ad9712d0" 483 | version = "2018.0.3+2" 484 | 485 | [[deps.InteractiveUtils]] 486 | deps = ["Markdown"] 487 | uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" 488 | 489 | [[deps.Interpolations]] 490 | deps = ["Adapt", "AxisAlgorithms", "ChainRulesCore", "LinearAlgebra", "OffsetArrays", "Random", "Ratios", "Requires", "SharedArrays", "SparseArrays", "StaticArrays", "WoodburyMatrices"] 491 | git-tree-sha1 = "721ec2cf720536ad005cb38f50dbba7b02419a15" 492 | uuid = "a98d9a8b-a2ab-59e6-89dd-64a1c18fca59" 493 | version = "0.14.7" 494 | 495 | [[deps.InverseFunctions]] 496 | deps = ["Test"] 497 | git-tree-sha1 = "49510dfcb407e572524ba94aeae2fced1f3feb0f" 498 | uuid = "3587e190-3f89-42d0-90ee-14403ec27112" 499 | version = "0.1.8" 500 | 501 | [[deps.InvertedIndices]] 502 | git-tree-sha1 = "82aec7a3dd64f4d9584659dc0b62ef7db2ef3e19" 503 | uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f" 504 | version = "1.2.0" 505 | 506 | [[deps.IrrationalConstants]] 507 | git-tree-sha1 = "7fd44fd4ff43fc60815f8e764c0f352b83c49151" 508 | uuid = "92d709cd-6900-40b7-9082-c6be49f344b6" 509 | version = "0.1.1" 510 | 511 | [[deps.IteratorInterfaceExtensions]] 512 | git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856" 513 | uuid = "82899510-4779-5014-852e-03e436cf321d" 514 | version = "1.0.0" 515 | 516 | [[deps.JDF]] 517 | deps = ["Blosc", "BufferedStreams", "CategoricalArrays", "DataAPI", "Dates", "InlineStrings", "Missings", "PooledArrays", "Serialization", "StatsBase", "Tables", "TimeZones", "WeakRefStrings"] 518 | git-tree-sha1 = "f98a3c5210bd73adf9aa236546c4b4b3e6328025" 519 | uuid = "babc3d20-cd49-4f60-a736-a8f9c08892d3" 520 | version = "0.5.1" 521 | 522 | [[deps.JLFzf]] 523 | deps = ["Pipe", "REPL", "Random", "fzf_jll"] 524 | git-tree-sha1 = "f377670cda23b6b7c1c0b3893e37451c5c1a2185" 525 | uuid = "1019f520-868f-41f5-a6de-eb00f4b6a39c" 526 | version = "0.1.5" 527 | 528 | [[deps.JLLWrappers]] 529 | deps = ["Preferences"] 530 | git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1" 531 | uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210" 532 | version = "1.4.1" 533 | 534 | [[deps.JLSO]] 535 | deps = ["BSON", "CodecZlib", "FilePathsBase", "Memento", "Pkg", "Serialization"] 536 | git-tree-sha1 = "7e3821e362ede76f83a39635d177c63595296776" 537 | uuid = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11" 538 | version = "2.7.0" 539 | 540 | [[deps.JSON]] 541 | deps = ["Dates", "Mmap", "Parsers", "Unicode"] 542 | git-tree-sha1 = "3c837543ddb02250ef42f4738347454f95079d4e" 543 | uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" 544 | version = "0.21.3" 545 | 546 | [[deps.JSON3]] 547 | deps = ["Dates", "Mmap", "Parsers", "SnoopPrecompile", "StructTypes", "UUIDs"] 548 | git-tree-sha1 = "84b10656a41ef564c39d2d477d7236966d2b5683" 549 | uuid = "0f8b85d8-7281-11e9-16c2-39a750bddbf1" 550 | version = "1.12.0" 551 | 552 | [[deps.JSONTables]] 553 | deps = ["JSON3", "StructTypes", "Tables"] 554 | git-tree-sha1 = "13f7485bb0b4438bb5e83e62fcadc65c5de1d1bb" 555 | uuid = "b9914132-a727-11e9-1322-f18e41205b0b" 556 | version = "1.0.3" 557 | 558 | [[deps.JpegTurbo_jll]] 559 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 560 | git-tree-sha1 = "b53380851c6e6664204efb2e62cd24fa5c47e4ba" 561 | uuid = "aacddb02-875f-59d6-b918-886e6ef4fbf8" 562 | version = "2.1.2+0" 563 | 564 | [[deps.KernelDensity]] 565 | deps = ["Distributions", "DocStringExtensions", "FFTW", "Interpolations", "StatsBase"] 566 | git-tree-sha1 = "9816b296736292a80b9a3200eb7fbb57aaa3917a" 567 | uuid = "5ab0869b-81aa-558d-bb23-cbf5423bbe9b" 568 | version = "0.6.5" 569 | 570 | [[deps.LAME_jll]] 571 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 572 | git-tree-sha1 = "f6250b16881adf048549549fba48b1161acdac8c" 573 | uuid = "c1c5ebd0-6772-5130-a774-d5fcae4a789d" 574 | version = "3.100.1+0" 575 | 576 | [[deps.LERC_jll]] 577 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 578 | git-tree-sha1 = "bf36f528eec6634efc60d7ec062008f171071434" 579 | uuid = "88015f11-f218-50d7-93a8-a6af411a945d" 580 | version = "3.0.0+1" 581 | 582 | [[deps.LZO_jll]] 583 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 584 | git-tree-sha1 = "e5b909bcf985c5e2605737d2ce278ed791b89be6" 585 | uuid = "dd4b983a-f0e5-5f8d-a1b7-129d4a5fb1ac" 586 | version = "2.10.1+0" 587 | 588 | [[deps.LaTeXStrings]] 589 | git-tree-sha1 = "f2355693d6778a178ade15952b7ac47a4ff97996" 590 | uuid = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" 591 | version = "1.3.0" 592 | 593 | [[deps.Latexify]] 594 | deps = ["Formatting", "InteractiveUtils", "LaTeXStrings", "MacroTools", "Markdown", "OrderedCollections", "Printf", "Requires"] 595 | git-tree-sha1 = "2422f47b34d4b127720a18f86fa7b1aa2e141f29" 596 | uuid = "23fbe1c1-3f47-55db-b15f-69d7ec21a316" 597 | version = "0.15.18" 598 | 599 | [[deps.LazyArtifacts]] 600 | deps = ["Artifacts", "Pkg"] 601 | uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3" 602 | 603 | [[deps.LibCURL]] 604 | deps = ["LibCURL_jll", "MozillaCACerts_jll"] 605 | uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21" 606 | version = "0.6.3" 607 | 608 | [[deps.LibCURL_jll]] 609 | deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"] 610 | uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0" 611 | version = "7.84.0+0" 612 | 613 | [[deps.LibGit2]] 614 | deps = ["Base64", "NetworkOptions", "Printf", "SHA"] 615 | uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" 616 | 617 | [[deps.LibSSH2_jll]] 618 | deps = ["Artifacts", "Libdl", "MbedTLS_jll"] 619 | uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8" 620 | version = "1.10.2+0" 621 | 622 | [[deps.Libdl]] 623 | uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" 624 | 625 | [[deps.Libffi_jll]] 626 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 627 | git-tree-sha1 = "0b4a5d71f3e5200a7dff793393e09dfc2d874290" 628 | uuid = "e9f186c6-92d2-5b65-8a66-fee21dc1b490" 629 | version = "3.2.2+1" 630 | 631 | [[deps.Libgcrypt_jll]] 632 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Libgpg_error_jll", "Pkg"] 633 | git-tree-sha1 = "64613c82a59c120435c067c2b809fc61cf5166ae" 634 | uuid = "d4300ac3-e22c-5743-9152-c294e39db1e4" 635 | version = "1.8.7+0" 636 | 637 | [[deps.Libglvnd_jll]] 638 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll", "Xorg_libXext_jll"] 639 | git-tree-sha1 = "6f73d1dd803986947b2c750138528a999a6c7733" 640 | uuid = "7e76a0d4-f3c7-5321-8279-8d96eeed0f29" 641 | version = "1.6.0+0" 642 | 643 | [[deps.Libgpg_error_jll]] 644 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 645 | git-tree-sha1 = "c333716e46366857753e273ce6a69ee0945a6db9" 646 | uuid = "7add5ba3-2f88-524e-9cd5-f83b8a55f7b8" 647 | version = "1.42.0+0" 648 | 649 | [[deps.Libiconv_jll]] 650 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 651 | git-tree-sha1 = "c7cb1f5d892775ba13767a87c7ada0b980ea0a71" 652 | uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531" 653 | version = "1.16.1+2" 654 | 655 | [[deps.Libmount_jll]] 656 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 657 | git-tree-sha1 = "9c30530bf0effd46e15e0fdcf2b8636e78cbbd73" 658 | uuid = "4b2f31a3-9ecc-558c-b454-b3730dcb73e9" 659 | version = "2.35.0+0" 660 | 661 | [[deps.Libtiff_jll]] 662 | deps = ["Artifacts", "JLLWrappers", "JpegTurbo_jll", "LERC_jll", "Libdl", "Pkg", "Zlib_jll", "Zstd_jll"] 663 | git-tree-sha1 = "3eb79b0ca5764d4799c06699573fd8f533259713" 664 | uuid = "89763e89-9b03-5906-acba-b20f662cd828" 665 | version = "4.4.0+0" 666 | 667 | [[deps.Libuuid_jll]] 668 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 669 | git-tree-sha1 = "7f3efec06033682db852f8b3bc3c1d2b0a0ab066" 670 | uuid = "38a345b3-de98-5d2b-a5d3-14cd9215e700" 671 | version = "2.36.0+0" 672 | 673 | [[deps.LinearAlgebra]] 674 | deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"] 675 | uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" 676 | 677 | [[deps.LogExpFunctions]] 678 | deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"] 679 | git-tree-sha1 = "680e733c3a0a9cea9e935c8c2184aea6a63fa0b5" 680 | uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688" 681 | version = "0.3.21" 682 | 683 | [deps.LogExpFunctions.extensions] 684 | ChainRulesCoreExt = "ChainRulesCore" 685 | ChangesOfVariablesExt = "ChangesOfVariables" 686 | InverseFunctionsExt = "InverseFunctions" 687 | 688 | [deps.LogExpFunctions.weakdeps] 689 | ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" 690 | ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0" 691 | InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112" 692 | 693 | [[deps.Logging]] 694 | uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" 695 | 696 | [[deps.LoggingExtras]] 697 | deps = ["Dates", "Logging"] 698 | git-tree-sha1 = "cedb76b37bc5a6c702ade66be44f831fa23c681e" 699 | uuid = "e6f89c97-d47a-5376-807f-9c37f3926c36" 700 | version = "1.0.0" 701 | 702 | [[deps.Lz4_jll]] 703 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 704 | git-tree-sha1 = "5d494bc6e85c4c9b626ee0cab05daa4085486ab1" 705 | uuid = "5ced341a-0733-55b8-9ab6-a4889d929147" 706 | version = "1.9.3+0" 707 | 708 | [[deps.MKL_jll]] 709 | deps = ["Artifacts", "IntelOpenMP_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"] 710 | git-tree-sha1 = "2ce8695e1e699b68702c03402672a69f54b8aca9" 711 | uuid = "856f044c-d86e-5d09-b602-aeab76dc8ba7" 712 | version = "2022.2.0+0" 713 | 714 | [[deps.MacroTools]] 715 | deps = ["Markdown", "Random"] 716 | git-tree-sha1 = "42324d08725e200c23d4dfb549e0d5d89dede2d2" 717 | uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09" 718 | version = "0.5.10" 719 | 720 | [[deps.Markdown]] 721 | deps = ["Base64"] 722 | uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" 723 | 724 | [[deps.MbedTLS]] 725 | deps = ["Dates", "MbedTLS_jll", "MozillaCACerts_jll", "Random", "Sockets"] 726 | git-tree-sha1 = "03a9b9718f5682ecb107ac9f7308991db4ce395b" 727 | uuid = "739be429-bea8-5141-9913-cc70e7f3736d" 728 | version = "1.1.7" 729 | 730 | [[deps.MbedTLS_jll]] 731 | deps = ["Artifacts", "Libdl"] 732 | uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" 733 | version = "2.28.0+0" 734 | 735 | [[deps.Measures]] 736 | git-tree-sha1 = "c13304c81eec1ed3af7fc20e75fb6b26092a1102" 737 | uuid = "442fdcdd-2543-5da2-b0f3-8c86c306513e" 738 | version = "0.3.2" 739 | 740 | [[deps.Memento]] 741 | deps = ["Dates", "Distributed", "Requires", "Serialization", "Sockets", "Test", "UUIDs"] 742 | git-tree-sha1 = "bb2e8f4d9f400f6e90d57b34860f6abdc51398e5" 743 | uuid = "f28f55f0-a522-5efc-85c2-fe41dfb9b2d9" 744 | version = "1.4.1" 745 | 746 | [[deps.Missings]] 747 | deps = ["DataAPI"] 748 | git-tree-sha1 = "f66bdc5de519e8f8ae43bdc598782d35a25b1272" 749 | uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28" 750 | version = "1.1.0" 751 | 752 | [[deps.Mmap]] 753 | uuid = "a63ad114-7e13-5084-954f-fe012c677804" 754 | 755 | [[deps.Mocking]] 756 | deps = ["Compat", "ExprTools"] 757 | git-tree-sha1 = "c272302b22479a24d1cf48c114ad702933414f80" 758 | uuid = "78c3b35d-d492-501b-9361-3d52fe80e533" 759 | version = "0.7.5" 760 | 761 | [[deps.MozillaCACerts_jll]] 762 | uuid = "14a3606d-f60d-562e-9121-12d972cd8159" 763 | version = "2022.10.11" 764 | 765 | [[deps.MultivariateStats]] 766 | deps = ["Arpack", "LinearAlgebra", "SparseArrays", "Statistics", "StatsAPI", "StatsBase"] 767 | git-tree-sha1 = "91a48569383df24f0fd2baf789df2aade3d0ad80" 768 | uuid = "6f286f6a-111f-5878-ab1e-185364afe411" 769 | version = "0.10.1" 770 | 771 | [[deps.NaNMath]] 772 | deps = ["OpenLibm_jll"] 773 | git-tree-sha1 = "a7c3d1da1189a1c2fe843a3bfa04d18d20eb3211" 774 | uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3" 775 | version = "1.0.1" 776 | 777 | [[deps.NamedArrays]] 778 | deps = ["Combinatorics", "DataStructures", "DelimitedFiles", "InvertedIndices", "LinearAlgebra", "Random", "Requires", "SparseArrays", "Statistics"] 779 | git-tree-sha1 = "2fd5787125d1a93fbe30961bd841707b8a80d75b" 780 | uuid = "86f7a689-2022-50b4-a561-43c23ac3c673" 781 | version = "0.9.6" 782 | 783 | [[deps.NearestNeighbors]] 784 | deps = ["Distances", "StaticArrays"] 785 | git-tree-sha1 = "2c3726ceb3388917602169bed973dbc97f1b51a8" 786 | uuid = "b8a86587-4115-5ab1-83bc-aa920d37bbce" 787 | version = "0.4.13" 788 | 789 | [[deps.NetworkOptions]] 790 | uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908" 791 | version = "1.2.0" 792 | 793 | [[deps.Observables]] 794 | git-tree-sha1 = "6862738f9796b3edc1c09d0890afce4eca9e7e93" 795 | uuid = "510215fc-4207-5dde-b226-833fc4488ee2" 796 | version = "0.5.4" 797 | 798 | [[deps.OffsetArrays]] 799 | deps = ["Adapt"] 800 | git-tree-sha1 = "82d7c9e310fe55aa54996e6f7f94674e2a38fcb4" 801 | uuid = "6fe1bfb0-de20-5000-8ca7-80f57d26f881" 802 | version = "1.12.9" 803 | 804 | [[deps.Ogg_jll]] 805 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 806 | git-tree-sha1 = "887579a3eb005446d514ab7aeac5d1d027658b8f" 807 | uuid = "e7412a2a-1a6e-54c0-be00-318e2571c051" 808 | version = "1.3.5+1" 809 | 810 | [[deps.OpenBLAS_jll]] 811 | deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] 812 | uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" 813 | version = "0.3.21+0" 814 | 815 | [[deps.OpenLibm_jll]] 816 | deps = ["Artifacts", "Libdl"] 817 | uuid = "05823500-19ac-5b8b-9628-191a04bc5112" 818 | version = "0.8.1+0" 819 | 820 | [[deps.OpenSSL]] 821 | deps = ["BitFlags", "Dates", "MozillaCACerts_jll", "OpenSSL_jll", "Sockets"] 822 | git-tree-sha1 = "6503b77492fd7fcb9379bf73cd31035670e3c509" 823 | uuid = "4d8831e6-92b7-49fb-bdf8-b643e874388c" 824 | version = "1.3.3" 825 | 826 | [[deps.OpenSSL_jll]] 827 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 828 | git-tree-sha1 = "9ff31d101d987eb9d66bd8b176ac7c277beccd09" 829 | uuid = "458c3c95-2e84-50aa-8efc-19380b2a3a95" 830 | version = "1.1.20+0" 831 | 832 | [[deps.OpenSpecFun_jll]] 833 | deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"] 834 | git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1" 835 | uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" 836 | version = "0.5.5+0" 837 | 838 | [[deps.Opus_jll]] 839 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 840 | git-tree-sha1 = "51a08fb14ec28da2ec7a927c4337e4332c2a4720" 841 | uuid = "91d4177d-7536-5919-b921-800302f37372" 842 | version = "1.3.2+0" 843 | 844 | [[deps.OrderedCollections]] 845 | git-tree-sha1 = "85f8e6578bf1f9ee0d11e7bb1b1456435479d47c" 846 | uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d" 847 | version = "1.4.1" 848 | 849 | [[deps.PCRE2_jll]] 850 | deps = ["Artifacts", "Libdl"] 851 | uuid = "efcefdf7-47ab-520b-bdef-62a2eaa19f15" 852 | version = "10.42.0+0" 853 | 854 | [[deps.PDMats]] 855 | deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"] 856 | git-tree-sha1 = "cf494dca75a69712a72b80bc48f59dcf3dea63ec" 857 | uuid = "90014a1f-27ba-587c-ab20-58faa44d9150" 858 | version = "0.11.16" 859 | 860 | [[deps.Parsers]] 861 | deps = ["Dates", "SnoopPrecompile"] 862 | git-tree-sha1 = "946b56b2135c6c10bbb93efad8a78b699b6383ab" 863 | uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" 864 | version = "2.5.6" 865 | 866 | [[deps.Pipe]] 867 | git-tree-sha1 = "6842804e7867b115ca9de748a0cf6b364523c16d" 868 | uuid = "b98c9c47-44ae-5843-9183-064241ee97a0" 869 | version = "1.3.0" 870 | 871 | [[deps.Pixman_jll]] 872 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 873 | git-tree-sha1 = "b4f5d02549a10e20780a24fce72bea96b6329e29" 874 | uuid = "30392449-352a-5448-841d-b1acce4e97dc" 875 | version = "0.40.1+0" 876 | 877 | [[deps.Pkg]] 878 | deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"] 879 | uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" 880 | version = "1.9.0" 881 | 882 | [[deps.PlotThemes]] 883 | deps = ["PlotUtils", "Statistics"] 884 | git-tree-sha1 = "1f03a2d339f42dca4a4da149c7e15e9b896ad899" 885 | uuid = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a" 886 | version = "3.1.0" 887 | 888 | [[deps.PlotUtils]] 889 | deps = ["ColorSchemes", "Colors", "Dates", "Printf", "Random", "Reexport", "SnoopPrecompile", "Statistics"] 890 | git-tree-sha1 = "c95373e73290cf50a8a22c3375e4625ded5c5280" 891 | uuid = "995b91a9-d308-5afd-9ec6-746e21dbc043" 892 | version = "1.3.4" 893 | 894 | [[deps.Plots]] 895 | deps = ["Base64", "Contour", "Dates", "Downloads", "FFMPEG", "FixedPointNumbers", "GR", "JLFzf", "JSON", "LaTeXStrings", "Latexify", "LinearAlgebra", "Measures", "NaNMath", "Pkg", "PlotThemes", "PlotUtils", "Preferences", "Printf", "REPL", "Random", "RecipesBase", "RecipesPipeline", "Reexport", "RelocatableFolders", "Requires", "Scratch", "Showoff", "SnoopPrecompile", "SparseArrays", "Statistics", "StatsBase", "UUIDs", "UnicodeFun", "Unzip"] 896 | git-tree-sha1 = "8ac949bd0ebc46a44afb1fdca1094554a84b086e" 897 | uuid = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" 898 | version = "1.38.5" 899 | 900 | [[deps.PooledArrays]] 901 | deps = ["DataAPI", "Future"] 902 | git-tree-sha1 = "a6062fe4063cdafe78f4a0a81cfffb89721b30e7" 903 | uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720" 904 | version = "1.4.2" 905 | 906 | [[deps.Preferences]] 907 | deps = ["TOML"] 908 | git-tree-sha1 = "47e5f437cc0e7ef2ce8406ce1e7e24d44915f88d" 909 | uuid = "21216c6a-2e73-6563-6e65-726566657250" 910 | version = "1.3.0" 911 | 912 | [[deps.PrettyTables]] 913 | deps = ["Crayons", "Formatting", "LaTeXStrings", "Markdown", "Reexport", "StringManipulation", "Tables"] 914 | git-tree-sha1 = "96f6db03ab535bdb901300f88335257b0018689d" 915 | uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" 916 | version = "2.2.2" 917 | 918 | [[deps.Printf]] 919 | deps = ["Unicode"] 920 | uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" 921 | 922 | [[deps.Profile]] 923 | deps = ["Printf"] 924 | uuid = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79" 925 | 926 | [[deps.Qt5Base_jll]] 927 | deps = ["Artifacts", "CompilerSupportLibraries_jll", "Fontconfig_jll", "Glib_jll", "JLLWrappers", "Libdl", "Libglvnd_jll", "OpenSSL_jll", "Pkg", "Xorg_libXext_jll", "Xorg_libxcb_jll", "Xorg_xcb_util_image_jll", "Xorg_xcb_util_keysyms_jll", "Xorg_xcb_util_renderutil_jll", "Xorg_xcb_util_wm_jll", "Zlib_jll", "xkbcommon_jll"] 928 | git-tree-sha1 = "0c03844e2231e12fda4d0086fd7cbe4098ee8dc5" 929 | uuid = "ea2cea3b-5b76-57ae-a6ef-0a8af62496e1" 930 | version = "5.15.3+2" 931 | 932 | [[deps.QuadGK]] 933 | deps = ["DataStructures", "LinearAlgebra"] 934 | git-tree-sha1 = "786efa36b7eff813723c4849c90456609cf06661" 935 | uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" 936 | version = "2.8.1" 937 | 938 | [[deps.REPL]] 939 | deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"] 940 | uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" 941 | 942 | [[deps.Random]] 943 | deps = ["SHA", "Serialization"] 944 | uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 945 | 946 | [[deps.Ratios]] 947 | deps = ["Requires"] 948 | git-tree-sha1 = "dc84268fe0e3335a62e315a3a7cf2afa7178a734" 949 | uuid = "c84ed2f1-dad5-54f0-aa8e-dbefe2724439" 950 | version = "0.4.3" 951 | 952 | [[deps.RecipesBase]] 953 | deps = ["SnoopPrecompile"] 954 | git-tree-sha1 = "261dddd3b862bd2c940cf6ca4d1c8fe593e457c8" 955 | uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" 956 | version = "1.3.3" 957 | 958 | [[deps.RecipesPipeline]] 959 | deps = ["Dates", "NaNMath", "PlotUtils", "RecipesBase", "SnoopPrecompile"] 960 | git-tree-sha1 = "e974477be88cb5e3040009f3767611bc6357846f" 961 | uuid = "01d81517-befc-4cb6-b9ec-a95719d0359c" 962 | version = "0.6.11" 963 | 964 | [[deps.Reexport]] 965 | git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" 966 | uuid = "189a3867-3050-52da-a836-e630ba90ab69" 967 | version = "1.2.2" 968 | 969 | [[deps.RelocatableFolders]] 970 | deps = ["SHA", "Scratch"] 971 | git-tree-sha1 = "90bc7a7c96410424509e4263e277e43250c05691" 972 | uuid = "05181044-ff0b-4ac5-8273-598c1e38db00" 973 | version = "1.0.0" 974 | 975 | [[deps.Requires]] 976 | deps = ["UUIDs"] 977 | git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7" 978 | uuid = "ae029012-a4dd-5104-9daa-d747884805df" 979 | version = "1.3.0" 980 | 981 | [[deps.Rmath]] 982 | deps = ["Random", "Rmath_jll"] 983 | git-tree-sha1 = "f65dcb5fa46aee0cf9ed6274ccbd597adc49aa7b" 984 | uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa" 985 | version = "0.7.1" 986 | 987 | [[deps.Rmath_jll]] 988 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 989 | git-tree-sha1 = "6ed52fdd3382cf21947b15e8870ac0ddbff736da" 990 | uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f" 991 | version = "0.4.0+0" 992 | 993 | [[deps.SHA]] 994 | uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" 995 | version = "0.7.0" 996 | 997 | [[deps.Scratch]] 998 | deps = ["Dates"] 999 | git-tree-sha1 = "f94f779c94e58bf9ea243e77a37e16d9de9126bd" 1000 | uuid = "6c6a2e73-6563-6170-7368-637461726353" 1001 | version = "1.1.1" 1002 | 1003 | [[deps.SentinelArrays]] 1004 | deps = ["Dates", "Random"] 1005 | git-tree-sha1 = "c02bd3c9c3fc8463d3591a62a378f90d2d8ab0f3" 1006 | uuid = "91c51154-3ec4-41a3-a24f-3f23e20d615c" 1007 | version = "1.3.17" 1008 | 1009 | [[deps.Serialization]] 1010 | uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" 1011 | 1012 | [[deps.SharedArrays]] 1013 | deps = ["Distributed", "Mmap", "Random", "Serialization"] 1014 | uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383" 1015 | 1016 | [[deps.Showoff]] 1017 | deps = ["Dates", "Grisu"] 1018 | git-tree-sha1 = "91eddf657aca81df9ae6ceb20b959ae5653ad1de" 1019 | uuid = "992d4aef-0814-514b-bc4d-f2e9a6c4116f" 1020 | version = "1.0.3" 1021 | 1022 | [[deps.SimpleBufferStream]] 1023 | git-tree-sha1 = "874e8867b33a00e784c8a7e4b60afe9e037b74e1" 1024 | uuid = "777ac1f9-54b0-4bf8-805c-2214025038e7" 1025 | version = "1.1.0" 1026 | 1027 | [[deps.SnoopPrecompile]] 1028 | deps = ["Preferences"] 1029 | git-tree-sha1 = "e760a70afdcd461cf01a575947738d359234665c" 1030 | uuid = "66db9d55-30c0-4569-8b51-7e840670fc0c" 1031 | version = "1.0.3" 1032 | 1033 | [[deps.Sockets]] 1034 | uuid = "6462fe0b-24de-5631-8697-dd941f90decc" 1035 | 1036 | [[deps.SoftGlobalScope]] 1037 | deps = ["REPL"] 1038 | git-tree-sha1 = "986ec2b6162ccb95de5892ed17832f95badf770c" 1039 | uuid = "b85f4697-e234-5449-a836-ec8e2f98b302" 1040 | version = "1.1.0" 1041 | 1042 | [[deps.SortingAlgorithms]] 1043 | deps = ["DataStructures"] 1044 | git-tree-sha1 = "a4ada03f999bd01b3a25dcaa30b2d929fe537e00" 1045 | uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c" 1046 | version = "1.1.0" 1047 | 1048 | [[deps.SparseArrays]] 1049 | deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"] 1050 | uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" 1051 | 1052 | [[deps.SpecialFunctions]] 1053 | deps = ["ChainRulesCore", "IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"] 1054 | git-tree-sha1 = "d75bda01f8c31ebb72df80a46c88b25d1c79c56d" 1055 | uuid = "276daf66-3868-5448-9aa4-cd146d93841b" 1056 | version = "2.1.7" 1057 | 1058 | [[deps.StaticArrays]] 1059 | deps = ["LinearAlgebra", "Random", "StaticArraysCore", "Statistics"] 1060 | git-tree-sha1 = "cee507162ecbb677450f20058ca83bd559b6b752" 1061 | uuid = "90137ffa-7385-5640-81b9-e52037218182" 1062 | version = "1.5.14" 1063 | 1064 | [[deps.StaticArraysCore]] 1065 | git-tree-sha1 = "6b7ba252635a5eff6a0b0664a41ee140a1c9e72a" 1066 | uuid = "1e83bf80-4336-4d27-bf5d-d5a4f845583c" 1067 | version = "1.4.0" 1068 | 1069 | [[deps.Statistics]] 1070 | deps = ["LinearAlgebra", "SparseArrays"] 1071 | uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" 1072 | version = "1.9.0" 1073 | 1074 | [[deps.StatsAPI]] 1075 | deps = ["LinearAlgebra"] 1076 | git-tree-sha1 = "f9af7f195fb13589dd2e2d57fdb401717d2eb1f6" 1077 | uuid = "82ae8749-77ed-4fe6-ae5f-f523153014b0" 1078 | version = "1.5.0" 1079 | 1080 | [[deps.StatsBase]] 1081 | deps = ["DataAPI", "DataStructures", "LinearAlgebra", "LogExpFunctions", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics", "StatsAPI"] 1082 | git-tree-sha1 = "d1bf48bfcc554a3761a133fe3a9bb01488e06916" 1083 | uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" 1084 | version = "0.33.21" 1085 | 1086 | [[deps.StatsFuns]] 1087 | deps = ["ChainRulesCore", "HypergeometricFunctions", "InverseFunctions", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"] 1088 | git-tree-sha1 = "ab6083f09b3e617e34a956b43e9d51b824206932" 1089 | uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c" 1090 | version = "1.1.1" 1091 | 1092 | [[deps.StatsPlots]] 1093 | deps = ["AbstractFFTs", "Clustering", "DataStructures", "DataValues", "Distributions", "Interpolations", "KernelDensity", "LinearAlgebra", "MultivariateStats", "NaNMath", "Observables", "Plots", "RecipesBase", "RecipesPipeline", "Reexport", "StatsBase", "TableOperations", "Tables", "Widgets"] 1094 | git-tree-sha1 = "e0d5bc26226ab1b7648278169858adcfbd861780" 1095 | uuid = "f3b207a7-027a-5e70-b257-86293d7955fd" 1096 | version = "0.15.4" 1097 | 1098 | [[deps.StringManipulation]] 1099 | git-tree-sha1 = "46da2434b41f41ac3594ee9816ce5541c6096123" 1100 | uuid = "892a3eda-7b42-436c-8928-eab12a02cf0e" 1101 | version = "0.3.0" 1102 | 1103 | [[deps.StructTypes]] 1104 | deps = ["Dates", "UUIDs"] 1105 | git-tree-sha1 = "ca4bccb03acf9faaf4137a9abc1881ed1841aa70" 1106 | uuid = "856f2bd8-1eba-4b0a-8007-ebc267875bd4" 1107 | version = "1.10.0" 1108 | 1109 | [[deps.SuiteSparse]] 1110 | deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"] 1111 | uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9" 1112 | 1113 | [[deps.SuiteSparse_jll]] 1114 | deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"] 1115 | uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c" 1116 | version = "5.10.1+6" 1117 | 1118 | [[deps.TOML]] 1119 | deps = ["Dates"] 1120 | uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76" 1121 | version = "1.0.3" 1122 | 1123 | [[deps.TableOperations]] 1124 | deps = ["SentinelArrays", "Tables", "Test"] 1125 | git-tree-sha1 = "e383c87cf2a1dc41fa30c093b2a19877c83e1bc1" 1126 | uuid = "ab02a1b2-a7df-11e8-156e-fb1833f50b87" 1127 | version = "1.2.0" 1128 | 1129 | [[deps.TableTraits]] 1130 | deps = ["IteratorInterfaceExtensions"] 1131 | git-tree-sha1 = "c06b2f539df1c6efa794486abfb6ed2022561a39" 1132 | uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c" 1133 | version = "1.0.1" 1134 | 1135 | [[deps.Tables]] 1136 | deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "OrderedCollections", "TableTraits", "Test"] 1137 | git-tree-sha1 = "c79322d36826aa2f4fd8ecfa96ddb47b174ac78d" 1138 | uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 1139 | version = "1.10.0" 1140 | 1141 | [[deps.Tar]] 1142 | deps = ["ArgTools", "SHA"] 1143 | uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e" 1144 | version = "1.10.0" 1145 | 1146 | [[deps.TensorCore]] 1147 | deps = ["LinearAlgebra"] 1148 | git-tree-sha1 = "1feb45f88d133a655e001435632f019a9a1bcdb6" 1149 | uuid = "62fd8b95-f654-4bbd-a8a5-9c27f68ccd50" 1150 | version = "0.1.1" 1151 | 1152 | [[deps.Test]] 1153 | deps = ["InteractiveUtils", "Logging", "Random", "Serialization"] 1154 | uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" 1155 | 1156 | [[deps.TimeZones]] 1157 | deps = ["Dates", "Downloads", "InlineStrings", "LazyArtifacts", "Mocking", "Printf", "RecipesBase", "Scratch", "Unicode"] 1158 | git-tree-sha1 = "a92ec4466fc6e3dd704e2668b5e7f24add36d242" 1159 | uuid = "f269a46b-ccf7-5d73-abea-4c690281aa53" 1160 | version = "1.9.1" 1161 | 1162 | [[deps.TranscodingStreams]] 1163 | deps = ["Random", "Test"] 1164 | git-tree-sha1 = "94f38103c984f89cf77c402f2a68dbd870f8165f" 1165 | uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa" 1166 | version = "0.9.11" 1167 | 1168 | [[deps.URIs]] 1169 | git-tree-sha1 = "ac00576f90d8a259f2c9d823e91d1de3fd44d348" 1170 | uuid = "5c2747f8-b7ea-4ff2-ba2e-563bfd36b1d4" 1171 | version = "1.4.1" 1172 | 1173 | [[deps.UUIDs]] 1174 | deps = ["Random", "SHA"] 1175 | uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" 1176 | 1177 | [[deps.Unicode]] 1178 | uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" 1179 | 1180 | [[deps.UnicodeFun]] 1181 | deps = ["REPL"] 1182 | git-tree-sha1 = "53915e50200959667e78a92a418594b428dffddf" 1183 | uuid = "1cfade01-22cf-5700-b092-accc4b62d6e1" 1184 | version = "0.4.1" 1185 | 1186 | [[deps.Unzip]] 1187 | git-tree-sha1 = "ca0969166a028236229f63514992fc073799bb78" 1188 | uuid = "41fe7b60-77ed-43a1-b4f0-825fd5a5650d" 1189 | version = "0.2.0" 1190 | 1191 | [[deps.VersionParsing]] 1192 | git-tree-sha1 = "58d6e80b4ee071f5efd07fda82cb9fbe17200868" 1193 | uuid = "81def892-9a0e-5fdd-b105-ffc91e053289" 1194 | version = "1.3.0" 1195 | 1196 | [[deps.Wayland_jll]] 1197 | deps = ["Artifacts", "Expat_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg", "XML2_jll"] 1198 | git-tree-sha1 = "ed8d92d9774b077c53e1da50fd81a36af3744c1c" 1199 | uuid = "a2964d1f-97da-50d4-b82a-358c7fce9d89" 1200 | version = "1.21.0+0" 1201 | 1202 | [[deps.Wayland_protocols_jll]] 1203 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1204 | git-tree-sha1 = "4528479aa01ee1b3b4cd0e6faef0e04cf16466da" 1205 | uuid = "2381bf8a-dfd0-557d-9999-79630e7b1b91" 1206 | version = "1.25.0+0" 1207 | 1208 | [[deps.WeakRefStrings]] 1209 | deps = ["DataAPI", "InlineStrings", "Parsers"] 1210 | git-tree-sha1 = "b1be2855ed9ed8eac54e5caff2afcdb442d52c23" 1211 | uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5" 1212 | version = "1.4.2" 1213 | 1214 | [[deps.Widgets]] 1215 | deps = ["Colors", "Dates", "Observables", "OrderedCollections"] 1216 | git-tree-sha1 = "fcdae142c1cfc7d89de2d11e08721d0f2f86c98a" 1217 | uuid = "cc8bc4a8-27d6-5769-a93b-9d913e69aa62" 1218 | version = "0.6.6" 1219 | 1220 | [[deps.WoodburyMatrices]] 1221 | deps = ["LinearAlgebra", "SparseArrays"] 1222 | git-tree-sha1 = "de67fa59e33ad156a590055375a30b23c40299d3" 1223 | uuid = "efce3f68-66dc-5838-9240-27a6d6f5f9b6" 1224 | version = "0.5.5" 1225 | 1226 | [[deps.WorkerUtilities]] 1227 | git-tree-sha1 = "cd1659ba0d57b71a464a29e64dbc67cfe83d54e7" 1228 | uuid = "76eceee3-57b5-4d4a-8e66-0e911cebbf60" 1229 | version = "1.6.1" 1230 | 1231 | [[deps.XML2_jll]] 1232 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"] 1233 | git-tree-sha1 = "93c41695bc1c08c46c5899f4fe06d6ead504bb73" 1234 | uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a" 1235 | version = "2.10.3+0" 1236 | 1237 | [[deps.XSLT_jll]] 1238 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Libgcrypt_jll", "Libgpg_error_jll", "Libiconv_jll", "Pkg", "XML2_jll", "Zlib_jll"] 1239 | git-tree-sha1 = "91844873c4085240b95e795f692c4cec4d805f8a" 1240 | uuid = "aed1982a-8fda-507f-9586-7b0439959a61" 1241 | version = "1.1.34+0" 1242 | 1243 | [[deps.Xorg_libX11_jll]] 1244 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxcb_jll", "Xorg_xtrans_jll"] 1245 | git-tree-sha1 = "5be649d550f3f4b95308bf0183b82e2582876527" 1246 | uuid = "4f6342f7-b3d2-589e-9d20-edeb45f2b2bc" 1247 | version = "1.6.9+4" 1248 | 1249 | [[deps.Xorg_libXau_jll]] 1250 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1251 | git-tree-sha1 = "4e490d5c960c314f33885790ed410ff3a94ce67e" 1252 | uuid = "0c0b7dd1-d40b-584c-a123-a41640f87eec" 1253 | version = "1.0.9+4" 1254 | 1255 | [[deps.Xorg_libXcursor_jll]] 1256 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXfixes_jll", "Xorg_libXrender_jll"] 1257 | git-tree-sha1 = "12e0eb3bc634fa2080c1c37fccf56f7c22989afd" 1258 | uuid = "935fb764-8cf2-53bf-bb30-45bb1f8bf724" 1259 | version = "1.2.0+4" 1260 | 1261 | [[deps.Xorg_libXdmcp_jll]] 1262 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1263 | git-tree-sha1 = "4fe47bd2247248125c428978740e18a681372dd4" 1264 | uuid = "a3789734-cfe1-5b06-b2d0-1dd0d9d62d05" 1265 | version = "1.1.3+4" 1266 | 1267 | [[deps.Xorg_libXext_jll]] 1268 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] 1269 | git-tree-sha1 = "b7c0aa8c376b31e4852b360222848637f481f8c3" 1270 | uuid = "1082639a-0dae-5f34-9b06-72781eeb8cb3" 1271 | version = "1.3.4+4" 1272 | 1273 | [[deps.Xorg_libXfixes_jll]] 1274 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] 1275 | git-tree-sha1 = "0e0dc7431e7a0587559f9294aeec269471c991a4" 1276 | uuid = "d091e8ba-531a-589c-9de9-94069b037ed8" 1277 | version = "5.0.3+4" 1278 | 1279 | [[deps.Xorg_libXi_jll]] 1280 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll", "Xorg_libXfixes_jll"] 1281 | git-tree-sha1 = "89b52bc2160aadc84d707093930ef0bffa641246" 1282 | uuid = "a51aa0fd-4e3c-5386-b890-e753decda492" 1283 | version = "1.7.10+4" 1284 | 1285 | [[deps.Xorg_libXinerama_jll]] 1286 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll"] 1287 | git-tree-sha1 = "26be8b1c342929259317d8b9f7b53bf2bb73b123" 1288 | uuid = "d1454406-59df-5ea1-beac-c340f2130bc3" 1289 | version = "1.1.4+4" 1290 | 1291 | [[deps.Xorg_libXrandr_jll]] 1292 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll", "Xorg_libXrender_jll"] 1293 | git-tree-sha1 = "34cea83cb726fb58f325887bf0612c6b3fb17631" 1294 | uuid = "ec84b674-ba8e-5d96-8ba1-2a689ba10484" 1295 | version = "1.5.2+4" 1296 | 1297 | [[deps.Xorg_libXrender_jll]] 1298 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] 1299 | git-tree-sha1 = "19560f30fd49f4d4efbe7002a1037f8c43d43b96" 1300 | uuid = "ea2f1a96-1ddc-540d-b46f-429655e07cfa" 1301 | version = "0.9.10+4" 1302 | 1303 | [[deps.Xorg_libpthread_stubs_jll]] 1304 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1305 | git-tree-sha1 = "6783737e45d3c59a4a4c4091f5f88cdcf0908cbb" 1306 | uuid = "14d82f49-176c-5ed1-bb49-ad3f5cbd8c74" 1307 | version = "0.1.0+3" 1308 | 1309 | [[deps.Xorg_libxcb_jll]] 1310 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "XSLT_jll", "Xorg_libXau_jll", "Xorg_libXdmcp_jll", "Xorg_libpthread_stubs_jll"] 1311 | git-tree-sha1 = "daf17f441228e7a3833846cd048892861cff16d6" 1312 | uuid = "c7cfdc94-dc32-55de-ac96-5a1b8d977c5b" 1313 | version = "1.13.0+3" 1314 | 1315 | [[deps.Xorg_libxkbfile_jll]] 1316 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] 1317 | git-tree-sha1 = "926af861744212db0eb001d9e40b5d16292080b2" 1318 | uuid = "cc61e674-0454-545c-8b26-ed2c68acab7a" 1319 | version = "1.1.0+4" 1320 | 1321 | [[deps.Xorg_xcb_util_image_jll]] 1322 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] 1323 | git-tree-sha1 = "0fab0a40349ba1cba2c1da699243396ff8e94b97" 1324 | uuid = "12413925-8142-5f55-bb0e-6d7ca50bb09b" 1325 | version = "0.4.0+1" 1326 | 1327 | [[deps.Xorg_xcb_util_jll]] 1328 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxcb_jll"] 1329 | git-tree-sha1 = "e7fd7b2881fa2eaa72717420894d3938177862d1" 1330 | uuid = "2def613f-5ad1-5310-b15b-b15d46f528f5" 1331 | version = "0.4.0+1" 1332 | 1333 | [[deps.Xorg_xcb_util_keysyms_jll]] 1334 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] 1335 | git-tree-sha1 = "d1151e2c45a544f32441a567d1690e701ec89b00" 1336 | uuid = "975044d2-76e6-5fbe-bf08-97ce7c6574c7" 1337 | version = "0.4.0+1" 1338 | 1339 | [[deps.Xorg_xcb_util_renderutil_jll]] 1340 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] 1341 | git-tree-sha1 = "dfd7a8f38d4613b6a575253b3174dd991ca6183e" 1342 | uuid = "0d47668e-0667-5a69-a72c-f761630bfb7e" 1343 | version = "0.3.9+1" 1344 | 1345 | [[deps.Xorg_xcb_util_wm_jll]] 1346 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] 1347 | git-tree-sha1 = "e78d10aab01a4a154142c5006ed44fd9e8e31b67" 1348 | uuid = "c22f9ab0-d5fe-5066-847c-f4bb1cd4e361" 1349 | version = "0.4.1+1" 1350 | 1351 | [[deps.Xorg_xkbcomp_jll]] 1352 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxkbfile_jll"] 1353 | git-tree-sha1 = "4bcbf660f6c2e714f87e960a171b119d06ee163b" 1354 | uuid = "35661453-b289-5fab-8a00-3d9160c6a3a4" 1355 | version = "1.4.2+4" 1356 | 1357 | [[deps.Xorg_xkeyboard_config_jll]] 1358 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xkbcomp_jll"] 1359 | git-tree-sha1 = "5c8424f8a67c3f2209646d4425f3d415fee5931d" 1360 | uuid = "33bec58e-1273-512f-9401-5d533626f822" 1361 | version = "2.27.0+4" 1362 | 1363 | [[deps.Xorg_xtrans_jll]] 1364 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1365 | git-tree-sha1 = "79c31e7844f6ecf779705fbc12146eb190b7d845" 1366 | uuid = "c5fb5394-a638-5e4d-96e5-b29de1b5cf10" 1367 | version = "1.4.0+3" 1368 | 1369 | [[deps.ZMQ]] 1370 | deps = ["FileWatching", "Sockets", "ZeroMQ_jll"] 1371 | git-tree-sha1 = "356d2bdcc0bce90aabee1d1c0f6d6f301eda8f77" 1372 | uuid = "c2297ded-f4af-51ae-bb23-16f91089e4e1" 1373 | version = "1.2.2" 1374 | 1375 | [[deps.ZeroMQ_jll]] 1376 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "libsodium_jll"] 1377 | git-tree-sha1 = "fe5c65a526f066fb3000da137d5785d9649a8a47" 1378 | uuid = "8f1865be-045e-5c20-9c9f-bfbfb0764568" 1379 | version = "4.3.4+0" 1380 | 1381 | [[deps.ZipFile]] 1382 | deps = ["Libdl", "Printf", "Zlib_jll"] 1383 | git-tree-sha1 = "f492b7fe1698e623024e873244f10d89c95c340a" 1384 | uuid = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea" 1385 | version = "0.10.1" 1386 | 1387 | [[deps.Zlib_jll]] 1388 | deps = ["Libdl"] 1389 | uuid = "83775a58-1f1d-513f-b197-d71354ab007a" 1390 | version = "1.2.13+0" 1391 | 1392 | [[deps.Zstd_jll]] 1393 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1394 | git-tree-sha1 = "e45044cd873ded54b6a5bac0eb5c971392cf1927" 1395 | uuid = "3161d3a3-bdf6-5164-811a-617609db77b4" 1396 | version = "1.5.2+0" 1397 | 1398 | [[deps.fzf_jll]] 1399 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1400 | git-tree-sha1 = "868e669ccb12ba16eaf50cb2957ee2ff61261c56" 1401 | uuid = "214eeab7-80f7-51ab-84ad-2988db7cef09" 1402 | version = "0.29.0+0" 1403 | 1404 | [[deps.libaom_jll]] 1405 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1406 | git-tree-sha1 = "3a2ea60308f0996d26f1e5354e10c24e9ef905d4" 1407 | uuid = "a4ae2306-e953-59d6-aa16-d00cac43593b" 1408 | version = "3.4.0+0" 1409 | 1410 | [[deps.libass_jll]] 1411 | deps = ["Artifacts", "Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "HarfBuzz_jll", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll"] 1412 | git-tree-sha1 = "5982a94fcba20f02f42ace44b9894ee2b140fe47" 1413 | uuid = "0ac62f75-1d6f-5e53-bd7c-93b484bb37c0" 1414 | version = "0.15.1+0" 1415 | 1416 | [[deps.libblastrampoline_jll]] 1417 | deps = ["Artifacts", "Libdl"] 1418 | uuid = "8e850b90-86db-534c-a0d3-1478176c7d93" 1419 | version = "5.4.0+0" 1420 | 1421 | [[deps.libfdk_aac_jll]] 1422 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1423 | git-tree-sha1 = "daacc84a041563f965be61859a36e17c4e4fcd55" 1424 | uuid = "f638f0a6-7fb0-5443-88ba-1cc74229b280" 1425 | version = "2.0.2+0" 1426 | 1427 | [[deps.libpng_jll]] 1428 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll"] 1429 | git-tree-sha1 = "94d180a6d2b5e55e447e2d27a29ed04fe79eb30c" 1430 | uuid = "b53b4c65-9356-5827-b1ea-8c7a1a84506f" 1431 | version = "1.6.38+0" 1432 | 1433 | [[deps.libsodium_jll]] 1434 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1435 | git-tree-sha1 = "848ab3d00fe39d6fbc2a8641048f8f272af1c51e" 1436 | uuid = "a9144af2-ca23-56d9-984f-0d03f7b5ccf8" 1437 | version = "1.0.20+0" 1438 | 1439 | [[deps.libvorbis_jll]] 1440 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Ogg_jll", "Pkg"] 1441 | git-tree-sha1 = "b910cb81ef3fe6e78bf6acee440bda86fd6ae00c" 1442 | uuid = "f27f6e37-5d2b-51aa-960f-b287f2bc3b7a" 1443 | version = "1.3.7+1" 1444 | 1445 | [[deps.nghttp2_jll]] 1446 | deps = ["Artifacts", "Libdl"] 1447 | uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d" 1448 | version = "1.48.0+0" 1449 | 1450 | [[deps.p7zip_jll]] 1451 | deps = ["Artifacts", "Libdl"] 1452 | uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0" 1453 | version = "17.4.0+0" 1454 | 1455 | [[deps.x264_jll]] 1456 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1457 | git-tree-sha1 = "4fea590b89e6ec504593146bf8b988b2c00922b2" 1458 | uuid = "1270edf5-f2f9-52d2-97e9-ab00b5d0237a" 1459 | version = "2021.5.5+0" 1460 | 1461 | [[deps.x265_jll]] 1462 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] 1463 | git-tree-sha1 = "ee567a171cce03570d77ad3a43e90218e38937a9" 1464 | uuid = "dfaa095f-4041-5dcd-9319-2fabd8486b76" 1465 | version = "3.5.0+0" 1466 | 1467 | [[deps.xkbcommon_jll]] 1468 | deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Wayland_jll", "Wayland_protocols_jll", "Xorg_libxcb_jll", "Xorg_xkeyboard_config_jll"] 1469 | git-tree-sha1 = "9ebfc140cc56e8c2156a15ceac2f0302e327ac0a" 1470 | uuid = "d8fb68d0-12a3-5cfd-a85a-d49703b185fd" 1471 | version = "1.4.1+0" 1472 | -------------------------------------------------------------------------------- /Project.toml: -------------------------------------------------------------------------------- 1 | [deps] 2 | Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45" 3 | BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf" 4 | CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" 5 | CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" 6 | Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc" 7 | CodecZlib = "944b1d66-785c-5afd-91f1-9de20f533193" 8 | DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" 9 | DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964" 10 | FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549" 11 | FreqTables = "da1fdf0e-e0ff-5433-a45f-9bb5ff651cb1" 12 | IJulia = "7073ff75-c697-5162-941a-fcdaad2a7d2a" 13 | JDF = "babc3d20-cd49-4f60-a736-a8f9c08892d3" 14 | JLSO = "9da8a3cd-07a3-59c0-a743-3fdc52c30d11" 15 | JSONTables = "b9914132-a727-11e9-1322-f18e41205b0b" 16 | NamedArrays = "86f7a689-2022-50b4-a561-43c23ac3c673" 17 | PooledArrays = "2dfb63ee-cc39-5dd5-95bd-886bf059d720" 18 | Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" 19 | Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" 20 | StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd" 21 | Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" 22 | ZipFile = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea" 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # An Introduction to DataFrames.jl 2 | 3 | [Bogumił Kamiński](http://bogumilkaminski.pl/about/), February 13, 2023 4 | 5 | **The tutorial is for DataFrames.jl 1.5.0** 6 | 7 | A brief introduction to basic usage of [DataFrames](https://github.com/JuliaData/DataFrames.jl). 8 | 9 | The tutorial contains a specification of the project environment version under 10 | which it should be run. In order to prepare this environment, before using the 11 | tutorial notebooks, while in the project folder run the following command in the 12 | command line: 13 | 14 | ``` 15 | julia -e 'using Pkg; Pkg.activate("."); Pkg.instantiate()' 16 | ``` 17 | 18 | Tested under Julia 1.9.0. The project dependencies are the following: 19 | 20 | ``` 21 | [69666777] Arrow v2.4.3 22 | [6e4b80f9] BenchmarkTools v1.3.2 23 | [336ed68f] CSV v0.10.9 24 | [324d7699] CategoricalArrays v0.10.7 25 | [8be319e6] Chain v0.5.0 26 | [944b1d66] CodecZlib v0.7.1 27 | [a93c6f00] DataFrames v1.5.0 28 | [1313f7d8] DataFramesMeta v0.13.0 29 | [5789e2e9] FileIO v1.16.0 30 | [da1fdf0e] FreqTables v0.4.5 31 | [7073ff75] IJulia v1.24.0 32 | [babc3d20] JDF v0.5.1 33 | [9da8a3cd] JLSO v2.7.0 34 | [b9914132] JSONTables v1.0.3 35 | [86f7a689] NamedArrays v0.9.6 36 | [2dfb63ee] PooledArrays v1.4.2 37 | [f3b207a7] StatsPlots v0.15.4 38 | [bd369af6] Tables v1.10.0 39 | [a5390f91] ZipFile v0.10.1 40 | [9a3f8284] Random 41 | [10745b16] Statistics v1.9.0 42 | ``` 43 | 44 | I will try to keep the material up to date as the packages evolve. 45 | 46 | This tutorial covers 47 | [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl) 48 | and [CategoricalArrays.jl](https://github.com/JuliaData/CategoricalArrays.jl), 49 | as they constitute the core of [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl) 50 | along with selected file reading and writing packages. 51 | 52 | In the last [extras](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/13_extras.ipynb) 53 | part mentions *selected* functionalities of *selected* useful packages that I find useful for data manipulation, currently those are: 54 | [FreqTables.jl](https://github.com/nalimilan/FreqTables.jl), 55 | [DataFramesMeta.jl](https://github.com/JuliaStats/DataFramesMeta.jl) 56 | [StatsPlots.jl](https://github.com/JuliaPlots/StatsPlots.jl). 57 | 58 | # TOC 59 | 60 | | File | Topic | 61 | |-------------------------------------------------------------------------------------------------------------------|-----------------------------------| 62 | | [01_constructors.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/01_constructors.ipynb) | Creating DataFrame and conversion | 63 | | [02_basicinfo.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/02_basicinfo.ipynb) | Getting summary information | 64 | | [03_missingvalues.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/03_missingvalues.ipynb) | Handling missing values | 65 | | [04_loadsave.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/04_loadsave.ipynb) | Loading and saving DataFrames | 66 | | [05_columns.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/05_columns.ipynb) | Working with columns of DataFrame | 67 | | [06_rows.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/06_rows.ipynb) | Working with row of DataFrame | 68 | | [07_factors.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/07_factors.ipynb) | Working with categorical data | 69 | | [08_joins.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/08_joins.ipynb) | Joining DataFrames | 70 | | [09_reshaping.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/09_reshaping.ipynb) | Reshaping DataFrames | 71 | | [10_transforms.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/10_transforms.ipynb) | Transforming DataFrames | 72 | | [11_performance.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/11_performance.ipynb) | Performance tips | 73 | | [12_pitfalls.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/12_pitfalls.ipynb) | Possible pitfalls | 74 | | [13_extras.ipynb](https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/13_extras.ipynb) | Additional interesting packages | 75 | 76 | Changelog: 77 | 78 | | Date | Changes | 79 | | ---------- | ------------------------------------------------------------ | 80 | | 2017-12-05 | Initial release | 81 | | 2017-12-06 | Added description of `insert!`, `merge!`, `empty!`, `categorical!`, `delete!`, `DataFrames.index` | 82 | | 2017-12-09 | Added performance tips | 83 | | 2017-12-10 | Added pitfalls | 84 | | 2017-12-18 | Added additional worthwhile packages: *FreqTables* and *DataFramesMeta* | 85 | | 2017-12-29 | Added description of `filter` and `filter!` | 86 | | 2017-12-31 | Added description of conversion to `Matrix` | 87 | | 2018-04-06 | Added example of extracting a row from a `DataFrame` | 88 | | 2018-04-21 | Major update of whole tutorial | 89 | | 2018-05-01 | Added `byrow!` example | 90 | | 2018-05-13 | Added `StatPlots` package to extras | 91 | | 2018-05-23 | Improved comments in sections 1 do 5 by [Jane Herriman](https://github.com/xorJane) | 92 | | 2018-07-25 | Update to 0.11.7 release | 93 | | 2018-08-25 | Update to Julia 1.0 release: sections 1 to 10 | 94 | | 2018-08-29 | Update to Julia 1.0 release: sections 11, 12 and 13 | 95 | | 2018-09-05 | Update to Julia 1.0 release: FreqTables section | 96 | | 2018-09-10 | Added CSVFiles section to chapter on load/save | 97 | | 2018-09-26 | Updated to DataFrames 0.14.0 | 98 | | 2018-10-04 | Updated to DataFrames 0.14.1, added `haskey` and `repeat` | 99 | | 2018-12-08 | Updated to DataFrames 0.15.2 | 100 | | 2019-01-03 | Updated to DataFrames 0.16.0, added serialization instructions | 101 | | 2019-01-18 | Updated to DataFrames 0.17.0, added `passmissing` | 102 | | 2019-01-27 | Added Feather.jl file read/write | 103 | | 2019-01-30 | Renamed StatPlots.jl to StatsPlots.jl and added Tables.jl| 104 | | 2019-02-08 | Added `groupvars` and `groupindices` functions| 105 | | 2019-04-27 | Updated to DataFrames 0.18.0, dropped JLD2.jl | 106 | | 2019-04-30 | Updated handling of missing values description | 107 | | 2019-07-16 | Updated to DataFrames 0.19.0 | 108 | | 2019-08-14 | Added JSONTables.jl and `Tables.columnindex` | 109 | | 2019-08-16 | Added Project.toml and Manifest.toml | 110 | | 2019-08-26 | Update to Julia 1.2 and DataFrames 0.19.3 | 111 | | 2019-08-29 | Add example how to compress/decompress CSV file using CodecZlib | 112 | | 2019-08-30 | Add examples of JLSO.jl and ZipFile.jl by [xiaodaigh](https://github.com/xiaodaigh) | 113 | | 2019-11-03 | Add examples of JDF.jl by [xiaodaigh](https://github.com/xiaodaigh) | 114 | | 2019-12-08 | Updated to DataFrames 0.20.0 | 115 | | 2020-05-06 | Updated to DataFrames 0.21.0 (except load/save and extras) | 116 | | 2020-11-20 | Updated to DataFrames 0.22.0 (except DataFramesMeta.jl which does not work yet) | 117 | | 2020-11-26 | Updated to DataFramesMeta.jl 0.6; update by @pdeffebach | 118 | | 2021-05-15 | Updated to DataFrames.jl 1.1.1 | 119 | | 2021-05-15 | Updated to DataFrames.jl 1.2 and DataFramesMeta.jl 0.8, added Chain.jl instead of Pipe.jl | 120 | | 2021-12-12 | Updated to DataFrames.jl 1.3 | 121 | | 2022-10-05 | Updated to DataFrames.jl 1.4 | 122 | | 2023-02-13 | Updated to DataFrames.jl 1.5 | 123 | 124 | # Core functions summary 125 | 126 | 1. Constructors: `DataFrame`, `DataFrame!`, `Tables.rowtable`, `Tables.columntable`, `Matrix`, `eachcol`, `eachrow`, `Tables.namedtupleiterator`, `empty`, `empty!` 127 | 2. Getting summary: `size`, `nrow`, `ncol`, `describe`, `names`, `eltypes`, `first`, `last`, `getindex`, `setindex!`, `@view`, `isapprox`, `metadata`, `metadata!`, `colmetadata`, `colmetadata!` 128 | 3. Handling missing: `missing` (singleton instance of `Missing`), `ismissing`, `nonmissingtype`, `skipmissing`, `replace`, `replace!`, `coalesce`, `allowmissing`, `disallowmissing`, `allowmissing!`, `completecases`, `dropmissing`, `dropmissing!`, `disallowmissing`, `disallowmissing!`, `passmissing` 129 | 4. Loading and saving: `CSV` (package), `CSVFiles` (package), `Serialization` (module), `CSV.read`, `CSV.write`, `save`, `load`, `serialize`, `deserialize`, `Arrow.write`, `Arrow.Table` (from Arrow.jl package), `JSONTables` (package), `arraytable`, `objecttable`, `jsontable`, `CodecZlib` (module), `GzipCompressorStream`, `GzipDecompressorStream`, `JDF.jl` (package), `JDF.save`, `JDF.load`, `JLSO.jl` (package), `JLSO.save`, `JLSO.load`, `ZipFile.jl` (package), `ZipFile.reader`, `ZipFile.writer`, `ZipFile.addfile` 130 | 5. Working with columns: `rename`, `rename!`, `hcat`, `insertcols!`, `categorical!`, `columnindex`, `hasproperty`, `select`, `select!`, `transform`, `transform!`, `combine`, `Not`, `All`, `Between`, `ByRow`, `AsTable` 131 | 6. Working with rows: `sort!`, `sort`, `issorted`, `append!`, `vcat`, `push!`, `view`, `filter`, `filter!`, `deleteat!`, `unique`, `nonunique`, `unique!`, `allunique`, `repeat`, `parent`, `parentindices`, `flatten`, `@chain` (from `Chain.jl` package), `only`, `subset`, `subset!`, `shuffle`, `prepend!`, `pushfirst!`, `insert!`, `keepat!` 132 | 7. Working with categorical: `categorical`, `cut`, `isordered`, `ordered!`, `levels`, `unique`, `levels!`, `droplevels!`, `unwrap`, `recode`, `recode!` 133 | 8. Joining: `innerjoin`, `leftjoin`, `leftjoin!`, `rightjoin`, `outerjoin`, `semijoin`, `antijoin`, `crossjoin` 134 | 9. Reshaping: `stack`, `unstack` 135 | 10. Transforming: `groupby`, `mapcols`, `parent`, `groupcols`, `valuecols`, `groupindices`, `keys` (for `GroupedDataFrame`), `combine`, `select`, `select!`, `transform`, `transform!`, `@chain` (from `Chain.jl` package) 136 | 11. Extras: 137 | * [FreqTables](https://github.com/nalimilan/FreqTables.jl): `freqtable`, `prop`, `Name` 138 | * [DataFramesMeta](https://github.com/JuliaStats/DataFramesMeta.jl): `@with`, `@subset`, `@select`, `@transform`, `@orderby`, `@by`, `@combine`, `@eachrow`, `@newcol`, `^`, `$` 139 | * [StatsPlots](https://github.com/JuliaPlots/StatsPlots.jl): `@df`, `plot`, `density`, `histogram`,`boxplot`, `violin` 140 | --------------------------------------------------------------------------------