├── gz_2010_us_050_00_5m.json ├── provincial-and-district-hospitals.csv ├── Nassau_police_union_contribs_dirty.xlsx ├── Dockerfile ├── .gitignore ├── LICENSE ├── README.md ├── HMDA.ipynb └── Messy_Data_Breakout_Pandas.ipynb /gz_2010_us_050_00_5m.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ipython/mozfest2014/master/gz_2010_us_050_00_5m.json -------------------------------------------------------------------------------- /provincial-and-district-hospitals.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ipython/mozfest2014/master/provincial-and-district-hospitals.csv -------------------------------------------------------------------------------- /Nassau_police_union_contribs_dirty.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ipython/mozfest2014/master/Nassau_police_union_contribs_dirty.xlsx -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM jupyter/demo 2 | 3 | MAINTAINER IPython Project 4 | 5 | USER root 6 | 7 | RUN pip2 install ipython[all] --force-reinstall --upgrade 8 | RUN pip3 install ipython[all] --force-reinstall --upgrade 9 | 10 | RUN pip2 install https://github.com/ellisonbg/leafletwidget/archive/b03ddea2842b6939810b82ee93aa13e28d458456.zip 11 | RUN pip3 install https://github.com/ellisonbg/leafletwidget/archive/b03ddea2842b6939810b82ee93aa13e28d458456.zip 12 | 13 | ADD HMDA.ipynb /home/jupyter/HMDA.ipynb 14 | ADD gz_2010_us_050_00_5m.json /home/jupyter/gz_2010_us_050_00_5m.json 15 | RUN chown jupyter:jupyter . -R 16 | 17 | USER jupyter 18 | 19 | CMD ipython2 notebook --no-browser --port 8888 --ip=0.0.0.0 --NotebookApp.base_url=/$RAND_BASE --NotebookApp.tornado_settings="{'template_path':['/srv/ga/', '/srv/ipython/IPython/html', '/srv/ipython/IPython/html/templates']}" 20 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | 5 | # C extensions 6 | *.so 7 | 8 | # Distribution / packaging 9 | .Python 10 | env/ 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | lib/ 17 | lib64/ 18 | parts/ 19 | sdist/ 20 | var/ 21 | *.egg-info/ 22 | .installed.cfg 23 | *.egg 24 | 25 | # PyInstaller 26 | # Usually these files are written by a python script from a template 27 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 28 | *.manifest 29 | *.spec 30 | 31 | # Installer logs 32 | pip-log.txt 33 | pip-delete-this-directory.txt 34 | 35 | # Unit test / coverage reports 36 | htmlcov/ 37 | .tox/ 38 | .coverage 39 | .cache 40 | nosetests.xml 41 | coverage.xml 42 | 43 | # Translations 44 | *.mo 45 | *.pot 46 | 47 | # Django stuff: 48 | *.log 49 | 50 | # Sphinx documentation 51 | docs/_build/ 52 | 53 | # PyBuilder 54 | target/ 55 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2014, IPython: interactive computing in Python 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright notice, 11 | this list of conditions and the following disclaimer in the documentation 12 | and/or other materials provided with the distribution. 13 | 14 | * Neither the name of mozfest2014 nor the names of its 15 | contributors may be used to endorse or promote products derived from 16 | this software without specific prior written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 19 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 21 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 22 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 24 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 25 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 27 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | 29 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | mozfest2014 2 | =========== 3 | 4 | Introduction to the IPython Notebook at MozFest 2014! 5 | 6 | Please install the IPython notebook ahead of time. If you have trouble, come find us and we'll help you out! 7 | 8 | Facilitators: 9 | 10 | * Kyle Kelley ([@rgbkrk](https://twitter.com/rgbkrk)) 11 | * Matthias Bussonier ([@Mbussonn](https://twitter.com/Mbussonn)) 12 | * Jeramia Ory ([@DrLabRatOry](https://twitter.com/DrLabRatOry)) 13 | * Aron Ahmadia ([@ahmadia](https://twitter.com/ahmadia)) 14 | 15 | ## Installation 16 | 17 | If you're just getting started with Python, we recommend downloading and installing Continuum's [Anaconda](http://continuum.io/downloads.html) or the free edition of Enthought's [Canopy](https://www.enthought.com/downloads/). 18 | 19 | Alternatively, there is a Docker image available with kernels for Julia, Python, and R called `ipython/mozfest2014`. It is ~3.5 GB though, so you'll need to be mindful of where you run it (boot2docker starts with a small disk). If time allows, we'll have some temporary servers up in London that will be preloaded with this image (wifi permitting). 20 | 21 | ### Upgrading 22 | 23 | If you have IPython notebook installed already, great! We'll need IPython 2.2+ for this tutorial session, so [upgrade in your preferred way](http://ipython.org/install.html). 24 | 25 | Anaconda: 26 | 27 | ``` 28 | conda update conda 29 | conda update ipython 30 | ``` 31 | 32 | Enthought Canopy: 33 | 34 | ``` 35 | enpkg ipython 36 | ``` 37 | 38 | ## Content 39 | 40 | * [Introductory notebook](http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Notebook%20Basics.ipynb) 41 | 42 | ### Notebooks of interest: 43 | 44 | * [Data Driven Journalism](http://nbviewer.ipython.org/github/BuzzFeedNews/presidential-language-notebooks/blob/master/2014-10-presidential-address-pronouns.ipynb) 45 | * [Introduction to Python with numpy](http://nbviewer.ipython.org/github/swcarpentry/bc/blob/gh-pages/novice/python/01-numpy.ipynb) 46 | * [Introduction to IPython Widgets](http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Interactive%20Widgets/Index.ipynb) 47 | 48 | 49 | 50 | 51 | -------------------------------------------------------------------------------- /HMDA.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "", 4 | "signature": "sha256:fdbb88cdce73f99036541f5f1cd3db147fbacfa1451a0381fc8714e0941b88cd" 5 | }, 6 | "nbformat": 3, 7 | "nbformat_minor": 0, 8 | "worksheets": [ 9 | { 10 | "cells": [ 11 | { 12 | "cell_type": "heading", 13 | "level": 2, 14 | "metadata": {}, 15 | "source": [ 16 | "Initialization" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "collapsed": false, 22 | "input": [ 23 | "%matplotlib inline" 24 | ], 25 | "language": "python", 26 | "metadata": {}, 27 | "outputs": [], 28 | "prompt_number": 1 29 | }, 30 | { 31 | "cell_type": "code", 32 | "collapsed": false, 33 | "input": [ 34 | "import sys\n", 35 | "import csv\n", 36 | "import json\n", 37 | "import numpy as np\n", 38 | "from collections import Counter, defaultdict\n", 39 | "import leafletwidget as lw\n", 40 | "import matplotlib as mpl\n", 41 | "import matplotlib.cm\n", 42 | "import matplotlib.colors\n", 43 | "import matplotlib.pyplot as plt" 44 | ], 45 | "language": "python", 46 | "metadata": {}, 47 | "outputs": [], 48 | "prompt_number": 2 49 | }, 50 | { 51 | "cell_type": "code", 52 | "collapsed": false, 53 | "input": [ 54 | "lw.initialize_notebook()" 55 | ], 56 | "language": "python", 57 | "metadata": {}, 58 | "outputs": [ 59 | { 60 | "html": [ 61 | "" 62 | ], 63 | "metadata": {}, 64 | "output_type": "display_data", 65 | "text": [ 66 | "" 67 | ] 68 | }, 69 | { 70 | "html": [ 71 | "" 72 | ], 73 | "metadata": {}, 74 | "output_type": "display_data", 75 | "text": [ 76 | "" 77 | ] 78 | }, 79 | { 80 | "javascript": [ 81 | "\n", 82 | "\n", 83 | "require.config({\n", 84 | " paths: {\n", 85 | " leaflet: \"//cdn.leafletjs.com/leaflet-0.7.2/leaflet\",\n", 86 | " leaflet_draw: \"//cdnjs.cloudflare.com/ajax/libs/leaflet.draw/0.2.3/leaflet.draw\"\n", 87 | " },\n", 88 | " shim: {leaflet_draw: \"leaflet\"}\n", 89 | "});\n", 90 | "\n", 91 | "require([\"widgets/js/widget\", \"leaflet\", \"leaflet_draw\"], function(WidgetManager, L) {\n", 92 | "\n", 93 | " function camel_case(input) {\n", 94 | " // Convert from foo_bar to fooBar \n", 95 | " return input.toLowerCase().replace(/_(.)/g, function(match, group1) {\n", 96 | " return group1.toUpperCase();\n", 97 | " });\n", 98 | " }\n", 99 | " \n", 100 | " var LeafletLayerView = IPython.WidgetView.extend({\n", 101 | " \n", 102 | " initialize: function (parameters) {\n", 103 | " LeafletLayerView.__super__.initialize.apply(this, arguments);\n", 104 | " // Remove this line after testing...\n", 105 | " this.model.on('displayed', this.test_display, this);\n", 106 | " this.map_view = this.options.map_view;\n", 107 | " },\n", 108 | " \n", 109 | " // Remove this method after testing...\n", 110 | " test_display: function () {\n", 111 | " },\n", 112 | " \n", 113 | " render: function () {\n", 114 | " this.create_obj();\n", 115 | " this.leaflet_events();\n", 116 | " this.model_events();\n", 117 | " },\n", 118 | "\n", 119 | " leaflet_events: function () {\n", 120 | " },\n", 121 | "\n", 122 | " model_events: function () {\n", 123 | " },\n", 124 | "\n", 125 | " get_options: function () {\n", 126 | " var o = this.model.get('options');\n", 127 | " var options = {};\n", 128 | " var key;\n", 129 | " for (var i=0; i" 569 | ] 570 | } 571 | ], 572 | "prompt_number": 3 573 | }, 574 | { 575 | "cell_type": "markdown", 576 | "metadata": {}, 577 | "source": [ 578 | "This is a big dataset!" 579 | ] 580 | }, 581 | { 582 | "cell_type": "code", 583 | "collapsed": false, 584 | "input": [ 585 | "# Number of records (+ header)\n", 586 | "#!wc -l hmda_lar-2012.csv" 587 | ], 588 | "language": "python", 589 | "metadata": {}, 590 | "outputs": [], 591 | "prompt_number": 4 592 | }, 593 | { 594 | "cell_type": "markdown", 595 | "metadata": {}, 596 | "source": [ 597 | "We would like to present this data in an informative manner. We're going to take advantage of the fact that the data is coded by state and county to aggregate based on this information when it's available in the data." 598 | ] 599 | }, 600 | { 601 | "cell_type": "heading", 602 | "level": 2, 603 | "metadata": {}, 604 | "source": [ 605 | "Loading and aggregating the data" 606 | ] 607 | }, 608 | { 609 | "cell_type": "code", 610 | "collapsed": false, 611 | "input": [ 612 | "state_actions = defaultdict(Counter)\n", 613 | "county_actions = defaultdict(Counter)\n", 614 | "bad_records = []\n", 615 | "with open('hmda_lar-2012.csv') as csv_file:\n", 616 | " dialect = csv.Sniffer().sniff(csv_file.read(4096))\n", 617 | " csv_file.seek(0)\n", 618 | " reader = csv.reader(csv_file, dialect)\n", 619 | " header_list = reader.next()\n", 620 | " action_idx = header_list.index('action_taken_name')\n", 621 | " county_code_idx = header_list.index('county_code')\n", 622 | " state_code_idx = header_list.index('state_code')\n", 623 | " state_name_idx = header_list.index('state_name')\n", 624 | " \n", 625 | " def parse_row_list(row_list):\n", 626 | " action = row_list[action_idx]\n", 627 | " county = int(row_list[county_code_idx])\n", 628 | " state = int(row_list[state_code_idx]) \n", 629 | " state_name = row_list[state_name_idx]\n", 630 | " county_fips = state*1000 + county\n", 631 | " return action, county_fips, state, state_name\n", 632 | " \n", 633 | " for i, row_list in enumerate(reader):\n", 634 | " try:\n", 635 | " action, county_fips, state, state_name = parse_row_list(row_list)\n", 636 | " county_actions[county_fips][action] += 1\n", 637 | " state_actions[state_name][action] += 1\n", 638 | " except:\n", 639 | " bad_records.append(row_list)\n", 640 | " if (i+1) % 100000 == 0:\n", 641 | " sys.stdout.write('.')\n", 642 | " sys.stdout.flush()\n", 643 | " #uncomment the line below to only run 100K records\n", 644 | " break\n", 645 | "print ''\n", 646 | "print 'Processed %d records' % i\n", 647 | "print 'Found %d records with missing county data' % len(bad_records)" 648 | ], 649 | "language": "python", 650 | "metadata": {}, 651 | "outputs": [ 652 | { 653 | "output_type": "stream", 654 | "stream": "stdout", 655 | "text": [ 656 | "." 657 | ] 658 | }, 659 | { 660 | "output_type": "stream", 661 | "stream": "stdout", 662 | "text": [ 663 | "\n", 664 | "Processed 99999 records\n", 665 | "Found 368 records with missing county data\n" 666 | ] 667 | } 668 | ], 669 | "prompt_number": 5 670 | }, 671 | { 672 | "cell_type": "markdown", 673 | "metadata": {}, 674 | "source": [ 675 | "We can now take a look at some of our accumulated data for a given state." 676 | ] 677 | }, 678 | { 679 | "cell_type": "code", 680 | "collapsed": false, 681 | "input": [ 682 | "state_actions['California']" 683 | ], 684 | "language": "python", 685 | "metadata": {}, 686 | "outputs": [ 687 | { 688 | "metadata": {}, 689 | "output_type": "pyout", 690 | "prompt_number": 6, 691 | "text": [ 692 | "Counter({'Loan originated': 8317, 'Application withdrawn by applicant': 1527, 'Application denied by financial institution': 1245, 'Loan purchased by the institution': 943, 'Application approved but not accepted': 608, 'File closed for incompleteness': 577})" 693 | ] 694 | } 695 | ], 696 | "prompt_number": 6 697 | }, 698 | { 699 | "cell_type": "heading", 700 | "level": 2, 701 | "metadata": {}, 702 | "source": [ 703 | "Querying the Data Set" 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "collapsed": false, 709 | "input": [ 710 | "def query(county_id):\n", 711 | " return county_actions[county_id]['Application denied by financial institution']" 712 | ], 713 | "language": "python", 714 | "metadata": {}, 715 | "outputs": [], 716 | "prompt_number": 7 717 | }, 718 | { 719 | "cell_type": "heading", 720 | "level": 2, 721 | "metadata": {}, 722 | "source": [ 723 | "Computing the Query" 724 | ] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "collapsed": false, 729 | "input": [ 730 | "# http://eric.clst.org/wupl/Stuff/gz_2010_us_050_00_5m.json\n", 731 | "with open('gz_2010_us_050_00_5m.json', 'rb') as f:\n", 732 | " county_json = json.load(f, encoding='latin-1')\n", 733 | "\n", 734 | "geo_id_stream = (feature['properties']['GEO_ID'] for feature in county_json['features'])" 735 | ], 736 | "language": "python", 737 | "metadata": {}, 738 | "outputs": [], 739 | "prompt_number": 8 740 | }, 741 | { 742 | "cell_type": "code", 743 | "collapsed": false, 744 | "input": [ 745 | "def to_county_id(geo_id):\n", 746 | " return int(geo_id[-5:])" 747 | ], 748 | "language": "python", 749 | "metadata": {}, 750 | "outputs": [], 751 | "prompt_number": 9 752 | }, 753 | { 754 | "cell_type": "code", 755 | "collapsed": false, 756 | "input": [ 757 | "counties = [to_county_id(geo_id) for geo_id in geo_id_stream]\n", 758 | "denials = [query(key) for key in counties]" 759 | ], 760 | "language": "python", 761 | "metadata": {}, 762 | "outputs": [], 763 | "prompt_number": 10 764 | }, 765 | { 766 | "cell_type": "heading", 767 | "level": 2, 768 | "metadata": {}, 769 | "source": [ 770 | "Visualizing the Query" 771 | ] 772 | }, 773 | { 774 | "cell_type": "code", 775 | "collapsed": false, 776 | "input": [ 777 | "normalized_denials = mpl.colors.Normalize()(denials)" 778 | ], 779 | "language": "python", 780 | "metadata": {}, 781 | "outputs": [], 782 | "prompt_number": 11 783 | }, 784 | { 785 | "cell_type": "code", 786 | "collapsed": false, 787 | "input": [ 788 | "colormap=mpl.cm.Blues\n", 789 | "denial_colors = [mpl.colors.rgb2hex(d[0:3]) for d in colormap(normalized_denials)]\n", 790 | "\n", 791 | "for feature, color in zip(county_json['features'],\n", 792 | " denial_colors):\n", 793 | " feature['properties']['style'] = {'color': color, 'weight': 1, 'fillColor': color, 'fillOpacity': 0.5}" 794 | ], 795 | "language": "python", 796 | "metadata": {}, 797 | "outputs": [], 798 | "prompt_number": 12 799 | }, 800 | { 801 | "cell_type": "code", 802 | "collapsed": false, 803 | "input": [ 804 | "#m = lw.Map(zoom=4, center=[37.996162679728116, -97.294921875])\n", 805 | "m = lw.Map(zoom=4, center=[37.996162679728116, -97.294921875], default_tiles=None)" 806 | ], 807 | "language": "python", 808 | "metadata": {}, 809 | "outputs": [], 810 | "prompt_number": 15 811 | }, 812 | { 813 | "cell_type": "code", 814 | "collapsed": false, 815 | "input": [ 816 | "m" 817 | ], 818 | "language": "python", 819 | "metadata": {}, 820 | "outputs": [], 821 | "prompt_number": 16 822 | }, 823 | { 824 | "cell_type": "code", 825 | "collapsed": false, 826 | "input": [ 827 | "m.bounds" 828 | ], 829 | "language": "python", 830 | "metadata": {}, 831 | "outputs": [ 832 | { 833 | "metadata": {}, 834 | "output_type": "pyout", 835 | "prompt_number": 17, 836 | "text": [ 837 | "[(22.917922936146045, -123.662109375), (50.51342652633956, -70.927734375)]" 838 | ] 839 | } 840 | ], 841 | "prompt_number": 17 842 | }, 843 | { 844 | "cell_type": "code", 845 | "collapsed": false, 846 | "input": [ 847 | "g = lw.GeoJSON(data=county_json)" 848 | ], 849 | "language": "python", 850 | "metadata": {}, 851 | "outputs": [], 852 | "prompt_number": 18 853 | }, 854 | { 855 | "cell_type": "code", 856 | "collapsed": false, 857 | "input": [ 858 | "m.add_layer(g)" 859 | ], 860 | "language": "python", 861 | "metadata": {}, 862 | "outputs": [], 863 | "prompt_number": 19 864 | } 865 | ], 866 | "metadata": {} 867 | } 868 | ] 869 | } -------------------------------------------------------------------------------- /Messy_Data_Breakout_Pandas.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "", 4 | "signature": "sha256:a54fcb4b11db5aa420212ce3902532b294e671b3aa5226ec0631fcf5190e7507" 5 | }, 6 | "nbformat": 3, 7 | "nbformat_minor": 0, 8 | "worksheets": [ 9 | { 10 | "cells": [ 11 | { 12 | "cell_type": "markdown", 13 | "metadata": {}, 14 | "source": [ 15 | "# Intro\n", 16 | "For one of the breakout sessions in the \"Dealing with Messy Data\" session run by [Milena](http://twitter.com/milena_iul) and [Yuandra](http://twitter.com/iniandra) dealt with getting data into pandas and trying to fix as much as feasible using pandas. Intro by [Jeramia](http://twitter.com/DrLabRatOry), panda-fu by [Kyle](http://twitter.com/rgbkrk).\n", 17 | "\n", 18 | "First we need to import `pandas`" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "collapsed": false, 24 | "input": [ 25 | "import pandas as pd" 26 | ], 27 | "language": "python", 28 | "metadata": {}, 29 | "outputs": [], 30 | "prompt_number": 49 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "Using IPython notebook's shell magic, we can see what excel files exist in the directory:" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "collapsed": false, 42 | "input": [ 43 | "!ls *.xlsx" 44 | ], 45 | "language": "python", 46 | "metadata": {}, 47 | "outputs": [ 48 | { 49 | "output_type": "stream", 50 | "stream": "stdout", 51 | "text": [ 52 | "\u001b[31mNassau_police_union_contribs_dirty.xlsx\u001b[m\u001b[m\r\n", 53 | "~$Nassau_police_union_contribs_dirty.xlsx\r\n" 54 | ] 55 | } 56 | ], 57 | "prompt_number": 50 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": {}, 62 | "source": [ 63 | "We can read the excel file (warts and all) using the `read_excel` command:" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "collapsed": false, 69 | "input": [ 70 | "df = pd.read_excel('Nassau_police_union_contribs_dirty.xlsx')" 71 | ], 72 | "language": "python", 73 | "metadata": {}, 74 | "outputs": [], 75 | "prompt_number": 51 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "Display an (abbreviated) contents of the dataFrame using IPython notebooks display magic:" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "collapsed": false, 87 | "input": [ 88 | "df" 89 | ], 90 | "language": "python", 91 | "metadata": {}, 92 | "outputs": [ 93 | { 94 | "html": [ 95 | "
\n", 96 | "\n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | "
DATENAME OF INITIATIVE/ADDRESSCityStateZip CodeCHECK NO.AMOUNT ($ )RECORD DATEPartySource
0 02/21/02 BRONX CONSERVATIVE COMM;415 MINNIEFORD AVE BRONX N.Y. 10458 1562 $ 140 JUL-03-02 10:16 AM C SOA
1 02/23/09 BRONX CONSERVATIVE COMM.;188D EDGEWATER PARK BRONX NY 10465 2628 100 JUL-13-09 04:13 PM C SOA
2 03/01/00 BRONX CONSERVATIVE COMMITTEE;475 MINNEFORD AVE. BRONX NY 10464 1346 405 JUL-10-00 02:50 PM C SOA
3 03/04/03 BRONX CONSERVATIVE COMMITTEE;188 EDGEWATER PARK BRONX NY 10465 1727 100 JUL-14-03 02:41 PM C SOA
4 03/18/04 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK BRONX NY 10465 1869 100 JUL-14-04 01:09 PM C SOA
5 03/01/05 BRONX CONSERVATIVE COMMITTEE;1880 EDGEWATER PARK BRONX NY 10465 1975 100 JUL-13-05 03:48 PM C SOA
6 02/26/07 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK BRONX NY 10465 2371 100 JUL-16-07 05:32 PM C SOA
7 03/20/08 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK BRONX NY 10465 2504 100 JUL-15-08 02:53 PM C SOA
8 05/25/04 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBO... LEVITTOWN NY 11756 1902 200 JUL-14-04 02:08 PM C DETECTIVES
9 01/20/09 NASSAU CO. CONSERVATIVE COMMITTEE;105 BOBOLINK LA LEVITTOWN NY 11756 2603 200 JUL-13-09 03:51 PM C SOA
10 04/30/09 NASSAU CO. CONSERVATIVE COMMITTEE;105 BOBLINK LA LEVITTOWN NY 11756 2646 250 JUL-13-09 04:42 PM C SOA
11 02/15/05 NASSAU CO. CONSERVATIVE PARTY;105 BOBOLINK LANE LEVITTOWN NY 11756 1968 120 JUL-13-05 03:34 PM C SOA
12 04/14/11 NASSAU CO. CONSERVATIVE PARTY;1 SYDNEY ST PLAINVIEW NY 11803 2854 $ 200 JUL-11-11 03:19 PM C SOA
13 05/28/02 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN N.Y. 11756 1858 $ 300 JUL-15-02 10:36 AM C PBA
14 05/23/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1953 2000 JUL-11-03 02:42 PM C PBA
15 05/23/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1954 300 JUL-11-03 02:43 PM C PBA
16 06/01/04 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBLI... LEVITTOWN NY 11756 2032 $ 3500 JUL-15-04 12:08 PM C PBA
17 05/05/06 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 2238 400 JUN-16-06 09:09 AM C PBA
18 02/13/09 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1236 400 JUL-08-09 01:26 PM C PBA
19 04/10/09 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1263 1500 JUL-08-09 02:27 PM C PBA
20 05/05/11 NASSAU COUNTY CONSERVATIVE COMMITTEE;1 SYDNEY ... PLAINVIEW NY 11803 1548 100 JUL-08-11 08:53 AM C PBA
21 06/10/11 NASSAU COUNTY CONSERVATIVE COMMITTEE;1 SYDNEY ST PLAINVIEW NY 11803 1567 200 JUL-08-11 09:57 AM C PBA
22 02/16/99 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... PLAINVIEW NY 11803 NaN 200 JUN-10-99 06:06 PM C DETECTIVES
23 02/12/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOB O... LEVITTOWN NY 11756 1802 $ 200 JUN-19-03 03:38 PM C DETECTIVES
24 04/26/05 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOB O... LEVITTOWN NY 11756 1976 200 JUL-11-05 02:19 PM C DETECTIVES
25 02/09/06 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 2034 200 JUN-27-06 12:47 PM C DETECTIVES
26 05/10/07 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1602 200 JUL-05-07 05:26 PM C DETECTIVES
27 02/09/99 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... PLAINVIEW NY 11803 1253 250 JUL-07-99 10:35 AM C SOA
28 04/27/00 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... PLAINVIEW NY 11803 1364 300 JUL-10-00 03:22 PM C SOA
29 01/24/02 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... LEVITTOWN NY 11756 1551 400 JUL-03-02 10:00 AM C SOA
.................................
3251 09/06/06 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... WESTBURY NY 11590 2333 300 SEP-21-06 04:36 PM R SOA
3252 03/19/07 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... WESTBURY NY 11590 2389 250 JUL-16-07 05:32 PM R SOA
3253 03/20/08 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... WESTBURY NY 11590 2503 250 JUL-15-08 02:52 PM R SOA
3254 07/08/09 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... WESTBURY NY 11590 2676 150 JUL-13-09 05:15 PM R SOA
3255 07/06/11 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... WESTBURY NY 11590 2883 300 JUL-11-11 03:52 PM R SOA
3256 04/06/10 TOWN OF N. HEMPSTEAD REPUBLICAN COMM;164 POST ... WESTBURY NY 11590 1424 625 JUL-15-10 10:51 AM R PBA
3257 10/19/01 TOWN OF N. HEMPSTEAD REPUBLICAN COMMITTEE;164 ... WESTBURY NY 11590 1812 400 NOV-28-01 02:20 PM R PBA
3258 06/01/00 TOWN OF N. HEMPSTEAD REPUBLICAN COMMITTEE;164 ... WESTBURY NY 11590 1375 1550 JUL-10-00 03:44 PM R SOA
3259 07/16/10 TOWN OF NORTH HEMPSTEAD REPUBLICAN COMMITTEE;1... WESTBURY NY 11590 1205 550 AUG-11-10 02:13 PM R DETECTIVES
3260 10/01/10 TOWN OF OYSTER BAY REPUBLICAN COMMITTEE;164 PO... WESTBURY NY 11590 1486 $ 1200 OCT-22-10 11:33 AM R PBA
3261 04/05/02 TRUNZO CAMPAIGN FUND;105 WASHINGTON AVE. BRENTWOOD NY 11717 1847 350 JUL-15-02 10:28 AM R PBA
3262 09/26/05 VALLEY STREAM REPUBLICAN CLUB;362 PICCADILLY POND LYNBROOK NY 11563 2184 250 OCT-07-05 03:13 PM R PBA
3263 09/14/09 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... LYNBROOK NY 11563 1320 400 SEP-30-09 12:54 PM R PBA
3264 10/15/09 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... LYNBROOK NY 11563 1338 2000 OCT-21-09 09:46 AM R PBA
3265 07/19/11 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... LYNBROOK NY 11563 1582 1000 OCT-03-11 02:26 PM R PBA
3266 09/06/07 VALLEY STREAM REPUBLICAN COMMITTEE;362 PICCADI... LYNBROOK NY 11563 1056 225 OCT-05-07 12:55 PM R PBA
3267 09/01/10 VALLEY STREAM REPUBLICAN COMMITTEE;362 PICCADI... LYNBROOK NY 11563 1472 400 SEP-30-10 11:01 AM R PBA
3268 03/02/01 VOLKER CAMPAIGN COMMITTEE;PO BOX 494 LANCASTER NY 14086 1747 $ 300 JUL-13-01 03:09 PM R PBA
3269 08/27/99 WANTAGH GOP;775 WANTAGH AVENUE WANTAGH NY 11793 1309 $ 100 AUG-31-99 05:04 PM R SOA
3270 07/14/01 WANTAGH GOP;75 WANTAGH AVE WANTAGH NY 11793 1497 700 AUG-09-01 10:01 PM R SOA
3271 08/03/99 WANTAGH REPUBLICAN COMMITTEE;2196 BROOKSIDE AV... WANTAGH NY 11793 1305 550 AUG-11-99 07:35 PM R SOA
3272 08/25/10 WOODMERE REPUBLICAN CLUB;36 CENTER STREET WOODMERE NY 11598 1210 200 AUG-31-10 06:09 PM R DETECTIVES
3273 03/14/11 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 1526 500 JUL-07-11 04:12 PM R PBA
3274 06/06/11 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 1560 250 JUL-08-11 09:01 AM R PBA
3275 01/09/12 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 1618 500 JAN-16-12 12:00 AM R PBA
3276 01/11/12 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 1621 500 JAN-16-12 12:00 AM R PBA
3277 01/09/12 ZELDIN FOR SENATE;47 FLINTLOCK DRIVE SHIRLEY NY 11967 1326 500 JAN-16-12 12:00 AM R DETECTIVES
3278 06/06/11 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 2872 250 JUL-11-11 03:37 PM R SOA
3279 01/09/12 ZELDIN FOR SENATE;PO BOX 628 SHIRLEY NY 11967 2916 500 JAN-12-12 12:00 AM R SOA
3280 06/30/09 ;235 CARMEN AVE EAST ROCKAWAY NY 11518 1290 $ 600 JUL-08-09 02:59 PM R PBA
\n", 908 | "

3281 rows \u00d7 10 columns

\n", 909 | "
" 910 | ], 911 | "metadata": {}, 912 | "output_type": "pyout", 913 | "prompt_number": 52, 914 | "text": [ 915 | " DATE NAME OF INITIATIVE/ADDRESS \\\n", 916 | "0 02/21/02 BRONX CONSERVATIVE COMM;415 MINNIEFORD AVE \n", 917 | "1 02/23/09 BRONX CONSERVATIVE COMM.;188D EDGEWATER PARK \n", 918 | "2 03/01/00 BRONX CONSERVATIVE COMMITTEE;475 MINNEFORD AVE. \n", 919 | "3 03/04/03 BRONX CONSERVATIVE COMMITTEE;188 EDGEWATER PARK \n", 920 | "4 03/18/04 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK \n", 921 | "5 03/01/05 BRONX CONSERVATIVE COMMITTEE;1880 EDGEWATER PARK \n", 922 | "6 02/26/07 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK \n", 923 | "7 03/20/08 BRONX CONSERVATIVE COMMITTEE;188D EDGEMERE PARK \n", 924 | "8 05/25/04 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBO... \n", 925 | "9 01/20/09 NASSAU CO. CONSERVATIVE COMMITTEE;105 BOBOLINK LA \n", 926 | "10 04/30/09 NASSAU CO. CONSERVATIVE COMMITTEE;105 BOBLINK LA \n", 927 | "11 02/15/05 NASSAU CO. CONSERVATIVE PARTY;105 BOBOLINK LANE \n", 928 | "12 04/14/11 NASSAU CO. CONSERVATIVE PARTY;1 SYDNEY ST \n", 929 | "13 05/28/02 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 930 | "14 05/23/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 931 | "15 05/23/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 932 | "16 06/01/04 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBLI... \n", 933 | "17 05/05/06 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 934 | "18 02/13/09 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 935 | "19 04/10/09 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 936 | "20 05/05/11 NASSAU COUNTY CONSERVATIVE COMMITTEE;1 SYDNEY ... \n", 937 | "21 06/10/11 NASSAU COUNTY CONSERVATIVE COMMITTEE;1 SYDNEY ST \n", 938 | "22 02/16/99 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... \n", 939 | "23 02/12/03 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOB O... \n", 940 | "24 04/26/05 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOB O... \n", 941 | "25 02/09/06 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 942 | "26 05/10/07 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 943 | "27 02/09/99 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... \n", 944 | "28 04/27/00 NASSAU COUNTY CONSERVATIVE COMMITTEE;36 SUNRIS... \n", 945 | "29 01/24/02 NASSAU COUNTY CONSERVATIVE COMMITTEE;105 BOBOL... \n", 946 | "... ... ... \n", 947 | "3251 09/06/06 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... \n", 948 | "3252 03/19/07 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... \n", 949 | "3253 03/20/08 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... \n", 950 | "3254 07/08/09 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... \n", 951 | "3255 07/06/11 TOWN OF HEMPSTEAD REPUBLICAN COMMITTEE;164 POS... \n", 952 | "3256 04/06/10 TOWN OF N. HEMPSTEAD REPUBLICAN COMM;164 POST ... \n", 953 | "3257 10/19/01 TOWN OF N. HEMPSTEAD REPUBLICAN COMMITTEE;164 ... \n", 954 | "3258 06/01/00 TOWN OF N. HEMPSTEAD REPUBLICAN COMMITTEE;164 ... \n", 955 | "3259 07/16/10 TOWN OF NORTH HEMPSTEAD REPUBLICAN COMMITTEE;1... \n", 956 | "3260 10/01/10 TOWN OF OYSTER BAY REPUBLICAN COMMITTEE;164 PO... \n", 957 | "3261 04/05/02 TRUNZO CAMPAIGN FUND;105 WASHINGTON AVE. \n", 958 | "3262 09/26/05 VALLEY STREAM REPUBLICAN CLUB;362 PICCADILLY POND \n", 959 | "3263 09/14/09 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... \n", 960 | "3264 10/15/09 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... \n", 961 | "3265 07/19/11 VALLEY STREAM REPUBLICAN COMM;362 PICCADILLY D... \n", 962 | "3266 09/06/07 VALLEY STREAM REPUBLICAN COMMITTEE;362 PICCADI... \n", 963 | "3267 09/01/10 VALLEY STREAM REPUBLICAN COMMITTEE;362 PICCADI... \n", 964 | "3268 03/02/01 VOLKER CAMPAIGN COMMITTEE;PO BOX 494 \n", 965 | "3269 08/27/99 WANTAGH GOP;775 WANTAGH AVENUE \n", 966 | "3270 07/14/01 WANTAGH GOP;75 WANTAGH AVE \n", 967 | "3271 08/03/99 WANTAGH REPUBLICAN COMMITTEE;2196 BROOKSIDE AV... \n", 968 | "3272 08/25/10 WOODMERE REPUBLICAN CLUB;36 CENTER STREET \n", 969 | "3273 03/14/11 ZELDIN FOR SENATE;PO BOX 628 \n", 970 | "3274 06/06/11 ZELDIN FOR SENATE;PO BOX 628 \n", 971 | "3275 01/09/12 ZELDIN FOR SENATE;PO BOX 628 \n", 972 | "3276 01/11/12 ZELDIN FOR SENATE;PO BOX 628 \n", 973 | "3277 01/09/12 ZELDIN FOR SENATE;47 FLINTLOCK DRIVE \n", 974 | "3278 06/06/11 ZELDIN FOR SENATE;PO BOX 628 \n", 975 | "3279 01/09/12 ZELDIN FOR SENATE;PO BOX 628 \n", 976 | "3280 06/30/09 ;235 CARMEN AVE \n", 977 | "\n", 978 | " City State Zip Code CHECK NO. AMOUNT ($ ) \\\n", 979 | "0 BRONX N.Y. 10458 1562 $ 140 \n", 980 | "1 BRONX NY 10465 2628 100 \n", 981 | "2 BRONX NY 10464 1346 405 \n", 982 | "3 BRONX NY 10465 1727 100 \n", 983 | "4 BRONX NY 10465 1869 100 \n", 984 | "5 BRONX NY 10465 1975 100 \n", 985 | "6 BRONX NY 10465 2371 100 \n", 986 | "7 BRONX NY 10465 2504 100 \n", 987 | "8 LEVITTOWN NY 11756 1902 200 \n", 988 | "9 LEVITTOWN NY 11756 2603 200 \n", 989 | "10 LEVITTOWN NY 11756 2646 250 \n", 990 | "11 LEVITTOWN NY 11756 1968 120 \n", 991 | "12 PLAINVIEW NY 11803 2854 $ 200 \n", 992 | "13 LEVITTOWN N.Y. 11756 1858 $ 300 \n", 993 | "14 LEVITTOWN NY 11756 1953 2000 \n", 994 | "15 LEVITTOWN NY 11756 1954 300 \n", 995 | "16 LEVITTOWN NY 11756 2032 $ 3500 \n", 996 | "17 LEVITTOWN NY 11756 2238 400 \n", 997 | "18 LEVITTOWN NY 11756 1236 400 \n", 998 | "19 LEVITTOWN NY 11756 1263 1500 \n", 999 | "20 PLAINVIEW NY 11803 1548 100 \n", 1000 | "21 PLAINVIEW NY 11803 1567 200 \n", 1001 | "22 PLAINVIEW NY 11803 NaN 200 \n", 1002 | "23 LEVITTOWN NY 11756 1802 $ 200 \n", 1003 | "24 LEVITTOWN NY 11756 1976 200 \n", 1004 | "25 LEVITTOWN NY 11756 2034 200 \n", 1005 | "26 LEVITTOWN NY 11756 1602 200 \n", 1006 | "27 PLAINVIEW NY 11803 1253 250 \n", 1007 | "28 PLAINVIEW NY 11803 1364 300 \n", 1008 | "29 LEVITTOWN NY 11756 1551 400 \n", 1009 | "... ... ... ... ... ... \n", 1010 | "3251 WESTBURY NY 11590 2333 300 \n", 1011 | "3252 WESTBURY NY 11590 2389 250 \n", 1012 | "3253 WESTBURY NY 11590 2503 250 \n", 1013 | "3254 WESTBURY NY 11590 2676 150 \n", 1014 | "3255 WESTBURY NY 11590 2883 300 \n", 1015 | "3256 WESTBURY NY 11590 1424 625 \n", 1016 | "3257 WESTBURY NY 11590 1812 400 \n", 1017 | "3258 WESTBURY NY 11590 1375 1550 \n", 1018 | "3259 WESTBURY NY 11590 1205 550 \n", 1019 | "3260 WESTBURY NY 11590 1486 $ 1200 \n", 1020 | "3261 BRENTWOOD NY 11717 1847 350 \n", 1021 | "3262 LYNBROOK NY 11563 2184 250 \n", 1022 | "3263 LYNBROOK NY 11563 1320 400 \n", 1023 | "3264 LYNBROOK NY 11563 1338 2000 \n", 1024 | "3265 LYNBROOK NY 11563 1582 1000 \n", 1025 | "3266 LYNBROOK NY 11563 1056 225 \n", 1026 | "3267 LYNBROOK NY 11563 1472 400 \n", 1027 | "3268 LANCASTER NY 14086 1747 $ 300 \n", 1028 | "3269 WANTAGH NY 11793 1309 $ 100 \n", 1029 | "3270 WANTAGH NY 11793 1497 700 \n", 1030 | "3271 WANTAGH NY 11793 1305 550 \n", 1031 | "3272 WOODMERE NY 11598 1210 200 \n", 1032 | "3273 SHIRLEY NY 11967 1526 500 \n", 1033 | "3274 SHIRLEY NY 11967 1560 250 \n", 1034 | "3275 SHIRLEY NY 11967 1618 500 \n", 1035 | "3276 SHIRLEY NY 11967 1621 500 \n", 1036 | "3277 SHIRLEY NY 11967 1326 500 \n", 1037 | "3278 SHIRLEY NY 11967 2872 250 \n", 1038 | "3279 SHIRLEY NY 11967 2916 500 \n", 1039 | "3280 EAST ROCKAWAY NY 11518 1290 $ 600 \n", 1040 | "\n", 1041 | " RECORD DATE Party Source \n", 1042 | "0 JUL-03-02 10:16 AM C SOA \n", 1043 | "1 JUL-13-09 04:13 PM C SOA \n", 1044 | "2 JUL-10-00 02:50 PM C SOA \n", 1045 | "3 JUL-14-03 02:41 PM C SOA \n", 1046 | "4 JUL-14-04 01:09 PM C SOA \n", 1047 | "5 JUL-13-05 03:48 PM C SOA \n", 1048 | "6 JUL-16-07 05:32 PM C SOA \n", 1049 | "7 JUL-15-08 02:53 PM C SOA \n", 1050 | "8 JUL-14-04 02:08 PM C DETECTIVES \n", 1051 | "9 JUL-13-09 03:51 PM C SOA \n", 1052 | "10 JUL-13-09 04:42 PM C SOA \n", 1053 | "11 JUL-13-05 03:34 PM C SOA \n", 1054 | "12 JUL-11-11 03:19 PM C SOA \n", 1055 | "13 JUL-15-02 10:36 AM C PBA \n", 1056 | "14 JUL-11-03 02:42 PM C PBA \n", 1057 | "15 JUL-11-03 02:43 PM C PBA \n", 1058 | "16 JUL-15-04 12:08 PM C PBA \n", 1059 | "17 JUN-16-06 09:09 AM C PBA \n", 1060 | "18 JUL-08-09 01:26 PM C PBA \n", 1061 | "19 JUL-08-09 02:27 PM C PBA \n", 1062 | "20 JUL-08-11 08:53 AM C PBA \n", 1063 | "21 JUL-08-11 09:57 AM C PBA \n", 1064 | "22 JUN-10-99 06:06 PM C DETECTIVES \n", 1065 | "23 JUN-19-03 03:38 PM C DETECTIVES \n", 1066 | "24 JUL-11-05 02:19 PM C DETECTIVES \n", 1067 | "25 JUN-27-06 12:47 PM C DETECTIVES \n", 1068 | "26 JUL-05-07 05:26 PM C DETECTIVES \n", 1069 | "27 JUL-07-99 10:35 AM C SOA \n", 1070 | "28 JUL-10-00 03:22 PM C SOA \n", 1071 | "29 JUL-03-02 10:00 AM C SOA \n", 1072 | "... ... ... ... \n", 1073 | "3251 SEP-21-06 04:36 PM R SOA \n", 1074 | "3252 JUL-16-07 05:32 PM R SOA \n", 1075 | "3253 JUL-15-08 02:52 PM R SOA \n", 1076 | "3254 JUL-13-09 05:15 PM R SOA \n", 1077 | "3255 JUL-11-11 03:52 PM R SOA \n", 1078 | "3256 JUL-15-10 10:51 AM R PBA \n", 1079 | "3257 NOV-28-01 02:20 PM R PBA \n", 1080 | "3258 JUL-10-00 03:44 PM R SOA \n", 1081 | "3259 AUG-11-10 02:13 PM R DETECTIVES \n", 1082 | "3260 OCT-22-10 11:33 AM R PBA \n", 1083 | "3261 JUL-15-02 10:28 AM R PBA \n", 1084 | "3262 OCT-07-05 03:13 PM R PBA \n", 1085 | "3263 SEP-30-09 12:54 PM R PBA \n", 1086 | "3264 OCT-21-09 09:46 AM R PBA \n", 1087 | "3265 OCT-03-11 02:26 PM R PBA \n", 1088 | "3266 OCT-05-07 12:55 PM R PBA \n", 1089 | "3267 SEP-30-10 11:01 AM R PBA \n", 1090 | "3268 JUL-13-01 03:09 PM R PBA \n", 1091 | "3269 AUG-31-99 05:04 PM R SOA \n", 1092 | "3270 AUG-09-01 10:01 PM R SOA \n", 1093 | "3271 AUG-11-99 07:35 PM R SOA \n", 1094 | "3272 AUG-31-10 06:09 PM R DETECTIVES \n", 1095 | "3273 JUL-07-11 04:12 PM R PBA \n", 1096 | "3274 JUL-08-11 09:01 AM R PBA \n", 1097 | "3275 JAN-16-12 12:00 AM R PBA \n", 1098 | "3276 JAN-16-12 12:00 AM R PBA \n", 1099 | "3277 JAN-16-12 12:00 AM R DETECTIVES \n", 1100 | "3278 JUL-11-11 03:37 PM R SOA \n", 1101 | "3279 JAN-12-12 12:00 AM R SOA \n", 1102 | "3280 JUL-08-09 02:59 PM R PBA \n", 1103 | "\n", 1104 | "[3281 rows x 10 columns]" 1105 | ] 1106 | } 1107 | ], 1108 | "prompt_number": 52 1109 | }, 1110 | { 1111 | "cell_type": "markdown", 1112 | "metadata": {}, 1113 | "source": [ 1114 | "Looking at the data, we can tell it has all kinds of problems. \n", 1115 | "\n", 1116 | "For the remainder of this breakout, we focused on trying to get the City column into better shape.\n", 1117 | "\n", 1118 | "To look at only the `City` column using IPython magic, we can type:" 1119 | ] 1120 | }, 1121 | { 1122 | "cell_type": "code", 1123 | "collapsed": false, 1124 | "input": [ 1125 | "df['City']" 1126 | ], 1127 | "language": "python", 1128 | "metadata": {}, 1129 | "outputs": [ 1130 | { 1131 | "metadata": {}, 1132 | "output_type": "pyout", 1133 | "prompt_number": 54, 1134 | "text": [ 1135 | "0 BRONX\n", 1136 | "1 BRONX\n", 1137 | "2 BRONX\n", 1138 | "3 BRONX\n", 1139 | "4 BRONX\n", 1140 | "5 BRONX\n", 1141 | "6 BRONX\n", 1142 | "7 BRONX\n", 1143 | "8 LEVITTOWN\n", 1144 | "9 LEVITTOWN\n", 1145 | "10 LEVITTOWN\n", 1146 | "11 LEVITTOWN\n", 1147 | "12 PLAINVIEW\n", 1148 | "13 LEVITTOWN\n", 1149 | "14 LEVITTOWN\n", 1150 | "...\n", 1151 | "3266 LYNBROOK\n", 1152 | "3267 LYNBROOK \n", 1153 | "3268 LANCASTER \n", 1154 | "3269 WANTAGH\n", 1155 | "3270 WANTAGH\n", 1156 | "3271 WANTAGH\n", 1157 | "3272 WOODMERE\n", 1158 | "3273 SHIRLEY\n", 1159 | "3274 SHIRLEY\n", 1160 | "3275 SHIRLEY\n", 1161 | "3276 SHIRLEY \n", 1162 | "3277 SHIRLEY\n", 1163 | "3278 SHIRLEY\n", 1164 | "3279 SHIRLEY\n", 1165 | "3280 EAST ROCKAWAY\n", 1166 | "Name: City, Length: 3281, dtype: object" 1167 | ] 1168 | } 1169 | ], 1170 | "prompt_number": 54 1171 | }, 1172 | { 1173 | "cell_type": "markdown", 1174 | "metadata": {}, 1175 | "source": [ 1176 | "This still doesn't give us a great idea of how big a mess we have, so let's see how many unique names exist and sort them by name:" 1177 | ] 1178 | }, 1179 | { 1180 | "cell_type": "code", 1181 | "collapsed": false, 1182 | "input": [ 1183 | "cities = df['City'].unique()\n", 1184 | "cities.sort()\n", 1185 | "cities" 1186 | ], 1187 | "language": "python", 1188 | "metadata": {}, 1189 | "outputs": [ 1190 | { 1191 | "metadata": {}, 1192 | "output_type": "pyout", 1193 | "prompt_number": 58, 1194 | "text": [ 1195 | "array([nan, u' MASSAPEQUA', u' WANTAGH', u' ALBANY', u' BALDWIN',\n", 1196 | " u' BRONX', u' BROOKLYN', u' COHOES', u' GREAT NECK',\n", 1197 | " u' HEMPSTEAD', u' HUNTINGTON', u' JEFFERSON VALLEY',\n", 1198 | " u' LEVITTOWN', u' LEVITTOWN ', u' MERRICK', u' MIDDLE VILLAGE',\n", 1199 | " u' STATEN ISLAND', u' WANTAGH', u' WESTBURY',\n", 1200 | " u' WILLISTON PARK', u' WOODMERE', u' ALBANY', u' BALDWIN',\n", 1201 | " u' BROOKLYN', u' CARMEL', u' EAST NASSAU', u' GARDEN CITY',\n", 1202 | " u' GARDEN CITY', u' GREAT NECK', u' GREAT NECK', u' HUNTINGTON',\n", 1203 | " u' JERICHO', u' LIDO BEACH', u' LYNBROOK', u' MASSAPEQUA',\n", 1204 | " u' MASSAPEQUA PARK', u' MERRICK', u' MINEOLA', u' PLAINVIEW',\n", 1205 | " u' ROSLYN', u' SHIRLEY', u' SOUTHOLD', u' STATEN ISLAND',\n", 1206 | " u' STATEN ISLAND ', u' WEST ISLIP', u' WESTBURY', u' WOODBURY',\n", 1207 | " u'ALBANI', u'ALBANNY', u'ALBANY', u'ALBANY ', u'ALBANY ',\n", 1208 | " u'ALBANY ', u'ALBERTSON', u'ALBION', u'AMITYVILLE', u'AUBURN',\n", 1209 | " u'BABYLON', u'BALDWIN', u'BALWIN', u'BARDONIA', u'BAYPORT',\n", 1210 | " u'BAYVILLE', u'BELLEMORE', u'BELLEROSE', u'BELLMORE', u'BETHPAGE',\n", 1211 | " u'BOHEMIA', u'BRENTWOOD', u'BRENTWOOD ', u'BREWSTER',\n", 1212 | " u'BRIDGEPORT', u'BROCKPORT', u'BRONX', u'BROOKLYN', u'BROOKLYN ',\n", 1213 | " u'BROOKLYN ', u'BROOKYN', u'CANADAIGUA', u'CANANDAIGUA',\n", 1214 | " u'CANANDALGUA', u'CAPITAL STATION', u'CARL PLACE', u'CARLE PLACE',\n", 1215 | " u'CARMEL', u'CIRCLEVILLE', u'COHOES', u'COHOES ', u'COMMACK',\n", 1216 | " u'COOPERSTOWN', u'CORONA', u'CROTON-ON-HUDO', u'E ISLIP',\n", 1217 | " u'E. GREENBUSH', u'E. ISLIP', u'E. MEADOW', u'E. NORTHPORT',\n", 1218 | " u'E. NORWICH', u'E. ROCHESTER', u'E. ROCKAWAY', u'E. SETAUKET',\n", 1219 | " u'E. WILLISTON', u'EAST', u'EAST GREENBUSH', u'EAST ISLIP',\n", 1220 | " u'EAST MEADOW', u'EAST NORTHPOND', u'EAST NORTHPORT',\n", 1221 | " u'EAST NORWICH', u'EAST ROCKAWAY', u'EAST SETAUKET', u'ELMIRA',\n", 1222 | " u'ELMONT', u'FARMINGDALE', u'FARMINGVILLE', u'FRANKLIN',\n", 1223 | " u'FRANKLIN SQUARE', u'FREEPORT', u'FREEPORT ', u'FRESH MEADOW',\n", 1224 | " u'FT HAMILTON STATE', u'GARDEN CITY', u'GARDEN CITY ',\n", 1225 | " u'GARDEN CITY PARK', u'GLEN COVE', u'GOSHEN', u'GREAT NECK',\n", 1226 | " u'GREAT NECK', u'GREAT NECK', u'GREAT NECK ', u'HAMPTON',\n", 1227 | " u'HAPPAGUE', u'HAPPAUGE', u'HAUPPAUGE', u'HAUPPAUSE', u'HEMPSTEAD',\n", 1228 | " u'HEWLETT', u'HICKSVILLE', u'HOLBROOK', u'HUNTINGTON',\n", 1229 | " u'HUNTINGTON ', u'HUNTINGTON STATION', u'HUNTINGTON STAT',\n", 1230 | " u'HUNTINGTON STATION', u'HUNTINGTONST', u'IRVINGTON-ON-HU',\n", 1231 | " u'ISLAND PARK', u'ISLIP', u'JAMAICA ESTATES', u'JEFFERSON',\n", 1232 | " u'JEFFERSON VALLEY', u'JEFFERSON VALLEY', u'JEFFERSON VALLE',\n", 1233 | " u'JEFFERSON VALLEY', u'JERICHO', u'KINDERHOOK', u'LANCASTER',\n", 1234 | " u'LANCASTER ', u'LAWRENCE', u'LEVITOWN', u'LEVITTOWN',\n", 1235 | " u'LEVITTOWN ', u'LEVITTSTOWN', u'LINDENHURST', u'LINDENHURST ',\n", 1236 | " u'LOCUST VALLEY', u'LOCUST VALLEY', u'LONG BEACH', u'LONG BEACH',\n", 1237 | " u'LONGBEACH', u'LYNBROOK', u'LYNBROOK ', u'LYNBROOK ', u'MALVERNE',\n", 1238 | " u'MANORVILLE', u'MASSAPEAQUA', u'MASSAPEQUA', u'MASSAPEQUA ',\n", 1239 | " u'MASSAPEQUA ', u'MASSAPEQUA PARK', u'MASTIC', u'MEDFORD',\n", 1240 | " u'MELVILLE', u'MERCERVILLE', u'MERICK', u'MERRICK',\n", 1241 | " u'MIDDLE VILLAGE', u'MIDDLE VILLAGE', u'MILL NECK', u'MINEOLA',\n", 1242 | " u'MINEOLA ', u'MT. SINAI', u'N MERRICK', u'N VALLEY',\n", 1243 | " u'N VALLEY STREAM', u'N. BABYLON', u'N. BALDWIN', u'N. MASSAPEQUA',\n", 1244 | " u'N. MERRICK', u'N. VALLEY', u'N. VALLEY STREAM', u'N.MASSAPEQUA',\n", 1245 | " u'N.MASSAPEQUA ', u'N.VALLEY STREAM', u'NANUET', u'NESCONSET',\n", 1246 | " u'NEW YORK', u'NEW YORK', u'NEW BALTIMORE', u'NEW HAMPTON',\n", 1247 | " u'NEW HYDE PARK', u'NEW HYDE PK', u'NEW ROCHELLE', u'NEW YORK',\n", 1248 | " u'NEW YORK ', u'NIAGARA FALLS', u'NO. BALDWIN', u'NO. MASSAPEQUA',\n", 1249 | " u'NO. MERRICK', u'NO. VALLEY STREAM', u'NO.VALLEY STREAM',\n", 1250 | " u'NORHTPORT', u'NORTH MASSAPEQUA', u'NORTH MERRICK',\n", 1251 | " u'NORTH VALLEY', u'NORTH VALLEY STREAM', u'NORTHPORT', u'NY',\n", 1252 | " u'NYC', u'OCEANSIDE', u'OLD BETHPAGE', u'OLD BETHPARE',\n", 1253 | " u'OLD WESTBURY', u'ORISKANY', u'OYSTER BAY', u'PATTERSONVILLE',\n", 1254 | " u'PENFIELD', u'PLAINVIEW', u'PORT', u'PORT WASHINGTON',\n", 1255 | " u'PORT JEFFERSON', u'PORT WASHINGTON', u'POUGHKEEPSIE',\n", 1256 | " u'PT. LOOKOUT', u'PT. WASHINGTON', u'PT.WASHINGTON', u'RENSSALAER',\n", 1257 | " u'RENSSELAER', u'RENSSELAU', u'RENSSELEAR', u'RICHMOND HILL',\n", 1258 | " u'RICHMOND HILLS', u'RIVERHEAD', u'ROCHESTER', u'ROCKEVILLE CENTRE',\n", 1259 | " u'ROCKILLE CTR', u'ROCKVILLE', u'ROCKVILLE CENTER',\n", 1260 | " u'ROCKVILLE CENTR', u'ROCKVILLE CENTRE', u'ROCKVILLE CTR',\n", 1261 | " u'RONKONKOMA', u'ROOSEVELT', u'ROSLYN', u'S.MERRICK',\n", 1262 | " u'SANDS POINT', u'SARATOGA SPRING', u'SARATOGA SPRING ',\n", 1263 | " u'SCHENECTADY', u'SEA CLIFF', u'SEACLIFF', u'SEAFORD', u'SEATAUKET',\n", 1264 | " u'SETAUKET', u'SHIRLEY', u'SHIRLEY ', u'SHOREHAM', u'SHOREHARN',\n", 1265 | " u'SMITHTOWN', u'ST JAMES', u'ST. JAMES', u'STATEN',\n", 1266 | " u'STATEN ISLAND', u'STATEN ISLAND', u'STONY POINT', u'SUFFERN',\n", 1267 | " u'SYOSSET', u'SYRACUSE', u'UNIONDALE', u'UPPER NYACK',\n", 1268 | " u'VALLEY STREAM', u'W BABYLON', u'W ISLIP', u'W. BABYLON',\n", 1269 | " u'W. HEMPSTEAD', u'W. ISLIP', u'W.BABYLON', u'WADING RIVER',\n", 1270 | " u'WANTAGH', u'WASHINGTONVILLE', u'WEST BABLYON', u'WEST BABYLON',\n", 1271 | " u'WEST BURY', u'WEST ISLIP', u'WEST MERRICK', u'WESTBURY',\n", 1272 | " u'WESTBURY ', u'WESTBUTY', u'WESYBURY', u'WHITESTONE',\n", 1273 | " u'WILLISTIN PK', u'WILLISTON', u'WILLISTON PARK', u'WILLISTON PK',\n", 1274 | " u'WOODBURY', u'WOODMERE', u'YAPHANK', u'YONKERS'], dtype=object)" 1275 | ] 1276 | } 1277 | ], 1278 | "prompt_number": 58 1279 | }, 1280 | { 1281 | "cell_type": "markdown", 1282 | "metadata": {}, 1283 | "source": [ 1284 | "Wow, that's... a mess. The most obvious mess is trailing and leading white space. Let's get rid of it with `.strip()`" 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "code", 1289 | "collapsed": false, 1290 | "input": [ 1291 | "cities = df['City'].str.strip().unique()\n", 1292 | "cities.sort()\n", 1293 | "cities" 1294 | ], 1295 | "language": "python", 1296 | "metadata": {}, 1297 | "outputs": [ 1298 | { 1299 | "metadata": {}, 1300 | "output_type": "pyout", 1301 | "prompt_number": 70, 1302 | "text": [ 1303 | "array([nan, u'ALBANI', u'ALBANNY', u'ALBANY', u'ALBERTSON', u'ALBION',\n", 1304 | " u'AMITYVILLE', u'AUBURN', u'BABYLON', u'BALDWIN', u'BALWIN',\n", 1305 | " u'BARDONIA', u'BAYPORT', u'BAYVILLE', u'BELLEMORE', u'BELLEROSE',\n", 1306 | " u'BELLMORE', u'BETHPAGE', u'BOHEMIA', u'BRENTWOOD', u'BREWSTER',\n", 1307 | " u'BRIDGEPORT', u'BROCKPORT', u'BRONX', u'BROOKLYN', u'BROOKYN',\n", 1308 | " u'CANADAIGUA', u'CANANDAIGUA', u'CANANDALGUA', u'CAPITAL STATION',\n", 1309 | " u'CARL PLACE', u'CARLE PLACE', u'CARMEL', u'CIRCLEVILLE', u'COHOES',\n", 1310 | " u'COMMACK', u'COOPERSTOWN', u'CORONA', u'CROTON-ON-HUDO',\n", 1311 | " u'E ISLIP', u'E. GREENBUSH', u'E. ISLIP', u'E. MEADOW',\n", 1312 | " u'E. NORTHPORT', u'E. NORWICH', u'E. ROCHESTER', u'E. ROCKAWAY',\n", 1313 | " u'E. SETAUKET', u'E. WILLISTON', u'EAST', u'EAST GREENBUSH',\n", 1314 | " u'EAST ISLIP', u'EAST MEADOW', u'EAST NASSAU', u'EAST NORTHPOND',\n", 1315 | " u'EAST NORTHPORT', u'EAST NORWICH', u'EAST ROCKAWAY',\n", 1316 | " u'EAST SETAUKET', u'ELMIRA', u'ELMONT', u'FARMINGDALE',\n", 1317 | " u'FARMINGVILLE', u'FRANKLIN', u'FRANKLIN SQUARE', u'FREEPORT',\n", 1318 | " u'FRESH MEADOW', u'FT HAMILTON STATE', u'GARDEN CITY',\n", 1319 | " u'GARDEN CITY', u'GARDEN CITY PARK', u'GLEN COVE', u'GOSHEN',\n", 1320 | " u'GREAT NECK', u'GREAT NECK', u'GREAT NECK', u'HAMPTON',\n", 1321 | " u'HAPPAGUE', u'HAPPAUGE', u'HAUPPAUGE', u'HAUPPAUSE', u'HEMPSTEAD',\n", 1322 | " u'HEWLETT', u'HICKSVILLE', u'HOLBROOK', u'HUNTINGTON',\n", 1323 | " u'HUNTINGTON STATION', u'HUNTINGTON STAT', u'HUNTINGTON STATION',\n", 1324 | " u'HUNTINGTONST', u'IRVINGTON-ON-HU', u'ISLAND PARK', u'ISLIP',\n", 1325 | " u'JAMAICA ESTATES', u'JEFFERSON', u'JEFFERSON VALLEY',\n", 1326 | " u'JEFFERSON VALLEY', u'JEFFERSON VALLE', u'JEFFERSON VALLEY',\n", 1327 | " u'JERICHO', u'KINDERHOOK', u'LANCASTER', u'LAWRENCE', u'LEVITOWN',\n", 1328 | " u'LEVITTOWN', u'LEVITTSTOWN', u'LIDO BEACH', u'LINDENHURST',\n", 1329 | " u'LOCUST VALLEY', u'LOCUST VALLEY', u'LONG BEACH', u'LONG BEACH',\n", 1330 | " u'LONGBEACH', u'LYNBROOK', u'MALVERNE', u'MANORVILLE',\n", 1331 | " u'MASSAPEAQUA', u'MASSAPEQUA', u'MASSAPEQUA PARK',\n", 1332 | " u'MASSAPEQUA PARK', u'MASTIC', u'MEDFORD', u'MELVILLE',\n", 1333 | " u'MERCERVILLE', u'MERICK', u'MERRICK', u'MIDDLE VILLAGE',\n", 1334 | " u'MIDDLE VILLAGE', u'MILL NECK', u'MINEOLA', u'MT. SINAI',\n", 1335 | " u'N MERRICK', u'N VALLEY', u'N VALLEY STREAM', u'N. BABYLON',\n", 1336 | " u'N. BALDWIN', u'N. MASSAPEQUA', u'N. MERRICK', u'N. VALLEY',\n", 1337 | " u'N. VALLEY STREAM', u'N.MASSAPEQUA', u'N.VALLEY STREAM', u'NANUET',\n", 1338 | " u'NESCONSET', u'NEW YORK', u'NEW YORK', u'NEW BALTIMORE',\n", 1339 | " u'NEW HAMPTON', u'NEW HYDE PARK', u'NEW HYDE PK', u'NEW ROCHELLE',\n", 1340 | " u'NEW YORK', u'NIAGARA FALLS', u'NO. BALDWIN', u'NO. MASSAPEQUA',\n", 1341 | " u'NO. MERRICK', u'NO. VALLEY STREAM', u'NO.VALLEY STREAM',\n", 1342 | " u'NORHTPORT', u'NORTH MASSAPEQUA', u'NORTH MERRICK',\n", 1343 | " u'NORTH VALLEY', u'NORTH VALLEY STREAM', u'NORTHPORT', u'NY',\n", 1344 | " u'NYC', u'OCEANSIDE', u'OLD BETHPAGE', u'OLD BETHPARE',\n", 1345 | " u'OLD WESTBURY', u'ORISKANY', u'OYSTER BAY', u'PATTERSONVILLE',\n", 1346 | " u'PENFIELD', u'PLAINVIEW', u'PORT', u'PORT WASHINGTON',\n", 1347 | " u'PORT JEFFERSON', u'PORT WASHINGTON', u'POUGHKEEPSIE',\n", 1348 | " u'PT. LOOKOUT', u'PT. WASHINGTON', u'PT.WASHINGTON', u'RENSSALAER',\n", 1349 | " u'RENSSELAER', u'RENSSELAU', u'RENSSELEAR', u'RICHMOND HILL',\n", 1350 | " u'RICHMOND HILLS', u'RIVERHEAD', u'ROCHESTER', u'ROCKEVILLE CENTRE',\n", 1351 | " u'ROCKILLE CTR', u'ROCKVILLE', u'ROCKVILLE CENTER',\n", 1352 | " u'ROCKVILLE CENTR', u'ROCKVILLE CENTRE', u'ROCKVILLE CTR',\n", 1353 | " u'RONKONKOMA', u'ROOSEVELT', u'ROSLYN', u'S.MERRICK',\n", 1354 | " u'SANDS POINT', u'SARATOGA SPRING', u'SCHENECTADY', u'SEA CLIFF',\n", 1355 | " u'SEACLIFF', u'SEAFORD', u'SEATAUKET', u'SETAUKET', u'SHIRLEY',\n", 1356 | " u'SHOREHAM', u'SHOREHARN', u'SMITHTOWN', u'SOUTHOLD', u'ST JAMES',\n", 1357 | " u'ST. JAMES', u'STATEN', u'STATEN ISLAND', u'STATEN ISLAND',\n", 1358 | " u'STONY POINT', u'SUFFERN', u'SYOSSET', u'SYRACUSE', u'UNIONDALE',\n", 1359 | " u'UPPER NYACK', u'VALLEY STREAM', u'W BABYLON', u'W ISLIP',\n", 1360 | " u'W. BABYLON', u'W. HEMPSTEAD', u'W. ISLIP', u'W.BABYLON',\n", 1361 | " u'WADING RIVER', u'WANTAGH', u'WASHINGTONVILLE', u'WEST BABLYON',\n", 1362 | " u'WEST BABYLON', u'WEST BURY', u'WEST ISLIP', u'WEST MERRICK',\n", 1363 | " u'WESTBURY', u'WESTBUTY', u'WESYBURY', u'WHITESTONE',\n", 1364 | " u'WILLISTIN PK', u'WILLISTON', u'WILLISTON PARK', u'WILLISTON PK',\n", 1365 | " u'WOODBURY', u'WOODMERE', u'YAPHANK', u'YONKERS'], dtype=object)" 1366 | ] 1367 | } 1368 | ], 1369 | "prompt_number": 70 1370 | }, 1371 | { 1372 | "cell_type": "markdown", 1373 | "metadata": {}, 1374 | "source": [ 1375 | "We can work with this, let's reassign all the `strip`ped values in place:" 1376 | ] 1377 | }, 1378 | { 1379 | "cell_type": "code", 1380 | "collapsed": false, 1381 | "input": [ 1382 | "df['City'] = df['City'].str.strip()" 1383 | ], 1384 | "language": "python", 1385 | "metadata": {}, 1386 | "outputs": [], 1387 | "prompt_number": 71 1388 | }, 1389 | { 1390 | "cell_type": "markdown", 1391 | "metadata": {}, 1392 | "source": [ 1393 | "First thing the group decided to tackle was standardizing the name \"EAST.\" It currently is \"E \", \"E.\" and \"EAST\" \n", 1394 | "\n", 1395 | "To make the list more manageable, we can find all cells that start with \"E\":" 1396 | ] 1397 | }, 1398 | { 1399 | "cell_type": "code", 1400 | "collapsed": false, 1401 | "input": [ 1402 | "df[df['City'].str.startswith(\"E\")]" 1403 | ], 1404 | "language": "python", 1405 | "metadata": {}, 1406 | "outputs": [ 1407 | { 1408 | "ename": "ValueError", 1409 | "evalue": "cannot index with vector containing NA / NaN values", 1410 | "output_type": "pyerr", 1411 | "traceback": [ 1412 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", 1413 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdf\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mdf\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'City'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstartswith\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"E\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1414 | "\u001b[0;32m/Users/jeramiaory/miniconda/lib/python2.7/site-packages/pandas/core/frame.pyc\u001b[0m in \u001b[0;36m__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1735\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mSeries\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mIndex\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1736\u001b[0m \u001b[0;31m# either boolean or fancy integer index\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1737\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getitem_array\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1738\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mDataFrame\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1739\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getitem_frame\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1415 | "\u001b[0;32m/Users/jeramiaory/miniconda/lib/python2.7/site-packages/pandas/core/frame.pyc\u001b[0m in \u001b[0;36m_getitem_array\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1762\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_getitem_array\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1763\u001b[0m \u001b[0;31m# also raises Exception if object array with NA values\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1764\u001b[0;31m \u001b[0;32mif\u001b[0m \u001b[0mcom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_is_bool_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1765\u001b[0m \u001b[0;31m# warning here just in case -- previously __setitem__ was\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1766\u001b[0m \u001b[0;31m# reindexing but __getitem__ was not; it seems more reasonable to\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1416 | "\u001b[0;32m/Users/jeramiaory/miniconda/lib/python2.7/site-packages/pandas/core/common.pyc\u001b[0m in \u001b[0;36m_is_bool_indexer\u001b[0;34m(key)\u001b[0m\n\u001b[1;32m 2039\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mlib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_bool_array\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2040\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misnull\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0many\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2041\u001b[0;31m raise ValueError('cannot index with vector containing '\n\u001b[0m\u001b[1;32m 2042\u001b[0m 'NA / NaN values')\n\u001b[1;32m 2043\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mFalse\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1417 | "\u001b[0;31mValueError\u001b[0m: cannot index with vector containing NA / NaN values" 1418 | ] 1419 | } 1420 | ], 1421 | "prompt_number": 73 1422 | }, 1423 | { 1424 | "cell_type": "markdown", 1425 | "metadata": {}, 1426 | "source": [ 1427 | "Or... we could, if all of the data was `string` data. Turns out some of it is missing (of course!) \n", 1428 | "\n", 1429 | "The group discussed what to do with the data, and decided they wanted to convert it to the string \"UNKNOWN\"\n", 1430 | "\n", 1431 | "Using the `fillna` command, we can do this in place(ish):" 1432 | ] 1433 | }, 1434 | { 1435 | "cell_type": "code", 1436 | "collapsed": false, 1437 | "input": [ 1438 | "df['City'] = df['City'].fillna(\"UNKNOWN\")" 1439 | ], 1440 | "language": "python", 1441 | "metadata": {}, 1442 | "outputs": [], 1443 | "prompt_number": 80 1444 | }, 1445 | { 1446 | "cell_type": "code", 1447 | "collapsed": false, 1448 | "input": [ 1449 | "cities = df['City'].unique()\n", 1450 | "cities.sort()" 1451 | ], 1452 | "language": "python", 1453 | "metadata": {}, 1454 | "outputs": [], 1455 | "prompt_number": 81 1456 | }, 1457 | { 1458 | "cell_type": "code", 1459 | "collapsed": false, 1460 | "input": [ 1461 | "cities" 1462 | ], 1463 | "language": "python", 1464 | "metadata": {}, 1465 | "outputs": [ 1466 | { 1467 | "metadata": {}, 1468 | "output_type": "pyout", 1469 | "prompt_number": 82, 1470 | "text": [ 1471 | "array([u'ALBANI', u'ALBANNY', u'ALBANY', u'ALBERTSON', u'ALBION',\n", 1472 | " u'AMITYVILLE', u'AUBURN', u'BABYLON', u'BALDWIN', u'BALWIN',\n", 1473 | " u'BARDONIA', u'BAYPORT', u'BAYVILLE', u'BELLEMORE', u'BELLEROSE',\n", 1474 | " u'BELLMORE', u'BETHPAGE', u'BOHEMIA', u'BRENTWOOD', u'BREWSTER',\n", 1475 | " u'BRIDGEPORT', u'BROCKPORT', u'BRONX', u'BROOKLYN', u'BROOKYN',\n", 1476 | " u'CANADAIGUA', u'CANANDAIGUA', u'CANANDALGUA', u'CAPITAL STATION',\n", 1477 | " u'CARL PLACE', u'CARLE PLACE', u'CARMEL', u'CIRCLEVILLE', u'COHOES',\n", 1478 | " u'COMMACK', u'COOPERSTOWN', u'CORONA', u'CROTON-ON-HUDO',\n", 1479 | " u'E ISLIP', u'E. GREENBUSH', u'E. ISLIP', u'E. MEADOW',\n", 1480 | " u'E. NORTHPORT', u'E. NORWICH', u'E. ROCHESTER', u'E. ROCKAWAY',\n", 1481 | " u'E. SETAUKET', u'E. WILLISTON', u'EAST', u'EAST GREENBUSH',\n", 1482 | " u'EAST ISLIP', u'EAST MEADOW', u'EAST NASSAU', u'EAST NORTHPOND',\n", 1483 | " u'EAST NORTHPORT', u'EAST NORWICH', u'EAST ROCKAWAY',\n", 1484 | " u'EAST SETAUKET', u'ELMIRA', u'ELMONT', u'FARMINGDALE',\n", 1485 | " u'FARMINGVILLE', u'FRANKLIN', u'FRANKLIN SQUARE', u'FREEPORT',\n", 1486 | " u'FRESH MEADOW', u'FT HAMILTON STATE', u'GARDEN CITY',\n", 1487 | " u'GARDEN CITY', u'GARDEN CITY PARK', u'GLEN COVE', u'GOSHEN',\n", 1488 | " u'GREAT NECK', u'GREAT NECK', u'GREAT NECK', u'HAMPTON',\n", 1489 | " u'HAPPAGUE', u'HAPPAUGE', u'HAUPPAUGE', u'HAUPPAUSE', u'HEMPSTEAD',\n", 1490 | " u'HEWLETT', u'HICKSVILLE', u'HOLBROOK', u'HUNTINGTON',\n", 1491 | " u'HUNTINGTON STATION', u'HUNTINGTON STAT', u'HUNTINGTON STATION',\n", 1492 | " u'HUNTINGTONST', u'IRVINGTON-ON-HU', u'ISLAND PARK', u'ISLIP',\n", 1493 | " u'JAMAICA ESTATES', u'JEFFERSON', u'JEFFERSON VALLEY',\n", 1494 | " u'JEFFERSON VALLEY', u'JEFFERSON VALLE', u'JEFFERSON VALLEY',\n", 1495 | " u'JERICHO', u'KINDERHOOK', u'LANCASTER', u'LAWRENCE', u'LEVITOWN',\n", 1496 | " u'LEVITTOWN', u'LEVITTSTOWN', u'LIDO BEACH', u'LINDENHURST',\n", 1497 | " u'LOCUST VALLEY', u'LOCUST VALLEY', u'LONG BEACH', u'LONG BEACH',\n", 1498 | " u'LONGBEACH', u'LYNBROOK', u'MALVERNE', u'MANORVILLE',\n", 1499 | " u'MASSAPEAQUA', u'MASSAPEQUA', u'MASSAPEQUA PARK',\n", 1500 | " u'MASSAPEQUA PARK', u'MASTIC', u'MEDFORD', u'MELVILLE',\n", 1501 | " u'MERCERVILLE', u'MERICK', u'MERRICK', u'MIDDLE VILLAGE',\n", 1502 | " u'MIDDLE VILLAGE', u'MILL NECK', u'MINEOLA', u'MT. SINAI',\n", 1503 | " u'N MERRICK', u'N VALLEY', u'N VALLEY STREAM', u'N. BABYLON',\n", 1504 | " u'N. BALDWIN', u'N. MASSAPEQUA', u'N. MERRICK', u'N. VALLEY',\n", 1505 | " u'N. VALLEY STREAM', u'N.MASSAPEQUA', u'N.VALLEY STREAM', u'NANUET',\n", 1506 | " u'NESCONSET', u'NEW YORK', u'NEW YORK', u'NEW BALTIMORE',\n", 1507 | " u'NEW HAMPTON', u'NEW HYDE PARK', u'NEW HYDE PK', u'NEW ROCHELLE',\n", 1508 | " u'NEW YORK', u'NIAGARA FALLS', u'NO. BALDWIN', u'NO. MASSAPEQUA',\n", 1509 | " u'NO. MERRICK', u'NO. VALLEY STREAM', u'NO.VALLEY STREAM',\n", 1510 | " u'NORHTPORT', u'NORTH MASSAPEQUA', u'NORTH MERRICK',\n", 1511 | " u'NORTH VALLEY', u'NORTH VALLEY STREAM', u'NORTHPORT', u'NY',\n", 1512 | " u'NYC', u'OCEANSIDE', u'OLD BETHPAGE', u'OLD BETHPARE',\n", 1513 | " u'OLD WESTBURY', u'ORISKANY', u'OYSTER BAY', u'PATTERSONVILLE',\n", 1514 | " u'PENFIELD', u'PLAINVIEW', u'PORT', u'PORT WASHINGTON',\n", 1515 | " u'PORT JEFFERSON', u'PORT WASHINGTON', u'POUGHKEEPSIE',\n", 1516 | " u'PT. LOOKOUT', u'PT. WASHINGTON', u'PT.WASHINGTON', u'RENSSALAER',\n", 1517 | " u'RENSSELAER', u'RENSSELAU', u'RENSSELEAR', u'RICHMOND HILL',\n", 1518 | " u'RICHMOND HILLS', u'RIVERHEAD', u'ROCHESTER', u'ROCKEVILLE CENTRE',\n", 1519 | " u'ROCKILLE CTR', u'ROCKVILLE', u'ROCKVILLE CENTER',\n", 1520 | " u'ROCKVILLE CENTR', u'ROCKVILLE CENTRE', u'ROCKVILLE CTR',\n", 1521 | " u'RONKONKOMA', u'ROOSEVELT', u'ROSLYN', u'S.MERRICK',\n", 1522 | " u'SANDS POINT', u'SARATOGA SPRING', u'SCHENECTADY', u'SEA CLIFF',\n", 1523 | " u'SEACLIFF', u'SEAFORD', u'SEATAUKET', u'SETAUKET', u'SHIRLEY',\n", 1524 | " u'SHOREHAM', u'SHOREHARN', u'SMITHTOWN', u'SOUTHOLD', u'ST JAMES',\n", 1525 | " u'ST. JAMES', u'STATEN', u'STATEN ISLAND', u'STATEN ISLAND',\n", 1526 | " u'STONY POINT', u'SUFFERN', u'SYOSSET', u'SYRACUSE', u'UNIONDALE',\n", 1527 | " 'UNKNOWN', u'UPPER NYACK', u'VALLEY STREAM', u'W BABYLON',\n", 1528 | " u'W ISLIP', u'W. BABYLON', u'W. HEMPSTEAD', u'W. ISLIP',\n", 1529 | " u'W.BABYLON', u'WADING RIVER', u'WANTAGH', u'WASHINGTONVILLE',\n", 1530 | " u'WEST BABLYON', u'WEST BABYLON', u'WEST BURY', u'WEST ISLIP',\n", 1531 | " u'WEST MERRICK', u'WESTBURY', u'WESTBUTY', u'WESYBURY',\n", 1532 | " u'WHITESTONE', u'WILLISTIN PK', u'WILLISTON', u'WILLISTON PARK',\n", 1533 | " u'WILLISTON PK', u'WOODBURY', u'WOODMERE', u'YAPHANK', u'YONKERS'], dtype=object)" 1534 | ] 1535 | } 1536 | ], 1537 | "prompt_number": 82 1538 | }, 1539 | { 1540 | "cell_type": "markdown", 1541 | "metadata": {}, 1542 | "source": [ 1543 | "Ok, no more `nan`, yay!\n", 1544 | "\n", 1545 | "Let's try that again:" 1546 | ] 1547 | }, 1548 | { 1549 | "cell_type": "code", 1550 | "collapsed": false, 1551 | "input": [ 1552 | "df['City'].str.startswith('E')" 1553 | ], 1554 | "language": "python", 1555 | "metadata": {}, 1556 | "outputs": [ 1557 | { 1558 | "metadata": {}, 1559 | "output_type": "pyout", 1560 | "prompt_number": 85, 1561 | "text": [ 1562 | "0 False\n", 1563 | "1 False\n", 1564 | "2 False\n", 1565 | "3 False\n", 1566 | "4 False\n", 1567 | "5 False\n", 1568 | "6 False\n", 1569 | "7 False\n", 1570 | "8 False\n", 1571 | "9 False\n", 1572 | "10 False\n", 1573 | "11 False\n", 1574 | "12 False\n", 1575 | "13 False\n", 1576 | "14 False\n", 1577 | "...\n", 1578 | "3266 False\n", 1579 | "3267 False\n", 1580 | "3268 False\n", 1581 | "3269 False\n", 1582 | "3270 False\n", 1583 | "3271 False\n", 1584 | "3272 False\n", 1585 | "3273 False\n", 1586 | "3274 False\n", 1587 | "3275 False\n", 1588 | "3276 False\n", 1589 | "3277 False\n", 1590 | "3278 False\n", 1591 | "3279 False\n", 1592 | "3280 True\n", 1593 | "Name: City, Length: 3281, dtype: bool" 1594 | ] 1595 | } 1596 | ], 1597 | "prompt_number": 85 1598 | }, 1599 | { 1600 | "cell_type": "markdown", 1601 | "metadata": {}, 1602 | "source": [ 1603 | "We can now select all the rows in our dataframe by feeding the `True`/`False` back into the dataframe:" 1604 | ] 1605 | }, 1606 | { 1607 | "cell_type": "code", 1608 | "collapsed": false, 1609 | "input": [ 1610 | "df[df['City'].str.startswith(\"E\")]" 1611 | ], 1612 | "language": "python", 1613 | "metadata": {}, 1614 | "outputs": [ 1615 | { 1616 | "html": [ 1617 | "
\n", 1618 | "\n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | " \n", 1935 | " \n", 1936 | " \n", 1937 | " \n", 1938 | " \n", 1939 | " \n", 1940 | " \n", 1941 | " \n", 1942 | " \n", 1943 | " \n", 1944 | " \n", 1945 | " \n", 1946 | " \n", 1947 | " \n", 1948 | " \n", 1949 | " \n", 1950 | " \n", 1951 | " \n", 1952 | " \n", 1953 | " \n", 1954 | " \n", 1955 | " \n", 1956 | " \n", 1957 | " \n", 1958 | " \n", 1959 | " \n", 1960 | " \n", 1961 | " \n", 1962 | " \n", 1963 | " \n", 1964 | " \n", 1965 | " \n", 1966 | " \n", 1967 | " \n", 1968 | " \n", 1969 | " \n", 1970 | " \n", 1971 | " \n", 1972 | " \n", 1973 | " \n", 1974 | " \n", 1975 | " \n", 1976 | " \n", 1977 | " \n", 1978 | " \n", 1979 | " \n", 1980 | " \n", 1981 | " \n", 1982 | " \n", 1983 | " \n", 1984 | " \n", 1985 | " \n", 1986 | " \n", 1987 | " \n", 1988 | " \n", 1989 | " \n", 1990 | " \n", 1991 | " \n", 1992 | " \n", 1993 | " \n", 1994 | " \n", 1995 | " \n", 1996 | " \n", 1997 | " \n", 1998 | " \n", 1999 | " \n", 2000 | " \n", 2001 | " \n", 2002 | " \n", 2003 | " \n", 2004 | " \n", 2005 | " \n", 2006 | " \n", 2007 | " \n", 2008 | " \n", 2009 | " \n", 2010 | " \n", 2011 | " \n", 2012 | " \n", 2013 | " \n", 2014 | " \n", 2015 | " \n", 2016 | " \n", 2017 | " \n", 2018 | " \n", 2019 | " \n", 2020 | " \n", 2021 | " \n", 2022 | " \n", 2023 | " \n", 2024 | " \n", 2025 | " \n", 2026 | " \n", 2027 | " \n", 2028 | " \n", 2029 | " \n", 2030 | " \n", 2031 | " \n", 2032 | " \n", 2033 | " \n", 2034 | " \n", 2035 | " \n", 2036 | " \n", 2037 | " \n", 2038 | " \n", 2039 | " \n", 2040 | " \n", 2041 | " \n", 2042 | " \n", 2043 | " \n", 2044 | " \n", 2045 | " \n", 2046 | " \n", 2047 | " \n", 2048 | " \n", 2049 | " \n", 2050 | " \n", 2051 | " \n", 2052 | " \n", 2053 | " \n", 2054 | " \n", 2055 | " \n", 2056 | " \n", 2057 | " \n", 2058 | " \n", 2059 | " \n", 2060 | " \n", 2061 | " \n", 2062 | " \n", 2063 | " \n", 2064 | " \n", 2065 | " \n", 2066 | " \n", 2067 | " \n", 2068 | " \n", 2069 | " \n", 2070 | " \n", 2071 | " \n", 2072 | " \n", 2073 | " \n", 2074 | " \n", 2075 | " \n", 2076 | " \n", 2077 | " \n", 2078 | " \n", 2079 | " \n", 2080 | " \n", 2081 | " \n", 2082 | " \n", 2083 | " \n", 2084 | " \n", 2085 | " \n", 2086 | " \n", 2087 | " \n", 2088 | " \n", 2089 | " \n", 2090 | " \n", 2091 | " \n", 2092 | " \n", 2093 | " \n", 2094 | " \n", 2095 | " \n", 2096 | " \n", 2097 | " \n", 2098 | " \n", 2099 | " \n", 2100 | " \n", 2101 | " \n", 2102 | " \n", 2103 | " \n", 2104 | " \n", 2105 | " \n", 2106 | " \n", 2107 | " \n", 2108 | " \n", 2109 | " \n", 2110 | " \n", 2111 | " \n", 2112 | " \n", 2113 | " \n", 2114 | " \n", 2115 | " \n", 2116 | " \n", 2117 | " \n", 2118 | " \n", 2119 | " \n", 2120 | " \n", 2121 | " \n", 2122 | " \n", 2123 | " \n", 2124 | " \n", 2125 | " \n", 2126 | " \n", 2127 | " \n", 2128 | " \n", 2129 | " \n", 2130 | " \n", 2131 | " \n", 2132 | " \n", 2133 | " \n", 2134 | " \n", 2135 | " \n", 2136 | " \n", 2137 | " \n", 2138 | " \n", 2139 | " \n", 2140 | " \n", 2141 | " \n", 2142 | " \n", 2143 | " \n", 2144 | " \n", 2145 | " \n", 2146 | " \n", 2147 | " \n", 2148 | " \n", 2149 | " \n", 2150 | " \n", 2151 | " \n", 2152 | " \n", 2153 | " \n", 2154 | " \n", 2155 | " \n", 2156 | " \n", 2157 | " \n", 2158 | " \n", 2159 | " \n", 2160 | " \n", 2161 | " \n", 2162 | " \n", 2163 | " \n", 2164 | " \n", 2165 | " \n", 2166 | " \n", 2167 | " \n", 2168 | " \n", 2169 | " \n", 2170 | " \n", 2171 | " \n", 2172 | " \n", 2173 | " \n", 2174 | " \n", 2175 | " \n", 2176 | " \n", 2177 | " \n", 2178 | " \n", 2179 | " \n", 2180 | " \n", 2181 | " \n", 2182 | " \n", 2183 | " \n", 2184 | " \n", 2185 | " \n", 2186 | " \n", 2187 | " \n", 2188 | " \n", 2189 | " \n", 2190 | " \n", 2191 | " \n", 2192 | " \n", 2193 | " \n", 2194 | " \n", 2195 | " \n", 2196 | " \n", 2197 | " \n", 2198 | " \n", 2199 | " \n", 2200 | " \n", 2201 | " \n", 2202 | " \n", 2203 | " \n", 2204 | " \n", 2205 | " \n", 2206 | " \n", 2207 | " \n", 2208 | " \n", 2209 | " \n", 2210 | " \n", 2211 | " \n", 2212 | " \n", 2213 | " \n", 2214 | " \n", 2215 | " \n", 2216 | " \n", 2217 | " \n", 2218 | " \n", 2219 | " \n", 2220 | " \n", 2221 | " \n", 2222 | " \n", 2223 | " \n", 2224 | " \n", 2225 | " \n", 2226 | " \n", 2227 | " \n", 2228 | " \n", 2229 | " \n", 2230 | " \n", 2231 | " \n", 2232 | " \n", 2233 | " \n", 2234 | " \n", 2235 | " \n", 2236 | " \n", 2237 | " \n", 2238 | " \n", 2239 | " \n", 2240 | " \n", 2241 | " \n", 2242 | " \n", 2243 | " \n", 2244 | " \n", 2245 | " \n", 2246 | " \n", 2247 | " \n", 2248 | " \n", 2249 | " \n", 2250 | " \n", 2251 | " \n", 2252 | " \n", 2253 | " \n", 2254 | " \n", 2255 | " \n", 2256 | " \n", 2257 | " \n", 2258 | " \n", 2259 | " \n", 2260 | " \n", 2261 | " \n", 2262 | " \n", 2263 | " \n", 2264 | " \n", 2265 | " \n", 2266 | " \n", 2267 | " \n", 2268 | " \n", 2269 | " \n", 2270 | " \n", 2271 | " \n", 2272 | " \n", 2273 | " \n", 2274 | " \n", 2275 | " \n", 2276 | " \n", 2277 | " \n", 2278 | " \n", 2279 | " \n", 2280 | " \n", 2281 | " \n", 2282 | " \n", 2283 | " \n", 2284 | " \n", 2285 | " \n", 2286 | " \n", 2287 | " \n", 2288 | " \n", 2289 | " \n", 2290 | " \n", 2291 | " \n", 2292 | " \n", 2293 | " \n", 2294 | " \n", 2295 | " \n", 2296 | " \n", 2297 | " \n", 2298 | " \n", 2299 | " \n", 2300 | " \n", 2301 | " \n", 2302 | " \n", 2303 | " \n", 2304 | " \n", 2305 | " \n", 2306 | " \n", 2307 | " \n", 2308 | " \n", 2309 | " \n", 2310 | " \n", 2311 | " \n", 2312 | " \n", 2313 | " \n", 2314 | " \n", 2315 | " \n", 2316 | " \n", 2317 | " \n", 2318 | " \n", 2319 | " \n", 2320 | " \n", 2321 | " \n", 2322 | " \n", 2323 | " \n", 2324 | " \n", 2325 | " \n", 2326 | " \n", 2327 | " \n", 2328 | " \n", 2329 | " \n", 2330 | " \n", 2331 | " \n", 2332 | " \n", 2333 | " \n", 2334 | " \n", 2335 | " \n", 2336 | " \n", 2337 | " \n", 2338 | " \n", 2339 | " \n", 2340 | " \n", 2341 | " \n", 2342 | " \n", 2343 | " \n", 2344 | " \n", 2345 | " \n", 2346 | " \n", 2347 | " \n", 2348 | " \n", 2349 | " \n", 2350 | " \n", 2351 | " \n", 2352 | " \n", 2353 | " \n", 2354 | " \n", 2355 | " \n", 2356 | " \n", 2357 | " \n", 2358 | " \n", 2359 | " \n", 2360 | " \n", 2361 | " \n", 2362 | " \n", 2363 | " \n", 2364 | " \n", 2365 | " \n", 2366 | " \n", 2367 | " \n", 2368 | " \n", 2369 | " \n", 2370 | " \n", 2371 | " \n", 2372 | " \n", 2373 | " \n", 2374 | " \n", 2375 | " \n", 2376 | " \n", 2377 | " \n", 2378 | " \n", 2379 | " \n", 2380 | " \n", 2381 | " \n", 2382 | " \n", 2383 | " \n", 2384 | " \n", 2385 | " \n", 2386 | " \n", 2387 | " \n", 2388 | " \n", 2389 | " \n", 2390 | " \n", 2391 | " \n", 2392 | " \n", 2393 | " \n", 2394 | " \n", 2395 | " \n", 2396 | " \n", 2397 | " \n", 2398 | " \n", 2399 | " \n", 2400 | " \n", 2401 | " \n", 2402 | " \n", 2403 | " \n", 2404 | " \n", 2405 | " \n", 2406 | " \n", 2407 | " \n", 2408 | " \n", 2409 | " \n", 2410 | " \n", 2411 | " \n", 2412 | " \n", 2413 | " \n", 2414 | " \n", 2415 | " \n", 2416 | " \n", 2417 | " \n", 2418 | " \n", 2419 | " \n", 2420 | " \n", 2421 | " \n", 2422 | " \n", 2423 | " \n", 2424 | " \n", 2425 | " \n", 2426 | " \n", 2427 | " \n", 2428 | " \n", 2429 | "
DATENAME OF INITIATIVE/ADDRESSCityStateZip CodeCHECK NO.AMOUNT ($ )RECORD DATEPartySource
123 04/14/11 BROOKHAVEN DEMOCRATIC COMM;PO BOX 561 EAST NY 11733 1538 500 JUL-07-11 04:21 PM D PBA
362 10/03/01 ENGLEBRIGHT FOR STATE ASSEMBLY;PO BOX 2703 EAST SETAUKET NY 11733 1523 250 OCT-23-01 12:08 PM D SOA
363 02/09/99 ENGLEBRITE FOR STATE ASSEMBLY;P.O. BOX 2703 EAST SETAUKET NY 11733 1250 200 JUL-07-99 09:05 AM D SOA
390 11/07/11 FRIENDS OF CARRIE SOLAGES;1630 DUTCH BROADWAY ELMONT NY 11003 2907 2000 NOV-29-11 12:00 AM D SOA
484 02/12/99 FRIENDS OF ENGELBRIGHT;POB 703 E. SETAUKET NY 11733 NaN 100 JUN-10-99 06:04 PM D DETECTIVES
727 04/04/00 FRIENDS OF STEVE ENGLEBRITE;P.O. BOX 2703 EAST SETAUKET NY 11733 1356 250 JUL-10-00 03:03 PM D SOA
871 11/03/11 NASSAU DEMOCRATIC COMMITTEE;6 NEIGHBOUR WAY EAST NASSAU NY 12062 1000 31659.77 DEC-05-11 12:00 AM D SAFE NASSAU
1128 03/08/10 CITIZENS FOMM TO RE-ELECT KEN LA VALLE;9 BERKS... E. GREENBUSH NY 12061 1405 500 JUL-15-10 10:31 AM R PBA
1150 08/03/99 CITIZENS FOR CIOTTI;P.O. BOX 03068 ELMONT NY 11003 1307 450 AUG-11-99 07:39 PM R SOA
1166 09/21/05 CITIZENS FOR GREG PETERSON;PO BOX 455 EAST MEADOW NY 11554 2171 10000 OCT-07-05 03:02 PM R PBA
1167 10/05/05 CITIZENS FOR GREG PETERSON;PO BOX 455 EAST MEADOW NY 11554 2186 10000 OCT-25-05 11:29 AM R PBA
1250 09/13/10 CITIZENS FOR MONTESANO;25 MEADOWLARK DRIVE E. NORTHPORT NY 11731 1475 250 SEP-30-10 11:02 AM R PBA
1288 06/10/05 CITIZENS FOR PETERSON;PO BOX 455 EAST MEADOW NY 11554 1988 350 JUL-11-05 03:06 PM R DETECTIVES
1289 09/21/05 CITIZENS FOR PETERSON;PO BOX 455 EAST MEADOW NY 11554 2012 1250 OCT-07-05 05:17 PM R DETECTIVES
1290 09/07/05 CITIZENS FOR PETERSON;PO BOX 455 EAST MEADOW NY 11554 2078 1500 SEP-27-05 11:53 AM R SOA
1332 09/15/05 CITIZENS FOR SANTINO;PO BOX 22 EAST ROCKAWAY NY 11515 2160 500 OCT-07-05 02:53 PM R PBA
1333 10/24/01 CITIZENS FOR SANTINO;POB 22 E. ROCKAWAY NY 11518 NaN 100 NOV-26-01 02:28 PM R DETECTIVES
1334 06/01/07 CITIZENS FOR SANTINO;POB 22 E. ROCKAWAY NY 11518 1605 100 JUL-05-07 05:27 PM R DETECTIVES
1335 06/12/09 CITIZENS FOR SANTINO;P.O. BOX 22 E. ROCKAWAY NY 11518 1089 100 JUL-13-09 02:19 PM R DETECTIVES
1336 05/07/00 CITIZENS FOR SANTINO;P.O. BOX 22 EAST ROCKAWAY NY 11518 1371 500 JUL-10-00 03:40 PM R SOA
1338 08/04/05 CITIZENS FOR SANTINO;PO BOX 22 EAST ROCKAWAY NY 11518 2053 200 AUG-08-05 12:34 PM R SOA
1358 09/05/02 COMM. TO ELECT JIM ALESSI;PO BOX 200 E. ROCHESTER NY 14445 1881 200 OCT-03-02 10:22 AM R PBA
1406 05/06/99 COMMITTEE TO RE-ELECT R. GAFFNEY;90 MERRICK AV... EAST MEADOW NY 11554 1613 1000 JUL-22-99 11:26 AM R PBA
1409 09/27/07 COMMITTEE TO REELECT ROBERT SCHMIDT;POB 320 E. NORWICH NY 11732 1635 500 OCT-05-07 05:23 PM R DETECTIVES
1410 09/04/07 COMMITTEE TO RE-ELECT ROBERT SCHMIDT SUPREME C... EAST MEADOW NY 11554 2434 500 OCT-01-07 03:31 PM R SOA
1424 01/14/08 COMMITTEE TO RE-ELECT SENATOR BRUNO;4 SPRUCE RUN EAST GREENBUSH NY 12061 1096 1000 OCT-23-09 01:29 PM R PBA
1438 10/20/00 COUNCILMAN JOE KEARNEY ELECTION COMMITTEE;1199... ELMONT NY 11580 1418 250 OCT-24-00 11:40 AM R SOA
1443 07/20/99 E. ROCKAWAY REPUBLICAN COMM;94 FRANKLIN AVE E. ROCKAWAY NY 11518 1622 100 OCT-08-99 10:14 AM R PBA
1472 09/11/00 FRIENDS FOR ANGIE M. CULLEN;90 MERRICK AVENUE EAST MEADOW NY 11554 1402 125 SEP-19-00 09:18 AM R SOA
1561 04/04/07 FRIENDS FOR NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 2391 500 JUL-16-07 05:32 PM R SOA
.................................
2265 09/23/03 FRIENDS OF NORMA GONSALVES;1901 MERCUM ST. EAST MEADOW NY 11554 1980 1500 OCT-03-03 12:50 PM R PBA
2266 09/02/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 2155 500 OCT-07-05 03:09 PM R PBA
2267 09/26/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 2177 1988 OCT-07-05 03:09 PM R PBA
2268 04/23/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 1265 700 JUL-08-09 02:28 PM R PBA
2269 09/13/10 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 1477 1000 SEP-30-10 11:03 AM R PBA
2270 03/14/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 1957 250 JUL-11-05 01:56 PM R DETECTIVES
2271 06/21/05 FRIENDS OF NORMA GONSALVES;1901 MARION STREET EAST MEADOW NY 11554 1994 100 JUL-11-05 03:11 PM R DETECTIVES
2272 04/09/09 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 1065 350 JUL-13-09 02:08 PM R DETECTIVES
2273 08/19/09 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 1109 495 SEP-04-09 12:51 PM R DETECTIVES
2274 09/21/10 FRIENDS OF NORMA GONSALVES;1901 MERION STREET EAST MEADOW NY 11554 1216 250 SEP-22-10 03:15 PM R DETECTIVES
2275 08/11/03 FRIENDS OF NORMA GONSALVES;1901 MERION ST. EAST MEADOW NY 11554 1780 250 SEP-03-03 01:46 PM R SOA
2276 10/09/03 FRIENDS OF NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 1808 1000 OCT-24-03 02:03 PM R SOA
2277 03/21/05 FRIENDS OF NORMA GONSALVES;1901 MERION ST. EAST MEADOW NY 11554 1988 250 JUL-13-05 04:00 PM R SOA
2278 10/13/05 FRIENDS OF NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 2090 2000 OCT-27-05 03:40 PM R SOA
2279 04/30/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 2650 500 JUL-13-09 04:45 PM R SOA
2280 09/21/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST EAST MEADOW NY 11554 2700 1000 SEP-24-09 03:30 PM R SOA
2281 08/03/99 FRIENDS OF NORMA GONZALEZ;1901 MERION STREET EAST MEADOW NY 11554 1308 300 AUG-11-99 07:40 PM R SOA
2322 11/02/11 FRIENDS OF PAT MAHER;335 SPRING DRIVE EAST MEADOW NY 11554 1320 2000 DEC-01-11 12:00 AM R DETECTIVES
2323 10/27/11 FRIENDS OF PAT MAHER;335 SPRING DR EAST MEADOW NY 11554 2905 1000 NOV-29-11 12:00 AM R SOA
2371 06/19/07 FRIENDS OF PHIL BOYLE;136 E MAIN STREET E. ISLIP NY 11730 1614 200 AUG-16-07 04:12 PM R DETECTIVES
2372 06/12/00 FRIENDS OF PHIL BOYLE;15 STEWART STREET EAST ISLIP NY 11730 1377 125 JUL-10-00 03:47 PM R SOA
2373 07/13/01 FRIENDS OF PHIL BOYLE;15 STEWART ST E ISLIP NY 11730 1499 400 AUG-09-01 10:07 PM R SOA
2374 03/01/02 FRIENDS OF PHIL BOYLE;15 STEWART ST E ISLIP NY 11730 1564 200 JUL-03-02 10:20 AM R SOA
2375 06/19/07 FRIENDS OF PHIL BOYLE;136 E. MAIN ST. EAST ISLIP NY 11730 2417 200 JUL-16-07 05:35 PM R SOA
2431 09/27/07 FRIENDS OF ROBERT SCHMIDT;PO BOX 320 EAST NORWICH NY 11732 1635 500 SEP-28-07 04:24 PM R DETECTIVES
2464 01/09/12 FRIENDS OF SENATOR JACK MARTIN;192 CLEARMEADOW... EAST MEADOW NY 11554 1325 500 JAN-16-12 12:00 AM R DETECTIVES
2505 02/23/06 FRIENDS OF TOM MCKEVITT;P.O. BOX 455 E. MEADOW NY 11554 2039 500 JUN-27-06 12:52 PM R DETECTIVES
2740 10/03/01 N. HEPMSTEAD REPUBLICAN COMMITTEE;C/O AXELROD ... EAST MEADOW NY 11554 1522 400 OCT-23-01 01:04 PM R SOA
3195 10/20/10 TAXPAYERS FOR LENAHAN;17 EVERDELL ROAD EAST ROCKAWAY NY 11518 1221 100 NOV-29-10 04:58 PM R DETECTIVES
3280 06/30/09 ;235 CARMEN AVE EAST ROCKAWAY NY 11518 1290 $ 600 JUL-08-09 02:59 PM R PBA
\n", 2430 | "

86 rows \u00d7 10 columns

\n", 2431 | "
" 2432 | ], 2433 | "metadata": {}, 2434 | "output_type": "pyout", 2435 | "prompt_number": 86, 2436 | "text": [ 2437 | " DATE NAME OF INITIATIVE/ADDRESS \\\n", 2438 | "123 04/14/11 BROOKHAVEN DEMOCRATIC COMM;PO BOX 561 \n", 2439 | "362 10/03/01 ENGLEBRIGHT FOR STATE ASSEMBLY;PO BOX 2703 \n", 2440 | "363 02/09/99 ENGLEBRITE FOR STATE ASSEMBLY;P.O. BOX 2703 \n", 2441 | "390 11/07/11 FRIENDS OF CARRIE SOLAGES;1630 DUTCH BROADWAY \n", 2442 | "484 02/12/99 FRIENDS OF ENGELBRIGHT;POB 703 \n", 2443 | "727 04/04/00 FRIENDS OF STEVE ENGLEBRITE;P.O. BOX 2703 \n", 2444 | "871 11/03/11 NASSAU DEMOCRATIC COMMITTEE;6 NEIGHBOUR WAY \n", 2445 | "1128 03/08/10 CITIZENS FOMM TO RE-ELECT KEN LA VALLE;9 BERKS... \n", 2446 | "1150 08/03/99 CITIZENS FOR CIOTTI;P.O. BOX 03068 \n", 2447 | "1166 09/21/05 CITIZENS FOR GREG PETERSON;PO BOX 455 \n", 2448 | "1167 10/05/05 CITIZENS FOR GREG PETERSON;PO BOX 455 \n", 2449 | "1250 09/13/10 CITIZENS FOR MONTESANO;25 MEADOWLARK DRIVE \n", 2450 | "1288 06/10/05 CITIZENS FOR PETERSON;PO BOX 455 \n", 2451 | "1289 09/21/05 CITIZENS FOR PETERSON;PO BOX 455 \n", 2452 | "1290 09/07/05 CITIZENS FOR PETERSON;PO BOX 455 \n", 2453 | "1332 09/15/05 CITIZENS FOR SANTINO;PO BOX 22 \n", 2454 | "1333 10/24/01 CITIZENS FOR SANTINO;POB 22 \n", 2455 | "1334 06/01/07 CITIZENS FOR SANTINO;POB 22 \n", 2456 | "1335 06/12/09 CITIZENS FOR SANTINO;P.O. BOX 22 \n", 2457 | "1336 05/07/00 CITIZENS FOR SANTINO;P.O. BOX 22 \n", 2458 | "1338 08/04/05 CITIZENS FOR SANTINO;PO BOX 22 \n", 2459 | "1358 09/05/02 COMM. TO ELECT JIM ALESSI;PO BOX 200 \n", 2460 | "1406 05/06/99 COMMITTEE TO RE-ELECT R. GAFFNEY;90 MERRICK AV... \n", 2461 | "1409 09/27/07 COMMITTEE TO REELECT ROBERT SCHMIDT;POB 320 \n", 2462 | "1410 09/04/07 COMMITTEE TO RE-ELECT ROBERT SCHMIDT SUPREME C... \n", 2463 | "1424 01/14/08 COMMITTEE TO RE-ELECT SENATOR BRUNO;4 SPRUCE RUN \n", 2464 | "1438 10/20/00 COUNCILMAN JOE KEARNEY ELECTION COMMITTEE;1199... \n", 2465 | "1443 07/20/99 E. ROCKAWAY REPUBLICAN COMM;94 FRANKLIN AVE \n", 2466 | "1472 09/11/00 FRIENDS FOR ANGIE M. CULLEN;90 MERRICK AVENUE \n", 2467 | "1561 04/04/07 FRIENDS FOR NORMA GONSALVES;1901 MERION ST \n", 2468 | "... ... ... \n", 2469 | "2265 09/23/03 FRIENDS OF NORMA GONSALVES;1901 MERCUM ST. \n", 2470 | "2266 09/02/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2471 | "2267 09/26/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2472 | "2268 04/23/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST \n", 2473 | "2269 09/13/10 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2474 | "2270 03/14/05 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2475 | "2271 06/21/05 FRIENDS OF NORMA GONSALVES;1901 MARION STREET \n", 2476 | "2272 04/09/09 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2477 | "2273 08/19/09 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2478 | "2274 09/21/10 FRIENDS OF NORMA GONSALVES;1901 MERION STREET \n", 2479 | "2275 08/11/03 FRIENDS OF NORMA GONSALVES;1901 MERION ST. \n", 2480 | "2276 10/09/03 FRIENDS OF NORMA GONSALVES;1901 MERION ST \n", 2481 | "2277 03/21/05 FRIENDS OF NORMA GONSALVES;1901 MERION ST. \n", 2482 | "2278 10/13/05 FRIENDS OF NORMA GONSALVES;1901 MERION ST \n", 2483 | "2279 04/30/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST \n", 2484 | "2280 09/21/09 FRIENDS OF NORMA GONSALVES;1901 MERION ST \n", 2485 | "2281 08/03/99 FRIENDS OF NORMA GONZALEZ;1901 MERION STREET \n", 2486 | "2322 11/02/11 FRIENDS OF PAT MAHER;335 SPRING DRIVE \n", 2487 | "2323 10/27/11 FRIENDS OF PAT MAHER;335 SPRING DR \n", 2488 | "2371 06/19/07 FRIENDS OF PHIL BOYLE;136 E MAIN STREET \n", 2489 | "2372 06/12/00 FRIENDS OF PHIL BOYLE;15 STEWART STREET \n", 2490 | "2373 07/13/01 FRIENDS OF PHIL BOYLE;15 STEWART ST \n", 2491 | "2374 03/01/02 FRIENDS OF PHIL BOYLE;15 STEWART ST \n", 2492 | "2375 06/19/07 FRIENDS OF PHIL BOYLE;136 E. MAIN ST. \n", 2493 | "2431 09/27/07 FRIENDS OF ROBERT SCHMIDT;PO BOX 320 \n", 2494 | "2464 01/09/12 FRIENDS OF SENATOR JACK MARTIN;192 CLEARMEADOW... \n", 2495 | "2505 02/23/06 FRIENDS OF TOM MCKEVITT;P.O. BOX 455 \n", 2496 | "2740 10/03/01 N. HEPMSTEAD REPUBLICAN COMMITTEE;C/O AXELROD ... \n", 2497 | "3195 10/20/10 TAXPAYERS FOR LENAHAN;17 EVERDELL ROAD \n", 2498 | "3280 06/30/09 ;235 CARMEN AVE \n", 2499 | "\n", 2500 | " City State Zip Code CHECK NO. AMOUNT ($ ) \\\n", 2501 | "123 EAST NY 11733 1538 500 \n", 2502 | "362 EAST SETAUKET NY 11733 1523 250 \n", 2503 | "363 EAST SETAUKET NY 11733 1250 200 \n", 2504 | "390 ELMONT NY 11003 2907 2000 \n", 2505 | "484 E. SETAUKET NY 11733 NaN 100 \n", 2506 | "727 EAST SETAUKET NY 11733 1356 250 \n", 2507 | "871 EAST NASSAU NY 12062 1000 31659.77 \n", 2508 | "1128 E. GREENBUSH NY 12061 1405 500 \n", 2509 | "1150 ELMONT NY 11003 1307 450 \n", 2510 | "1166 EAST MEADOW NY 11554 2171 10000 \n", 2511 | "1167 EAST MEADOW NY 11554 2186 10000 \n", 2512 | "1250 E. NORTHPORT NY 11731 1475 250 \n", 2513 | "1288 EAST MEADOW NY 11554 1988 350 \n", 2514 | "1289 EAST MEADOW NY 11554 2012 1250 \n", 2515 | "1290 EAST MEADOW NY 11554 2078 1500 \n", 2516 | "1332 EAST ROCKAWAY NY 11515 2160 500 \n", 2517 | "1333 E. ROCKAWAY NY 11518 NaN 100 \n", 2518 | "1334 E. ROCKAWAY NY 11518 1605 100 \n", 2519 | "1335 E. ROCKAWAY NY 11518 1089 100 \n", 2520 | "1336 EAST ROCKAWAY NY 11518 1371 500 \n", 2521 | "1338 EAST ROCKAWAY NY 11518 2053 200 \n", 2522 | "1358 E. ROCHESTER NY 14445 1881 200 \n", 2523 | "1406 EAST MEADOW NY 11554 1613 1000 \n", 2524 | "1409 E. NORWICH NY 11732 1635 500 \n", 2525 | "1410 EAST MEADOW NY 11554 2434 500 \n", 2526 | "1424 EAST GREENBUSH NY 12061 1096 1000 \n", 2527 | "1438 ELMONT NY 11580 1418 250 \n", 2528 | "1443 E. ROCKAWAY NY 11518 1622 100 \n", 2529 | "1472 EAST MEADOW NY 11554 1402 125 \n", 2530 | "1561 EAST MEADOW NY 11554 2391 500 \n", 2531 | "... ... ... ... ... ... \n", 2532 | "2265 EAST MEADOW NY 11554 1980 1500 \n", 2533 | "2266 EAST MEADOW NY 11554 2155 500 \n", 2534 | "2267 EAST MEADOW NY 11554 2177 1988 \n", 2535 | "2268 EAST MEADOW NY 11554 1265 700 \n", 2536 | "2269 EAST MEADOW NY 11554 1477 1000 \n", 2537 | "2270 EAST MEADOW NY 11554 1957 250 \n", 2538 | "2271 EAST MEADOW NY 11554 1994 100 \n", 2539 | "2272 EAST MEADOW NY 11554 1065 350 \n", 2540 | "2273 EAST MEADOW NY 11554 1109 495 \n", 2541 | "2274 EAST MEADOW NY 11554 1216 250 \n", 2542 | "2275 EAST MEADOW NY 11554 1780 250 \n", 2543 | "2276 EAST MEADOW NY 11554 1808 1000 \n", 2544 | "2277 EAST MEADOW NY 11554 1988 250 \n", 2545 | "2278 EAST MEADOW NY 11554 2090 2000 \n", 2546 | "2279 EAST MEADOW NY 11554 2650 500 \n", 2547 | "2280 EAST MEADOW NY 11554 2700 1000 \n", 2548 | "2281 EAST MEADOW NY 11554 1308 300 \n", 2549 | "2322 EAST MEADOW NY 11554 1320 2000 \n", 2550 | "2323 EAST MEADOW NY 11554 2905 1000 \n", 2551 | "2371 E. ISLIP NY 11730 1614 200 \n", 2552 | "2372 EAST ISLIP NY 11730 1377 125 \n", 2553 | "2373 E ISLIP NY 11730 1499 400 \n", 2554 | "2374 E ISLIP NY 11730 1564 200 \n", 2555 | "2375 EAST ISLIP NY 11730 2417 200 \n", 2556 | "2431 EAST NORWICH NY 11732 1635 500 \n", 2557 | "2464 EAST MEADOW NY 11554 1325 500 \n", 2558 | "2505 E. MEADOW NY 11554 2039 500 \n", 2559 | "2740 EAST MEADOW NY 11554 1522 400 \n", 2560 | "3195 EAST ROCKAWAY NY 11518 1221 100 \n", 2561 | "3280 EAST ROCKAWAY NY 11518 1290 $ 600 \n", 2562 | "\n", 2563 | " RECORD DATE Party Source \n", 2564 | "123 JUL-07-11 04:21 PM D PBA \n", 2565 | "362 OCT-23-01 12:08 PM D SOA \n", 2566 | "363 JUL-07-99 09:05 AM D SOA \n", 2567 | "390 NOV-29-11 12:00 AM D SOA \n", 2568 | "484 JUN-10-99 06:04 PM D DETECTIVES \n", 2569 | "727 JUL-10-00 03:03 PM D SOA \n", 2570 | "871 DEC-05-11 12:00 AM D SAFE NASSAU \n", 2571 | "1128 JUL-15-10 10:31 AM R PBA \n", 2572 | "1150 AUG-11-99 07:39 PM R SOA \n", 2573 | "1166 OCT-07-05 03:02 PM R PBA \n", 2574 | "1167 OCT-25-05 11:29 AM R PBA \n", 2575 | "1250 SEP-30-10 11:02 AM R PBA \n", 2576 | "1288 JUL-11-05 03:06 PM R DETECTIVES \n", 2577 | "1289 OCT-07-05 05:17 PM R DETECTIVES \n", 2578 | "1290 SEP-27-05 11:53 AM R SOA \n", 2579 | "1332 OCT-07-05 02:53 PM R PBA \n", 2580 | "1333 NOV-26-01 02:28 PM R DETECTIVES \n", 2581 | "1334 JUL-05-07 05:27 PM R DETECTIVES \n", 2582 | "1335 JUL-13-09 02:19 PM R DETECTIVES \n", 2583 | "1336 JUL-10-00 03:40 PM R SOA \n", 2584 | "1338 AUG-08-05 12:34 PM R SOA \n", 2585 | "1358 OCT-03-02 10:22 AM R PBA \n", 2586 | "1406 JUL-22-99 11:26 AM R PBA \n", 2587 | "1409 OCT-05-07 05:23 PM R DETECTIVES \n", 2588 | "1410 OCT-01-07 03:31 PM R SOA \n", 2589 | "1424 OCT-23-09 01:29 PM R PBA \n", 2590 | "1438 OCT-24-00 11:40 AM R SOA \n", 2591 | "1443 OCT-08-99 10:14 AM R PBA \n", 2592 | "1472 SEP-19-00 09:18 AM R SOA \n", 2593 | "1561 JUL-16-07 05:32 PM R SOA \n", 2594 | "... ... ... ... \n", 2595 | "2265 OCT-03-03 12:50 PM R PBA \n", 2596 | "2266 OCT-07-05 03:09 PM R PBA \n", 2597 | "2267 OCT-07-05 03:09 PM R PBA \n", 2598 | "2268 JUL-08-09 02:28 PM R PBA \n", 2599 | "2269 SEP-30-10 11:03 AM R PBA \n", 2600 | "2270 JUL-11-05 01:56 PM R DETECTIVES \n", 2601 | "2271 JUL-11-05 03:11 PM R DETECTIVES \n", 2602 | "2272 JUL-13-09 02:08 PM R DETECTIVES \n", 2603 | "2273 SEP-04-09 12:51 PM R DETECTIVES \n", 2604 | "2274 SEP-22-10 03:15 PM R DETECTIVES \n", 2605 | "2275 SEP-03-03 01:46 PM R SOA \n", 2606 | "2276 OCT-24-03 02:03 PM R SOA \n", 2607 | "2277 JUL-13-05 04:00 PM R SOA \n", 2608 | "2278 OCT-27-05 03:40 PM R SOA \n", 2609 | "2279 JUL-13-09 04:45 PM R SOA \n", 2610 | "2280 SEP-24-09 03:30 PM R SOA \n", 2611 | "2281 AUG-11-99 07:40 PM R SOA \n", 2612 | "2322 DEC-01-11 12:00 AM R DETECTIVES \n", 2613 | "2323 NOV-29-11 12:00 AM R SOA \n", 2614 | "2371 AUG-16-07 04:12 PM R DETECTIVES \n", 2615 | "2372 JUL-10-00 03:47 PM R SOA \n", 2616 | "2373 AUG-09-01 10:07 PM R SOA \n", 2617 | "2374 JUL-03-02 10:20 AM R SOA \n", 2618 | "2375 JUL-16-07 05:35 PM R SOA \n", 2619 | "2431 SEP-28-07 04:24 PM R DETECTIVES \n", 2620 | "2464 JAN-16-12 12:00 AM R DETECTIVES \n", 2621 | "2505 JUN-27-06 12:52 PM R DETECTIVES \n", 2622 | "2740 OCT-23-01 01:04 PM R SOA \n", 2623 | "3195 NOV-29-10 04:58 PM R DETECTIVES \n", 2624 | "3280 JUL-08-09 02:59 PM R PBA \n", 2625 | "\n", 2626 | "[86 rows x 10 columns]" 2627 | ] 2628 | } 2629 | ], 2630 | "prompt_number": 86 2631 | }, 2632 | { 2633 | "cell_type": "markdown", 2634 | "metadata": {}, 2635 | "source": [ 2636 | "Let's create a new dataframe that only contains the cities that start with \"E\" that we can play with:" 2637 | ] 2638 | }, 2639 | { 2640 | "cell_type": "code", 2641 | "collapsed": false, 2642 | "input": [ 2643 | "ecities = df[df['City'].str.startswith(\"E\")]" 2644 | ], 2645 | "language": "python", 2646 | "metadata": {}, 2647 | "outputs": [], 2648 | "prompt_number": 87 2649 | }, 2650 | { 2651 | "cell_type": "code", 2652 | "collapsed": false, 2653 | "input": [ 2654 | "ecities['City'].unique()" 2655 | ], 2656 | "language": "python", 2657 | "metadata": {}, 2658 | "outputs": [ 2659 | { 2660 | "metadata": {}, 2661 | "output_type": "pyout", 2662 | "prompt_number": 89, 2663 | "text": [ 2664 | "array([u'EAST', u'EAST SETAUKET', u'ELMONT', u'E. SETAUKET',\n", 2665 | " u'EAST NASSAU', u'E. GREENBUSH', u'EAST MEADOW', u'E. NORTHPORT',\n", 2666 | " u'EAST ROCKAWAY', u'E. ROCKAWAY', u'E. ROCHESTER', u'E. NORWICH',\n", 2667 | " u'EAST GREENBUSH', u'E. MEADOW', u'ELMIRA', u'EAST NORTHPOND',\n", 2668 | " u'EAST NORTHPORT', u'E. WILLISTON', u'E. ISLIP', u'EAST ISLIP',\n", 2669 | " u'E ISLIP', u'EAST NORWICH'], dtype=object)" 2670 | ] 2671 | } 2672 | ], 2673 | "prompt_number": 89 2674 | }, 2675 | { 2676 | "cell_type": "markdown", 2677 | "metadata": {}, 2678 | "source": [ 2679 | "It's stil petty much a mess. However, using regular expressions, we can replace every \"E \" or \"E. \" with the string \"EAST \"\n", 2680 | "\n", 2681 | "breaking `replace(r\"\\s*E\\.?\\s\", \"EAST \")` down into chunks:\n", 2682 | " \n", 2683 | "```\n", 2684 | "r = use regular expression language\n", 2685 | "\\s = space\n", 2686 | "* = wildcard (anything)\n", 2687 | "\\.? = one or no \".\" characters\n", 2688 | "```" 2689 | ] 2690 | }, 2691 | { 2692 | "cell_type": "code", 2693 | "collapsed": false, 2694 | "input": [ 2695 | "ecities['City'].str.replace(r\"\\s*E\\.?\\s\", \"EAST \").unique()" 2696 | ], 2697 | "language": "python", 2698 | "metadata": {}, 2699 | "outputs": [ 2700 | { 2701 | "metadata": {}, 2702 | "output_type": "pyout", 2703 | "prompt_number": 91, 2704 | "text": [ 2705 | "array([u'EAST', u'EAST SETAUKET', u'ELMONT', u'EAST NASSAU',\n", 2706 | " u'EAST GREENBUSH', u'EAST MEADOW', u'EAST NORTHPORT',\n", 2707 | " u'EAST ROCKAWAY', u'EAST ROCHESTER', u'EAST NORWICH', u'ELMIRA',\n", 2708 | " u'EAST NORTHPOND', u'EAST WILLISTON', u'EAST ISLIP'], dtype=object)" 2709 | ] 2710 | } 2711 | ], 2712 | "prompt_number": 91 2713 | }, 2714 | { 2715 | "cell_type": "code", 2716 | "collapsed": false, 2717 | "input": [ 2718 | "df['City'] = df['City'].str.replace(r\"\\s*E\\.?\\s\", \"EAST \")" 2719 | ], 2720 | "language": "python", 2721 | "metadata": {}, 2722 | "outputs": [], 2723 | "prompt_number": 92 2724 | }, 2725 | { 2726 | "cell_type": "code", 2727 | "collapsed": false, 2728 | "input": [ 2729 | "df[df['City'].str.startswith(\"E\")]['City'].unique()" 2730 | ], 2731 | "language": "python", 2732 | "metadata": {}, 2733 | "outputs": [ 2734 | { 2735 | "metadata": {}, 2736 | "output_type": "pyout", 2737 | "prompt_number": 96, 2738 | "text": [ 2739 | "array([u'EAST', u'EAST SETAUKET', u'ELMONT', u'EAST NASSAU',\n", 2740 | " u'EAST GREENBUSH', u'EAST MEADOW', u'EAST NORTHPORT',\n", 2741 | " u'EAST ROCKAWAY', u'EAST ROCHESTER', u'EAST NORWICH', u'ELMIRA',\n", 2742 | " u'EAST NORTHPOND', u'EAST WILLISTON', u'EAST ISLIP'], dtype=object)" 2743 | ] 2744 | } 2745 | ], 2746 | "prompt_number": 96 2747 | }, 2748 | { 2749 | "cell_type": "code", 2750 | "collapsed": false, 2751 | "input": [], 2752 | "language": "python", 2753 | "metadata": {}, 2754 | "outputs": [] 2755 | } 2756 | ], 2757 | "metadata": {} 2758 | } 2759 | ] 2760 | } --------------------------------------------------------------------------------