├── .gitignore ├── CONDUCT.md ├── README.md ├── lessons_2022 ├── Before we start.md ├── Exercise 1.md ├── Exercise 2.md ├── README.md ├── species.csv ├── speciesSubset.csv └── surveys.csv ├── practice ├── GitHub.md ├── Week2_lesson1_nameList.md ├── Week2_lesson2_web.md ├── Week2_lesson3_arcpy.md ├── Week3_lesson1_webCensus.md ├── Week3_lesson2_fileList.md ├── Week3_lesson3_arcpy.md ├── Week4_lesson1_renameFromList.md ├── Week4_lesson2_arcpyFieldsDesc.md ├── Week4_lesson3_webAPbbox.md ├── arcpy_excercises │ ├── selection on field values │ │ └── subset_vals.py │ └── toolbox │ │ ├── clip_excercise.docx │ │ └── clip_excercise.py ├── data │ ├── 99_files.zip │ ├── Resources.md │ ├── sinuosity.pyt │ ├── tl_2019_12113_addrfeat.zip │ └── utils.py ├── pyCOP_6_8_2020.md ├── week4_review.md ├── week4_web_demo.md ├── week5_arcpy_cursor_object.md ├── week5_pyt_toolbox.md └── week5_review.md └── slides ├── GED_Python_Workgroup_2018_01_09.pptx ├── GED_Python_Workgroup_2018_01_16.pptx ├── GED_Python_Workgroup_2018_01_23.pptx ├── GED_Python_Workgroup_2018_01_30.pptx ├── GED_Python_Workgroup_2018_02_05.pptx ├── GED_Python_Workgroup_2020_02_20.pptx ├── GED_Python_Workgroup_2020_02_27.pptx ├── GED_Python_Workgroup_2020_03_04.pptx └── GED_Python_Workgroup_2020_03_12.pptx /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | *.Rproj 3 | .Rhistory 4 | .RData 5 | .Ruserdata 6 | *.pyc -------------------------------------------------------------------------------- /CONDUCT.md: -------------------------------------------------------------------------------- 1 | # CONTRIBUTOR CODE OF CONDUCT 2 | 3 | As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities. 4 | 5 | We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion. 6 | 7 | Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct. 8 | 9 | Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team. 10 | 11 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers. 12 | 13 | This Code of Conduct is adapted from the Contributor Covenant, version 1.0.0, available at https://www.contributor-covenant.org/version/1/0/0/code-of-conduct.html 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # py_workgroup 2 | This repository contains materials for the workgroup to use while learning python. 3 | 4 | The purpose of this group is to work together to learn new skills, languages, data science practices, etc. as they relate to our research in environmental science. The primary focus will be the use of python, with exercises in jupyter focused on accessing data and manipulating it in pandas dataframes. However, topics are not limited to these and you're encouraged to bring ideas and tools from other topics or languages. 5 | 6 | # GitHub and this repository 7 | This repository is public and thus anyone can access the files; however, only collaborators on the repository can add or edit files. I encourage everyone to sign up for a GitHub account, they are free and you will be able to take advantage of our issues for discussions and you can be added as a collaborator on the repository. 8 | 9 | Lessons and format change from year to year, new lessons are added but old ones 10 | are not deleted. 11 | 12 | | Folder/File | Description | 13 | | ------------- |-------------------------------------------: | 14 | | lessons_2022 | Markdown and notebooks for each week. | 15 | | practice | Contains code, data etc. to apply skills. | 16 | | slides | Contains slides covering each weeks lesson. | 17 | | CONDUCT.md | Code of conduct - from Data Carpentry | 18 | | README.md | This document, provides basic information. | 19 | 20 | # Repository Structure 21 | To keep things neat and tiddy we will impose some structures and best practices. These should resemble typical practices where possible (e.g. README) so that if you move to repositories maintained by others, at least something should look familiar. 22 | 23 | 24 | # File naming conventions 25 | So that it is easier for us to find materials and keep things looking orderly, we will follow these rules for naming files in the notes and meetings sections: 26 | 27 | 1. Only use lower case (with the exception of the files listed above). 28 | 2. No spaces are allowed in file names. 29 | 3. Start out the name with something descriptive followed by the date with year, month, day separated by underscore. 30 | 4. Use conventions for file extensions (e.g. .md for markdown, .R for R script, .py for python etc.). 31 | 32 | For example, copy_feature_example_2018_01_11.py would be a good file name. 33 | 34 | # Questions 35 | Any questions about the group or this repository feel free to contact @jbousquin directly, or submit an issue on the repo! 36 | -------------------------------------------------------------------------------- /lessons_2022/Before we start.md: -------------------------------------------------------------------------------- 1 | # Before we start 2 | 3 | ## What is Programming and Coding? 4 | 5 | Programming is the process of writing _"programs"_ that a computer can execute and produce some 6 | (useful) output. 7 | Programming is a multi-step process that involves the following steps: 8 | 9 | 1. Identifying the aspects of the real-world problem that can be solved computationally 10 | 2. Identifying (the best) computational solution 11 | 3. Implementing the solution in a specific computer language 12 | 4. Testing, validating, and adjusting implemented solution. 13 | 14 | While _"Programming"_ refers to all of the above steps, 15 | _"Coding"_ refers to step 3 only: _"Implementing the solution in a specific computer language"_. It's 16 | important to note that _"the best"_ computational solution must consider factors beyond the computer. 17 | Who is using the program, what resources/funds does your team have for this project, and the available 18 | timeline all shape and mold what "best" may be. 19 | 20 | ## Why py? 21 | * Efficient (loop friendly) 22 | * Versatile (C, but can integrate C#, .net, JS, ObjectiveC etc.) 23 | * Libraries/packages by domain experts (e.g. scipy) 24 | * Readable 25 | * Fun Monty python references 26 | 27 | ## R vs python 28 | * coursera [Python or R for Data Analysis: Which Should I Learn?](https://www.coursera.org/articles/python-or-r-for-data-analysis) 29 | * datacamp [Choosing Python or R for Data Analysis? An Infographic](https://www.datacamp.com/community/tutorials/r-or-python-for-data-analysis) 30 | * rstudio [R vs. Python: What’s the best language for Data Science?](https://blog.rstudio.com/2019/12/17/r-vs-python-what-s-the-best-for-language-for-data-science/) 31 | * Running python in [Rstudio](https://www.rstudio.com/solutions/r-and-python/) (R notebooks, Quarto, reticulate) 32 | * Run either in jupyter with the proper kernal 33 | 34 | ## [Zen of Python](https://peps.python.org/pep-0020/#the-zen-of-python) 35 | 36 | ## Tools 37 | Text editor – where your script is written 38 | * IDLE (in python install) 39 | * Notepad 40 | * Notepad++ 41 | 42 | Console - where code is run 43 | 44 | * IDLE Shell 45 | * Command line 46 | 47 | Integrated Development Environment (IDE) – combines these plus more 48 | 49 | * Visual Studio (multi-language) 50 | * Rstudio (R, some Python) 51 | * [Spyder](https://www.spyder-ide.org) (similar to R studio/matlab) 52 | * PyCharm (good for web dev, Git, support for JS, HTML/CSS etc.) 53 | * Jupyter notebooks (cloud-based, great for instruction and web) 54 | 55 | For further info - see reccent lightning talks on IDEs (EPA Internal) 56 | 57 | ## environments 58 | ![xkcd](https://imgs.xkcd.com/comics/python_environment.png) 59 | 60 | Great in concept - allow you to switch between versions and packages 61 | * [Anaconda distribution](https://www.anaconda.com/products/distribution) (comes with many scientific packages) 62 | * Conda environment management (part of anaconda & ArcGIS Pro) 63 | * Virtual environments (usually set up through pip install from [PyPI](https://pypi.org/)) 64 | 65 | ## How to learn more after the workshop? 66 | 67 | The material we cover during this workshop will give you an initial taste of how you can use Python 68 | to analyze data for your own research. However, you will need to learn more to do advanced 69 | operations such as cleaning your dataset, using statistical methods, or creating beautiful graphics. 70 | The best way to become proficient and efficient at python, as with any other tool, is to use it to 71 | address your actual research questions. As a beginner, it can feel daunting to have to write a 72 | script from scratch, and given that many people make their code available online, modifying existing 73 | code to suit your purpose might make it easier for you to get started. 74 | 75 | ## Seeking help 76 | 77 | * check under the _Help_ menu 78 | * type `help()` 79 | * type `?object` or `help(object)` to get information about an object 80 | * [Python documentation][python-docs] 81 | * [Pandas documentation][pandas-docs] 82 | 83 | Finally, a generic Google or internet search "Python task" will often either send you to the 84 | appropriate module documentation or a helpful forum where someone else has already asked your 85 | question. 86 | 87 | I am stuck... I get an error message that I don’t understand. 88 | Start by googling the error message. However, this doesn’t always work very well, because often, 89 | package developers rely on the error catching provided by Python. You end up with general error 90 | messages that might not be very helpful to diagnose a problem (e.g. "subscript out of bounds"). If 91 | the message is very generic, you might also include the name of the function or package you’re using 92 | in your query. 93 | 94 | However, you should check Stack Overflow. Search using the `[python]` tag. Most questions have already 95 | been answered, but the challenge is to use the right words in the search to find the answers: 96 | 97 | 98 | ### Asking for help 99 | 100 | The key to receiving help from someone is for them to rapidly grasp your problem. You should make it 101 | as easy as possible to pinpoint where the issue might be. 102 | 103 | Try to use the correct words to describe your problem. For instance, a package is not the same thing 104 | as a library. Most people will understand what you meant, but others have really strong feelings 105 | about the difference in meaning. The key point is that it can make things confusing for people 106 | trying to help you. Be as precise as possible when describing your problem. 107 | 108 | If possible, try to reduce what doesn’t work to a simple reproducible example. If you can reproduce 109 | the problem using a very small data frame instead of your 50,000 rows and 10,000 columns one, 110 | provide the small one with the description of your problem. When appropriate, try to generalize what 111 | you are doing so even people who are not in your field can understand the question. For instance, 112 | instead of using a subset of your real dataset, create a small (3 columns, 5 rows) generic one. 113 | 114 | ### Where to ask for help? 115 | 116 | * The person sitting next to you during the workshop. Don’t hesitate to talk to your neighbor during 117 | the workshop, compare your answers, and ask for help. You might also be interested in organizing 118 | regular meetings following the workshop to keep learning from each other. 119 | * Your friendly colleagues: if you know someone with more experience than you, they might be able and 120 | willing to help you. 121 | * [Stack Overflow][so-python]: if your question hasn’t been answered before and is well crafted, 122 | chances are you will get an answer in less than 5 min. Remember to follow their guidelines on how to 123 | ask a good question. 124 | * [Python mailing lists][python-mailing-lists] 125 | 126 | ## More resources 127 | 128 | - [PyPI - the Python Package Index][pypi] 129 | 130 | - [The Hitchhiker's Guide to Python][python-guide] 131 | 132 | - [Dive into Python 3][dive-into-python3] 133 | 134 | 135 | [anaconda]: https://www.anaconda.com 136 | [anaconda-community]: https://www.anaconda.com/community 137 | [dive-into-python3]: https://finderiko.com/python-book 138 | [pandas-docs]: https://pandas.pydata.org/pandas-docs/stable/ 139 | [pypi]: https://pypi.org/ 140 | [python-docs]: https://www.python.org/doc 141 | [python-guide]: https://docs.python-guide.org 142 | [python-mailing-lists]: https://www.python.org/community/lists 143 | [stack-overflow]: https://stackoverflow.com 144 | [so-python]: https://stackoverflow.com/questions/tagged/python?tab=Votes 145 | 146 | -------------------------------------------------------------------------------- /lessons_2022/Exercise 1.md: -------------------------------------------------------------------------------- 1 | ## Goals 2 | 3 | In this exercise we will make a file accessible to our jupyter notebook, read that file into a dataframe and then manipulate the data in the dataframe. 4 | 5 | Step 1: Create a new notebook and save it as 'Exercise_1' 6 | Step 2: Get a local copy of surveys.csv and add it to the notebook files 7 | Step 3: Make your first block markdown and add a title 8 | Step 4: In the next code block import the pandas library 9 | Step 5: In the next code block add a comment describing where the file came from and set a new variable to the csv file location 10 | Step 6: read the csv file to a dataframe 11 | 12 | ### Questions (Create a metadown header cell for each and add code to solve) 13 | 14 | 1) Choose a method to view a sample of the dataframe (e.g., first 5 rows, etc.) 15 | 2) Create a list for the column names 16 | 3) What data type does the 'plot_id' col contain? 17 | 4) What is the value of the 'plot_id' col in the first row? The value of 'species_id' in the same same? 18 | 5) What are the unique values for 'plot_id' col? 19 | 6) Create a new column with the same data as 'plot_id' and 'species_id' combined and seperated by '_' 20 | 7) What is the value for the new column at the 6 index? 21 | 10) Create a new dataframe without the 'record_id' col 22 | 11) Create a new dataframe without any values of 'F' in the 'sex' col (using conditionals) 23 | 12) Aggregate rows based on the 'year' col 24 | 25 | Assignment to work on: 26 | * Create a dictionary where the key is the unique values for 'plot_id' col and the value is the count of instances of that value 27 | * Add a new key to the dictionary (make it sequential) and give it a value of 0 28 | -------------------------------------------------------------------------------- /lessons_2022/Exercise 2.md: -------------------------------------------------------------------------------- 1 | # Concatenating DataFrames 2 | 3 | Step 1: Create subset for rows 4-9 lines of surveys table 4 | Step 2: Grab the second to last rows 5 | Step 3: Reset the index values so the second dataframe appends properly 6 | Note: drop=True arg avoids adding new index column with old index values 7 | Step 4: Stack the DataFrames on top of each other 8 | 9 | Step 1: Create a subset for rows and columns 10 | Step 2: Create a subset for the same rows and different columns 11 | Step 3: Place the DataFrames side by side 12 | 13 | # Save dataframe as csv and download to local machine 14 | 15 | 16 | # Joining DataFrames 17 | 18 | Step 1: Read in first 10 lines of surveys table 19 | Step 2: Import a small subset of the species data (speciesSubset.csv) designed for this part of the lesson. 20 | Note: In this example, species_sub is the lookup table containing genus, species, and taxa names that we want to join with the data in survey_sub to produce a new DataFrame that contains all of the columns from both species_df and survey_df. 21 | Step 3: Identify join keys 22 | 23 | Step 4: attempt left join 24 | Step 5: pick another join type 25 | 26 | # Inner Join .merge() 27 | -------------------------------------------------------------------------------- /lessons_2022/README.md: -------------------------------------------------------------------------------- 1 | ## Overview 2 | 3 | For 2022 I decided to update materials for excercises based on [datacarpentry's Ecology Curriculum](https://github.com/datacarpentry/python-ecology-lesson/). As a result these are going to be a combination of markdown files and ipython notebooks (shared on [geoplatform](https://epa.maps.arcgis.com/home/index.html)). Felt like an improvement over pptx downloads, welcome feedback as we go. 4 | 5 | I will try to post recordings to the network drive (L:\Public\jbousqui\Resources\Python) 6 | 7 | ## Agendas 8 | 9 | -Lesson 0- 10 | 11 | - Before we start (why, setup etc.) 12 | - Demo: IDEs 13 | - Excercise 0: familiarize jupyter notebooks 14 | 15 | -Lesson 1- 16 | 17 | - Built-in data types 18 | - Intro to conditionals and control flow (True/False; if/else) 19 | - Lists 20 | - Tuples 21 | 22 | -Lesson 2- 23 | 24 | - For loops 25 | - Dictionaries 26 | - Looping over dictionaries 27 | - Functions 28 | - Working with local files (IDLE demo, notebook demo) 29 | - Packages/libraries 30 | - os 31 | - pandas 32 | - Reading in data 33 | - Parts of a dataframe (getting a col, datatypes) 34 | - Dataframe summary stats 35 | - Dataframe groupby 36 | 37 | -Practice Excercise 1- 38 | 39 | -Lesson 3- 40 | 41 | - NaN 42 | - Getting specific rows/columns/data points 43 | - Writing data 44 | - Append/Merge/Join 45 | 46 | -Practice Excercise 1- 47 | 48 | -Lesson 4- 49 | - Revisiting loops and functions 50 | - Visualization 51 | - Putting it all together 52 | 53 | -Practice Excercise 2- 54 | 55 | -Special Topics- 56 | 57 | - GIS vector 58 | - GIS Raster 59 | - GIS arcpy 60 | - GIS .pyt 61 | - APIs & Web scraping 62 | - pytest 63 | - conda/PyPI/env 64 | - Hydrology 65 | -------------------------------------------------------------------------------- /lessons_2022/species.csv: -------------------------------------------------------------------------------- 1 | species_id,genus,species,taxa 2 | "AB","Amphispiza","bilineata","Bird" 3 | "AH","Ammospermophilus","harrisi","Rodent-not censused" 4 | "AS","Ammodramus","savannarum","Bird" 5 | "BA","Baiomys","taylori","Rodent" 6 | "CB","Campylorhynchus","brunneicapillus","Bird" 7 | "CM","Calamospiza","melanocorys","Bird" 8 | "CQ","Callipepla","squamata","Bird" 9 | "CS","Crotalus","scutalatus","Reptile" 10 | "CT","Cnemidophorus","tigris","Reptile" 11 | "CU","Cnemidophorus","uniparens","Reptile" 12 | "CV","Crotalus","viridis","Reptile" 13 | "DM","Dipodomys","merriami","Rodent" 14 | "DO","Dipodomys","ordii","Rodent" 15 | "DS","Dipodomys","spectabilis","Rodent" 16 | "DX","Dipodomys","sp.","Rodent" 17 | "EO","Eumeces","obsoletus","Reptile" 18 | "GS","Gambelia","silus","Reptile" 19 | "NA","Neotoma","albigula","Rodent" 20 | "NX","Neotoma","sp.","Rodent" 21 | "OL","Onychomys","leucogaster","Rodent" 22 | "OT","Onychomys","torridus","Rodent" 23 | "OX","Onychomys","sp.","Rodent" 24 | "PB","Chaetodipus","baileyi","Rodent" 25 | "PC","Pipilo","chlorurus","Bird" 26 | "PE","Peromyscus","eremicus","Rodent" 27 | "PF","Perognathus","flavus","Rodent" 28 | "PG","Pooecetes","gramineus","Bird" 29 | "PH","Perognathus","hispidus","Rodent" 30 | "PI","Chaetodipus","intermedius","Rodent" 31 | "PL","Peromyscus","leucopus","Rodent" 32 | "PM","Peromyscus","maniculatus","Rodent" 33 | "PP","Chaetodipus","penicillatus","Rodent" 34 | "PU","Pipilo","fuscus","Bird" 35 | "PX","Chaetodipus","sp.","Rodent" 36 | "RF","Reithrodontomys","fulvescens","Rodent" 37 | "RM","Reithrodontomys","megalotis","Rodent" 38 | "RO","Reithrodontomys","montanus","Rodent" 39 | "RX","Reithrodontomys","sp.","Rodent" 40 | "SA","Sylvilagus","audubonii","Rabbit" 41 | "SB","Spizella","breweri","Bird" 42 | "SC","Sceloporus","clarki","Reptile" 43 | "SF","Sigmodon","fulviventer","Rodent" 44 | "SH","Sigmodon","hispidus","Rodent" 45 | "SO","Sigmodon","ochrognathus","Rodent" 46 | "SS","Spermophilus","spilosoma","Rodent-not censused" 47 | "ST","Spermophilus","tereticaudus","Rodent-not censused" 48 | "SU","Sceloporus","undulatus","Reptile" 49 | "SX","Sigmodon","sp.","Rodent" 50 | "UL","Lizard","sp.","Reptile" 51 | "UP","Pipilo","sp.","Bird" 52 | "UR","Rodent","sp.","Rodent" 53 | "US","Sparrow","sp.","Bird" 54 | "XX",,,"Zero Trapping Success" 55 | "ZL","Zonotrichia","leucophrys","Bird" 56 | "ZM","Zenaida","macroura","Bird" 57 | -------------------------------------------------------------------------------- /lessons_2022/speciesSubset.csv: -------------------------------------------------------------------------------- 1 | "species_id","genus","species","taxa" 2 | "DM","Dipodomys","merriami","Rodent" 3 | "NL","Neotoma","albigula","Rodent" 4 | "PE","Peromyscus","eremicus","Rodent" 5 | -------------------------------------------------------------------------------- /practice/GitHub.md: -------------------------------------------------------------------------------- 1 | # Why do version control? 2 | https://phdcomics.com/comics/archive.php?comicid=1531 3 | 4 | Think of version control like tracked changes in a word document in a shared folder... just multiplied across serveral revisions of that document and, well, better. 5 | Every time you commit changes it creates a log of what it was before, after and your commit comments describing what you did or why. If tomorrow you realize your code doesn't work anymore you can easily restore all or part of a previous version until it works again. 6 | 7 | Here are my top reasons for using version control: 8 | 1. Store versions properly (saves every commit) 9 | 2. Transparent (everyone can see every change you ever made) 10 | 3. Collaboration (I update what is on my home machine based on commits I made at work today and unless I commit changes a new version isn't created. Anyone else can suggest code changes or identify issues.) 11 | 12 | # GitHub 13 | There is an official [USEPA GitHub account](https://github.com/USEPA). Before posting there review their [guidelines](https://www.epa.gov/webguide/github-guidance) 14 | 15 | My short version would be: 16 | 1. Don't put anything up in a public repository that isn't already reviewed and public. 17 | 2. Even private repositories shouldn't contain protected or propriatary code. 18 | 3. Remember it saves every version so deleting something later isn't an option (e.g. passwords, keys, etc.). 19 | 4. Use it for code not documents. 20 | 5. Not everything is appropriate for the EPA account, a personal repository or a local repository (i.e. only accessed locally) may be a better fit for your project. 21 | 22 | # Integration with IDE: 23 | Setting up GitHub repositories in [RStudio](https://support.rstudio.com/hc/en-us/articles/200532077?version=1.1.383&mode=desktop) 24 | 25 | Setting up GitHub Projects in [spyder](https://pythonhosted.org/spyder/projects.html) 26 | -------------------------------------------------------------------------------- /practice/Week2_lesson1_nameList.md: -------------------------------------------------------------------------------- 1 | Often data will requiring some 'cleaning' before it can be used in analysis. This lesson will implement some of the python basics learned so far to prepare a mock dataset for further analysis. 2 | 3 | #Script layout 4 | 5 | All python scripts follow the same general structure when read from top to bottom (with space in between each section): 6 | 1. Documentation (commented text) describing what the script does who authored it 7 | 2. import any required modules 8 | 3. Define any functions that will be used in the script 9 | 4. Define input variables (although this is not strict it helps to see everything at once up front) 10 | 11 | Create a python script and add some descriptive text about what the file does using comments: 12 | 13 | # This file takes a string of station names and creates a noramlized list of those names for further analysis 14 | # Author 15 | 16 | Next, import any modules you plan to use, we don't actually need any for this excersise, but for practice try: 17 | 18 | import os 19 | 20 | We won't use any functions yet, so the next step is to define input variables. Define the string variable you'll be working with: 21 | 22 | names = "Fifth Ave (boat launch)\n Little Sandy Pond\n skinequit pond\n Bearses Pond\n Picture Lake (Flax Pond)\n Sand Pond,\n WHITE POND\n Dennis Pond\n Buck's Pond\n Hinkley's Pond\n Flax Pond (Yarmouth/Dennis)\n Long Pond-Long Pond Drive\n Long Pond-Cahoon St.\n Long Pond\n Upper Mill Pon\n Queen Sewell Pond\n Gull Pond - Gull Pond Landing\n Gull Pond (2) - Steele Rd.\n Tides Hotel\n Long Pond - Indian\n PARKERS RIVER SPORTFISHING PIER\n BASS RIVER(UNC. FREEMAN'S LANDING)\n BAKER'S POND\n WEQUASSETT INN\n" 23 | 24 | Now we can experiment with some of the string methods available for use on the variable names because it is string datatype. If you aren't sure what the dot notation methods are you can either google the documentation or, depending on your IDE it may make suggestions just after typing: 25 | 26 | names. 27 | 28 | First let's address the '\n' between every beach name. Often this will come up when a text file is read in as it denotes a new line for each item. In this case it suggests each name had it's own line in the original file. We could use the .strip() method to remove all occurances of '\n': 29 | 30 | names.strip('\n') 31 | 32 | It doesn't actually change our variable in-place though, if we print names it still has the '\n': 33 | 34 | print(names) 35 | 36 | To change the variable using this string method it must be set to the result (this is just for demonstration, don't do it): 37 | 38 | names = names.strip('\n') 39 | 40 | As an alternatvie to .strip() we could also use the .replace() method to remove occurances of '\n' and replace them with an empty string: 41 | 42 | names.replace('\n', '') 43 | 44 | Why are there two methods to do the same thing? Well now without '\n' we can't tell which name is which. Instead of seperating entries so that each has it's own line, a common way to seperate entries is using a comma. You may be familiar with Comma Seperated Values (CSV) as a file type. To do this: 45 | 46 | names.replace('\n', ',') 47 | 48 | Play around with some of the other string methods available using the dot notation. For example see if you can find one to make all the names lower case. 49 | 50 | You may notice there is now a ',' at the end of the string. One easy way to remove characters from the beginning or end of a string is to use the index, i.e. the place in string where the characters we want are. We could count the number of characters in the string, or with python determine the length in number of characters using len(): 51 | 52 | len(names) 53 | 54 | Now we know we just want the first 480 characters, to not include that last ','. To do this we get every character up to 480 using the index: 55 | 56 | names[:480] 57 | 58 | Again you'll noticed we waited to set the variable to the result so that we can experiement with the index some more. For example, what happens if we do: 59 | 60 | names[480] 61 | names[-1] 62 | 63 | Can you think of a way to get the same result as names[:480] without knowing the length of the string? 64 | -------------------------------------------------------------------------------- /practice/Week2_lesson2_web.md: -------------------------------------------------------------------------------- 1 | If you're able to download something using the web, you should be able to do it with python. 2 | 3 | # Create url to retrive 4 | 5 | Let's say we wanted to get a shapefile for Santa Rosa county. 6 | 1. Google "santa rosa county census shapefile" and follow the first [hit](https://catalog.data.gov/dataset/tiger-line-shapefile-2019-county-santa-rosa-county-fl-address-range-feature-county-based) 7 | 2. Scroll down on the site to find the first download button and copy it's link address. It should be: 8 | https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_12113_addrfeat.zip 9 | 10 | Now open your python shell. First, add some descriptive text about what the file does using comments: 11 | 12 | # This file downloads a shapefile for Santa Rosa County 13 | # Author 14 | 15 | Next, we need to declare a variable for the link: 16 | 17 | url = "https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_12113_addrfeat.zip" 18 | 19 | Typing a variable (e.g. url) into the shell will print the evaluation of that variable. 20 | 21 | url 22 | > 'http://www2.census.gov/geo/tiger/TIGER2012/ADDRFEAT/tl_2012_12113_addrfeat.zip' 23 | 24 | The quotation marks, `''` or `""` on either side denote that it is a string type variable. 25 | The name of the variable, `url` doesn't matter, as long as it is referenced consistently. 26 | 27 | # Create variable to save download as 28 | Next we need to have a variable to tell python where to put our download, both the directory and the filename. 29 | 30 | filepath = "C:\Users\\Desktop" 31 | 32 | Typing this variable into shell prints the `\` as `\\` because backslashes are escape characters. How python interprets backslashes can get complex and depends on the character after the backslash and even your operating system ([details](https://pythonconquerstheuniverse.wordpress.com/2008/06/04/gotcha-%E2%80%94-backslashes-are-escape-characters/)). 33 | One way to avoid confusion is to denote the string as raw: 34 | 35 | filePath = r"C:\Users\\Desktop" 36 | 37 | Note that setting a variable a second time replaces the original value of that variable. 38 | 39 | Next, create your desired filename as a new variable: 40 | 41 | fileName = "SantaRosaCounty.zip" 42 | 43 | String variables can be added on to the end of other string variables using `+` . Set the full file name including the directory: 44 | 45 | fullFileName = filePath + fileName 46 | 47 | When you print the value of the new variable we can see we left out backslash separators. We know from above that windows uses `\\`: 48 | 49 | fullFileName = filePath + "\\" + fileName 50 | 51 | # Download file 52 | Python uses modules to add functionality that other people have written code for. 53 | Each module is basically a script with functions inside that take specified variables to do something. We will revisit functions and modules in more detail later. 54 | A module can be imported using import : 55 | 56 | ```python 57 | import this 58 | import os 59 | import arcpy 60 | import urllib 61 | import urllib.request 62 | ``` 63 | 64 | The os module gives us operating system based functions. When we were declaring our fullFileName, if we didn't know the expected seperators were `"\\"` we could have used the sep function in the os module: 65 | 66 | fullFileName = filePath + os.sep + fileName 67 | 68 | The arcpy module allows us to use arcGIS functionality outside of arcGIS desktop. In the python window of arcMap this module is already imported. 69 | 70 | The urllib module is one of the packages for using urls in Python 2. There are several that can be used depending on your specific needs and instal. 71 | Once urllib is imported the functions inside can be accessed using the module they are in and dot notation: 72 | 73 | ```python 74 | import urllib 75 | urllib.urlretrieve() 76 | ``` 77 | However, in Python 3.x, the urllib module has been split into separate modules. The equivalent to urlretrieve() in Python 3 is: 78 | 79 | ```python 80 | import urllib.request 81 | urllib.request.urlretrieve() 82 | ``` 83 | 84 | The function urlretrieve() performs some function on the variables we put in (). When a function isn't given the variables (aka arguments) that it expects you should get an error. The urlretrieve() function expects at least 1 argument and so you get an error: 85 | 86 | >```python 87 | >Traceback (most recent call last): 88 | > File "", line 1, in 89 | > urllib.urlretrieve() 90 | >TypeError: urlretrieve() takes at least 1 argument (0 given) 91 | >``` 92 | 93 | How do we know what argument the function wants? Google the module.function to find the documentation that will tell what the variables should be (The first [hit](https://docs.python.org/2/library/urllib.html) should the documentation for the python standard library). 94 | The python documentation can be jargony, scroll down to urllib.**urlretrieve**(url[, filename[, reporthook[,data]]]) and we see the first argument is the url. We know from the error we got that the function only requires 1 argument, and now we see it is the url. In the documentation we see it will also take additional arguments such as filename: 95 | 96 | > The second argument, if present, specifies the file location to copy to (if absent, the location will be a tempfile with a generated name). 97 | 98 | It looks like it wants (url, filename) as the variables: 99 | 100 | urllib.urlretrieve(url, fullFileName) 101 | 102 | If using Python 3, the function looks like this: 103 | 104 | urllib.request.urlretrieve(url, fullFileName) 105 | 106 | Go see if it worked! 107 | 108 | # Additional things to try 109 | In the example above the name of the function within the urllib module and what arguments it expected had to be known. Depending on the IDE you are using there may be helpful resources for this. 110 | Try importing urllib from the python window in arcMap: 111 | 112 | ```python 113 | import urllib 114 | ``` 115 | 116 | As you type the interface will try to autocomplete for you. If you type urllib. it will start suggesting functions within that library. 117 | 118 | urlib.u 119 | 120 | > urllib.urlretrieve 121 | 122 | Likewise, once you out the function the IDE may be able to show you the expected syntax: 123 | 124 | > urllib.urlretrieve(url, filename=None, reporthook=None, data=None) 125 | 126 | It takes time for python to import an entire library, so if we are only using a couple known functions we can choose to only import those: 127 | ```python 128 | from urllib import urlretrieve 129 | ``` 130 | 131 | Now the function is accessible outside of the module: 132 | 133 | urlretrieve(url, fullFileName) 134 | 135 | 136 | 137 | -------------------------------------------------------------------------------- /practice/Week2_lesson3_arcpy.md: -------------------------------------------------------------------------------- 1 | ArcGIS Desktop tasks and processing are able to be automated through python. Python scripts are also able to handle certain data more directly (efficiently) and provide added functions based on outside python libraries. 2 | 3 | In the excercise we will take a simple task in arcMap and perform it using python instead of through the Graphic User Interface (GUI). 4 | 5 | ## Data to use for this excercise 6 | In the Week2_lesson2_web exercise we downloaded a shapefile of Santa Rosa County to the desktop. Start by unzipping that file to a folder on the desktop and add it as a layer in arcMap. If that data is not accessible the zip file `tl_2019_12113_addrfeat.zip` can be downloaded from [GitHub](https://github.com/jbousquin/py_workgroup/tree/master/practice/data/tl_2019_12113_addrfeat.zip) 7 | 8 | 1. Right click the zip file, go to WinZip > Extract to folder `C:\Users\\Desktop\tl_2012_12113_addrfeat` 9 | 10 | ## Using the python window 11 | 1. Open ArcMap 12 | 2. Open the python window within arcMap. 13 | 14 | This window has two panes, on the left is like shell, what you type gets executed. Type: 15 | 16 | ```python 17 | print("Hello World") 18 | ``` 19 | 20 | On the right documentation that should help you write your script. If you just type the function it will show expected arguments etc.: 21 | 22 | ```python 23 | print 24 | ``` 25 | 26 | You can set variables just like before. The shapefile can be declared as a variable: 27 | 28 | shapefile = r"C:\Users\\Desktop\tl_2019_12113_addrfeat\tl_2019_12113_addrfeat.shp" 29 | 30 | ## First create a copy of a shapefile 31 | 1. In ArcMap go to Add Data (The plus sign on the yellow box in the toolbar). 32 | 3. Navigate to the new folder and Add the new shapefile `tl_2019_12113_addrfeat.shp` as a layer on your map. 33 | 4. Copy the shapefile using the `Copy Features` tool under Data Management Tools > Features in ArcToolbox. 34 | For Input Features just drag and drop the layer from your table of contents. For the Output Features we're going to save it to the desktop as `newShapefile.shp` 35 | 36 | ## Layers vs shapefiles 37 | In arcToolbox there are differences between when you drag and drop a layer and when you navigate to or type in the shapefile. The same differences carry over to the python window. We can set variables based on layers already in our map without knowing the shapefile location. 38 | 39 | lyr = 'tl_2019_12113_addrfeat' 40 | 41 | The value for lyr can either be typed out or dragged and dropped. Just remember these two are not equivalent, one is a layer on the map and the other is a specific shapefile. 42 | 43 | At this point, python only evaluates the shapefiles as a string and doesn't treat it in any special way. It doesn't matter if the value of the `shapefile` variable is actually a shapefile, or even if it doesn't exist. Create a variable for the file we want to create: 44 | 45 | outShapefile = r"C:\Users\\Desktop\newShapefile2.shp" 46 | 47 | ## Functions in arcpy 48 | The arcpy module is already imported into arcMap. Just like when we used functions from urllib, we can use arcpy functions using arcpy.function() 49 | 50 | arcpy functions are named to resemble the tools in arcToolbox. Try: 51 | 52 | arcpy.CopyFeatures 53 | 54 | Note that it should autocomplete as `arcpy.CopyFeatures_management` since this function is in the Data Management tools. In the right panel look at the documentation for this function. Try the function using our variables: 55 | 56 | arcpy.CopyFeatures_management(shapefile, outShapefile) 57 | 58 | Notice that just like when we used the tool to copy the file manually, python added the result as a layer on the map by default. 59 | 60 | arcpy is well documented and the html lookups include code examples. [Google arcpy Copy Features](http://pro.arcgis.com/en/pro-app/tool-reference/data-management/copy-features.htm) and scroll down to look at the code samples there. 61 | 62 | In the code samples you may notice there is both an example for the python window and for stand-alone scripts. The standalone script is how you will want to store your scripts and python can run them without opening arcMap. However, to run arcpy functions you must have the arcpy library installed, and it requires arcGIS. 63 | 64 | ## Moving a manual process to python 65 | The copy we made using python was fairly straightforward to figure out and the documentation gives good examples but here is another alternative to copying manual processes into python scripts. 66 | 67 | Each time a task is performed in arcMap a record of it appears in results. Open the results window to examine when we manually copied the shapefile. 68 | 69 | 1. Open results window 70 | 2. Click Current Session 71 | 3. Expand the first instance of Copy Features (the second is the one we just did using the python window) 72 | 4. Right click and select Copy as Python Snipet 73 | 5. Open a new python script (IDLE or even notepad) 74 | 6. paste the contents of your clipboard to this script 75 | 76 | In the first line is a comment about replacing the layer `"tl_2012_12113_addrfeat"` the second line is the function we previously typed out with some additions: 77 | 78 | arcpy.CopyFeatures_management(in_features="tl_2019_12113_addrfeat", out_feature_class="C:/Users//Desktop/newShapefile.shp", config_keyword="", spatial_grid_1="0", spatial_grid_2="0", spatial_grid_3="0") 79 | 80 | The layer must be replaced because outside of the python window in arcMap it doesn't mean anything to python. The other additions are defaults that we didn't set. Copying the result as a python snipet is not foolproof, but it can get you started. 81 | 82 | Although this task seemed simple, once you start combining it with other python functions you'll see where it can be powerful. For example, the example in CopyFeatures_mangement documentation shows how to create a list of all the shapefiles in a folder and copy each of them to a new location. We'll learn about loops in week 3. 83 | 84 | Try experimenting with other processes and see if you can perform those with python. 85 | -------------------------------------------------------------------------------- /practice/Week3_lesson1_webCensus.md: -------------------------------------------------------------------------------- 1 | Last week we experimented with downloads using a url, but what if we want to download a bunch of files from a site or the file we want to download depends on some user provided variable or area of interest? 2 | 3 | One way to get multiple downloads if to loop over a list. 4 | One way to test if a user provided variable meets certain criteria is using conditional if statements. 5 | 6 | ## Set up script 7 | We'll start off setting up our script just like last time 8 | 9 | ```python 10 | # Author name 11 | # other comments with relevant metadata 12 | from urllib import urlretrieve 13 | import os 14 | 15 | filepath = r"C:\Users\\Desktop" 16 | fullFileName = filepath + os.sep + "SantaRosaCounty.zip" 17 | ``` 18 | 19 | if py 3.x 20 | ```python 21 | from urllib.request import urlretrieve 22 | ``` 23 | ## Downloading multiple county shapefiles 24 | 25 | Last week we downloaded a shapefile for Santa Rosa County using: 26 | ```python 27 | url = 'https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_12113_addrfeat.zip' 28 | urlretrieve(url, fullFileName) 29 | ``` 30 | This time, instead navigate to the page where the zip file is located: https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT 31 | 32 | All the files hosted here are for different counties in the United States as of 2019. You might notice they all start with 'tl_2019_' and end with '_addrfeat.zip' The 5 digits that change are called FIPS or GEOIDs and each equates to a different county (see [nrcs link](https://www.nrcs.usda.gov/wps/portal/nrcs/detail/fl/about/?cid=nrcs143_013697)). The first 2 digits identify the state, e.g. 12 = Florida, then the last 3 digits are for the county, e.g. 113. 33 | 34 | Let's start by getting the files for both Santa Rosa and Escambia County (FL). We know the FIPS codes for those so let's start by creating a list of the codes: 35 | ```python 36 | fips_list = [12113, 12033] 37 | ``` 38 | We know we'll need to construct 2 things for each: 39 | (1) the url to get the link from 40 | (2) a unique file name, since fullFileName = filepath + os.sep + "SantaRosaCounty.zip" won't work for Escambia County 41 | 42 | For the link we'll use the host page as a starting point for constructing our url: 43 | 44 | base_url = 'https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/' 45 | 46 | We could construct each link using the list and index: 47 | 48 | ```python 49 | url_SantaRosa = base_url + 'tl_2019_' + str(fips_list[0]) + '_addrfeat.zip' 50 | url_Escambia = base_url + 'tl_2019_' + str(fips_list[1]) + '_addrfeat.zip' 51 | ``` 52 | 53 | You'll notice we had to coerce both fips to string using str() because they were both int() datatype. There are several ways to format strings to combine several pieces of information from variables, one of these is the string .format method: 54 | 55 | ```python 56 | url_SantaRosa = '{}tl_2019_{}_addrfeat.zip'.format(base_url, fips_list[0]) 57 | url_Escambia = '{}tl_2019_{}_addrfeat.zip'.format(base_url, fips_list[1]) 58 | ``` 59 | 60 | The string is everything between ' and ', anywhere there is a {} a piece of information is inserted into the string. The two results are about the same length, but it is easier to read and the variables are automatically coerced to string. 61 | 62 | Next we need to set unique fullFileName variables. Again we could do that by indexing our list: 63 | 64 | ```python 65 | fileName_SantaRosa = os.path.join(filepath, 'county_{}.zip'.format(fips_list[0])) 66 | fileName_Escambia = os.path.join(filepath, 'county_{}.zip'.format(fips_list[1])) 67 | ``` 68 | 69 | Those variables will work and will allow you to download the files using urlliretrieve again: 70 | 71 | ```python 72 | urlretrieve(url_SantaRosa, fileName_SantaRosa) 73 | urlretrieve(url_Escambia, fileName_Escambia) 74 | ``` 75 | ## Downloading using for loop over list 76 | But that's like 8 lines of code, is that really any better than just clicking the two links and changing the file names? Now let's instead do the same thing within a for loop where we will loop over our list, downloading files as we go: 77 | 78 | ```python 79 | for fip in fips_list: 80 | url = '{}tl_2019_{}_addrfeat.zip'.format(base_url, fip) 81 | fileName = os.path.join(filepath, 'county_{}.zip'.format(fip)) 82 | urlretrieve(url, fileName) 83 | ``` 84 | ## Downloading using a function 85 | Taking it a step furth we could create a function to download the file to the destination given any FIP: 86 | 87 | ```python 88 | def downloadCounty(fip, filepath): 89 | base_url = 'https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/' 90 | url = '{}tl_2019_{}_addrfeat.zip'.format(base_url, fip) 91 | fileName = os.path.join(filepath, 'county_{}.zip'.format(fip)) 92 | urlretrieve(url, fileName) 93 | ``` 94 | 95 | Then we run it with: 96 | 97 | ```python 98 | for fip in fips_list: 99 | downloadCounty(fip, filepath) 100 | ``` 101 | 102 | You may notice we had to define base_url within our funtion. Variables are typically declared 'locally', this means if base_url is defined in the script it will not be available within a function unless declared gloablly or passed as a parameter to the function (i.e. like fip. Once in the function the variable name for the parameter will be used not the original variable in the script, e.g. this works the same as the above: 103 | 104 | ```python 105 | def downloadCounty(crazyFIPname, crazy_filepath): 106 | base_url = 'https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/' 107 | base_url = '{}tl_2019_{}_addrfeat.zip'.format(base_url, crazyFIPname) 108 | fileName = os.path.join(crazy_filepath, 'county_{}.zip'.format(crazyFIPname)) 109 | urlretrieve(base_url, fileName) 110 | ``` 111 | 112 | This also means any changes made within the function (e.g. to base_url) don't alter the variables in the main script. Of course any changes made to disk, e.g. like downloading the file, will be made and available for interaction in the main script. To get altered variables you use return: 113 | 114 | ```python 115 | def downloadCounty(crazyFIPname, crazy_filepath): 116 | base_url = 'https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/' 117 | base_url = '{}tl_2019_{}_addrfeat.zip'.format(base_url, crazyFIPname) 118 | fileName = os.path.join(crazy_filepath, 'county_{}.zip'.format(crazyFIPname)) 119 | urlretrieve(base_url, fileName) 120 | return base_url 121 | ``` 122 | 123 | And set either the original variable or a new variable to the result: 124 | 125 | ```python 126 | for fip in fips_list: 127 | base_url2 = downloadCounty(fip, filepath) 128 | ``` 129 | ## Downloading a subset for a longer list using conditional 130 | You are probably not impressed with the for loop or function as it only reduced your code by two lines, but now it doesn't matter how long your list is you could download all the counties in FL if you wanted. For instance lets say you have a complete list of all the FIPS scraped from a website (to keep it short this is just FL and AL): 131 | 132 | ```python 133 | fips_list = ['01067', '01073', '01117', '01095', '01123', '01107', '01039', '01015', '01043', '01115', '01083', '01053', '01055', '01081', '01003', '01097', '01007', '01071', '01109', '01021', '01131', '01127', '01019', '01121', '01005', '01045', '01103', '01091', '01069', '01031', '01035', '01057', '01077', '01049', '01061', '01065', '01013', '01093', '01133', '01029', '01089', '01025', '01017', '01027', '01119', '01041', '01105', '01001', '01051', '01099', '01101', '01079', '01033', '01125', '01009', '01113', '01059', '01111', '01047', '01075', '01087', '01011', '01023', '01037', '01063', '01085', '01129', '12001', '12117', '12081', '12037', '12095', '12027', '12031', '12099', '12105', '12086', '12055', '12103', '12083', '12013', '12059', '12071', '12049', '12077', '12053', '12035', '12119', '12005', '12009', '12075', '12039', '12133', '12069', '12051', '12011', '12107', '12091', '12017', '12101', '12127', '12131', '12021', '12041', '12061', '12089', '12111', '12063', '12019', '12113', '12007', '12047', '12087', '12097', '12125', '12023', '12121', '12003', '12079', '12065', '12043', '12115', '12093', '12033', '12123', '12057', '12045', '12015', '12129', '12109', '12085', '12073', '12029', '12067'] 134 | ``` 135 | 136 | We would use a conditional within our for loop to determine if it would be downloaded or not: 137 | 138 | ```python 139 | for fip in fips_list: 140 | if fip.startswith('12'): 141 | downloadCounty(fip, filepath) 142 | ``` 143 | 144 | We can get more sophisticated and catch any non-Alabama as well: 145 | 146 | ```python 147 | for fip in fips_list: 148 | if fip.startswith('12'): 149 | downloadCounty(fip, filepath) 150 | elif fip.startswith('01'): 151 | print('AL, skipped') 152 | else: 153 | print('Weird, {} is neither in AL or FL?'.format(fip)) 154 | ``` 155 | 156 | These are the basics, but this same structure can be used for queries to some APIs and are very powerful. 157 | -------------------------------------------------------------------------------- /practice/Week3_lesson2_fileList.md: -------------------------------------------------------------------------------- 1 | Ever have a folder full of files, like jpegs from your camera, that you want to rename consistently? In this excercise we'll write a function to do just that. 2 | 3 | ## List files in folder 4 | 5 | For this excercise you will need the contents of the folder 99_files in ~py_workgroup\practice\data\99_files.zip 6 | where ~ is your local folder containing the github repo. Alternatively you can use my copy on the L or any folder containing multiple pdf files and other types of files (just make a copy of it because the contents will be altered irrepably). 7 | 8 | Set up your script with the variable path set to the 99_files folder 9 | 10 | ```python 11 | # Comments 12 | # Author 13 | 14 | import os 15 | 16 | 17 | path = r'~py_workgroup\practice\data\99_files' 18 | 19 | ``` 20 | 21 | We can use the function listdir() from the os module to list all the contents of a given directory: 22 | 23 | ```python 24 | file_list = os.listdir(path) 25 | ``` 26 | 27 | You can test the length of the list to see that everything is there and use the index to check the name of the first file in the list: 28 | 29 | ```python 30 | len(file_list) 31 | file_list[0] 32 | ``` 33 | 34 | ## Use conditionals to limit list to one just pdfs 35 | There are multiple types of files in the folder, what if you want a list of just the pdfs? 36 | Start by figuring out how to test if a file is a pdf 37 | 38 | ```python 39 | first_file = file_list[0] 40 | if first_file.endswith('.pdf'): 41 | print("It's a PDF!") 42 | else: 43 | print('It is not a pdf') 44 | ``` 45 | Now do this while looping over all the files in the list: 46 | ```python 47 | for single_file in file_list: 48 | if single_file.endswith('.pdf'): 49 | print('It's a PDF!') 50 | else: 51 | print('It is not a pdf') 52 | ``` 53 | 54 | That's great, but a lot of printed info, instead let's add it item by item to a new list just for pdfs: 55 | 56 | ```python 57 | pdf_list = [] 58 | for single_file in file_list: 59 | if single_file.endswith('.pdf'): 60 | pdf_list.append(single_file) 61 | ``` 62 | 63 | All we have to do to change the file type is change the '.pdf' condition 64 | 65 | For this excercise we're going to rename the files so it is handy to have a list in python, but in the future we'll learn how to write this list to a text file that can easily be imported into excel or some other type of software to track all your files. You could also use other conditionals (e.g. parts of the file name) to identify specific files within a list by file type to do something to. For example I use a script to periodically clean my desktop and file away any notes I saved as textfiles. 66 | 67 | ## Rename files in a list 68 | WARNING anytime you delete or otherwise alter files using python you need to be extremely careful, it is easy to get disoriented on what directory you're working in and files deleted this way can't be retreived from the recycle bin. 69 | 70 | For this we'll use the os.rename() function: 71 | ```python 72 | os.rename(old, new) 73 | ``` 74 | where 'old' is the current file path/name and 'new' is the updated path/name 75 | 76 | For first_file: 77 | ```python 78 | first_file = file_list[0] 79 | 80 | old = os.path.join(path, first_file) 81 | new = os.path.join(path, 'new_name'+ first_file) 82 | 83 | os.rename(old, new) 84 | ``` 85 | 86 | Now that you have tested it out try it within a for loop 87 | 88 | ## Other file name tricks 89 | Keep in mind this approach can also be used to put the same file in a different folder/directory, change the filetype or to do various other things. It also becomes more handy as you have more information about the file being renamed. 90 | 91 | To split the filename from the extension you can use the os.path.splitext() function, which returns a list with the filename and the extension: 92 | ```python 93 | first_file = file_list[0] 94 | newName = os.path.splitext(first_file)[0] + 'prj' + os.path.splitext(first_file)[1] 95 | new = os.path.join(path, newName) 96 | ``` 97 | This has the advantage over .split('.') because it distinguishes between the extension and any leading '.' in the path/file name. 98 | -------------------------------------------------------------------------------- /practice/Week3_lesson3_arcpy.md: -------------------------------------------------------------------------------- 1 | This week we talked about lists. Lists are one of the main ways groups of data can be managed. 2 | Feature attribute tables have data in fields, although these entire tables can be handled as multi-dimensional arrays in numpy, we're going to play with them as lists. 3 | 4 | Pulling feature attributes to a list is a bit beyond what we've done so far, but I've provided a field_to_list() function in the utils.py module, or use the simplified: 5 | 6 | ```python 7 | def field_to_list(table, field): 8 | return [row[0] for row in arcpy.da.SearchCursor(table, [field])] 9 | ``` 10 | 11 | ## QC data in field 12 | I want to compare data over time for a given point. Normally I might create a third field and then use field calculator to fill it in. 13 | But lets do it using lists in python: 14 | 15 | ```python 16 | # Comments 17 | # Author 18 | 19 | from utils import field_to_list 20 | 21 | field1 = "B01003_1E" 22 | field2 = "HD01_VD01" 23 | # Note you'll need to specify your copy of the repo instead of ~ 24 | # If in arc you can drag and drop the layer (careful of selections) 25 | shp = r"~\py_workgroup\practice\data\QC_data.shp" 26 | 27 | list_1 = field_to_list(shp, field1) 28 | list_2 = field_to_list(shp, field2) 29 | ``` 30 | 31 | Now I could loop through and do math to find only points that are different: 32 | 33 | ```python 34 | # make list to hold differences 35 | diff_list = [] 36 | #make iterator 37 | i = 0 38 | for val in list_1: 39 | if val != list_2[i]: 40 | diff_list.append(float(val) - float(list_2[i])) 41 | i = i + 1 42 | ``` 43 | 44 | These fields are text fields, so we get an error the first time one of them is non-numeric: 45 | 46 | ```python 47 | Runtime error 48 | Traceback (most recent call last): 49 | File "", line 3, in 50 | ValueError: could not convert string to float: 51 | ``` 52 | 53 | We can dig in to find that this string value is null u' '. There are a few ways we could handle this, but let's try just catching those nulls and saving them to the list as such: 54 | 55 | ```python 56 | diff_list = [] 57 | i = 0 58 | for val in list_1: 59 | if val != list_2[i]: 60 | if val == " ": 61 | diff_list.append("") 62 | else: 63 | diff_list.append(float(val) - float(list_2[i])) 64 | i = i + 1 65 | ``` 66 | Well now we have a list of 314 instances where the columns don't match. How many of those are because of the null? 67 | 68 | ```python 69 | not_null_list = [] 70 | for x in diff_list: 71 | if x != "": 72 | not_null_list.append(x) 73 | ``` 74 | 75 | Alright, so the differences all have null for the first column. If we want to explore them more or fix it we really need to know the feature ID: 76 | 77 | ```python 78 | field3 = "FID" 79 | list_3 = field_to_list(shp, field3) 80 | 81 | diff_list = [] 82 | ID_list = [] # List to hold ID 83 | i = 0 84 | for val in list_1: 85 | if val != list_2[i]: 86 | ID_list.append(list_3[i]) 87 | if val == " ": 88 | diff_list.append("") 89 | else: 90 | diff_list.append(val - list_2[i]) 91 | i = i + 1 92 | ``` 93 | 94 | Now we have a list we can use to further explore those errors and try to find out why they might have happend. 95 | 96 | ## Make layer selection using list 97 | We have a ID_list with all features that don't equal. We could go through one by one, but let's say we want to explore it with a selection in arcmap. 98 | 99 | First we have to make sure we have a layer, as selections are temporary and can only be made on a layer not a shapefile: 100 | 101 | ```python 102 | arcpy.MakeFeatureLayer_management(shp, "temp_layer") 103 | ``` 104 | 105 | Next we'll construct a where_clause concatenating each ID in our list just like the one you use in select by attribute: 106 | 107 | ```python 108 | where_clause = '' # empty string 109 | for ID in ID_list: 110 | query = field3 + ' = ' + str(ID) + " OR " 111 | where_clause = where_clause + query 112 | ``` 113 | 114 | Sometimes query strings can be tricky, so I'd encourage you to test it in select by attribute for the first ID: 115 | 116 | ```python 117 | ID = ID_list[0] 118 | query = field3 + ' = ' + str(ID) + " OR " 119 | print query 120 | ``` 121 | 122 | Using that we see we'll need to remove the last " OR ", but otherwise it seems to work. We'll add that and then use it to make the selection: 123 | 124 | ```python 125 | where_clause = where_clause[:-4] 126 | arcpy.SelectLayerByAttribute_management("temp_layer", "NEW_SELECTION", where_clause) 127 | ``` 128 | -------------------------------------------------------------------------------- /practice/Week4_lesson1_renameFromList.md: -------------------------------------------------------------------------------- 1 | Last week we saw how you create a list of all the files in a folder, limit that list to certain file types using conditionals, and rename subsets of that list. 2 | This week we'll get more advance with how we rename them, getting the new name from a list of new names with the same order as our old names. 3 | 4 | ### List iterators 5 | An iterator can be used to keep track of how many items we've gone through (iterated) over a for loop: 6 | ```python 7 | lst1 = ["a", "b", "c", "d"] 8 | i = 0 9 | for item in lst1: 10 | print(item + " is in the {} place".format(i)) 11 | i += 1 12 | ``` 13 | 14 | Following that example, i could then be used to index a second list with our new name at the same location. 15 | ```python 16 | lst1 = ["a", "b", "c", "d"] 17 | lst2 = ["a1", "b1", "c1", "d1"] 18 | i = 0 19 | for item in lst1: 20 | print(item + " will become " + lst2[i]) 21 | i += 1 22 | ``` 23 | 24 | If you remember index() that might seem like another way to get an item at the same index in another list: 25 | ```python 26 | lst1 = ["a", "b", "c", "d"] 27 | lst2 = ["a1", "b1", "c1", "d1"] 28 | 29 | for item in lst1: 30 | item2 = lst1.index(item) 31 | print('{}, position {}'.format(lst2[item2], item2)) 32 | ``` 33 | 34 | However, if the list contains multiple instances of the same value this will 35 | be a problem, as it returns the index of the first occurence: 36 | ```python 37 | lst1 = ["a", "a", "c", "d"] 38 | lst2 = ["a1", "b1", "c1", "d1"] 39 | 40 | for item in lst1: 41 | item2 = lst1.index(item) 42 | print(item2) 43 | ``` 44 | 45 | Another way to generate an iterator is using the built in method enumerate: 46 | ```python 47 | lst1 = ["a", "a", "c", "d"] 48 | lst2 = ["a1", "b1", "c1", "d1"] 49 | 50 | for i, item in enumerate(lst1): 51 | print(item + " will become " + lst2[i]) 52 | ``` 53 | 54 | ## Applying it to 99_files 55 | 56 | ```python 57 | # Comments 58 | # Author 59 | 60 | import os 61 | 62 | 63 | path = r'~py_workgroup\practice\data\99_files' 64 | file_list = os.listdir(path) 65 | 66 | pdf_list = [] 67 | 68 | for item in file_list: 69 | if item.endswith('.pdf'): 70 | pdf_list.append(os.path.join(path, item)) 71 | 72 | #for item in pdf_list: 73 | # new_name = os.path.join(path, #name?) 74 | # os.rename(item, new_name) 75 | ``` 76 | 77 | Now lets say we have an excel sheet with all our old names and we went through it and lined each one up to create our new names, then read those new names into a python list: 78 | ```python 79 | new_names = ['Andreasen-etal-2001.pdf', 'Arrow-etal-1993.pdf', 'Aubry-etal-2006.pdf', 'Banzhaf-etal-2005.pdf', 'Banzhaf-etal-2012.pdf', 'Beaumont-etal-2007.pdf', 'Beys-da-Silva-etal-2014.pdf', 'Bhuta-etal-2014.pdf', 'Bishoi-etal-2009.pdf', 'Blomqvist-etal-2013.pdf', 'Borja-etal-2008.pdf', 'Borja-etal-2005.pdf', 'Brown-etal-2015.pdf', 'Brudvig-etal-2014.pdf', 'Bruno-etal-2002.pdf', 'Butchart-etal-2010.pdf', 'Clifton-etal-2011.pdf', 'Collen-etal-2008.pdf', 'Crossman-etal-2013.pdf', 'Cutter-etal-2003.pdf', 'Cutter-etal-2006.pdf', 'Cutter-etal-2008.pdf', 'Davies-etal-2006.pdf', 'DeFries-etal-2005.pdf', 'Deliege-etal-2015.pdf', 'Diaz-etal-2006.pdf', 'Djalante-etal-2012.pdf', 'Dobbie-etal-2013.pdf', 'Eakin-etal-2009.pdf', 'Eigenbrod-etal-2010.pdf', 'Elliot-2002.pdf', 'Evans-etal-2014.pdf', 'Faber-etal-2012.pdf', 'Fasola-etal-2010.pdf', 'Fromm-2000.pdf', 'Greaver-etal-2012.pdf', 'Haines-Young-etal-2009.pdf', 'Hak-etal-2015.pdf', 'Hallett-2014.pdf', 'Helfenstein-etal-2014.pdf', 'Hermoso-etal-2013.pdf', 'Herrick-2000.pdf', 'Herrick-etal-2010.pdf', 'Holling-1973.pdf', 'Jorgenson-etal-2015.pdf', 'Karnauskas-etal-2014.pdf', 'Katsanevakis-etal-2014.pdf', 'Kerr-etal-2003.pdf', 'Kronenberg-2014.pdf', 'Kumar-etal-2008.pdf', 'Lackey-1998.pdf', 'Laugen-etal-2014.pdf', 'Le Pape-etal-2014.pdf', 'Loh-etal-2005.pdf', 'Loomis-etal-2014.pdf', 'Lowe-etal-2009.pdf', 'Mace-etal-2012.pdf', 'Marine Trophic Index-ND.pdf', 'Martin-2014.pdf', 'Mawdsley-etal-2009.pdf', 'McCauley-2006.pdf', 'McCrea-etal-2015.pdf', 'Muller-2005.pdf', 'Muller-etal-2012.pdf', 'Muller-etal-2016.pdf', 'Muller-etal-2000.pdf', 'Murphy-etal-2013.pdf', 'Muscolo-etal-2014.pdf', 'Naeem-1998.pdf', 'Noss-2001.pdf', 'Orfanidis-etal-2003.pdf', 'Ostendorf-2011.pdf', 'Paulraj-etal-2015.pdf', 'Peck-etal-2009.pdf', 'Perrings-etal-2011.pdf', 'Pettorelli-etal-2005.pdf', 'Poikane-etal-2014.pdf', 'Pulselli-etal-2011.pdf', 'Rapport-etal-2013.pdf', 'Seppelt-etal-2011.pdf', 'Smeets-etal-1999.pdf', 'Smith-etal-2013.pdf', 'Smith-etal-B.pdf', 'Spooner-etal-2006.pdf', 'Stapanian-etal-2013.pdf', 'Sterk-etal-2013.pdf', 'Stinchcombe-etal-2007.pdf', 'Timko-etal-2009.pdf', 'Tompkins-etal-2004.pdf', 'Tsai-etal-2009.pdf', 'Turner-etal-1998.pdf', 'Verissimo-etal-2013.pdf', 'Villafan-etal-2001.pdf', 'Walker-etal-2002.pdf', 'Wolter-etal-2003.pdf', 'Wong-etal-2015.pdf'] 80 | ``` 81 | First we'll test that the two have the same number of names 82 | ```python 83 | len(pdf_list) == len(new_names) 84 | ``` 85 | Then we can update our loop over the pdf_list to change all the names: 86 | ```python 87 | for i, item in enumerate(pdf_list): 88 | new_name = os.path.join(path, new_names[i]) 89 | os.rename(item, new_name) 90 | ``` 91 | 92 | Now of course in the real application you'd want to probably read in the old name from that excel file too rather than trusting os.listdir() to go in the expected order. 93 | -------------------------------------------------------------------------------- /practice/Week4_lesson2_arcpyFieldsDesc.md: -------------------------------------------------------------------------------- 1 | We've seen some of the ways to read things on your system to a list (e.g. reading file names within a folder), but it is often useful to do the same type of thing in arc. 2 | 3 | ## Background 4 | We're working with arcpy, so if you're working outside of arcPro/ArcMap your script would start like: 5 | 6 | ```python 7 | # Comments 8 | # Author 9 | 10 | import os 11 | import arcpy 12 | 13 | path = r'~py_workgroup\practice\data\QC_data' 14 | shp = os.path.join(path, 'QC_data.shp') 15 | ``` 16 | 17 | We've talked a little bit about 'objects' as being special types of variables you can create, get properties from and interact otherwise with. Many times you'll run a arcpy function and it will return one of these objects with information about the inputs. We'll start by looking at the arcpy.Describe() function: 18 | 19 | ```python 20 | desc = arcpy.Describe(shp) 21 | print(desc) 22 | ``` 23 | We've created a variable called desc which is a 'geoprocessing describe data object.' We can use this to find out different information about this shapefile and otherwise interact with it: 24 | 25 | ```python 26 | desc.dataType 27 | ``` 28 | For a broader list of the dataset properties see the [documentation](https://pro.arcgis.com/en/pro-app/arcpy/functions/dataset-properties.htm). 29 | 30 | ## Get extent from shapefile 31 | Some of the properties from the describe object return another object. Two that are particurly useful are the [spatialReference](https://pro.arcgis.com/en/pro-app/arcpy/classes/spatialreference.htm) and the [extent](https://pro.arcgis.com/en/pro-app/arcpy/classes/extent.htm): 32 | 33 | ```python 34 | desc.spatialReference 35 | desc.spatialReference.factoryCode # EPSG 36 | desc.spatialReference.name 37 | desc.spatialReference.exportToString() 38 | 39 | desc.extent 40 | desc.extent.XMax 41 | desc.extent.JSON 42 | ``` 43 | 44 | ## Reading fields from a shp (Delete multiple fields) 45 | Ever have a huge table and you only care about a couple fields, but all the extras make it hard to work with? Clicking each one and deleting is a pain. 46 | Well we can make a list of the fields, subset it with the names we want to keep and then loop over the ones we want to delete. 47 | 48 | First we need our list of fields to delete. arpyc.ListFields() returns a list of field "objects," we'll revisit what this means next time, but for now just know that object.name will give us the name of the field. 49 | 50 | ```python 51 | # Long version 52 | fields_list = [] 53 | for x in arcpy.ListFields(shp): 54 | fields_list.append(x.name) 55 | 56 | # Create a list for field names using list comprehension 57 | fields_list = [x.name for x in arcpy.ListFields(shp)] 58 | ``` 59 | 60 | Now we want to make a list of the fields to keep, and for any field in fields_list not 61 | in our keep_list we'll add it to a delete_list. 62 | 63 | ```python 64 | # Long version 65 | delete_list = [] 66 | for field in fields_list: 67 | if field not in keep_list: 68 | delete_list.append(field) 69 | 70 | # List comprehension version 71 | keep_list = ['FID', 'Shape', 'GEOID10', 'B01003_1E', 'B01003_1M', 'HD01_VD01', 'HD02_VD01'] 72 | delete_list = [field for field in fields_list if field not in keep_list] 73 | ``` 74 | 75 | And now we loop over our delete_list, deleting each field 76 | 77 | ```python 78 | for field in delete_list: 79 | arcpy.DeleteField_management(shp, field) 80 | ``` 81 | 82 | Our final code would look something like this: 83 | 84 | ```python 85 | fields_list = [x.name for x in arcpy.ListFields(shp)] 86 | keep_list = ['FID', 'Shape', 'GEOID10', 'B01003_1E', 'HD01_VD01'] 87 | delete_list = [field for field in fields_list if field not in keep_list] 88 | 89 | for field in delete_list: 90 | arcpy.DeleteField_management(shp, field) 91 | ``` 92 | 93 | ## List of shapefiles or feature classes 94 | You can also read files to a list using arcpy, start off the same way but instead of setting a variable (e.g. shp) set it to your folder/geodatabase and set the current workspace to that using env 95 | ```python 96 | import os 97 | import arcpy 98 | 99 | path = r'' # folder containting shapefiles 100 | gdb = r'.gdb' # Alternatively you can set the workspace to a geodatabase 101 | 102 | # Set the workspace to path 103 | arcpy.env.workspace = path 104 | 105 | # Now list all the shapefiles in that folder 106 | featureClass_list = arcpy.ListFeatureClasses() 107 | ``` 108 | 109 | One time this might be useful is if combining it with functions from early excercises where we copied a shapefile or when re-projecting several files and copying them into a geodatabase: 110 | 111 | ```python 112 | # Copy shapefiles in list to a file geodatabase 113 | for fc in featureClass_list: 114 | arcpy.CopyFeatures_management(fc, os.path.join(gdb, os.path.splitext(fc)[0])) 115 | ``` 116 | -------------------------------------------------------------------------------- /practice/Week4_lesson3_webAPbbox.md: -------------------------------------------------------------------------------- 1 | We used the urllib module to download zip files from the web in earlier. In this excercise we'll build on that, showing how more complex url strings can be used to query an API and then we'll transition to requesting these instead through the requests module. 2 | 3 | ## Constructing a url to query an API 4 | For this excercise we'll use the census data API available through census.gov. They have a lot of different datasets so it's always a good idea to start at their [developer website](https://www.census.gov/developers/). 5 | 6 | Many APIs will require you to use some type of key or other authentication. This allows them to track who is using the API and block anyone with malicious intent (i.e. repeatedly querying the API just to overlaod it). When writing your code it is usually a good idea to read this type of information from a separate file that doesn't get shared with the code. This is particurly true for API services that charge you based on use! Many APIs will encourage but not require a key, Census is one of those, you can use the API without including a key but it may limit how big the response can be or the response rate. To sign-up for an key: 7 | https://api.census.gov/data/key_signup.html 8 | 9 | We will start with the [2010 Decennial](https://www.census.gov/data/developers/data-sets/decennial-census.html) census. The census API is great in that each has: 10 | 11 | * API Call: https://api.census.gov/data/2010/dec/sf1? 12 | 13 | Which is like what we were using as our base_url in past examples 14 | 15 | * Example calls: https://api.census.gov/data/2010/dec/sf1?get=H001001,NAME&for=state:* 16 | 17 | Which are working examples of how to request specific information 18 | 19 | * API Variables: [html](https://api.census.gov/data/2010/dec/sf1/variables.html) [xml](https://api.census.gov/data/2010/dec/sf1/variables.xml) [json](https://api.census.gov/data/2010/dec/sf1/variables.json) 20 | 21 | Which are very long lists of all the variables/parameters you can restrict your request with, and they're available in variety of formats. 22 | 23 | Start by looking at the example url. First if you navigate to that url you'll notice you get the results right in your browser. Looking more closely at the url, the first part - everything before the '?' is what we've been calling our base_url. The first bit is a list of the fields we want 'H001001' and 'NAME'. Ampersands, '&' are used to add more parameters to the string, in this case to specify 'for' where we want that information. In this case that is by state and the '*' is a wild card meaning any value of state (i.e. all states). 24 | 25 | When we look at the result we have a list of lists, where each nested list represents a row of results for one of the geographies (states) where the columns are 'H001001', 'NAME' and 'state.' Looking at the results you can tell right away 'NAME' is the name of each state. Based on what we've done with census in the past you may recognize results in the 'state' column are the census FIPS code for that state. We'll come back to the unknown field 'H001001'. First what happens if we change our wild card to one of the FIPS, e.g. Florida: 26 | 27 | https://api.census.gov/data/2010/dec/sf1?get=H001001,NAME&for=state:12 28 | 29 | Now you just got results for Florida! Now lets figure out what these H001001 results mean, go to the [html](https://api.census.gov/data/2010/dec/sf1/variables.html) variable list because that is most human readbale (also the slowest to load). Right at the top you may see the 'for' clause we were using in the example, and it says it's the FIPS 'for clause. There are also other sub-divisions of state geographies for reference (e.g. COUNTY (FIPS)). Before we get too distracted ctrl-f to find 'H001001. You'll see this is Total housing units. If we wanted a subset of those housing units considered 'rural; we could use 'H002005'. Let's do that for Florida and Alabama: 30 | 31 | https://api.census.gov/data/2010/dec/sf1?get=H002005,NAME&for=state:12,01 32 | 33 | If we wanted to do that API request for any state in python: 34 | ```python 35 | from urllib import urlretrieve #py2.x 36 | from urllib.request import urlretrieve #py3.x 37 | 38 | base_url = 'https://api.census.gov/data/2010/dec/sf1' 39 | 40 | # In this case urllib knows to insert the '?get=' 41 | base_query = 'H002005,NAME&for=state:' 42 | state = '12' 43 | 44 | url = base_url 45 | query = base_query + state 46 | 47 | response = urlretrieve(url, query) 48 | ``` 49 | ## webservice queries 50 | Continuing with census, we'll take a look at the [TIGERweb geoservices Rest API](https://www.census.gov/data/developers/data-sets/TIGERweb-map-service.html). Within that folder of services lets start with [tigerWMS_Current](https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer). Within that map service there are a bunch of layers, lets look at [Counties](https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/86) (ID 86). 51 | 52 | 53 | 54 | requests module 55 | 56 | query to census API 57 | 58 | query to website using spatial info from lesson 2 (e.g. census FIPS) 59 | 60 | query to OCS 61 | query to arcGIS hosted feature services 62 | -------------------------------------------------------------------------------- /practice/arcpy_excercises/selection on field values/subset_vals.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Tue Apr 20 11:05:48 2021 4 | 5 | @author: jbousqui 6 | """ 7 | 8 | import arcpy 9 | 10 | 11 | # Set your file names 12 | shp = r'L:\Public\jbousqui\Code\GitHub\H2O_BEST\greenspace_access\package\tests\parcels.shp' 13 | out_shp = r'L:\Public\jbousqui\Code\GitHub\H2O_BEST\greenspace_access\package\tests\parcels2.shp' 14 | 15 | field = 'parcels1_5' # The field you want to select on 16 | 17 | # Retrieve all values from field in attribute table to list 18 | field_values = [row[0] for row in arcpy.da.SearchCursor(shp, [field])] 19 | # Long form this looks like 20 | #field_values = [] 21 | #with arcpy.da.SearchCursor(shp, [field]) as cursor: 22 | # for row in cursor: 23 | # field_values.append(row[0]) 24 | 25 | # Coerce that list to a set (all unique) then back to a list (order changes) 26 | unique_values = list(set(field_values)) 27 | 28 | # Generate selection layer from attribute query to copy 29 | # First make the shp a layer 30 | lyr = "parcels_lyr" 31 | arcpy.MakeFeatureLayer_management(shp, lyr) 32 | # Now make the sql query string, there are many ways to do this 33 | drop_values = ['COUNTIES (OTHER THAN PUBLIC SCHOOLS, COLLEGES, HOSPITALS) INCLUDING NON-MUNICIPAL GOVERNMENT', 34 | 'FEDERAL, OTHER THAN MILITARY, FORESTS, PARKS, RECREATIONAL AREAS, HOSPITALS, COLLEGES', 35 | 'MILITARY', 36 | 'PARCELS WITH NO VALUES', 37 | 'PUBLIC COUNTY SCHOOLS - INCLUDING ALL PROPERTY OF BOARD OF PUBLIC INSTRUCTION', 38 | 'RIVERS AND LAKES, SUBMERGED LANDS', 39 | 'STATE, OTHER THAN MILITARY, FORESTS, PARKS, RECREATIONAL AREAS, COLLEGES, HOSPITALS', 40 | 'UTILITY, GAS AND ELECTRICITY, TELEPHONE AND TELEGRAPH, LOCALLY ASSESSED RAILROADS, WATER AND SEWER SERVICE, PIPELINES, CANALS, RADIO/TELEVISION COMMUNICATION', 41 | 'RIGHT-OF-WAY, STREETS, ROADS, IRRIGATION CHANNEL, DITCH, ETC.', 42 | 'MUNICIPAL, OTHER THAN PARKS, RECREATIONAL AREAS, COLLEGES, HOSPITALS', 43 | 'AIRPORTS (PRIVATE OR COMMERCIAL), BUS TERMINALS, MARINE TERMINALS, PIERS, MARINAS', 44 | 'SEWAGE DISPOSAL, SOLID WASTE, BORROW PITS, DRAINAGE RESERVOIRS, WASTE LAND, MARSH, SAND DUNES, SWAMPS',] 45 | #where_qry = ["{} = '{}' OR ".format(field, val) for val in drop_values][:-4] 46 | where_qry = "{} = '".format(field) + "' OR {} = '".format(field).join(drop_values) + "'" 47 | # Long form 48 | #where_qry = '' 49 | #for val in select_values: 50 | # where_qry += "{} = '{}' OR ".format(field, val) 51 | #where_qry = where_qry[:-4] # Drops last 4 characters from string 52 | # New Selection of parcels that do not (INVERT) have the string criteria 53 | arcpy.management.SelectLayerByAttribute(lyr, 'NEW_SELECTION', where_qry, 'INVERT') 54 | # Save the selection as new shp (layers are cleared at end of session unless saved in mxd) 55 | arcpy.CopyFeatures_management(lyr, out_shp) 56 | -------------------------------------------------------------------------------- /practice/arcpy_excercises/toolbox/clip_excercise.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/practice/arcpy_excercises/toolbox/clip_excercise.docx -------------------------------------------------------------------------------- /practice/arcpy_excercises/toolbox/clip_excercise.py: -------------------------------------------------------------------------------- 1 | #pyt to imitate clip 2 | import arcpy 3 | 4 | def clip(param_list): 5 | FC1 = param_list[0].valueAsText 6 | 7 | class Toolbox(object): 8 | def __init__(self): 9 | self.label = "Bay_Hexation" 10 | self.alias = "Bay Hex" 11 | # List of tool classes associated with this toolbox 12 | self.tools = [Bay_Hexing] 13 | 14 | class Bay_Hexing(object): 15 | def __init__(self): 16 | self.label = "Create Hexagons for Bay Polygon" 17 | self.description = "Create hexagon polygons for bay polygons that " + \ 18 | "are similar in size, clipped to and networked to" + \ 19 | " coastal NHDPlus catchments." 20 | 21 | def getParameterInfo(self): 22 | inFC = arcpy.Parameter(displayName = "Wrong Name 1", 23 | name = "inPoly", 24 | datatype = "GPFeatureLayer", 25 | parameterType = "Required", 26 | direction = "Input") 27 | clipFC = arcpy.Parameter(displayName = "Wrong Name 2", 28 | name = "inPoly2", 29 | datatype = "GPFeatureLayer", 30 | parameterType = "Required", 31 | direction = "Input") 32 | output = arcpy.Parameter(displayName = "Output", 33 | name = "outPoly", 34 | datatype = "DEFeatureClass", 35 | parameterType = "Required", 36 | direction = "Output") 37 | params = [inFC, output] 38 | return params 39 | 40 | def isLicensed(self): 41 | return True 42 | 43 | def updateParameters(self, params): 44 | return 45 | 46 | def updateMessages(self, params): 47 | return 48 | 49 | def execute(self, params, messages): 50 | #try: 51 | clip(params) 52 | #except: 53 | -------------------------------------------------------------------------------- /practice/data/99_files.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/practice/data/99_files.zip -------------------------------------------------------------------------------- /practice/data/Resources.md: -------------------------------------------------------------------------------- 1 | ### Federal GIS Servers (rather incomplete): 2 | https://mappingsupport.com/p/surf_gis/list-federal-GIS-servers.pdf 3 | ### State GIS resources 4 | https://mappingsupport.com/p/surf_gis/list-state-GIS-servers.pdf 5 | 6 | 7 | ### How to find IDLE on your machine: 8 | IDLE comes with the standard install of python. Depending on where python is installed on your machine the shortcut to it may be in different places. Mine was installed with arc 10.3 so it was: 9 | C:\Python27\ArcGIS10.3\Lib\idlelib\idle.pyw 10 | 11 | If I want to run python from command line instead of IDLE: 12 | C:\Python27\ArcGIS10.3\python.exe 13 | 14 | If you don’t know where it is you can create a python file by changing the extension on a text file (.txt) to .py then right click and “Edit with IDLE” 15 | 16 | 17 | ### Forward slash vs backslash: 18 | I briefly addressed file paths yesterday and since had a great question about using forward slashes “/” instead of the backslash “\” (I’d never tried). 19 | 20 | https://imgs.xkcd.com/comics/backslashes.png 21 | 22 | Unix uses forward slashes, and although Windows uses mainly backslashes it will usually accept either (Stack explanation of why). The catch is that not all software written for windows will always accept / (arc & QGIS seem to). Using os.sep from the os module is the safest solution. If you’re using “/” but concerned, os.path.normpath() can be used to normalize the pathname by collapsing redundant separators or on windows converting / to \. Using a raw string is my go to since that lets me copy it directly (apparently common). As to the “There should be one-- and preferably only one --obvious way to do it.” – don’t hardcode paths, make your user supply it or derive it using the os module. 23 | 24 | 25 | 26 | ### Resources for learning python 27 | Collection of Python books and documents contributed by EPA Python users: https://usepa.sharepoint.com/sites/oei_Work/edapservicecenter/Shared%20Documents/PythonBooks?csf=1 28 | 29 | Python courses on EPA's Skillport site: https://epa.skillport.com/skillportfe/main.action#browse/c739e936-2e2d-47c2-b5c8-c4a3e6e635fb 30 | 31 | "Learn Python The Hard Way" https://learnpythonthehardway.org/ 32 | 33 | Learn Python: https://www.learnpython.org/ 34 | 35 | Hackerrank Python Tutorials: https://www.hackerrank.com/python-tutorial 36 | 37 | Code Academy Intro to Python Course: https://www.codecademy.com/learn/learn-python 38 | 39 | ### GeoSpatial Instructional Videos 40 | Introduction to ArcPy - https://vimeo.com/133590224 41 | Creating Variables and Assigning Data - https://vimeo.com/104028282 42 | Basic Python Statements - https://vimeo.com/105271585 43 | Creating and Using Functions - https://vimeo.com/107270986 44 | Lists, Tuples, Dictionaries - https://vimeo.com/109254737 45 | Classes and Objects - https://vimeo.com/110920298 46 | Reading and Writing Text Files - https://vimeo.com/111126962 47 | 48 | ### Learning Python for Data Science 49 | 50 | Python for Data Science on EPA's Skillport site: https://epa.skillport.com/skillportfe/main.action#browse/ee90310a-0f07-4274-95e5-5c1f5f30e4a2 51 | 52 | DataCamp Python for Data Science course: https://www.datacamp.com/courses/intro-to-python-for-data-science 53 | 54 | ### Local Arcpy resources 55 | L:\Public\jbousqui\GIS_Resources 56 | 57 | ### Pep8 style guide: 58 | https://www.python.org/dev/peps/pep-0008/ 59 | 60 | ### Why is something un-pythonic? 61 | https://docs.quantifiedcode.com/python-anti-patterns/index.html 62 | 63 | 64 | ### R or python 65 | My opinion is each has it's use cases where it is better suited. I've been acusing of always answering with "it depends," so a little more - R was writen by statisticians as a user-friendly way to do data analysis, stats and graphical models; python is written by programers as a efficient and readable programing language and has had stats capabilities added to it. So python is more efficient and generally plays more nicely with other code/programs, but R has more statistics and visualization capacity and many of those are more user friendly. Both have extensive user communities, so if "there is a library/package for that" isn't true of either yet, just give it time (year old e.g. [here](https://elitedatascience.com/r-vs-python-for-data-science)). Here is what other people say: 66 | 67 | https://www.datacamp.com/community/tutorials/r-or-python-for-data-analysis 68 | 69 | https://www.kdnuggets.com/2015/05/r-vs-python-data-science.html 70 | 71 | https://www.datascience.com/blog/r-vs-python-for-data-models-data-science 72 | -------------------------------------------------------------------------------- /practice/data/sinuosity.pyt: -------------------------------------------------------------------------------- 1 | import arcpy 2 | 3 | class Toolbox(object): 4 | def __init__(self): 5 | self.label = "Sinuosity toolbox" 6 | self.alias = "sinuosity" 7 | 8 | # List of tool classes associated with this toolbox 9 | self.tools = [CalculateSinuosity] 10 | 11 | class CalculateSinuosity(object): 12 | def __init__(self): 13 | self.label = "Calculate Sinuosity" 14 | self.description = "Sinuosity measures the amount that a river " + \ 15 | "meanders within its valley, calculated by " + \ 16 | "dividing total stream length by valley length." 17 | 18 | def getParameterInfo(self): 19 | #Define parameter definitions 20 | 21 | # Input Features parameter 22 | in_features = arcpy.Parameter( 23 | displayName="Input Features", 24 | name="in_features", 25 | datatype="GPFeatureLayer", 26 | parameterType="Required", 27 | direction="Input") 28 | 29 | in_features.filter.list = ["Polyline"] 30 | 31 | # Sinuosity Field parameter 32 | sinuosity_field = arcpy.Parameter( 33 | displayName="Sinuosity Field", 34 | name="sinuosity_field", 35 | datatype="Field", 36 | parameterType="Optional", 37 | direction="Input") 38 | 39 | sinuosity_field.value = "sinuosity" 40 | 41 | # Derived Output Features parameter 42 | out_features = arcpy.Parameter( 43 | displayName="Output Features", 44 | name="out_features", 45 | datatype="GPFeatureLayer", 46 | parameterType="Derived", 47 | direction="Output") 48 | 49 | out_features.parameterDependencies = [in_features.name] 50 | out_features.schema.clone = True 51 | 52 | parameters = [in_features, sinuosity_field, out_features] 53 | 54 | return parameters 55 | 56 | def isLicensed(self): #optional 57 | return True 58 | 59 | def updateParameters(self, parameters): #optional 60 | if parameters[0].altered: 61 | parameters[1].value = arcpy.ValidateFieldName(parameters[1].value, 62 | parameters[0].value) 63 | return 64 | 65 | def updateMessages(self, parameters): #optional 66 | return 67 | 68 | def execute(self, parameters, messages): 69 | inFeatures = parameters[0].valueAsText 70 | fieldName = parameters[1].valueAsText 71 | 72 | if fieldName in ["#", "", None]: 73 | fieldName = "sinuosity" 74 | 75 | arcpy.AddField_management(inFeatures, fieldName, 'DOUBLE') 76 | 77 | expression = ''' 78 | import math 79 | def getSinuosity(shape): 80 | length = shape.length 81 | d = math.sqrt((shape.firstPoint.X - shape.lastPoint.X) ** 2 + 82 | (shape.firstPoint.Y - shape.lastPoint.Y) ** 2) 83 | return d/length 84 | ''' 85 | 86 | arcpy.CalculateField_management(inFeatures, 87 | fieldName, 88 | 'getSinuosity(!shape!)', 89 | 'PYTHON_9.3', 90 | expression) 91 | -------------------------------------------------------------------------------- /practice/data/tl_2019_12113_addrfeat.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/practice/data/tl_2019_12113_addrfeat.zip -------------------------------------------------------------------------------- /practice/data/utils.py: -------------------------------------------------------------------------------- 1 | # Utility functions 2 | # Author: Justin Bousquin 3 | # Email: bousquin.justin@epa.gov 4 | 5 | import arcpy 6 | 7 | def message(s): 8 | print(s) 9 | 10 | 11 | def field_exists(table, field): 12 | """Check if field exists in table 13 | Notes: return true/false 14 | """ 15 | fieldList = [f.name for f in arcpy.ListFields(table)] 16 | return True if field in fieldList else False 17 | 18 | 19 | def field_to_list(table, field): 20 | """Read Field in Table to List 21 | Notes: field as string, 1 field at a time 22 | Example: lst = field_to_lst("table.shp", "fieldName") 23 | """ 24 | lst = [] 25 | # Check that field exists in table 26 | if field_exists(table, field) is True: 27 | # Use cursor to iterate through table 28 | with arcpy.da.SearchCursor(table, [field]) as cursor: 29 | for row in cursor: 30 | lst.append(row[0]) 31 | #may be a faster implemenetation w/ lst += [row[0]] 32 | return lst 33 | else: 34 | message("{} could not be found in {}".format(field, table)) 35 | message("Empty values will be returned.") 36 | -------------------------------------------------------------------------------- /practice/pyCOP_6_8_2020.md: -------------------------------------------------------------------------------- 1 | # Intro to downloading data, with urllib/request, basic APIs and querying map services. 2 | - Introduction 3 | - Imitating a manual workflow 4 | - Scaling that workflow 5 | - Data from an API 6 | - Data from a map service 7 | 8 | # Imitating a manual workflow 9 | Let's say we wanted to get a shapefile for Santa Rosa county where the Gulf Breeze lab is. Doing this manually: 10 | 1. Google "santa rosa county census shapefile" and follow the first [hit](https://catalog.data.gov/dataset/tiger-line-shapefile-2019-county-santa-rosa-county-fl-address-range-feature-county-based) 11 | 2. Scroll down on the site to find the first download button and copy it's link address. It should be: 12 | https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_12113_addrfeat.zip 13 | 14 | Doing this in python using [urllib](https://docs.python.org/3/library/urllib.request.html): 15 | 16 | import urllib 17 | 18 | 19 | # Set a variable to the link 20 | url = "https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_12113_addrfeat.zip" 21 | 22 | filename = r"C:\Users\\Desktop\SantaRosaCounty.zip" # Save As filename 23 | 24 | # python 2 25 | urllib.urlretrieve(url, filename) 26 | # python 3 it has been nested in request 27 | urllib.request.urlretrieve(url, filename) 28 | 29 | # Scaling the workflow 30 | All the county file names are based on FIPS (Federal Information Processing Standards) codes. For our example: 12 = Florida, 113 = Santa Rosa ([list](https://www.census.gov/prod/techdoc/cbp/cbp95/st-cnty.pdf)) 31 | We can create a quick function to download each county from a list: 32 | 33 | import urllib 34 | 35 | 36 | def downloadCounty(fip): 37 | url = "https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_{}_addrfeat.zip".format(fip) 38 | filename = r"C:\Users\\Desktop\{}.zip".format(fip) 39 | urllib.request.urlretrieve(url, filename) 40 | 41 | 42 | fip_list = ['12113', '12033', '01003',] 43 | for fip in fip_list: 44 | downloadCounty(fip) 45 | 46 | Instead using [requests](https://requests.readthedocs.io/en/master/) 47 | 48 | import requests 49 | 50 | response = requests.get(url) 51 | zip = response.content # The download file can then be altered in memory or written to filename 52 | 53 | # Data from an API 54 | Now that we have our 3 shapefiles the next step might be to get data for those counties. Many datasets are available through an Application Programing Interface (API). 55 | Census makes many of their [datasets available](https://www.census.gov/data/developers/data-sets.html) through a REST API 56 | 57 | We'll work through an example where we Get information via an API but many are set up to allow you to Post new data or alter existing data. 58 | 59 | 2018 5-year ACS 60 | https://www.census.gov/data/developers/data-sets/acs-5year.html 61 | 62 | Each dataset has a brief description as well as: 63 | - API Call - to use as the base url 64 | - Supported variables and geographies 65 | - Examples 66 | 67 | Working from the example: 68 | https://api.census.gov/data/2018/acs/acs5?get=NAME,group(B01001)&for=us:1 69 | Name and group(B01001) are the variables (sex by age) and us:1 is the geography 70 | 71 | We can look at other variables to choose from [json](https://api.census.gov/data/2018/acs/acs5/variables.json). Then substitute those into the url. For example for 'B01001_001' Total: 72 | https://api.census.gov/data/2018/acs/acs5?get=NAME,B01001_001&for=us:1 73 | 74 | We can change the geography in a similar way, replacing the entire US with FL: 75 | https://api.census.gov/data/2018/acs/acs5?get=NAME,B01001_001E&for=state:12 76 | 77 | For a county within a state: 78 | https://api.census.gov/data/2018/acs/acs5?get=NAME,B01001_001E&for=county:033&in=state:12 79 | 80 | For all counties in a state: 81 | https://api.census.gov/data/2018/acs/acs5?get=NAME,B01001_001E&for=county:*&in=state:12 82 | 83 | Now that we understand the structure of the url, let's explore 2020 Response Rate data: 84 | https://www.census.gov/data/developers/data-sets/decennial-response-rates.html 85 | 86 | Specifically internet response rates (CRRINT) for Santa Rosa County: 87 | https://api.census.gov/data/2020/dec/responserate?get=CRRINT&for=county:033&in=state:12 88 | For all census tracts in Santa Rosa County: 89 | https://api.census.gov/data/2020/dec/responserate?get=CRRINT&for=tract:*&in=county:033&in=state:12 90 | or 91 | https://api.census.gov/data/2020/dec/responserate?get=CRRINT&for=tract:*&in=state:12%20county:033 92 | 93 | Taking this to python, for urllib we can construct the url and use urlopen() to get the response: 94 | 95 | base_url = 'https://api.census.gov/data/2020/dec/responserate?get=' 96 | variable = 'CRRINT' 97 | geo = 'for=tract:*&in=state:12 county:033' 98 | url = '{}{}&{}'.format(base_url, variable, geo) 99 | 100 | response = urllib.urlopen(url) 101 | lines = response.readlines() 102 | 103 | Here we start to see an advantage to requests: 104 | 105 | base_url = 'https://api.census.gov/data/2020/dec/responserate' 106 | data = {'get': 'CRRINT', 107 | 'in': ['state:12', 'county:033'], 108 | 'for': 'tract:*', 109 | } 110 | res = requests.get(base_url, data) 111 | res.ok # Check that it didn't error 112 | result = res.content 113 | 114 | # You can also generally see the url string used 115 | res.url 116 | 117 | The response is a json string so you can use the json library to manipulate it: 118 | 119 | import json 120 | 121 | response_list = json.loads(result) 122 | 123 | Or pandas dataframe 124 | 125 | import pandas 126 | 127 | df = pandas.read_json(result) 128 | 129 | 130 | # Data from a map service 131 | The original shapefile we downloaded as a .zip is also available via an API. More specifically, because it is spatial information it is available as an ArcGIS Rest Service: 132 | https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb 133 | 134 | Note: we can also open this in arcGIS online to see what fields we might want to query. 135 | 136 | Each of these listings are map services, each with their own layers, e.g. Census Tracts: 137 | https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/8 138 | 139 | Depending on the type, many of these services can be queried from the [web](https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/8/query). 140 | 141 | To return everything - Where: 1:1 142 | 143 | Start by just returning everything in Fl: 144 | Where: State=12 145 | 146 | Full link: 147 | https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/8/query?where=State%3D12&text=&objectIds=&time=&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&relationParam=&outFields=&returnGeometry=false&returnTrueCurves=false&maxAllowableOffset=&geometryPrecision=&outSR=&returnIdsOnly=true&returnCountOnly=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&returnZ=false&returnM=false&gdbVersion=&returnDistinctValues=false&resultOffset=&resultRecordCount=&queryByDistance=&returnExtentsOnly=false&datumTransformation=¶meterValues=&rangeValues=&f=html 148 | 149 | In Python (dropping a lot of unneeded fields): 150 | https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/8/query?where=State=12&returnGeometry=false&returnIdsOnly=true&f=html 151 | 152 | base_url = 'https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/tigerWMS_Current/MapServer/8/query' 153 | data = {'where': 'STATE=12', 154 | "returnGeometry": "false", 155 | "returnIdsOnly":"true", 156 | 'f':'json'} 157 | 158 | response = requests.get(base_url, data) 159 | response.ok 160 | res = json.loads(response.content) 161 | res.keys() 162 | 163 | Restrict it to just Santa Rosa County and return more than just the OID: 164 | 165 | data = {"where": "STATE=12 AND COUNTY=033", 166 | "returnGeometry": "false", 167 | "outFields": "STATE, COUNTY, TRACT", 168 | "f":"json"} 169 | 170 | Return Geometry 171 | 172 | data = {"where": "STATE=12 AND COUNTY=033", 173 | "returnGeometry": "true", 174 | "returnTrueCurves": "true", 175 | "outFields": "STATE, COUNTY, TRACT", 176 | "f":"json"} 177 | 178 | # Libraries/SDK 179 | https://github.com/Esri/arcgis-python-api 180 | https://github.com/Bolton-and-Menk-GIS/restapi 181 | 182 | OGC: 183 | https://github.com/opengeospatial/geoapi 184 | 185 | Census: 186 | https://github.com/datamade/census 187 | -------------------------------------------------------------------------------- /practice/week4_review.md: -------------------------------------------------------------------------------- 1 | ### Dictionary iterators 2 | When using dictionaries remember elements are store by key not by position, 3 | meaning their order is ignored and they can't be indexed like lists. 4 | When looping over a dictionary it will give keys: 5 | 6 | ```python 7 | dict1 = {"word1": 1, "word2": ["Short", "Sentence"]} 8 | 9 | for item in dict1: 10 | print item 11 | ``` 12 | 13 | Keys must be unique and can be used like iterators to return the 14 | values of each key: 15 | 16 | ```python 17 | for item in dict1: 18 | print dict1[item] 19 | ``` 20 | 21 | Dictionaries also have a built in .items() method to return both the key 22 | and its value: 23 | 24 | ```python 25 | for item in dict1.items(): 26 | print item 27 | ``` 28 | 29 | Alternatively, we can get a list of keys and iterate over that if we wanted: 30 | 31 | ```python 32 | for key in dict1.keys(): 33 | print dict1[key] 34 | print "Position : " + str(dict1.keys().index(key)) 35 | ``` 36 | 37 | And, as you might have suspected, you can get a list of the key values using 38 | .values(): 39 | 40 | ```python 41 | for item in dict1.values(): 42 | print item 43 | ``` 44 | 45 | Note- when I say list I mean it very literaly: 46 | 47 | ```python 48 | dict1.values() 49 | type(dict1.values()) 50 | ``` 51 | 52 | ### New data types set() and tuple() 53 | If you look at each item in a dict.items() list, you may notice an unfamiliar data type 54 | (a, b). These are [tuples](https://www.tutorialspoint.com/python/python_tuples.htm). Tuples are like lists except they are immutable, 55 | meaning you can't update the value of elements in place like with a list: 56 | 57 | ```python 58 | var = "a", "b" 59 | var[0] 60 | var[1] 61 | var[1] = "new value" 62 | ``` 63 | 64 | A set is an unordered collection of unique elements. 65 | 66 | ```python 67 | lst1 = ['a', 'a', 'c', 'd'] 68 | print set(lst1) 69 | ``` 70 | 71 | I wouldn't consider either of these data types my "go to," but they're both useful. 72 | 73 | ### Matrix math with lists, tuples, and zip() 74 | 75 | The zip() function will take items in two lists and combine them (using element position) 76 | into a list of tuples: 77 | 78 | ```python 79 | lst1 = range(1, 9) 80 | lst2 = range(11, 19) 81 | lst3 = zip(lst1, lst2) 82 | print lst3 83 | ``` 84 | 85 | Now we could loop over that list and do a math function: 86 | 87 | ```python 88 | lst4 = [] 89 | for i in lst3: 90 | lst4.append(sum(i)) 91 | ``` 92 | 93 | If it was a more complex, custom math function: 94 | 95 | ```python 96 | lst4 = [] 97 | for a, b in lst3: 98 | lst4.append(a + b) 99 | ``` 100 | 101 | This may not seem all that much better, but next we will add list comprehension... 102 | 103 | ### List Comprehension 104 | Sometimes you have a simple conditional that you want to use to remove a subset of values: 105 | 106 | ```python 107 | lst1 = ["NA", 1, 2, 4, 8] 108 | lst2 = [] 109 | for item in lst1: 110 | #if type(item) == type([]): 111 | if isinstance(item, int): 112 | lst2.append(item) 113 | ``` 114 | 115 | But that is a lot of lines of code just to remove NA. 116 | List comprehension allows you to do this pythonically, as one line: 117 | 118 | ```python 119 | lst3 = [item for item in lst1 if isinstance(item, int)] 120 | ``` 121 | 122 | This also lets you do things to item, like math, before adding it to the new list: 123 | 124 | ```python 125 | lst4 = [item + 1 for item in lst1 if isinstance(item, int)] 126 | ``` 127 | 128 | Now let's take it back to tuples, we had those two lists we wanted to add together. 129 | zip let us make one list with each tuple in it. Now we can use list comprehension 130 | to do math using each tuple: 131 | 132 | ```python 133 | lst1 = range(1, 9) 134 | lst2 = range(11, 19) 135 | 136 | lst3 = [a + b for a, b in zip(lst1, lst2)] 137 | ``` 138 | 139 | All in one line, we could even add in removal of NA again. 140 | 141 | ```python 142 | lst1 = range(1, 9) 143 | lst2 = ["NA"] + range(12, 19) 144 | 145 | lst3 = [a + b for a, b in zip(lst1, lst2) if a != "NA" and b != "NA"] 146 | ``` 147 | -------------------------------------------------------------------------------- /practice/week4_web_demo.md: -------------------------------------------------------------------------------- 1 | Now we'll do a similar excerise as with arcpy, but instead of deleting a list of fields 2 | from a attribute table we will delete a list of files from a system directory. 3 | 4 | # Warning be careful with this, it is way too easy to delete something unintentionally. 5 | 6 | First we'll list everything in a given folder using [os.listdir()](https://www.tutorialspoint.com/python/os_listdir.htm) 7 | 8 | ```python 9 | my_path = r"C:\Users\\Desktop\test_folder" 10 | contents_list = os.listdir(my_path) 11 | ``` 12 | 13 | On inspection we see this lists everything in the folder, files and sub-folders. 14 | 15 | Next we will filter out just the files using [os.path.isfile](https://docs.python.org/2/library/os.path.html) and [os.path.join](https://docs.python.org/2/library/os.path.html). 16 | os.path.join is being used to get the full filename and path to pass to os.path.isfile to test: 17 | 18 | ```python 19 | # Long way 20 | file_list = [] 21 | for f in contents_list: 22 | if f os.path.isfile(os.path.join(my_path, f): 23 | file_list.append(f) 24 | 25 | # List comprehension 26 | file_list = [f for f in contents_list if os.path.isfile(os.path.join(my_path, f))] 27 | ``` 28 | 29 | Last we delete the files: 30 | 31 | ```python 32 | for f in file_list: 33 | os.remove(f) 34 | ``` 35 | 36 | Putting it all together (and putting os.listdir() in the list comprehension too): 37 | 38 | ```python 39 | my_path = r"C:\Users\\Desktop\test_folder" 40 | file_list = [f for f in os.listdir(my_path) if os.path.isfile(os.path.join(my_path, f))] 41 | 42 | for f in file_list: 43 | os.remove(f) 44 | ``` 45 | 46 | Additional conditional statements makes this type of thing more useful. For example, 47 | when going through a folder of downloaded zip files you may want to delete ones that 48 | downloaded incorrectly or you've already unzipped the needed files from. In a simliar 49 | way a list of files could be used to rename all the photos in a folder to follow some 50 | desired convention. 51 | -------------------------------------------------------------------------------- /practice/week5_arcpy_cursor_object.md: -------------------------------------------------------------------------------- 1 | The utils.field_to_list() function you've been using uses a [data access cursor object](http://desktop.arcgis.com/en/arcmap/10.3/analyze/arcpy-classes/cursor.htm). 2 | 3 | ```python 4 | def field_to_list(table, field): 5 | """Read Field in Table to List 6 | Notes: field as string, 1 field at a time 7 | Example: lst = field_to_lst("table.shp", "fieldName") 8 | """ 9 | lst = [] 10 | # Check that field exists in table 11 | if field_exists(table, field) is True: 12 | # Use cursor to iterate through table 13 | with arcpy.da.SearchCursor(table, [field]) as cursor: 14 | for row in cursor: 15 | lst.append(row[0]) 16 | #may be a faster implemenetation w/ lst += [row[0]] 17 | return lst 18 | else: 19 | message("{} could not be found in {}".format(field, table)) 20 | message("Empty values will be returned.") 21 | ``` 22 | 23 | Here we just use a search cursor to read values from a field. 24 | We could instead use an update cursor to add values to a field from a list: 25 | 26 | ```python 27 | def list_to_field(table, field, lst) 28 | i = 0 29 | with arcpy.da.UpdateCursor(table, [field]) as cusor: 30 | for row in cusor: 31 | row[0] = lst[i] 32 | i += 1 33 | cusor.updateRow(row) 34 | ``` 35 | 36 | Notice the update cursor used a .updateRow() method. 37 | 38 | We could also use a cursor to read other attributes, like the [geometry object](http://pro.arcgis.com/en/pro-app/arcpy/get-started/reading-geometries.htm) of the cursor (SHAPE@) 39 | Or if we are using it in a web service query, we can get the geometry directly as a JSON string (SHAPE@JSON). 40 | You can also change the cursor [spatial reference](http://pro.arcgis.com/en/pro-app/arcpy/get-started/setting-a-cursor-s-spatial-reference.htm) without re-projecting the feature. 41 | -------------------------------------------------------------------------------- /practice/week5_pyt_toolbox.md: -------------------------------------------------------------------------------- 1 | You've learned how to script in the python window and how to save those scripts to run latter. 2 | But what if you want to write a process and then allow someone else unfamiliar with python to use it with different inputs? 3 | There are two ways to integrate python scripts into arcGIS tools, either using a wizard to create a .tbx or by using python for the front-end inteface in a python toolbox .pyt file 4 | Each of those two approaches have advantages and disadvantages. 5 | 6 | [Creating a .pyt](http://desktop.arcgis.com/en/arcmap/10.3/analyze/creating-tools/a-quick-tour-of-python-toolboxes.htm) 7 | 8 | ```python 9 | import arcpy 10 | 11 | class Toolbox(object): 12 | def __init__(self): 13 | self.label = "Sinuosity toolbox" 14 | self.alias = "sinuosity" 15 | 16 | # List of tool classes associated with this toolbox 17 | self.tools = [CalculateSinuosity] 18 | 19 | class CalculateSinuosity(object): 20 | def __init__(self): 21 | self.label = "Calculate Sinuosity" 22 | self.description = "Sinuosity measures the amount that a river " + \ 23 | "meanders within its valley, calculated by " + \ 24 | "dividing total stream length by valley length." 25 | 26 | def getParameterInfo(self): 27 | #Define parameter definitions 28 | 29 | # Input Features parameter 30 | in_features = arcpy.Parameter( 31 | displayName="Input Features", 32 | name="in_features", 33 | datatype="GPFeatureLayer", 34 | parameterType="Required", 35 | direction="Input") 36 | 37 | in_features.filter.list = ["Polyline"] 38 | 39 | # Sinuosity Field parameter 40 | sinuosity_field = arcpy.Parameter( 41 | displayName="Sinuosity Field", 42 | name="sinuosity_field", 43 | datatype="Field", 44 | parameterType="Optional", 45 | direction="Input") 46 | 47 | sinuosity_field.value = "sinuosity" 48 | 49 | # Derived Output Features parameter 50 | out_features = arcpy.Parameter( 51 | displayName="Output Features", 52 | name="out_features", 53 | datatype="GPFeatureLayer", 54 | parameterType="Derived", 55 | direction="Output") 56 | 57 | out_features.parameterDependencies = [in_features.name] 58 | out_features.schema.clone = True 59 | 60 | parameters = [in_features, sinuosity_field, out_features] 61 | 62 | return parameters 63 | 64 | def isLicensed(self): #optional 65 | return True 66 | 67 | def updateParameters(self, parameters): #optional 68 | if parameters[0].altered: 69 | parameters[1].value = arcpy.ValidateFieldName(parameters[1].value, 70 | parameters[0].value) 71 | return 72 | 73 | def updateMessages(self, parameters): #optional 74 | return 75 | 76 | def execute(self, parameters, messages): 77 | inFeatures = parameters[0].valueAsText 78 | fieldName = parameters[1].valueAsText 79 | 80 | if fieldName in ["#", "", None]: 81 | fieldName = "sinuosity" 82 | 83 | arcpy.AddField_management(inFeatures, fieldName, 'DOUBLE') 84 | 85 | expression = ''' 86 | import math 87 | def getSinuosity(shape): 88 | length = shape.length 89 | d = math.sqrt((shape.firstPoint.X - shape.lastPoint.X) ** 2 + 90 | (shape.firstPoint.Y - shape.lastPoint.Y) ** 2) 91 | return d/length 92 | ''' 93 | 94 | arcpy.CalculateField_management(inFeatures, 95 | fieldName, 96 | 'getSinuosity(!shape!)', 97 | 'PYTHON_9.3', 98 | expression) 99 | ``` 100 | 101 | -------------------------------------------------------------------------------- /practice/week5_review.md: -------------------------------------------------------------------------------- 1 | Object Oriented Programing can get complex, but at the basis the idea is you should reuse the same structures and modules. 2 | You interact with obecjts (instances) of classes of information with set structures, meaning an object will have set attributes and it's own functions, or methods, that can be used on it. 3 | 4 | ## Creating a new object 5 | 6 | ```python 7 | class NewClass(object): 8 | def __init__(self, var1): 9 | self.var1 = var1 10 | ``` 11 | 12 | Test it out: 13 | 14 | ```python 15 | something = NewClass("a") 16 | something 17 | something.var1 18 | ``` 19 | 20 | Add a method to the new class: 21 | 22 | ```python 23 | class NewClass(object): 24 | def __init__(self, var1): 25 | self.var1 = var1 26 | def special_method(self, char): 27 | self.special = char + self.var1 + char 28 | ``` 29 | 30 | Test it out: 31 | 32 | ```python 33 | something = NewClass("a") 34 | something.special 35 | something.special_method("_") 36 | something.special 37 | ``` 38 | 39 | ## File Objects and input/output 40 | Commonly you'll have to read a input or write an output to a .txt file or .csv 41 | These files are treated as objects. 42 | 43 | First set the file and file path to a variable: 44 | 45 | ```python 46 | import os 47 | 48 | filepath = r"L:\Public\jbousqui\Code\GitHub\py_workgroup\practice\data" 49 | completeFile = filepath + os.sep + "TestTextFile.txt" 50 | ``` 51 | 52 | Next open that file as a file object: 53 | 54 | ```python 55 | file_object = open(completeFile, "r") 56 | file_object.close() 57 | ``` 58 | 59 | The second arguement in open(), "r" in this case, is used to determine what will be done with the file. "r" for read, "a" for append, "w" for write and "r+" for both. Also note that we close() the file when we are done with it. Before performing other functions on a file object it must be closed and re-opened. To test if a file has been closed you can use .closed 60 | 61 | Better yet, you can use the with as syntax to automatically close the file: 62 | 63 | ```python 64 | with open(completeFile, "r") as file_obj: 65 | print file_obj.read() 66 | 67 | file_obj.closed 68 | ``` 69 | 70 | Once a file is open .read() and .write() methods can be used. To read a multi-line file one line at a time .readline() is used. When writing to a file, "\n" is used to denote the start of a new line. 71 | 72 | ```python 73 | with open(completeFile, "r") as file_obj: 74 | print file_obj.readlines() 75 | ``` 76 | 77 | ```python 78 | with open(completeFile, "r+") as file_obj: 79 | file_obj.write("new line") 80 | ``` 81 | Notice where the new line of text was added. Let's fix line one and this time append it to the end: 82 | 83 | ```python 84 | with open(completeFile, "a") as file_obj: 85 | file_obj.write("\nnew line") 86 | ``` 87 | Another way around this is to read the file to a variable and then re-write the file: 88 | 89 | ```python 90 | with open(completeFile, "r+") as file_obj: 91 | lines = file_obj.read() + "\nNewest line" 92 | file_obj.seek(0) 93 | file_obj.write(lines) 94 | ``` 95 | 96 | Think of .seek() as finding a character place within the text, starting at 0. 97 | If looking to change a certain setting in a file it is often useful to .readlines() until the item meets a condition and then updating that line according to the new setting. 98 | -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2018_01_09.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2018_01_09.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2018_01_16.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2018_01_16.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2018_01_23.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2018_01_23.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2018_01_30.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2018_01_30.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2018_02_05.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2018_02_05.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2020_02_20.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2020_02_20.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2020_02_27.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2020_02_27.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2020_03_04.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2020_03_04.pptx -------------------------------------------------------------------------------- /slides/GED_Python_Workgroup_2020_03_12.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jbousquin/py_workgroup/4a758203e2bf87e664d469c06f028b509d953fa7/slides/GED_Python_Workgroup_2020_03_12.pptx --------------------------------------------------------------------------------