├── .github └── workflows │ └── update.yml ├── README.md ├── Street_Tree_List.csv └── requirements.txt /.github/workflows/update.yml: -------------------------------------------------------------------------------- 1 | name: Scrape latest data 2 | 3 | on: 4 | push: 5 | workflow_dispatch: 6 | schedule: 7 | - cron: '21 11 * * *' 8 | 9 | jobs: 10 | scheduled: 11 | runs-on: ubuntu-latest 12 | steps: 13 | - name: Check out this repo 14 | uses: actions/checkout@v4 15 | with: 16 | fetch-depth: 0 17 | - name: Set up Python 18 | uses: actions/setup-python@v5 19 | with: 20 | python-version: 3.13 21 | - uses: actions/cache@v4 22 | name: Configure pip caching 23 | with: 24 | path: ~/.cache/pip 25 | key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }} 26 | restore-keys: | 27 | ${{ runner.os }}-pip- 28 | - name: Install Python dependencies 29 | run: |- 30 | pip install -r requirements.txt 31 | - name: Fetch latest data 32 | run: |- 33 | cp Street_Tree_List.csv Street_Tree_List-old.csv 34 | curl -o Street_Tree_List-unsorted.csv "https://data.sfgov.org/api/views/tkzw-k3nq/rows.csv?accessType=DOWNLOAD" 35 | # Remove heading line and use it to start a new file 36 | head -n 1 Street_Tree_List-unsorted.csv > Street_Tree_List.csv 37 | # Sort all but the first line and append to that file 38 | tail -n +2 "Street_Tree_List-unsorted.csv" | sort >> Street_Tree_List.csv 39 | # Generate commit message using csv-diff 40 | csv-diff Street_Tree_List-old.csv Street_Tree_List.csv --key=TreeID --singular=tree --plural=trees > message.txt 41 | - name: Commit and push if it changed 42 | run: |- 43 | git config user.name "Automated" 44 | git config user.email "actions@users.noreply.github.com" 45 | git add Street_Tree_List.csv 46 | timestamp=$(date -u) 47 | git commit -F message.txt || exit 0 48 | git push 49 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # sf-tree-history 2 | 3 | Tracking the history of trees in San Francisco. 4 | 5 | Background: [Generating a commit log for San Francisco’s official list of trees](https://simonwillison.net/2019/Mar/13/tree-history/). See also [Git Scraping](https://simonwillison.net/2020/Oct/9/git-scraping/) for a description of the general technique. 6 | 7 | This repository [uses GitHub Actions](https://github.com/simonw/sf-tree-history/actions) to retrieve the [official CSV file of trees in San Francisco](https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq) once a day and track any changes to it over time using the git commit history. 8 | 9 | It uses [csv-diff](https://github.com/simonw/csv-diff) to generate human-readable commit messages. 10 | 11 | You can see recent changes to the CSV file here: https://github.com/simonw/sf-tree-history/commits/main 12 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | csv-diff 2 | --------------------------------------------------------------------------------