├── .github └── ISSUE_TEMPLATE │ ├── bug_report.md │ └── feature_request.md ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── README_about_datasets.md ├── conda_requirements.txt ├── docs └── vortex.remodeled.froya.utc0.100m.pdf ├── examples ├── example_1_read_netcdf.py ├── example_2_read_txt.py ├── example_2_read_txt_functions.py ├── example_3_merge.py ├── example_3_merge_functions.py ├── example_4_basic_plots.py ├── example_4_basic_plots_functions.py ├── example_5_MeasureCorrelatePredict.py └── example_5_MeasureCorrelatePredict_functions.py ├── images ├── Froya-map.png └── logo_VORTEX.png └── notebooks ├── example_1_read_netcdf.ipynb ├── example_2_read_txt.ipynb ├── example_3_merge.ipynb ├── example_4_basic_plots.ipynb └── example_5_MeasureCorrelatePredict.ipynb /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Smartphone (please complete the following information):** 32 | - Device: [e.g. iPhone6] 33 | - OS: [e.g. iOS8.1] 34 | - Browser [e.g. stock browser, safari] 35 | - Version [e.g. 22] 36 | 37 | **Additional context** 38 | Add any other context about the problem here. 39 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | .idea/* 6 | .idea/ 7 | # C extensions 8 | *.so 9 | 10 | # Distribution / packaging 11 | .Python 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | cover/ 54 | 55 | # Translations 56 | *.mo 57 | *.pot 58 | 59 | # Django stuff: 60 | *.log 61 | local_settings.py 62 | db.sqlite3 63 | db.sqlite3-journal 64 | 65 | # Flask stuff: 66 | instance/ 67 | .webassets-cache 68 | 69 | # Scrapy stuff: 70 | .scrapy 71 | 72 | # Sphinx documentation 73 | docs/_build/ 74 | 75 | # PyBuilder 76 | .pybuilder/ 77 | target/ 78 | 79 | # Jupyter Notebook 80 | .ipynb_checkpoints 81 | 82 | # IPython 83 | profile_default/ 84 | ipython_config.py 85 | 86 | # pyenv 87 | # For a library or package, you might want to ignore these files since the code is 88 | # intended to run in multiple environments; otherwise, check them in: 89 | # .python-version 90 | 91 | # pipenv 92 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 93 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 94 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 95 | # install all needed dependencies. 96 | #Pipfile.lock 97 | 98 | # poetry 99 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 100 | # This is especially recommended for binary packages to ensure reproducibility, and is more 101 | # commonly ignored for libraries. 102 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 103 | #poetry.lock 104 | 105 | # pdm 106 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 107 | #pdm.lock 108 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 109 | # in version control. 110 | # https://pdm.fming.dev/#use-with-ide 111 | .pdm.toml 112 | 113 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 114 | __pypackages__/ 115 | 116 | # Celery stuff 117 | celerybeat-schedule 118 | celerybeat.pid 119 | 120 | # SageMath parsed files 121 | *.sage.py 122 | 123 | # Environments 124 | .env 125 | .venv 126 | env/ 127 | venv/ 128 | ENV/ 129 | env.bak/ 130 | venv.bak/ 131 | 132 | # Spyder project settings 133 | .spyderproject 134 | .spyproject 135 | 136 | # Rope project settings 137 | .ropeproject 138 | 139 | # mkdocs documentation 140 | /site 141 | 142 | # mypy 143 | .mypy_cache/ 144 | .dmypy.json 145 | dmypy.json 146 | 147 | # Pyre type checker 148 | .pyre/ 149 | 150 | # pytype static type analyzer 151 | .pytype/ 152 | 153 | # Cython debug symbols 154 | cython_debug/ 155 | 156 | # PyCharm 157 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 158 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 159 | # and can be added to the global gitignore or merged into this file. For a more nuclear 160 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 161 | #.idea/ 162 | .vscode/extensions.json 163 | 164 | 165 | ## data packs and outputs 166 | 167 | data/ 168 | data/* 169 | data.zip 170 | examples/output/ 171 | notebooks/output/ 172 | output/ -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | . 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contribution Guidelines 2 | ______________________________________________________________ 3 | ## How to Contribute 4 | 5 | Thank you for considering contributing to this project! Here’s how you can help: 6 | 7 | __Fork the Repository:__ Click on the fork button at the top right corner of the repository. 8 | 9 | __Clone Your Fork:__ Clone the forked repository to your local machine. 10 | 11 | git clone https://github.com/vortexFDC/pywind.git 12 | 13 | __Create a New Branch:__ Create a branch for your changes. 14 | 15 | git checkout -b feature-branch 16 | 17 | __Make Changes:__ Implement your changes and commit them. 18 | 19 | git commit -m "Add new feature" 20 | 21 | __Push Changes:__ Push your changes to your forked repository. 22 | 23 | git push origin feature-branch 24 | 25 | __Create a Pull Request:__ Submit a pull request (PR) to the main repository with a clear explanation of your changes. 26 | ______________________________________________________________ 27 | ## Code Style 28 | 29 | Follow PEP8 if contributing Python code. 30 | 31 | Ensure your code is well-documented and readable. 32 | ______________________________________________________________ 33 | ## Reporting Issues 34 | 35 | If you find a bug, please report it by opening an issue. Include: 36 | 37 | - A clear title and description 38 | 39 | - Steps to reproduce the issue 40 | 41 | - Expected and actual behavior 42 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Vortex Factoria de Calculs 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Pywind: Wind Data Analysis for Wind Energy Applications [](#top) 2 | 3 | Welcome to the **Pywind**, a comprehensive repository about the use of Python in the wind data analysis for wind energy 4 | applications. This repository provides tools and examples for reading, plotting, and analyzing wind data. Whether you 5 | are a student, developer, engineer, or a Python enthusiast interested in wind data, hope you will find valuable and usable 6 | resources. Find below the structure of this project: 7 | 8 | - [Pywind: Wind Data Analysis for Wind Energy Applications ](#pywind-wind-data-analysis-for-wind-energy-applications-) 9 | - [1. Introduction and Main Goals ](#1-introduction-and-main-goals-) 10 | - [2. Authors ](#2-authors-) 11 | - [3. About ](#3-about-) 12 | - [4. Installation ](#4-installation-) 13 | - [5. Usage ](#5-usage-) 14 | - [6. Chapters ](#6-chapters-) 15 | - [7. License ](#7-license-) 16 | - [8. Collaboration ](#8-collaboration-) 17 | 18 | _Developed and Created by Vortex FdC._ 19 | 20 | ## 1. Introduction and Main Goals [](#1.introduction-and-main-goals) 21 | 22 | The use of Python in Wind energy is a rapidly growing field, and the effective analysis and manipulation of wind data is crucial for its success. 23 | This repository provides wind data analysis solutions, from basic tasks to more technical methodologies. 24 | 25 | **Pywind** uses both public measurements and Vortex simulations, like the [SERIES](https://vortexfdc.com/windsite/wind-speed-time-series/) used in the first chapters. 26 | 27 | Feel free to browse, comment, share, reuse the code and ideas. 28 | 29 | The structure of this repository is based in three main folders. 30 | 31 | - **Data**: Sample wind data from public sources is facilitated for user testing. More in the corresponding [data sample documentation](README_about_datasets.md) which is used in this repository. 32 | - **Examples**: Python scripts can be executed from terminals (bash). This repository has been tested under Linux. 33 | - **Notebooks**: Jupyter Notebooks with extended comments and outputs from examples folder. 34 | 35 | 36 | ## 2. Authors [](#2-authors) 37 | 38 | This repository is created and maintained by: 39 | - __Oriol Lacave__, a member of the operational technical team at 40 | [Vortex FDC](http://vortexfdc.com). With over 15 years of experience in the wind industry, Oriol specializes in data 41 | manipulation, analysis, and improvements. He is a scientific programmer dedicated to delivering added value from 42 | reanalysis, mesoscale models, and measurements to engineers. 43 | - __Arnau Toledano__ and Vortex technical team are also contributing to the development of the Pywind repository. 44 | - Vortex team in general. Don't hesitate to contact us! 45 | 46 | ## 3. About [](#3-about) 47 | 48 | [Vortex](http://vortexfdc.com) is a private company that started its technology development in 2005. Comprised of former Wind & Site 49 | engineers, atmospheric physicists, and computer experts, Vortex has developed its own methodology independently.
50 | 51 | Their work is based on the **Weather Research and Forecasting model** [(WRF)](https://www.mmm.ucar.edu/models/wrf), a state-of-the-art mesoscale model developed collaboratively by atmospheric research centers and a thriving community. 52 | 53 | Some active groups we have been inspired are: 54 | - The [WRAG](https://groups.io/g/wrag) and the Wind Resource Assessment Group. 55 | - [Pywram](https://www.pywram.com/) stands for Python for Wind Resource Assessment & Metocean Analysis Forum. It originated for Python users 56 | within WRAG group. 57 | 58 | ## 4. Installation [](#4-installation) 59 | 60 | Clone this repository in your local git environment: 61 | 62 | `git clone https://github.com/VortexFDC/pywind.git` 63 | 64 | 65 | ## 5. Usage [](#5-usage) 66 | 67 | Each section has its own example and notebook files, located in the [examples](examples) and [notebooks](notebooks) folders, respectively. 68 | 69 | For each section a function file is provided, which is imported from the main notebook/example file of the section. 70 | 71 | To execute the examples, you can run the following commands from the terminal or your prefered IDE. 72 | 73 | `python example_1_read_netcdf.py` 74 | 75 | ## 6. Chapters [](#6-chapters) 76 | 77 | - [Chapter 1](notebooks/example_1_read_netcdf.ipynb) 78 | Read netcdf files with the xarray libraries. You will open and make basic operations. A quick overview of the data if done using pandas libraries. 79 | 80 | - [Chapter 2](notebooks/example_2_read_txt.ipynb) 81 | Read txt files with the pandas libraries and create custom functions to some utilities like parsing txt header metadata and incorporate into the data object in xarray. 82 | 83 | - [Chapter 3](notebooks/example_3_merge.ipynb) 84 | Merge two datasets. We will use the data from previous chapters. We will be merging the synthetic data from the model and the measurements. 85 | 86 | - [Chapter 4](notebooks/example_4_basic_plots.ipynb) 87 | With the previous datasets loaded we make some plots to gather extra information. XY plots for correlation. Histograms for wind and direction distribution. Yealy and daily cycles together with interanual variability are created. Also some more data cleaning is performed by extended measurements and Vortex data. An outlyers identification approach is also proposed. 88 | 89 | - [Chapter 5](notebooks/example_5_MeasureCorrelatePredict.ipynb) 90 | Loading the previous datasets we compute MCP and another wind direction categorized MCP. We also load a Vortex remodeling. Later we compare statistics of the three methods and visualize the histogram comparison. 91 | We have added the remodeling file in the froya.zip archive. 92 | 93 | 94 | 95 | ## 7. License [](#7-license) 96 | 97 | MIT License. Consult [LICENSE](/LICENSE) 98 | 99 | Please, use, modify and share if you find it useful. 100 | 101 | ## 8. Collaboration [](#8-collaboration) 102 | 103 | We encourage collaboration, proposals and new ideas. 104 | 105 | Plase follow the [collaboration guidelines](CONTRIBUTING.md) for work proposals. 106 | 107 | You can also use the discussions in this repository for new ideas, Q&A, questions, etc. 108 | 109 | [Contact us](https://vortexfdc.com/contact/) 110 | 111 | [Back to top.](#top) 112 | 113 |
114 | 115 | -------------------------------------------------------------------------------- /README_about_datasets.md: -------------------------------------------------------------------------------- 1 | # PYWIND Sample Data 2 | We present the sample data which is used within this repository to play with. 3 | 4 | ## Froya site 5 | 6 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3403362.svg)](https://doi.org/10.5281/zenodo.3403362) 7 | 8 | 9 | We are utilizing data for Froya site which are prepared for extraction in the `/data` folder. 10 | The data can be downloaded appart from this repository --> 11 | [froya data pack](http://download.vortexfdc.com/froya.zip) http://download.vortexfdc.com/froya.zip 12 | 13 | To check the data integrity and autenticity the md5 checksun is: 14 | md5sum froya.zip : 3ad4368eef6c8bb6ce8137448cdaaa1c 15 | 16 | * updated on 2025-05-16 17 | 18 | After unpacking the folder structure should be created as follows under data folder: 19 | ``` 20 | ├── froya 21 | │ ├── measurements 22 | │ │ ├── obs.nc 23 | │ │ └── obs.txt 24 | │ └── vortex 25 | │ └── SERIE 26 | │ ├── vortex.serie.era5.utc0.nc 27 | │ └── vortex.serie.era5.utc0.100m.txt 28 | │ └── vortex.remodeling.utc0.100m.txt __added on 2025-05-16 __ 29 | ``` 30 | 31 | 32 | ### A. Observed data 33 | 34 | Froya original data can be found [here](https://zenodo.org/records/3403362#.Y1eS5XZByUk). 35 | The site represents an exposed coastal wind climate with open sea, land and mixed fetch from various directions. UTM-coordinates of the Met-mast: 8.34251 E and 63.66638. A post processing has been applied in order to obtain single boom and quality control standards for wind industry. 36 | 37 | ![View of the area for measurement site in Froya 38 | ](images/Froya-map.png "Froya met mast") 39 | 40 | ### B. Modeled data 41 | We are also using [Vortex f.d.c](http://www.vortexfdc.com) simulations.
42 | 43 | SERIES 20 year long time series computed using WRF at 3km final spatial rsolution. Heights from 30m to 300m height.
44 | 45 | - Format netCDF with multiple heights. (data/froya/vortex/SERIE/vortex.serie.era5.utc0.nc).
46 | 47 | - Format txt @ 100m height (data/froya/vortex/SERIE/vortex.serie.era5.utc0.100m.txt)
48 | 49 | - Remodeled file in txt file. This has been generated using the measurements and Netcdf in this data pack. See full remodeling report in this pdf: [froya remodeling pdf](docs/vortex.remodeled.froya.utc0.100m.pdf) 50 |

51 | 52 | 53 |
-------------------------------------------------------------------------------- /conda_requirements.txt: -------------------------------------------------------------------------------- 1 | # This file may be used to create an environment using: 2 | # $ conda create --name --file 3 | # platform: linux-64 4 | # created-by: conda 24.11.1 5 | _libgcc_mutex=0.1=conda_forge 6 | _openmp_mutex=4.5=2_kmp_llvm 7 | asttokens=3.0.0=pyhd8ed1ab_1 8 | blas=1.0=mkl 9 | blosc=1.21.6=hef167b5_0 10 | bottleneck=1.4.2=py313hf0014fa_0 11 | bzip2=1.0.8=h5eee18b_6 12 | c-ares=1.34.4=hb9d3cd8_0 13 | ca-certificates=2025.1.31=hbcca054_0 14 | cached-property=1.5.2=hd8ed1ab_1 15 | cached_property=1.5.2=pyha770c72_1 16 | certifi=2025.1.31=py313h06a4308_0 17 | cftime=1.6.4=py313hf0014fa_0 18 | comm=0.2.2=pyhd8ed1ab_1 19 | contourpy=1.3.1=pypi_0 20 | cycler=0.12.1=pypi_0 21 | debugpy=1.8.12=py313h46c70d0_0 22 | decorator=5.1.1=pyhd8ed1ab_1 23 | exceptiongroup=1.2.2=pyhd8ed1ab_1 24 | executing=2.1.0=pyhd8ed1ab_1 25 | expat=2.6.4=h6a678d5_0 26 | fonttools=4.56.0=pypi_0 27 | h5netcdf=1.5.0=pyhd8ed1ab_0 28 | h5py=3.12.1=nompi_py313h8a9c7c3_103 29 | hdf4=4.2.15=h9772cbc_5 30 | hdf5=1.14.4=nompi_h2d575fe_105 31 | importlib-metadata=8.6.1=pyha770c72_0 32 | intel-openmp=2022.0.1=h06a4308_3633 33 | ipykernel=6.29.5=pyh3099207_0 34 | ipython=8.32.0=pyh907856f_0 35 | jedi=0.19.2=pyhd8ed1ab_1 36 | jpeg=9e=h5eee18b_3 37 | jupyter_client=8.6.3=pyhd8ed1ab_1 38 | jupyter_core=5.7.2=pyh31011fe_1 39 | kiwisolver=1.4.8=pypi_0 40 | krb5=1.21.3=h143b758_0 41 | ld_impl_linux-64=2.40=h12ee557_0 42 | libaec=1.1.3=h59595ed_0 43 | libcurl=8.11.1=h332b0f4_0 44 | libedit=3.1.20230828=h5eee18b_0 45 | libev=4.33=h7f8727e_1 46 | libexpat=2.6.4=h5888daf_0 47 | libffi=3.4.4=h6a678d5_1 48 | libgcc=14.2.0=h77fa898_1 49 | libgcc-ng=14.2.0=h69a702a_1 50 | libgfortran=14.2.0=h69a702a_1 51 | libgfortran-ng=14.2.0=h69a702a_1 52 | libgfortran5=14.2.0=hd5240d6_1 53 | libgomp=14.2.0=h77fa898_1 54 | libiconv=1.17=hd590300_2 55 | liblzma=5.6.4=hb9d3cd8_0 56 | libmpdec=4.0.0=h5eee18b_0 57 | libnetcdf=4.9.2=nompi_h5ddbaa4_116 58 | libnghttp2=1.64.0=h161d5f1_0 59 | libsodium=1.0.20=h4ab18f5_0 60 | libsqlite=3.48.0=hee588c1_1 61 | libssh2=1.11.1=hf672d98_0 62 | libstdcxx=14.2.0=hc0a3c3a_1 63 | libstdcxx-ng=14.2.0=h4852527_1 64 | libuuid=2.38.1=h0b41bf4_0 65 | libxml2=2.13.5=h0d44e9d_1 66 | libzip=1.11.2=h6991a6a_0 67 | libzlib=1.3.1=hb9d3cd8_2 68 | llvm-openmp=19.1.7=h024ca30_0 69 | lz4-c=1.9.4=h6a678d5_1 70 | matplotlib=3.10.0=pypi_0 71 | matplotlib-inline=0.1.7=pyhd8ed1ab_1 72 | mkl=2023.2.0=h84fe81f_50496 73 | mkl-service=2.4.0=py313h5eee18b_2 74 | mkl_fft=1.3.11=py313h5eee18b_0 75 | mkl_random=1.2.8=py313h06d7b56_0 76 | ncurses=6.5=h2d0b736_3 77 | nest-asyncio=1.6.0=pyhd8ed1ab_1 78 | netcdf4=1.7.2=nompi_py313h1dd084c_101 79 | numexpr=2.10.1=py313h3c60e43_0 80 | numpy=2.2.2=py313hf4aebb8_0 81 | numpy-base=2.2.2=py313h3fc9231_0 82 | openssl=3.4.0=h7b32b05_1 83 | packaging=24.2=py313h06a4308_0 84 | pandas=2.2.3=py313h6a678d5_0 85 | parso=0.8.4=pyhd8ed1ab_1 86 | pexpect=4.9.0=pyhd8ed1ab_1 87 | pickleshare=0.7.5=pyhd8ed1ab_1004 88 | pillow=11.1.0=pypi_0 89 | pip=24.2=py313h06a4308_0 90 | platformdirs=4.3.6=pyhd8ed1ab_1 91 | prompt-toolkit=3.0.50=pyha770c72_0 92 | psutil=6.1.1=py313h536fd9c_0 93 | ptyprocess=0.7.0=pyhd8ed1ab_1 94 | pure_eval=0.2.3=pyhd8ed1ab_1 95 | pygments=2.19.1=pyhd8ed1ab_0 96 | pyparsing=3.2.1=pypi_0 97 | python=3.13.1=ha99a958_105_cp313 98 | python-dateutil=2.9.0post0=py313h06a4308_2 99 | python-tzdata=2023.3=pyhd3eb1b0_0 100 | python_abi=3.13=0_cp313 101 | pytz=2024.1=py313h06a4308_0 102 | pyzmq=26.2.1=py313h8e95178_0 103 | readline=8.2=h5eee18b_0 104 | scipy=1.15.1=py313hf4aebb8_0 105 | seaborn=0.13.2=pypi_0 106 | setuptools=72.1.0=py313h06a4308_0 107 | six=1.16.0=pyhd3eb1b0_1 108 | snappy=1.2.1=h8bd8927_1 109 | sqlite=3.48.0=h9eae976_1 110 | stack_data=0.6.3=pyhd8ed1ab_1 111 | tbb=2021.8.0=hdb19cb5_0 112 | tk=8.6.13=noxft_h4845f30_101 113 | tornado=6.4.2=py313h536fd9c_0 114 | traitlets=5.14.3=pyhd8ed1ab_1 115 | typing_extensions=4.12.2=pyha770c72_1 116 | tzdata=2025a=h04d1e81_0 117 | wcwidth=0.2.13=pyhd8ed1ab_1 118 | wheel=0.44.0=py313h06a4308_0 119 | xarray=2024.11.0=py313h06a4308_0 120 | xz=5.4.6=h5eee18b_1 121 | zeromq=4.3.5=h3b0a872_7 122 | zipp=3.21.0=pyhd8ed1ab_1 123 | zlib=1.3.1=hb9d3cd8_2 124 | zstd=1.5.6=ha6fb4c9_0 125 | -------------------------------------------------------------------------------- /docs/vortex.remodeled.froya.utc0.100m.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/VortexFDC/pywind/f4dfe9ef705ed706355ba84ffd36ccec8ed06097/docs/vortex.remodeled.froya.utc0.100m.pdf -------------------------------------------------------------------------------- /examples/example_1_read_netcdf.py: -------------------------------------------------------------------------------- 1 | # ============================================================================= 2 | # Authors: Oriol L & Arnau T 3 | # Company: Vortex F.d.C. 4 | # Year: 2024 5 | # ============================================================================= 6 | 7 | """ 8 | Overview: 9 | --------- 10 | This script demonstrates the process of reading various types of meteorological data files. 11 | The script uses functions to load and manipulate data from four distinct file formats: 12 | 13 | 1. Measurements (NetCDF) - Contains multiple heights and variables. 14 | 2. Vortex NetCDF - NetCDF file format with multiple heights and variables. 15 | 16 | Data Storage: 17 | ------------ 18 | The acquired data is stored in two data structures for comparison and analysis: 19 | - Xarray Dataset 20 | - Pandas DataFrame 21 | 22 | Objective: 23 | ---------- 24 | - To understand the variance in data storage when using Xarray and Pandas. 25 | - Utilize the 'describe' method from Pandas for a quick statistical overview of the dataset. 26 | """ 27 | 28 | 29 | # ============================================================================= 30 | # 1. Import Libraries 31 | # ============================================================================= 32 | 33 | import xarray as xr 34 | import os 35 | 36 | # ============================================================================= 37 | # 2. Define Paths and Site 38 | # ============================================================================= 39 | 40 | SITE = 'froya' 41 | pwd = os.getcwd() 42 | # you may have to adjust this path relative to script depending on your Python configuration 43 | base_path = os.path.join(pwd, '../data') 44 | 45 | print() 46 | measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc') 47 | vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc') 48 | 49 | # Print filenames 50 | print('Measurements NetCDF: ', measurements_netcdf) 51 | print('Vortex NetCDF: ', vortex_netcdf) 52 | 53 | print() 54 | print('#'*26, 'Vortex F.d.C. 2024', '#'*26) 55 | print() 56 | 57 | # ============================================================================= 58 | # 3. Read NetCDF functions 59 | # ============================================================================= 60 | 61 | # Read measurements NetCDF 62 | ds_measurements = xr.open_dataset(measurements_netcdf) 63 | #print(ds_measurements) 64 | 65 | # Read Vortex NetCDF 66 | ds_vortex = xr.open_dataset(vortex_netcdf) 67 | #print(ds_vortex) 68 | 69 | # ============================================================================= 70 | # 4. Convert to Pandas DataFrame and inspect 71 | # ============================================================================= 72 | 73 | df_measurements = ds_measurements.to_dataframe() 74 | df_vortex = ds_vortex.to_dataframe() 75 | 76 | #print(df_measurements.head()) 77 | #print(df_vortex.head()) 78 | 79 | # ============================================================================= 80 | # 5. Interpolate to the same height 81 | # ============================================================================= 82 | 83 | max_height = ds_measurements.squeeze().coords['lev'].max().values 84 | print("Max height in measurements: ", max_height) 85 | print() 86 | 87 | ds_vortex = ds_vortex.interp(lev=max_height) 88 | #print(ds_vortex) 89 | 90 | # ============================================================================= 91 | # 6. Now we can compare statistics for M and Dir 92 | # ===== 93 | 94 | # Drop coordinates lat and lon and dimensions 95 | ds_vortex = ds_vortex.squeeze().reset_coords(drop=True) 96 | ds_measurements = ds_measurements.squeeze().reset_coords(drop=True) 97 | 98 | print('Vortex:') 99 | print(ds_vortex[['M', 'Dir']].to_dataframe().describe().apply(lambda x: x.apply('{:,.6f}'.format))) 100 | print() 101 | 102 | print('Measurements:') 103 | print(ds_measurements[['M', 'Dir']].to_dataframe().describe().apply(lambda x: x.apply('{:,.6f}'.format))) 104 | print() -------------------------------------------------------------------------------- /examples/example_2_read_txt.py: -------------------------------------------------------------------------------- 1 | # ============================================================================= 2 | # Authors: Oriol L & Arnau T 3 | # Company: Vortex F.d.C. 4 | # Year: 2024 5 | # ============================================================================= 6 | 7 | """ 8 | Overview: 9 | --------- 10 | This script demonstrates the process of reading various types of meteorological data files. 11 | The script uses functions to load and manipulate data from four distinct file formats: 12 | 13 | 1. Vortex Text Series - Text file with multiple columns and a header. 14 | 2. Vortex remodeling - txt: A LT extrapolation combining measurements and vortex time series. 15 | 16 | Data Storage: 17 | ------------ 18 | The acquired data is stored in two data structures for comparison and analysis: 19 | - Pandas DataFrame 20 | 21 | Objective: 22 | ---------- 23 | - To understand the variance in data storage when using Pandas. 24 | - Utilize the 'describe' , head and other methods from Pandas for a quick overview of the dataset. 25 | """ 26 | 27 | # ============================================================================= 28 | # 1. Import Libraries 29 | # ============================================================================= 30 | 31 | from typing import Dict 32 | from example_2_read_txt_functions import * 33 | 34 | 35 | # ============================================================================= 36 | # 2. Define Paths and Site 37 | # ============================================================================= 38 | 39 | SITE = 'froya' 40 | pwd = os.getcwd() 41 | base_path = str(os.path.join(pwd, '../data')) 42 | 43 | print() 44 | measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc') 45 | vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc') 46 | 47 | vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt') 48 | measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt') 49 | 50 | # Print filenames 51 | print('Measurements txt: ', measurements_txt) 52 | print('Vortex txt: ', vortex_txt) 53 | 54 | print() 55 | print('#'*26, 'Vortex f.d.c. 2024', '#'*26) 56 | print() 57 | 58 | # ============================================================================= 59 | # 3. Read Vortex Text Series Functions 60 | # ============================================================================= 61 | 62 | # Read Text Series 63 | 64 | # Call read_txt_to_pandas with particular options for file vortex_txt 65 | # `vortex_txt` format is like this: 66 | 67 | # Lat=52.16632 Lon=14.12259 Hub-Height=100 Timezone=00.0 ASL-Height(avg. 3km-grid)=68 (file requested on 2023-09-28 10:30:31) 68 | # VORTEX (www.vortexfdc.com) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes) 69 | # 70 | # YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m) 71 | # 20030101 0000 7.5 133 -9.2 1.32 1000.2 0.26 80.3 0.0081 72 | # 20030101 0100 7.4 136 -10.0 1.32 999.8 0.25 82.1 0.0059 73 | 74 | def read_vortex_serie(filename: str = "vortex.txt", 75 | vars_new_names: Dict = None) -> xr.Dataset: 76 | """ 77 | Read typical vortex time series from SERIES product and return 78 | an xarray.Dataset 79 | 80 | Parameters 81 | ---------- 82 | vars_new_names: Dict 83 | the dictionary with the old names to new names 84 | 85 | filename: str 86 | just the filename is enough 87 | 88 | Returns 89 | ------- 90 | ds: xarray.Dataset 91 | Dataset 92 | 93 | Examples 94 | -------- 95 | Lat=52.90466 Lon=14.76794 Hub-Height=130 Timezone=00.0 ASL-Height(avg. 3km-grid)=73 (file requested on 2022-10-17 11:34:05) 96 | VORTEX (www.vortex.es) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes) 97 | YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m) 98 | 19910101 0000 8.5 175 2.1 1.25 988.1 0.56 91.1 0. 99 | 100 | """ 101 | patterns = {'Lat=': 'lat', 102 | 'Lon=': 'lon', 103 | 'Timezone=': 'utc', 104 | 'Hub-Height=': 'lev'} 105 | metadata = _get_coordinates_vortex_header(filename, patterns, line=0) 106 | data = read_txt_to_pandas(filename, utc=metadata['utc'], 107 | skiprows=3, header=0, names=None) 108 | __ds = convert_to_xarray(data, coords=metadata).squeeze() 109 | 110 | if vars_new_names is None: 111 | vars_new_names = {'M(m/s)': 'M', 112 | 'D(deg)': 'Dir', 113 | 'T(C)': 'T', 114 | 'De(k/m3)': 'D', 115 | 'PRE(hPa)': 'P', 116 | 'RiNumber': 'RI', 117 | 'RH(%)': 'RH', 118 | 'RMOL(1/m)': 'RMOL'} 119 | __ds = rename_vars(__ds, vars_new_names) 120 | 121 | __ds = add_attrs_vars(__ds) 122 | return __ds 123 | 124 | ds_vortex = read_vortex_serie(vortex_txt) 125 | print(ds_vortex) 126 | print() 127 | 128 | df_vortex = ds_vortex.to_dataframe() # convert to dataframe 129 | 130 | # Quickly inspect with head() and describe() methods 131 | 132 | print('Vortex SERIES:\n' ,df_vortex[['M', 'Dir']].head()) 133 | print() 134 | 135 | # ============================================================================= 136 | # 4. Read Measurements Txt 137 | # ============================================================================= 138 | 139 | def read_vortex_obs_to_dataframe(infile: str, 140 | with_sd: bool = False, 141 | out_dir_name: str = 'Dir', 142 | **kwargs) -> pd.DataFrame: 143 | """ 144 | Read a txt file with flexible options as a pandas DataFrame. 145 | 146 | Parameters 147 | ---------- 148 | infile: str 149 | txt file. by default, no header, columns YYYYMMDD HHMM M D 150 | 151 | with_sd: bool 152 | If True, an 'SD' column is appended 153 | out_dir_name: str 154 | Wind direction labeled which will appear in the return dataframe 155 | 156 | Returns 157 | ------- 158 | df: pd.DataFrame 159 | Dataframe 160 | 161 | Examples 162 | -------- 163 | >>> print("The default files read by this function are YYYYMMDD HHMM M D:") 164 | 20050619 0000 6.2 331 1.1 165 | 20050619 0010 6.8 347 0.9 166 | 20050619 0020 7.3 343 1.2 167 | 168 | """ 169 | 170 | columns = ['YYYYMMDD', 'HHMM', 'M', out_dir_name] 171 | 172 | if with_sd: 173 | columns.append('SD') 174 | 175 | readcsv_kwargs = { 176 | 'skiprows': 0, 177 | 'header': None, 178 | 'names': columns, 179 | } 180 | readcsv_kwargs.update(kwargs) 181 | 182 | df: pd.DataFrame = read_txt_to_pandas(infile, **readcsv_kwargs) 183 | return df 184 | 185 | df_obs = read_vortex_obs_to_dataframe(measurements_txt) 186 | ds_obs = convert_to_xarray(df_obs) 187 | 188 | print('Measurements:\n', df_obs.head()) 189 | print() 190 | 191 | # ============================================================================= 192 | # 5. Now we can compare statistics 193 | # ============================================================================= 194 | 195 | print('Vortex SERIES Statistics:\n', df_vortex[['M', 'Dir']].describe().round(2)) 196 | print() 197 | print('Measurements Statistics:\n', df_obs.describe().round(2)) 198 | print() -------------------------------------------------------------------------------- /examples/example_2_read_txt_functions.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import xarray as xr 3 | import numpy as np 4 | import os 5 | from typing import Union, Dict, List 6 | 7 | 8 | def read_vortex_serie(filename: str = "vortex.txt", 9 | vars_new_names: Dict = None) -> xr.Dataset: 10 | """ 11 | Read typical vortex time series from SERIES product and return 12 | an xarray.Dataset 13 | 14 | Parameters 15 | ---------- 16 | vars_new_names: Dict 17 | the dictionary with the old names to new names 18 | 19 | filename: str 20 | just the filename is enough 21 | 22 | Returns 23 | ------- 24 | ds: xarray.Dataset 25 | Dataset 26 | 27 | Examples 28 | -------- 29 | Lat=52.90466 Lon=14.76794 Hub-Height=130 Timezone=00.0 ASL-Height(avg. 3km-grid)=73 (file requested on 2022-10-17 11:34:05) 30 | VORTEX f.d.c. (www.vortexfdc.com) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes) 31 | YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m) 32 | 19910101 0000 8.5 175 2.1 1.25 988.1 0.56 91.1 0. 33 | 34 | """ 35 | patterns = {'Lat=': 'lat', 36 | 'Lon=': 'lon', 37 | 'Timezone=': 'utc', 38 | 'Hub-Height=': 'lev'} 39 | metadata = _get_coordinates_vortex_header(filename, patterns, line=0) 40 | data = read_txt_to_pandas(filename, utc=metadata['utc'], 41 | skiprows=3, header=0, names=None) 42 | __ds = convert_to_xarray(data, coords=metadata).squeeze() 43 | 44 | if vars_new_names is None: 45 | vars_new_names = {'M(m/s)': 'M', 46 | 'D(deg)': 'Dir', 47 | 'T(C)': 'T', 48 | 'De(k/m3)': 'D', 49 | 'PRE(hPa)': 'P', 50 | 'RiNumber': 'RI', 51 | 'RH(%)': 'RH', 52 | 'RMOL(1/m)': 'RMOL'} 53 | __ds = rename_vars(__ds, vars_new_names) 54 | 55 | __ds = add_attrs_vars(__ds) 56 | return __ds 57 | 58 | def read_remodeling_serie(filename: str = "vortex.txt", 59 | vars_new_names: Dict = None) -> xr.Dataset: 60 | """ 61 | Read typical vortex time series from SERIES product and return 62 | an xarray.Dataset 63 | 64 | Parameters 65 | ---------- 66 | vars_new_names: Dict 67 | the dictionary with the old names to new names 68 | 69 | filename: str 70 | just the filename is enough 71 | 72 | Returns 73 | ------- 74 | ds: xarray.Dataset 75 | Dataset 76 | 77 | Examples 78 | -------- 79 | Lat=52.90466 Lon=14.76794 Hub-Height=130 Timezone=00.0 ASL-Height(avg. 3km-grid)=73 (file requested on 2022-10-17 11:34:05) 80 | VORTEX f.d.c. (www.vortexfdc.com) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes) 81 | YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m) 82 | 19910101 0000 8.5 175 2.1 1.25 988.1 0.56 91.1 0. 83 | 84 | """ 85 | patterns = {'Lat=': 'lat', 86 | 'Lon=': 'lon', 87 | 'Timezone=': 'utc', 88 | 'Hub-Height=': 'lev'} 89 | metadata = _get_coordinates_vortex_header(filename, patterns, line=0) 90 | data = read_txt_to_pandas(filename, utc=metadata['utc'], 91 | skiprows=3, header=0, names=None) 92 | __ds = convert_to_xarray(data, coords=metadata).squeeze() 93 | 94 | if vars_new_names is None: 95 | vars_new_names = {'M(m/s)': 'M', 96 | 'D(deg)': 'Dir', 97 | 'T(C)': 'T', 98 | 'PRE(hPa)': 'P' 99 | } 100 | __ds = rename_vars(__ds, vars_new_names) 101 | 102 | __ds = add_attrs_vars(__ds) 103 | return __ds 104 | 105 | 106 | 107 | 108 | def read_vortex_obs_to_dataframe(infile: str, 109 | with_sd: bool = False, 110 | out_dir_name: str = 'Dir', 111 | **kwargs) -> pd.DataFrame: 112 | """ 113 | Read a txt file with flexible options as a pandas DataFrame. 114 | 115 | Parameters 116 | ---------- 117 | infile: str 118 | txt file. by default, no header, columns YYYYMMDD HHMM M D 119 | 120 | with_sd: bool 121 | If True, an 'SD' column is appended 122 | out_dir_name: str 123 | Wind direction labeled which will appear in the return dataframe 124 | 125 | Returns 126 | ------- 127 | df: pd.DataFrame 128 | Dataframe 129 | 130 | Examples 131 | -------- 132 | >>> print("The default files read by this function are YYYYMMDD HHMM M D:") 133 | 20050619 0000 6.2 331 1.1 134 | 20050619 0010 6.8 347 0.9 135 | 20050619 0020 7.3 343 1.2 136 | 137 | """ 138 | 139 | columns = ['YYYYMMDD', 'HHMM', 'M', out_dir_name] 140 | 141 | if with_sd: 142 | columns.append('SD') 143 | 144 | readcsv_kwargs = { 145 | 'skiprows': 0, 146 | 'header': None, 147 | 'names': columns, 148 | } 149 | readcsv_kwargs.update(kwargs) 150 | 151 | df: pd.DataFrame = read_txt_to_pandas(infile, **readcsv_kwargs) 152 | return df 153 | 154 | 155 | def read_txt_to_pandas(infile: str, 156 | utc: float = 0., 157 | silent: bool = True, 158 | **kwargs) -> pd.DataFrame: 159 | """ 160 | Read a txt file with flexible options as a pandas DataFrame. 161 | Converts to UTC 0 if not in that UTC hour. 162 | 163 | Parameters 164 | ---------- 165 | infile: str 166 | txt file. by default, columns separated by spaces 167 | 168 | utc: float, optional 169 | If utc (float number) is passed, the txt is assumed to be in that 170 | offset and converted to UTC0. 171 | 172 | silent: bool 173 | if silent, suppress all print statements. 174 | 175 | kwargs 176 | to override defaults: maybe there is a header, or we want other 177 | names for the columns, or the separator is not spaces, or we want 178 | to skip the first rows because they are not part of the dataframe. 179 | Also specify the date time columns (parse_dates argument) 180 | 181 | Returns 182 | ------- 183 | pd.DataFrame 184 | 185 | """ 186 | if not silent: 187 | # print a bit of the file to see the structure 188 | print(f'Reading txt file {infile}. First 5 lines:') 189 | print(''.join(read_head_txt(infile, lines=5))) 190 | 191 | readcsv_kwargs = { 192 | 'sep': r"\s+", # sep = one or more spaces 193 | 'parse_dates': {'time': [0, 1]}, # make sure col is time 194 | 'index_col': 'time', # do not change 195 | 'date_format': '%Y%m%d %H%M', 196 | } 197 | readcsv_kwargs.update(kwargs) 198 | df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs) 199 | 200 | if not silent: 201 | print(f'Read csv using kwargs: {readcsv_kwargs}') 202 | print(df.head()) 203 | 204 | df.dropna(inplace=True) 205 | 206 | # Change UTC 207 | df.index = _convert_from_local_to_utc(df.index, 208 | utc_local=utc, 209 | silent=silent) 210 | 211 | if not silent: 212 | print('Formatted DataFrame') 213 | print(df.head()) 214 | 215 | return df 216 | 217 | 218 | # This function uses two auxiliary functions to read the head and to deal with time zones 219 | def read_head_txt(infile: str, lines: int = 8) -> List[str]: 220 | """ 221 | Get a list of the first lines of a txt file. Useful to print logs, 222 | or use in other functions that can deduce metadata of a file from 223 | the first lines of text of the file. 224 | 225 | Parameters 226 | ---------- 227 | infile: str 228 | Path to the .txt file 229 | lines: int 230 | Maximum number of lines to read and return 231 | 232 | Returns 233 | ------- 234 | head: List[str] 235 | Concatenated lines read from the file 236 | 237 | Examples 238 | -------- 239 | >>> print(''.join(read_head_txt('/path/to/file.txt', lines=4))) 240 | Will print the first 4 lines (respecting the line skips). 241 | 242 | """ 243 | if not os.path.isfile(infile): 244 | raise IOError('File ' + infile + ' not found.') 245 | 246 | head = [line for i, line in enumerate(open(infile, 'r')) if i < lines] 247 | 248 | return head 249 | 250 | 251 | def _convert_from_local_to_utc(time_values: Union[pd.Series, pd.DatetimeIndex, 252 | pd.Index], utc_local=0., silent=True) -> \ 253 | Union[pd.Series, pd.DatetimeIndex, pd.Index]: 254 | """ 255 | Convert time values from local time to UTC 256 | 257 | Parameters 258 | ---------- 259 | time_values: Union[pd.Series, pd.DatetimeIndex, pd.Index] 260 | Datetime values to be UTC0 converted 261 | utc_local: float 262 | Timezone difference 263 | silent: bool 264 | Print some info if True 265 | 266 | Returns 267 | ------- 268 | Union[pd.Series, pd.DatetimeIndex] 269 | 270 | """ 271 | 272 | if not silent: 273 | print(f'Changing utc: {pd.Timedelta(utc_local, "h")}') 274 | 275 | return time_values - pd.Timedelta(utc_local, 'h') 276 | 277 | 278 | # We also set up some other function to rename variables 279 | def rename_vars(dataset: xr.Dataset, 280 | vars_new_names: Dict[str, str]) \ 281 | -> xr.Dataset: 282 | """ 283 | Rename the variables in the given dataset with the dictionary provided. 284 | 285 | Parameters 286 | ---------- 287 | dataset: xr.Dataset 288 | the dataset we want to change the variables/columns names. 289 | 290 | vars_new_names: Dict[str, str] 291 | original and new name for the variable we want to rename in 292 | the dataset. The dataset is overwritten. 293 | 294 | Returns 295 | ------- 296 | dataset: xr.Dataset 297 | Dataset with the new variables names overwritten. 298 | 299 | """ 300 | for old_name, new_name in vars_new_names.items(): 301 | if old_name not in dataset.variables: 302 | raise UserWarning("This variable is not in the dataset: " + str( 303 | old_name)) 304 | if new_name not in vtx_attributes_vars.keys(): 305 | raise UserWarning("This new variable name variable implemented " 306 | "in the vortexpy") 307 | dataset = dataset.rename({old_name: new_name}) 308 | return dataset 309 | 310 | 311 | def _get_coordinates_vortex_header(filename: str, 312 | patterns: Dict[str, str] = None, 313 | line: int = 0)\ 314 | -> Dict[str, float]: 315 | """ 316 | Read a txt file header 317 | 318 | Parameters 319 | ---------- 320 | filename: str 321 | 322 | patterns: Dictionary 323 | What to search for , just before a = 324 | 325 | line: int 326 | which line to read 327 | 328 | Returns 329 | ------- 330 | metadata: Dict[str, float] 331 | Dictionary containing 332 | 333 | """ 334 | if patterns is None: 335 | patterns = {'Lat=': 'lat', 'Lon=': 'lon', 336 | 'Timezone=': 'utc', 337 | 'Hub-Height=': 'lev'} 338 | 339 | headerfile = read_head_txt(filename, lines=15) 340 | metadata = {} 341 | for info in headerfile[line].split(' '): 342 | for pattern, keyword in patterns.items(): 343 | if pattern in info: 344 | metadata[keyword] = float(info.replace(pattern, '')) 345 | return metadata 346 | 347 | 348 | def convert_to_xarray(df: pd.DataFrame, 349 | coords: Dict[str, Union[float, np.ndarray]] = None 350 | ) -> xr.Dataset: 351 | """ 352 | Convert a dataframe to a xarray object. 353 | 354 | Parameters 355 | ---------- 356 | df: pd.DataFrame 357 | coords: Dict[str, Union[float, np.ndarray]] 358 | Info about lat, lon, lev so that the new dimensions can be added 359 | 360 | Returns 361 | ------- 362 | xr.Dataset 363 | With un-squeezed dimensions and added attributes 364 | """ 365 | ds: xr.Dataset = df.to_xarray() 366 | if coords is not None: 367 | coords_dict = {name: [float(val)] for name, val in coords.items() 368 | if name not in ds.dims} 369 | ds = ds.expand_dims(coords_dict) 370 | ds = add_attrs_vars(ds) 371 | ds = add_attrs_coords(ds) 372 | return ds 373 | 374 | 375 | def add_attrs_vars(ds: xr.Dataset, 376 | attributes_vars: Dict[str, Dict[str, str]] = None, 377 | remove_existing_attrs: bool = False) -> xr.Dataset: 378 | """ 379 | Add attributes information to variables from a dataset. 380 | 381 | If no `attributes_vars` dictionary is passed, the default 382 | attributes from the vars module are used. 383 | 384 | In xarray, a variable can have attributes : 385 | 386 | .. code-block:: python 387 | 388 | data['U'].attrs = {'description': 'Zonal Wind Speed', 389 | 'long_name' : 'U wind speed', 390 | 'units' : 'm/s'} 391 | 392 | Parameters 393 | ---------- 394 | ds : xarray.Dataset 395 | 396 | attributes_vars : dict, optional 397 | An attributes_vars is a dictionary whose keys are strings that 398 | represent variables (this could produce clashing of models) and 399 | each has some attributes like description, long_name, units. 400 | 401 | remove_existing_attrs : bool, False 402 | True will put only the attributes of `attributes_vars` and 403 | remove existing attributes, **including ENCODING details**. 404 | 405 | Returns 406 | ------- 407 | xarray.Dataset 408 | Data with the new attributes 409 | """ 410 | if attributes_vars is None: 411 | attributes_vars = vtx_attributes_vars 412 | 413 | for var in ds.data_vars: 414 | if remove_existing_attrs: 415 | attributes = {} 416 | else: 417 | attributes = ds[var].attrs 418 | 419 | if var in attributes_vars: 420 | # noinspection PyTypeChecker 421 | for key, info in attributes_vars[var].items(): 422 | attributes[key] = info 423 | 424 | ds[var].attrs = attributes 425 | 426 | return ds 427 | 428 | 429 | def add_attrs_coords(ds: xr.Dataset) -> xr.Dataset: 430 | """ 431 | Add attributes information to coordinates from a dataset. 432 | 433 | Used for lat, lon and lev. 434 | 435 | Parameters 436 | ---------- 437 | ds : xarray.Dataset 438 | 439 | Returns 440 | ------- 441 | xarray.Dataset 442 | Data with the new attributes for the coordinates 443 | """ 444 | if 'lat' in ds: 445 | ds['lat'].attrs = {'units': 'degrees', 'long_name': 'Latitude'} 446 | if 'lon' in ds: 447 | ds['lon'].attrs = {'units': 'degrees', 'long_name': 'Longitude'} 448 | if 'lev' in ds: 449 | ds['lev'].attrs = {'units': 'metres', 'long_name': 'Level'} 450 | 451 | return ds 452 | 453 | 454 | vtx_attributes_vars = { 455 | 'U': {'description': 'Zonal Wind Speed', 456 | 'long_name': 'U wind speed', 457 | 'units': 'm/s'}, 458 | 'V': {'description': 'Meridional Wind Speed Component', 459 | 'long_name': 'V wind speed', 460 | 'units': 'm/s'}, 461 | 'W': {'description': 'Vertical Wind Speed Component', 462 | 'long_name': 'W wind speed', 463 | 'units': 'm/s'}, 464 | 'M': {'description': 'Wind Speed (module velocity)', 465 | 'long_name': 'Wind speed', 466 | 'units': 'm/s'}, 467 | 'TI': {'long_name': 'Turbulence Intensity', 468 | 'description': 'Turbulence Intensity', 469 | 'units': '%'}, 470 | 'Dir': {'description': 'Wind Direction', 471 | 'long_name': 'Wind direction', 472 | 'units': 'degrees'}, 473 | 'SD': {'description': 'Wind Speed Standard Deviation', 474 | 'long_name': 'Wind Speed Standard Deviation', 475 | 'units': 'm/s'}, 476 | 'DSD': {'description': 'Wind Direction Standard Deviation', 477 | 'long_name': 'Wind Direction Standard Deviation', 478 | 'units': 'degrees'}, 479 | 'variance': {'description': 'Wind Speed Variance', 480 | 'long_name': 'Wind Speed Variance', 481 | 'units': 'm^2/s^2'}, 482 | 'T': {'description': 'Air Temperature', 483 | 'long_name': 'Air Temperature', 484 | 'units': 'Deg.Celsius'}, 485 | 'P': {'description': 'Pressure', 486 | 'long_name': 'Pressure', 487 | 'units': 'hPa'}, 488 | 'D': {'long_name': 'Density', 489 | 'description': 'Air Density', 490 | 'units': 'kg/m^(-3)'}, 491 | 'RMOL': {'description': 'Inverse Monin Obukhov Length', 492 | 'long_name': 'Inverse Monin Obukhov Length', 493 | 'units': 'm^-1'}, 494 | 'L': {'description': 'Monin Obukhov Length', 495 | 'long_name': 'Monin Obukhov Length', 496 | 'units': 'm'}, 497 | 'stability': {'description': 'Atmospheric Stability Index (RMOL)', 498 | 'long_name': 'Atmospheric Stability (idx)', 499 | 'units': ''}, 500 | 'stabilityClass': {'description': 'Atmospheric Stability Class (RMOL)', 501 | 'long_name': 'Atmospheric Stability (class)', 502 | 'units': ''}, 503 | 'HGT': {'description': 'Terrain Height (above sea level)', 504 | 'long_name': 'Terrain Height', 505 | 'units': 'm'}, 506 | 'inflow': {'long_name': 'Inflow angle', 507 | 'description': 'Inflow angle', 508 | 'units': 'degrees'}, 509 | 'RI': {'long_name': 'Richardson Number', 510 | 'description': 'Richardson Number', 511 | 'units': ''}, 512 | 'shear': {'long_name': 'Wind Shear Exponent', 513 | 'description': 'Wind Shear Exponent', 514 | 'units': ''}, 515 | 'shear_sd': {'long_name': 'Wind SD Shear', 516 | 'description': 'Wind SD Shear', 517 | 'units': ''}, 518 | 'veer': {'long_name': 'Wind Directional Bulk Veer', 519 | 'description': 'Wind Directional Bulk Veer', 520 | 'units': 'degrees m^-1'}, 521 | 'total_veer': {'long_name': 'Wind Directional TotalVeer', 522 | 'description': 'Wind Directional Total Veer', 523 | 'units': 'degrees m^-1'}, 524 | 'sector': {'long_name': 'Wind Direction Sector', 525 | 'description': 'Wind Direction Sector', 526 | 'units': ''}, 527 | 'Mbin': {'long_name': 'Wind Speed Bin', 528 | 'description': 'Wind Speed Bin (round to nearest int)', 529 | 'units': ''}, 530 | 'daynight': {'long_name': 'Day or Night', 531 | 'description': 'Day or Night', 532 | 'units': ''}, 533 | 'solar_elev': {'long_name': 'Solar Elevation', 534 | 'description': 'Solar Elevation Angle', 535 | 'units': 'degrees'}, 536 | 'power': {'long_name': 'Power', 537 | 'description': 'Approximation to the power expected at ' 538 | 'this instant (energy/time)', 539 | 'units': 'kW'}, 540 | 'energy': {'long_name': 'Energy Production', 541 | 'description': 'Approximation to the energy expected from ' 542 | 'the power and time frequency of the series', 543 | 'units': 'kWh'}, 544 | 'SST': {'long_name': 'Sea Surface Temperature', 545 | 'description': 'Sea Surface Temperature', 546 | 'units': 'K'}, 547 | 'HFX': {'long_name': 'Heat Flux Surface', 548 | 'description': 'Upward heat flux at the surface', 549 | 'units': 'W m-2'}, 550 | 'PBLH': {'long_name': 'Boundary Layer Height', 551 | 'description': 'Boundary Layer Height', 552 | 'units': 'm'}, 553 | 'RH': {'long_name': 'Relative Humidity', 554 | 'description': 'Relative Humidity', 555 | 'units': '%'}, 556 | 'TP': {'long_name': 'Potential Temperature', 557 | 'description': 'Potential Temperature', 558 | 'units': 'K'}, 559 | 'T2': {'long_name': 'Air Temperature at 2m', 560 | 'description': 'Air Temperature at 2m', 561 | 'units': 'K'}, 562 | 'TKE_PBL': {'long_name': 'Turbulent Kinetic Energy', 563 | 'description': 'Turbulent Kinetic Energy', 564 | 'units': 'm^2/s^2'}, 565 | 'Gust3s': {'long_name': '3-second Wind Gust', 566 | 'description': '3-second Wind Gust', 567 | 'units': 'm/s'}, 568 | } -------------------------------------------------------------------------------- /examples/example_3_merge.py: -------------------------------------------------------------------------------- 1 | # ============================================================================= 2 | # Authors: Oriol L & Arnau T 3 | # Company: Vortex F.d.C. 4 | # Year: 2024 5 | # ============================================================================= 6 | 7 | """ 8 | Overview: 9 | --------- 10 | This script demonstrates the process of reading and processing various types of meteorological data files. The goal is to compare measurements from different sources and formats by resampling, interpolating, and merging the data for further analysis. 11 | 12 | The script uses functions to load and manipulate data from four distinct file formats: 13 | 14 | 1. Measurements (NetCDF) - Contains multiple heights and variables. 15 | 2. Vortex NetCDF - NetCDF file format with multiple heights and variables. 16 | 3. Vortex Text Series - Text file containing time series data of meteorological measurements. 17 | 4. Measurements Text Series - Text file containing time series data of observations. 18 | 19 | Data Storage: 20 | ------------- 21 | The acquired data is stored and processed in two data structures for comparison and analysis: 22 | - **Xarray Dataset**: For handling multi-dimensional arrays of the meteorological data, useful for complex operations and transformations. 23 | - **Pandas DataFrame**: For flexible and powerful data manipulation and analysis, allowing easy integration and comparison of different datasets. 24 | 25 | Objective: 26 | ---------- 27 | - **Read and Interpolate Data**: Load data from NetCDF and text files, and interpolate Vortex data to match the measurement levels. 28 | - **Resample Data**: Convert the time series data to an hourly frequency to ensure uniformity in the analysis. 29 | - **Data Comparison**: Merge the datasets to facilitate a detailed comparison of measurements from different sources. 30 | - **Statistical Overview**: Utilize the 'describe' method from Pandas for a quick statistical summary of the datasets, providing insights into the distribution and characteristics of the data. 31 | - **Concurrent Period Analysis**: Clean the data by removing non-concurrent periods (no data) to focus on the overlapping timeframes for accurate comparison. 32 | 33 | By following these steps, the script aims to provide a comprehensive approach to handling and analyzing meteorological data from various sources, ensuring a clear understanding of the data's behavior and relationships. 34 | """ 35 | 36 | # ============================================================================= 37 | # 1. Import Libraries 38 | # ============================================================================= 39 | from example_2_read_txt_functions import * 40 | from example_3_merge_functions import * 41 | import pandas as pd 42 | 43 | # ============================================================================= 44 | # 2. Define Paths and Site 45 | # ============================================================================= 46 | 47 | SITE = 'froya' 48 | pwd = os.getcwd() 49 | base_path = str(os.path.join(pwd, '../data')) 50 | 51 | print() 52 | measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc') 53 | vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc') 54 | 55 | vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt') 56 | measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt') 57 | 58 | # Print filenames 59 | print('Measurements NetCDF: ', measurements_netcdf) 60 | print('Vortex NetCDF: ', vortex_netcdf) 61 | 62 | print() 63 | print('#'*26, 'Vortex F.d.C. 2024', '#'*26) 64 | print() 65 | 66 | # ============================================================================= 67 | # 3. Read Vortex Series in NetCDF and Text 68 | # ============================================================================= 69 | 70 | # Read NetCDF 71 | ds_obs_nc = xr.open_dataset(measurements_netcdf) 72 | ds_vortex_nc = xr.open_dataset(vortex_netcdf) 73 | #ds_vortex_nc = ds_vortex_nc.rename_vars({'D': 'Dir'}) 74 | 75 | # Read Text Series 76 | ds_vortex_txt = read_vortex_serie(vortex_txt) 77 | df_obs_txt = read_vortex_obs_to_dataframe(measurements_txt)[['M', 'Dir']] 78 | ds_obs_txt = convert_to_xarray(df_obs_txt)[['M', 'Dir']] 79 | 80 | # ============================================================================= 81 | # 4. Interpolate Vortex Series to the same Measurements level. Select M and Dir. 82 | # ============================================================================= 83 | 84 | print() 85 | max_height = ds_obs_nc.squeeze().coords['lev'].max().values 86 | print("Max height in measurements: ", max_height) 87 | ds_obs_nc = ds_obs_nc.sel(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']] 88 | 89 | ds_vortex_nc = ds_vortex_nc.interp(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']] 90 | ds_vortex_txt = ds_vortex_txt[['M', 'Dir']].squeeze().reset_coords(drop=True) 91 | 92 | # ============================================================================= 93 | # 5. Measurements Time Resampling to Hourly 94 | # ============================================================================= 95 | 96 | # No need to perform any resampling to Vortex data, as SERIES products is already hourly 97 | 98 | # convert ds_obs_nc to hourly 99 | ds_obs_nc = ds_obs_nc.resample(time='1H').mean() 100 | # convert ds_obs_txt to hourly 101 | ds_obs_txt = ds_obs_txt.resample(time='1H').mean() 102 | 103 | # ============================================================================= 104 | # 6. Convert all to DataFrame, Rename and Merge 105 | # ============================================================================= 106 | 107 | df_obs_nc = ds_obs_nc.to_dataframe() 108 | df_vortex_nc = ds_vortex_nc.to_dataframe() 109 | df_obs_txt = ds_obs_txt.to_dataframe() 110 | df_vortex_txt = ds_vortex_txt.to_dataframe() 111 | 112 | # rename columns so they do now have the same name when merging 113 | df_obs_nc.columns = ['M_obs_nc', 'Dir_obs_nc'] 114 | df_vortex_nc.columns = ['M_vortex_nc', 'Dir_vortex_nc'] 115 | df_obs_txt.columns = ['M_obs_txt', 'Dir_obs_txt'] 116 | df_vortex_txt.columns = ['M_vortex_txt', 'Dir_vortex_txt'] 117 | 118 | # merge using index (time) all dataframes 119 | df_nc = df_obs_nc.merge(df_vortex_nc, left_index=True, right_index=True) 120 | df_txt = df_obs_txt.merge(df_vortex_txt, left_index=True, right_index=True) 121 | df = df_nc.merge(df_txt, left_index=True, right_index=True) 122 | print() 123 | 124 | # force to show all describe columns 125 | with pd.option_context('display.max_rows', None, 'display.max_columns', None): 126 | print(df.head()) 127 | print() 128 | print(df.describe()) 129 | print() 130 | 131 | print("After Cleaning Nodatas: Concurrent period") 132 | print() 133 | # If you want to have only concurrent period, remove nodatas 134 | df = df.dropna(how='any', axis=0) 135 | # force to show all describe columns 136 | with pd.option_context('display.max_rows', None, 'display.max_columns', None): 137 | print(df.head()) 138 | print() 139 | print(df.describe()) 140 | print() -------------------------------------------------------------------------------- /examples/example_3_merge_functions.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import xarray as xr 3 | import numpy as np 4 | from typing import Union, Dict, List 5 | import math 6 | 7 | # VORTEX TYPES 8 | vSet = Union[xr.Dataset, pd.DataFrame] 9 | vArray = Union[xr.DataArray, pd.Series] 10 | vData = Union[vSet, vArray] 11 | 12 | def find_wind_speed(vs: vSet) -> vArray: 13 | """ 14 | Calculate the wind speed. 15 | 16 | Given a vSet we return the vArray 17 | of wind speed, which may be already on the ``vSet`` 18 | or it may need to be obtained from the wind components. 19 | It is computed lazily if the inputs are Dask arrays. 20 | 21 | Parameters 22 | ---------- 23 | vs: vSet 24 | vSet with wind speed (M) or wind components (U & V) 25 | 26 | Returns 27 | ------- 28 | vArray 29 | Wind speed data (named M) 30 | """ 31 | if 'M' in vs: 32 | m = vs['M'] 33 | else: 34 | try: 35 | u = vs['U'] 36 | v = vs['V'] 37 | except KeyError: 38 | raise ValueError('Cannot obtain M (no U or V)') 39 | 40 | m = np.sqrt(u ** 2 + v ** 2).rename('M') 41 | 42 | m.attrs = vtx_attributes_vars['M'] 43 | return m 44 | 45 | 46 | def find_direction(vs: vSet) -> vArray: 47 | """ 48 | Calculate the wind direction. 49 | 50 | Given a vSet we return the vArray 51 | of wind direction, which may be already on the ``vSet`` 52 | or it may need to be obtained from the wind components. 53 | It is computed lazily if the inputs are Dask arrays. 54 | 55 | Parameters 56 | ---------- 57 | vs: vSet 58 | vSet with wind direction (Dir) or wind components (U & V) 59 | 60 | Returns 61 | ------- 62 | vArray: 63 | Wind direction data (named Dir) 64 | 65 | """ 66 | if 'Dir' in vs: 67 | d = vs['Dir'] 68 | else: 69 | try: 70 | u = vs['U'] 71 | v = vs['V'] 72 | except KeyError: 73 | raise ValueError('Cannot obtain Dir (no U or V)') 74 | 75 | radians = np.arctan2(u, v) 76 | d = (radians * 180 / math.pi + 180).rename('Dir') 77 | 78 | d.attrs = vtx_attributes_vars['Dir'] 79 | return d 80 | 81 | def find_var(var: str, ds: vSet, **kwargs) -> vArray: 82 | """ 83 | Return the requested variable from the vSet if possible. Given a vSet 84 | we return the vArray of the variable `var` using the functions 85 | defined in this module or simply selecting it from the vSet. 86 | It is computed lazily if the inputs are Dask arrays. 87 | 88 | Parameters 89 | ---------- 90 | var: str 91 | Name of the variable. Either existing in the vSet or 92 | one that is standard Vortex and can be computed: 93 | 94 | .. code-block:: python 95 | 96 | find_new_vars = { 97 | # Some variables need to be computed 98 | 'U': find_zonal_wind, 99 | 'V': find_meridional_wind, 100 | 'D': find_density, 101 | 'M': find_wind_speed, 102 | 'T': find_temperature_celsius, 103 | 'P': find_pressure_hpa, 104 | 'SD': find_standard_deviation, 105 | 'TI': find_turbulence_intensity, 106 | 'Dir': find_direction, 107 | 'energy': find_energy, 108 | 'power': find_power, 109 | 'RI': find_richardson, 110 | 'stability': find_stability, 111 | 'stabilityClass': find_stability_class, 112 | 'shear': find_shear, 113 | 'shear_sd': find_shear_sd, 114 | 'veer': find_veer, 115 | 'total_veer': find_total_veer, 116 | 'inflow': find_inflow, 117 | 'sector': find_sectors, 118 | 'Mbin': find_wind_bins, 119 | 'daynight': find_daynight, 120 | 'solar_elev': find_solar_elevation, 121 | 'variance': find_wind_variance, 122 | } 123 | 124 | ds: vSet 125 | 126 | Returns 127 | ------- 128 | v: vArray 129 | Array called `var` and, in case it is a xarray object, with attributes. 130 | 131 | """ 132 | 133 | if var in find_new_vars: 134 | v = find_new_vars[var](ds, **kwargs) 135 | elif var in ds: 136 | v = ds[var] 137 | if var in vtx_attributes_vars: 138 | v.attrs = vtx_attributes_vars[var] 139 | else: 140 | raise ValueError('Cannot obtain variable ' + var + ' from vSet.') 141 | 142 | return v 143 | 144 | 145 | def get_dataset(vd: vData, 146 | vars_list: List[str] = None, 147 | strict: bool = True, 148 | no_zarr: bool = True) -> Union[xr.Dataset, None]: 149 | """ 150 | Given a vData return the data in xr.Dataset format 151 | 152 | Sometimes it is useful to know what kind of objects are we dealing 153 | with instead of having the flexibility of vDatas. 154 | 155 | This function tries to smartly convert your vData to a xarray 156 | Dataset, and compute the requested variables. 157 | 158 | If the input is: 159 | - a xr.Datarray: simply convert to dataset 160 | - a pd.Series: convert to dataframe and then apply convert_to_xarray 161 | - a pd.DataFrame: add to the index lat, lon, lev, time if they were in the columns, and apply convert_to_xarray 162 | 163 | Try to find the variables of vars_list, and raise an error if 164 | none is found. If strict, fail if ANY variable is not found. 165 | 166 | If the vData passed was an vArray without name, and we request a 167 | single var, the code will assume the only var we passed is the one 168 | we want, and rename it to what we passed in vars_list. 169 | 170 | Parameters 171 | ---------- 172 | vd: vData 173 | vars_list: list of variables 174 | Must be understood by find_var 175 | strict: bool 176 | If strict=True the function will fail if any variable 177 | is missing. If strict=False only fails if all variables fail. 178 | no_zarr: bool 179 | Compute the dask arrays if any, so that the result is not a 180 | dask object. 181 | 182 | Returns 183 | ------- 184 | xr.Dataset 185 | The vData in xarray.Dataset format. 186 | """ 187 | # Make sure we won't return a dask array (zarr) 188 | if no_zarr: 189 | if hasattr(vd, 'compute'): 190 | vd = vd.compute() 191 | 192 | # If we have a vArray, we just convert it to xr.Dataset 193 | if isinstance(vd, xr.DataArray): 194 | if vd.name == '' and len(vars_list) == 1: 195 | vd = vd.rename(vars_list[0]) 196 | vd = vd.to_dataset() 197 | elif isinstance(vd, pd.Series): 198 | if vd.name == '' and len(vars_list) == 1: 199 | vd = vd.rename(vars_list[0]) 200 | vd = convert_to_xarray(vd.to_dataframe()) 201 | elif isinstance(vd, pd.DataFrame): 202 | newdims = [c for c in vd.columns 203 | if c in ['lat', 'lon', 'lev', 'time']] 204 | coords = {c: np.unique([vd[c].values]) for c in vd.columns 205 | if c in ['lat', 'lon', 'lev', 'time']} 206 | if 0 < len(newdims) < 4: 207 | vd = vd.set_index(newdims, append=True) 208 | elif len(newdims) == 4: 209 | vd = vd.set_index(newdims) 210 | 211 | vd = convert_to_xarray(vd, coords=coords) 212 | 213 | # If we get here, vd should be a xr.Dataset 214 | variables = [] 215 | for v in vars_list: 216 | try: 217 | thisv = find_var(v, vd) 218 | except ValueError as e: 219 | if strict: 220 | print('One of the variables cannot be obtained: ' + v) 221 | raise e 222 | else: 223 | variables.append(thisv) 224 | 225 | if len(variables) == 0: 226 | return None 227 | 228 | full = xr.merge(variables, combine_attrs="drop") 229 | full = add_attrs_vars(full) 230 | full = add_attrs_coords(full) 231 | return full 232 | 233 | 234 | find_new_vars = { 235 | # Some variables need to be computed 236 | 'M': find_wind_speed, 237 | 'Dir': find_direction 238 | 239 | } 240 | -------------------------------------------------------------------------------- /examples/example_4_basic_plots.py: -------------------------------------------------------------------------------- 1 | # ============================================================================= 2 | # Authors: Oriol L 3 | # Company: Vortex F.d.C. 4 | # Year: 2025 5 | # ============================================================================= 6 | 7 | """ 8 | Overview: 9 | --------- 10 | This script demonstrates the process of plotting basic information once a dataset from both measurements and synthetic data has been merged. 11 | 12 | """ 13 | 14 | # ============================================================================= 15 | # 1. Import Libraries 16 | # ============================================================================= 17 | 18 | from example_3_merge_functions import * 19 | from example_4_basic_plots_functions import * 20 | import os 21 | 22 | # ============================================================================= 23 | # 2. Define Paths and Site 24 | # Repeat the process in chapter 3 to read netcdf and merge datasets 25 | # ============================================================================= 26 | 27 | SITE = 'froya' 28 | pwd = os.getcwd() 29 | base_path = str(os.path.join(pwd, '../data')) 30 | 31 | print() 32 | measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc') 33 | vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc') 34 | 35 | vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt') 36 | measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt') 37 | 38 | # Print filenames 39 | print('Measurements NetCDF: ', measurements_netcdf) 40 | print('Vortex NetCDF: ', vortex_netcdf) 41 | 42 | print() 43 | print('#'*26, 'Vortex F.d.C. 2025', '#'*26) 44 | print() 45 | 46 | 47 | # Read NetCDF 48 | ds_obs_nc = xr.open_dataset(measurements_netcdf) 49 | ds_vortex_nc = xr.open_dataset(vortex_netcdf) 50 | #ds_vortex_nc = ds_vortex_nc.rename_vars({'D': 'Dir'}) 51 | 52 | # ============================================================================= 53 | # 3. Interpolate Vortex Series to the same Measurements level. Select M and Dir. 54 | # ============================================================================= 55 | 56 | print() 57 | max_height = ds_obs_nc.squeeze().coords['lev'].max().values 58 | print("Max height in measurements: ", max_height) 59 | ds_obs_nc = ds_obs_nc.sel(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']] 60 | 61 | ds_vortex_nc = ds_vortex_nc.interp(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']] 62 | 63 | 64 | # convert ds_obs_nc to hourly 65 | ds_obs_nc = ds_obs_nc.resample(time='1h').mean() 66 | 67 | # ============================================================================= 68 | # 6. Convert all to DataFrame, Rename and Merge 69 | # ============================================================================= 70 | 71 | df_obs_nc = ds_obs_nc.to_dataframe() 72 | df_vortex_nc = ds_vortex_nc.to_dataframe() 73 | 74 | 75 | # rename columns so they do now have the same name when merging 76 | df_obs_nc.columns = ['M_obs_nc', 'Dir_obs_nc'] 77 | print("df_obs_nc columns: ", df_obs_nc.columns) 78 | df_vortex_nc = df_vortex_nc[['M','Dir']] 79 | print("df_vortex_nc columns: ", df_vortex_nc.columns) 80 | df_vortex_nc.columns = ['M_vortex_nc', 'Dir_vortex_nc'] 81 | 82 | df = df_vortex_nc.merge(df_obs_nc, left_index=True, right_index=True) 83 | # subtitute NA with 9999 84 | df = df.dropna(how='any', axis=0) 85 | 86 | # add vortex series with only concurrent period to obs. 87 | df_with_na = pd.merge(df_vortex_nc, df_obs_nc, left_index=True, right_index=True, how='outer') 88 | df_ST = df[['M_vortex_nc','Dir_vortex_nc']] 89 | df_ST.columns = ['M_vortex_nc_ST', 'Dir_vortex_nc_ST'] 90 | df_with_na = pd.merge(df_with_na, df_ST, left_index=True, right_index=True, how='outer') 91 | 92 | # check.... 93 | ## checked, they are the same df_with_na = df_with_na.dropna(how='any', axis=0) 94 | 95 | # ============================================================================= 96 | # 8. Use the functions for plotting 97 | # ============================================================================= 98 | output_dir = "output" 99 | 100 | 101 | 102 | # Use the functions to create the plots 103 | xy_stats = plot_xy_comparison( 104 | df=df, 105 | x_col='M_obs_nc', 106 | y_col='M_vortex_nc', 107 | x_label='Measurement Wind Speed (m/s)', 108 | y_label='Vortex Wind Speed (m/s)', 109 | site=SITE, 110 | output_dir=output_dir, 111 | outlyer_threshold=4 112 | ) 113 | output_dir = "output" 114 | # Use the functions to create the plots 115 | xy_stats = plot_xy_comparison( 116 | df=df, 117 | x_col='M_obs_nc', 118 | y_col='M_vortex_nc', 119 | x_label='Measurement Wind Speed (m/s)', 120 | y_label='Vortex Wind Speed (m/s)', 121 | site=SITE, 122 | output_dir=output_dir, 123 | outlyer_threshold=6 124 | ) 125 | 126 | # Print regression statistics 127 | print(f"\nRegression Statistics:") 128 | print(f"Slope: {xy_stats['slope']:.4f}") 129 | print(f"Intercept: {xy_stats['intercept']:.4f}") 130 | print(f"R-squared: {xy_stats['r_squared']:.4f}") 131 | print(f"p-value: {xy_stats['p_value']:.4e}") 132 | print(f"Standard Error: {xy_stats['std_err']:.4f}") 133 | 134 | # Create histogram 135 | hist_stats = plot_histogram_comparison( 136 | df=df, 137 | cols=['M_obs_nc', 'M_vortex_nc'], 138 | labels=['Measurements', 'Vortex'], 139 | colors=['blue', 'red'], 140 | site=SITE, 141 | output_dir=output_dir, 142 | bins=25, 143 | alpha=0.6 144 | ) 145 | 146 | # Create histogram 147 | hist_stats = plot_histogram_comparison( 148 | df=df, 149 | cols=['Dir_obs_nc', 'Dir_vortex_nc'], 150 | labels=['Measurements', 'Vortex'], 151 | colors=['blue', 'red'], 152 | site=SITE+" Dir", 153 | output_dir=output_dir, 154 | bins=12, 155 | alpha=0.6 156 | ) 157 | 158 | 159 | 160 | # ============================================================================= 161 | # 9. Plot Annual and Daily Cycles 162 | # ============================================================================= 163 | 164 | # Plot annual cycle for wind speed 165 | annual_stats_M = plot_annual_means( 166 | df=df, 167 | cols=['M_obs_nc', 'M_vortex_nc'], 168 | labels=['Measurements', 'Vortex'], 169 | colors=['blue', 'red'], 170 | site=SITE, 171 | output_dir=output_dir 172 | ) 173 | 174 | # Plot daily cycle for wind speed 175 | daily_stats_M = plot_daily_cycle( 176 | df=df, 177 | cols=['M_obs_nc', 'M_vortex_nc'], 178 | labels=['Measurements', 'Vortex'], 179 | colors=['blue', 'red'], 180 | site=SITE, 181 | output_dir=output_dir 182 | ) 183 | 184 | # ============================================================================= 185 | # 10. Plot Yearly Means 186 | # ============================================================================= 187 | 188 | # Plot yearly means for wind speed 189 | yearly_stats_M = plot_yearly_means( 190 | df=df, 191 | cols=['M_obs_nc', 'M_vortex_nc'], 192 | labels=['Measurements', 'Vortex'], 193 | colors=['blue', 'red'], 194 | site=SITE, 195 | output_dir=output_dir 196 | ) 197 | 198 | 199 | 200 | # now I want to compare long term histogram using full ds_vortex_nc compared to the df period 201 | hist_stats = plot_yearly_means( 202 | df = df_with_na, 203 | cols = ['M_vortex_nc','M_obs_nc','M_vortex_nc_ST'], 204 | labels=['Vortex LT','OBS','Vortex ST'], 205 | colors=['green','blue','red'], 206 | site =SITE, 207 | output_dir=output_dir 208 | ) 209 | 210 | # describe to check number of NaNs in years 2010 to 2014 211 | # check the number of NaNs in the years 2010 to 2014 212 | print(df_with_na.loc['2010-01-01':'2014-12-31'].describe()) 213 | print(df_with_na.loc['2010-01-01':'2014-12-31'].isna().sum()) 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | -------------------------------------------------------------------------------- /examples/example_4_basic_plots_functions.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | from scipy import stats 5 | import os 6 | # ============================================================================= 7 | # 7. Define Plotting Functions 8 | # ============================================================================= 9 | 10 | def plot_xy_comparison(df, x_col, y_col, x_label=None, y_label=None, site=None, 11 | output_dir=None, save_fig=True, outlyer_threshold=999, show=True): 12 | """ 13 | Create a scatter plot with linear regression comparing two variables. 14 | 15 | Parameters: 16 | ----------- 17 | df : pandas DataFrame 18 | DataFrame containing the data to plot 19 | x_col : str 20 | Column name for x-axis data 21 | y_col : str 22 | Column name for y-axis data 23 | x_label : str, optional 24 | Custom x-axis label (defaults to x_col) 25 | y_label : str, optional 26 | Custom y-axis label (defaults to y_col) 27 | site : str, optional 28 | Site name for title and filename 29 | output_dir : str, optional 30 | Directory to save plot 31 | save_fig : bool, optional 32 | Whether to save the figure 33 | show: bool, optional 34 | Whether to show the plot 35 | 36 | Returns: 37 | -------- 38 | dict 39 | Dictionary containing regression statistics 40 | """ 41 | # Create figure and axis 42 | plt.figure(figsize=(8, 8)) 43 | 44 | # Use default labels if not provided 45 | x_label = x_label or x_col 46 | y_label = y_label or y_col 47 | site_name = site.capitalize() if site else '' 48 | 49 | # Scatter plot 50 | plt.scatter(df[x_col], df[y_col], alpha=0.5, color='blue') 51 | 52 | 53 | # Calculate linear regression 54 | slope, intercept, r_value, p_value, std_err = stats.linregress(df[x_col], df[y_col]) 55 | r_squared = r_value**2 56 | 57 | # Create regression line for plotting 58 | x_line = np.linspace(df[x_col].min(), df[x_col].max(), 100) 59 | y_line = slope * x_line + intercept 60 | 61 | # Calculate the expected y values based on regression for each actual x point 62 | y_expected = slope * df[x_col] + intercept 63 | y_real= df[y_col] 64 | df['diff'] = np.abs(y_real - y_expected) 65 | 66 | # Find points with significant differences 67 | 68 | outlyers = df['diff'] > outlyer_threshold 69 | 70 | # Plot outliers 71 | plt.scatter(df[x_col][outlyers], df[y_col][outlyers], color='red', alpha=0.5, label='Outliers') 72 | 73 | # Plot regression line 74 | plt.plot(x_line, y_line, 'r-', label=f'y = {slope:.3f}x + {intercept:.3f}') 75 | 76 | # Add identity line (perfect agreement) 77 | plt.plot([0, max(df[x_col].max(), df[y_col].max())], 78 | [0, max(df[x_col].max(), df[y_col].max())], 79 | 'k--', alpha=0.3, label='1:1') 80 | 81 | # Add annotations with regression statistics 82 | plt.annotate(f'$R^2$ = {r_squared:.3f}', 83 | xy=(0.05, 0.95), xycoords='axes fraction', 84 | bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8)) 85 | 86 | # Add labels and title 87 | plt.xlabel(x_label, fontsize=12) 88 | plt.ylabel(y_label, fontsize=12) 89 | plt.title(f'Comparison of {x_label} vs {y_label} at {site_name}', fontsize=14) 90 | 91 | # Add grid and legend 92 | plt.grid(True, alpha=0.3) 93 | plt.legend() 94 | 95 | # Equal aspect ratio 96 | plt.axis('equal') 97 | plt.tight_layout() 98 | 99 | # Save the figure if requested 100 | if save_fig and output_dir and site: 101 | os.makedirs(output_dir, exist_ok=True) 102 | plt.savefig(os.path.join(output_dir, f'{site}_comparison_{x_col}_vs_{y_col}.png'), dpi=300) 103 | 104 | # Show the plot 105 | if show: 106 | plt.show() 107 | else: 108 | plt.clf() 109 | plt.close() 110 | 111 | # Return regression statistics 112 | stats_dict = { 113 | "slope": slope, 114 | "intercept": intercept, 115 | "r_squared": r_squared, 116 | "p_value": p_value, 117 | "std_err": std_err 118 | } 119 | 120 | return stats_dict 121 | 122 | 123 | def plot_histogram_comparison(df, cols, labels=None, colors=None, site=None, 124 | output_dir=None, save_fig=True, bins=25, alpha=0.6): 125 | """ 126 | Create histograms comparing multiple columns. 127 | 128 | Parameters: 129 | ----------- 130 | df : pandas DataFrame 131 | DataFrame containing the data to plot 132 | cols : list of str 133 | List of column names to plot 134 | labels : list of str, optional 135 | Custom labels for each column (defaults to column names) 136 | colors : list of str, optional 137 | Colors for each histogram (defaults to default color cycle) 138 | site : str, optional 139 | Site name for title and filename 140 | output_dir : str, optional 141 | Directory to save plot 142 | save_fig : bool, optional 143 | Whether to save the figure 144 | bins : int or array-like, optional 145 | Number of bins or bin edges 146 | alpha : float, optional 147 | Transparency for histograms (0-1) 148 | 149 | Returns: 150 | -------- 151 | dict 152 | Dictionary containing basic statistics for each column 153 | """ 154 | # Create figure 155 | 156 | plt.figure(figsize=(10, 6)) 157 | 158 | # Use default labels if not provided 159 | labels = labels or cols 160 | colors = colors or ['blue', 'red', 'green', 'orange', 'purple'] 161 | site_name = site.capitalize() if site else '' 162 | 163 | # Calculate overall range for binning if bins is an integer 164 | if isinstance(bins, int): 165 | max_val = max([df[col].max() for col in cols]) + 1 166 | bin_edges = np.linspace(0, max_val, bins) 167 | else: 168 | bin_edges = bins 169 | 170 | # Plot histograms with transparency 171 | stats_dict = {} 172 | for i, col in enumerate(cols): 173 | plt.hist(df[col], bins=bin_edges, alpha=alpha, 174 | label=labels[i], color=colors[i % len(colors)], 175 | edgecolor='black') 176 | 177 | 178 | # Calculate statistics 179 | mean_val = df[col].mean() 180 | stats_dict[col] = { 181 | "mean": mean_val, 182 | "median": df[col].median(), 183 | "std": df[col].std(), 184 | "min": df[col].min(), 185 | "max": df[col].max() 186 | } 187 | 188 | # Add vertical line for mean 189 | plt.axvline(mean_val, color=colors[i % len(colors)], 190 | linestyle='dashed', linewidth=1.5) 191 | 192 | # Add annotation for mean 193 | plt.annotate(f'{labels[i]} Mean: {mean_val:.2f}', 194 | xy=(mean_val, plt.ylim()[1] * (0.9 - i*0.1)), 195 | xytext=(mean_val + 0.5, plt.ylim()[1] * (0.9 - i*0.1)), 196 | arrowprops=dict(arrowstyle='->', color=colors[i % len(colors)]), 197 | color=colors[i % len(colors)]) 198 | 199 | # Add labels and title 200 | plt.xlabel('Values', fontsize=12) 201 | plt.ylabel('Frequency', fontsize=12) 202 | plt.title(f'Distribution Comparison at {site_name}', fontsize=14) 203 | 204 | # Add grid and legend 205 | plt.grid(True, alpha=0.3) 206 | plt.legend() 207 | plt.tight_layout() 208 | 209 | # Save the figure if requested 210 | if save_fig and output_dir and site: 211 | os.makedirs(output_dir, exist_ok=True) 212 | col_names = '_'.join([col.replace('_', '') for col in cols]) 213 | plt.savefig(os.path.join(output_dir, f'{site}_histogram_{col_names}.png'), dpi=300) 214 | 215 | # Show the plot 216 | plt.show() 217 | 218 | return stats_dict 219 | 220 | 221 | def plot_annual_means(df, cols, labels=None, colors=None, site=None, 222 | output_dir=None, save_fig=True): 223 | """ 224 | Create a plot of annual means for given columns. 225 | 226 | Parameters: 227 | ----------- 228 | df : pandas DataFrame 229 | DataFrame with datetime index and columns to plot 230 | cols : list of str 231 | List of column names to plot 232 | labels : list of str, optional 233 | Custom labels for each column (defaults to column names) 234 | colors : list of str, optional 235 | Colors for each line (defaults to default color cycle) 236 | site : str, optional 237 | Site name for title and filename 238 | output_dir : str, optional 239 | Directory to save plot 240 | save_fig : bool, optional 241 | Whether to save the figure 242 | 243 | Returns: 244 | -------- 245 | dict 246 | Dictionary containing annual statistics for each column 247 | """ 248 | # Ensure DataFrame has datetime index 249 | if not isinstance(df.index, pd.DatetimeIndex): 250 | raise ValueError("DataFrame index must be DatetimeIndex") 251 | 252 | # Create figure 253 | plt.figure(figsize=(12, 6)) 254 | 255 | # Use default labels if not provided 256 | labels = labels or cols 257 | colors = colors or ['blue', 'red', 'green', 'orange', 'purple'] 258 | site_name = site.capitalize() if site else '' 259 | 260 | # Group by month for annual cycle 261 | monthly_means = {} 262 | stats_dict = {} 263 | 264 | for i, col in enumerate(cols): 265 | # Group by month and calculate mean 266 | monthly_data = df[col].groupby(df.index.month).mean() 267 | monthly_means[col] = monthly_data 268 | 269 | # Store stats 270 | stats_dict[col] = { 271 | "annual_mean": df[col].mean(), 272 | "monthly_means": monthly_data.to_dict(), 273 | "max_month": monthly_data.idxmax(), 274 | "min_month": monthly_data.idxmin(), 275 | "annual_std": df[col].std() 276 | } 277 | 278 | # Plot the monthly means 279 | plt.plot(monthly_data.index, monthly_data.values, 280 | marker='o', linestyle='-', linewidth=2, 281 | color=colors[i % len(colors)], label=labels[i]) 282 | 283 | # Set x-ticks to month names 284 | month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 285 | 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] 286 | plt.xticks(range(1, 13), month_names) 287 | 288 | # Add labels and title 289 | plt.xlabel('Month', fontsize=12) 290 | plt.ylabel('Mean Value', fontsize=12) 291 | plt.title(f'Annual Cycle at {site_name}', fontsize=14) 292 | 293 | # Add grid and legend 294 | plt.grid(True, alpha=0.3) 295 | plt.legend() 296 | plt.tight_layout() 297 | 298 | # Save the figure if requested 299 | if save_fig and output_dir and site: 300 | os.makedirs(output_dir, exist_ok=True) 301 | col_names = '_'.join([col.replace('_', '') for col in cols]) 302 | plt.savefig(os.path.join(output_dir, f'{site}_annual_cycle_{col_names}.png'), dpi=300) 303 | 304 | # Show the plot 305 | plt.show() 306 | 307 | return stats_dict 308 | 309 | 310 | def plot_daily_cycle(df, cols, labels=None, colors=None, site=None, 311 | output_dir=None, save_fig=True): 312 | """ 313 | Create a plot of daily cycle (hour of day) for given columns. 314 | 315 | Parameters: 316 | ----------- 317 | df : pandas DataFrame 318 | DataFrame with datetime index and columns to plot 319 | cols : list of str 320 | List of column names to plot 321 | labels : list of str, optional 322 | Custom labels for each column (defaults to column names) 323 | colors : list of str, optional 324 | Colors for each line (defaults to default color cycle) 325 | site : str, optional 326 | Site name for title and filename 327 | output_dir : str, optional 328 | Directory to save plot 329 | save_fig : bool, optional 330 | Whether to save the figure 331 | 332 | Returns: 333 | -------- 334 | dict 335 | Dictionary containing daily cycle statistics for each column 336 | """ 337 | # Ensure DataFrame has datetime index 338 | if not isinstance(df.index, pd.DatetimeIndex): 339 | raise ValueError("DataFrame index must be DatetimeIndex") 340 | 341 | # Create figure 342 | plt.figure(figsize=(12, 6)) 343 | 344 | # Use default labels if not provided 345 | labels = labels or cols 346 | colors = colors or ['blue', 'red', 'green', 'orange', 'purple'] 347 | site_name = site.capitalize() if site else '' 348 | 349 | # Group by hour for daily cycle 350 | hourly_means = {} 351 | stats_dict = {} 352 | 353 | for i, col in enumerate(cols): 354 | # Group by hour and calculate mean 355 | hourly_data = df[col].groupby(df.index.hour).mean() 356 | hourly_means[col] = hourly_data 357 | 358 | # Store stats 359 | stats_dict[col] = { 360 | "daily_mean": df[col].mean(), 361 | "hourly_means": hourly_data.to_dict(), 362 | "max_hour": hourly_data.idxmax(), 363 | "min_hour": hourly_data.idxmin(), 364 | "daily_std": hourly_data.std() 365 | } 366 | 367 | # Plot the hourly means 368 | plt.plot(hourly_data.index, hourly_data.values, 369 | marker='o', linestyle='-', linewidth=2, 370 | color=colors[i % len(colors)], label=labels[i]) 371 | 372 | # Set x-ticks to hour format 373 | plt.xticks(range(0, 24, 2), [f'{h:02d}:00' for h in range(0, 24, 2)]) 374 | 375 | # Add labels and title 376 | plt.xlabel('Hour of Day', fontsize=12) 377 | plt.ylabel('Mean Value', fontsize=12) 378 | plt.title(f'Daily Cycle at {site_name}', fontsize=14) 379 | 380 | # Add grid and legend 381 | plt.grid(True, alpha=0.3) 382 | plt.legend() 383 | plt.tight_layout() 384 | 385 | # Save the figure if requested 386 | if save_fig and output_dir and site: 387 | os.makedirs(output_dir, exist_ok=True) 388 | col_names = '_'.join([col.replace('_', '') for col in cols]) 389 | plt.savefig(os.path.join(output_dir, f'{site}_daily_cycle_{col_names}.png'), dpi=300) 390 | 391 | # Show the plot 392 | plt.show() 393 | 394 | return stats_dict 395 | 396 | 397 | def plot_yearly_means(df, cols, labels=None, colors=None, site=None, 398 | output_dir=None, save_fig=True): 399 | """ 400 | Create a plot of annual means for each year for given columns. 401 | 402 | Parameters: 403 | ----------- 404 | df : pandas DataFrame 405 | DataFrame with datetime index and columns to plot 406 | cols : list of str 407 | List of column names to plot 408 | labels : list of str, optional 409 | Custom labels for each column (defaults to column names) 410 | colors : list of str, optional 411 | Colors for each line (defaults to default color cycle) 412 | site : str, optional 413 | Site name for title and filename 414 | output_dir : str, optional 415 | Directory to save plot 416 | save_fig : bool, optional 417 | Whether to save the figure 418 | 419 | Returns: 420 | -------- 421 | dict 422 | Dictionary containing yearly statistics for each column 423 | """ 424 | # Ensure DataFrame has datetime index 425 | if not isinstance(df.index, pd.DatetimeIndex): 426 | raise ValueError("DataFrame index must be DatetimeIndex") 427 | 428 | # Create figure 429 | plt.figure(figsize=(12, 6)) 430 | 431 | # Use default labels if not provided 432 | labels = labels or cols 433 | colors = colors or ['blue', 'red', 'green', 'orange', 'purple'] 434 | site_name = site.capitalize() if site else '' 435 | 436 | # Get years in data 437 | years = df.index.year.unique() 438 | years.sort_values() 439 | 440 | # Group by year for yearly cycle 441 | yearly_means = {} 442 | stats_dict = {} 443 | 444 | for i, col in enumerate(cols): 445 | # Group by year and calculate mean 446 | yearly_data = df[col].groupby(df.index.year).mean() 447 | yearly_means[col] = yearly_data 448 | 449 | # Store stats 450 | stats_dict[col] = { 451 | "overall_mean": df[col].mean(), 452 | "yearly_means": yearly_data.to_dict(), 453 | "max_year": yearly_data.idxmax(), 454 | "min_year": yearly_data.idxmin(), 455 | "yearly_std": yearly_data.std() 456 | } 457 | 458 | # Plot the yearly means 459 | plt.plot(yearly_data.index, yearly_data.values, 460 | marker='o', linestyle='-', linewidth=2, 461 | color=colors[i % len(colors)], label=labels[i]) 462 | 463 | # Add horizontal lines for overall means 464 | for i, col in enumerate(cols): 465 | plt.axhline(y=stats_dict[col]["overall_mean"], 466 | color=colors[i % len(colors)], 467 | linestyle='--', 468 | alpha=0.5, 469 | label=f'{labels[i]} Overall Mean') 470 | 471 | # Set x-ticks to years 472 | plt.xticks(years, [str(year) for year in years]) 473 | 474 | # Add labels and title 475 | plt.xlabel('Year', fontsize=12) 476 | plt.ylabel('Mean Value', fontsize=12) 477 | plt.title(f'Yearly Means at {site_name}', fontsize=14) 478 | 479 | # Add grid and legend 480 | plt.grid(True, alpha=0.3) 481 | plt.legend() 482 | plt.tight_layout() 483 | 484 | # Save the figure if requested 485 | if save_fig and output_dir and site: 486 | os.makedirs(output_dir, exist_ok=True) 487 | col_names = '_'.join([col.replace('_', '') for col in cols]) 488 | plt.savefig(os.path.join(output_dir, f'{site}_yearly_means_{col_names}.png'), dpi=300) 489 | 490 | # Show the plot 491 | plt.show() 492 | 493 | return stats_dict -------------------------------------------------------------------------------- /examples/example_5_MeasureCorrelatePredict.py: -------------------------------------------------------------------------------- 1 | # ============================================================================= 2 | # Authors: Oriol L 3 | # Company: Vortex F.d.C. 4 | # Year: 2025 5 | # ============================================================================= 6 | 7 | """ 8 | Overview: 9 | --------- 10 | This script demonstrates the process of plotting basic information once a dataset from both measurements and synthetic data has been merged. 11 | 12 | """ 13 | # ============================================================================= 14 | # 1. Import Libraries 15 | # ============================================================================= 16 | from example_2_read_txt_functions import * 17 | from example_3_merge_functions import * 18 | from example_4_basic_plots_functions import * 19 | from scipy.stats import wasserstein_distance 20 | import os 21 | from example_5_MeasureCorrelatePredict_functions import plot_histogram_comparison_lines 22 | 23 | 24 | # ============================================================================= 25 | # 2. Define Paths and Site 26 | # Repeat the process in chapter 3 to read netcdf and merge datasets 27 | # ============================================================================= 28 | 29 | SITE = 'froya' 30 | pwd = os.getcwd() 31 | base_path = str(os.path.join(pwd, '../data')) 32 | 33 | vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt') 34 | measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt') 35 | 36 | print() 37 | print('#'*26, 'Vortex F.d.C. 2025', '#'*26) 38 | print() 39 | 40 | # Read Text Series 41 | ds_vortex_txt = read_vortex_serie(vortex_txt) 42 | df_obs_txt = read_vortex_obs_to_dataframe(measurements_txt)[['M', 'Dir']] 43 | ds_obs_txt = convert_to_xarray(df_obs_txt)[['M', 'Dir']] 44 | #ds_vortex_nc = ds_vortex_nc.rename_vars({'D': 'Dir'}) 45 | ds_vortex_txt = ds_vortex_txt[['M', 'Dir']].squeeze().reset_coords(drop=True) 46 | # convert ds_obs_txt to hourly 47 | ds_obs_txt = ds_obs_txt.resample(time='1H').mean() 48 | # ============================================================================= 49 | # 3. Convert all to DataFrame, Rename and Merge 50 | # ============================================================================= 51 | df_obs_txt = ds_obs_txt.to_dataframe() 52 | df_vortex_txt = ds_vortex_txt.to_dataframe() 53 | 54 | df_obs_txt.columns = ['M_obs_txt', 'Dir_obs_txt'] 55 | df_vortex_txt.columns = ['M_vortex_txt', 'Dir_vortex_txt'] 56 | 57 | df_concurrent = df_obs_txt.merge(df_vortex_txt, left_index=True, right_index=True).dropna() 58 | df_all = df_obs_txt.merge(df_vortex_txt, left_index=True, right_index=True, how='outer') 59 | 60 | # SAVED DATASETS, CONCURRENT AND ALL PERIODS 61 | #print(df_concurrent.describe()) 62 | #print(df_all.describe()) 63 | 64 | 65 | # ============================================================================= 66 | # 4. Use the functions for regression 67 | # ============================================================================= 68 | output_dir = "output" 69 | # Use the functions to compute metrics 70 | xy_stats = plot_xy_comparison( 71 | df=df_concurrent, 72 | x_col='M_vortex_txt', 73 | y_col='M_obs_txt', 74 | x_label='Measurement Wind Speed (m/s)', 75 | y_label='Vortex Wind Speed (m/s)', 76 | site=SITE, 77 | output_dir=output_dir, 78 | outlyer_threshold=4, 79 | show=False 80 | ) 81 | 82 | # Print regression statistics 83 | print(f"\nRegression Statistics:") 84 | print(f"Slope: {xy_stats['slope']:.4f}") 85 | print(f"Intercept: {xy_stats['intercept']:.4f}") 86 | print(f"R-squared: {xy_stats['r_squared']:.4f}") 87 | print(f"p-value: {xy_stats['p_value']:.4e}") 88 | print(f"Standard Error: {xy_stats['std_err']:.4f}") 89 | 90 | # ============================================================================= 91 | # 5. Compute the MCP 92 | # ============================================================================= 93 | 94 | ## create a new column with the MCP 95 | ## Ymcp ) = {xy_stats['intercept'] + {xy_stats['slope']*df_all['M_vortex_txt']} 96 | df_all['Ymcp'] = xy_stats['intercept'] + xy_stats['slope']*df_all['M_vortex_txt'] 97 | 98 | # concurrent stats 99 | 100 | #print(df_all.dropna().describe()) 101 | 102 | hist_stats = plot_histogram_comparison( 103 | df=df_all.dropna(), 104 | cols=['M_obs_txt', 'Ymcp'], 105 | labels=['Measurements', 'MCP'], 106 | colors=['blue', 'orange'], 107 | site=SITE, 108 | output_dir=output_dir, 109 | bins=25, 110 | alpha=0.3 111 | ) 112 | 113 | # ============================================================================= 114 | # 6. Compute the Sectorial MCP 115 | # ============================================================================= 116 | 117 | # Define 8 directional sectors (0-45, 45-90, etc.) 118 | sector_bounds = list(range(0, 361, 45)) 119 | sector_labels = [f"{sector_bounds[i]}-{sector_bounds[i+1]}" for i in range(len(sector_bounds)-1)] 120 | 121 | # Add a column for the wind direction sector 122 | df_concurrent['dir_sector'] = pd.cut(df_concurrent['Dir_vortex_txt'], 123 | bins=sector_bounds, 124 | labels=sector_labels, 125 | include_lowest=True, 126 | right=False) 127 | df_all['dir_sector'] = pd.cut(df_all['Dir_vortex_txt'], 128 | bins=sector_bounds, 129 | labels=sector_labels, 130 | include_lowest=True, 131 | right=False) 132 | 133 | # Initialize results dictionary to store regression parameters 134 | sector_regressions = {} 135 | 136 | # Perform regression for each sector 137 | for sector in sector_labels: 138 | sector_data = df_concurrent[df_concurrent['dir_sector'] == sector] 139 | 140 | # Skip sectors with too few data points 141 | if len(sector_data) < 5: 142 | print(f"Warning: Sector {sector} has insufficient data points. Using global regression.") 143 | sector_regressions[sector] = {'slope': xy_stats['slope'], 'intercept': xy_stats['intercept']} 144 | continue 145 | 146 | # Perform linear regression for this sector 147 | x = sector_data['M_vortex_txt'] 148 | y = sector_data['M_obs_txt'] 149 | 150 | slope, intercept, r_value, p_value, std_err = stats.linregress(x, y) 151 | 152 | sector_regressions[sector] = { 153 | 'slope': slope, 154 | 'intercept': intercept, 155 | 'r_squared': r_value**2, 156 | 'p_value': p_value, 157 | 'std_err': std_err 158 | } 159 | 160 | print(f"\nSector {sector} Regression Statistics:") 161 | print(f"Slope: {slope:.4f}, Intercept: {intercept:.4f}, R-squared: {r_value**2:.4f}") 162 | 163 | # Apply sector-specific regression to create the new MCP column 164 | df_all['Ymcp_sectorial'] = None 165 | 166 | 167 | for sector in sector_labels: 168 | mask = df_all['dir_sector'] == sector 169 | if mask.any(): # Only proceed if there's data in this sector 170 | slope = sector_regressions[sector]['slope'] 171 | intercept = sector_regressions[sector]['intercept'] 172 | df_all.loc[mask, 'Ymcp_sectorial'] = intercept + slope * df_all.loc[mask, 'M_vortex_txt'] 173 | else: 174 | print(f"Warning: Sector {sector} has no data points. Skipping.") 175 | 176 | # Convert Ymcp_sectorial column to float64 177 | df_all['Ymcp_sectorial'] = df_all['Ymcp_sectorial'].astype('float64') 178 | df = df_all.copy().dropna()[['M_obs_txt', 'M_vortex_txt','Ymcp', 'Ymcp_sectorial']] 179 | 180 | # Create histogram 181 | hist_stats = plot_histogram_comparison( 182 | df=df, 183 | cols=['M_obs_txt', 'Ymcp_sectorial'], 184 | labels=['Measurements', 'Sectorial MCP'], 185 | colors=['blue', 'red'], 186 | site=SITE, 187 | output_dir=output_dir, 188 | bins=25, 189 | alpha=0.3 190 | ) 191 | # ============================================================================= 192 | # 7. Read remodeling 193 | # ============================================================================= 194 | 195 | # we now introduce a different method, Vortex Remodeling 196 | 197 | file_remodeling_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.remodeling.utc0.100m.txt') 198 | ds_remodeling_txt = read_remodeling_serie(file_remodeling_txt) 199 | df_remodeling_txt = ds_remodeling_txt.to_dataframe().rename(columns={'M': 'M_remodeling_txt'}) 200 | df =df.merge(df_remodeling_txt[['M_remodeling_txt']], left_index=True, right_index=True, how='outer').dropna() 201 | 202 | hist_stats = plot_histogram_comparison( 203 | df=df, 204 | cols=['M_obs_txt', 'M_remodeling_txt'], 205 | labels=['Measurements', 'Remodeling'], 206 | colors=['blue', 'orange'], 207 | site=SITE, 208 | output_dir=output_dir, 209 | bins=25, 210 | alpha=0.3 211 | ) 212 | 213 | # ============================================================================= 214 | # 8. Compare statsistics 215 | # ============================================================================= 216 | # Import required library for Earth Mover's Distance calculation 217 | 218 | # Calculate Earth Mover's Distance (Wasserstein distance) for each method 219 | 220 | emd_results = {} 221 | for col in ['Ymcp', 'Ymcp_sectorial', 'M_remodeling_txt']: 222 | # Calculate EMD between the method and observations 223 | emd = wasserstein_distance(df['M_obs_txt'], df[col]) 224 | emd_results[col] = emd 225 | 226 | 227 | 228 | 229 | # Calculate mean absolute error and root mean squared error for each prediction method compared to observations 230 | print("\nError Metrics (compared to M_obs_txt):") 231 | print("=" * 80) 232 | for col in ['Ymcp', 'Ymcp_sectorial', 'M_remodeling_txt']: 233 | mae = (df[col] - df['M_obs_txt']).abs().mean() 234 | rmse = ((df[col] - df['M_obs_txt']) ** 2).mean() ** 0.5 235 | bias = (df[col] - df['M_obs_txt']).mean() 236 | print(f"{col}:") 237 | print(f" Mean Absolute Error (MAE): {mae:.4f} m/s") 238 | print(f" Root Mean Squared Error (RMSE): {rmse:.4f} m/s") 239 | print(f" Bias: {bias:.4f} m/s") 240 | print(f" Histogram error(EMD): {emd_results[col]:.4f}") 241 | 242 | # ============================================================================= 243 | # 9. Liner histogram for histogram comparison 244 | # ============================================================================= 245 | 246 | 247 | # Example usage for the new function 248 | hist_line_stats = plot_histogram_comparison_lines( 249 | df=df, 250 | cols=['M_obs_txt', 'Ymcp', 'Ymcp_sectorial', 'M_remodeling_txt'], 251 | labels=['Measurements', 'MCP', 'Sectorial MCP', 'Remodeling'], 252 | colors=['blue', 'orange', 'red', 'green'], 253 | site=SITE, 254 | output_dir=output_dir, 255 | bins=50 256 | ) 257 | exit() 258 | 259 | 260 | 261 | 262 | 263 | 264 | 265 | 266 | 267 | 268 | 269 | 270 | 271 | 272 | -------------------------------------------------------------------------------- /examples/example_5_MeasureCorrelatePredict_functions.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import pandas as pd 4 | import matplotlib.pyplot as plt 5 | 6 | 7 | def plot_histogram_comparison_lines(df, cols, labels, colors, site='Site', output_dir='output', bins=20, alpha=0.7, save_fig=True): 8 | """ 9 | Create a histogram comparison plot with lines instead of bars. 10 | 11 | Parameters: 12 | ----------- 13 | df : pandas.DataFrame 14 | DataFrame containing the columns to be plotted. 15 | cols : list 16 | List of column names to plot. 17 | labels : list 18 | List of labels for the legend. 19 | colors : list 20 | List of colors for each line. 21 | site : str 22 | Site name for the plot title. 23 | output_dir : str 24 | Directory to save the output plot. 25 | bins : int 26 | Number of bins for the histogram. 27 | alpha : float 28 | Transparency of the lines. 29 | save_fig : bool 30 | Whether to save the figure or not. 31 | 32 | Returns: 33 | -------- 34 | dict 35 | Statistics of each distribution. 36 | """ 37 | plt.figure(figsize=(10, 6)) 38 | 39 | stats = {} 40 | 41 | for col, label, color in zip(cols, labels, colors): 42 | # Calculate histogram values 43 | hist_values, bin_edges = np.histogram(df[col].dropna(), bins=bins, density=True) 44 | bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2 45 | 46 | # Plot as a line 47 | plt.plot(bin_centers, hist_values, label=label, color=color, linewidth=2, alpha=alpha) 48 | 49 | # Store statistics 50 | stats[label] = { 51 | 'mean': df[col].mean(), 52 | 'median': df[col].median(), 53 | 'std': df[col].std(), 54 | 'min': df[col].min(), 55 | 'max': df[col].max(), 56 | } 57 | 58 | plt.title(f'Wind Speed Distribution Comparison - {site}') 59 | plt.xlabel('Wind Speed (m/s)') 60 | plt.ylabel('Probability Density') 61 | plt.legend() 62 | plt.grid(True, linestyle='--', alpha=0.7) 63 | 64 | # Create output directory if it doesn't exist 65 | if not os.path.exists(output_dir): 66 | os.makedirs(output_dir) 67 | if save_fig and output_dir and site: 68 | os.makedirs(output_dir, exist_ok=True) 69 | plt.savefig(os.path.join(output_dir, f'histogram_comparison_{site}.png'), dpi=300, bbox_inches='tight') 70 | plt.show() 71 | plt.close() 72 | 73 | print(f"Histogram line plot saved to {output_dir}/histogram_comparison_{site}.png") 74 | return stats 75 | -------------------------------------------------------------------------------- /images/Froya-map.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/VortexFDC/pywind/f4dfe9ef705ed706355ba84ffd36ccec8ed06097/images/Froya-map.png -------------------------------------------------------------------------------- /images/logo_VORTEX.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/VortexFDC/pywind/f4dfe9ef705ed706355ba84ffd36ccec8ed06097/images/logo_VORTEX.png -------------------------------------------------------------------------------- /notebooks/example_2_read_txt.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"Company\n", 8 | "\n", 9 | "| Project| Authors | Company | Year | Chapter |\n", 10 | "|--------|-------------------|-----------------------------------------|------|---------|\n", 11 | "| Pywind | Oriol L & Arnau T | [Vortex FdC](https://www.vortexfdc.com) | 2024 | 2 |\n", 12 | "\n", 13 | "# Chapter 2: Loading Txt\n", 14 | "\n", 15 | "_Overview:_\n", 16 | "\n", 17 | "This script reads meteorological **Text data (.txt)** and uses functions to load and show the file structure and a quick overview.\n", 18 | "\n", 19 | "- Measurements (txt) - Contains single height with limited variables.\n", 20 | "- Vortex (txt) - Contains single heights and variables.\n", 21 | "\n", 22 | "_Data Storage:_\n", 23 | "\n", 24 | "The acquired data is stored in two data structures for comparison and analysis:\n", 25 | "- Xarray Dataset\n", 26 | "- Pandas DataFrame\n", 27 | "\n", 28 | "_Objective:_\n", 29 | "\n", 30 | "- To understand the variance in data storage when using Xarray and Pandas.\n", 31 | "- Utilize the basic commands to make a quick overview of the loaded data; e.g. `describe()` and `head()`.\n", 32 | "- Define functions in external files." 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "metadata": {}, 38 | "source": [ 39 | "### Import Libraries" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 25, 45 | "metadata": {}, 46 | "outputs": [], 47 | "source": [ 48 | "import sys\n", 49 | "import os\n", 50 | "\n", 51 | "sys.path.append(os.path.join(os.getcwd(), '../examples'))\n", 52 | "\n", 53 | "from typing import Dict\n", 54 | "from example_2_read_txt_functions import *\n", 55 | "#from examples import example_2_read_txt_functions import _get_coordinates_vortex_header" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "\n", 63 | "### Define Paths and Site" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": 26, 69 | "metadata": {}, 70 | "outputs": [ 71 | { 72 | "name": "stdout", 73 | "output_type": "stream", 74 | "text": [ 75 | "\n", 76 | "Measurements txt: /home/oriol/vortex/git/pywind_private/notebooks/../data/froya/measurements/obs.txt\n", 77 | "Vortex txt: /home/oriol/vortex/git/pywind_private/notebooks/../data/froya/vortex/SERIE/vortex.serie.era5.utc0.100m.txt\n" 78 | ] 79 | } 80 | ], 81 | "source": [ 82 | "SITE = 'froya'\n", 83 | "pwd = os.getcwd()\n", 84 | "base_path = str(os.path.join(pwd, '../data'))\n", 85 | "\n", 86 | "print()\n", 87 | "measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc')\n", 88 | "vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc')\n", 89 | "\n", 90 | "vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt')\n", 91 | "measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt')\n", 92 | "\n", 93 | "# Print filenames\n", 94 | "print('Measurements txt: ', measurements_txt)\n", 95 | "print('Vortex txt: ', vortex_txt)" 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": {}, 101 | "source": [ 102 | "### Read Vortex Text Series Functions" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 27, 108 | "metadata": {}, 109 | "outputs": [ 110 | { 111 | "name": "stderr", 112 | "output_type": "stream", 113 | "text": [ 114 | "/home/oriol/vortex/git/pywind_private/notebooks/../examples/example_2_read_txt_functions.py:149: FutureWarning: Support for nested sequences for 'parse_dates' in pd.read_csv is deprecated. Combine the desired columns with pd.to_datetime after parsing instead.\n", 115 | " df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)\n" 116 | ] 117 | }, 118 | { 119 | "data": { 120 | "text/html": [ 121 | "
\n", 122 | "\n", 123 | "\n", 124 | "\n", 125 | "\n", 126 | "\n", 127 | "\n", 128 | "\n", 129 | "\n", 130 | "\n", 131 | "\n", 132 | "\n", 133 | "\n", 134 | "\n", 135 | "\n", 136 | "
<xarray.Dataset> Size: 13MB\n",
 492 |        "Dimensions:  (time: 176424)\n",
 493 |        "Coordinates:\n",
 494 |        "    lat      float64 8B 63.67\n",
 495 |        "    lon      float64 8B 8.342\n",
 496 |        "    lev      float64 8B 100.0\n",
 497 |        "    utc      float64 8B 0.0\n",
 498 |        "  * time     (time) datetime64[ns] 1MB 2004-01-01 ... 2024-02-15T23:00:00\n",
 499 |        "Data variables:\n",
 500 |        "    M        (time) float64 1MB 4.6 4.3 4.9 4.7 4.7 5.2 ... 7.9 7.5 6.8 6.0 4.5\n",
 501 |        "    Dir      (time) int64 1MB 154 162 166 156 145 141 ... 93 100 104 108 113 122\n",
 502 |        "    T        (time) float64 1MB -1.8 -1.7 -1.7 -1.8 -1.9 ... 1.6 1.0 0.5 0.3 0.3\n",
 503 |        "    D        (time) float64 1MB 1.29 1.29 1.29 1.29 1.29 ... 1.27 1.27 1.27 1.27\n",
 504 |        "    P        (time) float64 1MB 1.006e+03 1.005e+03 1.005e+03 ... 994.4 993.9\n",
 505 |        "    RI       (time) float64 1MB -0.75 -1.48 -3.55 -2.01 ... -1.89 -1.7 0.18\n",
 506 |        "    RH       (time) float64 1MB 67.0 64.9 64.6 63.6 62.2 ... 77.6 78.5 79.7 78.9\n",
 507 |        "    RMOL     (time) float64 1MB -0.1985 -0.2098 -0.1699 ... -0.0406 -0.0795
" 524 | ], 525 | "text/plain": [ 526 | " Size: 13MB\n", 527 | "Dimensions: (time: 176424)\n", 528 | "Coordinates:\n", 529 | " lat float64 8B 63.67\n", 530 | " lon float64 8B 8.342\n", 531 | " lev float64 8B 100.0\n", 532 | " utc float64 8B 0.0\n", 533 | " * time (time) datetime64[ns] 1MB 2004-01-01 ... 2024-02-15T23:00:00\n", 534 | "Data variables:\n", 535 | " M (time) float64 1MB 4.6 4.3 4.9 4.7 4.7 5.2 ... 7.9 7.5 6.8 6.0 4.5\n", 536 | " Dir (time) int64 1MB 154 162 166 156 145 141 ... 93 100 104 108 113 122\n", 537 | " T (time) float64 1MB -1.8 -1.7 -1.7 -1.8 -1.9 ... 1.6 1.0 0.5 0.3 0.3\n", 538 | " D (time) float64 1MB 1.29 1.29 1.29 1.29 1.29 ... 1.27 1.27 1.27 1.27\n", 539 | " P (time) float64 1MB 1.006e+03 1.005e+03 1.005e+03 ... 994.4 993.9\n", 540 | " RI (time) float64 1MB -0.75 -1.48 -3.55 -2.01 ... -1.89 -1.7 0.18\n", 541 | " RH (time) float64 1MB 67.0 64.9 64.6 63.6 62.2 ... 77.6 78.5 79.7 78.9\n", 542 | " RMOL (time) float64 1MB -0.1985 -0.2098 -0.1699 ... -0.0406 -0.0795" 543 | ] 544 | }, 545 | "execution_count": 27, 546 | "metadata": {}, 547 | "output_type": "execute_result" 548 | } 549 | ], 550 | "source": [ 551 | "ds_vortex = read_vortex_serie(vortex_txt)\n", 552 | "ds_vortex" 553 | ] 554 | }, 555 | { 556 | "cell_type": "markdown", 557 | "metadata": {}, 558 | "source": [ 559 | "Now, we convert *df_vortex* to Pandas DataFrame and we use `head()` Pandas function to display the first 5 rows." 560 | ] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "execution_count": 28, 565 | "metadata": {}, 566 | "outputs": [ 567 | { 568 | "data": { 569 | "text/html": [ 570 | "
\n", 571 | "\n", 584 | "\n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | "
latlonlevutcMDirTDPRIRHRMOL
time
2004-01-01 00:00:0063.666398.342499100.00.04.6154-1.81.291005.5-0.7567.0-0.1985
2004-01-01 01:00:0063.666398.342499100.00.04.3162-1.71.291005.2-1.4864.9-0.2098
2004-01-01 02:00:0063.666398.342499100.00.04.9166-1.71.291005.2-3.5564.6-0.1699
2004-01-01 03:00:0063.666398.342499100.00.04.7156-1.81.291004.7-2.0163.6-0.1740
2004-01-01 04:00:0063.666398.342499100.00.04.7145-1.91.291004.4-1.0362.2-0.1747
\n", 695 | "
" 696 | ], 697 | "text/plain": [ 698 | " lat lon lev utc M Dir T D \\\n", 699 | "time \n", 700 | "2004-01-01 00:00:00 63.66639 8.342499 100.0 0.0 4.6 154 -1.8 1.29 \n", 701 | "2004-01-01 01:00:00 63.66639 8.342499 100.0 0.0 4.3 162 -1.7 1.29 \n", 702 | "2004-01-01 02:00:00 63.66639 8.342499 100.0 0.0 4.9 166 -1.7 1.29 \n", 703 | "2004-01-01 03:00:00 63.66639 8.342499 100.0 0.0 4.7 156 -1.8 1.29 \n", 704 | "2004-01-01 04:00:00 63.66639 8.342499 100.0 0.0 4.7 145 -1.9 1.29 \n", 705 | "\n", 706 | " P RI RH RMOL \n", 707 | "time \n", 708 | "2004-01-01 00:00:00 1005.5 -0.75 67.0 -0.1985 \n", 709 | "2004-01-01 01:00:00 1005.2 -1.48 64.9 -0.2098 \n", 710 | "2004-01-01 02:00:00 1005.2 -3.55 64.6 -0.1699 \n", 711 | "2004-01-01 03:00:00 1004.7 -2.01 63.6 -0.1740 \n", 712 | "2004-01-01 04:00:00 1004.4 -1.03 62.2 -0.1747 " 713 | ] 714 | }, 715 | "execution_count": 28, 716 | "metadata": {}, 717 | "output_type": "execute_result" 718 | } 719 | ], 720 | "source": [ 721 | "df_vortex = ds_vortex.to_dataframe() \n", 722 | "df_vortex.head()" 723 | ] 724 | }, 725 | { 726 | "cell_type": "markdown", 727 | "metadata": {}, 728 | "source": [ 729 | "And if we only want to print 'M' and 'Dir' columns:" 730 | ] 731 | }, 732 | { 733 | "cell_type": "code", 734 | "execution_count": 29, 735 | "metadata": {}, 736 | "outputs": [ 737 | { 738 | "data": { 739 | "text/html": [ 740 | "
\n", 741 | "\n", 754 | "\n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | "
MDir
time
2004-01-01 00:00:004.6154
2004-01-01 01:00:004.3162
2004-01-01 02:00:004.9166
2004-01-01 03:00:004.7156
2004-01-01 04:00:004.7145
\n", 795 | "
" 796 | ], 797 | "text/plain": [ 798 | " M Dir\n", 799 | "time \n", 800 | "2004-01-01 00:00:00 4.6 154\n", 801 | "2004-01-01 01:00:00 4.3 162\n", 802 | "2004-01-01 02:00:00 4.9 166\n", 803 | "2004-01-01 03:00:00 4.7 156\n", 804 | "2004-01-01 04:00:00 4.7 145" 805 | ] 806 | }, 807 | "execution_count": 29, 808 | "metadata": {}, 809 | "output_type": "execute_result" 810 | } 811 | ], 812 | "source": [ 813 | "df_vortex[['M', 'Dir']].head()" 814 | ] 815 | }, 816 | { 817 | "cell_type": "markdown", 818 | "metadata": {}, 819 | "source": [ 820 | "### Read Measurements Txt" 821 | ] 822 | }, 823 | { 824 | "cell_type": "code", 825 | "execution_count": 30, 826 | "metadata": {}, 827 | "outputs": [ 828 | { 829 | "name": "stderr", 830 | "output_type": "stream", 831 | "text": [ 832 | "/home/oriol/vortex/git/pywind_private/notebooks/../examples/example_2_read_txt_functions.py:149: FutureWarning: Support for nested sequences for 'parse_dates' in pd.read_csv is deprecated. Combine the desired columns with pd.to_datetime after parsing instead.\n", 833 | " df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)\n" 834 | ] 835 | }, 836 | { 837 | "data": { 838 | "text/html": [ 839 | "
\n", 840 | "\n", 853 | "\n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | "
MDir
time
2009-11-18 13:50:003.62159.0
2009-11-18 14:00:003.46153.0
2009-11-18 14:10:002.99153.0
2009-11-18 14:20:002.41151.5
2009-11-18 14:30:001.91150.5
\n", 894 | "
" 895 | ], 896 | "text/plain": [ 897 | " M Dir\n", 898 | "time \n", 899 | "2009-11-18 13:50:00 3.62 159.0\n", 900 | "2009-11-18 14:00:00 3.46 153.0\n", 901 | "2009-11-18 14:10:00 2.99 153.0\n", 902 | "2009-11-18 14:20:00 2.41 151.5\n", 903 | "2009-11-18 14:30:00 1.91 150.5" 904 | ] 905 | }, 906 | "execution_count": 30, 907 | "metadata": {}, 908 | "output_type": "execute_result" 909 | } 910 | ], 911 | "source": [ 912 | "df_obs = read_vortex_obs_to_dataframe(measurements_txt)\n", 913 | "ds_obs = convert_to_xarray(df_obs)\n", 914 | "df_obs.head()" 915 | ] 916 | }, 917 | { 918 | "cell_type": "markdown", 919 | "metadata": {}, 920 | "source": [ 921 | "### Now we can compare statistics" 922 | ] 923 | }, 924 | { 925 | "cell_type": "code", 926 | "execution_count": 31, 927 | "metadata": {}, 928 | "outputs": [ 929 | { 930 | "data": { 931 | "text/html": [ 932 | "
\n", 933 | "\n", 946 | "\n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | "
MDir
count176424.00176424.00
mean8.43183.10
std4.7789.74
min0.100.00
25%4.90111.00
50%7.60194.00
75%11.00248.00
max34.70360.00
\n", 997 | "
" 998 | ], 999 | "text/plain": [ 1000 | " M Dir\n", 1001 | "count 176424.00 176424.00\n", 1002 | "mean 8.43 183.10\n", 1003 | "std 4.77 89.74\n", 1004 | "min 0.10 0.00\n", 1005 | "25% 4.90 111.00\n", 1006 | "50% 7.60 194.00\n", 1007 | "75% 11.00 248.00\n", 1008 | "max 34.70 360.00" 1009 | ] 1010 | }, 1011 | "metadata": {}, 1012 | "output_type": "display_data" 1013 | }, 1014 | { 1015 | "data": { 1016 | "text/html": [ 1017 | "
\n", 1018 | "\n", 1031 | "\n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | "
MDir
count171895.00171895.00
mean8.06178.48
std4.7895.98
min0.090.00
25%4.5793.00
50%7.04197.00
75%10.60251.00
max36.10360.00
\n", 1082 | "
" 1083 | ], 1084 | "text/plain": [ 1085 | " M Dir\n", 1086 | "count 171895.00 171895.00\n", 1087 | "mean 8.06 178.48\n", 1088 | "std 4.78 95.98\n", 1089 | "min 0.09 0.00\n", 1090 | "25% 4.57 93.00\n", 1091 | "50% 7.04 197.00\n", 1092 | "75% 10.60 251.00\n", 1093 | "max 36.10 360.00" 1094 | ] 1095 | }, 1096 | "metadata": {}, 1097 | "output_type": "display_data" 1098 | } 1099 | ], 1100 | "source": [ 1101 | "from IPython.display import display\n", 1102 | "\n", 1103 | "display(df_vortex[['M', 'Dir']].describe().round(2))\n", 1104 | "display(df_obs.describe().round(2))" 1105 | ] 1106 | }, 1107 | { 1108 | "cell_type": "markdown", 1109 | "metadata": {}, 1110 | "source": [ 1111 | "### Thank you for completing this Notebook! \n", 1112 | "### *Other references available upon request.*\n", 1113 | "\n", 1114 | "You now can:\n", 1115 | "\n", 1116 | "- Read Vortex SERIES txt files.\n", 1117 | "- Convert from txt to NetCD.\n", 1118 | "- Convert to **Pandas** DataFrames.\n", 1119 | "- Have a quick overview of the data using `head()` and `describe()` Pandas functions.\n", 1120 | "\n", 1121 | "**Don't hesitate to [contact us](https://vortexfdc.com/contact/) for any questions and information.**\n", 1122 | "\n", 1123 | "## Change Log\n", 1124 | "\n", 1125 | "\n", 1126 | "| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n", 1127 | "|-------------------|---------|------------|--------------------------------------------|\n", 1128 | "| 2024-06-25 | 0.0 | Arnau | Notebook creation |\n", 1129 | "\n", 1130 | "
\n", 1131 | "\n", 1132 | "##

© Vortex F.d.C. 2024. All rights reserved.

" 1133 | ] 1134 | } 1135 | ], 1136 | "metadata": { 1137 | "kernelspec": { 1138 | "display_name": "python-def", 1139 | "language": "python", 1140 | "name": "python3" 1141 | }, 1142 | "language_info": { 1143 | "codemirror_mode": { 1144 | "name": "ipython", 1145 | "version": 3 1146 | }, 1147 | "file_extension": ".py", 1148 | "mimetype": "text/x-python", 1149 | "name": "python", 1150 | "nbconvert_exporter": "python", 1151 | "pygments_lexer": "ipython3", 1152 | "version": "3.13.1" 1153 | } 1154 | }, 1155 | "nbformat": 4, 1156 | "nbformat_minor": 2 1157 | } 1158 | -------------------------------------------------------------------------------- /notebooks/example_3_merge.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "\"Company\n", 8 | "\n", 9 | "| Project| Authors | Company | Year | Chapter |\n", 10 | "|--------|-------------------|-----------------------------------------|------|---------|\n", 11 | "| Pywind | Oriol L & Arnau T | [Vortex FdC](https://www.vortexfdc.com) | 2024 | 3 |\n", 12 | "\n", 13 | "# Chapter 3: Merge\n", 14 | "\n", 15 | "_Overview_\n", 16 | "---------\n", 17 | "This script demonstrates the process of reading and processing various types of meteorological data files. The goal is to compare measurements from different sources and formats by resampling, interpolating, and merging the data for further analysis.\n", 18 | "\n", 19 | "The script uses functions to load and manipulate data from four distinct file formats:\n", 20 | "\n", 21 | "1. **Measurements (NetCDF)** - Contains multiple heights and variables.\n", 22 | "2. **Vortex NetCDF** - NetCDF file format with multiple heights and variables.\n", 23 | "3. **Vortex Text Series** - Text file containing time series data of meteorological measurements.\n", 24 | "4. **Measurements Text Series** - Text file containing time series data of observations.\n", 25 | "\n", 26 | "_Data Storage_\n", 27 | "-------------\n", 28 | "The acquired data is stored and processed in two data structures for comparison and analysis:\n", 29 | "- **Xarray Dataset**: For handling multi-dimensional arrays of the meteorological data, useful for complex operations and transformations.\n", 30 | "- **Pandas DataFrame**: For flexible and powerful data manipulation and analysis, allowing easy integration and comparison of different datasets.\n", 31 | "\n", 32 | "_Objective_\n", 33 | "----------\n", 34 | "- **Read and Interpolate Data**: Load data from NetCDF and text files, and interpolate Vortex data to match the measurement levels.\n", 35 | "- **Resample Data**: Convert the time series data to an hourly frequency to ensure uniformity in the analysis.\n", 36 | "- **Data Comparison**: Merge the datasets to facilitate a detailed comparison of measurements from different sources.\n", 37 | "- **Statistical Overview**: Utilize the `describe()` method from Pandas for a quick statistical summary of the datasets, providing insights into the distribution and characteristics of the data.\n", 38 | "- **Concurrent Period Analysis**: Clean the data by removing non-concurrent periods (no data) to focus on the overlapping timeframes for accurate comparison.\n", 39 | "\n", 40 | "By following these steps, the script aims to provide a comprehensive approach to handling and analyzing meteorological data from various sources, ensuring a clear understanding of the data's behavior and relationships." 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "metadata": {}, 46 | "source": [ 47 | "### Import Libraries" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 1, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "import sys\n", 57 | "import os\n", 58 | "\n", 59 | "sys.path.append(os.path.join(os.getcwd(), '../examples'))\n", 60 | "from example_3_merge_functions import *" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "### Define Paths and Site" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 2, 73 | "metadata": {}, 74 | "outputs": [ 75 | { 76 | "name": "stdout", 77 | "output_type": "stream", 78 | "text": [ 79 | "\n", 80 | "Measurements txt: /home/oriol/vortex/git/pywind/notebooks/../data/froya/measurements/obs.txt\n", 81 | "Vortex txt: /home/oriol/vortex/git/pywind/notebooks/../data/froya/vortex/SERIE/vortex.serie.era5.utc0.100m.txt\n" 82 | ] 83 | } 84 | ], 85 | "source": [ 86 | "SITE = 'froya'\n", 87 | "pwd = os.getcwd()\n", 88 | "base_path = str(os.path.join(pwd, '../data'))\n", 89 | "\n", 90 | "print()\n", 91 | "measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc')\n", 92 | "vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc')\n", 93 | "\n", 94 | "vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt')\n", 95 | "measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt')\n", 96 | "\n", 97 | "# Print filenames\n", 98 | "print('Measurements txt: ', measurements_txt)\n", 99 | "print('Vortex txt: ', vortex_txt)" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "### Read Vortex Series in NetCDF and Text" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 3, 112 | "metadata": {}, 113 | "outputs": [ 114 | { 115 | "name": "stderr", 116 | "output_type": "stream", 117 | "text": [ 118 | "/home/oriol/vortex/git/pywind/notebooks/../examples/example_3_merge_functions.py:51: FutureWarning: Support for nested sequences for 'parse_dates' in pd.read_csv is deprecated. Combine the desired columns with pd.to_datetime after parsing instead.\n", 119 | " df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)\n", 120 | "/home/oriol/vortex/git/pywind/notebooks/../examples/example_3_merge_functions.py:51: FutureWarning: Support for nested sequences for 'parse_dates' in pd.read_csv is deprecated. Combine the desired columns with pd.to_datetime after parsing instead.\n", 121 | " df: pd.DataFrame = pd.read_csv(infile, **readcsv_kwargs)\n" 122 | ] 123 | } 124 | ], 125 | "source": [ 126 | "# Read NetCDF\n", 127 | "ds_obs_nc = xr.open_dataset(measurements_netcdf)\n", 128 | "ds_vortex_nc = xr.open_dataset(vortex_netcdf)\n", 129 | "#ds_vortex_nc = ds_vortex_nc.rename_vars({'D': 'Dir'})\n", 130 | "\n", 131 | "# Read Text Series\n", 132 | "ds_vortex_txt = read_vortex_serie(vortex_txt)\n", 133 | "df_obs_txt = read_vortex_obs_to_dataframe(measurements_txt)[['M', 'Dir']]\n", 134 | "ds_obs_txt = convert_to_xarray(df_obs_txt)[['M', 'Dir']]" 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | "#### Interpolate Vortex Series to the same Measurements level. Select M and Dir" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": 4, 147 | "metadata": {}, 148 | "outputs": [ 149 | { 150 | "name": "stdout", 151 | "output_type": "stream", 152 | "text": [ 153 | "Max height in measurements: 100.0\n" 154 | ] 155 | } 156 | ], 157 | "source": [ 158 | "max_height = ds_obs_nc.squeeze().coords['lev'].max().values\n", 159 | "print(\"Max height in measurements: \", max_height)\n", 160 | "ds_obs_nc = ds_obs_nc.sel(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']]\n", 161 | "\n", 162 | "ds_vortex_nc = ds_vortex_nc.interp(lev=max_height).squeeze().reset_coords(drop=True)[['M', 'Dir']]\n", 163 | "ds_vortex_txt = ds_vortex_txt[['M', 'Dir']].squeeze().reset_coords(drop=True)" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "#### Measurements Time Resampling to Hourly\n", 171 | "\n", 172 | "No need to perform any resampling to Vortex data, as SERIES products is already hourly." 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": 5, 178 | "metadata": {}, 179 | "outputs": [ 180 | { 181 | "name": "stderr", 182 | "output_type": "stream", 183 | "text": [ 184 | "/home/oriol/miniconda3/envs/python-def/lib/python3.13/site-packages/xarray/groupers.py:487: FutureWarning: 'H' is deprecated and will be removed in a future version, please use 'h' instead.\n", 185 | " self.index_grouper = pd.Grouper(\n", 186 | "/home/oriol/miniconda3/envs/python-def/lib/python3.13/site-packages/xarray/groupers.py:487: FutureWarning: 'H' is deprecated and will be removed in a future version, please use 'h' instead.\n", 187 | " self.index_grouper = pd.Grouper(\n" 188 | ] 189 | } 190 | ], 191 | "source": [ 192 | "# convert ds_obs_nc to hourly\n", 193 | "ds_obs_nc = ds_obs_nc.resample(time='1H').mean()\n", 194 | "# convert ds_obs_txt to hourly\n", 195 | "ds_obs_txt = ds_obs_txt.resample(time='1H').mean()" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "#### Convert all to DataFrame, Rename and Merge" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 6, 208 | "metadata": {}, 209 | "outputs": [], 210 | "source": [ 211 | "# convert to Pandas DataFrames\n", 212 | "df_obs_nc = ds_obs_nc.to_dataframe()\n", 213 | "df_vortex_nc = ds_vortex_nc.to_dataframe()\n", 214 | "df_obs_txt = ds_obs_txt.to_dataframe()\n", 215 | "df_vortex_txt = ds_vortex_txt.to_dataframe()\n", 216 | "\n", 217 | "# rename columns so they do now have the same name when merging\n", 218 | "df_obs_nc.columns = ['M_obs_nc', 'Dir_obs_nc']\n", 219 | "df_vortex_nc.columns = ['M_vortex_nc', 'Dir_vortex_nc']\n", 220 | "df_obs_txt.columns = ['M_obs_txt', 'Dir_obs_txt']\n", 221 | "df_vortex_txt.columns = ['M_vortex_txt', 'Dir_vortex_txt']\n", 222 | "\n", 223 | "# merge using index (time) all dataframes\n", 224 | "df_nc = df_obs_nc.merge(df_vortex_nc, left_index=True, right_index=True)\n", 225 | "df_txt = df_obs_txt.merge(df_vortex_txt, left_index=True, right_index=True)\n", 226 | "df = df_nc.merge(df_txt, left_index=True, right_index=True)" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | "#### Results" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 7, 239 | "metadata": {}, 240 | "outputs": [ 241 | { 242 | "data": { 243 | "text/html": [ 244 | "
\n", 245 | "\n", 258 | "\n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | "
M_obs_ncDir_obs_ncM_vortex_ncDir_vortex_ncM_obs_txtDir_obs_txtM_vortex_txtDir_vortex_txt
time
2009-11-18 13:00:003.62159.005.81114.233.62159.005.8114
2009-11-18 14:00:002.35147.585.91114.552.35147.585.9115
2009-11-18 15:00:001.25162.585.83117.741.26162.585.8118
2009-11-18 16:00:001.2077.425.24128.541.2077.425.2129
2009-11-18 17:00:001.84124.084.22151.341.84124.084.2151
\n", 341 | "
" 342 | ], 343 | "text/plain": [ 344 | " M_obs_nc Dir_obs_nc M_vortex_nc Dir_vortex_nc \\\n", 345 | "time \n", 346 | "2009-11-18 13:00:00 3.62 159.00 5.81 114.23 \n", 347 | "2009-11-18 14:00:00 2.35 147.58 5.91 114.55 \n", 348 | "2009-11-18 15:00:00 1.25 162.58 5.83 117.74 \n", 349 | "2009-11-18 16:00:00 1.20 77.42 5.24 128.54 \n", 350 | "2009-11-18 17:00:00 1.84 124.08 4.22 151.34 \n", 351 | "\n", 352 | " M_obs_txt Dir_obs_txt M_vortex_txt Dir_vortex_txt \n", 353 | "time \n", 354 | "2009-11-18 13:00:00 3.62 159.00 5.8 114 \n", 355 | "2009-11-18 14:00:00 2.35 147.58 5.9 115 \n", 356 | "2009-11-18 15:00:00 1.26 162.58 5.8 118 \n", 357 | "2009-11-18 16:00:00 1.20 77.42 5.2 129 \n", 358 | "2009-11-18 17:00:00 1.84 124.08 4.2 151 " 359 | ] 360 | }, 361 | "metadata": {}, 362 | "output_type": "display_data" 363 | }, 364 | { 365 | "name": "stdout", 366 | "output_type": "stream", 367 | "text": [ 368 | "\n" 369 | ] 370 | }, 371 | { 372 | "data": { 373 | "text/html": [ 374 | "
\n", 375 | "\n", 388 | "\n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | "
M_obs_ncDir_obs_ncM_vortex_ncDir_vortex_ncM_obs_txtDir_obs_txtM_vortex_txtDir_vortex_txt
count28809.0028809.0044275.0044275.0028809.0028809.0044275.0044275.00
mean8.06178.518.22180.638.06178.518.21180.72
std4.7391.884.7188.934.7391.884.7189.36
min0.292.170.100.040.292.170.100.00
25%4.6196.004.73110.634.6196.004.70110.00
50%7.04196.507.37190.027.04196.507.40190.00
75%10.57249.3310.71244.6810.57249.3310.70245.00
max34.08358.5833.09359.9534.08358.5833.10360.00
\n", 493 | "
" 494 | ], 495 | "text/plain": [ 496 | " M_obs_nc Dir_obs_nc M_vortex_nc Dir_vortex_nc M_obs_txt \\\n", 497 | "count 28809.00 28809.00 44275.00 44275.00 28809.00 \n", 498 | "mean 8.06 178.51 8.22 180.63 8.06 \n", 499 | "std 4.73 91.88 4.71 88.93 4.73 \n", 500 | "min 0.29 2.17 0.10 0.04 0.29 \n", 501 | "25% 4.61 96.00 4.73 110.63 4.61 \n", 502 | "50% 7.04 196.50 7.37 190.02 7.04 \n", 503 | "75% 10.57 249.33 10.71 244.68 10.57 \n", 504 | "max 34.08 358.58 33.09 359.95 34.08 \n", 505 | "\n", 506 | " Dir_obs_txt M_vortex_txt Dir_vortex_txt \n", 507 | "count 28809.00 44275.00 44275.00 \n", 508 | "mean 178.51 8.21 180.72 \n", 509 | "std 91.88 4.71 89.36 \n", 510 | "min 2.17 0.10 0.00 \n", 511 | "25% 96.00 4.70 110.00 \n", 512 | "50% 196.50 7.40 190.00 \n", 513 | "75% 249.33 10.70 245.00 \n", 514 | "max 358.58 33.10 360.00 " 515 | ] 516 | }, 517 | "metadata": {}, 518 | "output_type": "display_data" 519 | } 520 | ], 521 | "source": [ 522 | "from IPython.display import display\n", 523 | "\n", 524 | "display(df.head().round(2))\n", 525 | "print()\n", 526 | "display(df.describe().round(2)) " 527 | ] 528 | }, 529 | { 530 | "cell_type": "markdown", 531 | "metadata": {}, 532 | "source": [ 533 | "After Cleaning Nodatas: Concurrent period" 534 | ] 535 | }, 536 | { 537 | "cell_type": "code", 538 | "execution_count": 8, 539 | "metadata": {}, 540 | "outputs": [ 541 | { 542 | "data": { 543 | "text/html": [ 544 | "
\n", 545 | "\n", 558 | "\n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | "
M_obs_ncDir_obs_ncM_vortex_ncDir_vortex_ncM_obs_txtDir_obs_txtM_vortex_txtDir_vortex_txt
time
2009-11-18 13:00:003.62159.005.81114.233.62159.005.8114
2009-11-18 14:00:002.35147.585.91114.552.35147.585.9115
2009-11-18 15:00:001.25162.585.83117.741.26162.585.8118
2009-11-18 16:00:001.2077.425.24128.541.2077.425.2129
2009-11-18 17:00:001.84124.084.22151.341.84124.084.2151
\n", 641 | "
" 642 | ], 643 | "text/plain": [ 644 | " M_obs_nc Dir_obs_nc M_vortex_nc Dir_vortex_nc \\\n", 645 | "time \n", 646 | "2009-11-18 13:00:00 3.62 159.00 5.81 114.23 \n", 647 | "2009-11-18 14:00:00 2.35 147.58 5.91 114.55 \n", 648 | "2009-11-18 15:00:00 1.25 162.58 5.83 117.74 \n", 649 | "2009-11-18 16:00:00 1.20 77.42 5.24 128.54 \n", 650 | "2009-11-18 17:00:00 1.84 124.08 4.22 151.34 \n", 651 | "\n", 652 | " M_obs_txt Dir_obs_txt M_vortex_txt Dir_vortex_txt \n", 653 | "time \n", 654 | "2009-11-18 13:00:00 3.62 159.00 5.8 114 \n", 655 | "2009-11-18 14:00:00 2.35 147.58 5.9 115 \n", 656 | "2009-11-18 15:00:00 1.26 162.58 5.8 118 \n", 657 | "2009-11-18 16:00:00 1.20 77.42 5.2 129 \n", 658 | "2009-11-18 17:00:00 1.84 124.08 4.2 151 " 659 | ] 660 | }, 661 | "metadata": {}, 662 | "output_type": "display_data" 663 | }, 664 | { 665 | "name": "stdout", 666 | "output_type": "stream", 667 | "text": [ 668 | "\n" 669 | ] 670 | }, 671 | { 672 | "data": { 673 | "text/html": [ 674 | "
\n", 675 | "\n", 688 | "\n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | "
M_obs_ncDir_obs_ncM_vortex_ncDir_vortex_ncM_obs_txtDir_obs_txtM_vortex_txtDir_vortex_txt
count28809.0028809.0028809.0028809.0028809.0028809.0028809.0028809.00
mean8.06178.518.00178.248.06178.517.99178.32
std4.7391.884.5193.594.7391.884.5194.10
min0.292.170.100.060.292.170.100.00
25%4.6196.004.6798.124.6196.004.7098.00
50%7.04196.507.24191.007.04196.507.20191.00
75%10.57249.3310.47245.9610.57249.3310.50246.00
max34.08358.5830.91359.9534.08358.5830.90360.00
\n", 793 | "
" 794 | ], 795 | "text/plain": [ 796 | " M_obs_nc Dir_obs_nc M_vortex_nc Dir_vortex_nc M_obs_txt \\\n", 797 | "count 28809.00 28809.00 28809.00 28809.00 28809.00 \n", 798 | "mean 8.06 178.51 8.00 178.24 8.06 \n", 799 | "std 4.73 91.88 4.51 93.59 4.73 \n", 800 | "min 0.29 2.17 0.10 0.06 0.29 \n", 801 | "25% 4.61 96.00 4.67 98.12 4.61 \n", 802 | "50% 7.04 196.50 7.24 191.00 7.04 \n", 803 | "75% 10.57 249.33 10.47 245.96 10.57 \n", 804 | "max 34.08 358.58 30.91 359.95 34.08 \n", 805 | "\n", 806 | " Dir_obs_txt M_vortex_txt Dir_vortex_txt \n", 807 | "count 28809.00 28809.00 28809.00 \n", 808 | "mean 178.51 7.99 178.32 \n", 809 | "std 91.88 4.51 94.10 \n", 810 | "min 2.17 0.10 0.00 \n", 811 | "25% 96.00 4.70 98.00 \n", 812 | "50% 196.50 7.20 191.00 \n", 813 | "75% 249.33 10.50 246.00 \n", 814 | "max 358.58 30.90 360.00 " 815 | ] 816 | }, 817 | "metadata": {}, 818 | "output_type": "display_data" 819 | } 820 | ], 821 | "source": [ 822 | "# If you want to have only concurrent period, remove nodatas\n", 823 | "df = df.dropna(how='any', axis=0)\n", 824 | "\n", 825 | "display(df.head().round(2))\n", 826 | "print()\n", 827 | "display(df.describe().round(2)) " 828 | ] 829 | }, 830 | { 831 | "cell_type": "markdown", 832 | "metadata": {}, 833 | "source": [ 834 | "### Thank you for completing this Notebook! \n", 835 | "### *Other references available upon request.*\n", 836 | "\n", 837 | "You now can:\n", 838 | "\n", 839 | "- Read Vortex SERIES txt files.\n", 840 | "- Convert from txt to NetCDF.\n", 841 | "- Convert to **Pandas** DataFrames.\n", 842 | "- Have a quick overview of the data using `head()` and `describe()` Pandas functions.\n", 843 | "- Perform interpolation.\n", 844 | "- Perform resampling.\n", 845 | "- Merge datasets.\n", 846 | "\n", 847 | "**Don't hesitate to [contact us](https://vortexfdc.com/contact/) for any questions and information.**\n", 848 | "\n", 849 | "## Change Log\n", 850 | "\n", 851 | "\n", 852 | "| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n", 853 | "|-------------------|---------|------------|--------------------------------------------|\n", 854 | "| 2024-07-23 | 0.0 | Arnau | Notebook creation |\n", 855 | "| 2025-02-10 | 0.1 | Oriol | Notebook review |\n", 856 | "\n", 857 | "
\n", 858 | "\n", 859 | "##

© Vortex F.d.C. 2024. All rights reserved.

" 860 | ] 861 | } 862 | ], 863 | "metadata": { 864 | "kernelspec": { 865 | "display_name": "python-def", 866 | "language": "python", 867 | "name": "python3" 868 | }, 869 | "language_info": { 870 | "codemirror_mode": { 871 | "name": "ipython", 872 | "version": 3 873 | }, 874 | "file_extension": ".py", 875 | "mimetype": "text/x-python", 876 | "name": "python", 877 | "nbconvert_exporter": "python", 878 | "pygments_lexer": "ipython3", 879 | "version": "3.13.1" 880 | } 881 | }, 882 | "nbformat": 4, 883 | "nbformat_minor": 2 884 | } 885 | --------------------------------------------------------------------------------