├── LICENSE.md ├── README.md ├── apple_health_xml_convert.py └── img ├── example_output.jpg ├── export_data_button.jpg └── health_home.jpg /LICENSE.md: -------------------------------------------------------------------------------- 1 | Original Code: https://github.com/jameno/Simple-Apple-Health-XML-to-CSV 2 | Copyright (c) 2025, Jason Meno 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | * Redistributions of source code must retain the above copyright notice, this 9 | list of conditions and the following disclaimer. 10 | 11 | * Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 16 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Simple Apple Health XML to CSV 2 | 3 | A simple script to convert Apple Health's export.xml file to an easy to use csv. 4 | 5 | 6 | 7 | ## How to Run 8 | 9 | ### 1. Verify you have Python 3 & Pandas installed on your machine or environment 10 | 11 | `python --version` should return _Python 3.x.x_ where x is any number. 12 | 13 | If you have Python 2.x.x, please upgrade to Python 3 here: https://www.python.org/downloads/ (or specify your environment's Python version) 14 | 15 | `python3 -c "import pandas"` should return blank from the command line 16 | 17 | If you get a _**ModuleNotFoundError: No module named 'pandas'**_ error, install pandas and try again: 18 | 19 | `pip3 install pandas` 20 | 21 | 22 | ### 2. Export your Apple Health Data 23 | 24 | | Health Home | ➡️ | Export Data | 25 | |--|--|--| 26 | |||| 27 | 28 | Your data will be prepared, and then you can transfer the export.zip file to your machine. 29 | 30 | ### 3. Unzip the file, which should contain: 31 | 32 | * apple_health_export 33 | * export.xml (This is the file with your data that you want to convert) 34 | 35 | * export_cda.xml 36 | 37 | 38 | 39 | ### 4. Place the "apple_health_xml_convert.py" file from this repo into the folder alongside the files and run the script 40 | 41 | `python3 apple_health_xml_convert.py` 42 | 43 | 44 | 45 | The export will be written with the format: 46 | 47 | * **apple_health_export_YYYY-MM-DD.csv** 48 | 49 | 50 | 51 | In Excel, the output should look something like this: 52 | 53 | 54 | 55 | Note: This script removes the Apple Health data prefixes: `HKQuantityTypeIdentifier`, `HKCategoryTypeIdentifier`, and `HKCharacteristicTypeIdentifier` for increased legibility. Feel free to comment out those lines in the code with a `#` if you want to keep them in the CSV output. 56 | -------------------------------------------------------------------------------- /apple_health_xml_convert.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Simple Apple Health XML to CSV 5 | ============================== 6 | :File: convert.py 7 | :Description: Convert Apple Health "export.xml" file into a csv 8 | :Version: 0.0.2 9 | :Created: 2019-10-04 10 | :Updated: 2023-10-29 11 | :Authors: Jason Meno (jam) 12 | :Dependencies: An export.xml file from Apple Health 13 | :License: BSD-2-Clause 14 | """ 15 | 16 | # %% Imports 17 | import os 18 | import pandas as pd 19 | import xml.etree.ElementTree as ET 20 | import datetime as dt 21 | import sys 22 | 23 | 24 | # %% Function Definitions 25 | 26 | def preprocess_to_temp_file(file_path): 27 | """ 28 | The export.xml file is where all your data is, but Apple Health Export has 29 | two main problems that make it difficult to parse: 30 | 1. The DTD markup syntax is exported incorrectly by Apple Health for some data types. 31 | 2. The invisible character \x0b (sometimes rendered as U+000b) likes to destroy trees. Think of the trees! 32 | 33 | Knowing this, we can save the trees and pre-processes the XML data to avoid destruction and ParseErrors. 34 | """ 35 | 36 | print("Pre-processing and writing to temporary file...", end="") 37 | sys.stdout.flush() 38 | 39 | temp_file_path = "temp_preprocessed_export.xml" 40 | with open(file_path, 'r', encoding='UTF-8') as infile, open(temp_file_path, 'w', encoding='UTF-8') as outfile: 41 | skip_dtd = False 42 | for line in infile: 43 | if '' in line: 49 | skip_dtd = False 50 | 51 | print("done!") 52 | return temp_file_path 53 | 54 | def strip_invisible_character(line): 55 | return line.replace("\x0b", "") 56 | 57 | 58 | def xml_to_csv(file_path): 59 | """Loops through the element tree, retrieving all objects, and then 60 | combining them together into a dataframe 61 | """ 62 | 63 | print("Converting XML File to CSV...", end="") 64 | sys.stdout.flush() 65 | 66 | attribute_list = [] 67 | 68 | for event, elem in ET.iterparse(file_path, events=('end',)): 69 | if event == 'end': 70 | child_attrib = elem.attrib 71 | for metadata_entry in list(elem): 72 | metadata_values = list(metadata_entry.attrib.values()) 73 | if len(metadata_values) == 2: 74 | metadata_dict = {metadata_values[0]: metadata_values[1]} 75 | child_attrib.update(metadata_dict) 76 | attribute_list.append(child_attrib) 77 | 78 | # Clear the element from memory to avoid excessive memory consumption 79 | elem.clear() 80 | 81 | health_df = pd.DataFrame(attribute_list) 82 | 83 | # Every health data type and some columns have a long identifer 84 | # Removing these for readability 85 | health_df.type = health_df.type.str.replace('HKQuantityTypeIdentifier', "") 86 | health_df.type = health_df.type.str.replace('HKCategoryTypeIdentifier', "") 87 | health_df.columns = \ 88 | health_df.columns.str.replace("HKCharacteristicTypeIdentifier", "") 89 | 90 | # Reorder some of the columns for easier visual data review 91 | original_cols = list(health_df) 92 | shifted_cols = ['type', 93 | 'sourceName', 94 | 'value', 95 | 'unit', 96 | 'startDate', 97 | 'endDate', 98 | 'creationDate'] 99 | 100 | # Add loop specific column ordering if metadata entries exist 101 | if 'com.loopkit.InsulinKit.MetadataKeyProgrammedTempBasalRate' in original_cols: 102 | shifted_cols.append( 103 | 'com.loopkit.InsulinKit.MetadataKeyProgrammedTempBasalRate') 104 | 105 | if 'com.loopkit.InsulinKit.MetadataKeyScheduledBasalRate' in original_cols: 106 | shifted_cols.append( 107 | 'com.loopkit.InsulinKit.MetadataKeyScheduledBasalRate') 108 | 109 | if 'com.loudnate.CarbKit.HKMetadataKey.AbsorptionTimeMinutes' in original_cols: 110 | shifted_cols.append( 111 | 'com.loudnate.CarbKit.HKMetadataKey.AbsorptionTimeMinutes') 112 | 113 | remaining_cols = list(set(original_cols) - set(shifted_cols)) 114 | reordered_cols = shifted_cols + remaining_cols 115 | health_df = health_df.reindex(labels=reordered_cols, axis='columns') 116 | 117 | # Sort by newest data first 118 | health_df.sort_values(by='startDate', ascending=False, inplace=True) 119 | 120 | print("done!") 121 | 122 | return health_df 123 | 124 | 125 | def save_to_csv(health_df): 126 | print("Saving CSV file...", end="") 127 | sys.stdout.flush() 128 | 129 | today = dt.datetime.now().strftime('%Y-%m-%d') 130 | health_df.to_csv("apple_health_export_" + today + ".csv", index=False) 131 | print("done!") 132 | 133 | return 134 | 135 | def remove_temp_file(temp_file_path): 136 | print("Removing temporary file...", end="") 137 | os.remove(temp_file_path) 138 | print("done!") 139 | 140 | return 141 | 142 | def main(): 143 | file_path = "export.xml" 144 | temp_file_path = preprocess_to_temp_file(file_path) 145 | health_df = xml_to_csv(temp_file_path) 146 | save_to_csv(health_df) 147 | remove_temp_file(temp_file_path) 148 | 149 | return 150 | 151 | 152 | # %% 153 | if __name__ == '__main__': 154 | main() 155 | -------------------------------------------------------------------------------- /img/example_output.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/example_output.jpg -------------------------------------------------------------------------------- /img/export_data_button.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/export_data_button.jpg -------------------------------------------------------------------------------- /img/health_home.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/health_home.jpg --------------------------------------------------------------------------------