├── LICENSE.md
├── README.md
├── apple_health_xml_convert.py
└── img
├── example_output.jpg
├── export_data_button.jpg
└── health_home.jpg
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Original Code: https://github.com/jameno/Simple-Apple-Health-XML-to-CSV
2 | Copyright (c) 2025, Jason Meno
3 | All rights reserved.
4 |
5 | Redistribution and use in source and binary forms, with or without
6 | modification, are permitted provided that the following conditions are met:
7 |
8 | * Redistributions of source code must retain the above copyright notice, this
9 | list of conditions and the following disclaimer.
10 |
11 | * Redistributions in binary form must reproduce the above copyright notice,
12 | this list of conditions and the following disclaimer in the documentation
13 | and/or other materials provided with the distribution.
14 |
15 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
16 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
19 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
21 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
22 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
23 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
24 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
25 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Simple Apple Health XML to CSV
2 |
3 | A simple script to convert Apple Health's export.xml file to an easy to use csv.
4 |
5 |
6 |
7 | ## How to Run
8 |
9 | ### 1. Verify you have Python 3 & Pandas installed on your machine or environment
10 |
11 | `python --version` should return _Python 3.x.x_ where x is any number.
12 |
13 | If you have Python 2.x.x, please upgrade to Python 3 here: https://www.python.org/downloads/ (or specify your environment's Python version)
14 |
15 | `python3 -c "import pandas"` should return blank from the command line
16 |
17 | If you get a _**ModuleNotFoundError: No module named 'pandas'**_ error, install pandas and try again:
18 |
19 | `pip3 install pandas`
20 |
21 |
22 | ### 2. Export your Apple Health Data
23 |
24 | | Health Home | ➡️ | Export Data |
25 | |--|--|--|
26 | |
||
|
27 |
28 | Your data will be prepared, and then you can transfer the export.zip file to your machine.
29 |
30 | ### 3. Unzip the file, which should contain:
31 |
32 | * apple_health_export
33 | * export.xml (This is the file with your data that you want to convert)
34 |
35 | * export_cda.xml
36 |
37 |
38 |
39 | ### 4. Place the "apple_health_xml_convert.py" file from this repo into the folder alongside the files and run the script
40 |
41 | `python3 apple_health_xml_convert.py`
42 |
43 |
44 |
45 | The export will be written with the format:
46 |
47 | * **apple_health_export_YYYY-MM-DD.csv**
48 |
49 |
50 |
51 | In Excel, the output should look something like this:
52 |
53 |
54 |
55 | Note: This script removes the Apple Health data prefixes: `HKQuantityTypeIdentifier`, `HKCategoryTypeIdentifier`, and `HKCharacteristicTypeIdentifier` for increased legibility. Feel free to comment out those lines in the code with a `#` if you want to keep them in the CSV output.
56 |
--------------------------------------------------------------------------------
/apple_health_xml_convert.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Simple Apple Health XML to CSV
5 | ==============================
6 | :File: convert.py
7 | :Description: Convert Apple Health "export.xml" file into a csv
8 | :Version: 0.0.2
9 | :Created: 2019-10-04
10 | :Updated: 2023-10-29
11 | :Authors: Jason Meno (jam)
12 | :Dependencies: An export.xml file from Apple Health
13 | :License: BSD-2-Clause
14 | """
15 |
16 | # %% Imports
17 | import os
18 | import pandas as pd
19 | import xml.etree.ElementTree as ET
20 | import datetime as dt
21 | import sys
22 |
23 |
24 | # %% Function Definitions
25 |
26 | def preprocess_to_temp_file(file_path):
27 | """
28 | The export.xml file is where all your data is, but Apple Health Export has
29 | two main problems that make it difficult to parse:
30 | 1. The DTD markup syntax is exported incorrectly by Apple Health for some data types.
31 | 2. The invisible character \x0b (sometimes rendered as U+000b) likes to destroy trees. Think of the trees!
32 |
33 | Knowing this, we can save the trees and pre-processes the XML data to avoid destruction and ParseErrors.
34 | """
35 |
36 | print("Pre-processing and writing to temporary file...", end="")
37 | sys.stdout.flush()
38 |
39 | temp_file_path = "temp_preprocessed_export.xml"
40 | with open(file_path, 'r', encoding='UTF-8') as infile, open(temp_file_path, 'w', encoding='UTF-8') as outfile:
41 | skip_dtd = False
42 | for line in infile:
43 | if '' in line:
49 | skip_dtd = False
50 |
51 | print("done!")
52 | return temp_file_path
53 |
54 | def strip_invisible_character(line):
55 | return line.replace("\x0b", "")
56 |
57 |
58 | def xml_to_csv(file_path):
59 | """Loops through the element tree, retrieving all objects, and then
60 | combining them together into a dataframe
61 | """
62 |
63 | print("Converting XML File to CSV...", end="")
64 | sys.stdout.flush()
65 |
66 | attribute_list = []
67 |
68 | for event, elem in ET.iterparse(file_path, events=('end',)):
69 | if event == 'end':
70 | child_attrib = elem.attrib
71 | for metadata_entry in list(elem):
72 | metadata_values = list(metadata_entry.attrib.values())
73 | if len(metadata_values) == 2:
74 | metadata_dict = {metadata_values[0]: metadata_values[1]}
75 | child_attrib.update(metadata_dict)
76 | attribute_list.append(child_attrib)
77 |
78 | # Clear the element from memory to avoid excessive memory consumption
79 | elem.clear()
80 |
81 | health_df = pd.DataFrame(attribute_list)
82 |
83 | # Every health data type and some columns have a long identifer
84 | # Removing these for readability
85 | health_df.type = health_df.type.str.replace('HKQuantityTypeIdentifier', "")
86 | health_df.type = health_df.type.str.replace('HKCategoryTypeIdentifier', "")
87 | health_df.columns = \
88 | health_df.columns.str.replace("HKCharacteristicTypeIdentifier", "")
89 |
90 | # Reorder some of the columns for easier visual data review
91 | original_cols = list(health_df)
92 | shifted_cols = ['type',
93 | 'sourceName',
94 | 'value',
95 | 'unit',
96 | 'startDate',
97 | 'endDate',
98 | 'creationDate']
99 |
100 | # Add loop specific column ordering if metadata entries exist
101 | if 'com.loopkit.InsulinKit.MetadataKeyProgrammedTempBasalRate' in original_cols:
102 | shifted_cols.append(
103 | 'com.loopkit.InsulinKit.MetadataKeyProgrammedTempBasalRate')
104 |
105 | if 'com.loopkit.InsulinKit.MetadataKeyScheduledBasalRate' in original_cols:
106 | shifted_cols.append(
107 | 'com.loopkit.InsulinKit.MetadataKeyScheduledBasalRate')
108 |
109 | if 'com.loudnate.CarbKit.HKMetadataKey.AbsorptionTimeMinutes' in original_cols:
110 | shifted_cols.append(
111 | 'com.loudnate.CarbKit.HKMetadataKey.AbsorptionTimeMinutes')
112 |
113 | remaining_cols = list(set(original_cols) - set(shifted_cols))
114 | reordered_cols = shifted_cols + remaining_cols
115 | health_df = health_df.reindex(labels=reordered_cols, axis='columns')
116 |
117 | # Sort by newest data first
118 | health_df.sort_values(by='startDate', ascending=False, inplace=True)
119 |
120 | print("done!")
121 |
122 | return health_df
123 |
124 |
125 | def save_to_csv(health_df):
126 | print("Saving CSV file...", end="")
127 | sys.stdout.flush()
128 |
129 | today = dt.datetime.now().strftime('%Y-%m-%d')
130 | health_df.to_csv("apple_health_export_" + today + ".csv", index=False)
131 | print("done!")
132 |
133 | return
134 |
135 | def remove_temp_file(temp_file_path):
136 | print("Removing temporary file...", end="")
137 | os.remove(temp_file_path)
138 | print("done!")
139 |
140 | return
141 |
142 | def main():
143 | file_path = "export.xml"
144 | temp_file_path = preprocess_to_temp_file(file_path)
145 | health_df = xml_to_csv(temp_file_path)
146 | save_to_csv(health_df)
147 | remove_temp_file(temp_file_path)
148 |
149 | return
150 |
151 |
152 | # %%
153 | if __name__ == '__main__':
154 | main()
155 |
--------------------------------------------------------------------------------
/img/example_output.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/example_output.jpg
--------------------------------------------------------------------------------
/img/export_data_button.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/export_data_button.jpg
--------------------------------------------------------------------------------
/img/health_home.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jameno/Simple-Apple-Health-XML-to-CSV/b1aca79143173dc035098dbc59faf77d12e6f36e/img/health_home.jpg
--------------------------------------------------------------------------------