├── .gitattributes ├── README.md ├── valSECfilings.py ├── loadSECfilings.py ├── extractRatios.py └── LICENSE /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | * -whitespace 4 | 5 | # Custom for Visual Studio 6 | *.cs diff=csharp 7 | *.sln merge=union 8 | *.csproj merge=union 9 | *.vbproj merge=union 10 | *.fsproj merge=union 11 | *.dbproj merge=union 12 | 13 | # Standard to msysgit 14 | *.doc diff=astextplain 15 | *.DOC diff=astextplain 16 | *.docx diff=astextplain 17 | *.DOCX diff=astextplain 18 | *.dot diff=astextplain 19 | *.DOT diff=astextplain 20 | *.pdf diff=astextplain 21 | *.PDF diff=astextplain 22 | *.rtf diff=astextplain 23 | *.RTF diff=astextplain 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | sec-xbrl 2 | ======== 3 | 4 | Copyright 2014 Altova GmbH 5 | 6 | Licensed under the Apache License, Version 2.0 (the "License"); 7 | you may not use this file except in compliance with the License. 8 | You may obtain a copy of the License at 9 | 10 | http://www.apache.org/licenses/LICENSE-2.0 11 | 12 | Unless required by applicable law or agreed to in writing, software 13 | distributed under the License is distributed on an "AS IS" BASIS, 14 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | See the License for the specific language governing permissions and 16 | limitations under the License. 17 | 18 | ------------------------------------------------------------------------- 19 | 20 |

XBRL.US Webinar: How to download and process SEC XBRL Data Directly from EDGAR

21 | 22 | These are the supporting Python files for the XBRL.US Webinar that is availble 23 | on YouTube: https://www.youtube.com/watch?v=2Oe9ZqXVGME as well as the slides 24 | available here on SlideShare: http://www.slideshare.net/afalk42/xbrl-us-altova-webinar 25 | 26 | See also my recent blog post: http://www.xmlaficionado.com/2014/09/how-to-download-and-process-sec-xbrl.html 27 | 28 | Please watch the YouTube video and review the slides to see how these Python 29 | scripts are intended to be used. Also note that these scripts were written with 30 | Python 3.3.3 so they may require modifications if you use them with a different 31 | version of Python. 32 | 33 | To use this approach you will need to download and install RaptorXML+XBRL Server from 34 | the Altova website: http://www.altova.com/download-trial-server.html and then 35 | request a 30-day free evaluation license key. 36 | 37 | These sample Python scripts available here on GitHub were tested with a the MacOS 38 | version of RaptorXML+XBRL Server, but should function with the Windows or Linux version 39 | as well. You may need to change the file-paths pointing to the RaptorXML+XBRL Server 40 | executable in the Python script, though, or add them to the global PATH environment 41 | variable on your system. 42 | 43 | In addition to the standard Python libraries, you also need to install the Python 44 | feedparser module/library available here: https://pypi.python.org/pypi/feedparser 45 | 46 | For more information on RaptorXML, please see here: http://www.altova.com/raptorxml.html 47 | 48 | ------------------------------------------------------------------------- 49 | 50 |

Usage Information

51 | 52 | These scripts now require RaptorXML+XBRL v2015r3 or newer. 53 | 54 |

loadSECfilings

55 | 56 | loadSECfilings.py -y -m | -f -t 57 | 58 | These creates a subdirectory sec/ and then subsequent year-based directories and months 59 | underneath and downloads all SES XBRL filings from the EDGAR system to your local hard 60 | disk for further processing. Please use only during off-peak hours in order to not 61 | overload the SEC servers. This downloads the ZIPped XBRL filings, so you'll have one 62 | ZIP file per filing submitted to the SEC on your drive. If you call this script 63 | again for the current or any previous month at a later day, it will only download 64 | any files that are new and have not yet been downloaded before. 65 | 66 |
Examples
67 | 68 | python3 loadSECfilings.py -y 2014 -m 9 69 | 70 | This will load all SEC filing for September 2014. 71 | 72 | python3 loadSECfilings.py -f 2005 -t 2014 73 | 74 | This will load all SEC filing for the start of the XBRL pilot program in 2005 until 2014. 75 | WARNING: If you download all years available (2005-2014) this will be about 127,000 files 76 | and take about 18GB of data on your hard disk, so please use with caution, especially 77 | when you are on a slow Internet connection. 78 | 79 | 80 |

valSECfilings

81 | 82 | valSECfilings ( -y | -f -t ) -m 83 | -c -k -s