├── .gitattributes
├── README.md
├── valSECfilings.py
├── loadSECfilings.py
├── extractRatios.py
└── LICENSE
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 | * -whitespace
4 |
5 | # Custom for Visual Studio
6 | *.cs diff=csharp
7 | *.sln merge=union
8 | *.csproj merge=union
9 | *.vbproj merge=union
10 | *.fsproj merge=union
11 | *.dbproj merge=union
12 |
13 | # Standard to msysgit
14 | *.doc diff=astextplain
15 | *.DOC diff=astextplain
16 | *.docx diff=astextplain
17 | *.DOCX diff=astextplain
18 | *.dot diff=astextplain
19 | *.DOT diff=astextplain
20 | *.pdf diff=astextplain
21 | *.PDF diff=astextplain
22 | *.rtf diff=astextplain
23 | *.RTF diff=astextplain
24 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | sec-xbrl
2 | ========
3 |
4 | Copyright 2014 Altova GmbH
5 |
6 | Licensed under the Apache License, Version 2.0 (the "License");
7 | you may not use this file except in compliance with the License.
8 | You may obtain a copy of the License at
9 |
10 | http://www.apache.org/licenses/LICENSE-2.0
11 |
12 | Unless required by applicable law or agreed to in writing, software
13 | distributed under the License is distributed on an "AS IS" BASIS,
14 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 | See the License for the specific language governing permissions and
16 | limitations under the License.
17 |
18 | -------------------------------------------------------------------------
19 |
20 |
XBRL.US Webinar: How to download and process SEC XBRL Data Directly from EDGAR
21 |
22 | These are the supporting Python files for the XBRL.US Webinar that is availble
23 | on YouTube: https://www.youtube.com/watch?v=2Oe9ZqXVGME as well as the slides
24 | available here on SlideShare: http://www.slideshare.net/afalk42/xbrl-us-altova-webinar
25 |
26 | See also my recent blog post: http://www.xmlaficionado.com/2014/09/how-to-download-and-process-sec-xbrl.html
27 |
28 | Please watch the YouTube video and review the slides to see how these Python
29 | scripts are intended to be used. Also note that these scripts were written with
30 | Python 3.3.3 so they may require modifications if you use them with a different
31 | version of Python.
32 |
33 | To use this approach you will need to download and install RaptorXML+XBRL Server from
34 | the Altova website: http://www.altova.com/download-trial-server.html and then
35 | request a 30-day free evaluation license key.
36 |
37 | These sample Python scripts available here on GitHub were tested with a the MacOS
38 | version of RaptorXML+XBRL Server, but should function with the Windows or Linux version
39 | as well. You may need to change the file-paths pointing to the RaptorXML+XBRL Server
40 | executable in the Python script, though, or add them to the global PATH environment
41 | variable on your system.
42 |
43 | In addition to the standard Python libraries, you also need to install the Python
44 | feedparser module/library available here: https://pypi.python.org/pypi/feedparser
45 |
46 | For more information on RaptorXML, please see here: http://www.altova.com/raptorxml.html
47 |
48 | -------------------------------------------------------------------------
49 |
50 | Usage Information
51 |
52 | These scripts now require RaptorXML+XBRL v2015r3 or newer.
53 |
54 | loadSECfilings
55 |
56 | loadSECfilings.py -y -m | -f -t
57 |
58 | These creates a subdirectory sec/ and then subsequent year-based directories and months
59 | underneath and downloads all SES XBRL filings from the EDGAR system to your local hard
60 | disk for further processing. Please use only during off-peak hours in order to not
61 | overload the SEC servers. This downloads the ZIPped XBRL filings, so you'll have one
62 | ZIP file per filing submitted to the SEC on your drive. If you call this script
63 | again for the current or any previous month at a later day, it will only download
64 | any files that are new and have not yet been downloaded before.
65 |
66 | Examples
67 |
68 | python3 loadSECfilings.py -y 2014 -m 9
69 |
70 | This will load all SEC filing for September 2014.
71 |
72 | python3 loadSECfilings.py -f 2005 -t 2014
73 |
74 | This will load all SEC filing for the start of the XBRL pilot program in 2005 until 2014.
75 | WARNING: If you download all years available (2005-2014) this will be about 127,000 files
76 | and take about 18GB of data on your hard disk, so please use with caution, especially
77 | when you are on a slow Internet connection.
78 |
79 |
80 | valSECfilings
81 |
82 | valSECfilings ( -y | -f -t ) -m
83 | -c -k -s