├── .gitignore ├── README.rst ├── Vagrantfile ├── config.yaml ├── exitwp.py ├── html2text.py ├── pip_requirements.txt └── wordpress-xml └── .gitignore /.gitignore: -------------------------------------------------------------------------------- 1 | build/ 2 | *.pyc 3 | .ropeproject/ 4 | .Python 5 | lib/ 6 | bin/ 7 | build/ 8 | include/ 9 | .vagrant/ 10 | .DS_Store 11 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | ###### 2 | Exitwp 3 | ###### 4 | 5 | Exitwp is tool for making migration from one or more wordpress blogs to the `jekyll blog engine `_ as easy as possible. 6 | 7 | By default it will try to convert as much information as possible from wordpress but can also be told to filter the amount of data it converts. 8 | 9 | The latest version of these docs should always be available at https://github.com/thomasf/exitwp 10 | 11 | Getting started 12 | =============== 13 | * `Download `_ or clone using ``git clone https://github.com/thomasf/exitwp.git`` 14 | * Export one or more wordpress blogs using the wordpress exporter under tools/export in wordpress admin. 15 | * Put all wordpress xml files in the ``wordpress-xml`` directory 16 | * Special note for Wordpress 3.1, you need to add a missing namespace in rss tag : ``xmlns:atom="http://www.w3.org/2005/Atom"`` 17 | * Run xmllint on your export file and fix errors if there are. 18 | * Run the converter by typing ``python exitwp.py`` in the console from the directory of the unzipped archive 19 | * You should now have all the blogs converted into separate directories under the ``build`` directory 20 | 21 | Runtime dependencies 22 | ==================== 23 | * `Python `_ 2.6, 2.7, ??? 24 | * `html2text `_ : converts HTML to markdown (python) 25 | * `PyYAML `_ : Reading configuration files and writing YAML headers (python) 26 | * `Beautiful soup `_ : Parsing and downloading of post images/attachments (python) 27 | 28 | 29 | Installing dependencies in ubuntu/debian 30 | ---------------------------------------- 31 | 32 | ``sudo apt-get install python-yaml python-bs4 python-html2text`` 33 | 34 | Installing Python dependencies using python package installer (pip) 35 | ------------------------------------------------------------------- 36 | 37 | From the checked out root for this project, type: 38 | 39 | ``sudo pip install --upgrade -r pip_requirements.txt`` 40 | 41 | Note that PyYAML will require other packages to compile correctly under ubuntu/debian, these are installed by typing: 42 | 43 | ``sudo apt-get install libyaml-dev python-dev build-essential`` 44 | 45 | Using Vagrant for dependency management 46 | --------------------------------------- 47 | 48 | In the event your local system is incompatible with the dependencies listed (or you'd rather not install them), you can use the included Vagrantfile to start a VM with all necessary dependencies installed. 49 | 50 | 1. Lint and place all wordpress xml files in the ``wordpress-xml`` directory as mentioned above 51 | 2. In the directory of the unzipped archive, run ``vagrant up``. 52 | 3. SSH to your Vagrant VM using ``vagrant ssh`` 53 | 4. Run ``cd /vagrant`` to open the VM's shared folder 54 | 5. Run the converter from the VM by typing ``python exitwp.py`` 55 | 6. After the converter completes, exit the SSH session using ``exit`` 56 | 7. You should now have all the blogs converted into separate directories under the ``build`` directory 57 | 8. **Important:** Once satisfied with the results, run ``vagrant destroy -f`` to shut down the VM and remove the virtual drive from your local machine 58 | 59 | Configuration/Customization 60 | =========================== 61 | 62 | See the `configuration file `_ for all configurable options. 63 | 64 | Some things like custom handling of non standard post types is not fully configurable through the config file. You might have to modify the `source code `_ to add custom parsing behaviour. 65 | 66 | Known issues 67 | ============ 68 | * Target file names are some times less than optimal. 69 | * Rewriting of image/attachment links if they are downloaded would be a good feature 70 | * There will probably be issues when migrating non utf-8 encoded wordpress dump files (if they exist). 71 | 72 | Other Tools 73 | =========== 74 | * A Gist to convert WP-Footnotes style footnotes to PHP Markdown Extra style footnotes: https://gist.github.com/1246047 75 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | # -*- mode: ruby -*- 2 | # vi: set ft=ruby : 3 | 4 | $script = <