├── .gitignore ├── LICENSE ├── README.md ├── docbuilder.py ├── docbuilder_rsa.pub ├── requirements.txt └── srv └── salt ├── docbuilder_rsa ├── docbuilder_rsa.pub ├── gitconfig ├── sklearn-docbuilder.sls ├── ssh_config ├── top.sls └── update_doc.sh /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | var 3 | salt/roster 4 | salt/master 5 | docbuilder_rsa 6 | etc 7 | certs 8 | machines 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2013 scikit-learn 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | sklearn-docbuilder 2 | ================== 3 | 4 | Script to configure a cloud server to build the documentation and example 5 | gallery and update the [sklearn dev website](http://scikit-learn.org/dev). 6 | 7 | This script is only meant to be used by scikit-learn project maintainers with 8 | github and sourceforge write access. 9 | 10 | 11 | Usage 12 | ----- 13 | 14 | First install the script dependencies (preferably in a virtualenv): 15 | 16 | pip install -r requirements.txt 17 | 18 | The following tools will be installed locally (NumPy, SciPy and scikit-learn 19 | are **not** required locally, they will only be installed on the cloud server): 20 | 21 | - Apache Libcloud is used to check the status and start a Rackspace cloud 22 | server dedicated to monitoring the master branch of the project and building 23 | the example gallery and the sphinx documentation. 24 | 25 | - SaltStack (via the salt-ssh client) is used to automate the configuration of 26 | the server. 27 | 28 | - Paramiko is a Python implementation of the SSH protocol used by SaltStack. 29 | 30 | - Yaml is used to parse and generate SaltStack configuration files. 31 | 32 | Ask one of the scikit-learn maintainers for the `docbuilder_rsa` private 33 | key next to the `docbuilder_rsa.pub` public key in this folder. 34 | 35 | Also ask for the Rackspace cloud credentials and put them as enviroment 36 | variables (possibly in your `~/.bashrc` file): 37 | 38 | export SKLEARN_RACKSPACE_NAME="sklearn" 39 | export SKLEARN_RACKSPACE_KEY="XXXXXXXXX" 40 | 41 | Once you have the credentials, run the following script to check that the 42 | Rackspace cloud server is running, start it if this not the case and fetch the 43 | connection parameters (the IP address stored in `./etc/salt/roster`): 44 | 45 | python docbuilder.py 46 | 47 | You can check that the server can by contacted by salt commands such 48 | as: 49 | 50 | salt-ssh -c ./etc/salt docbuilder test.ping 51 | 52 | To install all the scikit-learn build dependencies and install the cron job 53 | use: 54 | 55 | salt-ssh -c ./etc/salt docbuilder state.highstate 56 | 57 | If the server was freshly provisioned, the first execution of `state.highstate` 58 | can take several minutes. Subsequent calls will be much faster. To display 59 | debug info use: 60 | 61 | salt-ssh -c ./etc/salt docbuilder state.highstate -l debug 62 | 63 | 64 | Changing the server configuration 65 | --------------------------------- 66 | 67 | The Salt Stack configuration for the server and the script run by the cron job 68 | to build the documentation once the server is ready can be found in the 69 | [srv/salt]( 70 | https://github.com/scikit-learn/sklearn-docbuilder/tree/master/srv/salt) 71 | subfolder. 72 | 73 | To re-apply a configuration change in the configuration re-rerun: 74 | 75 | python docbuilder.py # to fetch the IP of the docbuilder server 76 | salt-ssh -c ./etc/salt docbuilder state.highstate 77 | 78 | 79 | Fixing execution errors in docbuilder.py 80 | ---------------------------------------- 81 | 82 | API change for Apache Libcloud are documented here: 83 | 84 | https://ci.apache.org/projects/libcloud/docs/upgrade_notes.html 85 | 86 | 87 | Connecting to the server via SSH 88 | -------------------------------- 89 | 90 | Run: 91 | 92 | python docbuilder.py 93 | 94 | to fetch the IP address of the running server in `etc/salt/roster`. The output 95 | of the command should display the ssh command, such as: 96 | 97 | ssh -i docbuilder_rsa root@ 98 | 99 | 100 | Changing the ssh keys 101 | --------------------- 102 | 103 | If you suspect that the private keys have been compromised you can generate 104 | a new keypair with: 105 | 106 | ssh-keygen -f docbuilder_rsa -N '' 107 | 108 | Then commit the new public key and push it to github and send the private key by 109 | email to the other scikit-learn maintainers and ask one of them to update the 110 | authorized public key of the `sklearndocbuild` user profile on sourceforge.net. 111 | 112 | 113 | Accessing the Rackspace Cloud management console 114 | ------------------------------------------------ 115 | 116 | Use rackspace credentials to list the running servers and terminate them if 117 | needed at: 118 | 119 | https://mycloud.rackspace.com/a/sklearn/ 120 | 121 | 122 | Thanks 123 | ------ 124 | 125 | We would like to thank Rackspace for supporting the scikit-learn project by 126 | giving a free Rackspace Cloud account. 127 | -------------------------------------------------------------------------------- /docbuilder.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | import sys 4 | import getopt 5 | import yaml 6 | 7 | from libcloud.compute.providers import get_driver 8 | from libcloud.compute.deployment import SSHKeyDeployment 9 | 10 | # Environment variables to lookup rackspace credentials 11 | SKLEARN_RACKSPACE_NAME = "SKLEARN_RACKSPACE_NAME" 12 | SKLEARN_RACKSPACE_KEY = "SKLEARN_RACKSPACE_KEY" 13 | RACKSPACE_DRIVER = "rackspace" 14 | REGION = "ord" 15 | 16 | IMAGE_NAME = 'Ubuntu 12.04 LTS (Precise Pangolin) (PVHVM)' 17 | NODE_NAME = 'docbuilder' 18 | DEFAULT_NODE_SIZE = 2048 19 | PUBLIC_KEY_PATH = 'docbuilder_rsa.pub' 20 | PRIVATE_KEY_PATH = 'docbuilder_rsa' 21 | TIMEOUT = 5000 22 | 23 | MASTER_TEMPLATE = """\ 24 | root_dir: {root_dir} 25 | 26 | fileserver_backend: 27 | - roots 28 | 29 | file_roots: 30 | base: 31 | - {root_dir}/srv/salt 32 | """ 33 | 34 | 35 | def print_usage(machine_sizes): 36 | print('USAGE: python docbuilder.py [machine_size]') 37 | print('Regarding `machine_size` (optional):') 38 | print('Please select one of the following:') 39 | print(machine_sizes) 40 | 41 | 42 | def gen_salt_roster(host_ips=None): 43 | # XXX: cannot connect to host with the IPv6 public address 44 | ipv4 = [ip for ip in host_ips if not ":" in ip][0] 45 | salt_roster = """\ 46 | %s: 47 | host: %s 48 | user: root 49 | priv: %s 50 | """ % (NODE_NAME, ipv4, PRIVATE_KEY_PATH) 51 | output_stream = open("etc/salt/roster", "w") 52 | yaml.dump(yaml.load(salt_roster), output_stream, default_flow_style=False) 53 | output_stream.close() 54 | return ipv4 55 | 56 | 57 | def wait_for_active_status(server_status, connect): 58 | # Wait for active server status 59 | wait_id = 0 60 | wait_li = ['. ', '.. ', '...'] 61 | while server_status != 0: 62 | existing_nodes = connect.list_nodes() 63 | s_node = [n for n in existing_nodes if n.name == NODE_NAME][0] 64 | server_status = s_node.state 65 | wait_id = (wait_id + 1) % len(wait_li) 66 | if server_status == 0: 67 | state_str = "READY" 68 | else: 69 | state_str = "BUSY - (waiting for active status)" 70 | sys.stdout.write("\r%s%s" % 71 | (state_str, wait_li[wait_id])) 72 | sys.stdout.flush() 73 | 74 | sys.stdout.write("\nServer is now active\n") 75 | return server_status 76 | 77 | 78 | def main(argv): 79 | # Make a connection through the rackspace driver to the sklearn space 80 | 81 | name = os.environ.get(SKLEARN_RACKSPACE_NAME) 82 | key = os.environ.get(SKLEARN_RACKSPACE_KEY) 83 | if name is None or key is None: 84 | raise RuntimeError( 85 | "Please set credentials as enviroment variables " 86 | " {} and {}".format(SKLEARN_RACKSPACE_NAME, SKLEARN_RACKSPACE_KEY)) 87 | conn_sklearn = get_driver(RACKSPACE_DRIVER)(name, key, region=REGION) 88 | 89 | # Obtain list of nodes 90 | existing_nodes = conn_sklearn.list_nodes() 91 | node_list = "\n".join(" - " + n.name for n in existing_nodes) 92 | print("Found %d existing node(s) with names:\n%s" % ( 93 | len(existing_nodes), node_list)) 94 | 95 | # Obtain list of machine sizes 96 | machine_sizes = [n.ram for n in conn_sklearn.list_sizes()] 97 | selected_ram = None 98 | server_status = 3 # assume busy 99 | 100 | try: 101 | opts, args = getopt.getopt(argv, "h") 102 | for opt, arg in opts: 103 | if opt == '-h': 104 | print_usage(machine_sizes) 105 | sys.exit() 106 | if args: 107 | if int(args[0]) not in machine_sizes: 108 | print_usage(machine_sizes) 109 | sys.exit() 110 | else: 111 | selected_ram = int(args[0]) 112 | except getopt.GetoptError: 113 | print_usage(machine_sizes) 114 | sys.exit(2) 115 | 116 | # Check if our desired node already exists 117 | if not any(n.name == NODE_NAME for n in existing_nodes): 118 | print('The docbuilder node does not exist yet - creating node...') 119 | print(' - Configuring node size') 120 | if selected_ram is None: 121 | print(' -- No node size provided: using default size of 2GB') 122 | size = [i for i in conn_sklearn.list_sizes() 123 | if i.ram == DEFAULT_NODE_SIZE][0] 124 | else: 125 | print(' -- Node size set to: ', selected_ram) 126 | size = [i for i in conn_sklearn.list_sizes() 127 | if i.ram >= selected_ram][0] 128 | 129 | print(' - Configuring the builder image to', IMAGE_NAME) 130 | images = conn_sklearn.list_images() 131 | matching_images = [i for i in images if i.name == IMAGE_NAME] 132 | if len(matching_images) == 0: 133 | image_names = "\n".join(sorted(i.name for i in images)) 134 | raise RuntimeError("Could not find image with name %s," 135 | " available images:\n%s" 136 | % (IMAGE_NAME, image_names)) 137 | s_node_image = matching_images[0] 138 | 139 | # Create a new node if non exists 140 | with open(PUBLIC_KEY_PATH) as fp: 141 | pub_key_content = fp.read() 142 | step = SSHKeyDeployment(pub_key_content) 143 | print("Starting node deployment - This may take a few minutes") 144 | print("WARNING: Please do not interrupt the process") 145 | node = conn_sklearn.deploy_node(name=NODE_NAME, image=s_node_image, 146 | size=size, deploy=step, 147 | timeout=TIMEOUT, ssh_timeout=TIMEOUT) 148 | print('Node successfully provisioned: ', NODE_NAME) 149 | else: 150 | node = [n for n in existing_nodes if n.name == NODE_NAME][0] 151 | print("Node '%s' found" % NODE_NAME) 152 | print('Gathering connection information') 153 | 154 | if not os.path.exists('etc/salt'): 155 | os.makedirs('etc/salt') 156 | 157 | print("Storing connection information to etc/salt/roster") 158 | ip = gen_salt_roster(host_ips=node.public_ips) 159 | 160 | print("Configuring etc/salt/master") 161 | salt_master = open("etc/salt/master", "w") 162 | here = os.getcwd() 163 | salt_master.write(MASTER_TEMPLATE.format(root_dir=here)) 164 | 165 | print('Checking if the server is active:') 166 | server_status = wait_for_active_status(server_status, conn_sklearn) 167 | 168 | # Making sure the private key has the right permissions to be useable by 169 | # paramiko 170 | os.chmod('docbuilder_rsa', 0o600) 171 | 172 | print("SSH connection command:") 173 | print(" ssh -i %s root@%s" % (PRIVATE_KEY_PATH, ip)) 174 | 175 | # TODO: find a way to launch the state.highstate command via salt-ssh 176 | print("You can now configure the server with (can take several minutes):") 177 | print(" salt-ssh -i -c ./etc/salt docbuilder state.highstate") 178 | 179 | 180 | if __name__ == "__main__": 181 | main(sys.argv[1:]) 182 | -------------------------------------------------------------------------------- /docbuilder_rsa.pub: -------------------------------------------------------------------------------- 1 | ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDmymvvKKAZSI1HaE7+u53QuVJbmuQISDuGcoeVVpxeahrJ3/aNCthQ7DY+njeV/j/LieteSo1f524pjxRGWKZfLM86iNLNAbwJ0KSgmm/FRc//l6nJQpPJ9l12wbfrNK1BCmzRMps6Y9NiZEtF1GXF4mNGw9nceRZCKGVk1js9FSA56T1pzg6ewoYekSAbzaD/kn3hRRVHtSt6bUyfyOtAgi+nyqXvByDXDBYMtpDtHaYXF/SnI6x50m2gHGlBX7a9inVE2q7kIk31zF9NLAsJxfqSpF59egHvLUdfqA2h5jrcpwZFWHmnLQ57GRa/YqcRR045Q97Bsj3HnqXQ1Oj3 jaques@is148602 2 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | salt==2015.5.5 # last known good version of salt 2 | apache-libcloud 3 | pyaml 4 | paramiko 5 | -------------------------------------------------------------------------------- /srv/salt/docbuilder_rsa: -------------------------------------------------------------------------------- 1 | ../../docbuilder_rsa -------------------------------------------------------------------------------- /srv/salt/docbuilder_rsa.pub: -------------------------------------------------------------------------------- 1 | ../../docbuilder_rsa.pub -------------------------------------------------------------------------------- /srv/salt/gitconfig: -------------------------------------------------------------------------------- 1 | [user] 2 | email = olivier.grisel+sklearn-ci@gmail.com 3 | name = sklearn-ci 4 | 5 | -------------------------------------------------------------------------------- /srv/salt/sklearn-docbuilder.sls: -------------------------------------------------------------------------------- 1 | scipy-stack-packages: 2 | pkg: 3 | - installed 4 | - names: 5 | # Salt optional stuff 6 | - git 7 | - vim 8 | - python-git 9 | - python-numpy 10 | - python-scipy 11 | - python-pip 12 | - python-coverage 13 | - python-nose 14 | - ipython 15 | - make 16 | - optipng 17 | 18 | # Required for building a more recent matplotlib from source 19 | - libfreetype6-dev 20 | - libpng12-dev 21 | 22 | # Latex packages for math expressions in sphinx 23 | - latex209-base 24 | - texlive-latex-extra 25 | - dvipng 26 | 27 | # Linear Algebra routines 28 | - libatlas-dev 29 | - libatlas3gf-base 30 | 31 | sklearn: 32 | user.present: 33 | - shell: /bin/bash 34 | - home: /home/sklearn 35 | 36 | /home/sklearn/public_html: 37 | file.directory: 38 | - user: sklearn 39 | - group: sklearn 40 | - mode: 755 41 | - makedirs: True 42 | - require: 43 | - user: sklearn 44 | 45 | /home/sklearn/.ssh: 46 | file.directory: 47 | - user: sklearn 48 | - group: sklearn 49 | - mode: 755 50 | - makedirs: True 51 | - require: 52 | - user: sklearn 53 | 54 | /home/sklearn/.ssh/id_rsa: 55 | file.managed: 56 | - user: sklearn 57 | - group: sklearn 58 | - mode: 600 59 | - source: salt://docbuilder_rsa 60 | - require: 61 | - file: /home/sklearn/.ssh 62 | 63 | /home/sklearn/.ssh/id_rsa.pub: 64 | file.managed: 65 | - user: sklearn 66 | - group: sklearn 67 | - source: salt://docbuilder_rsa.pub 68 | - require: 69 | - file: /home/sklearn/.ssh 70 | 71 | /home/sklearn/.ssh/config: 72 | file.managed: 73 | - user: sklearn 74 | - group: sklearn 75 | - source: salt://ssh_config 76 | - require: 77 | - user: sklearn 78 | - file: /home/sklearn/.ssh 79 | 80 | 81 | # Install a recent version of virtualenv with pip 82 | # before creating the virtual environment itself. 83 | # This is required as the version of virtualenv shipped 84 | # by the python-virtualenv package is too old and 85 | # has a bug that prevents it to upgrade setuptools to 86 | # a version recent-enough for matplotlib to install 87 | # correctly 88 | # Note that we use a `cmd.run` state instead of a 89 | # `pip.installed` state to work around a bug in salt: 90 | # https://github.com/saltstack/salt/issues/21845 91 | install-virtualenv: 92 | cmd.run: 93 | - name: pip install -q virtualenv 94 | - unless: test -f /home/sklearn/venv 95 | 96 | 97 | /home/sklearn/venv: 98 | virtualenv.managed: 99 | - python: /usr/bin/python 100 | - system_site_packages: True 101 | - user: sklearn 102 | - require: 103 | - user: sklearn 104 | - cmd: install-virtualenv 105 | pip.installed: 106 | - names: 107 | - sphinx == 1.2.3 108 | - coverage 109 | - nose 110 | - ipython 111 | - matplotlib 112 | - bin_env: /home/sklearn/venv 113 | - user: sklearn 114 | 115 | 116 | sklearn-git-repo: 117 | git.latest: 118 | - name: https://github.com/scikit-learn/scikit-learn.git 119 | - rev: master 120 | - target: /home/sklearn/scikit-learn/ 121 | - user: sklearn 122 | - require: 123 | - user: sklearn 124 | 125 | 126 | # Upload a bash script that builds the doc and upload the doc on 127 | # http://scikit-learn.org/dev 128 | /home/sklearn/update_doc.sh: 129 | file.managed: 130 | - user: sklearn 131 | - group: sklearn 132 | - source: salt://update_doc.sh 133 | - require: 134 | - user: sklearn 135 | 136 | 137 | # Upload git configuration to be able to commit to the 138 | # scikit-learn.github.io repo 139 | /home/sklearn/.gitconfig: 140 | file.managed: 141 | - user: sklearn 142 | - group: sklearn 143 | - source: salt://gitconfig 144 | - require: 145 | - user: sklearn 146 | 147 | 148 | # Register the execution of the script in a cron job 149 | update-doc-cron-job: 150 | cron.present: 151 | - name: bash /home/sklearn/update_doc.sh 152 | > /home/sklearn/public_html/update_doc.log 2>&1 153 | - user: sklearn 154 | - minute: 2 155 | - hour: '*/1' 156 | - require: 157 | - git: sklearn-git-repo 158 | - file: /home/sklearn/update_doc.sh 159 | - file: /home/sklearn/public_html 160 | - file: /home/sklearn/.gitconfig 161 | 162 | 163 | # Once in a while build the doc from a clean folder 164 | update-doc-clean-cron-job: 165 | cron.present: 166 | - name: bash /home/sklearn/update_doc.sh clean 167 | > /home/sklearn/public_html/update_doc_clean.log 2>&1 168 | - user: sklearn 169 | - minute: 32 170 | - hour: 2 171 | - require: 172 | - git: sklearn-git-repo 173 | - file: /home/sklearn/update_doc.sh 174 | - file: /home/sklearn/public_html 175 | - file: /home/sklearn/.gitconfig 176 | - file: /home/sklearn/.ssh/config 177 | - file: /home/sklearn/.ssh/id_rsa 178 | -------------------------------------------------------------------------------- /srv/salt/ssh_config: -------------------------------------------------------------------------------- 1 | Host * 2 | StrictHostKeyChecking no 3 | -------------------------------------------------------------------------------- /srv/salt/top.sls: -------------------------------------------------------------------------------- 1 | base: 2 | '*': 3 | - sklearn-docbuilder 4 | -------------------------------------------------------------------------------- /srv/salt/update_doc.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # Incremental update and upload of the dev documentation. 3 | # This script is meant to be run regularly in a cron job 4 | 5 | # Early return on error 6 | set -e 7 | 8 | # Timestamp the log 9 | echo `date` 10 | 11 | # Activate the virtualenv for Python libraries 12 | echo "Activating virtualenv" 13 | source $HOME/venv/bin/activate 14 | cd $HOME/scikit-learn 15 | 16 | # Clean update of the source folder 17 | echo "Fetching source from github" 18 | git fetch origin 19 | git reset --hard origin/master 20 | rev=$(git rev-parse --short HEAD) 21 | 22 | # Compile source code 23 | echo "Building scikit-learn" 24 | if [[ "$1" = "clean" ]]; 25 | then 26 | make clean 27 | fi 28 | 29 | python setup.py develop 30 | 31 | # Compile doc and run example to populate the gallery 32 | echo "Building examples and documentation" 33 | cd doc 34 | if [[ "$1" = "clean" ]]; 35 | then 36 | make clean 37 | fi 38 | sphinx-build -b html -d _build/doctrees . _build/html/stable 39 | if [[ "$1" = "clean" ]]; 40 | then 41 | make optipng 42 | fi 43 | 44 | test -f _build/html/stable/index.html 45 | 46 | echo "Copying documentation to scikit-learn.github.io/dev/" 47 | if [ -d $HOME/scikit-learn.github.io ] 48 | then 49 | cd $HOME/scikit-learn.github.io 50 | else 51 | cd $HOME 52 | git clone git@github.com:scikit-learn/scikit-learn.github.io.git 53 | cd scikit-learn.github.io 54 | fi 55 | git checkout master 56 | git reset --hard origin/master 57 | git rm -rf dev/ && rm -rf dev/ 58 | cp -R $HOME/scikit-learn/doc/_build/html/stable dev 59 | git add -f dev/ 60 | git commit -m "Rebuild dev docs at master=$rev" dev 61 | git push 62 | --------------------------------------------------------------------------------