├── .DS_Store ├── .gitignore ├── Makefile ├── README.md ├── TODO.rst ├── rst_sample ├── Makefile ├── test.jpg └── test.rst └── source ├── _template └── page.html ├── conf.py ├── images ├── 01-01.jpg ├── 01-02.png ├── 02-01.png ├── 02-02.png ├── 02-03.png ├── 02-04.png ├── 02-05.png ├── 02-06.png ├── 02-07.png ├── 02-08.png ├── 02-09.png ├── 02-10.png ├── 02-11.png ├── 02-12.png ├── 02-13.png ├── 02-14.png ├── 02-15.png ├── 02-16.png ├── 02-17.png ├── 02-18.png ├── 03-01.png ├── 03-02.png ├── 03-03.png ├── 03-04.png ├── 03-05.png ├── 03-06.png ├── 03-07.png ├── 03-08.png ├── 03-09.png ├── 03-10.jpg ├── 03-11.png ├── 03-12.png ├── 03-13.png ├── 04-01.png ├── 04-02.png ├── 04-03.png ├── 04-04.png ├── apx01-01.png ├── apx01-02.png ├── apx01-03.png ├── apx01-04.png ├── apx01-05.png ├── apx01-06.jpeg ├── apx01-07.png ├── apx01-08.png ├── apx01-09.jpg ├── apx01-10.jpg ├── apx01-11.jpg ├── apx01-12.jpg └── exp-01.png ├── index.rst └── posts ├── about.rst ├── appendix01.rst ├── appendix02.rst ├── appendix03.rst ├── appendix04.rst ├── appendix05.rst ├── appendix06.rst ├── appendix07.rst ├── ch01.rst ├── ch02.rst ├── ch03.rst ├── ch04.rst ├── ch05.rst └── exp.rst /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/.DS_Store -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | build/ 2 | rst_sample/*.pdf 3 | rst_sample/*.html 4 | .DS_Store 5 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # Makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = 6 | SPHINXBUILD = sphinx-build 7 | PAPER = 8 | BUILDDIR = build 9 | 10 | # User-friendly check for sphinx-build 11 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1) 12 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/) 13 | endif 14 | 15 | # Internal variables. 16 | PAPEROPT_a4 = -D latex_paper_size=a4 17 | PAPEROPT_letter = -D latex_paper_size=letter 18 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 19 | # the i18n builder cannot share the environment and doctrees with the others 20 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source 21 | 22 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext 23 | 24 | help: 25 | @echo "Please use \`make ' where is one of" 26 | @echo " html to make standalone HTML files" 27 | @echo " dirhtml to make HTML files named index.html in directories" 28 | @echo " singlehtml to make a single large HTML file" 29 | @echo " pickle to make pickle files" 30 | @echo " json to make JSON files" 31 | @echo " htmlhelp to make HTML files and a HTML help project" 32 | @echo " qthelp to make HTML files and a qthelp project" 33 | @echo " devhelp to make HTML files and a Devhelp project" 34 | @echo " epub to make an epub" 35 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" 36 | @echo " latexpdf to make LaTeX files and run them through pdflatex" 37 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx" 38 | @echo " text to make text files" 39 | @echo " man to make manual pages" 40 | @echo " texinfo to make Texinfo files" 41 | @echo " info to make Texinfo files and run them through makeinfo" 42 | @echo " gettext to make PO message catalogs" 43 | @echo " changes to make an overview of all changed/added/deprecated items" 44 | @echo " xml to make Docutils-native XML files" 45 | @echo " pseudoxml to make pseudoxml-XML files for display purposes" 46 | @echo " linkcheck to check all external links for integrity" 47 | @echo " doctest to run all doctests embedded in the documentation (if enabled)" 48 | 49 | clean: 50 | rm -rf $(BUILDDIR)/* 51 | 52 | html: 53 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html 54 | @echo 55 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." 56 | 57 | dirhtml: 58 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml 59 | @echo 60 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." 61 | 62 | singlehtml: 63 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml 64 | @echo 65 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." 66 | 67 | pickle: 68 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle 69 | @echo 70 | @echo "Build finished; now you can process the pickle files." 71 | 72 | json: 73 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json 74 | @echo 75 | @echo "Build finished; now you can process the JSON files." 76 | 77 | htmlhelp: 78 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp 79 | @echo 80 | @echo "Build finished; now you can run HTML Help Workshop with the" \ 81 | ".hhp project file in $(BUILDDIR)/htmlhelp." 82 | 83 | qthelp: 84 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp 85 | @echo 86 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \ 87 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:" 88 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/IntheCloud.qhcp" 89 | @echo "To view the help file:" 90 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/IntheCloud.qhc" 91 | 92 | devhelp: 93 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp 94 | @echo 95 | @echo "Build finished." 96 | @echo "To view the help file:" 97 | @echo "# mkdir -p $$HOME/.local/share/devhelp/IntheCloud" 98 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/IntheCloud" 99 | @echo "# devhelp" 100 | 101 | epub: 102 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub 103 | @echo 104 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub." 105 | 106 | latex: 107 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 108 | @echo 109 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." 110 | @echo "Run \`make' in that directory to run these through (pdf)latex" \ 111 | "(use \`make latexpdf' here to do that automatically)." 112 | 113 | latexpdf: 114 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 115 | @echo "Running LaTeX files through pdflatex..." 116 | $(MAKE) -C $(BUILDDIR)/latex all-pdf 117 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 118 | 119 | latexpdfja: 120 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 121 | @echo "Running LaTeX files through platex and dvipdfmx..." 122 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja 123 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 124 | 125 | text: 126 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text 127 | @echo 128 | @echo "Build finished. The text files are in $(BUILDDIR)/text." 129 | 130 | man: 131 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man 132 | @echo 133 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man." 134 | 135 | texinfo: 136 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 137 | @echo 138 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." 139 | @echo "Run \`make' in that directory to run these through makeinfo" \ 140 | "(use \`make info' here to do that automatically)." 141 | 142 | info: 143 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 144 | @echo "Running Texinfo files through makeinfo..." 145 | make -C $(BUILDDIR)/texinfo info 146 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." 147 | 148 | gettext: 149 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale 150 | @echo 151 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." 152 | 153 | changes: 154 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes 155 | @echo 156 | @echo "The overview file is in $(BUILDDIR)/changes." 157 | 158 | linkcheck: 159 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck 160 | @echo 161 | @echo "Link check complete; look for any errors in the above output " \ 162 | "or in $(BUILDDIR)/linkcheck/output.txt." 163 | 164 | doctest: 165 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest 166 | @echo "Testing of doctests in the sources finished, look at the " \ 167 | "results in $(BUILDDIR)/doctest/output.txt." 168 | 169 | xml: 170 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml 171 | @echo 172 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml." 173 | 174 | pseudoxml: 175 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml 176 | @echo 177 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml." 178 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | In the Cloud 2 | ==== 3 | 4 | 本书已经大改版并更新实体书,可至JD购买《KVM私有云架构设计与实践》。 5 | 6 | 这是一本云计算入门手册,旨在给入门云计算的同学们一个指南,快速入坑的指南。 7 | 内容主要涵盖:*oVirt*,*Glusterfs*,*Hadoop*,*OpenStack*,*家居云*,以及各种可以*折腾的小东西*。当然,如果你有什么好的意见或者建议可以新建[issue](https://inthecloud.readthedocs.org/),或者在[v2ex](http://www.v2ex.com/t/123647)上留言,再或者去[Lofyer's Archive](http://blog.lofyer.org/workshop/)留言,或者干脆[发邮件](mailto:lofyer@gmail.com)给我。 8 | 9 | 在线阅读 10 | ---- 11 | 12 | ReadTheDocs: https://inthecloud.readthedocs.org 13 | 14 | 朴素的项目主站 15 | ---- 16 | 17 | InTheCloud: http://blog.lofyer.org/InTheCloud 18 | 19 | 编译 20 | ---- 21 | 22 | ``` 23 | $ git clone https://github.com/lofyer/InTheCloud.git 24 | $ cd InTheCloud 25 | $ make html 26 | ``` 27 | -------------------------------------------------------------------------------- /TODO.rst: -------------------------------------------------------------------------------- 1 | 要多说些理念 2 | 3 | 授之以鱼不如授之以渔,讲述how的同时讲讲why。 4 | 5 | 第四五章的工具都要添加使用示例(比如可视化、HadoopBox) 6 | -------------------------------------------------------------------------------- /rst_sample/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | rst2html.py test.rst > test.html 3 | # open test.html 4 | pdf: 5 | rst2pdf test.rst > test.pdf 6 | -------------------------------------------------------------------------------- /rst_sample/test.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/rst_sample/test.jpg -------------------------------------------------------------------------------- /rst_sample/test.rst: -------------------------------------------------------------------------------- 1 | ======================================== 2 | 主标题 3 | ======================================== 4 | 5 | ---------------------------------------- 6 | 副标题 7 | ---------------------------------------- 8 | 9 | 第一章 10 | ======================================== 11 | 12 | 1.1 小节 13 | ---------------------------------------- 14 | 15 | 1.1.1 小小节 16 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 17 | 18 | 这里会有一些文字吧。 19 | 20 | 来个tab? 21 | 22 | 好的。 23 | 24 | * 目录,开头是* 25 | 26 | - 1. 第一章 27 | 28 | + 1.1 我是子目录 29 | 30 | + 1.2 我是子目录,尽管我和上面有空行。。 31 | 32 | - A. 附录A,开头是- 33 | 34 | *斜体* 35 | 36 | **粗体** 37 | 38 | Reference_ 39 | 40 | +------------------------+---------+--------+ 41 | |我是谁 |是的么? |真的么?| 42 | +========================+=========+========+ 43 | |我 |是 |真的 | 44 | +------------------------+---------+--------+ 45 | |我 |不是 |假的 | 46 | +------------------------+---------+--------+ 47 | 48 | tab测试 49 | 50 | 一个 51 | 两个,我和上面一行没有newline 52 | 53 | 这是一个链接 `lofyer `_ 54 | 55 | .. _Reference: http://localhost/ 56 | 57 | 添加一些代码,隔一行出个tab:: 58 | 59 | printf("hello world"); 60 | 61 | 好,接下来,来个Apple的logo。 62 | 63 | .. image:: test.jpg 64 | :height: 300 65 | :width: 550 66 | 67 | 调整下大小,添加个链接,再居个中。 68 | 69 | .. image:: test.jpg 70 | :height: 300 71 | :width: 550 72 | :scale: 50 73 | :align: center 74 | :target: http://localhost 75 | 76 | .. topic:: 主题topic 77 | 78 | 主题内容 79 | 80 | .. epigraph:: 81 | 82 | 鸿篇巨作,引用。 83 | 84 | .. DANGER:: 85 | Beware killer rabbits! 86 | 87 | .. line-block:: 88 | Lend us 89 | a couple 90 | of tea. 91 | Eva. 92 | 93 | .. sidebar:: Sidebar Title 94 | :subtitle: 边栏 95 | 96 | 边栏内容 97 | 98 | .. math:: 99 | 100 | (a + b)^2 &= (a + b)(a + b) \\ 101 | &= a^2 + 2ab + b^2 102 | -------------------------------------------------------------------------------- /source/_template/page.html: -------------------------------------------------------------------------------- 1 | {% extends "!page.html" %} 2 | 3 | {% block body %} 4 | {{ super() }} 5 | {% if READTHEDOCS %} 6 |

7 | 讨论 8 | 9 |

10 | 11 |
12 | {% endif %} 13 | {% endblock %} 14 | 15 | {% block footer %} 16 | {{ super() }} 17 | {% if READTHEDOCS %} 18 | 30 | 31 | {% endif %} 32 | {% endblock %} 33 | -------------------------------------------------------------------------------- /source/conf.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # In the Cloud documentation build configuration file, created by 4 | # sphinx-quickstart on Sun Jun 8 23:43:15 2014. 5 | # 6 | # This file is execfile()d with the current directory set to its 7 | # containing dir. 8 | # 9 | # Note that not all possible configuration values are present in this 10 | # autogenerated file. 11 | # 12 | # All configuration values have a default; values that are commented out 13 | # serve to show the default. 14 | 15 | import sys 16 | import os 17 | 18 | on_rtd = os.environ.get('READTHEDOCS', None) == 'True' 19 | 20 | # If extensions (or modules to document with autodoc) are in another directory, 21 | # add these directories to sys.path here. If the directory is relative to the 22 | # documentation root, use os.path.abspath to make it absolute, like shown here. 23 | #sys.path.insert(0, os.path.abspath('.')) 24 | 25 | # -- General configuration ------------------------------------------------ 26 | 27 | # If your documentation needs a minimal Sphinx version, state it here. 28 | #needs_sphinx = '1.0' 29 | 30 | # Add any Sphinx extension module names here, as strings. They can be 31 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom 32 | # ones. 33 | extensions = [ 34 | 'sphinx.ext.autodoc', 35 | 'sphinx.ext.todo', 36 | 'sphinx.ext.pngmath', 37 | 'sphinx.ext.viewcode', 38 | ] 39 | 40 | # Add any paths that contain templates here, relative to this directory. 41 | templates_path = ['_templates'] 42 | 43 | # The suffix of source filenames. 44 | source_suffix = '.rst' 45 | 46 | # The encoding of source files. 47 | #source_encoding = 'utf-8-sig' 48 | 49 | # The master toctree document. 50 | master_doc = 'index' 51 | 52 | # General information about the project. 53 | project = u'In the Cloud' 54 | copyright = u'2014, lofyer' 55 | 56 | # The version info for the project you're documenting, acts as replacement for 57 | # |version| and |release|, also used in various other places throughout the 58 | # built documents. 59 | # 60 | # The short X.Y version. 61 | version = '0.8' 62 | # The full version, including alpha/beta/rc tags. 63 | release = '' 64 | 65 | # The language for content autogenerated by Sphinx. Refer to documentation 66 | # for a list of supported languages. 67 | language = "zh_CN" 68 | 69 | # There are two options for replacing |today|: either, you set today to some 70 | # non-false value, then it is used: 71 | #today = '' 72 | # Else, today_fmt is used as the format for a strftime call. 73 | #today_fmt = '%B %d, %Y' 74 | 75 | # List of patterns, relative to source directory, that match files and 76 | # directories to ignore when looking for source files. 77 | exclude_patterns = [] 78 | 79 | # The reST default role (used for this markup: `text`) to use for all 80 | # documents. 81 | #default_role = None 82 | 83 | # If true, '()' will be appended to :func: etc. cross-reference text. 84 | #add_function_parentheses = True 85 | 86 | # If true, the current module name will be prepended to all description 87 | # unit titles (such as .. function::). 88 | #add_module_names = True 89 | 90 | # If true, sectionauthor and moduleauthor directives will be shown in the 91 | # output. They are ignored by default. 92 | #show_authors = False 93 | 94 | # The name of the Pygments (syntax highlighting) style to use. 95 | pygments_style = 'sphinx' 96 | 97 | # A list of ignored prefixes for module index sorting. 98 | #modindex_common_prefix = [] 99 | 100 | # If true, keep warnings as "system message" paragraphs in the built documents. 101 | #keep_warnings = False 102 | 103 | 104 | # -- Options for HTML output ---------------------------------------------- 105 | 106 | # The theme to use for HTML and HTML Help pages. See the documentation for 107 | # a list of builtin themes. 108 | # html_theme = 'haiku' 109 | html_theme = 'default' 110 | if not on_rtd: # only import and set the theme if we're building docs locally 111 | import sphinx_rtd_theme 112 | html_theme = 'sphinx_rtd_theme' 113 | html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] 114 | 115 | # Theme options are theme-specific and customize the look and feel of a theme 116 | # further. For a list of options available for each theme, see the 117 | # documentation. 118 | #html_theme_options = {} 119 | 120 | # Add any paths that contain custom themes here, relative to this directory. 121 | #html_theme_path = [] 122 | 123 | # The name for this set of Sphinx documents. If None, it defaults to 124 | # " v documentation". 125 | #html_title = None 126 | 127 | # A shorter title for the navigation bar. Default is the same as html_title. 128 | #html_short_title = None 129 | 130 | # The name of an image file (relative to this directory) to place at the top 131 | # of the sidebar. 132 | #html_logo = None 133 | 134 | # The name of an image file (within the static path) to use as favicon of the 135 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 136 | # pixels large. 137 | #html_favicon = None 138 | 139 | # Add any paths that contain custom static files (such as style sheets) here, 140 | # relative to this directory. They are copied after the builtin static files, 141 | # so a file named "default.css" will overwrite the builtin "default.css". 142 | html_static_path = ['_static'] 143 | 144 | # Add any extra paths that contain custom files (such as robots.txt or 145 | # .htaccess) here, relative to this directory. These files are copied 146 | # directly to the root of the documentation. 147 | #html_extra_path = [] 148 | 149 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, 150 | # using the given strftime format. 151 | #html_last_updated_fmt = '%b %d, %Y' 152 | 153 | # If true, SmartyPants will be used to convert quotes and dashes to 154 | # typographically correct entities. 155 | html_use_smartypants = True 156 | 157 | # Custom sidebar templates, maps document names to template names. 158 | #html_sidebars = {} 159 | 160 | # Additional templates that should be rendered to pages, maps page names to 161 | # template names. 162 | #html_additional_pages = {} 163 | 164 | # If false, no module index is generated. 165 | #html_domain_indices = True 166 | 167 | # If false, no index is generated. 168 | #html_use_index = True 169 | 170 | # If true, the index is split into individual pages for each letter. 171 | #html_split_index = False 172 | 173 | # If true, links to the reST sources are added to the pages. 174 | #html_show_sourcelink = True 175 | 176 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 177 | #html_show_sphinx = True 178 | 179 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 180 | #html_show_copyright = True 181 | 182 | # If true, an OpenSearch description file will be output, and all pages will 183 | # contain a tag referring to it. The value of this option must be the 184 | # base URL from which the finished HTML is served. 185 | #html_use_opensearch = '' 186 | 187 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 188 | #html_file_suffix = None 189 | 190 | # Output file base name for HTML help builder. 191 | htmlhelp_basename = 'IntheCloud_cndoc' 192 | 193 | 194 | # -- Options for LaTeX output --------------------------------------------- 195 | 196 | latex_elements = { 197 | # The paper size ('letterpaper' or 'a4paper'). 198 | #'papersize': 'letterpaper', 199 | 200 | # The font size ('10pt', '11pt' or '12pt'). 201 | #'pointsize': '10pt', 202 | 203 | # Additional stuff for the LaTeX preamble. 204 | #'preamble': '', 205 | } 206 | 207 | # Grouping the document tree into LaTeX files. List of tuples 208 | # (source start file, target name, title, 209 | # author, documentclass [howto, manual, or own class]). 210 | latex_documents = [ 211 | ('index', 'IntheCloud.tex', u'In the Cloud Documentation', 212 | u'lofyer', 'manual'), 213 | ] 214 | 215 | # The name of an image file (relative to this directory) to place at the top of 216 | # the title page. 217 | #latex_logo = None 218 | 219 | # For "manual" documents, if this is true, then toplevel headings are parts, 220 | # not chapters. 221 | #latex_use_parts = False 222 | 223 | # If true, show page references after internal links. 224 | #latex_show_pagerefs = False 225 | 226 | # If true, show URL addresses after external links. 227 | #latex_show_urls = False 228 | 229 | # Documents to append as an appendix to all manuals. 230 | #latex_appendices = [] 231 | 232 | # If false, no module index is generated. 233 | #latex_domain_indices = True 234 | 235 | 236 | # -- Options for manual page output --------------------------------------- 237 | 238 | # One entry per manual page. List of tuples 239 | # (source start file, name, description, authors, manual section). 240 | man_pages = [ 241 | ('index', 'inthecloud', u'In the Cloud Documentation', 242 | [u'lofyer'], 1) 243 | ] 244 | 245 | # If true, show URL addresses after external links. 246 | #man_show_urls = False 247 | 248 | 249 | # -- Options for Texinfo output ------------------------------------------- 250 | 251 | # Grouping the document tree into Texinfo files. List of tuples 252 | # (source start file, target name, title, author, 253 | # dir menu entry, description, category) 254 | texinfo_documents = [ 255 | ('index', 'IntheCloud', u'In the Cloud Documentation', 256 | u'lofyer', 'IntheCloud', 'One line description of project.', 257 | 'Miscellaneous'), 258 | ] 259 | 260 | # Documents to append as an appendix to all manuals. 261 | #texinfo_appendices = [] 262 | 263 | # If false, no module index is generated. 264 | #texinfo_domain_indices = True 265 | 266 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 267 | #texinfo_show_urls = 'footnote' 268 | 269 | # If true, do not generate a @detailmenu in the "Top" node's menu. 270 | #texinfo_no_detailmenu = False 271 | 272 | 273 | # -- Options for Epub output ---------------------------------------------- 274 | 275 | # Bibliographic Dublin Core info. 276 | epub_title = u'In the Cloud' 277 | epub_author = u'lofyer' 278 | epub_publisher = u'lofyer' 279 | epub_copyright = u'2014, lofyer' 280 | 281 | # The basename for the epub file. It defaults to the project name. 282 | #epub_basename = u'In the Cloud' 283 | 284 | # The HTML theme for the epub output. Since the default themes are not optimized 285 | # for small screen space, using the same theme for HTML and epub output is 286 | # usually not wise. This defaults to 'epub', a theme designed to save visual 287 | # space. 288 | #epub_theme = 'epub' 289 | 290 | # The language of the text. It defaults to the language option 291 | # or en if the language is not set. 292 | #epub_language = '' 293 | 294 | # The scheme of the identifier. Typical schemes are ISBN or URL. 295 | #epub_scheme = '' 296 | 297 | # The unique identifier of the text. This can be a ISBN number 298 | # or the project homepage. 299 | #epub_identifier = '' 300 | 301 | # A unique identification for the text. 302 | #epub_uid = '' 303 | 304 | # A tuple containing the cover image and cover page html template filenames. 305 | #epub_cover = () 306 | 307 | # A sequence of (type, uri, title) tuples for the guide element of content.opf. 308 | #epub_guide = () 309 | 310 | # HTML files that should be inserted before the pages created by sphinx. 311 | # The format is a list of tuples containing the path and title. 312 | #epub_pre_files = [] 313 | 314 | # HTML files shat should be inserted after the pages created by sphinx. 315 | # The format is a list of tuples containing the path and title. 316 | #epub_post_files = [] 317 | 318 | # A list of files that should not be packed into the epub file. 319 | epub_exclude_files = ['search.html'] 320 | 321 | # The depth of the table of contents in toc.ncx. 322 | #epub_tocdepth = 3 323 | 324 | # Allow duplicate toc entries. 325 | #epub_tocdup = True 326 | 327 | # Choose between 'default' and 'includehidden'. 328 | #epub_tocscope = 'default' 329 | 330 | # Fix unsupported image types using the PIL. 331 | #epub_fix_images = False 332 | 333 | # Scale large images. 334 | #epub_max_image_width = 0 335 | 336 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 337 | #epub_show_urls = 'inline' 338 | 339 | # If false, no index is generated. 340 | #epub_use_index = True 341 | -------------------------------------------------------------------------------- /source/images/01-01.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/01-01.jpg -------------------------------------------------------------------------------- /source/images/01-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/01-02.png -------------------------------------------------------------------------------- /source/images/02-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-01.png -------------------------------------------------------------------------------- /source/images/02-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-02.png -------------------------------------------------------------------------------- /source/images/02-03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-03.png -------------------------------------------------------------------------------- /source/images/02-04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-04.png -------------------------------------------------------------------------------- /source/images/02-05.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-05.png -------------------------------------------------------------------------------- /source/images/02-06.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-06.png -------------------------------------------------------------------------------- /source/images/02-07.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-07.png -------------------------------------------------------------------------------- /source/images/02-08.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-08.png -------------------------------------------------------------------------------- /source/images/02-09.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-09.png -------------------------------------------------------------------------------- /source/images/02-10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-10.png -------------------------------------------------------------------------------- /source/images/02-11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-11.png -------------------------------------------------------------------------------- /source/images/02-12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-12.png -------------------------------------------------------------------------------- /source/images/02-13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-13.png -------------------------------------------------------------------------------- /source/images/02-14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-14.png -------------------------------------------------------------------------------- /source/images/02-15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-15.png -------------------------------------------------------------------------------- /source/images/02-16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-16.png -------------------------------------------------------------------------------- /source/images/02-17.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-17.png -------------------------------------------------------------------------------- /source/images/02-18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/02-18.png -------------------------------------------------------------------------------- /source/images/03-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-01.png -------------------------------------------------------------------------------- /source/images/03-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-02.png -------------------------------------------------------------------------------- /source/images/03-03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-03.png -------------------------------------------------------------------------------- /source/images/03-04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-04.png -------------------------------------------------------------------------------- /source/images/03-05.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-05.png -------------------------------------------------------------------------------- /source/images/03-06.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-06.png -------------------------------------------------------------------------------- /source/images/03-07.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-07.png -------------------------------------------------------------------------------- /source/images/03-08.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-08.png -------------------------------------------------------------------------------- /source/images/03-09.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-09.png -------------------------------------------------------------------------------- /source/images/03-10.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-10.jpg -------------------------------------------------------------------------------- /source/images/03-11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-11.png -------------------------------------------------------------------------------- /source/images/03-12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-12.png -------------------------------------------------------------------------------- /source/images/03-13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/03-13.png -------------------------------------------------------------------------------- /source/images/04-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/04-01.png -------------------------------------------------------------------------------- /source/images/04-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/04-02.png -------------------------------------------------------------------------------- /source/images/04-03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/04-03.png -------------------------------------------------------------------------------- /source/images/04-04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/04-04.png -------------------------------------------------------------------------------- /source/images/apx01-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-01.png -------------------------------------------------------------------------------- /source/images/apx01-02.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-02.png -------------------------------------------------------------------------------- /source/images/apx01-03.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-03.png -------------------------------------------------------------------------------- /source/images/apx01-04.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-04.png -------------------------------------------------------------------------------- /source/images/apx01-05.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-05.png -------------------------------------------------------------------------------- /source/images/apx01-06.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-06.jpeg -------------------------------------------------------------------------------- /source/images/apx01-07.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-07.png -------------------------------------------------------------------------------- /source/images/apx01-08.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-08.png -------------------------------------------------------------------------------- /source/images/apx01-09.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-09.jpg -------------------------------------------------------------------------------- /source/images/apx01-10.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-10.jpg -------------------------------------------------------------------------------- /source/images/apx01-11.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-11.jpg -------------------------------------------------------------------------------- /source/images/apx01-12.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/apx01-12.jpg -------------------------------------------------------------------------------- /source/images/exp-01.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lofyer/InTheCloud/4dc6ec4b16c7684f572c4dc40708f6fa52de3ab9/source/images/exp-01.png -------------------------------------------------------------------------------- /source/index.rst: -------------------------------------------------------------------------------- 1 | .. a documentation master file, created by 2 | sphinx-quickstart on Tue Jun 10 21:11:02 2014. 3 | You can adapt this file completely to your liking, but it should at least 4 | contain the root `toctree` directive. 5 | 6 | In the Cloud 7 | ============================= 8 | 9 | 本文档已经深度提炼,作为《KVM私有云架构设计与实践》一书的参考内容之一(新版目录可至 `lofyer.org `_ 查看),已由上海交通大学出版社出版,欢迎购买,或者联系本人团购,微信号为lofyer_org。 10 | 11 | .. toctree:: 12 | :maxdepth: 3 13 | 14 | posts/ch01.rst 15 | posts/ch02.rst 16 | posts/ch03.rst 17 | posts/ch04.rst 18 | posts/ch05.rst 19 | posts/appendix01.rst 20 | posts/appendix02.rst 21 | posts/appendix03.rst 22 | posts/appendix04.rst 23 | posts/appendix05.rst 24 | posts/appendix06.rst 25 | posts/appendix07.rst 26 | posts/exp.rst 27 | 28 | 链接 29 | ================== 30 | 31 | .. toctree:: 32 | :maxdepth: 1 33 | 34 | posts/about.rst 35 | -------------------------------------------------------------------------------- /source/posts/about.rst: -------------------------------------------------------------------------------- 1 | ============== 2 | 关于作者与文档 3 | ============== 4 | 5 | 文档基本已暂停更新,计划出书中,详情见作者博客。 6 | 7 | Author: 十六(lofyer) 8 | 9 | Mail: lofyer@gmail.com 10 | 11 | 新浪微博: lofyer 12 | 13 | QQ: 578645806 14 | 15 | 微信: 578645806 16 | 17 | 个人博客: http://blog.lofyer.org 18 | 19 | 在线阅读 20 | 21 | ReadTheDocs: https://inthecloud.readthedocs.org 22 | 23 | .. raw:: html 24 | 25 | 26 |

这是一本云计算入门手册,内容比较简单但是又复杂,可当作学习参考map

27 |

内容主要涵盖:oVirtGlusterfsHadoopOpenStack家居云,以及各种可以折腾的小东西。当然,如果你有什么好的意见或者建议可以新建issue,或者在v2ex上留言,再或者去Lofyer's Archive留言,或者干脆发邮件给我。

28 | Creative Commons License
InTheCloud is licensed under a Creative Commons Attribution 4.0 International License. 29 | 30 | -------------------------------------------------------------------------------- /source/posts/appendix02.rst: -------------------------------------------------------------------------------- 1 | ================= 2 | 附录二 公有云参考 3 | ================= 4 | 5 | ----------------------- 6 | 虚拟机占用主机资源隔离 7 | ----------------------- 8 | 9 | 网络 10 | ----- 11 | 12 | 1. 使用 **VLAN** 、 **openvswitch** 进行隔离或限制。 13 | 14 | 2. 使用tc(Traffic Controller)命令进行速率的限制,许多虚拟化平台使用的都是它。 15 | 16 | **** 17 | 18 | CPU 19 | ----- 20 | 21 | 1. 使用CPU Pin可以将特定虚拟核固定在指定物理核上。线程当作核来使用的话,每一个虚拟机使用一个核,之间互不干涉。 22 | 23 | 2. 使用Linux的 `Control Group `_ 来限制虚拟机对宿主机CPU的用度。 24 | 25 | 磁盘IO 26 | ------- 27 | 28 | 1. 更改内核磁盘IO调度方式:noop,cfq,deadline直接选择,参考 `Linux Doc. `_ 。 29 | 30 | 2. 使用Linux的 `Control Group `_ 来限制虚拟机对宿主机设备的访问。 31 | 32 | 使能cgroup,假如没有启动的话。 33 | 34 | .. code:: 35 | 36 | # mount tmpfs cgroup_root /sys/fs/cgroup 37 | # mkdir /sys/fs/cgroup/blkio 38 | # mount -t cgroup -o blkio none /sys/fs/cgroup/blkio 39 | 40 | 创建一个1 Mbps的IO限制组,X:Y 为MAJOR:MINOR。 41 | 42 | .. code:: 43 | 44 | # lsblk 45 | # mkdir -p /sys/fs/cgroup/blkio/limit1M/ 46 | # echo "8:0 1048576" > /sys/fs/cgroup/blkio/limit1M/blkio.throttle.write_bps_device 47 | 48 | 将虚拟机进程附加到限制组。 49 | 50 | .. code:: 51 | 52 | # echo $VM_PID > /sys/fs/cgroup/blkio/limit1M/tasks 53 | 54 | 目前没有删除task功能,只能将它移到根组,或者是删除此组。 55 | 56 | .. code:: 57 | 58 | # echo $VM_PID > /sys/fs/cgroup/blkio/tasks 59 | 60 | 3. 更改qemu drive cache。 61 | 62 | 设备 63 | ----- 64 | 65 | 使用Linux的 `Control Group `_ 来限制虚拟机对宿主机设备的访问。 66 | 67 | --------------- 68 | 资源用度、计费 69 | --------------- 70 | 71 | 系统报告主要是在scaling(测量)、charging(计费)时使用。 72 | 73 | 在测量时可以使用平台提供的API、测量组件,综合利用nagios/Icinga,使用Django快速开发。 74 | 75 | oVirt可以参考ovirt-reports,OpenStack参考其Ceilometer。 76 | 77 | 计费模块OpenStack可参考新浪云的 `dough项目 `_ 。 78 | 79 | -------------------------- 80 | DeltaCloud/Libcloud混合云 81 | -------------------------- 82 | 83 | **DeltaCloud支持:** 84 | 85 | arubacloud 86 | 87 | azure 88 | 89 | ec2 90 | 91 | rackspace 92 | 93 | terremark 94 | 95 | openstack 96 | 97 | fgcp 98 | 99 | eucalyptus 100 | 101 | digitalocean 102 | 103 | sbc 104 | 105 | mock 106 | 107 | condor 108 | 109 | rhevm 110 | 111 | google 112 | 113 | opennebula 114 | 115 | vsphere 116 | 117 | gogrid 118 | 119 | rimuhosting 120 | 121 | **Libcloud支持:** 122 | 123 | biquo 124 | 125 | PCextreme 126 | 127 | Azure Virtual machines 128 | 129 | Bluebox Blocks 130 | 131 | Brightbox 132 | 133 | CloudFrames 134 | 135 | CloudSigma (API v2.0) 136 | 137 | CloudStack 138 | 139 | DigitalOcean 140 | 141 | Dreamhost 142 | 143 | Amazon EC2 144 | 145 | Enomaly Elastic Computing Platform 146 | 147 | ElasticHosts 148 | 149 | Eucalyptus 150 | 151 | Exoscale 152 | 153 | Gandi 154 | 155 | Google Compute Engine 156 | 157 | GoGrid 158 | 159 | HostVirtual 160 | 161 | HP Public Cloud (Helion) 162 | 163 | IBM SmartCloud Enterprise 164 | 165 | Ikoula 166 | 167 | Joyent 168 | 169 | Kili Public Cloud 170 | 171 | KTUCloud 172 | 173 | Libvirt 174 | 175 | Linode 176 | 177 | NephoScale 178 | 179 | Nimbus 180 | 181 | Ninefold 182 | 183 | OpenNebula (v3.8) 184 | 185 | OpenStack 186 | 187 | Opsource 188 | 189 | Outscale INC 190 | 191 | Outscale SAS 192 | 193 | ProfitBricks 194 | 195 | Rackspace Cloud 196 | 197 | RimuHosting 198 | 199 | ServerLove 200 | 201 | skalicloud 202 | 203 | SoftLayer 204 | 205 | vCloud 206 | 207 | VCL 208 | 209 | vCloud 210 | 211 | Voxel VoxCLOUD 212 | 213 | vps.net 214 | 215 | VMware vSphere 216 | 217 | Vultr 218 | 219 | DeltaCloud示例 220 | -------------- 221 | 222 | Libcloud示例 223 | -------------- 224 | 225 | ---------------- 226 | SDN学习/mininet 227 | ---------------- 228 | 229 | SDN广泛用在内容加速、虚拟网络、监控等领域。 230 | 231 | 关于SDN有许多学习工具: `mininet `_ 、 `POX `_ 、 `Netwrok Heresy Blog `_ 。 232 | 233 | 学习视频: `Coursera SDN `_ 。 234 | -------------------------------------------------------------------------------- /source/posts/appendix03.rst: -------------------------------------------------------------------------------- 1 | ====================== 2 | 附录三 PaaS/OpenShift 3 | ====================== 4 | 5 | --------- 6 | 这是什么 7 | --------- 8 | 9 | 设想,你有一个公网主机,上面配置了Apache/Nginx,同时上面你装的有Ruby、JBoss、Python等环境,平时你用它作你自己的应用发布。某一天,你的朋友说他也有一个Django应用要发布,问你要一个环境,你就在你的主机上配置了VirtualHost来解析xiaoli.myhost.com到/var/www/django/xiaoli这个目录下,然后他就请你去吃个烤羊腿了。后来,又一个朋友问你要这样的环境,但是这次是php,你就把/var/www/html/php/zhangsan这个目录给他了,这次请你吃麻辣烫。再后来,问你要环境的朋友越来越多,你就又搞了一个主机,同时配置了一个代理服务来解析不同的域名到某个主机的目录下。某天你在公交车上的时候就想了,我为什么不写一个应用让他们自己注册选择语言环境和域名呢?于是,你就开始了,花了两天时间终于搞定。用的人越来越多,你吃得也也越来越胖。。 10 | 11 | 这样一个应用,就是PaaS的原型。 12 | 13 | ---------- 14 | 当前的形势 15 | ---------- 16 | 面对国内社交APP微信的火爆,对Web服务器的需求日益增长,同样,开发者的需求环境也有所差异,而面对这种差异,一个更加灵活的平台就出现了,国内比如SinaAPP,国外比如Google APPEngine,Redhat OpenShift,Amazon AWS。 17 | 18 | OK,不多说了,下面开始试验OpenShift的服务器搭建及上线。 19 | -------------------------------------------------------------------------------- /source/posts/appendix04.rst: -------------------------------------------------------------------------------- 1 | ================================ 2 | 附录四 Docker 使用及自建repo 3 | ================================ 4 | 5 | Docker已经越来越流行了(IaaS平台开始支持它,PaaS平台也开始支持它),不介绍它总感觉过不去。 6 | 7 | 它是基于LXC的容器类型虚拟化技术,从实现上说更类似于chroot,用户空间的信息被很好隔离的同时,又实现了网络相关的分离。它取代LXC的原因,我想是因为其REPO非常丰富,操作上类似git。 8 | 9 | 另外,它有提供Windows/MacOSX的客户端 boot2docker。 10 | 11 | 中文入门手册请参考 `Docker中文指南 `_ ,另外它有一个WebUI `shipyard `_ 。 12 | 13 | 官方repo `https://registry.hub.docker.com/ `_ 。 14 | 15 | --------- 16 | 镜像操作 17 | --------- 18 | 19 | 运行简单命令 20 | 21 | .. code:: 22 | 23 | docker run ubuntu /bin/echo "Hello world!" 24 | 25 | 运行交互shell 26 | 27 | .. code:: 28 | 29 | docker run -t -i ubuntu /bin/bash 30 | 31 | 运行Django程序 32 | 33 | .. code:: 34 | 35 | docker run -d -P training/webapp python app.py 36 | 37 | 获取container信息 38 | 39 | .. code:: 40 | 41 | docker ps 42 | 43 | 获取container内部信息 44 | 45 | .. code:: 46 | 47 | docker inspect -f '{{ .NetworkSettings.IPAddress }}' my_container 48 | 49 | 获取container历史 50 | 51 | .. code:: 52 | 53 | docker log my_container 54 | 55 | commit/save/load 56 | 57 | .. note:: 保存 58 | 59 | 只有commit,对docker做的修改才会保存,形如docker run centos yum install -y nmap不会保存。 60 | 61 | .. code:: 62 | 63 | docker images 64 | docker commit $image_id$ myimage 65 | docker save myimage > myimage.tar 66 | docker load < myimage.tar 67 | 68 | ------------- 69 | Registry操作 70 | ------------- 71 | 72 | 登录,默认为DockerHub 73 | 74 | .. code:: 75 | 76 | docker login 77 | 78 | 创建Registry 79 | 80 | 参考 https://www.digitalocean.com/community/tutorials/how-to-set-up-a-private-docker-registry-on-ubuntu-14-04 以及 http://blog.docker.com/2013/07/how-to-use-your-own-registry/ 。 81 | 82 | .. code:: 83 | 84 | # 获取docker-registry,从github或者直接 pip install docker-registry 85 | # git clone https://github.com/dotcloud/docker-registry.git 86 | # cd docker-registry 87 | # cp config_sample.yml config.yml 88 | # pip install -r requirements.txt 89 | # gunicorn --access-logfile - --log-level debug --debug 90 | -b 0.0.0.0:5000 -w 1 wsgi:application 91 | 92 | push/pull 93 | 94 | .. code:: 95 | 96 | # docker pull ubuntu 97 | # docker tag ubuntu localhost:5000/ubuntu 98 | # docker push localhost:5000/ubuntu 99 | -------------------------------------------------------------------------------- /source/posts/appendix05.rst: -------------------------------------------------------------------------------- 1 | ======================== 2 | 附录五 常用功能运维工具 3 | ======================== 4 | 5 | ----------------- 6 | Foreman 部署指导 7 | ----------------- 8 | 9 | ----------------- 10 | Katello 部署指导 11 | ----------------- 12 | 13 | ------------- 14 | 数据恢复工具 15 | ------------- 16 | 17 | extundelete 18 | 19 | ----------------------- 20 | 常用性能测量及优化工具 21 | ----------------------- 22 | 23 | - 优化 24 | 25 | .. image:: ../images/apx01-09.jpg 26 | 27 | - 监视 28 | 29 | .. image:: ../images/apx01-10.jpg 30 | 31 | - 测试 32 | 33 | .. image:: ../images/apx01-11.jpg 34 | 35 | all in one - pip install glances 36 | 37 | 另外针对qemu/libvirt相关的测试工具,可以参考 `virt-test `_ ,当然,仅作参考。 38 | 39 | ---------------- 40 | HAProxy 41 | ---------------- 42 | 43 | 没错,我就是要把这个东西单列出来讲,因为你可以用这个东西来做几乎全部应用的HA或者LoadBalancer, `这里是配置说明 ` 。 44 | 45 | 代理http: 46 | 47 | .. code:: 48 | 49 | ... 50 | 51 | backend webbackend 52 | balance roundrobin 53 | server web1 192.168.0.130:80 check 54 | 55 | frontend http 56 | bind *:80 57 | mode http 58 | default_backend webbackend 59 | 60 | listen stats :8080 61 | balance 62 | mode http 63 | stats enable 64 | stats auth me:password 65 | 66 | 代理tcp: 67 | 68 | .. code:: 69 | 70 | listen *:3306 71 | mode tcp 72 | option tcplog 73 | balance roundrobin 74 | server smtp 192.168.0.1:3306 check 75 | server smtp1 192.168.0.2:3306 check 76 | 77 | 78 | ------------ 79 | 常用运维工具 80 | ------------ 81 | 82 | Monit 83 | ----- 84 | 85 | 小型监控工具,不推荐使用。 86 | 87 | Munin 88 | ----- 89 | 90 | 轻量级的监控工具。 91 | 92 | Cacti 93 | ----- 94 | 95 | 与Zabbix在某些方面很像。 96 | 97 | Ganglia 98 | -------- 99 | 100 | 比较专业的监控工具,并有一款专门针对虚拟机的应用。 101 | http://blog.sflow.com/2012/01/using-ganglia-to-monitor-virtual.html 102 | 103 | zabbix 104 | ------- 105 | 106 | 类似Nagios,不过图形绘制很强,在一键脚本中提供安装。 107 | 108 | `移动客户端下载 `_ 。 109 | 110 | 关于zabbix的更多介绍可以参考 `itnihao的相关著作 `_ 。 111 | 112 | nagios 113 | ------- 114 | 115 | 使用UI Plugin可以将在oVirt管理界面中查看Nagios监控状态,可参考 `oVirt_Monitoring_UI_Plugin `_ 以及 `Nagios_Intergration `_ 。 116 | 117 | foreman 118 | -------- 119 | 120 | 使用Foreman的主要目的是更方便地部署宿主机以及创建虚拟机。 121 | 122 | 参考 `ForemanIntegration `_ 、 `foreman_ovirt `_ 以及UIPlugin相关内容。 123 | 124 | chef 125 | ---- 126 | 127 | 简单理解为一些列安装脚本(cookbook)。 128 | 129 | 访问 `http://gettingstartedwithchef.com/ `_ 开始快速上手学习。 130 | 131 | `获取更多cookbook `_ 。 132 | 133 | puppet 134 | ------ 135 | 136 | 功能上与chef类似,但是影响力更大。 137 | 138 | `下载虚拟机 `_ 并按照里面的教程来快速上手。 139 | -------------------------------------------------------------------------------- /source/posts/appendix06.rst: -------------------------------------------------------------------------------- 1 | ================================ 2 | 附录六 文档参考资源以及建议书单 3 | ================================ 4 | 5 | ------------ 6 | Server World 7 | ------------ 8 | 9 | 一个非常好的网站,含有很多服务的很具体的搭建过程,在工程实施中参考意义比较大。 10 | 11 | ---- 12 | OVF 13 | ---- 14 | 15 | 文中的全部资源,可以访问通过百度网盘进行下载。 16 | 17 | 链接: http://pan.baidu.com/s/1he23k 密码: bmfn 18 | 19 | --------- 20 | Qemu Doc 21 | --------- 22 | 23 | qemu是KVM虚拟机的基础套件,建议通读其 `手册 `_ 中所有内容以了结其特性。 24 | 25 | ------------ 26 | 快速安装脚本 27 | ------------ 28 | 29 | https://github.com/lofyer/onekey-deploy.git 30 | 31 | 目前包含: 32 | 33 | - Gitlab 34 | 35 | - Zabbix 36 | 37 | - oVirt 38 | 39 | - Jarvis 40 | 41 | --------- 42 | 常用爬虫 43 | --------- 44 | 45 | https://github.com/lofyer/myspiders.git 46 | 47 | ---------------------- 48 | Django based WebAdmin 49 | ---------------------- 50 | 51 | https://github.com/lofyer/webadmin.git 52 | 53 | ---- 54 | 书单 55 | ---- 56 | 57 | 虽然现在的移动设备很适合阅读,但我还是推荐多看些实体书,尤其是一些大部头。 58 | 59 | 当然,下面的书目我会尽量提供适合移动设备阅读的版本(PDF、MOBI、EPUB、TXT)。 60 | 61 | TCP/IP Vol. 1/2/3 62 | 63 | Machine Learning in Action 64 | 65 | Elements of Information Theory 66 | 67 | The Design of UNIX Operating System 68 | 69 | Understanding the Linux Kernel 70 | 71 | The Art of Computer Programming Vol. 1/2/3/4 72 | 73 | Linux内核完全注释 74 | 75 | 浪潮之巅 76 | 77 | 数学之美 78 | 79 | UNIX环境高级编程 80 | 81 | 存储技术原理分析 82 | 83 | Hadoop权威指南 84 | 85 | Weka应用技术与实践 86 | 87 | Python机器学习实践 88 | 89 | Model Thinking 90 | 91 | Practice Lisp 92 | -------------------------------------------------------------------------------- /source/posts/appendix07.rst: -------------------------------------------------------------------------------- 1 | ======================== 2 | 待整理扩展内容 3 | ======================== 4 | 5 | 可信计算组(TCG) 6 | 7 | NUMAZ 8 | 9 | Qemu - QMP 通信,交换机二层直连 10 | -------------------------------------------------------------------------------- /source/posts/ch01.rst: -------------------------------------------------------------------------------- 1 | =================== 2 | 第一章 随便说些什么 3 | =================== 4 | 5 | 本人书写过程中也在接受新的思考方式方面的训练,所以从开始到结尾难免有些易察觉到的写作方式变化。 6 | 7 | 我写这个的尽量多解释概念、可重现的操作以及学习指导,以及提供发现解决问题的思路。其中大部分操作细节我尽量使用现有公司的产品(比如Mirantis、Cloudera),而不是从源码部署,因为版本升级带来的配置细节问题交给他们处理会更通用些。同时我现在更倾向于使用现实的模型来解释或者构建一个原理或者是架构,从而方便记忆和扩展。 8 | 9 | 1.1 我所看到的 10 | ------------------- 11 | 12 | 从07年那会儿,甚至更早,拥有千万用户(包括盗版受害者在内)的行业先锋VMware,又有Google数据中心以及Amazon各种在线服务,这些实打实的东西遵循计算能力的摩尔定律,再顺应日益增长的商业需求,就有了“云”和“大数据”这两个让许多企业再次躁动的概念。 13 | 14 | 不管大家都怎么看,先引用一句关于“大数据”的经典,虽然与云计算不太相干: 15 | 16 | *“Big Data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”* 17 | 18 | 我始终认为认知很大一部分都来自于实践,即先有实践,才来理论,同时理论可以被提高被改变,然后再用于实践。 19 | 20 | 那就这样,**但行好事,莫问前程**。 21 | 22 | 作为公司的一名小技术人员,从客户交流到现场部署、从SDN交换机到Neutron、从QEMU到IAAS,我都有过了解或接触,并将其记录成文,是总结,亦是探索。期间难免会有这样或者那样的想法,而将这些想法实践出来的过程是会让人觉得生活依旧是美好的。 23 | 24 | 接下来是以我目前水平能看到的: 25 | 26 | - 和多数行业一样,不管哪个公司,他们技术和关系,在不同客户那里都有不同的权重,这点的南北差异尤为明显; 27 | 28 | - 以我所遇到的客户来看,部分客户不会明确的说使用“云”,他们的需求在多数情况下可以有这样或者那样的传统方案替代;明确要求使用“云”的客户,总会要求周边指标(用户接入、终端协议、USB重定向、视频流占用带宽、UPS、存储复用等)以及各种各样的定制,这是国内许多公有私有混合云厂商的一个痛处; 29 | 30 | - **云计算的本意不是虚拟机,也不是管理方便的集群,国内好多公司搞的云计算都是在卖传统计算机集群所能提供的服务,要有公司来建分水岭。** 31 | 32 | - 尽量不要从技术角度出发去给客户提供技术,而是去关心客户现在面临的问题,我们再提供技术。(这句说的不好,意思还差点) 33 | 34 | 所以,在需求上的灵活变通,也是有些许重要性的,**不要把技术人员的固执带到客户那**。 35 | 36 | 1.2 今天天气怎样 37 | ---------------- 38 | 39 | 7:50 手机闹铃响了,随之而来的是你订阅的RSS、Flipboard推送的新闻。随手翻阅下,完后Siri说今天天气挺暖和的,可以穿你最喜欢的高领毛衣,起床。 40 | 41 | 8:30 高速公路上,地图告诉你前方拥堵,正好这时领导打来电话,你告诉Siri说“接听”。挂了电话以后,红灯亮了,汽车记录下你这次堵车的时间,并默默再次计算预计到时。 42 | 43 | 9:20 到达公司门口,没有保安,取而代之的是连接公司LDAP服务器的瞳孔识别。 44 | 45 | 10:00 收到信息,来自楼下快件派发柜,你订购的球鞋到了。 46 | 47 | 10:30 客户预约的拜访,你告诉Siri说把公司地址导航路线发送给客户张三。 48 | 49 | 11:10 通过公司IM告诉前台临时允许车牌号为K9031的车辆进来并发送车位信息给客户。 50 | 51 | 11:20 和客户在临时会客厅谈话,期间你把演说内容通过手势(食指一划)推送到客户邮箱,他们感到很意外(加分点)。 52 | 53 | 54 | 12:30 午餐时间,Siri还记得你在决心减肥期间,提醒你不要吃最喜欢的红烧肉,末了来一句“两个人被老虎追,谁最危险?Bingo,胖的那个”。 55 | 56 | 14:30 有客户如约送来了10T的财务数据,你通知技术部小李对这些数据进行方案II处理,百分之30的处理任务交给武汉机房的机器,因为这些从机柜到主板都是你依照节能环保的原则主持设计的。 57 | 58 | 16:00 Siri告诉你,抱歉,六点钟有局部降雨。 59 | 60 | 17:00 你把今天的备忘录存进公司派发给你的虚拟桌面,下班。 61 | 62 | 19:00 老婆大人的晚饭做好了,你也再次连上虚拟桌面把备忘录整理归档。 63 | 64 | 20:30 跑步结束,它告诉你这几周换季期间,要增加运动量,你说“可以”。 65 | 66 | 23:00 休息,Siri通过文字悄悄告诉你,明天结婚纪念日,记住。 67 | 68 | **所有已知未知的暗流,让人有了继续折腾下去的欲望。** 69 | 70 | 我是一个着重实践的人,是的,比如磁盘不拆开的话我可能就对扇区、磁道、柱面没什么深刻概念(其实拆了也没多深刻)。所以,这一些列文章将尽量从操作或者现象中总结规律。一些原理性的东西会尽量用我觉得容易理解的形式表现出来。 71 | 72 | .. image:: ../images/01-01.jpg 73 | :height: 373 74 | :width: 300 75 | :align: center 76 | 77 | 整本“书”的结构将会是这个样子: 78 | 79 | 介绍下主流“云”的现状,会引用许多现成(尽量)客观的信息片段;畅想一下;从中选择部分组成一个完整系统,进行搭建(或模拟)并本地调优(避免过早优化);避免烂尾,结尾送两首短诗吧。 80 | 81 | 1.3 根据需求来架构 82 | -------------------------- 83 | 84 | 所有我们所需要的东西,至此有了一个整体的印象。接下来,我列举几个关键词,从中挑出一部分来组建我们“云”。 85 | 86 | - 存储:将要构建的基础设施的基础,所以一定要对其可用性及速度有所保证。 87 | 88 | - 虚拟化:通过软件模拟芯片以及处理器的运行结构,作为八九十年代生人(忽然觉得时间好快),多数人最熟悉的应该是红白机模拟器了吧。 89 | 90 | - 集群:这个概念里有并行还有分布,两者比较明显的区别是是否在某一时间段内有共同的计算目的。 91 | 92 | - 授权服务:现代数据中心的多数应用,都需要统一的用户数据,如何安全地使用统一的用户数据,也是我们需要考虑的。 93 | 94 | - 安全与可信:安全的问题,**很重要**。不要等到哪天信用卡账户被盗刷才发现安全的重要性;可信,即是服务提供者与接受者具有可信的第三方提供支持。 95 | 96 | - 可扩展:这里有两个层面,一个基础设施自身计算与存储能力扩增,另一个是与其他云计算平台的模块级别兼容。 97 | 98 | - 外围设备:对于特殊的应用(环境监视、认证、GPU依赖应用等),我们需要一些外围设备来辅助完成(usb设备、显卡等)。对于重要设备或者非常依赖总线带宽的设备(令牌、显卡等)的安装,推荐将其与实施应用的物理机直接绑定,对于其他设备(U盘、摄像头等)推荐使用某一既定物理机进行基于TCP/IP的透传。 99 | 100 | - 电源管理:由于集群中存在中央管理,所以有必要使用栅栏(fencing)去关闭与管理失去联系的机器,防止其自建中央管理。 101 | 102 | - orchestration:即预配置,不管是虚拟机还是应用程序,我们的目的之一就是达到服务的快速响应。 103 | 104 | 如此划分的意图是什么呢?**基础设施,在保证安全可靠的前提下,对本地资源实现最大化利用,也是我们追求的指标之一。还要注意,确保虚拟机状态监控无有遗漏,宿主机部署保证安全,否则,后期很痛苦,因为你永远无法控制用户使用你的环境做什么。** 105 | 106 | 接下来,看看即将部署的各个层之间的关系: 107 | 108 | .. image:: ../images/01-02.png 109 | :align: center 110 | 111 | 如你所见,存储与计算是在同一节点上,所有管理服务以虚拟机形态运行,统统高可用。但是,这个架构一定存在一个弊端吧?没错,从整体服务的角度来看,虚拟机作主体的架构中存在一定程度的管理上的不便。传统集群只要关心物理设施及与其绑定的应用即可,它们在某个区的几号柜的那一层;而虚拟机们则可能有些“任性”,没有绑定的情况下会在集群中的某台宿主机中进行迁移。所以,我们需要一个完备的集群管理及报告系统,也会需要一个DataWare House来统计用户行为。 112 | 113 | -------------------------------------------------------------------------------- /source/posts/ch02.rst: -------------------------------------------------------------------------------- 1 | ========================== 2 | 第二章 一个可靠的存储后端 3 | ========================== 4 | 5 | ------------------- 6 | 2.1 谈谈分布式存储 7 | ------------------- 8 | 9 | 计算机领域中有诸多有意思的东西可以把玩,在这儿且看看分布式存储。 10 | 11 | **集群文件系统** 12 | 13 | 在某些场景下又可以称作网络文件系统、并行文件系统,在70年代由IBM提出并实现原型。 14 | 15 | 有几种方法可以实现集群形式,但多数仅仅是节点直连存储而不是将存储之上的文件系统进行合理“分布”。分布式文件系统同时挂载于多个服务器上,并在它们之间共享,可以提供类似于位置无关的数据定位或冗余等特点。并行文件系统是一种集群式的文件系统,它将数据分布于多个节点,其主要目的是提供冗余和提高读写性能。 16 | 17 | **共享磁盘(Shared-disk)/Storage-area network(SAN)** 18 | 19 | 从应用程序使用的文件级别,到SAN之间的块级别的操作,诸如权限控制和传输,都是发生在客户端节点上。共享磁盘(Shared-disk)文件系统,在并行控制上做了很多工作,以至于其拥有比较一致连贯的文件系统视图,从而避免了多个客户端试图同时访问同一设备时数据断片或丢失的情况发生。其中有种技术叫做围栏(Fencing),就是在某个或某些节点发生断片时,集群自动将这些节点隔离(关机、断网、自恢复),保证其他节点数据访问的正确性。元数据(Metadata)类似目录,可以让所有的机器都能查找使用所有信息,在不同的架构中有不同的保存方式,有的均匀分布于集群,有的存储在中央节点。 20 | 21 | 实现的方式有iSCSI,AoE,FC,Infiniband等,比较著名的产品有Redhat GFS、Sun QFS、Vmware VMFS等。 22 | 23 | **分布式文件系统** 24 | 25 | 分布式文件系统则不是块级别的共享的形式了,所有加进来的存储(文件系统)都是整个文件系统的一部分,所有数据的传输也是依靠网络来的。 26 | 27 | 它的设计有这么几个原则: 28 | 29 | - *访问透明* 客户端在其上的文件操作与本地文件系统无异 30 | 31 | - *位置透明* 其上的文件不代表其存储位置,只要给了全名就能访问 32 | 33 | - *并发透明* 所有客户端持有的文件系统的状态在任何时候都是一致的,不会出现A修改了F文件,但是B愣了半天才发现。 34 | 35 | - *失败透明* 理解为阻塞操作,不成功不回头。 36 | 37 | - *异构性* 文件系统可以在多种硬件以及操作系统下部署使用。 38 | 39 | - *扩展性* 随时添加进新的节点,无视其资格新旧。 40 | 41 | - *冗余透明* 客户端不需要了解文件存在于多个节点上这一事实。 42 | 43 | - *迁移透明* 客户端不需要了解文件根据负载均衡策略迁移的状况。 44 | 45 | 实现的方式有NFS、CIFS、SMB、NCP等,比较著名的产品有Google GFS、Hadoop HDFS、GlusterFS、Lustre等。 46 | 47 | .. epigraph:: 48 | 49 | FUSE,filesystem in user space。 50 | 51 | FUSE全名Filesystem in Userspace,是在类UNIX系统下的一个机制,可以让普通用户创建修改访问文件系统。功能就是连接内核接口与用户控件程序的一座“桥”,目前普遍存在于多个操作系统中,比如Linux、BSD、Solaris、OSX、Android等。 52 | 53 | FUSE来源于AVFS,不同于传统文件系统从磁盘读写数据,FUSE在文件系统或磁盘中有“转换”的角色,本身并不会存储数据。 54 | 55 | 在Linux系统中的实现有很多,比如各种要挂载ntfs文件系统使用到的ntfs-3g,以及即将要用到的glusterfs-fuse。 56 | 57 | .. image:: ../images/02-01.png 58 | :align: center 59 | 60 | --------------------- 61 | 2.2 Glusterfs简述 62 | --------------------- 63 | 64 | 接下来,说一下我所看到的glusterfs。 65 | 66 | 首先它可以基于以太网或者Infiniband构建大规模分布式文件系统,其设计原则符合奥卡姆剃刀原则,即“ *若无必要,勿增实体* ”;它的源码部分遵循GPLv3,另一部分遵循GPLv2/LGPLv3;统一对象视图,与UNIX设计哲学类似,所有皆对象;跨平台兼容性高,可作为hadoop、openstack、ovirt、Amazon EC的后端。 67 | 68 | .. image:: ../images/02-02.png 69 | :align: center 70 | 71 | .. note:: 72 | 73 | **砖块(brick)**:即服务器节点上导出的一个目录,作为glusterfs的最基本单元。 74 | 75 | **卷(volume)**:用户最终使用的、由砖块组成的逻辑卷。 76 | 77 | **GFID**:glusterfs中的每一个文件或者目录都有一个独立的128位GFID,与普通文件系统中的inode类似。 78 | 79 | **节点(peer)**:即集群中含有砖块并参与构建卷的计算机。 80 | 81 | 功能介绍 82 | --------- 83 | 84 | 具体功能特性请参考 `Glusterfs features `_ 。 85 | 86 | 组合方式 87 | --------- 88 | 89 | **gluster支持四种存储逻辑卷组合:普通分布式(Distributed)、条带(Striped)、冗余(Replicated)、条带冗余(Striped-Replicated)** 90 | 91 | +-----------+-------------------------------+ 92 | |普通分布式 |.. image:: ../images/02-04.png | 93 | | | :align: center | 94 | +-----------+-------------------------------+ 95 | |条带 |.. image:: ../images/02-05.png | 96 | | | :align: center | 97 | +-----------+-------------------------------+ 98 | | 冗余 |.. image:: ../images/02-06.png | 99 | | | :align: center | 100 | +-----------+-------------------------------+ 101 | |条带冗余 |.. image:: ../images/02-07.png | 102 | | | :align: center | 103 | +-----------+-------------------------------+ 104 | 105 | Translator 106 | ---------- 107 | 108 | Translator是glusterfs设计时的核心之一,它具有以下功能: 109 | 110 | - 将用户发来的请求转化为对存储的请求,可以是一对一、一对多或者一对零(cache)。 111 | 112 | - 可用修改请求类型、路径、标志,甚至是数据(加密)。 113 | 114 | - 拦截请求(访问控制)。 115 | 116 | - 生成新请求(预取)。 117 | 118 | **类型** 119 | 120 | 根据translator的类型,可用将其分为如下类型: 121 | 122 | +-----------------+-----------------------------------------+ 123 | |Translator 类型 |功能 | 124 | +=================+=========================================+ 125 | |Storage |访问本地文件系统。 | 126 | +-----------------+-----------------------------------------+ 127 | |Debug |提供调试信息。 | 128 | +-----------------+-----------------------------------------+ 129 | |Cluster |处理集群环境下读写请求。 | 130 | +-----------------+-----------------------------------------+ 131 | |Encryption |加密/解密传送中的数据。 | 132 | +-----------------+-----------------------------------------+ 133 | |Protocol |加密/解密传送中的数据。 | 134 | +-----------------+-----------------------------------------+ 135 | |Performance |IO参数调节。 | 136 | +-----------------+-----------------------------------------+ 137 | |Bindings |增加可扩展性,比如python接口。 | 138 | +-----------------+-----------------------------------------+ 139 | |System |负责系统访问,比如文件系统控制接口访问。 | 140 | +-----------------+-----------------------------------------+ 141 | |Scheduler |调度集群环境下文件访问请求。 | 142 | +-----------------+-----------------------------------------+ 143 | |Features |提供额外文件特性,比如quota,锁机制等。 | 144 | +-----------------+-----------------------------------------+ 145 | 146 | AFR 147 | --- 148 | 149 | AFR(Automatic File Replication)是translator的一种,它使用额外机制去控制跟踪文件操作,用于跨砖块复制数据。 150 | 151 | 支持跨网备份 152 | 153 | +-----------+-------------------------------+ 154 | |局域网备份 |.. image:: ../images/02-08.png | 155 | | | :align: center | 156 | +-----------+-------------------------------+ 157 | |内网备份 |.. image:: ../images/02-09.png | 158 | | | :align: center | 159 | +-----------+-------------------------------+ 160 | |广域网备份 |.. image:: ../images/02-10.png | 161 | | | :align: center | 162 | +-----------+-------------------------------+ 163 | 164 | 其中,它有以下特点: 165 | 166 | - 保持数据一致性 167 | 168 | - 发生脑裂时自动恢复,应保证至少一个节点有正确数据 169 | 170 | - 为读系列操作提供最新数据结构 171 | 172 | DHT 173 | --- 174 | 175 | DHT(Distributed Hash Table)是glusterfs的真正核心。它决定将每个文件放置至砖块的位置。不同于多副本或者条带模式,它的功能是路由,而不是分割或者拷贝。 176 | 177 | **工作方式** 178 | 179 | 分布式哈希表的核心是一致性哈希算法,又名环形哈希。它具有的一个性质是当一个存储空间被加入或者删除时,现有得映射关系的改变尽可能小。 180 | 181 | 假设我们的哈希算出一个32位的哈希值,即一个[0,2^32-1]的空间,现将它首尾相接,即构成一个环形。 182 | 183 | 假如我们有四个存储砖块,每一个砖块B都有一个哈希值H,假设四个文件及其哈希值表示为(k,v),那么他们在哈希环上即如此表示: 184 | 185 | .. image:: ../images/02-14.png 186 | :align: center 187 | 188 | 每一个文件哈希k顺时针移动遇到一个H后,就将文件k保存至B。 189 | 190 | .. image:: ../images/02-15.png 191 | :align: center 192 | 193 | 上图表示的是理想环境下文件与砖块的存储映射,当有砖块失效时,存储位置的映射也就发生了改变。比如砖块B3失效,那么文件v3会被继续顺时针改变至B4上。 194 | 195 | .. image:: ../images/02-16.png 196 | :align: center 197 | 198 | 当砖块数目发生改变时,为了服务器能平摊负载,我们需要一次rebalance来稍许改变映射关系。rebalance的技巧即是创建一个虚拟的存储位置B',使所有砖块及其虚拟砖块尽量都存储有文件。 199 | 200 | .. image:: ../images/02-17.png 201 | :align: center 202 | 203 | .. image:: ../images/02-18.png 204 | :align: center 205 | 206 | ------------------------------ 207 | 2.3 搭建Glusterfs作为基础存储 208 | ------------------------------ 209 | 210 | 既然要搭建一个稳健的基础存储,那么glusterfs推荐使用distributed striped replicated方式,这里使用4台预装CentOS 6(SELINUX设置为permissive)的机器进行演示。 211 | 212 | 添加DNS或者修改hosts文件 213 | ------------------------------ 214 | 215 | 鉴于笔者所在环境中暂时没有配置独立的DNS,此处先修改hosts文件以完成配置,注意每台机器都要添加: 216 | 217 | */etc/hosts* 218 | 219 | .. code:: 220 | 221 | 127.0.0.1 localhost.localdomain localhost 222 | ::1 localhost6.localdomain6 localhost6 223 | 224 | 192.168.10.101 gs1.example.com 225 | 192.168.10.102 tgs2.example.com 226 | 192.168.10.103 tgs3.example.com 227 | 192.168.10.104 gs4.example.com 228 | 229 | 同样地在所有机器上添加repo: 230 | 231 | */etc/yum.repos.d/gluster_epel.repo* 232 | 233 | .. code:: 234 | 235 | [epel] 236 | name=Extra Packages for Enterprise Linux 6 - $basearch 237 | #baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch 238 | mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch 239 | failovermethod=priority 240 | enabled=1 241 | gpgcheck=0 242 | gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 243 | 244 | [glusterfs-epel] 245 | name=GlusterFS is a clustered file-system capable of scaling to several petabytes. 246 | baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/ 247 | enabled=1 248 | skip_if_unavailable=1 249 | gpgcheck=0 250 | gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key 251 | 252 | [glusterfs-noarch-epel] 253 | name=GlusterFS is a clustered file-system capable of scaling to several petabytes. 254 | baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch 255 | enabled=1 256 | skip_if_unavailable=1 257 | gpgcheck=0 258 | gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key 259 | 260 | [glusterfs-source-epel] 261 | name=GlusterFS is a clustered file-system capable of scaling to several petabytes. - Source 262 | baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/SRPMS 263 | enabled=0 264 | skip_if_unavailable=1 265 | gpgcheck=1 266 | gpgkey=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/pub.key 267 | 268 | 准备磁盘作为砖块 269 | ----------------- 270 | 271 | 在所有节点上安装xfs用户空间工具: 272 | 273 | .. code:: 274 | 275 | # yum install -y glusterfs glusterfs-fuse glusterfs-server xfsprogs 276 | # /etc/init.d/glusterd start 277 | # /etc/init.d/glusterfsd start 278 | # chkconfig glusterfsd on 279 | # chkconfig glusterd on 280 | 281 | 假设每台机器除系统盘之外都有2块1T SATA硬盘,我们需要对其进行分区,创建逻辑卷,格式化并挂载: 282 | 283 | .. code:: 284 | 285 | # fdisk /dev/sdX << EOF 286 | n 287 | p 288 | 1 289 | 290 | w 291 | EOF 292 | 293 | 格式化并挂载: 294 | 295 | .. code:: 296 | 297 | # mkfs.xfs -i size 512 /dev/sdb1 298 | # mkfs.xfs -i size 512 /dev/sdc1 299 | # mkdir /gluster_brick_root1 300 | # mkdir /gluster_brick_root2 301 | # echo -e "/dev/sdb1\t/gluster_brick_root1\txfs\tdefaults\t0 0\n/dev/sdc1\t/gluster_brick_root2\txfs\tdefaults\t0 0" >> /etc/fstab 302 | # mount -a 303 | # mkdir /gluster_brick_root1/data 304 | # mkdir /gluster_brick_root2/data 305 | 306 | .. note:: 为什么要用XFS? 307 | 308 | XFS具有元数据日志功能,可以快速恢复数据;同时,可以在线扩容及碎片整理。其他文件系统比如EXT3,EXT4未做充分测试。 309 | 310 | 添加卷 311 | ------- 312 | 313 | 在其中任意台机器上,比如gs2.example.com,执行 314 | 315 | .. code:: 316 | 317 | # gluster peer probe gs1.example.com 318 | # gluster peer probe gs3.example.com 319 | # gluster peer probe gs4.example.com 320 | 321 | 使用砖块进行卷的构建: 322 | 323 | .. code:: 324 | 325 | # gluster 326 | > volume create gluster-vol1 stripe 2 replica 2 \ 327 | gs1.example.com:/gluster_brick_root1/data gs2.example.com:/gluster_brick_root1/data \ 328 | gs1.example.com:/gluster_brick_root2/data gs2.example.com:/gluster_brick_root2/data \ 329 | gs3.example.com:/gluster_brick_root1/data gs4.example.com:/gluster_brick_root1/data \ 330 | gs3.example.com:/gluster_brick_root2/data gs4.example.com:/gluster_brick_root2/data force 331 | > volume start gluster-vol1 # 启动卷 332 | > volume status gluster-vol1 # 查看卷状态 333 | Status of volume: gluster-vol1 334 | Gluster process Port Online Pid 335 | ------------------------------------------------------------------------------ 336 | Brick gs1.example.com:/gluster_brick_root1/data 49152 Y 1984 337 | Brick gs2.example.com:/gluster_brick_root1/data 49152 Y 1972 338 | Brick gs1.example.com:/gluster_brick_root2/data 49153 Y 1995 339 | Brick gs2.example.com:/gluster_brick_root2/data 49153 Y 1983 340 | Brick gs3.example.com:/gluster_brick_root1/data 49152 Y 1961 341 | Brick gs4.example.com:/gluster_brick_root1/data 49152 Y 1975 342 | Brick gs3.example.com:/gluster_brick_root2/data 49153 Y 1972 343 | Brick gs4.example.com:/gluster_brick_root2/data 49153 Y 1986 344 | NFS Server on localhost 2049 Y 1999 345 | Self-heal Daemon on localhost N/A Y 2006 346 | NFS Server on gs2.example.com 2049 Y 2007 347 | Self-heal Daemon on gs2.example.com N/A Y 2014 348 | NFS Server on gs2.example.com 2049 Y 1995 349 | Self-heal Daemon on gs2.example.com N/A Y 2002 350 | NFS Server on gs3.example.com 2049 Y 1986 351 | Self-heal Daemon on gs3.example.com N/A Y 1993 352 | 353 | Task Status of Volume gluster-vol1 354 | ------------------------------------------------------------------------------ 355 | There are no active volume tasks 356 | > volume info all 查看所有卷信息 357 | gluster volume info all 358 | 359 | Volume Name: gluster-vol1 360 | Type: Distributed-Striped-Replicate 361 | Volume ID: bc8e102c-2b35-4748-ab71-7cf96ce083f3 362 | Status: Started 363 | Number of Bricks: 2 x 2 x 2 = 8 364 | Transport-type: tcp 365 | Bricks: 366 | Brick1: gs1.example.com:/gluster_brick_root1/data 367 | Brick2: gs2.example.com:/gluster_brick_root1/data 368 | Brick3: gs1.example.com:/gluster_brick_root2/data 369 | Brick4: gs2.example.com:/gluster_brick_root2/data 370 | Brick5: gs3.example.com:/gluster_brick_root1/data 371 | Brick6: gs4.example.com:/gluster_brick_root1/data 372 | Brick7: gs3.example.com:/gluster_brick_root2/data 373 | Brick8: gs4.example.com:/gluster_brick_root2/data 374 | 375 | 挂载卷 376 | ------- 377 | 378 | 当以glusterfs挂载时,客户端的hosts文件里需要有的任一节点做解析: 379 | 380 | *挂载glusterfs的客户端/etc/hosts* 381 | 382 | .. code:: 383 | 384 | 127.0.0.1 localhost.localdomain localhost 385 | ::1 localhost6.localdomain6 localhost6 386 | 387 | 192.168.1.81 gs1.example.com 388 | 389 | 安装gluster-fuse,将gluster卷作为glusterfs挂载,并写入1M文件查看其在各砖块分配: 390 | 391 | .. code:: 392 | 393 | # yum install glusterfs glusterfs-fuse 394 | # mount.glusterfs 192.168.1.81:/gluster-vol1 /mnt 395 | # cd /mnt 396 | # dd if=/dev/zero of=a.img bs=1k count=1k 397 | # cp a.img b.img; cp a.img c.img; cp a.img d.img 398 | 399 | 在四台服务端分别查看: 400 | 401 | .. code:: 402 | 403 | [root@gs1 ~]# ls -lh /gluster_brick_root* 404 | /gluster_brick_root1/data/: 405 | total 1.0M 406 | -rw-r--r--. 2 root root 512K Apr 22 17:13 a.img 407 | -rw-r--r--. 2 root root 512K Apr 22 17:13 d.img 408 | /gluster_brick_root2/data/: 409 | total 1.0M 410 | -rw-r--r--. 2 root root 512K Apr 22 17:13 a.img 411 | -rw-r--r--. 2 root root 512K Apr 22 17:13 d.img 412 | 413 | .. code:: 414 | 415 | [root@gs2 ~]# ls -lh /gluster_brick_root* 416 | /gluster_brick_root1/data/: 417 | total 1.0M 418 | -rw-r--r--. 2 root root 512K Apr 22 17:13 a.img 419 | -rw-r--r--. 2 root root 512K Apr 22 17:13 d.img 420 | /gluster_brick_root2/data/: 421 | total 1.0M 422 | -rw-r--r--. 2 root root 512K Apr 22 17:13 a.img 423 | -rw-r--r--. 2 root root 512K Apr 22 17:13 d.img 424 | 425 | .. code:: 426 | 427 | [root@gs3 ~]# ls -lh /gluster_brick_root* 428 | /gluster_brick_root1/data/: 429 | total 1.0M 430 | -rw-r--r--. 2 root root 512K Apr 22 17:13 b.img 431 | -rw-r--r--. 2 root root 512K Apr 22 17:13 c.img 432 | /gluster_brick_root2/data/: 433 | total 1.0M 434 | -rw-r--r--. 2 root root 512K Apr 22 17:13 b.img 435 | -rw-r--r--. 2 root root 512K Apr 22 17:13 c.img 436 | 437 | .. code:: 438 | 439 | [root@gs4 ~]# ls -lh /gluster_brick_root* 440 | /gluster_brick_root1/data/: 441 | total 1.0M 442 | -rw-r--r--. 2 root root 512K Apr 22 17:13 b.img 443 | -rw-r--r--. 2 root root 512K Apr 22 17:13 c.img 444 | /gluster_brick_root2/data/: 445 | total 1.0M 446 | -rw-r--r--. 2 root root 512K Apr 22 17:13 b.img 447 | -rw-r--r--. 2 root root 512K Apr 22 17:13 c.img 448 | 449 | 至此,所有配置结束。 450 | 451 | --------------------------- 452 | 2.4 Glusterfs应用示例及技巧 453 | --------------------------- 454 | 455 | 参数调整 456 | --------- 457 | 458 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 459 | | Option | Description | Default Value | Available Options | 460 | +======================================+===============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================+============================+=====================================================================================+ 461 | | auth.allow | IP addresses of the clients which should be allowed to access the volume. | * (allow all) | Valid IP address which includes wild card patterns including *, such as 192.168.1.* | 462 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 463 | | auth.reject | IP addresses of the clients which should be denied to access the volume. | NONE (reject none) | Valid IP address which includes wild card patterns including *, such as 192.168.2.* | 464 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 465 | | client.grace-timeout | Specifies the duration for the lock state to be maintained on the client after a network disconnection. | 10 | 10 - 1800 secs | 466 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 467 | | cluster.self-heal-window-size | Specifies the maximum number of blocks per file on which self-heal would happen simultaneously. | 16 | 0 - 1025 blocks | 468 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 469 | | cluster.data-self-heal-algorithm | Specifies the type of self-heal. If you set the option as "full", the entire file is copied from source to destinations. If the option is set to "diff" the file blocks that are not in sync are copied to destinations. Reset uses a heuristic model. If the file does not exist on one of the subvolumes, or a zero-byte file exists (created by entry self-heal) the entire content has to be copied anyway, so there is no benefit from using the "diff" algorithm. If the file size is about the same as page size, the entire file can be read and written with a few operations, which will be faster than "diff" which has to read checksums and then read and write. | reset | full/diff/reset | 470 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 471 | | cluster.min-free-disk | Specifies the percentage of disk space that must be kept free. Might be useful for non-uniform bricks | 10% | Percentage of required minimum free disk space | 472 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 473 | | cluster.stripe-block-size | Specifies the size of the stripe unit that will be read from or written to. | 128 KB (for all files) | size in bytes | 474 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 475 | | cluster.self-heal-daemon | Allows you to turn-off proactive self-heal on replicated | On | On/Off | 476 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 477 | | cluster.ensure-durability | This option makes sure the data/metadata is durable across abrupt shutdown of the brick. | On | On/Off | 478 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 479 | | diagnostics.brick-log-level | Changes the log-level of the bricks. | INFO | DEBUG/WARNING/ERROR/CRITICAL/NONE/TRACE | 480 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 481 | | diagnostics.client-log-level | Changes the log-level of the clients. | INFO | DEBUG/WARNING/ERROR/CRITICAL/NONE/TRACE | 482 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 483 | | diagnostics.latency-measurement | Statistics related to the latency of each operation would be tracked. | Off | On/Off | 484 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 485 | | diagnostics.dump-fd-stats | Statistics related to file-operations would be tracked. | Off | On | 486 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 487 | | features.read-only | Enables you to mount the entire volume as read-only for all the clients (including NFS clients) accessing it. | Off | On/Off | 488 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 489 | | features.lock-heal | Enables self-healing of locks when the network disconnects. | On | On/Off | 490 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 491 | | features.quota-timeout | For performance reasons, quota caches the directory sizes on client. You can set timeout indicating the maximum duration of directory sizes in cache, from the time they are populated, during which they are considered valid | 0 | 0 - 3600 secs | 492 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 493 | | geo-replication.indexing | Use this option to automatically sync the changes in the filesystem from Master to Slave. | Off | On/Off | 494 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 495 | | network.frame-timeout | The time frame after which the operation has to be declared as dead, if the server does not respond for a particular operation. | 1800 (30 mins) | 1800 secs | 496 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 497 | | network.ping-timeout | The time duration for which the client waits to check if the server is responsive. When a ping timeout happens, there is a network disconnect between the client and server. All resources held by server on behalf of the client get cleaned up. When a reconnection happens, all resources will need to be re-acquired before the client can resume its operations on the server. Additionally, the locks will be acquired and the lock tables updated. This reconnect is a very expensive operation and should be avoided. | 42 Secs | 42 Secs | 498 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 499 | | nfs.enable-ino32 | For 32-bit nfs clients or applications that do not support 64-bit inode numbers or large files, use this option from the CLI to make Gluster NFS return 32-bit inode numbers instead of 64-bit inode numbers. | Off | On/Off | 500 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 501 | | nfs.volume-access | Set the access type for the specified sub-volume. | read-write | read-write/read-only | 502 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 503 | | nfs.trusted-write | If there is an UNSTABLE write from the client, STABLE flag will be returned to force the client to not send a COMMIT request. In some environments, combined with a replicated GlusterFS setup, this option can improve write performance. This flag allows users to trust Gluster replication logic to sync data to the disks and recover when required. COMMIT requests if received will be handled in a default manner by fsyncing. STABLE writes are still handled in a sync manner. | Off | On/Off | 504 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 505 | | nfs.trusted-sync | All writes and COMMIT requests are treated as async. This implies that no write requests are guaranteed to be on server disks when the write reply is received at the NFS client. Trusted sync includes trusted-write behavior. | Off | On/Off | 506 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 507 | | nfs.export-dir | This option can be used to export specified comma separated subdirectories in the volume. The path must be an absolute path. Along with path allowed list of IPs/hostname can be associated with each subdirectory. If provided connection will allowed only from these IPs. Format: [(hostspec[hostspec...])][,...]. Where hostspec can be an IP address, hostname or an IP range in CIDR notation. Note: Care must be taken while configuring this option as invalid entries and/or unreachable DNS servers can introduce unwanted delay in all the mount calls. | No sub directory exported. | Absolute path with allowed list of IP/hostname | 508 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 509 | | nfs.export-volumes | Enable/Disable exporting entire volumes, instead if used in conjunction with nfs3.export-dir, can allow setting up only subdirectories as exports. | On | On/Off | 510 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 511 | | nfs.rpc-auth-unix | Enable/Disable the AUTH_UNIX authentication type. This option is enabled by default for better interoperability. However, you can disable it if required. | On | On/Off | 512 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 513 | | nfs.rpc-auth-null | Enable/Disable the AUTH_NULL authentication type. It is not recommended to change the default value for this option. | On | On/Off | 514 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 515 | | nfs.rpc-auth-allow | Allow a comma separated list of addresses and/or hostnames to connect to the server. By default, all clients are disallowed. This allows you to define a general rule for all exported volumes. | Reject All | IP address or Host name | 516 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 517 | | nfs.rpc-auth-reject | Reject a comma separated list of addresses and/or hostnames from connecting to the server. By default, all connections are disallowed. This allows you to define a general rule for all exported volumes. | Reject All | IP address or Host name | 518 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 519 | | nfs.ports-insecure | Allow client connections from unprivileged ports. By default only privileged ports are allowed. This is a global setting in case insecure ports are to be enabled for all exports using a single option. | Off | On/Off | 520 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 521 | | nfs.addr-namelookup | Turn-off name lookup for incoming client connections using this option. In some setups, the name server can take too long to reply to DNS queries resulting in timeouts of mount requests. Use this option to turn off name lookups during address authentication. Note, turning this off will prevent you from using hostnames in rpc-auth.addr.* filters. | On | On/Off | 522 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 523 | | nfs.register-with-portmap | For systems that need to run multiple NFS servers, you need to prevent more than one from registering with portmap service. Use this option to turn off portmap registration for Gluster NFS. | On | On/Off | 524 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 525 | | nfs.port | Use this option on systems that need Gluster NFS to be associated with a non-default port number. | NA | 38465- 38467 | 526 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 527 | | nfs.disable | Turn-off volume being exported by NFS | Off | On/Off | 528 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 529 | | performance.write-behind-window-size | Size of the per-file write-behind buffer. | 1MB | Write-behind cache size | 530 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 531 | | performance.io-thread-count | The number of threads in IO threads translator. | 16 | 0-65 | 532 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 533 | | performance.flush-behind | If this option is set ON, instructs write-behind translator to perform flush in background, by returning success (or any errors, if any of previous writes were failed) to application even before flush is sent to backend filesystem. | On | On/Off | 534 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 535 | | performance.cache-max-file-size | Sets the maximum file size cached by the io-cache translator. Can use the normal size descriptors of KB, MB, GB,TB or PB (for example, 6GB). Maximum size uint64. | 2 ^ 64 -1 bytes | size in bytes | 536 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 537 | | performance.cache-min-file-size | Sets the minimum file size cached by the io-cache translator. Values same as "max" above | 0B | size in bytes | 538 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 539 | | performance.cache-refresh-timeout | The cached data for a file will be retained till 'cache-refresh-timeout' seconds, after which data re-validation is performed. | 1s | 0-61 | 540 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 541 | | performance.cache-size | Size of the read cache. | 32 MB | size in bytes | 542 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 543 | | server.allow-insecure | Allow client connections from unprivileged ports. By default only privileged ports are allowed. This is a global setting in case insecure ports are to be enabled for all exports using a single option. | On | On/Off | 544 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 545 | | server.grace-timeout | Specifies the duration for the lock state to be maintained on the server after a network disconnection. | 10 | 10 - 1800 secs | 546 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 547 | | server.statedump-path | Location of the state dump file. | tmp directory of the brick | New directory path | 548 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 549 | | storage.health-check-interval | Number of seconds between health-checks done on the filesystem that is used for the brick(s). Defaults to 30 seconds, set to 0 to disable. | tmp directory of the brick | New directory path | 550 | +--------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-------------------------------------------------------------------------------------+ 551 | 552 | 具体参数参考 `gluster_doc `_ 。 553 | 554 | 文件权限 555 | --------- 556 | 557 | glusterfs在创建卷时会更改砖块所有者为root.root,对于某些应用请注意更改砖块目录所有者(比如在/etc/rc.local中添加chown,不要更改砖块下隐藏目录.glusterfs)。 558 | 559 | 砖块组合 560 | --------- 561 | 562 | 网上现有的部分文档中所述的砖块划分方式,是将整个磁盘划分为砖块,此种划分方式在某些场景下不是很好(比如存储复用),可以在/brickX下创建目录,比如data1,同时在创建glusterfs卷的时候使用HOST:/brickX/data1作为砖块,以合理利用存储空间。 563 | 564 | normal、replica、striped卷组合 565 | ------------------------------- 566 | 567 | 砖块的划分排序:striped(normal)优先,replica在striped(normal)基础上做冗余;计算大小时,同一replica组中的brick合并为一个砖块,一个striped组可看做一个有效块。 568 | 569 | 假设我们有4个主机,8个砖块,每个砖块都是5GB,如下图: 570 | 571 | .. image:: ../images/02-11.png 572 | :align: center 573 | 574 | 575 | 创建卷时使用如下命令: 576 | 577 | .. code:: 578 | 579 | # gluster volume create gluster-vol1 stripe 2 replica 2 \ 580 | host1:/brick1 host1:/brick2 host2:/brick1 host2:/brick2 \ 581 | host3:/brick1 host3:/brick2 host4:/brick1 host4:/brick2 force 582 | 583 | 砖块将会按照如下进行组合: 584 | 585 | .. image:: ../images/02-12.png 586 | :align: center 587 | 588 | 然而,创建卷时使用如下命令: 589 | 590 | .. code:: 591 | 592 | # gluster volume create gluster-vol1 stripe 2 replica 2 \ 593 | host1:/brick1 host2:/brick1 host3:/brick1 host4:/brick1 \ 594 | host1:/brick2 host2:/brick2 host3:/brick2 host4:/brick2 force 595 | 596 | 砖块将会按照如下进行组合: 597 | 598 | .. image:: ../images/02-13.png 599 | :align: center 600 | 601 | 作为nfs挂载 602 | ------------ 603 | 604 | 由于glusterfs占用了2049端口,所以其与nfs server一般不能共存于同一台服务器,除非更改nfs服务端口。 605 | 606 | .. code:: 607 | 608 | # mount -t nfs -o vers=3 server1:/volume1 /mnt 609 | 610 | 作为cifs挂载 611 | ------------ 612 | 613 | 先在某一服务器或者客户端将起挂载,再以cifs方式导出: 614 | 615 | /etc/smb.conf 616 | 617 | .. code:: 618 | 619 | [glustertest] 620 | comment = For testing a Gluster volume exported through CIFS 621 | path = /mnt/glusterfs 622 | read only = no 623 | guest ok = yes 624 | 625 | 修复裂脑(split-brain) 626 | ----------------------- 627 | 628 | 裂脑发生以后,各节点信息可能会出现不一致。可以通过以下步骤查看并修复。 629 | 630 | 1. 定位裂脑文件 631 | 632 | 通过命令 633 | 634 | .. code:: 635 | 636 | # gluster volume heal info split-brain 637 | 638 | 或者查看在客户端仍然是Input/Output错误的文件。 639 | 640 | 2. 关闭已经打开的文件或者虚机 641 | 642 | 3. 确定正确副本 643 | 644 | 4. 恢复扩展属性 645 | 646 | 登录到后台,查看脑裂文件的MD5sum和时间,判断哪个副本是需要保留的。 647 | 然后删除不再需要的副本即可。(glusterfs采用硬链接方式,所以需要同时删除.gluster下面的硬连接文件) 648 | 649 | 首先检查文件的md5值,并且和其他的节点比较,确认是否需要删除此副本。 650 | 651 | .. code:: 652 | 653 | [root@hostd data0]# md5sum 1443f429-7076-4792-9cb7-06b1ee38d828/images/5c881816-6cdc-4d8a-a8c8-4b068a917c2f/80f33212-7adb-4e24-9f01-336898ae1a2c 654 | 6c6b704ce1c0f6d22204449c085882e2 1443f429-7076-4792-9cb7-06b1ee38d828/images/5c881816-6cdc-4d8a-a8c8-4b068a917c2f/80f33212-7adb-4e24-9f01-336898ae1a2c 655 | 656 | 通过ls -i 和find -inum 找到此文件及其硬连接文件。 657 | 658 | .. code:: 659 | 660 | [root@hostd data0]# ls -i 1443f429-7076-4792-9cb7-06b1ee38d828/images/5c881816-6cdc-4d8a-a8c8-4b068a917c2f/80f33212-7adb-4e24-9f01-336898ae1a2c 661 | 12976365 1443f429-7076-4792-9cb7-06b1ee38d828/images/5c881816-6cdc-4d8a-a8c8-4b068a917c2f/80f33212-7adb-4e24-9f01-336898ae1a2c 662 | [root@hostd data0]# find -inum 12976365 663 | ./1443f429-7076-4792-9cb7-06b1ee38d828/images/5c881816-6cdc-4d8a-a8c8-4b068a917c2f/80f33212-7adb-4e24-9f01-336898ae1a2c 664 | ./.glusterfs/01/8d/018db725-c8b8-47ed-a6bb-f6ad4195134f 665 | 666 | 删除两个文件 667 | 668 | .. code:: 669 | 670 | [root@hostd data0]# find -inum 12976365 |xargs rm -rf 671 | 672 | 或者 673 | 674 | .. code:: 675 | 676 | -bash-4.1# getfattr -d -e hex -m . ids 677 | # file: ids 678 | trusted.afr.data_double10t-client-0=0x000000010000000000000000 679 | trusted.afr.data_double10t-client-1=0x000000000000000000000000 680 | trusted.afr.dirty=0x000000010000000000000000 681 | trusted.bit-rot.version=0x040000000000000056e52946000deaaa 682 | trusted.gfid=0x112060ecdd5c42f9baa3dd2d67fcf729 683 | 684 | -bash-4.1# setfattr -x trusted.afr.data_double10t-client-0 ids 685 | -bash-4.1# setfattr -x trusted.afr.data_double10t-client-1 ids 686 | -bash-4.1# setfattr -x trusted.afr.dirty ids 687 | -bash-4.1# getfattr -d -e hex -m . ids 688 | # file: ids 689 | trusted.bit-rot.version=0x040000000000000056e52946000deaaa 690 | trusted.gfid=0x112060ecdd5c42f9baa3dd2d67fcf729 691 | 692 | 脑裂文件恢复完成,此文件可以在挂载点上读写。 693 | 694 | 砖块复用 695 | --------- 696 | 697 | 当卷正在被使用,其中一个砖块被删除,而用户试图再次将其用于卷时,可能会出现“/bricks/app or a prefix of it is already part of a volume”。 698 | 699 | 解决方法: 700 | 701 | .. code:: 702 | 703 | # setfattr -x trusted.glusterfs.volume-id $brick_path 704 | # setfattr -x trusted.gfid $brick_path 705 | # rm -rf $brick_path/.glusterfs 706 | 707 | 高可用业务IP 708 | ------------- 709 | 710 | 由于挂载存储时需要指定集群中的任意IP,所以我们可以使用Heartbeat/CTDB/Pacemaker等集群软件来保证业务IP的高可用。 711 | 712 | 可参考 713 | 714 | http://clusterlabs.org/wiki/Debian_Lenny_HowTo#Configure_an_IP_resource 715 | 716 | http://geekpeek.net/linux-cluster-corosync-pacemaker/ 717 | -------------------------------------------------------------------------------- /source/posts/ch03.rst: -------------------------------------------------------------------------------- 1 | ========================= 2 | 第三章 合适的虚拟化平台 3 | ========================= 4 | 5 | ------------------ 6 | 3.1 虚拟化平台简介 7 | ------------------ 8 | 9 | Welcome to the core! 10 | 11 | 嗯,笔者撸了个OpenStack和k8s的部署 `脚本 `_ ,PXE哦。 12 | 13 | 云计算目前主流实现有SaaS(Software-as-a-service)、PaaS(Platform-as-a-service)和IaaS(Infrastructure-as-a-service)。IaaS和PaaS都算作基础件,SaaS可以与基础件自由组合或者单独使用。 14 | 15 | 虚拟化技术已经很受重视而且被推到了一个浪尖。如今诸多开源虚拟化平台,比如XenServer、CloudStack、OpenStack、Eucalyptus、oVirt、OpenVZ、Docker、LXC等,我们都看花了眼,些许慌乱不知哪个适合自己了。 16 | 17 | 各平台实现方式:全虚拟化,半虚拟化,应用虚拟化。 18 | 19 | IaaS云计算平台,综合来说具有以下特性: 20 | 21 | - 虚拟化:虚拟化作为云计算平台的核心,是资源利用的主要形式之一。网络、存储、CPU乃至GPU等主要通过虚拟主机进行实体化。 22 | 23 | - 分布式:分布式可利用共享的存储,通过网络将资源进行整合,是实现资源化的必备条件。 24 | 25 | - 高可用:于规模庞大的云平台,提供存储、管理节点、重要服务的高度可用性是十分必要的。笔者在写这篇文章时,oVirt 3.4已经可以做到管理节点的高度可用。 26 | 27 | - 兼容性:云计算平台众多,各家有各家的特点,同一数据中心部署不同的平台的可能性极大,因此,主要服务(比如虚拟网络、存储、虚机等)要有一定兼容性,比如oVirt可以利用OpenStack的Nouveau提供的虚拟网络、Foreman可以方便地在oVirt上部署新机器等。另外,也有DeltaCloud、libvirt等API,用户可以利用它们自由地实现自己的云综合管理工具。 28 | 29 | - 资源池化:网络、存储、CPU或者GPU可以综合或者单独划分资源池,通过配额进行分配,从而保证其合理利用。 30 | 31 | - 安全性:现代企业对于安全性的要求已经十分苛刻,除去传统数据加密、访问控制,甚至对于社会工程也要有一定防护能力;用户数据具有对非企业管理员具有防护性能,即使将虚拟机磁盘文件拷贝出来也不能直接获取其内容。 32 | 33 | - 需求导向性:在计算水平上,优质资源最先提供给重要服务;服务水平上,平台具有可定制能力。 34 | 35 | oVirt 36 | ----- 37 | 38 | oVirt目前两种部署方式: 39 | 40 | +-------------------+------------------------------+ 41 | |管理独占一台物理机 |.. image:: ../images/03-01.png| 42 | | | :align: center | 43 | +-------------------+------------------------------+ 44 | |高可用管理引擎 |.. image:: ../images/03-02.png| 45 | | | :align: center | 46 | +-------------------+------------------------------+ 47 | 48 | .. note:: **常见名词** 49 | 50 | **管理引擎(engine)**:提供平台web管理、api,各种扩展服务,vdsm以及libvirt服务的重要节点。 51 | 52 | **宿主机(node/host)**:为平台的的功能提供支持,主要是虚拟化能力。 53 | 54 | **数据中心(data center)**:以数据共享方式(Shared/Local)划分的容器,可以包含多个集群。 55 | 56 | **存储域(storage domain)**:平台所依赖的各种存储空间。 57 | 58 | **逻辑网络(logic network)**:物理网络或者虚拟网络的抽象代表。 59 | 60 | **池(pool)**:用于批量创建虚拟机。 61 | 62 | **集群策略(cluster policy)**:宿主机/虚拟机运行或者迁移时所遵循的原则。 63 | 64 | **DWH/Reports**:可以查看当前状态报告(需要ovirt-engine-reports)。 65 | 66 | **可信服务**: 需要OpenAttestation 。 67 | 68 | **电源管理**: 如果没有物理电源控制器,可以直接指定某一台主机为代理机,以构建更稳健的集群。 69 | 70 | OpenStack 71 | --------- 72 | 73 | OpenStack云计算中引入的概念已经先入为主,让国内许多较晚接触云计算的人认为这就是标准概念。 74 | 75 | 更多内容请参考附录一。 76 | 77 | CloudStack 78 | ---------- 79 | 80 | 可能也不错,我没用过。 81 | 82 | ---------------------- 83 | 3.2 搭建oVirt管理引擎 84 | ---------------------- 85 | 86 | 搭建oVirt平台的步骤: 87 | 88 | 1. 安装Redhat类操作系统(Redhat、CentOS、Fedora) 89 | 90 | 2. 从yum安装oVirt,并执行engine-setup,或者直接从oVirt提供的 `iso `_ 进行安装 91 | 92 | 3. 添加宿主机 93 | 94 | 4. 添加存储域 95 | 96 | 5. 创建虚拟机 97 | 98 | 系统准备 99 | --------- 100 | 101 | 所有机器的SELINUX都设置为permissive。 102 | 103 | */etc/selinux/config* 104 | 105 | .. code:: 106 | 107 | SELINUX=permissive 108 | 109 | .. code:: 110 | 111 | # setenfoce permissive 112 | 113 | 如有需要,清除iptables规则。 114 | 115 | .. code:: 116 | 117 | # iptables -F 118 | # service iptables save 119 | 120 | 每台机器上都要添加作为虚拟机运行的engine的FQDN,此处为ha.example.com。 121 | 122 | .. code:: 123 | 124 | # echo -e '192.168.10.100\tha.example.com' >> /etc/hosts 125 | 126 | 127 | 存储可以使用之前的glusterfs,方式为NFS_V3,注意将brick的权限设置为vdsm.kvm或者36:36。 128 | 129 | .. note:: **普通NFS服务器设置** 130 | 131 | 因为考虑到NFS4的权限配置较为复杂,推荐NFS使用V3,修改nfsmount.conf中的Version为3。 132 | 133 | .. code:: 134 | 135 | # gluster volume create gluster-vol1 replica 2 \ 136 | gs1.example.com:/gluster_brick0 gs2.example.com:/gluster_brick0 \ 137 | gs3.example.com:/gluster_brick0 gs4.example.com:/gluster_brick0 \ 138 | gs1.example.com:/gluster_brick1 gs2.example.com:/gluster_brick1 \ 139 | gs3.example.com:/gluster_brick1 gs4.example.com:/gluster_brick1 force 140 | 141 | 由于管理端以及节点的网络服务依赖于 **network** 而非 **NetworkManager** ,我们需要启用前者禁用后者,在每一台服务器上都进行如下类似配置修改网络。 142 | 143 | */etc/sysconfig/network-scripts/ifcfg-eth0* 144 | 145 | .. code:: 146 | 147 | NAME=eth0 148 | DEVICE=eth0 149 | ONBOOT=yes 150 | BOOTPROTO=static 151 | # 注意修改此处的IP 152 | IPADDR=192.168.10.101 153 | NETMASK=255.255.255.0 154 | GATEWAY=192.168.10.1 155 | DNS1=192.168.10.1 156 | 157 | .. code:: 158 | 159 | # chkconfig NetworkManager off 160 | # chkconfig network on 161 | # service NetworkManager stop; service network restart 162 | 163 | 添加repo 164 | --------- 165 | 166 | .. note:: **oVirt3.4.2 特别说明** 167 | 168 | 2014年六七月的初次安装oVirt的用户可能会遇到添加宿主机失败的问题,暂时解决办法为卸载python-pthreading-0.1.3-1及以后的版本,安装老版本,比如 ftp://ftp.icm.edu.pl/vol/rzm2/linux-fedora/linux/epel/6/i386/python-pthreading-0.1.3-0.el6.noarch.rpm ,再尝试安装vdsm并添加宿主机。 169 | 170 | 使用rpm: 171 | 172 | .. code:: 173 | 174 | # yum localinstall http://plain.resources.ovirt.org/releases/ovirt-release/ovirt-release34.rpm 175 | # yum install ovirt-hosted-engine-setup 176 | 177 | 或者手动添加: 178 | 179 | .. code:: 180 | 181 | [ovirt-stable] 182 | name=Latest oVirt Releases 183 | baseurl=http://resources.ovirt.org/releases/stable/rpm/EL/$releasever/ 184 | enabled=1 185 | skip_if_unavailable=1 186 | gpgcheck=0 187 | 188 | [ovirt-3.4-stable] 189 | name=Latest oVirt 3.4.z Releases 190 | baseurl=http://resources.ovirt.org/releases/3.4/rpm/EL/$releasever/ 191 | enabled=1 192 | skip_if_unavailable=1 193 | gpgcheck=0 194 | 195 | [epel] 196 | name=Extra Packages for Enterprise Linux 6 - $basearch 197 | #baseurl=http://download.fedoraproject.org/pub/epel/6/$basearch 198 | mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch 199 | failovermethod=priority 200 | enabled=1 201 | gpgcheck=0 202 | 203 | [ovirt-glusterfs-epel] 204 | name=GlusterFS is a clustered file-system capable of scaling to several petabytes. 205 | baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/$basearch/ 206 | enabled=1 207 | skip_if_unavailable=1 208 | gpgcheck=0 209 | 210 | [ovirt-glusterfs-noarch-epel] 211 | name=GlusterFS is a clustered file-system capable of scaling to several petabytes. 212 | baseurl=http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/epel-$releasever/noarch 213 | enabled=1 214 | skip_if_unavailable=1 215 | gpgcheck=0 216 | 217 | **从下面两种方式中选择之一进行搭建** 218 | 219 | :ref:`label2` 220 | 221 | :ref:`label3` 222 | 223 | .. _label2: 224 | 225 | 搭建普通oVirt虚拟化平台 226 | ------------------------ 227 | 228 | 笔者写此文时oVirt已经更新到3.4。 229 | 230 | 在此,我们会用到之前创建的distributed-replicated存储,这样可用保证系统服务的高度可用性有所提高。 231 | 232 | 对于初次使用oVirt的用户,建议使用此种搭建方式,**太折腾的话就吓走好多目标读者了** 。 233 | 234 | 使用之前的四台机器,分别为gs1.example.com,gs2.example.com,gs3.example.com和gs4.example.com,其中,将gs1作为管理机安装ovirt-engine,其余三台作为节点(node),存储使用已经创建好的glusterfs。 235 | 236 | .. image:: ../images/03-03.png 237 | :align: center 238 | 239 | 在gs1上运行如下命令。 240 | 241 | .. code:: 242 | 243 | # yum install ovirt-engine 244 | # engine-setup --offline 245 | [ INFO ] Stage: Initializing 246 | [ INFO ] Stage: Environment setup 247 | Configuration files: ['/etc/ovirt-engine-setup.conf.d/10-packaging.conf'] 248 | Log file: /var/log/ovirt-engine/setup/ovirt-engine-setup-20140508054649.log 249 | Version: otopi-1.2.0 (otopi-1.2.0-1.el6) 250 | [ INFO ] Stage: Environment packages setup 251 | [ INFO ] Stage: Programs detection 252 | [ INFO ] Stage: Environment setup 253 | [ INFO ] Stage: Environment customization 254 | 255 | --== PRODUCT OPTIONS ==-- 256 | 257 | 258 | --== PACKAGES ==-- 259 | 260 | 261 | --== NETWORK CONFIGURATION ==-- 262 | 263 | Host fully qualified DNS name of this server [gs1.example.com]: 264 | Setup can automatically configure the firewall on this system. 265 | Note: automatic configuration of the firewall may overwrite current settings. 266 | Do you want Setup to configure the firewall? (Yes, No) [Yes]: 267 | The following firewall managers were detected on this system: iptables 268 | Firewall manager to configure (iptables): iptables 269 | [ INFO ] iptables will be configured as firewall manager. 270 | 271 | --== DATABASE CONFIGURATION ==-- 272 | 273 | Where is the Engine database located? (Local, Remote) [Local]: 274 | Setup can configure the local postgresql server automatically for the engine to run. This may conflict with existing applications. 275 | Would you like Setup to automatically configure postgresql and create Engine database, or prefer to perform that manually? (Automatic, Manual) [Automatic]: 276 | 277 | --== OVIRT ENGINE CONFIGURATION ==-- 278 | 279 | Application mode (Both, Virt, Gluster) [Both]: 280 | Default storage type: (NFS, FC, ISCSI, POSIXFS) [NFS]: 281 | Engine admin password: 282 | Confirm engine admin password: 283 | 284 | --== PKI CONFIGURATION ==-- 285 | 286 | Organization name for certificate [example.com]: 287 | 288 | --== APACHE CONFIGURATION ==-- 289 | 290 | Setup can configure apache to use SSL using a certificate issued from the internal CA. 291 | Do you wish Setup to configure that, or prefer to perform that manually? (Automatic, Manual) [Automatic]: 292 | Setup can configure the default page of the web server to present the application home page. This may conflict with existing applications. 293 | Do you wish to set the application as the default page of the web server? (Yes, No) [Yes]: 294 | 295 | --== SYSTEM CONFIGURATION ==-- 296 | 297 | Configure WebSocket Proxy on this machine? (Yes, No) [Yes]: 298 | Configure an NFS share on this server to be used as an ISO Domain? (Yes, No) [Yes]: no 299 | 300 | --== MISC CONFIGURATION ==-- 301 | 302 | 303 | --== END OF CONFIGURATION ==-- 304 | 305 | [ INFO ] Stage: Setup validation 306 | 307 | --== CONFIGURATION PREVIEW ==-- 308 | 309 | Engine database name : engine 310 | Engine database secured connection : False 311 | Engine database host : localhost 312 | Engine database user name : engine 313 | Engine database host name validation : False 314 | Engine database port : 5432 315 | PKI organization : example.com 316 | Application mode : both 317 | Firewall manager : iptables 318 | Update Firewall : True 319 | Configure WebSocket Proxy : True 320 | Host FQDN : gs1.example.com 321 | Datacenter storage type : nfs 322 | Configure local Engine database : True 323 | Set application as default page : True 324 | Configure Apache SSL : True 325 | 326 | Please confirm installation settings (OK, Cancel) [OK]: ok 327 | [ INFO ] Stage: Transaction setup 328 | [ INFO ] Stopping engine service 329 | [ INFO ] Stopping websocket-proxy service 330 | [ INFO ] Stage: Misc configuration 331 | [ INFO ] Stage: Package installation 332 | [ INFO ] Stage: Misc configuration 333 | [ INFO ] Initializing PostgreSQL 334 | [ INFO ] Creating PostgreSQL 'engine' database 335 | [ INFO ] Configuring PostgreSQL 336 | [ INFO ] Creating Engine database schema 337 | [ INFO ] Creating CA 338 | [ INFO ] Configuring WebSocket Proxy 339 | [ INFO ] Generating post install configuration file '/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf' 340 | [ INFO ] Stage: Transaction commit 341 | [ INFO ] Stage: Closing up 342 | 343 | --== SUMMARY ==-- 344 | 345 | SSH fingerprint: 1B:FD:08:A2:FD:83:20:8A:65:F5:0D:F6:CB:BF:46:C7 346 | Internal CA 28:7E:D6:6B:F7:F2:6C:B5:60:27:44:C3:7F:3C:22:63:E5:68:DD:F4 347 | Web access is enabled at: 348 | http://gs1.example.com:80/ovirt-engine 349 | https://gs1.example.com:443/ovirt-engine 350 | Please use the user "admin" and password specified in order to login into oVirt Engine 351 | 352 | --== END OF SUMMARY ==-- 353 | 354 | [ INFO ] Starting engine service 355 | [ INFO ] Restarting httpd 356 | [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20140508054842-setup.conf' 357 | [ INFO ] Stage: Clean up 358 | Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20140508054649.log 359 | [ INFO ] Stage: Pre-termination 360 | [ INFO ] Stage: Termination 361 | [ INFO ] Execution of setup completed successfully 362 | 363 | 至此,管理节点安装结束,参考 :ref:`label1` 加入节点以及存储域。 364 | 365 | .. _label3: 366 | 367 | 搭建管理端高可用oVirt(hosted engine) 368 | -------------------------------------- 369 | 370 | 高可用,我们可以这么划分: 371 | 372 | - 存储的高可用:传统存储使用DRBD/Heartbeat或者独立的存储设备保证高可用,在灵活性、可扩展性、成本上都有一定局限。在与主机同台使用Ceph或者Glusterfs可以较好地保证资源充分利用地同时,又满足了高度可用的要求。 373 | 374 | - 管理高可用:因为比如oVirt、OpenStack这种拥有大型数据库的设施不像存储设施那样高效的同步,需要独立的管理运行在集群中的某一台机器上来同步集群消息,所以,管理端的高可用也是十分必要的。 375 | 376 | - 虚拟机/服务高可用:虚拟机在宕机时可自动重启,在主机资源紧张时可用迁移到其他负载较低的主机上,从而保证服务的质量以及连续性。 377 | 378 | .. image:: ../images/03-03.png 379 | :align: center 380 | 381 | .. epigraph:: 382 | 383 | 1. 宿主机的CPU架构建议选择Westmere(Westmere E56xx/L56xx/X56xx)、Nehalem(Intel Core i7 9xx)、Penryn(Intel Core 2 Duo P9xxx)或者Conroe(Intel Celeron_4x0)中的之一。 384 | 385 | CPU Family table 参阅 386 | `Intel Architecture and Processor Identification With CPUID Model and Family Numbers `_ 387 | 388 | 2. 建议参考第11节提前安装含有oVirt管理的虚拟机,硬盘格式为RAW,从而在安装管理机时作为OVF导入或者覆盖虚拟磁盘,减少失败风险时间。 389 | 390 | 安装ovirt-hosted-engine-setup,并回答一些问题,注意高亮部分: 391 | 392 | .. code-block:: bash 393 | :emphasize-lines: 21,36,123,138-144,150,166,173 394 | 395 | # hosted-engine --deploy 396 | [ INFO ] Stage: Initializing 397 | Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards. 398 | Are you sure you want to continue? (Yes, No)[Yes]: yes 399 | [ INFO ] Generating a temporary VNC password. 400 | [ INFO ] Stage: Environment setup 401 | Configuration files: [] 402 | Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20140508182241.log 403 | Version: otopi-1.2.0 (otopi-1.2.0-1.el6) 404 | [ INFO ] Hardware supports virtualization 405 | [ INFO ] Bridge ovirtmgmt already created 406 | [ INFO ] Stage: Environment packages setup 407 | [ INFO ] Stage: Programs detection 408 | [ INFO ] Stage: Environment setup 409 | [ INFO ] Stage: Environment customization 410 | 411 | --== STORAGE CONFIGURATION ==-- 412 | 413 | During customization use CTRL-D to abort. 414 | Please specify the storage you would like to use (nfs3, nfs4)[nfs3]: 415 | # 此处的存储域只存储hosted-engine的相关文件,不作为主数据域 416 | # 建议挂载gluster的nfs时使用localhost:data形式 417 | Please specify the full shared storage connection path to use (example: host:/path): 192.168.10.101:/gluster-vol1/ovirt_data/hosted_engine 418 | [ INFO ] Installing on first host 419 | Please provide storage domain name. [hosted_storage]: 420 | Local storage datacenter name is an internal name and currently will not be shown in engine's admin UI.Please enter local datacenter name [hosted_datacenter]: 421 | 422 | --== SYSTEM CONFIGURATION ==-- 423 | 424 | 425 | --== NETWORK CONFIGURATION ==-- 426 | 427 | iptables was detected on your computer, do you wish setup to configure it? (Yes, No)[Yes]: no 428 | Please indicate a pingable gateway IP address [192.168.10.1]: 429 | 430 | --== VM CONFIGURATION ==-- 431 | # 虚拟engine的安装方式 432 | Please specify the device to boot the VM from (cdrom, disk, pxe) [cdrom]: 433 | The following CPU types are supported by this host: 434 | - model_Conroe: Intel Conroe Family 435 | Please specify the CPU type to be used by the VM [model_Conroe]: 436 | Please specify path to installation media you would like to use [None]: /tmp/centos.iso 437 | Please specify the number of virtual CPUs for the VM [Defaults to minimum requirement: 2]: 438 | Please specify the disk size of the VM in GB [Defaults to minimum requirement: 25]: 439 | You may specify a MAC address for the VM or accept a randomly generated default [00:16:3e:59:9b:e2]: 440 | Please specify the memory size of the VM in MB [Defaults to minimum requirement: 4096]: 4096 441 | Please specify the console type you would like to use to connect to the VM (vnc, spice) [vnc]: 442 | 443 | --== HOSTED ENGINE CONFIGURATION ==-- 444 | 445 | Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_1]: 446 | Enter 'admin@internal' user password that will be used for accessing the Administrator Portal: 447 | Confirm 'admin@internal' user password: 448 | Please provide the FQDN for the engine you would like to use. 449 | This needs to match the FQDN that you will use for the engine installation within the VM. 450 | Note: This will be the FQDN of the VM you are now going to create, 451 | it should not point to the base host or to any other existing machine. 452 | Engine FQDN: ha.example.com 453 | [WARNING] Failed to resolve ha.example.com using DNS, it can be resolved only locally 454 | Please provide the name of the SMTP server through which we will send notifications [localhost]: 455 | Please provide the TCP port number of the SMTP server [25]: 456 | Please provide the email address from which notifications will be sent [root@localhost]: 457 | Please provide a comma-separated list of email addresses which will get notifications [root@localhost]: 458 | [ INFO ] Stage: Setup validation 459 | 460 | --== CONFIGURATION PREVIEW ==-- 461 | 462 | Engine FQDN : ha.example.com 463 | Bridge name : ovirtmgmt 464 | SSH daemon port : 22 465 | Gateway address : 192.168.10.1 466 | Host name for web application : hosted_engine_1 467 | Host ID : 1 468 | Image size GB : 25 469 | Storage connection : 192.168.10.101:/gluster-vol1/ovirt_data/hosted_data/ 470 | Console type : vnc 471 | Memory size MB : 4096 472 | MAC address : 00:16:3e:59:9b:e2 473 | Boot type : cdrom 474 | Number of CPUs : 2 475 | ISO image (for cdrom boot) : /tmp/centos.iso 476 | CPU Type : model_Conroe 477 | 478 | Please confirm installation settings (Yes, No)[No]: yes 479 | [ INFO ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf' 480 | [ INFO ] Stage: Transaction setup 481 | [ INFO ] Stage: Misc configuration 482 | [ INFO ] Stage: Package installation 483 | [ INFO ] Stage: Misc configuration 484 | [ INFO ] Configuring libvirt 485 | [ INFO ] Configuring VDSM 486 | [ INFO ] Starting vdsmd 487 | [ INFO ] Waiting for VDSM hardware info 488 | [ INFO ] Waiting for VDSM hardware info 489 | [ INFO ] Waiting for VDSM hardware info 490 | [ INFO ] Waiting for VDSM hardware info 491 | [ INFO ] Creating Storage Domain 492 | [ INFO ] Creating Storage Pool 493 | [ INFO ] Connecting Storage Pool 494 | [ INFO ] Verifying sanlock lockspace initialization 495 | [ INFO ] Initializing sanlock lockspace 496 | [ INFO ] Initializing sanlock metadata 497 | [ INFO ] Creating VM Image 498 | [ INFO ] Disconnecting Storage Pool 499 | [ INFO ] Start monitoring domain 500 | [ INFO ] Configuring VM 501 | [ INFO ] Updating hosted-engine configuration 502 | [ INFO ] Stage: Transaction commit 503 | [ INFO ] Stage: Closing up 504 | The following network ports should be opened: 505 | tcp:5900 506 | tcp:5901 507 | udp:5900 508 | udp:5901 509 | An example of the required configuration for iptables can be found at: 510 | /etc/ovirt-hosted-engine/iptables.example 511 | In order to configure firewalld, copy the files from 512 | /etc/ovirt-hosted-engine/firewalld to /etc/firewalld/services 513 | and execute the following commands: 514 | firewall-cmd -service hosted-console 515 | [ INFO ] Creating VM 516 | You can now connect to the VM with the following command: 517 | /usr/bin/remote-viewer vnc://localhost:5900 518 | Use temporary password "2067OGHU" to connect to vnc console. 519 | Please note that in order to use remote-viewer you need to be able to run graphical applications. 520 | This means that if you are using ssh you have to supply the -Y flag (enables trusted X11 forwarding). 521 | Otherwise you can run the command from a terminal in your preferred desktop environment. 522 | If you cannot run graphical applications you can connect to the graphic console from another host or connect to the console using the following command: 523 | virsh -c qemu+tls://192.168.1.150/system console HostedEngine 524 | If you need to reboot the VM you will need to start it manually using the command: 525 | hosted-engine --vm-start 526 | You can then set a temporary password using the command: 527 | hosted-engine --add-console-password 528 | The VM has been started. Install the OS and shut down or reboot it. To continue please make a selection: 529 | 530 | (1) Continue setup - VM installation is complete 531 | (2) Reboot the VM and restart installation 532 | (3) Abort setup 533 | # 需要在另外一个有图形能力的terminal中运行 534 | # "remote-viewer vnc://192.168.10.101:5900"连接虚拟机。 535 | # 完成engine-setup后关闭虚拟机;可以在虚拟机运行状态下执行 536 | # "hosted-engine --add-console-password"更换控制台密码。 537 | # 如果之前选择cdrom进行安装的话,此处可以在gs1上用已经安装好engine的 538 | # 虚拟磁盘进行覆盖,类似 539 | # "mount -t nfs 192.168.10.101:192.168.10.101:/gluster-vol1/ovirt_data/hosted_data/ /mnt; mv engine-disk.raw /mnt/ovirt_data/hosted_data/.../vm_UUID" 540 | (1, 2, 3)[1]: 1 541 | Waiting for VM to shut down... 542 | [ INFO ] Creating VM 543 | You can now connect to the VM with the following command: 544 | /usr/bin/remote-viewer vnc://localhost:5900 545 | Use temporary password "2067OGHU" to connect to vnc console. 546 | Please note that in order to use remote-viewer you need to be able to run graphical applications. 547 | This means that if you are using ssh you have to supply the -Y flag (enables trusted X11 forwarding). 548 | Otherwise you can run the command from a terminal in your preferred desktop environment. 549 | If you cannot run graphical applications you can connect to the graphic console from another host or connect to the console using the following command: 550 | virsh -c qemu+tls://192.168.1.150/system console HostedEngine 551 | If you need to reboot the VM you will need to start it manually using the command: 552 | hosted-engine --vm-start 553 | You can then set a temporary password using the command: 554 | hosted-engine --add-console-password 555 | Please install and setup the engine in the VM. 556 | You may also be interested in installing ovirt-guest-agent-common package in the VM. 557 | To continue make a selection from the options below: 558 | (1) Continue setup - engine installation is complete 559 | (2) Power off and restart the VM 560 | (3) Abort setup 561 | # 此处参考第一次操作,连接虚拟机控制台后进行"engine-setup --offline"以安装engine 562 | (1, 2, 3)[1]: 1 563 | [ INFO ] Engine replied: DB Up!Welcome to Health Status! 564 | [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... 565 | [ INFO ] Still waiting for VDSM host to become operational... 566 | [ INFO ] The VDSM Host is now operational 567 | Please shutdown the VM allowing the system to launch it as a monitored service. 568 | # 到此,需要连接虚拟机控制台关闭虚拟机 569 | The system will wait until the VM is down. 570 | [ INFO ] Enabling and starting HA services 571 | Hosted Engine successfully set up 572 | [ INFO ] Stage: Clean up 573 | [ INFO ] Stage: Pre-termination 574 | [ INFO ] Stage: Termination 575 | 576 | 此时,运行”hosted-engine –vm-start”以启动虚拟管理机。 577 | 578 | .. note:: 579 | 580 | 1. 若要重新部署 581 | 582 | # vdsClient -s 0 list 583 | 584 | # vdsClient -s 0 destroy 585 | 586 | 2. 若要添加第二台机器 587 | 588 | # yum install ovirt-hosted-engine-setup 589 | 590 | # hosted-engine --deploy 591 | 592 | 然后指定存储路径即可自动判断此为第2+台机器。 593 | 594 | .. _label1: 595 | 596 | ---------------------- 597 | 3.3 添加节点以及存储域 598 | ---------------------- 599 | 600 | 你看到这的话应该已经有了一个数据中心、几个宿主机,也可能有一个虚拟机(engine),还差一个存储虚拟机镜像的地方就可以拥有基本的oVirt平台了。 601 | 602 | 添加节点(宿主机) 603 | ------------------ 604 | 605 | 对于第11节的普通oVirt、第12节的ha平台,你可能需要添加更多节点以支持更好的SLA(service level agreement)。 606 | 添加节点目前有三种方式: 607 | 608 | - 通过oVirt的节点ISO安装系统后加入。 609 | 610 | - 直接将现有CentOS或者Fedora转化为节点(可以为当前管理机)。 611 | 612 | - 指定使用外部提供者(Foreman)。 613 | 614 | 在此我们使用第二种方法。 615 | 616 | .. image:: ../images/03-05.png 617 | :align: center 618 | 619 | 添加存储域 620 | ----------- 621 | 622 | 存储域有3种,Data(数据域)、ISO(ISO域)、Export(导出域)。 623 | 624 | 其中,数据域是为必需,在创建任何虚拟机之前需要有一个可用的数据域用于存储虚拟磁盘以及快照文件;ISO域中可以存放ISO和VFD格式的系统镜像或者驱动文件,可在多个数据中心间共享;导出域用于导出或导入OVF格式的虚机。 625 | 626 | 而根据数据域的存储类型,我们有5种(NFS、POSIX兼容、Glusterfs、iSCSI、光纤)可选,在此,选择glusterfs导出的NFS。 627 | 628 | .. image:: ../images/03-06.png 629 | :align: center 630 | 631 | .. note:: 632 | 确保存储域目录被vdsm.kvm可读,即所有者为36:36,或者vdsm.kvm。 633 | 导出域在已加入数据中心后不可共享,如果它意外损坏,请参考 http://blog.lofyer.org/blog/2014/05/11/cloud-6-5-advanced-ovirt/ 手动修复,或者删除dom_md/metadata中的POOL_UUID的值以及_SHA_CKSUM行。 634 | 若要使用oVirt的gluster支持,请安装vdsm-gluster包。 635 | 636 | ----------------- 637 | 3.4 连接虚拟机 638 | ----------------- 639 | 640 | 虚拟机运行后,通过web界面,你可用使用以下几种方式连接虚拟机(可通过控制台选项进行修改): 641 | 642 | .. image:: ../images/03-08.png 643 | :align: center 644 | 645 | Spice-Html5 646 | ------------ 647 | 648 | 首先在服务器端打开spice代理: 649 | 650 | .. code:: 651 | 652 | # engine-setup --otopi-environment="OVESETUP_CONFIG/websocketProxyConfig=bool:True" # 如果未setup或者要在其他机器setup,可做此步 653 | # yum install -y numpy # 安装numpy以加速转换。 654 | # engine-config -s WebSocketProxy="192.168.10.100:6100" 655 | # service ovirt-websocket-proxy restart 656 | # service ovirt-engine restart 657 | 658 | 连接之前,要信任以下两处https证书: 659 | 660 | https://192.168.10.100 661 | 662 | https://192.168.10.100:6100 663 | 664 | 然后点击控制台按钮即可在浏览器的新标签中打开spice-html5桌面。 665 | 666 | 浏览器插件 667 | ----------- 668 | 669 | 对于Redhat系列系统,可安装spice-xpi插件;Windows系统可以安装SpiceX的控件。 670 | 671 | 本地客户端 672 | ----------- 673 | 674 | 访问 `virt-manager官网 `_ 下载virt-viewer客户端,使用它打开下载到本地的console.vv文件。 675 | 676 | spice proxy/gateway - squid代理 677 | -------------------------------- 678 | 679 | 设置squid代理,将所有spice端口代理至3128端口。 680 | 681 | .. code:: 682 | 683 | # yum install squid 684 | 685 | 修改/etc/squid/squid.conf。 686 | 687 | .. code:: 688 | 689 | # 第5行 690 | acl localhost src 192.168.0.150 691 | # 第41行修改为 692 | http_access allow CONNECT Safe_ports 693 | #acl spice_servers dst 192.168.10.0/24 694 | #http_access allow spice_servers 695 | 696 | 启用squid服务。 697 | 698 | .. code:: 699 | 700 | # chkconfig squid on 701 | # service squid restart 702 | 703 | 设置engine的SpiceProxy 704 | 705 | .. code:: 706 | 707 | # engine-config -s SpiceProxyDefault="http://FQDN_or_外网IP:3128" 708 | # service ovirt-engine restart 709 | 710 | 可通过集群设置中设置所有宿主机的Spice代理,或者在虚拟机设置中单一设置某台虚拟机通过代理访问。 711 | 712 | RDP插件(仅适用于Windows虚机及IE浏览器) 713 | ---------------------------------------- 714 | 715 | 如果虚拟机的操作系统选择为Windows,并且内部启动了远程桌面服务,使用IE浏览器访问用户或者管理员入口时,可以启用RDP控件。 716 | 717 | ----------------- 718 | 3.5 oVirt使用进阶 719 | ----------------- 720 | 721 | 数据库修改非同步数据 722 | ---------------------- 723 | 724 | 如果出现网络错误,很有可能导致数据不同步,从而导致界面上虚拟机状态一直处于异常状态,这点OpenStack也有一样的缺点。连接引擎数据库,修改其中的vm_dynamic, image, vm_static等数据表即可。 725 | 726 | engine-config参数配置 727 | ---------------------- 728 | 729 | 平台安装完以后,可用通过engine-config命令进行详细参数配置。 730 | 731 | .. code:: 732 | 733 | # 查看设置说明 734 | # engine-config -l 735 | # 查看当前设置 736 | # engine-config -a 737 | 738 | **示例:重设管理员密码** 739 | 740 | .. code:: 741 | 742 | # engine-config -s AdminPassword=interactive 743 | Please enter a password: # 密码 744 | Please reenter password: # 密码 745 | 746 | ovirt-shell与API 747 | ----------------- 748 | 749 | Restful API(Application User Interface)是oVirt的一大特点,用户可以通过它将其与第三方的界面或者应用进行集成。访问 http://192.168.10.100/api?rsdl 以获取其用法。 750 | 751 | .. note:: 访问API使用GET、POST、PUT、DELETE方法 752 | 753 | 获取内容时使用GET; 754 | 添加新内容或者执行动作使用POST; 755 | 更新内容使用PUT; 756 | 删除内容使用DELETE; 757 | 758 | 详细用法参考 http://www.ovirt.org/Api,SDK示例参考 http://www.ovirt.org/Testing/PythonApi 759 | 760 | ovirt-shell则是全部使用Restful API写成的shell,通过它可以完成图形界面所不能提供的功能。 761 | 762 | .. code:: 763 | 764 | # ovirt-shell -I -u admin@internal -l https://192.168.10.100/api 765 | ============================================================================ 766 | >>> connected to oVirt manager 3.4.0.0 <<< 767 | ============================================================================ 768 | 769 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 770 | Welcome to oVirt shell 771 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 772 | [oVirt shell (connected)]# 773 | 774 | **示例:使用ovirt-shell或者API来连接虚拟机。** 775 | 776 | 1. 获取虚拟机列表及其所在宿主机 777 | 778 | + ovirtshell 779 | 780 | .. code:: 781 | 782 | # ovirt-shell -I -u admin@internal -l https://192.168.10.100/api -E "list vms" 783 | id : 124e8020-c9d7-4e86-81e1-0d4e28ff1cd4 784 | name : aaa 785 | # ovirt-shell -I -u admin@internal -l https://192.168.10.100/api -E "show vm aaa" 786 | id : 124e8020-c9d7-4e86-81e1-0d4e28ff1cd4 787 | name : aaa 788 | ... 789 | display-address : 192.168.10.100 790 | ... 791 | display-port : 5912 792 | display-secure_port : 5913 793 | ... 794 | 795 | + restapi 796 | 797 | .. code:: 798 | 799 | # curl -u admin@internal:admin https://192.168.10.100/api/vms | less 800 | aaa 801 | 802 | ... 803 | 804 | spice 805 |
192.168.10.100
806 | 5912 807 | 5913 808 | 1 809 | false 810 | false 811 | false 812 |
813 | ... 814 | 815 | 2. 获取/设置控制台密码 816 | 817 | + ovirtshell 818 | 819 | .. code:: 820 | 821 | # ovirt-shell -I -u admin@internal -l https://192.168.10.100/api -E "action vm aaa ticket" 822 | # [oVirt shell (connected)]# action vm aaa ticket 823 | 824 | status-state : complete 825 | ticket-expiry: 7200 826 | ticket-value : MfY9P5kpmNpw 827 | vm-id : 124e8020-c9d7-4e86-81e1-0d4e28ff1cd4 828 | 829 | + restapi 830 | 831 | .. code:: 832 | 833 | # curl -k -u admin@internal:admin https://192.168.10.100/api/vms/124e8020-c9d7-4e86-81e1-0d4e28ff1cd4/ticket -X POST -H "Content-type: application/xml" -d '120' 834 | 835 | 836 | 837 | jRUqhrks6JiT 838 | 120 839 | 840 | ... 841 | 842 | complete 843 | 844 | 845 | 846 | 3. 连接虚拟机 847 | 848 | 除了以上获取的显示端口、宿主机IP,我们需要额外获取一个根证书。 849 | 850 | .. code:: 851 | 852 | # wget http://192.168.10.100/ca.crt 853 | 854 | + ovirt-shell 855 | 856 | .. code:: 857 | 858 | # ovirt-shell -I -u admin@internal -l https://192.168.10.100/api \ 859 | -E "console aaa" 860 | 861 | + virt-viewer 862 | 863 | .. code:: 864 | 865 | # remote-viewer --spice-ca-file=ca.crt spice://192.168.10.100?port=5912&tls-port=5913&password=jRUqhrks6JiT 866 | 867 | 主机hooks 868 | ---------- 869 | 870 | 主机hooks位于各个宿主机上,用于扩展oVirt的平台功能,比如网络、USB设备、SRIOV等。原理是在某一事件触发时(比如虚拟机启动之前),修改libvirt启动XML文件、环境变量或者主机配置,从而改变qemu的启动参数。 871 | 872 | 更多hooks内容请参考 `vdsm-hooks `_ 。 873 | 874 | **示例:使用libvirt内部网络** 875 | 876 | 1. 准备所需文件。 877 | 878 | 拷贝 `extnet_vnic.py `_ 至/usr/libexec/vdsm/hooks/before_vm_start/,不要忘记添加可执行权限。 879 | 880 | 2. 查看,添加libvirt网络。 881 | 882 | .. code:: 883 | 884 | # virsh net-list 885 | 886 | 在 *extnet_vnic.py* 文件中 **newnet = os.environ.get('extnet')** 前一行添加如下代码,替换其中的 **default** 为要使用的libvirt网络, **其他的hooks脚本多数可以这样修改,也可以在engine-config的CustomProperty中指定** : 887 | 888 | .. code:: 889 | 890 | # 注意此处使用双引号 891 | params = "default" 892 | # 同上部分粗体字,因为大部分hooks检查engine-config中的环境变量,简便起见,我直接在hooks脚本中设置了环境变量 893 | os.environ.__setitem__("extnet",params) 894 | 895 | 3. 虚拟机一定要添加网络配置,否则会启动失败,在虚拟机启动时,第一个网络配置文件会被替换为 **default** 网络。 896 | 897 | .. note:: 如果忘记了libvirt密码,可以用以下命令重置。 898 | 899 | .. code:: 900 | 901 | # saslpasswd2 -a libvirt root 902 | 903 | 第三方界面 904 | ----------- 905 | 906 | oVirt现在自带的gwt界面对浏览器和客户的要求较高,有一些人由于这个原因抛弃它使用第三方的web界面,参考 `oVirt Dash `_ 、 `ovirt sample-portals `_ 。 907 | 908 | 主机策略 909 | --------- 910 | 911 | 参考这个 `PDF `_ 。(没记错的话,里面那个显卡透穿的问题,是当初我问的。。) 912 | 913 | **示例:** 914 | 915 | 使用virt-install安装系统 916 | -------------------------- 917 | 918 | **示例:** 919 | 920 | .. code:: 921 | 922 | # virt-install \ 923 | --name centos7 \ 924 | --ram 2048 \ 925 | --disk path=/var/kvm/images/centos7.img,format=qcow2 \ 926 | --vcpus 2 \ 927 | --os-type linux \ 928 | --os-variant rhel7 \ 929 | --graphics none \ 930 | --console pty,target_type=serial \ 931 | --location 'http://mirrors.aliyun.com/centos/7/os/x86_64/' \ 932 | --extra-args 'console=ttyS0,115200n8 serial' 933 | 934 | 虚拟机文件系统扩容 935 | ------------------- 936 | 937 | oVirt 3.4 磁盘可以进行在线扩容,但是对于磁盘内的文件系统需要单独支持,在此列出常用Linux以及Windows扩容方法。 938 | 939 | **Linux文件系统扩容(镜像扩容后操作)** 940 | 941 | 在镜像扩容后进行如下操作。 942 | 943 | 1. 重写分区表 944 | 945 | .. code:: 946 | 947 | # fdisk /dev/sda 948 | > d 949 | > 3 950 | > n 951 | > 3 952 | > w 953 | 然后 954 | # kpartprobe 955 | 或者 956 | # reboot 957 | 958 | 2. 在线扩容文件系统 959 | 960 | .. code:: 961 | 962 | # resize2fs /dev/sda3 963 | 964 | **Linux文件系统扩容(lvm)** 965 | 966 | 在创建好一个新的分区或者一个新的磁盘后,将新空间(比如10G)添加到PV,VG,扩容LV,然后扩容文件系统。 967 | 968 | .. code:: 969 | 970 | # vgextend vg_livecd /dev/sdb 971 | # lvextend /dev/vg_livecd/lv_root -L +10G 972 | # resize2fs /dev/vg_livecd/lv_root 973 | 974 | **Linux文件系统扩容(libguestfs)** 975 | 976 | 具体内容请参考 `libguestfs site `_ 。 977 | 978 | 1. 检视磁盘 979 | 980 | .. code:: 981 | 982 | # virt-filesystem --all --long -h -a hda.img 983 | 984 | 2. 创建待扩容磁盘副本,同时扩容10G(假设原磁盘大小为10G) 985 | 986 | 对于RAW格式: 987 | 988 | .. code:: 989 | 990 | # truncate -r hda.img hda-new.img 991 | # truncate -s +10G hda-new.img 992 | 993 | 对于QCOW2等有压缩格式: 994 | 995 | .. code:: 996 | 997 | # qemu-img create -f qcow2 -o preallocation=metadata hda-new.img 20G 998 | 999 | 3. 扩展分区尺寸 1000 | 1001 | 普通分区扩展,/boot分区扩容200M,其余全部给/分区: 1002 | 1003 | .. code:: 1004 | 1005 | # virt-resize --resize /dev/sda1=+200M --expand /dev/sda2 hda.img hda-new.img 1006 | 1007 | LVM分区扩展,扩容lv_root逻辑卷: 1008 | 1009 | .. code:: 1010 | 1011 | # virt-resize --expand /dev/sda2 --LV-expand /dev/vg_livecd/lv_root hda.qcow2 hda-new.qcow2 1012 | 1013 | **FAT/NTFS扩容** 1014 | 1015 | XP使用Paragon Partion Manager;Windows 7 在磁盘管理中即可扩容。 1016 | 1017 | P2V/V2P 1018 | -------- 1019 | 1020 | **V2V** 1021 | 1022 | 在此以ESXi迁移至oVirt为例。 1023 | 1024 | 1. 在oVirt上创建一个NFS导出域 1025 | 1026 | 2. 安装libguestfs,并创建.netrc文件,文件内容为ESXi的登陆信息: 1027 | 1028 | *~/.netrc* 1029 | 1030 | .. code:: 1031 | 1032 | machine 192.168.1.135 login root password 1234567 1033 | 1034 | 3. 开始迁移,确保ESXi的虚拟机已正常关闭: 1035 | 1036 | .. code:: 1037 | 1038 | # virt-v2v -ic esx://192.168.1.135/?no_verify=1 -o rhev -os 192.168.1.111:/export --network mgmtnet myvm 1039 | myvm_myvm: 100% [====================================================]D 0h04m48s 1040 | virt-v2v: myvm configured with virtio drivers. 1041 | 1042 | 4. 从导出域导入虚拟机并运行: 1043 | 1044 | 导入虚拟机: 1045 | 1046 | .. image:: ../images/03-11.png 1047 | :align: center 1048 | 1049 | 运行虚拟机: 1050 | 1051 | .. image:: ../images/03-12.png 1052 | :align: center 1053 | 1054 | **P2V** 1055 | 1056 | oVirt的P2V方式我所知的有三种,一是使用VMWare的P2V工具转化为VM以后,再通过virt-v2v转化为oVirt的VM,二是使用clonezilla或者ghost制作系统后,再将其安装到oVirt中,三则是使用virt-p2v工具。笔者在此使用virt-p2v工具示例,成文时只在CentOS 6.5以上版本测试,CentOS 7未测试,RHEL 7.1以上有此工具。 1057 | 1058 | 1. 在服务器端安装所需包: 1059 | 1060 | .. code:: 1061 | 1062 | yum install -y virt-p2v virt-v2v 1063 | 1064 | 2. 将/usr/share/virt-v2v/中的ISO文件dd到U盘或者烧录到光盘,然后在要转化的机器上启动。 1065 | 1066 | 3. 修改服务器端/etc/virt-v2v.conf,修改rhevm的配置,形如: 1067 | 1068 | .. code:: 1069 | 1070 | 1071 | 1072 | rhev 1073 | 1074 | nfs.example.com:/ovirt-export 1075 | 1076 | 1077 | 1078 | 1079 | 1080 | 1081 | 1082 | 4. 开始转换(我是使用libvirt的profile示例) 1083 | 1084 | .. image:: ../images/03-13.png 1085 | :align: center 1086 | 1087 | 5. 转换完成后关闭计算机,并修改虚拟机的配置以完全适应OS。 1088 | 1089 | UI 插件 1090 | -------- 1091 | 1092 | 详细内容请参考 `oVirt官网关于UI插件的介绍 `_ 以及附录部分内容,在此我仅使用ShellInABox举例,你可以考虑将上面的libguestfs扩容加进来,更多UI Plugin请 **git clone git://gerrit.ovirt.org/samples-uiplugins.git** 。 1093 | 1094 | **ShellInABox oVirt UI plugin** 1095 | 1096 | 1. 在宿主机上安装ShellInABox。 1097 | 1098 | .. code:: 1099 | 1100 | # yum install shellinabox 1101 | # chkconfig shellinaboxd on 1102 | 1103 | 修改shellinabox配置 **OPTS** : 1104 | 1105 | */etc/sysconfig/shellinaboxd* 1106 | 1107 | .. code:: 1108 | 1109 | OPTS="--disable-ssl --service /:SSH" 1110 | 1111 | 2. 拷贝uiplugin文件并启动服务。 1112 | 1113 | .. code:: 1114 | 1115 | # git clone git://gerrit.ovirt.org/samples-uiplugins.git 1116 | # cp -r samples-uiplugins/shellinabox-plugin/* /usr/share/ovirt-engine/ui-plugins/ 1117 | # service shellinaboxd start 1118 | # service ovirt-engine restart 1119 | 1120 | .. image:: ../images/03-09.png 1121 | :align: center 1122 | 1123 | .. note:: shellinabox插件链接问题 1124 | 1125 | 由于3.3到3.4之后链接的变化,shellinabox.json中的 **/webadmin/webadmin/plugin/ShellBoxPlugin/start.html** 需要替换为 **plugin/ShellBoxPlugin/start.html** 。 1126 | 1127 | .. note:: shellinabox的root登陆问题 1128 | 1129 | echo -e "pts0\npts1\npts2" >> /etc/securetty 1130 | 1131 | 使用SysPrep/Cloud-Init重置虚拟机信息 1132 | ------------------------------------- 1133 | 1134 | 参考 https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.4/html/Administration_Guide/sect-Sealing_Templates_in_Preparation_for_Deployment.html . 1135 | 1136 | **初始化Redhat系列Linux** 1137 | 1138 | .. code:: 1139 | 1140 | # touch /.unconfigured 1141 | # rm -i /etc/ssh/ssh_host_* 1142 | # echo "HOSTNAME=localhost.localdomain" >> /etc/sysconfig/network 1143 | # rm -i /etc/udev/rules.d/70-persistent* 1144 | 1145 | 删除 */etc/sysconfig/network-scripts/ifcfg-** 文件中的 **HWADDR=** 字段,删除 */var/log/** 、 */root/*.log* ,然后关机即可。 1146 | 1147 | 修改密码 1148 | 1149 | .. code:: 1150 | 1151 | # virt-sysprep --root-password password:123456 -a os.img 1152 | 1153 | **初始化Windows 7** 1154 | 1155 | 1. 使用http://www.drbl-winroll.org工具批量修改。 1156 | 1157 | 2. 使用位于 *C:\\Windows\\system32\\sysprep* 目录下的工具。 1158 | 1159 | .. image:: ../images/03-10.jpg 1160 | :align: center 1161 | 1162 | 与其他平台集成、与认证服务器集成 1163 | --------------------------------- 1164 | 1165 | oVirt平台目前可以使用Foreman,OpenStack Network,OpenStack Image的部分功能,具体实施请参阅附录一内容。 1166 | 1167 | 与认证服务器集成时很有可能遇到各种问题,比如与AD集成时不能使用Administrator用户进行engine-manage-domains,与IPA集成时需要修改minss之类的参数,与LDAP集成时需要Kerberos或者使用3.5版本中的aaa认证插件。 1168 | 1169 | 后端(libvirt/qemu/kernel)优化 1170 | -------------------------------- 1171 | 1172 | 如果你觉得现有平台的性能达不到预期,或者有其他的需求,可以从以下几方面进行调节或优化。 1173 | 1174 | - qemu : 我写了 `一系列qemu的脚本 `_ ,你可以调节里面的参数直接启动虚拟机进行调试或者优化。 1175 | 1176 | - libvirt : `libvirt `_ 的目的是统一各种虚拟化后端的调用方式(kvm/xen/lxc/OpenVZ//VirtualBox/VMWare/Hyper-V等等),主要一个特性是用统一的描述文件来定义虚拟机配置(xml文件),在Linux下你可以使用 `Virt Manager `_ 进行调试,相关文档参考 `libvirt ref `_ 。 1177 | 1178 | - kernel : 基本上大部分的内核相关配置都可以通过 **/sys** 或者 **/proc** 进行调节,而针对内核所谓的“优化”,在非大规模部署的情况下,其优势很难体现出来,还有一方面,目前的KVM效率,CPU、内存管理、网络等方面都比较优秀,在I/O方面还有部分不足,可以在 `VIRTIO `_ 上进行相关的优化,还有开启 `HugePage <://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_ 等操作。 1179 | -------------------------------------------------------------------------------- /source/posts/ch04.rst: -------------------------------------------------------------------------------- 1 | ============================= 2 | 第四章 数据抓取与机器学习算法 3 | ============================= 4 | 5 | 在开始这一章之前,你可能需要补习一下数学知识;还有熟悉下常见工具(语言),不必多年开发经验,会处理常见数据结构、能格式化文件即可。 6 | 7 | 建议先通读一下 `Scrapy 中文文档 `_ ,这样你会省去好多Google的时间;在 `知乎 `_ 上有许多关于 *大数据* 、 *数据挖掘* 的讨论,你可以去看看了解一些业内的动态。 8 | 9 | 另外,可以使用 `Nutch `_ 来爬取,并用 `Solr `_ 来构建一个简单的搜索引擎,它们可以跟下一章节的Hadoop集成。 10 | 11 | 还有一个比较重要的点-- `Model Thinking `_ ,你需要的不只是建模的知识,还要有建模的思想。数据和算法并不是最重要的,重要的是你如何利用这些数据通过你设计的模型来输出对你有用的结果。 12 | 13 | **不要以编程开始你的机器学习之旅,这样容易使思维受限于语言,从模型和结果思考去达到你的目的,编程只是手段之一。** 14 | 15 | ------------- 16 | 4.1 数据收集 17 | ------------- 18 | 19 | 数据收集是学习数据分析的开始。我们每一天 20 | 21 | 为了省去一些学习的麻烦,我找了一些 `“大”数据 `_ ,有些上百TB的数据对非行业内的人来说可能毫无意义,但是,先来些数据吧,但是对学习者来说还是比较实用的。 22 | 23 | 简单抓取 24 | ========= 25 | 26 | 动手写一个最简单的爬虫 27 | ----------------------- 28 | 29 | 实际使用时遇到的问题 30 | ---------------------- 31 | 32 | 分布式抓取 33 | =========== 34 | 35 | scrapyd 36 | -------- 37 | 38 | scrapy-redis 39 | ------------- 40 | 41 | 使用Nutch + Solr 42 | ----------------- 43 | 44 | ------------- 45 | 4.2 爬虫示例 46 | ------------- 47 | 48 | 58同城 49 | ======= 50 | 51 | 我简单写了一个 `收集58同城中上海出租房信息的爬虫 `_ ,包括的条目有: *描述* 、 *位置* 、 *价格* 、 *房间数* 、 *URL* 。 52 | 53 | 由于这些信息都可以在地图上表示出来,那我除了画统计图以外还会画它们在地图上的表示。 54 | 55 | 知乎 56 | ==== 57 | 58 | http://blog.javachen.com/2014/06/08/using-scrapy-to-cralw-zhihu/ 59 | 60 | http://segmentfault.com/blog/javachen/1190000000583419 61 | 62 | https://github.com/KeithYue/Zhihu_Spider.git 63 | 64 | 新浪微博 65 | ========= 66 | 67 | https://github.com/followyourheart/sina-weibo-crawler 68 | 69 | --------------- 70 | 4.3 numpy 快查 71 | --------------- 72 | 73 | .. code:: 74 | 75 | import numpy as np 76 | a = np.arange(1,5) 77 | data_type = [('name','S10'), ('height', 'float'), ('age', int)] 78 | values = [('Arthur', 1.8, 41), ('Lancelot', 1.9, 38), 79 | ('Galahad', 1.7, 38)] 80 | b = np.array(values, dtype=data_type) 81 | c = np.arange(6,10) 82 | 83 | # 符号 84 | np.sign(a) 85 | 86 | # 数组最大值 87 | a.max() 88 | 89 | # 数组最小值 90 | a.max() 91 | 92 | # 区间峰峰值 93 | a.ptp() 94 | 95 | # 乘积 96 | a.prod() 97 | 98 | # 累积 99 | a.cumprod() 100 | 101 | # 平均值 102 | a.mean() 103 | 104 | # 中值 105 | a.median() 106 | 107 | # 差分 108 | np.diff(a) 109 | 110 | # 方差 111 | np.var(a) 112 | 113 | # 元素条件查找,返回index的array 114 | np.where(a>2) 115 | 116 | # 返回第2,3,5个元素的array 117 | np.take(a, np.array(1,2,4)) 118 | 119 | # 排序 120 | np.msort(a) 121 | np.sort(b, kind='mergesort', order='height') 122 | 123 | # 均分,奇数个元素的array不可分割为偶数。 124 | np.split(b,2) 125 | 126 | # 创建单位矩阵 127 | np.eye(3) 128 | 129 | # 最小二乘,参数为[x,y,degree],degree为多项式的最高次幂,返回值为所有次幂的系数 130 | np.polyfit(a,c,1) 131 | 132 | --------------------------------- 133 | 4.4 监督学习常用算法及Python实现 134 | --------------------------------- 135 | 136 | 信息分类基础 137 | ============= 138 | 139 | 信息的不稳定性为熵(entropy),而信息增益为有无样本特征对分类问题影响的大小。比如,抛硬币正反两面各有50%概率,此时不稳定性最大,熵为1;太阳明天照常升起,则是必然,此事不稳定性最小,熵为0。 140 | 141 | 假设事件X,发生概率为x,其信息期望值定义为: 142 | 143 | .. math:: 144 | 145 | l(X) = -log_2 x 146 | 147 | 整个信息的熵为: 148 | 149 | .. math:: 150 | 151 | H = -\sum^n_{i=1} log_2 x 152 | 153 | 如何找到最好的分类特征: 154 | 155 | .. code:: 156 | 157 | def chooseBestFeatureToSplit(dataSet): 158 | numFeatures = len(dataSet[0]) - 1 #the last column is used for the labels 159 | baseEntropy = calcShannonEnt(dataSet) 160 | bestInfoGain = 0.0; bestFeature = -1 161 | for i in range(numFeatures): #iterate over all the features 162 | featList = [example[i] for example in dataSet]#create a list of all the examples of this feature 163 | uniqueVals = set(featList) #get a set of unique values 164 | newEntropy = 0.0 165 | for value in uniqueVals: 166 | subDataSet = splitDataSet(dataSet, i, value) 167 | prob = len(subDataSet)/float(len(dataSet)) 168 | newEntropy += prob * calcShannonEnt(subDataSet) 169 | infoGain = baseEntropy - newEntropy #calculate the info gain; ie reduction in entropy 170 | if (infoGain > bestInfoGain): #compare this to the best gain so far 171 | bestInfoGain = infoGain #if better than current best, set to best 172 | bestFeature = i 173 | return bestFeature #returns an integer 174 | 175 | 其中,dataSet为所有特征向量,caclShannonEnt()计算特征向量的熵,splitDataSet()切除向量中的value列;infoGain即为信息增益,chooseBestFeatureToSplit返回最好的特征向量索引值。 176 | 177 | K邻近算法 178 | ========== 179 | 180 | kNN的算法模型如下: 181 | 182 | 对于未知类别属性的数据且集中的每个点依次执行以下操作: 183 | 184 | - 计算已知类别数据集中的点与当前点之间的距离 185 | 186 | - 按照距离递增依次排序 187 | 188 | - 选取与当前点距离最小的k个点 189 | 190 | - 确定前k个点所在类别的出现频率 191 | 192 | - 返回前k个点出现频率最高的类别作为当前点的预测分类 193 | 194 | 代码参考如下: 195 | 196 | .. code:: 197 | 198 | def classify0(inX, dataSet, labels, k): 199 | dataSetSize = dataSet.shape[0] 200 | diffMat = tile(inX, (dataSetSize,1)) - dataSet 201 | sqDiffMat = diffMat**2 202 | sqDistances = sqDiffMat.sum(axis=1) 203 | distances = sqDistances**0.5 204 | sortedDistIndicies = distances.argsort() 205 | classCount={} 206 | for i in range(k): 207 | voteIlabel = labels[sortedDistIndicies[i]] 208 | classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1 209 | sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1), reverse=True) 210 | return sortedClassCount[0][0] 211 | 212 | 其中,inX为输入向量,dataSet为数据集,labels为数据集的分类,可调。距离计算公式为d0 = ((x-x0)**2 + (y-y0)**2)**0.5。 213 | 214 | 此种算法的优点为精度高、对异常值不敏感、但缺点也比较明显,即数据量大时开支相对较大,适用于数值-标称型数据。 215 | 216 | 决策树 217 | ====== 218 | 219 | 决策树即列出一系列选择,根据训练集中的大量形似(A、B、C)以及结果D的向量来预测新输入(A'、B'、C')的结果D'。 220 | 221 | 首先创建一个决策树: 222 | 223 | .. code:: 224 | 225 | def createTree(dataSet,labels): 226 | classList = [example[-1] for example in dataSet] 227 | if classList.count(classList[0]) == len(classList): 228 | return classList[0] #stop splitting when all of the classes are equal 229 | if len(dataSet[0]) == 1: #stop splitting when there are no more features in dataSet 230 | return majorityCnt(classList) 231 | bestFeat = chooseBestFeatureToSplit(dataSet) 232 | bestFeatLabel = labels[bestFeat] 233 | myTree = {bestFeatLabel:{}} 234 | del(labels[bestFeat]) 235 | featValues = [example[bestFeat] for example in dataSet] 236 | uniqueVals = set(featValues) 237 | for value in uniqueVals: 238 | subLabels = labels[:] #copy all of labels, so trees don't mess up existing labels 239 | myTree[bestFeatLabel][value] = createTree(splitDataSet(dataSet, bestFeat, value),subLabels) 240 | return myTree 241 | 242 | 找到影响最大的特征bestFeat后,再创建此特征下的分类向量创建子树向量,然后将bestFeat分离后继续迭代,直至所有特征都转换成决策节点。 243 | 244 | 原始数据比如: 245 | 246 | no-surfacing flippers fish 247 | 1 yes yes yes 248 | 2 yes yes yes 249 | 3 yes no no 250 | 4 no yes no 251 | 5 no yes no 252 | 253 | 会生成如下决策树: 254 | 255 | no-surfacing? 256 | / \ 257 | no/ \yes 258 | fish(no) flippers? 259 | / \ 260 | no/ \yes 261 | fish(no) fish(yes) 262 | 263 | 表示成JSON格式,即python字典: 264 | 265 | {'no surfacing':{0:'no',1:{'flippers':{0:'no',1:'yes'}}} 266 | 267 | 构建决策树的方法比较多,也可使用C4.5和CART算法。 268 | 269 | 接下来使用决策树进行分类: 270 | 271 | .. code:: 272 | 273 | def classify(inputTree,featLabels,testVec): 274 | firstStr = inputTree.keys()[0] 275 | secondDict = inputTree[firstStr] 276 | featIndex = featLabels.index(firstStr) 277 | key = testVec[featIndex] 278 | valueOfFeat = secondDict[key] 279 | if isinstance(valueOfFeat, dict): 280 | classLabel = classify(valueOfFeat, featLabels, testVec) 281 | else: classLabel = valueOfFeat 282 | return classLabel 283 | 284 | 其中,featLabels为测试的判断节点,即特征,testVec为其值,比如classify[myTree,"['no-surfacing','flippers']",:[1,1]"],如此结果便为'no'。 285 | 286 | 使用pickle对决策树进行序列化存储: 287 | 288 | .. code:: 289 | 290 | def storeTree(inputTree,filename): 291 | import pickle 292 | fw = open(filename,'w') 293 | pickle.dump(inputTree,fw) 294 | fw.close() 295 | 296 | 其中,dump可选协议为0(ASCII),1(BINARY),默认为0;读取时使用pickle.load;同样可使用dumps,loads直接对字符变量进行操作。 297 | 298 | 此种算法计算复杂度不高,对中间值缺失不敏感,但可能会产生过拟合的问题。 299 | 300 | 301 | 朴素贝叶斯 302 | =========== 303 | 304 | 贝叶斯模型是基于独立概率统计的,思想大概可以这么说: 305 | 306 | .. code:: 307 | 308 | 总共7个石子在A、B两个桶中,A桶中有2黑2白,B桶中有2黑1白。已知条件为石子来自B桶,那么它是白色石子的概率可表示为: 309 | 310 | P(white|B)=P(B|white)P(white)/P(B) 311 | 312 | 接下来,定义两个事件A、B,P(A|B)与P(B|A)相互转化的过程即为: 313 | 314 | P(B|A)=P(A|B)P(B)/P(A) 315 | 316 | 而朴素贝叶斯可以这样描述: 317 | 318 | 设x={a1,a2,...,am}为待分类项,a为x的特征属性,类别集合为C={y1,y2,...,ym},如果P(yk|x)=max(P(y1|x),P(y2|x),...,P(yn|x)),则x属于yk。 319 | 320 | 整个算法核心即是等式P(yi|x)=P(x|yi)P(yi)/P(x)。 321 | 322 | 首先构建一个分类训练函数(二元分类): 323 | 324 | .. code:: 325 | 326 | def trainNB0(trainMatrix,trainCategory): 327 | numTrainDocs = len(trainMatrix) 328 | numWords = len(trainMatrix[0]) 329 | pBad = sum(trainCategory)/float(numTrainDocs) 330 | p0Num = ones(numWords); p1Num = ones(numWords) #change to ones() 331 | p0Denom = 2.0; p1Denom = 2.0 #change to 2.0 332 | for i in range(numTrainDocs): 333 | if trainCategory[i] == 1: 334 | p1Num += trainMatrix[i] 335 | p1Denom += sum(trainMatrix[i]) 336 | else: 337 | p0Num += trainMatrix[i] 338 | p0Denom += sum(trainMatrix[i]) 339 | p1Vect = log(p1Num/p1Denom) #change to log() 340 | p0Vect = log(p0Num/p0Denom) #change to log() 341 | return p0Vect,p1Vect,pBad 342 | 343 | 其中,trainMatrix为所有训练集中的布尔向量,比如两本书A、B,其中A有两个单词x、y,B有两个单词x、z,并且A是好书(值计为0),B是烂书(值计为0),把所有单词进行排序后得向量['x','y','z'],则A的Matrix可表示为[1,1,0],B的为[1,0,1],所以此函数中的trainMatrix即[[1,1,0],[1,0,1]],trainCategory为[0,1]。 344 | 函数返回的为概率集的向量。 345 | 346 | 分类函数: 347 | 348 | .. code:: 349 | 350 | def classifyNB(vec2Classify, p0Vec, p1Vec, pClass1): 351 | p1 = sum(vec2Classify * p1Vec) + log(pClass1) #element-wise mult 352 | p0 = sum(vec2Classify * p0Vec) + log(1.0 - pClass1) 353 | if p1 > p0: 354 | return 1 355 | else: 356 | return 0 357 | 358 | vec2Classify即为要分类的向量,形如trainMatrix,随后的三个参数为trainNB0所返回。p1、p0可以理解为期望概率值,比较两者大小即可划分。 359 | 360 | 测试用例: 361 | 362 | .. code:: 363 | 364 | def testingNB(): 365 | listOPosts,listClasses = loadDataSet() 366 | myVocabList = createVocabList(listOPosts) 367 | trainMat=[] 368 | for postinDoc in listOPosts: 369 | trainMat.append(setOfWords2Vec(myVocabList, postinDoc)) 370 | p0V,p1V,pAb = trainNB0(array(trainMat),array(listClasses)) 371 | testEntry = ['love', 'my', 'dalmation'] 372 | thisDoc = array(setOfWords2Vec(myVocabList, testEntry)) 373 | print testEntry,'classified as: ',classifyNB(thisDoc,p0V,p1V,pAb) 374 | testEntry = ['stupid', 'garbage'] 375 | thisDoc = array(setOfWords2Vec(myVocabList, testEntry)) 376 | print testEntry,'classified as: ',classifyNB(thisDoc,p0V,p1V,pAb) 377 | 378 | 整体来说,朴素贝叶斯分类方法在数据较少的情况下仍然有效,但是对数据输入比较敏感。 379 | 380 | Logistic回归 381 | ============= 382 | 383 | 在统计学中,线性回归是利用称为线性回归方程的最小二乘函数对一个或多个自变量和因变量之间关系进行建模的一种回归分析。这种函数是一个或多个称为回归系数的模型参数的线性组合。只有一个自变量的情况称为简单回归,大于一个自变量情况的叫做多元回归。( `维基百科 `_ ) 384 | 385 | 先介绍两个重要的数学概念。 386 | 387 | **最小二乘法则** 388 | 389 | 最小二乘法(又称最小平方法)是一种数学优化技术。它通过最小化误差的平方和寻找数据的最佳函数匹配。 390 | 391 | 利用最小二乘法可以简便地求得未知的数据,并使得这些求得的数据与实际数据之间误差的平方和为最小。 392 | 393 | *示例1* 394 | 395 | 有四个数据点(1,6)、(2,5)、(3,7)、(4,10),我们希望找到一条直线y=a+bx与这四个点最匹配。 396 | 397 | .. math:: 398 | 399 | a+1b=6 400 | 401 | a+2b=5 402 | 403 | a+3b=7 404 | 405 | a+4b=10 406 | 407 | 采用最小二乘法使等号两边的方差尽可能小,也就是找出这个函数的最小值: 408 | 409 | .. math:: 410 | 411 | S(a,b) = [6-(a+1b)]^2+[5-(a+2b)]^2+[7-(a+3b)]^2+[10-(a+4b)]^2 412 | 413 | 然后对S(a,b)求a,b的偏导数,使其为0得到: 414 | 415 | .. math:: 416 | 417 | \cfrac{{\partial}S}{{\partial}a} = 0 = 8a+20b-56 418 | 419 | \cfrac{{\partial}S}{{\partial}b} = 0 = 20a+60b-154 420 | 421 | 这样就解出: 422 | 423 | .. math:: 424 | 425 | a=3.5,b=1.4 426 | 427 | 所以直线y=3.5+1.4x是最佳的。 428 | 429 | *函数表示* 430 | 431 | .. math:: 432 | 433 | \min_{\vec{b}}{\sum^n_{i=1}}(y_m-y_i)^2 434 | 435 | *欧几里德表示* 436 | 437 | .. math:: 438 | 439 | \min_{ \vec{b} } \| \vec{y}_{m} ( \vec{b} ) - \vec{y} \|_{2} 440 | 441 | *线性函数模型* 442 | 443 | 典型的一类函数模型是线性函数模型。最简单的线性式是 444 | 445 | .. math:: 446 | 447 | y = b_0 + b_1 t 448 | 449 | 写成矩阵式,为 450 | 451 | .. math:: 452 | 453 | \min_{b_0,b_1}\left\|\begin{pmatrix}1 & t_1 \\ \vdots & \vdots \\ 1 & t_n \end{pmatrix}\begin{pmatrix} b_0\\ b_1\end{pmatrix} - \begin{pmatrix} y_1 \\ \vdots \\ y_{n}\end{pmatrix}\right\|_{2} = \min_b\|Ab-Y\|_2 454 | 455 | 直接给出该式的参数解: 456 | 457 | .. math:: 458 | 459 | b_1 = \frac{\sum_{i=1}^n t_iy_i - n \cdot \bar t \bar y}{\sum_{i=1}^n t_i^2- n \cdot (\bar t)^2} 460 | 461 | b_0 = \bar y - b_1 \bar t 462 | 463 | 其中 464 | 465 | .. math:: 466 | 467 | \bar t = \frac{1}{n} \sum_{i=1}^n t_i 468 | 469 | 为t值的算术平均值。也可解得如下形式: 470 | 471 | .. math:: 472 | 473 | b_1 = \frac{\sum_{i=1}^n (t_i - \bar t)(y_i - \bar y)}{\sum_{i=1}^n (t_i - \bar t)^2} 474 | 475 | *示例2* 476 | 477 | 随机选定10艘战舰,并分析它们的长度与宽度,寻找它们长度与宽度之间的关系。由下面的描点图可以直观地看出,一艘战舰的长度(t)与宽度(y)基本呈线性关系。散点图如下: 478 | 479 | .. image:: ../images/04-02.png 480 | :align: center 481 | 482 | 以下图表列出了各战舰的数据,随后步骤是采用最小二乘法确定两变量间的线性关系。 483 | 484 | .. image:: ../images/04-03.png 485 | :align: center 486 | 487 | 仿照上面给出的例子 488 | 489 | .. math:: 490 | 491 | \bar t = \frac {\sum_{i=1}^n t_i}{n} = \frac {1678}{10} = 167{.}8 492 | 493 | 并得到相应的 494 | 495 | .. math:: 496 | 497 | \bar y = 18{.}41 498 | 499 | 然后确定b1 500 | 501 | .. math:: 502 | 503 | b_1 = \frac{\sum_{i=1}^n (t_i- \bar {t})(y_i - \bar y)}{\sum_{i=1}^n (t_i- \bar t)^2} 504 | 505 | = \frac{3287{.}820} {20391{.}60} = 0{.}1612 \; 506 | 507 | 可以看出,战舰的长度每变化1m,相对应的宽度便要变化16cm。并由下式得到常数项b0: 508 | 509 | .. math:: 510 | 511 | b_0 = \bar y - b_1 \bar t = 18{.}41 - 0{.}1612 \cdot 167{.}8 = -8{.}6394 512 | 513 | 可以看出点的拟合非常好,长度和宽度的相关性大约为96.03%。 利用Matlab得到拟合直线: 514 | 515 | .. image:: ../images/04-04.png 516 | :align: center 517 | 518 | **Sigmoid函数** 519 | 520 | Sigmoid函数具有单位阶跃函数的性质,公式表示为: 521 | 522 | .. math:: 523 | 524 | \sigma (z)=\cfrac{1}{1+e^{-z}} 525 | 526 | .. image:: ../images/04-01.png 527 | :align: center 528 | 529 | 我们将输入记为z,有下面的公式得出: 530 | 531 | .. math:: 532 | 533 | z=w_0 x_0 + w_1 x_1 + w_2 x_2 + \dots + w_n x_n 534 | 535 | 使用向量写法: 536 | 537 | .. math:: 538 | 539 | z=w^T x 540 | 541 | 其中向量x是分类器的输入数据,向量w就是我们要找到的最佳系数。 542 | 543 | *基于优化方法确定回归系数* 544 | 545 | **梯度上升/下降法** 546 | 547 | 梯度上升法/下降法的思想是:要找到函数的最大值,最好的方法是沿着该函数的梯度方向探寻,函数f(x,y)的梯度如下表示: 548 | 549 | .. math:: 550 | 551 | {\nabla}f(x,y)=\begin{pmatrix} \cfrac{{\partial}f(x,y)}{{\partial}x} \\ \cfrac{{\partial}f(x,y)}{{\partial}y}\end{pmatrix} 552 | 553 | 可以这样理解此算法: 554 | 555 | 从前有一座山,一个懒人要爬山,他从山脚下的任意位置向山顶出发,并且知道等高线图的每个环上都有一个宿营点,他希望在这些宿营点之间修建一条笔直的路,并且路到两旁的宿营点的垂直距离差的平方和尽可能小。每到一个等高线圈,他都会根据他在上一个等高线的距离的变化量来调节他的在等高线上的位置,从而使公路满足要求。 556 | 557 | 返回回归系数: 558 | 559 | .. code:: 560 | 561 | def gradAscent(dataMatIn, classLabels): 562 | dataMatrix = mat(dataMatIn) #convert to NumPy matrix 563 | labelMat = mat(classLabels).transpose() #convert to NumPy matrix 564 | m,n = shape(dataMatrix) 565 | alpha = 0.001 566 | maxCycles = 500 567 | weights = ones((n,1)) 568 | for k in range(maxCycles): #heavy on matrix operations 569 | h = sigmoid(dataMatrix*weights) #matrix mult 570 | error = (labelMat - h) #vector subtraction 571 | weights = weights + alpha * dataMatrix.transpose()* error #matrix mult 572 | return weights 573 | 574 | 其中,误差值乘以矩阵的转秩代表梯度。 575 | 576 | 待修改。 577 | 578 | SVM 579 | === 580 | 581 | SVM(Supprot Vector Machines)即支持向量机,完全理解其理论知识对数学要求较高。 582 | 583 | AdaBoost 584 | ======== 585 | 586 | 587 | --------------- 588 | 4.5 无监督学习 589 | --------------- 590 | 591 | --------------- 592 | 4.6 数据可视化 593 | --------------- 594 | 595 | 数据统计 596 | ========= 597 | 598 | Gephi 599 | 600 | GraphViz 601 | 602 | python-matplotlib 603 | 604 | Microsoft Excel 2013 PowerView 605 | 606 | 地理位置表示 607 | ============= 608 | 609 | `百度地图API `_ 610 | 611 | `MaxMind GeoIP `_ 612 | 613 | Microsoft Excel 2013 PowerView使用示例 614 | 615 | ----------------- 616 | 4.7 机器学习工具 617 | ----------------- 618 | 619 | `Weka `_ 620 | 621 | `Netlogo `_ 622 | 623 | `SciKit `_ 624 | 625 | `Pandas `_ 626 | -------------------------------------------------------------------------------- /source/posts/ch05.rst: -------------------------------------------------------------------------------- 1 | ======================= 2 | 第五章 数据处理平台 3 | ======================= 4 | 5 | 5.1 Hadoop简介 6 | --------------- 7 | 8 | Hadoop与现在更流行的Storm和Spark,从初学的角度来说更有价值。因为Hadoop内容不止有MapReduce,更有SQL式的Yarn和HDFS这一专为MR开发的文件系统,所以我认为在基础学习阶段它更具代表性。而Storm和Spark,它们的优劣我现在并不清楚,只知道前者适用于处理输入连绵数据,后者适用于复杂MR过程的模型。 9 | 10 | 5.2 模块部署(单机/集群) 11 | ------------------------- 12 | 13 | 现在部署Hadoop的方式比过去更加容易,就我所知,你可以使用 `Cloudera Manager `_ 或者 Puppet 去完成企业级的部署;如果你需要概念证明类的工作,可以直接使用 `Hortonworks 的虚拟机镜像 `_ 或者 `Cloudera的虚拟机镜像 `_ ,或者 `MapR `_ ,在接下来的章节中我会使用rpm包进行安装,而不是按照 `官方文档 `_ 去部署。 14 | 15 | Hue:`Hadoop User Experience `_ ,即web UI 16 | 17 | 单节点部署 18 | ~~~~~~~~~~~ 19 | 20 | 集群部署 21 | ~~~~~~~~~ 22 | 23 | 5.3 本地数据处理 24 | ----------------- 25 | 26 | 5.4 实时数据处理 27 | ----------------- 28 | 29 | 5.5 实例 30 | --------- 31 | 32 | 基于Solr和Nutch的搜索引擎 33 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ 34 | 35 | 5.6 与Storm/Spark配合使用 36 | ------------------------- 37 | -------------------------------------------------------------------------------- /source/posts/exp.rst: -------------------------------------------------------------------------------- 1 | ====================== 2 | 实践 构建先进的家居云 3 | ====================== 4 | 5 | 鉴于计算能力的摩尔定律以及家庭联网功能设备的爆发式增长,我们可以制造建立于各个家庭上的“云团”。 6 | 7 | 先进性体现在哪呢? 8 | 9 | 首先,我们的服务主要依托于虚拟化,数据流一定是SSL加密的,最大程度地与现有设备交互,服务可以对外限量使用。 10 | 11 | 系统架构 12 | -------- 13 | 14 | 架构这个东西从来就没离开过需求,那我们的需求是什么呢? 15 | 16 | 1. 你是否有过文件无处安放的苦恼,装进电脑里怕系统重装后弄丢,装进移动存储怕插坏? 17 | 18 | 2. 你是否有一些本地照片、音乐、视频要给家人分享? 19 | 20 | 3. 你是否有一些本地照片、音乐、视频要给网络上的好友分享? 21 | 22 | 4. 你是否有一些重要文档,想要的时候却总也找不到? 23 | 24 | 5. 你是不是考虑过品牌家用NAS? 25 | 26 | 6. 你是不是不放心市场上的智能家居设备,担心它们窥探隐私(后门、被入侵)? 27 | 28 | 7. 你是否觉得让现有的家庭设备智能化很容易,但是自己没时间做? 29 | 30 | 8. 你是不是炒股?你的信息来源是不是非常分散? 31 | 32 | 9. 机器太多,怎么监控? 33 | 34 | 0. OK,我在扯淡。。 35 | 36 | OK,这些我们都可以分而治之,整个系统的骨架大概如下图所示。 37 | 38 | .. image:: ../images/exp-01.png 39 | :align: center 40 | 41 | 构建元素 42 | -------- 43 | 44 | 硬件: **HP N54L** 、 **Raspberry Pi** 、 **Mac mini** 、 **电话语音卡** 、 **WRT54G(可选)** 45 | 46 | 服务:网络认证、XMPP即时通信(服务群成员)、云存储、家庭知识库、家庭影像库、NAS(Apple TM兼容)、数据源(微博等)、DNS(解析内部服务器)、语音电话、语音识别控制、股票分析、clamav(防病毒)、zabbix监控 47 | 48 | 软件:OpenLDAP、jabber、 、 `Seafile `_ 、 `owncloud `_ 、 `XBMC(更名Kodi) `_ 、Wiki、Asterisk、 `jasper `_ 、 Hadoop 、 `clamav `_ 、 AirPlay(Linux/OSX Server) 、 `Jarvis ` 49 | 50 | .. note:: 不需要的东西 51 | 52 | 建立一个搜索引擎就三步:下载网页、建立索引、质量排序,对的,我们不需要自己建立,主要原因就是索引量太小。有兴趣的话可以查看 http://en.wikipedia.org/wiki/List_of_search_engines ,或者使用Nutch、Lucene或者Sphinx来搭建自己的搜索引擎。 53 | 54 | OS X Server 55 | ------------ 56 | 57 | 鉴于OS X Server安装服务非常方便,这里就针对它的常用服务进行讲解。 58 | 59 | - Time Machine: 给Mac机器提供时光机器服务,可以很方便地对Mac进行备份与恢复,一定要保证磁盘划分合理。 60 | 61 | - VPN:可以创建基于LT2P或者PPTP的VPN服务器。 62 | 63 | - 信息:提供基于XMPP的Jabber即时消息服务。 64 | 65 | - Wiki:可以创建博客以及Wiki服务器。 66 | 67 | - 网站:可提供PHP或者Python的Web服务。 OS X有一个 `webpromotion `_ 命令,用于更改桌面配置,以优化web服务体验。 68 | 69 | - 文件共享:可以通过Samba、AFP、Webdav方式共享文件或目录。 70 | 71 | - FTP:提供FTP服务。 72 | 73 | - 通讯录:可提供CardDav格式或者LDAP内的通讯录,适用于大多数移动设备。 74 | 75 | - NetInstall:提供网络安装OS X的服务,一般用于重装或者恢复系统。 76 | 77 | - Open Directory:提供LDAP服务,包含Kerberos认证。 78 | 79 | - DNS:用于内部DNS服务。 80 | 81 | .. note:: 家庭局域网DNS服务器 82 | 83 | 家庭局域网中的DNS服务器有时还是很有必要的,可以使用RPi、MacMini(或者其他合适的路由设备)作为DNS转发,配合 `dnsmasq `_ 快速部署DNS。 84 | --------------------------------------------------------------------------------