├── .gitignore ├── LICENSE ├── Makefile ├── MooseFS-2-0-Users-Manual.tex ├── MooseFS-3-0-Users-Manual.tex ├── MooseFS-Step-by-Step-Tutorial.tex ├── MooseFS-Upgrade-Guide.tex ├── README.md └── images ├── cgi_labelsAB.png ├── cgi_labelsAB_data.png ├── cgi_nolabels.png ├── cgi_resources1.png ├── cgi_resources2.png ├── diagram_A_B_ssd_hdd_v2.png ├── diagram_A_B_v2.png ├── diagram_ssd_hdd_v2.png ├── diagram_with_labels.png ├── diagram_without_labels.png ├── mfsscr1_blur.png ├── mfsscr2_blur.png ├── mfsscr3_blur.png ├── moosefs.png ├── read_mfs.png └── write_mfs.png /.gitignore: -------------------------------------------------------------------------------- 1 | *.aux 2 | *.bak 3 | *.bbl 4 | *.blg 5 | *.dvi 6 | *.log 7 | *.out 8 | *.pdf 9 | *.synctex.gz 10 | *.tex.bak 11 | *.toc 12 | .DS_Store 13 | .directory 14 | 15 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU General Public License, version 2 (GPL-2.0) 2 | 3 | The GNU General Public License (GPL-2.0) 4 | 5 | Version 2, June 1991 6 | 7 | Copyright (C) 1989, 1991 Free Software Foundation, Inc. 8 | 9 | 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. 10 | 11 | Everyone is permitted to copy and distribute verbatim copies 12 | 13 | of this license document, but changing it is not allowed. 14 | 15 | Preamble 16 | 17 | The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. 18 | 19 | When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. 20 | 21 | To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. 22 | 23 | For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. 24 | 25 | We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. 26 | 27 | Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. 28 | 29 | Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. 30 | 31 | The precise terms and conditions for copying, distribution and modification follow. 32 | 33 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 34 | 35 | 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". 36 | 37 | Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 38 | 39 | 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. 40 | 41 | You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 42 | 43 | 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: 44 | 45 | a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. 46 | 47 | b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. 48 | 49 | c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) 50 | 51 | These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. 52 | 53 | Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. 54 | 55 | In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 56 | 57 | 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: 58 | 59 | a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, 60 | 61 | b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, 62 | 63 | c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) 64 | 65 | The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. 66 | 67 | If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 68 | 69 | 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 70 | 71 | 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 72 | 73 | 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 74 | 75 | 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. 76 | 77 | If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. 78 | 79 | It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. 80 | 81 | This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 82 | 83 | 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 84 | 85 | 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. 86 | 87 | Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 88 | 89 | 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. 90 | 91 | NO WARRANTY 92 | 93 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 94 | 95 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 96 | 97 | END OF TERMS AND CONDITIONS 98 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2017 Piotr Robert Konopelko, Core Technology Sp. z o.o. 2 | # 3 | # This file is part of MooseFS. 4 | # 5 | # MooseFS is free software; you can redistribute it and/or modify 6 | # it under the terms of the GNU General Public License as published by 7 | # the Free Software Foundation, version 2 (only). 8 | # 9 | # MooseFS is distributed in the hope that it will be useful, 10 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | # GNU General Public License for more details. 13 | # 14 | # You should have received a copy of the GNU General Public License 15 | # along with MooseFS; if not, write to the Free Software 16 | # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 17 | # or visit http://www.gnu.org/licenses/gpl-2.0.html 18 | 19 | TEX=pdflatex 20 | RM=rm 21 | 22 | all: MooseFS-2-0-Users-Manual.pdf MooseFS-3-0-Users-Manual.pdf MooseFS-Step-by-Step-Tutorial.pdf MooseFS-Upgrade-Guide.pdf 23 | 24 | MooseFS-2-0-Users-Manual.pdf: 25 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-2-0-Users-Manual.tex 26 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-2-0-Users-Manual.tex 27 | $(RM) -v MooseFS-2-0-Users-Manual.{aux,log,out,synctex.gz,toc} 28 | 29 | MooseFS-3-0-Users-Manual.pdf: 30 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-3-0-Users-Manual.tex 31 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-3-0-Users-Manual.tex 32 | $(RM) -v MooseFS-3-0-Users-Manual.{aux,log,out,synctex.gz,toc} 33 | 34 | MooseFS-Step-by-Step-Tutorial.pdf: 35 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-Step-by-Step-Tutorial.tex 36 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-Step-by-Step-Tutorial.tex 37 | $(RM) -v MooseFS-Step-by-Step-Tutorial.{aux,log,out,synctex.gz,toc} 38 | 39 | MooseFS-Upgrade-Guide.pdf: 40 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-Upgrade-Guide.tex 41 | $(TEX) -synctex=1 -interaction=batchmode MooseFS-Upgrade-Guide.tex 42 | $(RM) -v MooseFS-Upgrade-Guide.{aux,log,out,synctex.gz,toc} 43 | 44 | clean: 45 | $(RM) -v *.pdf 46 | 47 | -------------------------------------------------------------------------------- /MooseFS-2-0-Users-Manual.tex: -------------------------------------------------------------------------------- 1 | % Copyright (c) 2014-2017 Piotr Robert Konopelko, Core Technology Sp. z o.o. 2 | % 3 | % This file is part of MooseFS. 4 | % 5 | % MooseFS is free software; you can redistribute it and/or modify 6 | % it under the terms of the GNU General Public License as published by 7 | % the Free Software Foundation, version 2 (only). 8 | % 9 | % MooseFS is distributed in the hope that it will be useful, 10 | % but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | % GNU General Public License for more details. 13 | % 14 | % You should have received a copy of the GNU General Public License 15 | % along with MooseFS; if not, write to the Free Software 16 | % Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 17 | % or visit http://www.gnu.org/licenses/gpl-2.0.html 18 | 19 | \documentclass[a4paper,11pt,english]{report} 20 | \usepackage{url} 21 | \usepackage{hyperref} 22 | \usepackage{fullpage} 23 | \usepackage{parskip} 24 | \usepackage{graphicx} 25 | \usepackage{xcolor} 26 | \usepackage{listings} 27 | 28 | \lstset{ 29 | language=bash, 30 | basicstyle=\ttfamily\scriptsize, 31 | showstringspaces=false, 32 | commentstyle=\color{black}, 33 | keywordstyle=\color{black}, 34 | breakatwhitespace=false, 35 | breaklines=true, 36 | showspaces=false, 37 | tabsize=4 38 | } 39 | 40 | \def\code#1{\texttt{#1}} 41 | 42 | \newenvironment{copyrightnotice} 43 | {\begingroup 44 | \footnotesize 45 | \setlength{\parindent}{0pt} 46 | \setlength{\parskip}{\baselineskip}} 47 | {\endgroup} 48 | 49 | % ------------------------------------------------------------------------ 50 | 51 | \begin{document} 52 | 53 | \renewcommand{\labelitemi}{$\bullet$} 54 | \renewcommand{\labelitemii}{$\circ$} 55 | \renewcommand{\labelitemiii}{$\bullet$} 56 | \renewcommand{\labelitemiv}{$\circ$} 57 | 58 | \begin{titlepage} 59 | \begin{center} 60 | \includegraphics[width=0.2\textwidth]{images/moosefs.png}\\[1cm] 61 | 62 | % Title 63 | { \huge \bfseries MooseFS 2.0 64 | User's Manual \\[0.4cm] } 65 | 66 | 67 | \textsc{Core Technology} Development \& Support Team 68 | 69 | \vfill 70 | 71 | % Bottom of the page 72 | {\large \today} 73 | \end{center} 74 | \end{titlepage} 75 | 76 | 77 | % Copyright page 78 | \begin{copyrightnotice} 79 | \begin{flushleft} 80 | Copyright \textcopyright{} 2014-\the\year 81 | \hfill 82 | \textsc{v. 1.4.3}\\ % DOCUMENTVERSION 83 | 84 | Piotr Robert Konopelko, \textsc{Core Technology} Development \& Support Team. 85 | 86 | \emph{Proofread by} 87 | Agata Kruszona-Zawadzka \\ 88 | \emph{Coordination \& layout by} Piotr Robert Konopelko. 89 | 90 | Please send corrections to \href{mailto:peter@mfs.io}{Piotr Robert Konopelko} -- peter@mfs.io. 91 | 92 | \bigskip 93 | 94 | This file is part of MooseFS. 95 | 96 | MooseFS is free software; you can redistribute it and/or modify 97 | it under the terms of the GNU General Public License as published by 98 | the Free Software Foundation, version 2 (only). 99 | 100 | MooseFS is distributed in the hope that it will be useful, 101 | but WITHOUT ANY WARRANTY; without even the implied warranty of 102 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 103 | GNU General Public License for more details. 104 | 105 | You should have received a copy of the GNU General Public License 106 | along with MooseFS; if not, write to the Free Software 107 | Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 108 | or visit \url{http://www.gnu.org/licenses/gpl-2.0.html} 109 | \end{flushleft} 110 | \end{copyrightnotice} 111 | 112 | \vfill 113 | 114 | \tableofcontents 115 | 116 | \chapter{About MooseFS} 117 | MooseFS is a fault-tolerant distributed file system. It spreads data over several physical locations (servers), which are visible to user as one resource. For standard file operations MooseFS acts as any other Unix-alike filesystem: 118 | 119 | \begin{itemize} 120 | \item Hierarchical structure (directory tree) 121 | \item Stores POSIX file attributes (permissions, last access and modification times) 122 | \item Supports special files (block and character devices, pipes and sockets) 123 | \item Symbolic links (file names pointing to target files, not necessarily on MooseFS) and hard links (different names of files that refer to the same data on MooseFS) 124 | \item Access to the file system can be limited based on IP address and/or password 125 | \end{itemize} 126 | 127 | \bigskip 128 | Distinctive features of MooseFS are: 129 | \begin{itemize} 130 | \item High reliability (several copies of the data can be stored on separate physical machines) 131 | \item Capacity is dynamically expandable by adding new computers/disks 132 | \item Deleted files are retained for a configurable period of time (a file system level "trash bin") 133 | \item Coherent snapshots of files, even while the file is being written/accessed 134 | \end{itemize} 135 | 136 | \section{Architecture} 137 | MooseFS consists of four components: 138 | \begin{enumerate} 139 | \item Managing servers (\code{master server}s) -- In Community Edition one machine, in Pro any number of machines managing the whole filesystem, storing metadata for every file (information on size, attributes and file location(s), including all information about non-regular files, i.e. directories, sockets, pipes and devices). 140 | \item Data servers (\code{chunk servers}) -- any number of commodity servers storing files' data and synchronizing it among themselves (if a certain file is supposed to exist in more than one copy). 141 | \item Metadata backup server(s) (\code{metalogger server}) -- any number of servers, all of which store metadata changelogs and periodically download main metadata file. 142 | 143 | In CE version machine with metalogger can be easily set up as a master in case of main master failure. 144 | 145 | In Pro version metalogger can be set up to provide an additional level of security. 146 | \item Client computers that access (\code{mount}) the files in MooseFS -- any number of machines using \code{mfsmount} process to communicate with the managing server (to receive and modify file metadata) and with chunkservers (to exchange actual file data). 147 | \end{enumerate} 148 | 149 | \bigskip 150 | 151 | \code{mfsmount} is based on the FUSE\footnote{You can read more about FUSE at \url{http://fuse.sourceforge.net}} mechanism (Filesystem in USErspace), so MooseFS is available on every Operating System with a working FUSE implementation (Linux, FreeBSD, MacOS X, etc.) 152 | 153 | \bigskip 154 | 155 | \begin{center} 156 | \includegraphics[width=0.8\textwidth]{images/read_mfs.png}\\[1cm] 157 | \end{center} 158 | 159 | \vfill 160 | 161 | \begin{center} 162 | \includegraphics[width=0.8\textwidth]{images/write_mfs.png}\\[1cm] 163 | \end{center} 164 | 165 | Metadata is stored in the memory of the managing server and simultaneously saved to disk (as a periodically updated binary file and immediately updated incremental logs). The main binary file as well as the logs are synchronized to the metaloggers (if present) and to spare master servers in Pro version. 166 | 167 | File data is divided into fragments (chunks) with a maximum size of 64MiB each. Each chunk is itself a file on selected disks on data servers (chunkservers). 168 | 169 | High reliability is achieved by configuring as many different data servers as appropriate to assure the "\code{goal}" value (number of copies to keep) set for the given file. 170 | 171 | 172 | \section{How does the system work} 173 | All file operations on a client computer that has mounted MooseFS are exactly the same as they would be with other file systems. The operating system's kernel transfers all file operations to the FUSE module, which communicates with the \code{mfsmount} process. The \code{mfsmount} process communicates through the network subsequently with the managing server and data servers (chunk servers). This entire process is fully transparent to the user.\\\\ 174 | 175 | \code{mfsmount} communicates with the managing server every time an operation on file metadata is required: 176 | \begin{itemize} 177 | \item creating files 178 | \item deleting files 179 | \item reading directories 180 | \item reading and changing attributes 181 | \item changing file sizes 182 | \item at the start of reading or writing data 183 | \item on any access to special files on \code{MFSMETA} 184 | \end{itemize} 185 | 186 | \bigskip 187 | \code{mfsmount} uses a direct connection to the data server (chunk server) that stores the relevant chunk of a file. When writing a file, after finishing the write process the managing server receives information from \code{mfsmount} to update a file's length and the last modification time. 188 | 189 | Furthermore, data servers (chunk servers) communicate with each other to replicate data in order to achieve the appropriate number of copies of a file on different machines. 190 | 191 | \section{Fault tolerance} 192 | % Add more information 193 | Administrative commands allow the system administrator to specify the "\code{goal}", or number of copies that should be maintained, on a per-directory or per-file level. Setting the goal to more than one and having more than one data server will provide fault tolerance. When the file data is stored in many copies (on more than one data server), the system is resistant to failures or temporary network outages of a single data server. 194 | 195 | This of course does not refer to files with the "\code{goal}" set to 1, in which case the file will only exist on a single data server irrespective of how many data servers are deployed in the system.\\ 196 | 197 | Exceptionally important files may have their \code{goal} set to a number higher than two, which will allow these files to be resistant to a breakdown of more than one server at the same time.\\ 198 | 199 | In general the setting for the number of copies available should be one more than the anticipated number of inaccessible or out-of-order servers.\\ 200 | 201 | In the case where a single data server experiences a failure or disconnection from the network, the files stored within it that had at least two copies, will remain accessible from another data server. The data that is now 'under its goal' will be replicated on another accessible data server to again provide the required number of copies.\\ 202 | 203 | It should be noted that if the number of available servers is lower than the "\code{goal}" set for a given file, the required number of copies cannot be preserved. Similarly if there are the same number of servers as the currently set goal and if a data server has reached 100\% of its capacity, it will be unable to hold a copy of a file that is now below its goal due to another data server going offline. In these cases a new data server should be connected to the system as soon as possible in order to maintain the desired number of copies of the file.\\ 204 | 205 | A new data server can be connected to the system at any time. The new capacity will immediately become available for use to store new files or to hold replicated copies of files from other data servers.\\ 206 | 207 | Administrative utilities exist to query the status of the files within the file system to determine if any of the files are currently below their goal (set number of copies). This utility can also be used to alter the goal setting as required.\\ 208 | 209 | The data fragments stored in the chunks are versioned, so re-connecting a data server with older copy of data (i.e. if it had been offline for a period of time), will not cause the files to become incoherent. The data server will synchronize itself to hold the current versions of the chunks, where the obsolete chunks will be removed and the free space will be reallocated to hold the new chunks.\\ 210 | 211 | Failures of a client machine (that runs the \code{mfsmount} process) will have no influence on the coherence of the file system or on the other clients' operations. In the worst case scenario the data that has not yet been sent from the failed client computer may be lost. 212 | 213 | \section{Platforms} 214 | MooseFS is available on every Operating System with a working FUSE implementation: 215 | \begin{itemize} 216 | \item Linux (Linux 2.6.14 and up have FUSE support included in the official kernel) 217 | \item FreeBSD 218 | \item OpenSolaris 219 | \item MacOS X 220 | \end{itemize} 221 | 222 | \bigskip 223 | The master server, metalogger server and chunkservers can also be run on Solaris or Windows with Cygwin. Unfortunately without FUSE it won't be possible to mount the filesystem within these operating systems. 224 | 225 | 226 | \chapter{Moose File System Requirements} 227 | 228 | \section{Network requirements} 229 | 230 | MooseFS requires TCP/IP network. The faster the network is, the better is performance. It is recommended to connect all servers to the same switch or at least try to minimize network latencies, because they may have significant impact on performance. 231 | 232 | MooseFS requires the following ports to be open (it can be configured in appropriate configuration files): 233 | 234 | \begin{itemize} 235 | \item \code{9419..9421} -- Master Server(s) 236 | \item \code{9422} -- Chunkservers 237 | \item \code{9425} -- CGI Server 238 | \end{itemize} 239 | 240 | \section{Requirements for Master Servers} 241 | As the managing server (master) is a crucial element of MooseFS, it should be installed on a machine which guarantees high stability and access requirements which are adequate for the whole system. It is advisable to use a server with a redundant power supply, ECC memory, and disk array RAID1/RAID5/RAID10. The managing server OS has to be POSIX compliant (systems verified so far: Linux, FreeBSD, MacOS X and OpenSolaris). 242 | 243 | \subsection{CPU} 244 | 245 | Because Master Server is a single-threaded process, it is recommended to use modern CPU with high clock (e.g. 3.7 GHz) and small number of cores (e.g. 4) -- especially in MooseFS instances which handle a lot of small files. 246 | 247 | Additionally, disabling CPU power management in BIOS (or enable mode like "maximum performance") may have positive impact on efficiency. 248 | 249 | You can compare CPUs on the following website -- please pay attention to "single-thread points": \code{\url{https://www.cpubenchmark.net/singleThread.html}}. 250 | 251 | 252 | \subsection{RAM size} 253 | 254 | The most important factor in sizing requirements for the Master Server machine is RAM, as the full file system structure is cached in RAM for speed. The Master Server should have approximately 300-350 MiB of RAM allocated to handle 1 million objects (files, directories, pipes, sockets, ...). 255 | 256 | \underline{Example}: 257 | \begin{itemize} 258 | \item Leader Master RAM usage: \code{20 GiB} (\code{21 017 505 792} Bytes exactly) 259 | \item "All FS objects" (from MFS CGI): \code{67 552 270} 260 | \item \code{21 017 505 792 / 67 552 270 = $\sim$311.13} Bytes per one object 261 | \end{itemize} 262 | 263 | \subsection{HDD free space} 264 | 265 | The necessary size of HDD depends both on the number of files and chunks used (main metadata file) and on the number of operations made on the files (metadata changelog); for example the space of 20 GiB is enough for storing information for 25 million files and for changelogs to be kept for up to 50 hours. 266 | 267 | You can calculate the minimum amount of space we recommend using the following formula: 268 | \begin{itemize} 269 | \item \code{RAM} -- amount of RAM 270 | \item \code{BACK\_LOGS} -- number of metadata change log files, default is 50 (from \code{/etc/mfs/mfsmaster.cfg}) 271 | \item \code{BACK\_META\_KEEP\_PREVIOUS} -- number of previous metadata files to be kept (default is 1) (also from \code{/etc/mfs/mfsmaster.cfg}) 272 | \end{itemize} 273 | 274 | \bigskip 275 | \underline{The formula}: \\ 276 | \code{SPACE = RAM * (BACK\_META\_KEEP\_PREVIOUS + 2) + 1 * (BACK\_LOGS + 1) [GiB]} 277 | \bigskip 278 | 279 | (If default values from \code{/etc/mfs/mfsmaster.cfg} are used, it is \code{RAM * 3 + 51 [GiB]}) 280 | 281 | The value \code{1} (before multiplying by \code{BACK\_LOGS + 1}) is an estimation of size used by one \code{changelog.[number].mfs} file. On highly loaded MooseFS instance it uses a bit less than 1 GB. 282 | 283 | \underline{Example}: \\ 284 | If you have 128 GiB of RAM, using the formula above, you should reserve for \code{/var/lib/mfs}: 285 | 286 | 128*3 + 51 = 384 + 51 = \textbf{435 GiB minimum}. 287 | 288 | \section{Requirements for Metalogger(s)} 289 | MooseFS metalogger simply gathers metadata backups from the MooseFS Master Server -- so the hardware requirements are not higher than for the Master Server itself; it needs about the same disk space. Similarly to the Master Server -- the OS has to be POSIX compliant (Linux, FreeBSD, Mac OS X, OpenSolaris, etc.).\\ 290 | 291 | MooseFS Metalogger should have at least the same amount of HDD space (\textbf{especially the free space in \code{/var/lib/mfs}!}) as the main Master Server. 292 | 293 | If you would like to use the Metalogger as a Master Server in case of the main Master's failure, the Metalogger machine should have at least the same amount of RAM as the main Master Server. 294 | 295 | \section{Requirements for Chunkservers} 296 | Chunkservers, like other MooseFS machines have to have POSIX compliant OS. 297 | 298 | \subsection{CPU} 299 | MooseFS Chunkserver is a multi-threaded process, so the best choice is to have a CPU with a number of cores. 300 | 301 | \subsection{RAM size} 302 | MooseFS Chunkserver uses approximately 250 MiB of RAM allocated to handle 1 million chunks. 303 | 304 | \underline{Example}: 305 | \begin{itemize} 306 | \item Chunkserver RAM usage: \code{661 MiB} 307 | \item Chunks stored on this Chunkserver (from MFS CGI): \code{3 275 062} 308 | \item \code{(661 * $2^{20}$) / 3 275 062 = $\sim$211.63} Bytes per one chunk 309 | \end{itemize} 310 | 311 | \subsection{HDD space} 312 | Chunkserver machines should have appropriate disk space (dedicated exclusively for MooseFS). Typical and recommended usage is to create one partition on each HDD, mount them and enter paths to mounted partitions in \code{/etc/mfs/mfshdd.cfg}. 313 | 314 | Minimal configuration should start from several gigabytes of storage space (only disks with more than 256 MB and Chunkservers reporting more than 1 GB of total free space are accessible for new data). 315 | 316 | \section{Requirements for Clients / Mounts} 317 | \code{mfsmount} requires FUSE to work; FUSE is available on several operating systems: Linux, FreeBSD, OpenSolaris and MacOS X, with the following notes: 318 | 319 | \begin{itemize} 320 | \item In case of Linux a kernel module with API 7.8 or later is required (it can be checked with dmesg command -- after loading kernel module there should be a line fuse init (API version 7.8)). It is available in fuse package 2.6.0 (or later) or in Linux kernel 2.6.20 (or later). Due to some minor bugs, the newer module is recommended (fuse 2.7.2 or Linux 2.6.24, although fuse 2.7.x standalone doesn't contain getattr/write race condition fix). 321 | \item In case of FreeBSD we recommed using fuse-freebsd\footnote{\url{https://github.com/glk/fuse-freebsd}}, which is a successor to fuse4bsd. 322 | \item For MacOSX we recommend using OSXFUSE\footnote{\url{http://osxfuse.github.com}}, which is a successor to MacFUSE and has been tested on MacOSX 10.6, 10.7, 10.8, 10.9 and 10.11. 323 | \end{itemize} 324 | 325 | 326 | 327 | 328 | \chapter{Installing MooseFS 2.0} 329 | This is a Very Quick Start Guide describing basic MooseFS 2.0 installation in configuration of two Master Servers and three Chunkservers. 330 | 331 | Please note that complete installation process is described in "MooseFS 2.0 Step by Step Tutorial". 332 | 333 | For the sake of this document, it's assumed that your machines have following IP addresses: 334 | \begin{itemize} 335 | \item Master servers: 192.168.1.1, 192.168.1.2 336 | \item Chunkservers: 192.168.1.101, 192.168.1.102 and 192.168.1.103 337 | \item Users' computers (clients): 192.168.2.x 338 | \end{itemize} 339 | 340 | \bigskip 341 | \bigskip 342 | \bigskip 343 | In this tutorial it is assumed that you have MooseFS 2.0 Pro version. If you use Community Edition, please remove '\code{-pro}' from packages names. 344 | 345 | \bigskip 346 | In this tutorial it is also assumed that you have Ubuntu/Debian installed on your machines. If you have another distribution, please use appropriate package manager instead of \code{apt}. 347 | 348 | \section{Configuring DNS Server} 349 | Before you start installing MooseFS, you need to have working DNS. It's needed for MooseFS to work properly with several master servers, because DNS can resolve one host name as more than one IP address. \\ 350 | All IPs of machines which will be master servers must be included in DNS configuration file and resolved as "\code{mfsmaster}" (or any other selected name), e.g.: 351 | 352 | \begin{lstlisting}[caption={DNS entries}] 353 | mfsmaster IN A 192.168.1.1 ; address of first master server 354 | mfsmaster IN A 192.168.1.2 ; address of second master server 355 | \end{lstlisting} 356 | 357 | More information about configuring DNS server is included in supplement to "MooseFS Step by Step Tutorial". 358 | 359 | \section{Adding repositories} 360 | Before installing MooseFS you need to add MooseFS Official Supported Repositories to your system. 361 | \subsection{Ubuntu / Debian} 362 | First, add the key: 363 | \begin{lstlisting}[caption={Adding the repo key}] 364 | # wget -O - http://ppa.moosefs.com/moosefs.key | apt-key add - 365 | \end{lstlisting} 366 | 367 | \bigskip 368 | Then add the appropriate entry in \code{/etc/apt/sources.list}: 369 | \begin{itemize} 370 | \item For Ubuntu 14.04 Trusty: \\ 371 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/ubuntu/trusty trusty main} 372 | \item For Ubuntu 12.04 Precise: \\ 373 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/ubuntu/precise precise main} 374 | \item For Ubuntu 10.10 Maverick: \\ 375 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/ubuntu/maverick maverick main} 376 | \item For Debian 7.0 Wheezy: \\ 377 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/debian/wheezy wheezy main} 378 | \item For Debian 6.0 Squeeze: \\ 379 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/debian/squeeze squeeze main} 380 | \item For Debian 5.0 Lenny: \\ 381 | \code{deb http://ppa.moosefs.com/moosefs-2/apt/debian/lenny lenny main} 382 | \end{itemize} 383 | 384 | \bigskip 385 | After that do: \\ 386 | \code{\# apt-get update} 387 | 388 | \subsection{RedHat / CentOS (EL7)} 389 | Red Hat 7 familiy OS use \code{systemd} Linux system and service manager to start processes. To use systemctl command to start MooseFS processes use this steps to add \code{systemd} repository.\\ 390 | 391 | Add the appropriate key to package manager: 392 | \begin{lstlisting}[caption={Adding the repo key}] 393 | # curl "http://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS 394 | \end{lstlisting} 395 | 396 | Next you need to add the repository entry to yum repo: 397 | \begin{lstlisting}[caption={Adding MooseFS repo}] 398 | # curl "http://ppa.moosefs.com/MooseFS-2-el7.repo" > /etc/yum.repos.d/MooseFS.repo 399 | # yum update 400 | \end{lstlisting} 401 | 402 | \subsection{RedHat / CentOS (EL6)} 403 | Red Hat 6 family OS use \code{SysV init} runlevel system to start processes. To use service command to start MooseFS processes use this steps to add SysV repository.\\ 404 | Add the appropriate key to package manager: 405 | \begin{lstlisting}[caption={Adding the repo key}] 406 | # curl "http://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS 407 | \end{lstlisting} 408 | 409 | Next you need to add the repository entry to yum repo: 410 | \begin{lstlisting}[caption={Adding the MooseFS repo}] 411 | # curl "http://ppa.moosefs.com/MooseFS-2-el6.repo" > /etc/yum.repos.d/MooseFS.repo 412 | # yum update 413 | \end{lstlisting} 414 | 415 | 416 | \subsection{Apple MacOS X} 417 | It's possible to run all components of the system on Mac OS X systems, but most common scenario would be to run the client (\code{mfsmount}) that enables Mac OS X users to access resources available in MooseFS infrastructure. 418 | 419 | In case of MacOS X -- since there's no default package manager -- we release \code{.pkg} files containing only binaries without any startup scripts, that normally are available in Linux packages. 420 | 421 | To install MooseFS on Mac please follow these steps: 422 | 423 | \begin{itemize} 424 | \item download and install FUSE for Mac OS X package from \\ 425 | \code{http://osxfuse.github.io} 426 | \item download and install MooseFS packages from \\ 427 | \code{http://ppa.moosefs.com/moosefs-2/osx/} 428 | \end{itemize} 429 | 430 | You should be able to mount MooseFS filesystem in \code{/mnt/mfs} issuing the following command: 431 | 432 | \code{\$ sudo mfsmount /mnt/mfs} 433 | 434 | If you've exported filesystem with additional options like password protection, you should include those options in \code{mfsmount} invocation as in documentation. 435 | 436 | 437 | \section{Differences in package names between Pro and CE version} 438 | The packages in MooseFS 2.0 Pro version are named according to following pattern: 439 | \begin{itemize} 440 | \item \code{moosefs-pro-master} 441 | \item \code{moosefs-pro-metalogger} 442 | \item \code{moosefs-pro-chunkserver} 443 | \item \code{moosefs-pro-client} 444 | \item \code{moosefs-pro-cli} 445 | \item \code{moosefs-pro-cgi} 446 | \item \code{moosefs-pro-cgiserv} 447 | \item \code{moosefs-pro-netdump} 448 | \item \code{moosefs-pro-supervisor} 449 | \end{itemize} 450 | 451 | \bigskip 452 | In MooseFS 2.0 Community Edition (CE) the packages are named according to the following pattern: 453 | \begin{itemize} 454 | \item \code{moosefs-master} 455 | \item \code{moosefs-metalogger} 456 | \item \code{moosefs-chunkserver} 457 | \item \code{moosefs-client} 458 | \item \code{moosefs-cli} 459 | \item \code{moosefs-cgi} 460 | \item \code{moosefs-cgiserv} 461 | \item \code{moosefs-netdump} 462 | \end{itemize} 463 | 464 | \section{MooseFS Master Server(s) installation} 465 | Install package \code{moosefs-pro-master} by running the following command: 466 | 467 | \underline{For Debian OS family}: 468 | \begin{lstlisting} 469 | # apt-get install moosefs-pro-master 470 | \end{lstlisting} 471 | 472 | 473 | \underline{For RedHat OS family}: 474 | \begin{lstlisting} 475 | # yum install moosefs-pro-master 476 | \end{lstlisting} 477 | 478 | Now, set MooseFS Master Server basic configuration: 479 | \begin{lstlisting} 480 | # cd /etc/mfs 481 | # cp mfsmaster.cfg.dist mfsmaster.cfg 482 | # cp mfsexports.cfg.dist mfsexports.cfg 483 | \end{lstlisting} 484 | 485 | File \code{mfsexports.cfg} specifies which users' computers can mount the file system and with what privileges. For example, to specify that only machines addressed as \code{192.168.2.x} can use the whole structure of MooseFS resources (\code{/}) in read/write mode, in the first line which is not commented out change the asterisk (\code{*}) to \code{192.168.2.0/24}, so that you'll have: 486 | 487 | \begin{lstlisting} 488 | 192.168.2.0/24 / rw,alldirs,maproot=0 489 | \end{lstlisting} 490 | 491 | Now, if you use MooseFS Pro, place proper \code{mfslicence.bin} file into \code{/etc/mfs} directory. This file \textbf{must} be available on \textbf{all} Master Servers. 492 | 493 | At this point it is possible to run the MooseFS Master Server: 494 | \begin{lstlisting} 495 | # mfsmaster start 496 | \end{lstlisting} 497 | 498 | If you use \code{SysV} init script manager, which is by default available in Debian, Ubuntu and RedHat 6 family operating systems, you can also start Master by issuing the following command: 499 | 500 | \begin{lstlisting} 501 | # service moosefs-pro-master start 502 | \end{lstlisting} 503 | 504 | To start MooseFS Master Server with latest \code{systemd} Linux system and service manager, which is available in RedHat 7 family operating systems, use this command: 505 | 506 | \begin{lstlisting} 507 | # systemctl start moosefs-pro-master.service 508 | \end{lstlisting} 509 | 510 | 511 | You need to repeat these steps on each machine intended for running MooseFS Master Server (in this example -- on \code{192.168.1.1} and \code{192.168.1.2}). 512 | 513 | You can also find more detailed description how to add Master Followers in \textbf{MooseFS Upgrade Guide - Chapter 6: Adding master follower(s) server(s) procedure} (Pro only). 514 | 515 | \section{MooseFS CGI Monitor, CGI Server and Command Line Interface installation} 516 | MooseFS CGI Monitor and MooseFS CGISERV can be installed on any machine, but good practice tells that it should be installed on every Master Server. 517 | 518 | MooseFS Command Line Interface (CLI) tool allows you to see various information about MooseFS status. The \code{mfscli} with \code{-SIN} option displays basic info similar to the "Info" tab in CGI. To install CGI, CGISERV and CLI, use the following commands. 519 | 520 | \underline{For Debian OS family}: 521 | 522 | \begin{lstlisting} 523 | # apt-get install moosefs-pro-cgi 524 | # apt-get install moosefs-pro-cgiserv 525 | # apt-get install moosefs-pro-cli 526 | \end{lstlisting} 527 | 528 | Set \code{MFSCGISERV\_ENABLE} variable to \code{true} in file \code{/etc/default/mfs-cgiserv} to configure \code{mfscgiserv} autostart. 529 | 530 | \underline{For RedHat OS family}: 531 | 532 | \begin{lstlisting} 533 | # yum install moosefs-pro-cgi 534 | # yum install moosefs-pro-cgiserv 535 | # yum install moosefs-pro-cli 536 | \end{lstlisting} 537 | 538 | 539 | Run MooseFS CGI Monitor with \code{SysV}: 540 | \begin{lstlisting} 541 | # service moosefs-pro-cgiserv start 542 | \end{lstlisting} 543 | 544 | Run MooseFS CGI Monitor with \code{systemd}: 545 | \begin{lstlisting} 546 | # systemctl start moosefs-pro-cgiserv.service 547 | \end{lstlisting} 548 | MooseFS CGI Monitor website should now be available at \url{http://192.168.1.1:9425} address(for the moment there would be no data about chunk servers). 549 | 550 | 551 | \section{Chunk servers installation} 552 | 553 | \underline{For Debian OS family}: 554 | \begin{lstlisting} 555 | # apt-get install moosefs-pro-chunkserver 556 | \end{lstlisting} 557 | 558 | 559 | \underline{For RedHat OS family}: 560 | \begin{lstlisting} 561 | # yum install moosefs-pro-chunkserver 562 | \end{lstlisting} 563 | 564 | 565 | Now you need to prepare basic configuration files for the \code{mfschunkserver}: 566 | \begin{lstlisting} 567 | # cd /etc/mfs 568 | # cp mfschunkserver.cfg.dist mfschunkserver.cfg 569 | # cp mfshdd.cfg.dist mfshdd.cfg 570 | \end{lstlisting} 571 | 572 | 573 | In the \code{mfshdd.cfg} file you'll give locations in which you have mounted hard drives/partitions designed for the chunks of the system. It is recommended that they are used exclusively for the MooseFS -- this is necessary to manage the free space properly. For example if you'll use \code{/mnt/mfschunks1} and \code{/mnt/mfschunks2} locations, add these two lines to \code{mfshdd.cfg} file: 574 | 575 | \begin{lstlisting} 576 | /mnt/mfschunks1 577 | /mnt/mfschunks2 578 | \end{lstlisting} 579 | 580 | 581 | Before you start chunkserver, make sure that the user \code{mfs} has rights to write in the mounted partitions (which is necessary to create a .lock file): 582 | 583 | \begin{lstlisting} 584 | # chown -R mfs:mfs /mnt/mfschunks1 585 | # chown -R mfs:mfs /mnt/mfschunks2 586 | \end{lstlisting} 587 | 588 | 589 | At this moment you are ready to start the chunk server: 590 | 591 | For \code{SysV} init script system 592 | \begin{lstlisting} 593 | # service moosefs-pro-chunkserver start 594 | \end{lstlisting} 595 | 596 | For \code{systemd} Linux system and service manager 597 | \begin{lstlisting} 598 | # systemctl start moosefs-pro-chunkserver.service 599 | \end{lstlisting} 600 | 601 | 602 | \bigskip 603 | \bigskip 604 | You need to repeat these steps on each machine intended for running MooseFS Chunkserver (in this example -- on \code{192.168.1.101}, \code{192.168.1.102} and \code{192.168.1.103}. 605 | 606 | Now at \url{http://192.168.1.1:9425} full information about the system is available, including the master server and chunk servers. 607 | 608 | 609 | \section{MooseFS Clients installation} 610 | MooseFS client uses \code{FUSE} library. During installation process, your operating system also downloads and installs \code{FUSE} library if it is not installed. 611 | 612 | \underline{Debian OS family}: 613 | \begin{lstlisting} 614 | # apt-get install moosefs-pro-client 615 | \end{lstlisting} 616 | 617 | \underline{RedHat OS family}: 618 | \begin{lstlisting} 619 | # yum install moosefs-pro-client 620 | \end{lstlisting} 621 | 622 | Let's assume that you want to mount the MooseFS share in a \code{/mnt/mfs} folder on a client's machine. Issue the following commands: 623 | \begin{lstlisting} 624 | # mkdir -p /mnt/mfs 625 | # mfsmount /mnt/mfs -H mfsmaster 626 | \end{lstlisting} 627 | 628 | Now after running the \code{df -h | grep mfs} command you should get information similar to this: 629 | \begin{lstlisting} 630 | /storage/mfschunks/mfschunks1 631 | 2.0G 69M 1.9G 4% /mnt/mfschunks1 632 | /storage/mfschunks/mfschunks2 633 | 2.0G 69M 1.9G 4% /mnt/mfschunks2 634 | mfs#mfsmaster:9421 635 | 3.2G 0 3.2G 0% /mnt/mfs 636 | \end{lstlisting} 637 | 638 | You need to repeat these steps on each machine intended to be MooseFS 2.0 Client (in this example -- on \code{192.168.2.x}. 639 | 640 | To enable MooseFS Client automount during boot, first of all check if the \code{fuse} and \code{fuse-libs} packages are installed. If \code{fuse} and \code{fuse-libs} packages are installed, add similar entry to the following one in \code{/etc/fstab}: 641 | 642 | \begin{lstlisting} 643 | mfsmount /mnt/mfs fuse defaults,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 644 | \end{lstlisting} 645 | 646 | If MooseFS Client has to be mounted on the same machine that MooseFS Master Server runs, please put the following \code{fstab} entry instead of the one listed above: 647 | 648 | \begin{lstlisting} 649 | mfsmount /mnt/mfs fuse defaults,mfsdelayedinit,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 650 | \end{lstlisting} 651 | 652 | \newpage 653 | 654 | \section{Enabling MooseFS services during OS boot} 655 | Each operating system has it's own method to manage services start during boot. Below you can find a few examples of enabling MooseFS autostart in supported operating systems. 656 | 657 | 658 | \subsection{RedHat / Centos (EL6)} 659 | 660 | \underline{MooseFS Chunkserver}: 661 | 662 | To enable MooseFS Chunkserver autostart during OS boot, use \code{chkconfig} command like in example below: 663 | \begin{lstlisting} 664 | chkconfig moosefs-chunkserver on 665 | \end{lstlisting} 666 | 667 | \underline{MooseFS Master Server}: 668 | 669 | To enable MooseFS Master Server autostart during OS boot, use \code{chkconfig} command like in example below: 670 | \begin{lstlisting} 671 | chkconfig moosefs-master on 672 | \end{lstlisting} 673 | 674 | \underline{MooseFS Client}: 675 | 676 | To enable MooseFS Client automount during boot, first of all check if the \code{fuse} and \code{fuse-libs} packages are installed: 677 | \begin{lstlisting} 678 | # rpm -qa | grep fuse 679 | fuse-2.8.3-4.el6.x86_64 680 | fuse-libs-2.8.3-4.el6.x86_64 681 | \end{lstlisting} 682 | 683 | If \code{fuse} and \code{fuse-libs} packages are installed, add similar entry to the following one in \code{/etc/fstab}: 684 | 685 | \begin{lstlisting} 686 | mfsmount /mnt/mfs fuse defaults,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 687 | \end{lstlisting} 688 | 689 | If MooseFS Client has to be mounted on the same machine that MooseFS Master Server runs, please put the following \code{fstab} entry instead of the one listed above: 690 | 691 | \begin{lstlisting} 692 | mfsmount /mnt/mfs fuse defaults,mfsdelayedinit,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 693 | \end{lstlisting} 694 | 695 | 696 | 697 | \subsection{RedHat / Centos (EL7)} 698 | In operating systems with \code{systemd}, use \code{systemctl} command to manage init processes at boot: 699 | 700 | \underline{MooseFS Chunkserver}: 701 | 702 | To enable MooseFS Chunkserver autostart during OS boot: 703 | \begin{lstlisting} 704 | systemctl enable moosefs-chunkserver.service 705 | \end{lstlisting} 706 | 707 | \underline{MooseFS Master Server}: 708 | 709 | To enable MooseFS Master Server autostart during OS boot: 710 | \begin{lstlisting} 711 | systemctl enable moosefs-master.service 712 | \end{lstlisting} 713 | 714 | 715 | \underline{MooseFS Client}: 716 | 717 | To enable MooseFS Client automount during boot, first of all check if the \code{fuse} and \code{fuse-libs} packages are installed: 718 | \begin{lstlisting} 719 | # rpm -qa | grep fuse 720 | fuse-2.9.2-6.el7.x86_64 721 | fuse-libs-2.9.2-6.el7.x86_64 722 | \end{lstlisting} 723 | 724 | If \code{fuse} and \code{fuse-libs} packages are installed, add similar entry to the following one in \code{/etc/fstab}: 725 | 726 | \begin{lstlisting} 727 | mfsmount /mnt/mfs fuse mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 728 | \end{lstlisting} 729 | 730 | If MooseFS Client has to be mounted on the same machine that MooseFS Master Server runs, please put the following \code{fstab} entry instead of the one listed above: 731 | 732 | \begin{lstlisting} 733 | mfsmount /mnt/mfs fuse defaults,mfsdelayedinit,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 734 | \end{lstlisting} 735 | 736 | 737 | \subsection{Debian / Ubuntu} 738 | This method works in Debian 6, Debian 7, Ubuntu 12, Ubuntu 14. 739 | 740 | \underline{MooseFS Chunkserver}: 741 | 742 | To enable MooseFS Chunkserver autostart during OS boot, find \code{/etc/default/moosefs-chunkserver} file and change \code{MFSCHUNKSERVER\_ENBLE} variable to \code{true}: 743 | \begin{lstlisting} 744 | MFSCHUNKSERVER_ENABLE=true 745 | \end{lstlisting} 746 | 747 | \underline{MooseFS Master}: 748 | 749 | To enable MooseFS Master Server autostart during OS boot, edit \code{/etc/default/moosefs-master} file and change \code{MFSMASTER\_ENBLE} variable to \code{true}: 750 | \begin{lstlisting} 751 | MFSMASTER_ENABLE=true 752 | \end{lstlisting} 753 | 754 | \underline{MooseFS Client}: 755 | 756 | To enable MooseFS Client automount during boot, first of all check if the \code{fuse} and \code{fuse-libs} packages are installed. If \code{fuse} and \code{fuse-libs} packages are installed, add similar entry to the following one in \code{/etc/fstab}: 757 | 758 | \begin{lstlisting} 759 | mfsmount /mnt/mfs fuse mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 760 | \end{lstlisting} 761 | 762 | If MooseFS Client has to be mounted on the same machine that MooseFS Master Server runs, please put the following \code{fstab} entry instead of the one listed above: 763 | 764 | \begin{lstlisting} 765 | mfsmount /mnt/mfs fuse defaults,mfsdelayedinit,mfsmaster=mfsmaster.example.lan,mfsport=9421 0 0 766 | \end{lstlisting} 767 | 768 | 769 | 770 | 771 | \subsection{FreeBSD} 772 | 773 | \underline{MooseFS Chunkserver}: 774 | 775 | To enable MooseFS Chunkserver autostart during OS boot, add an entry to \code{/etc/rc.conf}: 776 | \begin{lstlisting} 777 | mfschunkserver_enable="YES" 778 | \end{lstlisting} 779 | 780 | \underline{MooseFS Master}: 781 | 782 | To enable MooseFS Chunkserver autostart during OS boot, add entry to \code{/etc/rc.conf}: 783 | \begin{lstlisting} 784 | mfsmaster_enable="YES" 785 | \end{lstlisting} 786 | 787 | \underline{MooseFS Client}: 788 | 789 | To enable MooseFS Client automount during boot add the following entry in \code{/boot/loader.conf} to let FreeBSD load \code{fuse} module during boot: 790 | \begin{lstlisting} 791 | fuse_load="YES" 792 | \end{lstlisting} 793 | 794 | And add the entry in \code{/etc/fstab}: 795 | \begin{lstlisting} 796 | mfsmount_magic /mnt/mfs moosefs rw,mfsmaster=mfsmaster,mountprog=/usr/local/bin/mfsmount,late 0 0 797 | \end{lstlisting} 798 | 799 | 800 | \newpage 801 | 802 | \section{Basic MooseFS use} 803 | Create \code{folder1} in \code{/mnt/mfs}, in which you store files in one copy (setting \code{goal=1}): 804 | \begin{lstlisting} 805 | mkdir -p /mnt/mfs/folder1 806 | \end{lstlisting} 807 | 808 | and \code{folder2}, in which you store files in two copies (setting \code{goal=2}): 809 | \begin{lstlisting} 810 | mkdir -p /mnt/mfs/folder2 811 | \end{lstlisting} 812 | 813 | 814 | The number of copies for the folder is set with the \code{mfssetgoal -r} command: 815 | \begin{lstlisting} 816 | # mfssetgoal -r 1 /mnt/mfs/folder1 817 | /mnt/mfs/folder1: 818 | inodes with goal changed: 0 819 | inodes with goal not changed: 1 820 | inodes with permission denied: 0 821 | 822 | # mfssetgoal -r 2 /mnt/mfs/folder2 823 | /mnt/mfs/folder2: 824 | inodes with goal changed: 0 825 | inodes with goal not changed: 1 826 | inodes with permission denied: 0 827 | \end{lstlisting} 828 | 829 | 830 | \section{Stopping MooseFS} 831 | In order to safely stop the MooseFS cluster you have to perform the following steps: 832 | \begin{itemize} 833 | \item Unmount the file system on all machines using umount command (in our example it would be: \code{umount /mnt/mfs}) 834 | \item Stop the Chunk Servers processes:\\ 835 | For \code{SysV}: \code{service moosefs-pro-chunkserver stop}\\ 836 | For \code{systemd}: \code{systemctl stop moosefs-pro-chunkserver.service} 837 | 838 | \item Stop the Master Server processes (starting from the FOLLOWER, you shuould stop the LEADER Master Server as the last one):\\ 839 | For \code{SysV}: \code{service moosefs-pro-master stop}\\ 840 | For \code{systemd}: \code{systemctl stop moosefs-pro-master.service} 841 | 842 | \item Stop the Metalogger process:\\ 843 | For \code{SysV}: \code{service moosefs-pro-metalogger stop}\\ 844 | For \code{systemd}: \code{systemctl stop moosefs-pro-metalogger.service} 845 | \end{itemize} 846 | 847 | 848 | 849 | \chapter{Troubleshooting} 850 | 851 | \section{Metadata save} 852 | Sometimes MFS master server freezes during the metadata save. To overcome this problem you should change one setting in your system. On your master machines, you should enable overcommit memory setting by issuing the following command as root: 853 | 854 | \begin{lstlisting} 855 | # echo 1 > /proc/sys/vm/overcommit_memory 856 | \end{lstlisting} 857 | 858 | To do it permanently, you can add the following line to your \code{/etc/sysctl.conf} file (it works only on Linux): 859 | 860 | \begin{lstlisting} 861 | vm.overcommit_memory=1 862 | \end{lstlisting} 863 | 864 | More detail about the reasons for this behavior:\\ 865 | Master server performs a fork operation, effectively spawning another process to save metadata to disk. Theoretically, when you fork a process, the process memory is copied. In real life it is done the lazy way -- the memory is marked, so that if any changes are to occur, a block with changes is copied as needed, but only then. Now, if you fork a process that has 180GB of memory in use, the system can "just do it", or check if it has 180GB of free memory and reserve it for the forked "child", and only then do it and, when it doesn't have enough memory, the fork operation fails -- this is the case in Linux, so actually saving metadata is done in the main process, because fork operation failed. 866 | 867 | This behavior differs between systems and even between distributions of one system. 868 | 869 | It is safe to enable overcommit memory (the "just do it" way) with mfsmaster, because the forked process is short lived. It terminates as soon as it manages to save metadata, and during the time that it works, there are usually not that many changes to the main process' memory, so the amount of additional RAM needed is relatively small. 870 | 871 | Alternatively, you can add huge (at least equal to the amount of physical RAM or even more) amounts of swap space on your master servers -- then the fork should succeed, because it should always find the needed memory space in your swap. 872 | 873 | \section{Master metadata restore from metaloggers} 874 | MooseFS Community Edition have only one master, but can have several metaloggers deployed for backup. If for some reason you loose all metadata files and changelogs from master server you can use data form metalogger to restore your data. To start dealing with recovery first you need to transfer all data stored on metalogger in \code{/var/lib/mfs} to master metadata folder. Files on metalogger will have \code{\_ml\_} prefix prepended to the filenames. After all files are copied, you need to create \code{metadata.mfs} file from changelogs and \code{metadata.back} files. To do this we need to use the command \code{mfsmaster -a}. \code{Mfsmaster} starts to build new metadata file and starts \code{mfsmaster} process. 875 | 876 | \section{Maintenance mode} 877 | Maintenance mode in general is helpful when there is need for maintenance on Chunkserver(s), like Chunkserver package upgrade to a newer version, adding new HDD / replacing broken ones or system upgrade (and e.g. reboot). 878 | 879 | Maintenance mode has been introduced, because in MooseFS 1.6, when there was need for maintenance on Chunkserver(s) and necessity to turn server(s) off, a lot of replications were being performed, because MooseFS had started to replicate all undergoal chunks from another available copy to fulfill the goal (it's one of MooseFS principals). Then, when it was back again -- a lot of deletions were running, because of presence of overgoal chunks, created during replications. So a lot of unnecessary I/O operations. 880 | 881 | By enabling maintenance mode before stopping Chunkserver(s) process(es) / turning machine(s) off or \textit{post factum}, you can prevent MooseFS from replicating chunks from such turned off Chunkserver(s). \textbf{Note: Server(s) in maintenance mode must match currently off (disconnected) servers. If they don't match, all chunks are replicated.} 882 | 883 | Additionally, MooseFS treats Chunkservers in maintenance mode as overloaded (no chunk creations, replications etc.). It means, that new chunks are not created on Chunkservers in maintenance mode. The reason of such behavior is because when you want to turn Chunkserver off / stop the Chunkserver process, at the moment of stopping, some I/O operations may go to this Chunkserver and when you just stop it, some write operations must be re-tried (because they haven't been finished on this stopped Chunkserver). When you turn maintenance mode on for specific Chunkserver a few seconds before stop, MooseFS will finish write operations and won't start a new ones on this Chunkserver. 884 | 885 | \textbf{Maintenance mode is designed to be a temporary state and it is not recommended to put Chunkservers in this mode for a long time.} 886 | 887 | You can enable or disable maintenance mode in CGI monitor by clicking "switch on / switch off" 888 | in "maintenance" column, or sending a command using: 889 | \begin{itemize} 890 | \item \code{mfscli -CM1/ip/port} -- to switch maintenance mode on 891 | \item \code{mfscli -CM0/ip/port} -- to switch maintenance mode off 892 | \end{itemize} 893 | 894 | \bigskip 895 | 896 | \textbf{Note: If number of Chunkservers in maintenance mode is equal or greater than 20\% of all Chunkserver, MooseFS treats all Chunkservers like maintenance mode wouldn't be enabled at all.} 897 | 898 | \section{Chunk replication priorities} 899 | In MooseFS 2.0 a few chunk replication classes and priorities have been introduced: 900 | \begin{itemize} 901 | \item Replication limit class 0 and class 1 -- replication for data safety 902 | \item Replication limit class 2 and class 3 -- equalization of used disk space 903 | \end{itemize} 904 | 905 | \bigskip 906 | \bigskip 907 | These classes and priorities are described below: 908 | \bigskip 909 | 910 | \begin{itemize} 911 | \item Replication limit class 0 (for endangered chunks): 912 | \begin{itemize} 913 | \item priority 0: 0 (chunk) copies on regular disks and 1 copy on disk marked for removal 914 | \item priority 1: 1 copy on regular disks and 0 copies on disks marked for removal 915 | \end{itemize} 916 | 917 | \item Replication limit class 1 (for undergoal chunks): 918 | \begin{itemize} 919 | \item priority 2: 1 copy on regular disk and some copies on disks marked for removal 920 | \item priority 3: $>$1 copy on regular disks and at least 1 copy on disks marked for removal 921 | \item priority 4: just undergoal chunks ("goal" $>$ "valid copies", no copies on disks marked for removal) 922 | \end{itemize} 923 | 924 | \item Replication limit class 2: Rebalancing between chunkservers with disk space usage around arithmetic mean 925 | \item Replication limit class 3: Rebalancing between chunkserver with disk space usage strongly above or strongly below arithmetic mean (very low or very high disk space usage, e.g. when new chunkserver is added) 926 | \end{itemize} 927 | 928 | \chapter{MooseFS Tools} 929 | 930 | \section{For MooseFS Master Server(s)} 931 | 932 | \subsection{\code{mfsmaster}} 933 | \code{mfsmaster} -- start, restart or stop Moose File System master process 934 | \bigskip 935 | 936 | \textbf{SYNOPSIS} 937 | \begin{itemize} 938 | \item \code{mfsmaster [-c CFGFILE] [-u] [-i] [-a] [-e] [-x[x]] [-t LOCKTIMEOUT] [ACTION]} 939 | \item \code{mfsmaster -v} 940 | \item \code{mfsmaster -h} 941 | \end{itemize} 942 | 943 | \bigskip 944 | \textbf{DESCRIPTION} 945 | 946 | \code{mfsmaster} is the master program of Moose File System. 947 | 948 | \bigskip 949 | 950 | \textbf{OPTIONS} 951 | \begin{itemize} 952 | \item \code{-v} print version information and exit 953 | \item \code{-h} print usage information and exit 954 | \item \code{-c CFGFILE} 955 | specify alternative path of configuration file (default is \code{mfsmaster.cfg} in system configuration directory) 956 | \item \code{-u} log undefined configuration values (when default is assumed) 957 | \item \code{-f} run in foreground, don't daemonize 958 | \item \code{-t LOCKTIMEOUT} 959 | how long to wait for lockfile (in seconds; default is 1800 seconds) 960 | \item \code{-i} ignore some metadata structure errors 961 | \item \code{-a} automatically restore metadata from change logs 962 | \item \code{-e} start without metadata (usable only in pro version -- used to start additional masters) 963 | \item \code{-x} produce more verbose output 964 | \item \code{-xx} even more verbose output 965 | 966 | \item \code{ACTION} is the one of \code{start}, \code{stop}, \code{restart}, \code{reload}, \code{test} or \code{kill}. 967 | Default action is restart. The test action will yield one of two 968 | responses: "\code{mfsmaster pid: PID}" or "\code{mfsmaster is not running}". 969 | The kill action will send a SIGKILL to the currently running 970 | master process. SIGHUP or reload action forces \code{mfsmaster} to 971 | reload all configuration files. 972 | \end{itemize} 973 | 974 | \bigskip 975 | \textbf{FILES} 976 | \begin{itemize} 977 | \item \code{mfsmaster.cfg} 978 | configuration file for MooseFS master process; refer to 979 | mfsmaster.cfg(5) manual for details 980 | 981 | \item \code{mfsexports.cfg} 982 | MooseFS access control file; refer to mfsexports.cfg(5) manual 983 | for details 984 | 985 | \item \code{mfstopology.cfg} 986 | Network topology definitions; refer to mfstopology.cfg(5) manual 987 | for details 988 | 989 | \item \code{.mfsmaster.lock} 990 | lock file of running MooseFS master process (created in data 991 | directory) 992 | 993 | \item \code{metadata.mfs}, \code{metadata.mfs.back} 994 | MooseFS filesystem metadata image (created in data directory) 995 | 996 | \item \code{changelog.*.mfs} 997 | MooseFS filesystem metadata change logs (created in data 998 | directory; merged into metadata.mfs once per hour) 999 | 1000 | \item \code{data.stats} 1001 | MooseFS master charts state (created in data directory) 1002 | \end{itemize} 1003 | 1004 | 1005 | \subsection{\code{mfsmetarestore}} 1006 | \code{mfsmetarestore} -- doesn't exist this version of MooseFS 1007 | \bigskip 1008 | 1009 | \textbf{DESCRIPTION} 1010 | 1011 | This tool was removed as of version 1.7. To achieve the same effect, simply start your mfsmaster with \code{-a} parameter. 1012 | 1013 | \subsection{\code{mfsmetadump}} 1014 | \code{mfsmetadump} -- dump MooseFS metadata info in human readable format. 1015 | \bigskip 1016 | 1017 | \textbf{SYNOPSIS} 1018 | 1019 | \code{mfsmetadump metadata\_file} 1020 | \bigskip 1021 | 1022 | \textbf{DESCRIPTION} 1023 | \code{mfsmetadump} dumps MooseFS metadata info in human readable format. Output consists of several sections with different types of information. Every section consist of header data -- rows starting with hash (\#) sign - and content data (may be empty). 1024 | \newpage 1025 | 1026 | \textbf{FILE HEADER} 1027 | \begin{itemize} 1028 | \item \code{mfsmaster.cfg} 1029 | configuration file for MooseFS master process; refer to 1030 | mfsmaster.cfg(5) manual for details 1031 | \item \code{header} -- MooseFS version 1032 | \item \code{version} -- metadata file version 1033 | \item \code{fileid} -- metadata file id 1034 | \end{itemize} 1035 | \bigskip 1036 | 1037 | \textbf{SECTION HEADER} 1038 | \begin{itemize} 1039 | \item \code{section header} -- section header (section type + version) 1040 | \item \code{length} -- length of section 1041 | \item \code{section type} -- name of section 1042 | \item \code{version} -- hexadecimal representation of section version 1043 | \end{itemize} 1044 | \bigskip 1045 | 1046 | \textbf{SESS SECTION} 1047 | \begin{itemize} 1048 | \item \code{nextsessionid} -- first free session id 1049 | \item \code{statscount} -- number of stats remembered in each session 1050 | \item \code{SESSION} -- line describing a single session 1051 | \begin{itemize} 1052 | \item \code{s} -- session id 1053 | \item \code{p} -- IP address 1054 | \item \code{r} -- root inode number 1055 | \item \code{f} -- session flags 1056 | \item \code{g} -- mingoal and maxgoal 1057 | \item \code{t} -- mintrashtime and maxtrashtime 1058 | \item \code{m} -- maproot uid,gid and mapall uid,gid 1059 | \item \code{d} -- disconnection time (optional) 1060 | \item \code{c} -- current hour stats data 1061 | \item \code{l} -- last hour stats data 1062 | \item \code{i} -- session name (usually local mount point) 1063 | \end{itemize} 1064 | \end{itemize} 1065 | \bigskip 1066 | 1067 | \textbf{NODES SECTION} 1068 | \begin{itemize} 1069 | \item \code{maxinode} -- maximum inode number used by system 1070 | \item \code{hashelements} -- number of inodes in hash table 1071 | \item \code{NODE} -- line with node (inode) description 1072 | \begin{itemize} 1073 | \item \code{k} -- node type (-,D,S,F,B,C,L,T,R) 1074 | \begin{itemize} 1075 | \item \code{-} -- file 1076 | \item \code{D} -- directory 1077 | \item \code{S} -- socket 1078 | \item \code{F} -- fifo 1079 | \item \code{B} -- block device 1080 | \item \code{C} -- character device 1081 | \item \code{L} -- symbolic link 1082 | \item \code{T} -- trash file 1083 | \item \code{R} -- sustained file (removed open file) 1084 | \end{itemize} 1085 | \item \code{i} -- inode number 1086 | \item \code{\#} -- labelset number (10+) or goal (1-9) 1087 | \item \code{e} -- flags 1088 | \item \code{m} -- mode 1089 | \item \code{u} -- uid 1090 | \item \code{g} -- gid 1091 | \item \code{a,m,c} -- atime, mtime and ctime timestamps 1092 | \item \code{t} -- trashtime 1093 | \item \code{d} -- rdevhi,rdevlo (only block and character devices) 1094 | \item \code{p} -- path (only symbolic links) 1095 | \item \code{l} -- file length (only files) 1096 | \item \code{c} -- chunk list (only files) 1097 | \item \code{r} -- sessions that have this file open (only files) 1098 | \end{itemize} 1099 | \end{itemize} 1100 | \bigskip 1101 | 1102 | \textbf{EDGES SECTION} 1103 | \begin{itemize} 1104 | \item \code{nextedgeid} -- next available edge id (descending) 1105 | \item \code{EDGE} -- line with edge description 1106 | \begin{itemize} 1107 | \item \code{p} -- parent inode number 1108 | \item \code{c} -- child inode number 1109 | \item \code{i} -- edge id 1110 | \item \code{n} -- edge name 1111 | \end{itemize} 1112 | \end{itemize} 1113 | \bigskip 1114 | 1115 | \textbf{FREE SECTION} 1116 | \begin{itemize} 1117 | \item \code{free nodes} -- number of free (reusable) nodes 1118 | \item \code{FREEID} -- line with free inode description 1119 | \begin{itemize} 1120 | \item \code{i} -- inode number 1121 | \item \code{f} -- deletion timestamp 1122 | \end{itemize} 1123 | \end{itemize} 1124 | \bigskip 1125 | 1126 | \textbf{QUOTA SECTION} 1127 | \begin{itemize} 1128 | \item \code{quota nodes} -- number of nodes with quota 1129 | \item \code{QUOTA} -- line with quota description 1130 | \begin{itemize} 1131 | \item \code{i} -- inode number 1132 | \item \code{g} -- grace period 1133 | \item \code{e} -- exceeded 1134 | \item \code{f} -- flags 1135 | \item \code{s} -- soft quota exceeded timestamp 1136 | \item \code{si} -- soft inode quota 1137 | \item \code{hi} -- hard inode quota 1138 | \item \code{sl} -- soft length quota 1139 | \item \code{hl} -- hard length quota 1140 | \item \code{ss} -- soft size quota 1141 | \item \code{hs} -- hard size quota 1142 | \item \code{sr} -- soft real size quota 1143 | \item \code{hr} -- hard real size quota 1144 | \end{itemize} 1145 | \end{itemize} 1146 | \bigskip 1147 | 1148 | \textbf{XATTR SECTION} 1149 | \begin{itemize} 1150 | \item \code{XATTR} -- line with xattr description 1151 | \begin{itemize} 1152 | \item \code{i} -- inode number 1153 | \item \code{n} -- xattr name 1154 | \item \code{v} -- xattr value 1155 | \end{itemize} 1156 | \end{itemize} 1157 | \bigskip 1158 | 1159 | \textbf{POSIX ACL SECTION} 1160 | \begin{itemize} 1161 | \item \code{POSIXACL} -- line with acl description 1162 | \begin{itemize} 1163 | \item \code{i} -- inode number 1164 | \item \code{t} -- acl type 1165 | \item \code{u} -- user (file owner) permissions 1166 | \item \code{g} -- group permissions 1167 | \item \code{o} -- other permissions 1168 | \item \code{m} -- permission mask 1169 | \item \code{n} -- named permissions -- list of objects: 1170 | \begin{itemize} 1171 | \item \code{u(U):P} -- permissions \code{P} for user with uid \code{U} 1172 | \item \code{g(G):P} -- permissions \code{P} for group with gid \code{G} 1173 | \end{itemize} 1174 | \end{itemize} 1175 | \end{itemize} 1176 | \bigskip 1177 | 1178 | \textbf{OPEN SECTION} 1179 | \begin{itemize} 1180 | \item \code{chunk servers} -- number of chunkservers 1181 | \item \code{CHUNKSERVER} -- line with chunk server description 1182 | \begin{itemize} 1183 | \item \code{i} -- server ip 1184 | \item \code{p} -- server port 1185 | \item \code{\#} -- server id 1186 | \item \code{m} -- maintenance mode 1187 | \end{itemize} 1188 | \end{itemize} 1189 | \bigskip 1190 | 1191 | \textbf{CHUNKSERVERS SECTION} 1192 | \begin{itemize} 1193 | \item \code{OPENFILE} -- line with open file description 1194 | \begin{itemize} 1195 | \item \code{s} -- session id 1196 | \item \code{i} -- inode number 1197 | \end{itemize} 1198 | \end{itemize} 1199 | \bigskip 1200 | 1201 | \textbf{CHUNKS SECTION} 1202 | \begin{itemize} 1203 | \item \code{nextchunkid} -- first available chunk number 1204 | \item \code{CHUNK} -- line with chunk description 1205 | \begin{itemize} 1206 | \item \code{i} -- chunk number 1207 | \item \code{v} -- chunk version 1208 | \item \code{t} -- "locked to" timestamp 1209 | \item \code{a} -- archive flag 1210 | \end{itemize} 1211 | \end{itemize} 1212 | 1213 | \section{For MooseFS Supervisor} 1214 | \subsection{\code{mfssupervisor}} 1215 | \code{mfssupervisor} -- choose or switch leader master 1216 | 1217 | \bigskip 1218 | \textbf{SYNOPSIS} 1219 | \begin{itemize} 1220 | \item \code{mfssupervisor [-xdfi] [-l new leader ip] [-H master host name]} \\ \code{[-P master supervising port]} 1221 | \item \code{mfssupervisor -v} 1222 | \item \code{mfssupervisor -h} 1223 | \end{itemize} 1224 | 1225 | \bigskip 1226 | \textbf{DESCRIPTION} \\\\ 1227 | \code{mfssupervisor} is the supervisor program of Moose File System. It is 1228 | needed to start a completely new system or a system after a big crash. 1229 | It can be also used to force select a new leader master. 1230 | 1231 | \bigskip 1232 | \textbf{OPTIONS} 1233 | \begin{itemize} 1234 | \item \code{-v} -- print version information and exit 1235 | \item \code{-h} -- print usage information and exit 1236 | \item \code{-x} -- produce more verbose output 1237 | \item \code{-d} -- dry run (print info, but do not change anything) 1238 | \item \code{-f} -- force electing not synchronized follower; use this option to initialize a new system 1239 | \item \code{-i} -- print info only about masters state 1240 | \item \code{-l} -- try to switch current leader to given ip 1241 | \item \code{-H} -- use given host to find your master servers (default: \code{mfsmaster}) 1242 | \item \code{-P} -- use given port to connect to your master servers (default: \code{9419}) 1243 | \end{itemize} 1244 | \section{For MooseFS Command Line Interface} 1245 | \subsection{\code{mfscli}} 1246 | \code{mfscli} -- CGI in TXT mode 1247 | \bigskip 1248 | 1249 | \textbf{SYNOPSIS} 1250 | \begin{itemize} 1251 | \item \code{/usr/bin/mfscli [-pn28] [-H master\_host] [-P master\_port] [-f 0..3] \\ 1252 | -S(IN|IM|LI|IG|MU|IC|IL|CS|MB|HD|EX|MS|MO|QU) [-s separator] [-o order\_id [-r]] [-m mode\_id]} 1253 | 1254 | \item \code{/usr/bin/mfscli [-pn28] [-H master\_host] [-P master\_port] [-f 0..3] \\ 1255 | -C(RC/ip/port|BW/ip/port|M[01]/ip/port|RS/sessionid)} 1256 | 1257 | \item \code{mfscli -h} 1258 | \end{itemize} 1259 | \bigskip 1260 | 1261 | \textbf{DESCRIPTION} 1262 | 1263 | \code{mfscli} is a commandline counterpart to MooseFS's CGI interface. All the information available in CGI (except for graphs) can be obtained via CLI using different "monitoring options" 1264 | \bigskip 1265 | 1266 | \textbf{OPTIONS}: 1267 | \begin{itemize} 1268 | \item \code{-h} -- print help message 1269 | \item \code{-p} -- force plain text format on tty devices 1270 | \item \code{-n} -- do not resolve ip adresses (default when output device is not tty) 1271 | \item \code{-s separator} -- field separator to use in plain text format on tty devices (forces -p) 1272 | \item \code{-2} -- force 256-color terminal color codes 1273 | \item \code{-8} -- force 8-color terminal color codes 1274 | \item \code{-H} master\_host -- master address (default: mfsmaster) 1275 | \item \code{-P} master\_port -- master client port (default: 9421) 1276 | 1277 | \item \code{-f 0..3} -- set frame charset to be displayed as table frames in ttymode 1278 | \begin{itemize} 1279 | \item \code{0} -- use simple ascii frames \code{+}, \code{-}, \code{|} (default) 1280 | \item \code{1} -- thick unicode frames 1281 | \item \code{2} -- thin unicode frames 1282 | \item \code{3} -- double unicode frames (dos style) 1283 | \end{itemize} 1284 | 1285 | \item \code{-o order\_id} -- sort data by column specified by \code{order\_id} (depends on data set) 1286 | \item \code{-r} -- reverse sort order 1287 | \item \code{-m mode\_id} -- show data specified by \code{mode\_id} (depends on data set) 1288 | \end{itemize} 1289 | \bigskip 1290 | 1291 | \textbf{MONITORING OPTIONS}: 1292 | \begin{itemize} 1293 | \item \code{-SIN} -- show full master info 1294 | \item \code{-SIM} -- show only masters states 1295 | \item \code{-SLI} -- show only licence info 1296 | \item \code{-SIG} -- show only general master (leader) info 1297 | \item \code{-SIC} -- show only chunks info (goal/copies matrices) 1298 | \item \code{-SIL} -- show only loop info (with messages) 1299 | \item \code{-SCS} -- show connected chunk servers 1300 | \item \code{-SMB} -- show connected metadata backup servers 1301 | \item \code{-SHD} -- show hdd data 1302 | \item \code{-SEX} -- show exports 1303 | \item \code{-SMS} -- show active mounts 1304 | \item \code{-SMO} -- show operation counters 1305 | \item \code{-SQU} -- show quota info 1306 | \item \code{-SMC} -- show master charts data 1307 | \item \code{-SCC} -- show chunkserver charts data 1308 | \end{itemize} 1309 | \bigskip 1310 | 1311 | \textbf{COMMANDS}: 1312 | \begin{itemize} 1313 | \item \code{-CRC/ip/port} -- remove given chunkserver from list of active chunkservers 1314 | \item \code{-CBW/ip/port} -- send given chunkserver back to work (from grace state) 1315 | \item \code{-CM1/ip/port} -- switch selected chunkserver to maintenance mode 1316 | \item \code{-CM0/ip/port} -- switch selected chunkserver to standard mode (from maintenance mode) 1317 | \item \code{-CRS/sessionid} -- remove given session 1318 | \end{itemize} 1319 | \bigskip 1320 | 1321 | \textbf{EXAMPLES}: 1322 | \begin{itemize} 1323 | \item \code{mfscli -SIC -2} -- shows table with chunk state matrix (number of chunks for each combination of valid copies and goal set by user) using extended terminal colors (256-colors) chunkservers 1324 | \item \code{mfscli -SCS -f 1} -- shows table with all chunkservers using unicode thick frames 1325 | \item \code{mfscli -SMS -p -s ','} -- shows current sessions (mounts) using plain text format and coma as a separator 1326 | \end{itemize} 1327 | 1328 | \section{For MooseFS CGI Server} 1329 | \subsection{\code{mfscgiserv}} 1330 | \code{mfscgiserv} -- start HTTP/CGI server for Moose File System monitoring 1331 | \bigskip 1332 | 1333 | \textbf{SYNOPSIS} 1334 | \begin{itemize} 1335 | \item \code{mfscgiserv [-H BIND\_HOST] [-P BIND\_PORT] [-R ROOT\_PATH] [-t LOCKTIMEOUT] [-f [-v]] [ACTION]} 1336 | \item \code{mfscgiserv -h} 1337 | \end{itemize} 1338 | \bigskip 1339 | 1340 | \textbf{DESCRIPTION} 1341 | 1342 | \code{mfscgiserv} is a very simple HTTP server capable of running CGI scripts for Moose File System monitoring. 1343 | 1344 | \bigskip 1345 | \textbf{OPTIONS} 1346 | \begin{itemize} 1347 | \item \code{-h} -- print usage information and exit 1348 | \item \code{-H BIND\_HOST} -- local address to listen on (default: any) 1349 | \item \code{-P BIND\_PORT} -- port to listen on (default: \code{9425}) 1350 | \item \code{-R ROOT\_PATH} -- local path to use as HTTP document root (default is \code{CGIDIR} set up at configure time) 1351 | \item \code{-f} --run in foreground, don't daemonize 1352 | \item \code{-v} -- log requests on stderr 1353 | \item \code{-t LOCKTIMEOUT} -- how long to wait for lockfile (in seconds; default is 60 seconds) 1354 | \end{itemize} 1355 | 1356 | \bigskip 1357 | \code{ACTION }is one of \code{start}, \code{stop}, \code{restart} or \code{test}. Default action is 1358 | \code{restart}. The \code{test} action will yeld one of two responses: 1359 | "\code{mfscgiserv pid: PID}" or "\code{mfscgiserv is not running}". 1360 | 1361 | \section{For MooseFS Metalogger(s)} 1362 | \subsection{\code{mfsmetalogger}} 1363 | \code{mfsmetalogger} -- start, restart or stop Moose File System metalogger process 1364 | 1365 | \bigskip 1366 | \textbf{SYNOPSIS} 1367 | \begin{itemize} 1368 | \item \code{mfsmetalogger [-f] [-c CFGFILE] [-u] [-d] [-t LOCKTIMEOUT] [ACTION]} 1369 | \item \code{mfsmetalogger -s [-c CFGFILE]} 1370 | \item \code{mfsmetalogger -v} 1371 | \item \code{mfsmetalogger -h} 1372 | \end{itemize} 1373 | 1374 | \bigskip 1375 | \textbf{DESCRIPTION} \\\\ 1376 | \code{mfsmetalogger} is the metadata replication server of Moose File System. 1377 | Depending on parameters it can start, restart or stop MooseFS metalogger 1378 | process. Without any options it starts MooseFS metalogger, killing 1379 | previously run process if lock file exists. 1380 | 1381 | \code{SIGHUP} (or '\code{reload}' \code{ACTION}) forces \code{mfsmetalogger} to reload all configuration files. 1382 | 1383 | \code{mfsmetalogger} exists since 1.6.5 version of MooseFS; before this version 1384 | \code{mfschunkserver} was responsible of logging metadata changes. 1385 | 1386 | \begin{itemize} 1387 | \item \code{-v} -- print version information and exit 1388 | \item \code{-h} -- print usage information and exit 1389 | \item \code{-f} -- (\textbf{deprecated}, use start action instead) forcily run MooseFS 1390 | metalogger process, without trying to kill previous instance (this 1391 | option allows to run MooseFS metalogger if stale PID file 1392 | exists) 1393 | \item \code{-s} -- (\textbf{deprecated}, use stop action instead) stop MooseFS metalogger process 1394 | 1395 | \item \code{-c CFGFILE} -- specify alternative path of configuration file (default is \code{mfsmetalogger.cfg} in system configuration directory) 1396 | \item \code{-u} -- log undefined configuration values (when default is assumed) 1397 | \item \code{-d} -- run in foreground, don't daemonize 1398 | \item \code{-t LOCKTIMEOUT} -- how long to wait for lockfile (default is 60 seconds) 1399 | \end{itemize} 1400 | \bigskip 1401 | 1402 | \code{ACTION} is the one of \code{start}, \code{stop}, \code{restart}, \code{reload}, \code{test} or \code{kill}. Default action is \code{restart} unless \code{-s} (stop) or \code{-f} (start) option is given. Note that \code{-s} and \code{-f} options are \textbf{deprecated}, likely to disappear and \code{ACTION} parameter to become obligatory in MooseFS 1.7+. 1403 | \newpage 1404 | 1405 | \textbf{FILES} 1406 | \begin{itemize} 1407 | \item \code{mfsmetalogger.cfg} -- configuration file for MooseFS metalogger process; refer to mfsmetalogger.cfg(5) manual for details 1408 | \item \code{mfsmetalogger.lock} -- PID file of running MooseFS metalogger process (created in \code{RUN\_PATH} by MooseFS $<$ 1.6.9) 1409 | \item \code{.mfsmetalogger.lock} -- lock file of running MooseFS metalogger process (created in data directory since MooseFS 1.6.9) 1410 | \item \code{changelog\_ml.*.mfs} -- MooseFS filesystem metadata change logs (backup of master change log files) 1411 | \item \code{metadata.ml.mfs.back} -- Latest copy of complete \code{metadata.mfs.back} file from MooseFS master. 1412 | \item \code{sessions.ml.mfs} -- Latest copy of \code{sessions.mfs} file from MooseFS master. 1413 | \end{itemize} 1414 | 1415 | 1416 | \section{For MooseFS Chunkserver(s)} 1417 | \subsection{\code{mfschunkserver}} 1418 | \code{mfschunkserver} -- start, restart or stop Moose File System chunkserver process 1419 | 1420 | \bigskip 1421 | \textbf{SYNOPSIS} 1422 | \begin{itemize} 1423 | \item \code{mfschunkserver [-c CFGFILE] [-u] [-f] [-t LOCKTIMEOUT] [ACTION]} 1424 | \item \code{mfschunkserver -v} 1425 | \item \code{mfschunkserver -h} 1426 | \end{itemize} 1427 | 1428 | \bigskip 1429 | \textbf{DESCRIPTION} \\\\ 1430 | \code{mfschunkserver} is the data server of Moose File System. 1431 | 1432 | \bigskip 1433 | \textbf{OPTIONS} 1434 | \begin{itemize} 1435 | \item \code{-v} -- print version information and exit 1436 | \item \code{-h} -- print usage information and exit 1437 | \item \code{-c CFGFILE} -- specify alternative path of configuration file (default is 1438 | \code{mfschunkserver.cfg} in system configuration directory) 1439 | \item \code{-u} -- log undefined configuration values (when default is assumed) 1440 | \item \code{-f} -- run in foreground, don't daemonize 1441 | \item \code{-t LOCKTIMEOUT} -- how long to wait for lockfile (in seconds; default is 60 seconds) 1442 | \end{itemize} 1443 | 1444 | \bigskip 1445 | \code{ACTION} is the one of \code{start}, \code{stop}, \code{restart}, \code{reload}, \code{test} or \code{kill}. Default action is \code{restart}. The test action will yield one of two responses: "\code{mfschunkserver pid: PID}" or "\code{mfschunkserver is not running}". The kill action will send a \code{SIGKILL} to the currently running chunkserver process. \code{SIGHUP} or reload action forces \code{mfschunkserver} to reload all configuration files. 1446 | 1447 | \bigskip 1448 | \textbf{FILES} 1449 | \begin{itemize} 1450 | \item \code{mfschunkserver.cfg} -- configuration file for MooseFS chunkserver process; refer to mfschunkserver.cfg(5) manual for details 1451 | \item \code{mfshdd.cfg} -- list of directories (mountpoints) used for MooseFS storage; refer to mfshdd.cfg(5) manual for details 1452 | \item \code{.mfschunkserver.lock} -- lock file of running MooseFS chunkserver process (created in data directory) 1453 | \item \code{data.csstats} -- chunkserver charts state (created in data directory) 1454 | \end{itemize} 1455 | 1456 | \section{For MooseFS Client} 1457 | \subsection{\code{mfsmount}} 1458 | \code{mfsmount} -- mount Moose File System 1459 | 1460 | \bigskip 1461 | \textbf{SYNOPSIS} 1462 | \begin{itemize} 1463 | \item \code{mfsmount mountpoint [-d] [-f] [-s] [-m] [-n] [-p] [-H HOST] [-P PORT] [-S PATH] [-o opt[,opt]...]} 1464 | \item \code{mfsmount -h|--help} 1465 | \item \code{mfsmount -V|--version} 1466 | \end{itemize} 1467 | 1468 | \bigskip 1469 | \textbf{DESCRIPTION} 1470 | 1471 | Mount Moose File System. 1472 | \bigskip 1473 | 1474 | General options: 1475 | \begin{itemize} 1476 | \item \code{-h}, \code{--help} -- display help and exit 1477 | \item \code{-V} -- display version information and exit 1478 | \end{itemize} 1479 | \bigskip 1480 | 1481 | FUSE options: 1482 | \begin{itemize} 1483 | \item \code{-d}, \code{-o debug} -- enable debug mode (implies \code{-f}) 1484 | \item \code{-f} -- foreground operation 1485 | \item \code{-s} -- disable multi-threaded operation 1486 | \end{itemize} 1487 | \newpage 1488 | 1489 | MooseFS options: 1490 | \begin{itemize} 1491 | \item \code{-c CFGFILE, -o mfscfgfile=CFGFILE} 1492 | -- loads file with additional mount options 1493 | 1494 | \item \code{-m}, \code{--meta}, \code{-o mfsmeta} 1495 | -- mount MFSMETA companion filesystem instead of primary MooseFS 1496 | 1497 | \item \code{-n} 1498 | -- omit default mount options (\code{-o allow\_other,default\_permissions}) 1499 | 1500 | \item \code{-p} 1501 | -- prompt for password (interactive version of \code{-o mfspassword=PASS}) 1502 | 1503 | \item \code{-H HOST}, \code{-o mfsmaster=HOST} 1504 | -- connect with MooseFS master on \code{HOST} (default is mfsmaster) 1505 | 1506 | \item \code{-P PORT}, \code{-o mfsport=PORT} 1507 | -- connect with MooseFS master on \code{PORT} (default is 9421) 1508 | 1509 | \item \code{-B HOST}, \code{-o mfsbind=HOST} 1510 | -- local address to use for connecting with master instead of 1511 | default one 1512 | 1513 | \item \code{-S PATH}, \code{-o mfssubfolder=PATH} 1514 | -- mount specified MooseFS directory (default is \code{/}, i.e. whole 1515 | filesystem) 1516 | 1517 | \item \code{-o mfspassword=PASSWORD} 1518 | -- authenticate to MooseFS master with PASSWORD 1519 | 1520 | \item \code{-o mfsmd5pass=MD5} 1521 | -- authenticate to MooseFS master using directly given MD5 (only if 1522 | \code{mfspassword} option is not specified) 1523 | 1524 | \item \code{-o mfsdonotrememberpassword} 1525 | -- do not remember password in memory -- more secure, but when session is lost then new session is created without password 1526 | 1527 | \item \code{-o mfsdebug} 1528 | -- print some MooseFS-specific debugging information 1529 | 1530 | \item \code{-o mfsdelayedinit} 1531 | -- connection with master is done in background -- with this option mount can be run without network (good for being run from fstab / init scripts etc.) 1532 | 1533 | \item \code{-o mfsmkdircopysgid=N} 1534 | -- sgid bit should be copied during mkdir operation (default depends on operating system) 1535 | 1536 | \item \code{-o mfssugidclearmode=SMODE} 1537 | -- set sugid clear mode (see SUGID CLEAR MODES; default depends on operating system) 1538 | 1539 | \item \code{-o mfscachemode=CMODE} 1540 | -- set cache mode (see DATA CACHE MODES; default is AUTO) 1541 | 1542 | \item \code{-o mfscachefiles} 1543 | -- (deprecated) preserve file data in cache (equivalent to '-o mfscachemode=YES') 1544 | 1545 | \item \code{-o mfsattrcacheto=SEC} 1546 | -- set attributes cache timeout in seconds (default: 1.0) 1547 | 1548 | \item \code{-o mfsxattrcacheto=SEC} 1549 | -- set extended attributes (xattr) cache timeout in seconds (default: 30.0) 1550 | 1551 | \item \code{-o mfsentrycacheto=SEC} 1552 | -- set file entry cache timeout in seconds (default: 0.0, i.e. no cache) 1553 | 1554 | \item \code{-o mfsdirentrycacheto=SEC} 1555 | -- set directory entry cache timeout in seconds (default: 1.0) 1556 | 1557 | \item \code{-o mfsnegentrycacheto=SEC} 1558 | -- set negative entry cache timeout in seconds (default: 1.0) 1559 | 1560 | \item \code{-o mfsgroupscacheto=SEC} 1561 | -- set supplementary groups cache timeout in seconds (default: 300.0) 1562 | 1563 | \item \code{-o mfsrlimitnofile=N} 1564 | -- try to change limit of simultaneously opened file descriptors on startup (default: 100000) 1565 | 1566 | \item \code{-o mfsnice=LEVEL} 1567 | -- try to change nice level to specified value on startup (default: -19) 1568 | 1569 | \item \code{-o mfswritecachesize=N} 1570 | -- specify write cache size in MiB (in range: 16..2048 - default: 250) 1571 | 1572 | \item \code{-o mfsioretries=N} 1573 | -- specify number of retiries before I/O error is returned (default: 30) 1574 | \end{itemize} 1575 | 1576 | \bigskip 1577 | General mount options (see mount(8) manual): 1578 | \begin{itemize} 1579 | \item \code{-o rw | -o ro} 1580 | -- Mount file-system in read-write (default) or read-only mode respectively. 1581 | 1582 | \item \code{-o suid | -o nosuid} 1583 | -- Enable or disable suid/sgid attributes to work. 1584 | 1585 | \item \code{-o dev | -o nodev} 1586 | -- Enable or disable character or block special device files interpretation. 1587 | 1588 | \item \code{-o exec | -o noexec} 1589 | -- Allow or disallow execution of binaries. 1590 | \end{itemize} 1591 | \bigskip 1592 | 1593 | \textbf{SUGID CLEAR MODE} 1594 | 1595 | During attribute change file systems sometimes clear flags suid and/or sgid. Behavior is different on different file systems. MFS tries to mimic behavior of most popular file system on given operating systems. 1596 | 1597 | \begin{itemize} 1598 | \item \code{NEVER} -- MFS will not change suid and sgid bit on chown 1599 | \item \code{ALWAYS} -- clear suid and sgid on every chown - safest operation 1600 | \item \code{OSX} -- standard behavior in OS X and Solaris (chown made by unprivileged user clear suid and sgid) 1601 | \item \code{BSD} -- standard behavior in *BSD systems (like in OSX, but only when something is really changed) 1602 | \item \code{EXT} -- standard behavior in most file systems on Linux (directories not changed, others: suid cleared always, sgid only when group exec bit is set) 1603 | \item \code{XFS} -- standard behavior in XFS on Linux (like EXT but directories are changed by unprivileged users) 1604 | \end{itemize} 1605 | \bigskip 1606 | 1607 | \textbf{DATA CACHE MODES} 1608 | 1609 | There are three cache modes: \code{NO}, \code{YES} and \code{AUTO}. Default option is \code{AUTO} and you shouldn't change it unless you really know what you are doing. In AUTO mode data cache is managed automatically by \code{mfsmaster}. 1610 | 1611 | \begin{itemize} 1612 | \item \code{NO}, \code{NONE} or \code{NEVER} -- never allow files data to be kept in cache (safest but can reduce efficiency) 1613 | \item \code{YES} or \code{ALWAYS} -- always allow files data to be kept in cache (dangerous) 1614 | \item \code{AUTO} -- file cache is managed by \code{mfsmaster} automatically (should be very safe and efficient) 1615 | \end{itemize} 1616 | 1617 | 1618 | \subsection{\code{mfstools}} 1619 | \code{mfstools} -- perform MooseFS-specific operations 1620 | \bigskip 1621 | 1622 | \textbf{SYNOPSIS} 1623 | \begin{itemize} 1624 | \item \code{mfsgetgoal [-r] [-n|-h|-H|-k|-m|-g] OBJECT...} 1625 | \item \code{mfsrgetgoal [-n|-h|-H|-k|-m|-g] OBJECT...} 1626 | \item \code{mfssetgoal [-r] [-n|-h|-H|-k|-m|-g] [+|-]N OBJECT...} 1627 | \item \code{mfsrsetgoal [-n|-h|-H|-k|-m|-g] [+|-]N OBJECT...} 1628 | \item \code{mfsgettrashtime [-r] [-n|-h|-H|-k|-m|-g] OBJECT...} 1629 | \item \code{mfsrgettrashtime [-n|-h|-H|-k|-m|-g] OBJECT...} 1630 | \item \code{mfssettrashtime [-r] [-n|-h|-H|-k|-m|-g] [+|-]SECONDS OBJECT...} 1631 | \item \code{mfsrsettrashtime [-n|-h|-H|-k|-m|-g] [+|-]SECONDS OBJECT...} 1632 | \item \code{mfsgeteattr [-r] [-n|-h|-H|-k|-m|-g] OBJECT...} 1633 | \item \code{mfsseteattr [-r] [-n|-h|-H|-k|-m|-g] -f ATTRNAME [-f ATTRNAME ...] OBJECT...} 1634 | \item \code{mfsdeleattr [-r] [-n|-h|-H|-k|-m|-g] -f ATTRNAME [-f ATTRNAME ...] OBJECT...} 1635 | \item \code{mfscheckfile FILE...} 1636 | \item \code{mfsfileinfo FILE...} 1637 | \item \code{mfsdirinfo [-n|-h|-H|-k|-m|-g] OBJECT...} 1638 | \item \code{mfsfilerepair [-n|-h|-H|-k|-m|-g] FILE...} 1639 | \item \code{mfsappendchunks SNAPSHOT\_FILE OBJECT...} 1640 | \item \code{mfsmakesnapshot [-o] SOURCE... DESTINATION} 1641 | \item \code{mfsgetquota [-n|-h|-H|-k|-m|-g] DIRECTORY...} 1642 | \item \code{mfssetquota [-n|-h|-H|-k|-m|-g] [-i|-I inodes] [-l|-L length] 1643 | [-s|-S size] [-r|-R realsize] DIRECTORY...} 1644 | \item \code{mfsdelquota [-a|-A|-i|-I|-l|-L|-s|-S|-r|-R] [-n|-h|-H|-k|-m|-g] 1645 | -f DIRECTORY...} 1646 | \item \code{mfsfilepaths OBJECT|INODE...} 1647 | \end{itemize} 1648 | \bigskip 1649 | 1650 | \textbf{DESCRIPTION} 1651 | 1652 | \begin{itemize} 1653 | \item \code{mfsgetgoal} and \code{mfssetgoal} operate on object's goal value, i.e. the number 1654 | of copies in which all file data are stored. It means that file 1655 | should survive failure of one less chunkservers than its goal value. 1656 | Goal must be set between \code{1} and \code{9} (note that \code{1} is strongly unadvised). 1657 | \code{mfsgetgoal} prints current goal value of given object(s). \code{-r} option 1658 | enables recursive mode, which works as usual for every given file, but 1659 | for every given directory additionally prints current goal value of all 1660 | contained objects (files and directories). \code{mfssetgoal} changes current 1661 | goal value of given object(s). If new value is specified in \code{+N} form, 1662 | goal value is increased to \code{N} for objects with lower goal value and 1663 | unchanged for the rest. Similarly, if new value is specified as \code{-N}, 1664 | goal value is decreased to \code{N} for objects with higher goal value and 1665 | unchanged for the rest. \code{-r} option enables recursive mode. These tools 1666 | can be used on any file, directory or deleted (trash) file. 1667 | 1668 | \item \code{mfsrgetgoal} and \code{mfsrsetgoal} are deprecated aliases for \code{mfsgetgoal -r} 1669 | and \code{mfssetgoal -r} respectively. 1670 | 1671 | \item \code{mfsgettrashtime} and \code{mfssettrashtime} operate on object's trashtime 1672 | value, i.e. the number of seconds the file is preserved in special 1673 | trash directory before it's finally removed from filesystem. Trashtime 1674 | must be non-negative integer value. \code{mfsgettrashtime} prints current 1675 | trashtime value of given object(s). \code{-r} option enables recursive mode, 1676 | which works as usual for every given file, but for every given directory 1677 | additionally prints current trashtime value of all contained 1678 | objects (files and directories). \code{mfssettrashtime} changes current 1679 | trashtime value of given object(s). If new value is specified in \code{+N} 1680 | form, trashtime value is increased to \code{N} for objects with lower trashtime 1681 | value and unchanged for the rest. Similarly, if new value is specified 1682 | as \code{-N}, trashtime value is decreased to \code{N} for objects with higher 1683 | trashtime value and unchanged for the rest. \code{-r} option enables recursive 1684 | mode. These tools can be used on any file, directory or deleted 1685 | (trash) file. 1686 | 1687 | \item \code{mfsrgettrashtime} and \code{mfsrsettrashtime} are deprecated aliases for \code{mfsgettrashtime -r} and \code{mfssettrashtime -r} respectively. 1688 | 1689 | \item \code{mfsgeteattr}, \code{mfsseteattr} and \code{mfsdeleattr} tools are used to get, set or 1690 | delete some extra attributes. Attributes are described below. 1691 | 1692 | \item \code{mfscheckfile} checks and prints number of chunks and number of chunk 1693 | copies belonging to specified file(s). It can be used on any file, 1694 | included deleted (trash). 1695 | 1696 | \item \code{mfsfileinfo} prints location (chunkserver host and port) of each chunk 1697 | copy belonging to specified file(s). It can be used on any file, 1698 | included deleted (trash). 1699 | 1700 | \item \code{mfsdirinfo} is extended, MooseFS-specific equivalent of \code{du -s} command. 1701 | It prints summary for each specified object (single file or directory 1702 | tree). If you only want to see one parameter, then add one of show options (see SHOW OPTIONS) 1703 | 1704 | \item \code{mfsfilerepair} deals with broken files (those which cause I/O errors on 1705 | read operations) to make them partially readable. In case of missing 1706 | chunk it fills missing parts of file with zeroes; in case of chunk version 1707 | mismatch it sets chunk version known to \code{mfsmaster} to highest one 1708 | found on chunkservers. Note: because in the second case content mismatch 1709 | can occur in chunks with the same version, it's advised to make a 1710 | copy (not a snapshot!) and delete original file after "repairing". 1711 | 1712 | \item \code{mfsappendchunks} (equivalent of \code{mfssnapshot} from MooseFS 1.5) appends a 1713 | lazy copy of specified file(s) to specified snapshot file ("lazy" means 1714 | that creation of new chunks is delayed to the moment one copy is modified). 1715 | If multiple files are given, they are merged into one target 1716 | file in the way that each file begins at chunk (64MB) boundary; padding 1717 | space is left empty. 1718 | 1719 | \item \code{mfsmakesnapshot} makes a "real" snapshot (lazy copy, like in case of 1720 | \code{mfsappendchunks}) of some object(s) or subtree (similarly to \code{cp -r command}). 1721 | It's atomic with respect to each \code{SOURCE} argument separately. If 1722 | \code{DESTINATION} points to already existing file, error will be reported 1723 | unless \code{-o} (overwrite) option is given. Note: if \code{SOURCE} is a directory, 1724 | it's copied as a whole; but if it's followed by trailing slash, only 1725 | directory content is copied. 1726 | 1727 | \item \code{mfsgetquota}, \code{mfssetquota} and \code{mfsdelquota} tools are used to check, 1728 | define and delete quotas. Quota is set on a directory. It can be set in 1729 | one of 4 ways: for number of inodes inside the directory (total sum of 1730 | the subtree's inodes) with \code{-i}, \code{-I} options, for sum of (logical) file 1731 | lengths with \code{-l}, \code{-L} options, for sum of chunk sizes (not considering 1732 | goals) with \code{-s}, \code{-S} options and for physical hdd space (more or less 1733 | chunk sizes multiplied by goal of each chunk) with \code{-r}, \code{-R} options. 1734 | Small letters set soft quota, capital letters set hard quota. \code{-a} and \code{-A} 1735 | options in \code{mfsdelquota} mean all kinds of quota. Quota behavior is 1736 | described below. 1737 | 1738 | \item \code{mfsfilepaths} tool can be used to find all occurrences (hard links) of given file in filesystem. Also can be used to find file by number of i-node. In case of searching by i-node tool has to be run in mfs mounted directory. 1739 | \end{itemize} 1740 | \bigskip 1741 | 1742 | \textbf{GENERAL OPTIONS} 1743 | 1744 | Most of mfstools use \code{-n}, \code{-h}, \code{-H}, \code{-k}, \code{-m} and \code{-g} options to select format 1745 | of printed numbers. \code{-n} causes to print exact numbers, \code{-h} uses binary 1746 | prefixes (\code{Ki}, \code{Mi}, \code{Gi} as $2^{10}$, $2^{20}$ etc.) while \code{-H} uses \code{SI} prefixes (\code{k}, 1747 | \code{M}, \code{G} as $10^3$, $10^6$ etc.). \code{-k}, \code{-m} and \code{-g} show plain numbers respectivaly 1748 | in kibis (binary kilo -- 1024), mebis (binary mega -- $1024^2$) and gibis 1749 | (binary giga -- $1024^3$). The same can be achieved by setting \code{MFSHRFORMAT} 1750 | environment variable to: 0 (exact numbers), \code{1} or \code{h} (binary prefixes), 1751 | \code{2} or \code{H} (\code{SI} prefixes), \code{3} or \code{h+} (exact numbers and binary prefixes), 1752 | \code{4} or \code{H+} (exact numbers and SI prefixes). The default is to 1753 | print just exact numbers. 1754 | \bigskip 1755 | 1756 | \textbf{SHOW OPTIONS} 1757 | \begin{itemize} 1758 | \item \code{-i} -- show number of inodes 1759 | \item \code{-d} -- show number of directories 1760 | \item \code{-f} -- show number of files 1761 | \item \code{-c} -- show number of chunks 1762 | \item \code{-l} -- show length 1763 | \item \code{-s} -- show size 1764 | \item \code{-r} -- show realsize 1765 | \end{itemize} 1766 | \bigskip 1767 | 1768 | \textbf{EXTRA ATTRIBUTES} 1769 | \begin{itemize} 1770 | \item \code{noowner} -- This flag means, that particular object belongs to current user 1771 | (\code{uid} and \code{gid} are equal to \code{uid} and \code{gid} values of accessing process). Only \code{root} (\code{uid=0}) sees the real \code{uid} and \code{gid}. 1772 | 1773 | \item \code{noattrcache} -- This flag means, that standard file attributes such as \code{uid}, code{gid}, \code{mode}, \code{length} and so on won't be stored in kernel cache. In MooseFS 1.5 this was the only behavior, and \code{mfsmount} always prevented attributes from being stored in kernel cache, but in MooseFS 1.6 attributes can be cached, so in very rare ocassions it could be useful to turn it off. 1774 | 1775 | \item \code{noentrycache} -- This flag is similar to above. It prevents directory entries from being cached in kernel. 1776 | \end{itemize} 1777 | \bigskip 1778 | 1779 | \textbf{QUOTAS} 1780 | 1781 | Quota is always set on a directory. Hard quota cannot be exceeded any 1782 | time. Soft quota can be exceeded for a period of time (7 days). Once a 1783 | quota is exceeded in a directory, user must go below the quota during 1784 | the next 7 days. If not, the soft quota for this particular directory 1785 | starts to behave like a hard quota. The 7 days period is global and 1786 | cannot currently be modified. 1787 | \bigskip 1788 | 1789 | \textbf{INHERITANCE} 1790 | 1791 | When new object is created in MooseFS, attributes such as \code{goal}, \code{trashtime} 1792 | and extra attributes are inherited from parent directory. So if 1793 | you set i.e. "\code{noowner}" attribute and \code{goal} to \code{3} in a directory then 1794 | every new object created in this directory will have \code{goal} set to \code{3} and 1795 | "\code{noowner}" flag set. A newly created object inherits always the current 1796 | set of its parent's attributes. Changing a directory attribute does not 1797 | affect its already created children. To change an attribute for a 1798 | directory and all of its children use "\code{-r}" option. 1799 | 1800 | \chapter{MooseFS Configuration Files} 1801 | \section{For MooseFS Master Server(s)} 1802 | \bigskip 1803 | \begin{center} 1804 | \textbf{\fcolorbox{red}{pink}{Warning: Configuration files on all Master Servers must be consistent!}} 1805 | \end{center} 1806 | \bigskip 1807 | 1808 | \subsection{\code{mfsmaster.cfg}} 1809 | \code{mfsmaster.cfg} -- main configuration file for \code{mfsmaster} 1810 | 1811 | \bigskip 1812 | \textbf{DESCRIPTION} \\\\ 1813 | The file \code{mfsmaster.cfg} contains configuration of MooseFS master 1814 | process. 1815 | 1816 | \bigskip 1817 | \textbf{SYNTAX} \\\\ 1818 | Syntax is: 1819 | 1820 | \code{OPTION = VALUE} 1821 | 1822 | Lines starting with \code{\#} character are ignored as comments. 1823 | 1824 | \bigskip 1825 | \textbf{OPTIONS} \\\\ 1826 | Configuration options: 1827 | \begin{itemize} 1828 | \item \code{WORKING\_USER} 1829 | -- user to run daemon as 1830 | 1831 | \item \code{WORKING\_GROUP} 1832 | -- group to run daemon as; optional value - if empty then default 1833 | user group will be used 1834 | 1835 | \item \code{SYSLOG\_IDENT} 1836 | -- name of process to place in syslog messages; default is \code{mfsmaster} 1837 | 1838 | \item \code{LOCK\_MEMORY} 1839 | -- whether to perform mlockall() to avoid swapping out \code{mfsmaster} 1840 | process; default is 0, i.e. no 1841 | 1842 | \item \code{NICE\_LEVEL} 1843 | -- nice level to run daemon with; default is -19; note: process 1844 | must be started as root to increase priority, if setting of 1845 | priority fails, process retains the nice level it started with 1846 | 1847 | \item \code{FILE\_UMASK} 1848 | -- set default umask for group and others (user has always 0); 1849 | default is \code{027} -- block write for group and block all for others 1850 | 1851 | \item \code{DATA\_PATH} 1852 | -- where to store metadata files and lock file 1853 | 1854 | \item \code{EXPORTS\_FILENAME} 1855 | -- alternate location/name of \code{mfsmaster.cfg} file 1856 | 1857 | \item \code{TOPOLOGY\_FILENAME} 1858 | -- alternate location/name of \code{mfstopology.cfg} file 1859 | 1860 | \item \code{LICENCE\_FILENAME} 1861 | -- alternate location/name of \code{mfslicence.bin} file (pro version only) 1862 | 1863 | \item \code{BACK\_LOGS} 1864 | -- number of metadata change log files (default is 50) 1865 | 1866 | \item \code{BACK\_META\_KEEP\_PREVIOUS} 1867 | -- number of previous metadata files to be kept (default is 1) 1868 | 1869 | \item \code{CHANGELOG\_PRESERVE\_SECONDS} 1870 | -- how many seconds of change logs have to be preserved in memory 1871 | (default is 1800; this sets the minimum, actual number may be a 1872 | bit bigger due to logs being kept in 5k blocks; zero disables 1873 | extra logs storage) 1874 | 1875 | \item \code{MISSING\_LOG\_CAPACITY} 1876 | -- how many missing chunks will be stored in master (up to \\ \code{100*MISSING\_LOG\_CAPACITY} bytes of memory will be allocated; default value is 100000) 1877 | 1878 | \item \code{MATOML\_LISTEN\_HOST} 1879 | -- IP address to listen on for \code{metalogger}, \code{master}s and \code{supervisor}s connections (* means any) 1880 | 1881 | \item \code{MATOML\_LISTEN\_PORT} 1882 | -- port to listen on for \code{metalogger}, \code{master}s and \code{supervisor}s connections 1883 | 1884 | \item \code{MASTER\_RECONNECTION\_DELAY} 1885 | -- delay in seconds before next try to reconnect to \code{master-leader} 1886 | if not connected (default is 5) 1887 | 1888 | \item \code{MASTER\_TIMEOUT} 1889 | -- timeout in seconds for \code{master-leader} connections (pro version only; default is 10) 1890 | 1891 | \item \code{BIND\_HOST} 1892 | -- local address to use for connecting with \code{master-leader} (pro version only; default is *, i.e. default local address) 1893 | 1894 | \item \code{MATOCS\_LISTEN\_HOST} 1895 | -- IP address to listen on for \code{chunkserver} connections (\code{*} means any) 1896 | 1897 | \item \code{MATOCS\_LISTEN\_PORT} 1898 | -- port to listen on for \code{chunkserver} connections 1899 | 1900 | \item \code{MATOCS\_TIMEOUT} 1901 | -- timeout in seconds for master-chunkserver connection (default is 10) 1902 | 1903 | \item \code{REPLICATIONS\_DELAY\_INIT} 1904 | -- initial delay in seconds before starting replications (default is 300) 1905 | 1906 | \item \code{CHUNKS\_LOOP\_MAX\_CPS} 1907 | -- Chunks loop shouldn't check more chunks per seconds than given 1908 | number (default is 100000) 1909 | 1910 | \item \code{CHUNKS\_LOOP\_MIN\_TIME} 1911 | -- Chunks loop shouldn't be done in less seconds than given number 1912 | (default is 300) 1913 | 1914 | \item \code{CHUNKS\_SOFT\_DEL\_LIMIT} 1915 | -- Soft maximum number of chunks to delete on one \code{chunkserver} 1916 | (default is 10) 1917 | 1918 | \item \code{CHUNKS\_HARD\_DEL\_LIMIT} 1919 | -- Hard maximum number of chunks to delete on one \code{chunkserver} 1920 | (default is 25) 1921 | 1922 | \item \code{CHUNKS\_WRITE\_REP\_LIMIT} 1923 | -- Maximum number of chunks to replicate to one \code{chunkserver} (default is 2,1,1,4 -- see NOTES) 1924 | 1925 | \item \code{CHUNKS\_READ\_REP\_LIMIT} 1926 | -- Maximum number of chunks to replicate from one \code{chunkserver} (default is 10,5,2,5 -- see NOTES) 1927 | 1928 | \item \code{CS\_HEAVY\_LOAD\_THRESHOLD} 1929 | -- Threshold for chunkserver load (default is 100 -- see NOTES) 1930 | 1931 | \item \code{CS\_HEAVY\_LOAD\_RATIO\_THRESHOLD} 1932 | -- Threshold ratio for chunkserver load (default is 5.0 -- see NOTES) 1933 | 1934 | \item \code{CS\_HEAVY\_LOAD\_GRACE\_PERIOD} 1935 | -- Defines how long chunkservers will remain in 'grace' mode (default is 900 -- see NOTES) 1936 | 1937 | \item \code{ACCEPTABLE\_DIFFERENCE} 1938 | -- Maximum difference between space usage of chunkservers (deprecated, use \code{ACCEPTABLE\_PERCENTAGE\_DIFFERENCE} instead) 1939 | 1940 | \item \code{ACCEPTABLE\_PERCENTAGE\_DIFFERENCE} 1941 | -- Maximum percentage difference between space usage of chunkservers (default is 1 = 1\%) 1942 | \item \code{PRIORITY\_QUEUES\_LENGTH} 1943 | -- Length of priority queues (for endangered, undergoal etc. chunks -- chunks that should be processed first -- default is 1000000) 1944 | 1945 | \item \code{MATOCL\_LISTEN\_HOST} 1946 | -- IP address to listen on for client (mount) connections (\code{*} means any) 1947 | 1948 | \item \code{MATOCL\_LISTEN\_PORT} 1949 | -- port to listen on for client (mount) connections 1950 | 1951 | \item \code{SESSION\_SUSTAIN\_TIME} 1952 | -- How long to sustain a disconnected client session (in seconds; default is 86400 = 1 day) 1953 | 1954 | \item \code{QUOTA\_TIME\_LIMIT} 1955 | -- Time limit in seconds for soft quota (default is 604800 = 7 days) 1956 | 1957 | \item \code{ATIME\_MODE} 1958 | -- Set atime modification mode (default is 0 = always modify atime -- see NOTES) 1959 | \end{itemize} 1960 | \bigskip 1961 | 1962 | \textbf{NOTES} 1963 | 1964 | Chunks in master are tested in a loop. Speed (or frequency) is 1965 | regulated by two options \code{CHUNKS\_LOOP\_MIN\_TIME} and \code{CHUNKS\_LOOP\_MAX\_CPS}. 1966 | First defines minimal time between iterations of the loop and second 1967 | defines maximal number of chunk tests per second. Typically at the 1968 | beginning, when number of chunks is small, time is constant, regulated 1969 | by \code{CHUNK\_LOOP\_MIN\_TIME}, but when number of chunks becomes bigger then 1970 | time of loop can increase according to\\ 1971 | \code{CHUNKS\_LOOP\_MAX\_CPS}. 1972 | 1973 | Example: \code{CHUNKS\_LOOP\_MIN\_TIME} is set to 300, \code{CHUNKS\_LOOP\_MAX\_CPS} is set 1974 | to 100000 and there is 1000000 (one million) chunks in the system. 1975 | 1000000/100000 = 10, which is less than 300, so one loop iteration will 1976 | take 300 seconds. With 1000000000 (one billion) chunks the system 1977 | needs 10000 seconds for one iteration of the loop. 1978 | 1979 | Deletion limits are defined as 'soft' and 'hard' limit. When number of 1980 | chunks to delete increases from loop to loop, current limit can be temporary 1981 | increased above soft limit, but never above hard limit. 1982 | 1983 | Replication limits are divided into four cases: 1984 | \begin{itemize} 1985 | \item first limit is for endangered chunks (chunks with only one copy) 1986 | \item second limit is for undergoal chunks (chunks with number of copies lower than specified goal) 1987 | \item third limit is for rebalance between servers with space usage around arithmetic mean 1988 | \item fourth limit is for rebalance between other servers (very low or very high space usage) 1989 | \end{itemize} 1990 | 1991 | Usually first number should be grater than or equal to second, second greater than or equal to third, and fourth greater than or equal to third (1st $>=$ 2nd $>=$ 3rd $<=$ 4th). If one number is given, then all limits are set to this number (for backward compatibility). 1992 | 1993 | Whenever chunkserver load is higher than \code{CS\_HEAVY\_LOAD\_THRESHOLD} and \\ \code{CS\_HEAVY\_LOAD\_RATIO\_THRESHOLD} times higher than average load, then chunkserver is switched into 'grace' mode. Chunkserver stays in grace mode for \code{CS\_HEAVY\_LOAD\_GRACE\_PERIOD} seconds. 1994 | 1995 | There are five values for \code{ATIME\_MODE} (all other values are treated as \code{0}): 1996 | \begin{itemize} 1997 | \item \code{0} = Always modify atime for files, folders and symlinks. 1998 | \item \code{1} = Always modify atime but only in case of files (do not modify atime in case of folders and symlinks). 1999 | \item \code{2} = Modify atime only when it is lower than ctime or mtime and when current time is higher than ctime or mtime respectively, also modify atime when current atime is older than 24h. Do it for all objects during access (like "relatime" 2000 | option in Linux). 2001 | \item \code{3} = Same as above but only in case of files. In case of folders and symlinks do not modify atime. 2002 | \item \code{4} = Never modify atime during access (like "noatime" option). 2003 | \end{itemize} 2004 | 2005 | \subsection{\code{mfsexports.cfg}} 2006 | \code{mfsexports.cfg} -- MooseFS access control for \code{mfsmount}s 2007 | \bigskip 2008 | 2009 | \textbf{DESCRIPTION} 2010 | 2011 | The file \code{mfsexports.cfg} contains MooseFS access list for \code{mfsmount} clients. 2012 | \bigskip 2013 | 2014 | \textbf{SYNTAX} 2015 | 2016 | Syntax is: 2017 | \code{ADDRESS DIRECTORY [OPTIONS]} 2018 | 2019 | Lines starting with \code{\#} character are ignored as comments. 2020 | 2021 | \code{ADDRESS} can be specified in several forms: 2022 | \begin{itemize} 2023 | \item \code{*} -- all addresses 2024 | \item \code{n.n.n.n} -- single IP address 2025 | \item \code{n.n.n.n/b} -- IP class specified by network address and number of significant bits 2026 | \item \code{n.n.n.n/m.m.m.m} -- IP class specified by network address and mask 2027 | \item \code{f.f.f.f-t.t.t.t} -- IP range specified by from-to addresses (inclusive) 2028 | \end{itemize} 2029 | 2030 | \bigskip 2031 | \code{DIRECTORY} can be \code{/} or path relative to MooseFS root; special value \code{.} means MFSMETA companion filesystem. 2032 | 2033 | \bigskip 2034 | OPTIONS list: 2035 | \begin{itemize} 2036 | \item \code{ro}, \code{readonly} -- export tree in read-only mode; this is default 2037 | \item \code{rw}, \code{readwrite} -- export tree in read-write mode 2038 | \item \code{alldirs} -- allows to mount any subdirectory of specified directory (similarly to NFS) 2039 | \item \code{dynamicip} -- allows reconnecting of already authenticated client from any IP address (the default is to check IP address on reconnect) 2040 | \item \code{ignoregid} -- disable testing of group access at \code{mfsmaster} level (it's still 2041 | done at \code{mfsmount} level) -- in this case "group" and "other" 2042 | permissions are logically added; needed for supplementary groups to 2043 | work (\code{mfsmaster} receives only user primary group information) 2044 | \item \code{admin} -- administrative privileges -- currently: allow changing of quota values 2045 | \item \code{maproot=USER[:GROUP]} -- maps root (\code{uid=0}) accesses to given user and group (similarly to \code{maproot} option in NFS mounts); \code{USER} and \code{GROUP} can be given either as name or number; if no group is specified, \code{USER}'s primary group is used. Names are resolved on \code{mfsmaster} side (see note below). 2046 | \item \code{mapall=USER[:GROUP]} -- like above but maps all non privileged users (\code{uid!=0}) accesses to given user and group (see notes below). 2047 | \item \code{password=PASS}, \code{md5pass=MD5} -- requires password authentication in order to access specified resource 2048 | \item \code{minversion=VER} -- rejects access from clients older than specified 2049 | \item \code{mingoal=N}, \code{maxgoal=N} -- specify range in which goal can be set by users 2050 | \item \code{mintrashtime=TDUR}, \code{maxtrashtime=TDUR} -- specify range in which trashtime can be set by users 2051 | \end{itemize} 2052 | \bigskip 2053 | 2054 | Default options are: 2055 | 2056 | \code{ro}, \code{maproot=999:999}, \code{mingoal=1}, \code{maxgoal=9}, \code{mintrashtime=0}, \code{maxtrashtime=4294967295}. 2057 | \bigskip 2058 | 2059 | \textbf{NOTES} 2060 | 2061 | \code{USER} and \code{GROUP} names (if not specified by explicit \code{uid}/\code{gid} number) are 2062 | resolved on \code{mfsmaster} host. 2063 | 2064 | \code{TDUR} can be specified as number without time unit (number of seconds) 2065 | or combination of numbers with time units. Time units are: \code{W}, \code{D}, \code{H}, \code{M}, \code{S}. Order is important -- less significant time units can't be defined before more significant time units. Time units are case insensitive. 2066 | 2067 | Option \code{mapall} works in MooseFS in different way than in NFS, because MooseFS is using FUSE's "\code{default\_permissions}" option. When \code{mapall} option is used, users see all objects with \code{uid} equal to mapped \code{uid} as their own and all other as root's objects. Similarly objects with \code{gid} equal to mapped \code{gid} are seen as objects with current user's primary group and all other objects as objects with group 0 (usually wheel). With \code{mapall} option set attribute cache in kernel is always turned off. 2068 | \bigskip 2069 | 2070 | \textbf{EXAMPLES} 2071 | 2072 | \begin{lstlisting} 2073 | * / ro 2074 | 192.168.1.0/24 / rw 2075 | 192.168.1.0/24 / rw,alldirs,maproot=0,password=passcode 2076 | 10.0.0.0-10.0.0.5 /test rw,maproot=nobody,password=test 2077 | 10.1.0.0/255.255.0.0 /public rw,mapall=1000:1000 2078 | 10.2.0.0/16 / rw,alldirs,maproot=0,mintrashtime=2h30m,maxtrashtime=2w 2079 | \end{lstlisting} 2080 | 2081 | \subsection{\code{mfstopology.cfg}} 2082 | \code{mfstopology.cfg} -- MooseFS network topology definitions 2083 | \bigskip 2084 | 2085 | \textbf{DESCRIPTION} 2086 | 2087 | The file \code{mfstopology.cfg} contains assignments of IP addresses into 2088 | network locations (usually switch numbers). This file is optional. If your 2089 | network has one switch or decreasing traffic between switches is not 2090 | necessary then leave this file empty. 2091 | \bigskip 2092 | 2093 | \textbf{SYNTAX} 2094 | 2095 | Syntax is: 2096 | 2097 | \code{ADDRESS SWITCH-NUMBER} 2098 | 2099 | Lines starting with \code{\#} character are ignored as comments. 2100 | 2101 | \code{ADDRESS} can be specified in several forms: 2102 | 2103 | \begin{itemize} 2104 | \item \code{*} -- all addresses 2105 | \item \code{n.n.n.n} -- single IP address 2106 | \item \code{n.n.n.n/b} -- IP class specified by network address and bits number 2107 | \item \code{n.n.n.n/m.m.m.m} -- IP class specified by network address and mask 2108 | \item \code{f.f.f.f-t.t.t.t} -- IP range specified by from-to addresses (inclusive) 2109 | \end{itemize} 2110 | 2111 | \code{SWITCH-NUMBER} can be specified as any positive 32-bit numer. 2112 | 2113 | 2114 | \textbf{NOTES} \\\\ 2115 | If one IP belongs to more than one definition then last definition is 2116 | used. 2117 | 2118 | As for now distance between switches is constant. So distance between 2119 | machines is calculated as: 0 when IP numbers are the same, 1 when IP 2120 | numbers are different, but switch numbers are the same and 2 when 2121 | switch numbers are different 2122 | 2123 | Distances are used only to sort chunkservers during read and write 2124 | operations. New chunks are still created randomly. Also rebalance 2125 | routines do not take distances into account. 2126 | 2127 | 2128 | \section{For MooseFS Metalogger(s)} 2129 | \subsection{\code{mfsmetalogger.cfg}} 2130 | code{mfsmetalogger.cfg} -- configuration file for mfsmetalogger 2131 | \bigskip 2132 | 2133 | \textbf{DESCRIPTION} 2134 | 2135 | The file \code{mfsmetalogger.cfg} contains configuration of MooseFS \code{metalogger} process. 2136 | \bigskip 2137 | 2138 | \textbf{SYNTAX} 2139 | 2140 | Syntax is: 2141 | 2142 | \code{OPTION = VALUE} 2143 | 2144 | Lines starting with \code{\#} character are ignored as comments. 2145 | \bigskip 2146 | 2147 | \textbf{OPTIONS} 2148 | 2149 | Configuration options: 2150 | 2151 | \begin{itemize} 2152 | \item \code{DATA\_PATH} -- where to store metadata files 2153 | 2154 | \item \code{LOCK\_FILE} -- (deprecated) daemon lock/pid file 2155 | 2156 | \item \code{WORKING\_USER} -- user to run daemon as 2157 | 2158 | \item \code{WORKING\_GROUP} -- group to run daemon as (optional -- if empty then default user group will be used) 2159 | 2160 | \item \code{SYSLOG\_IDENT} -- name of process to place in syslog messages (default is mfsmetalogger) 2161 | 2162 | \item \code{LOCK\_MEMORY} -- whether to perform mlockall() to avoid swapping out mfsmetalogger process (default is 0, i.e. no) 2163 | 2164 | \item \code{NICE\_LEVEL} -- nice level to run daemon with (default is -19 if possible; note: process must be started as root to increase priority) 2165 | 2166 | \item \code{BACK\_LOGS} -- number of metadata change log files (default is 50) 2167 | 2168 | \item \code{BACK\_META\_KEEP\_PREVIOUS} -- number of previous metadata files to be kept (default is 3) 2169 | 2170 | \item \code{META\_DOWNLOAD\_FREQ} -- metadata download frequency in hours (default is 24, at most \code{BACK\_LOGS}/2) 2171 | 2172 | \item \code{MASTER\_HOST} -- address of MooseFS master host to connect with (default is mfsmaster) 2173 | 2174 | \item \code{MASTER\_PORT} -- number of MooseFS master port to connect with (default is 9420) 2175 | 2176 | \item \code{MASTER\_RECONNECTION\_DELAY} -- delay in seconds before trying to reconnect to master after disconnection (default is 30) 2177 | 2178 | \item \code{MASTER\_TIMEOUT} -- timeout (in seconds) for master connections (default is 60) 2179 | \end{itemize} 2180 | 2181 | \section{For MooseFS Chunkservers} 2182 | \subsection{\code{mfschunkserver.cfg}} 2183 | \code{mfschunkserver.cfg} -- main configuration file for mfschunkserver 2184 | \bigskip 2185 | 2186 | \textbf{DESCRIPTION} 2187 | 2188 | The file \code{mfschunkserver.cfg} contains configuration of MooseFS \code{chunkserver} process. 2189 | \bigskip 2190 | 2191 | \textbf{SYNTAX} 2192 | 2193 | Syntax is: 2194 | 2195 | \code{OPTION = VALUE} 2196 | 2197 | Lines starting with \code{\#} character are ignored as comments. 2198 | \bigskip 2199 | 2200 | \textbf{OPTIONS} 2201 | 2202 | Configuration options: 2203 | \begin{itemize} 2204 | \item \code{WORKING\_USER} -- user to run daemon as 2205 | \item \code{WORKING\_GROUP} -- group to run daemon as; optional value -- if empty then default user group will be used 2206 | \item \code{SYSLOG\_IDENT} -- name of process to place in syslog messages; default is \code{mfschunkserver} 2207 | \item \code{LOCK\_MEMORY} -- whether to perform mlockall() to avoid swapping out \code{mfschunkserver} process; default is \code{0}, i.e. no 2208 | \item \code{NICE\_LEVEL} -- nice level to run daemon with; default is \code{-19}; note: process must be started as root to increase priority, if setting of priority fails, process retains the nice level it started with 2209 | \item \code{FILE\_UMASK} -- set default umask for group and others (user has always 0); default is \code{027} -- block write for group and block all for others 2210 | \item \code{DATA\_PATH} -- where to store daemon lock file 2211 | \item \code{HDD\_CONF\_FILENAME} -- alternate location/name of \code{mfshdd.cfg} file 2212 | \item \code{HDD\_TEST\_FREQ} -- chunk test period in seconds; default is \code{10} 2213 | \item \code{HDD\_LEAVE\_SPACE\_DEFAULT} -- how much space should be left unused on each hard drive; number format: \code{[0-9]*(.[0-9]*)?([kMGTPE]|[KMGTPE]i)?B?}; default is \code{256MiB}; examples: \code{0.5GB}, \code{.5G}, \code{2.56GiB}, \code{1256M} etc. 2214 | \item \code{HDD\_REBALANCE\_UTILIZATION} -- percent of total work time the chunkserver is allowed to spend on hdd space rebalancing; default is 20 2215 | \item \code{HDD\_ERROR\_TOLERANCE\_COUNT}, \code{HDD\_ERROR\_TOLERANCE\_PERIOD} -- how many i/o errors (\code{COUNT}) to tolerate in given amount of seconds (\code{PERIOD}) on a single hard drive; if the number of errors exceeds this setting, the offending hard drive will be marked as damaged; defaults are 2 and 600 2216 | \item \code{HDD\_FSYNC\_BEFORE\_CLOSE} -- enables/disables fsync before chunk closing; deafult is 0 (off) 2217 | \item \code{WORKERS\_MAX}, \code{WORKERS\_MAX\_IDLE} -- maximum number of active workers and maximum number of idle workers; defaults are 150 and 40 2218 | \item \code{BIND\_HOST} -- local address to use for master connections; default is \code{*}, i.e. default local address 2219 | \item \code{MASTER\_HOST} -- MooseFS master host, IP is allowed only in single-master installations; default is \code{mfsmaster} 2220 | \item \code{MASTER\_PORT} -- MooseFS master command port; default is \code{9420} 2221 | \item \code{MASTER\_CONTROL\_PORT} -- MooseFS master control port; default is \code{9419} 2222 | \item \code{MASTER\_TIMEOUT} -- timeout in seconds for master connections; default is \code{60} 2223 | \item \code{MASTER\_RECONNECTION\_DELAY} -- delay in seconds before trying to reconnect to master after disconnection (default is 5) 2224 | \item \code{BIND\_HOST} -- local address to use for connecting with master (default is \code{*}, i.e. default local address) 2225 | \item \code{CSSERV\_LISTEN\_HOST} -- IP address to listen on for client (mount) connections (\code{*} means any) 2226 | \item \code{CSSERV\_LISTEN\_PORT} -- port to listen on for client (mount) connections (default is \code{9422}) 2227 | \item \code{CSSERV\_TIMEOUT} -- timeout (in seconds) for client (mount) connections (default is \code{5}) 2228 | \end{itemize} 2229 | 2230 | \subsection{\code{mfshdd.cfg}} 2231 | \code{mfshdd.cfg} -- list of MooseFS storage directories for mfschunkserver 2232 | 2233 | \bigskip 2234 | \textbf{DESCRIPTION} 2235 | 2236 | The file \code{mfshdd.cfg} contains list of directories (mountpoints) used for MooseFS storage. 2237 | 2238 | \bigskip 2239 | \textbf{SYNTAX} 2240 | 2241 | Syntax is: 2242 | \code{[*]PATH [SPACE LIMIT]} 2243 | 2244 | Lines starting with \code{\#} character are ignored as comments. 2245 | 2246 | \code{*} means this directory (hard drive) is "marked for removal" and all data will be replicated to other hard drives, usually on other chunkservers 2247 | 2248 | \code{PATH} is path to the mounting point of storage directory, usually 2249 | a single hard drive. 2250 | 2251 | \code{SPACE LIMIT} is optional space limit, that allows to set one of two 2252 | values: how much space should be left unused on this device or how much 2253 | space is to be used on this device. Definition format: 2254 | \code{[0-9]*(.[0-9]*)?([kMGTPE]|[KMGTPE]i)?B?}, positive value means how much 2255 | space to use, negative value means how much space should be left 2256 | unused. 2257 | 2258 | 2259 | % \chapter{MooseFS Maintenance} - add? 2260 | 2261 | \chapter{Frequently Asked Questions} 2262 | \section{What average write/read speeds can we expect?} 2263 | Aside from common (for most filesystems) factors like: block size and type of access (sequential or random), in MooseFS the speeds depend also on hardware performance. Main factors are hard drives performance and network capacity and topology (network latency). The better the performance of the hard drives used and the better throughput of the network, the higher performance of the whole system. 2264 | 2265 | \section{Does the goal setting influence writing/reading speeds?} 2266 | Generally speaking, it does not. In case of reading a file, goal higher than one may in some cases help speed up the reading operation, i. e. when two clients access a file with goal two or higher, they may perform the read operation on different copies, thus having all the available throughtput for themselves. But in average the goal setting does not alter the speed of the reading operation in any way. 2267 | 2268 | Similarly, the writing speed is negligibly influenced by the goal setting. Writing with goal higher than two is done chain-like: the client send the data to one chunk server and the chunk server simultaneously reads, writes and sends the data to another chunk server (which may in turn send them to the next one, to fulfill the goal). This way the client's throughtput is not overburdened by sending more than one copy and all copies are written almost simultaneously. Our tests show that writing operation can use all available bandwidth on client's side in 1Gbps network. 2269 | 2270 | \section{Are concurrent read and write operations supported?} 2271 | All read operations are parallel -- there is no problem with concurrent reading of the same data by several clients at the same moment. 2272 | Write operations are parallel, execpt operations on the same chunk (fragment of file), which are synchronized by Master server and therefore need to be sequential. 2273 | 2274 | \section{How much CPU/RAM resources are used?} 2275 | In our environment (ca. 1 PiB total space, 36 million files, 6 million folders distributed on 38 million chunks on 100 machines) the usage of chunkserver CPU (by constant file transfer) is about 15-30\% and chunkserver RAM usually consumes in between 100MiB and 1GiB (dependent on amount of chunks on each chunk server). The master server consumes about 50\% of modern 3.3 GHz CPU (ca. 5000 file system operations per second, of which ca. 1500 are modifications) and 12GiB RAM. CPU load depends on amount of operations and RAM on the total number of files and folders, not the total size of the files themselves. The RAM usage is proportional to the number of entries in the file system because the master server process keeps the entire metadata in memory for performance. HHD usage on our master server is ca. 22 GB. 2276 | 2277 | 2278 | \section{Is it possible to add/remove chunkservers and disks on the fly?} 2279 | You can add/remove chunk servers on the fly. But keep in mind that it is not wise to disconnect a chunk server if this server contains the only copy of a chunk in the file system (the CGI monitor will mark these in orange). You can also disconnect (change) an individual hard drive. The scenario for this operation would be: 2280 | \begin{enumerate} 2281 | \item Mark the disk(s) for removal (see How to mark a disk for removal?) 2282 | \item Reload the chunkserver process 2283 | \item Wait for the replication (there should be no "undergoal" or "missing" chunks marked in yellow, orange or red in CGI monitor) 2284 | \item Stop the chunkserver process 2285 | \item Delete entry(ies) of the disconnected disk(s) in \code{mfshdd.cfg} 2286 | \item Stop the chunkserver machine 2287 | \item Remove hard drive(s) 2288 | \item Start the machine 2289 | \item Start the chunkserver process 2290 | \end{enumerate} 2291 | 2292 | \bigskip 2293 | If you have hotswap disk(s) you should follow these: 2294 | \begin{enumerate} 2295 | \item Mark the disk(s) for removal (see How to mark a disk for removal?) 2296 | \item Reload the chunkserver process 2297 | \item Wait for the replication (there should be no "undergoal" or "missing" chunks marked in yellow, orange or red in CGI monitor) 2298 | \item Delete entry(ies) of the disconnected disk(s) in \code{mfshdd.cfg} 2299 | \item Reload the chunkserver process 2300 | \item Unmount disk(s) 2301 | \item Remove hard drive(s) 2302 | \end{enumerate} 2303 | If you follow the above steps, work of client computers won't be interrupted and the whole operation won't be noticed by MooseFS users. 2304 | 2305 | 2306 | \section{How to mark a disk for removal?} 2307 | When you want to mark a disk for removal from a chunkserver, you need to edit the chunkserver's \code{mfshdd.cfg} configuration file and put an asterisk '\code{*}' at the start of the line with the disk that is to be removed. For example, in this \code{mfshdd.cfg} we have marked "\code{/mnt/hdd}" for removal: 2308 | \begin{lstlisting} 2309 | /mnt/hda 2310 | /mnt/hdb 2311 | /mnt/hdc 2312 | */mnt/hdd 2313 | /mnt/hde 2314 | \end{lstlisting} 2315 | 2316 | After changing the mfshdd.cfg you need to reload chunkserver (on Linux Debian/Ubuntu: \code{service moosefs-pro-chunkserver reload}). 2317 | 2318 | Once the disk has been marked for removal and the chunkserver process has been restarted, the system will make an appropriate number of copies of the chunks stored on this disk, to maintain the required "goal" number of copies. 2319 | 2320 | Finally, before the disk can be disconnected, you need to confirm there are no "undergoal" chunks on the other disks. This can be done using the CGI Monitor. In the "Info" tab select "Regular chunks state matrix" mode. 2321 | 2322 | 2323 | \section{My experience with clustered filesystems is that metadata operations are quite slow. How did you resolve this problem?} 2324 | During our research and development we also observed the problem of slow metadata operations. We decided to aleviate some of the speed issues by keeping the file system structure in RAM on the metadata server. This is why metadata server has increased memory requirements. The metadata is frequently flushed out to files on the master server. 2325 | 2326 | Additionally, in Community Edition, the metadata logger server(s) also frequently receive updates to the metadata structure and write these to their file systems. 2327 | 2328 | In Pro version metaloggers are optional, because master followers are keeping synchronised with leader master. They're also saving metadata to the hard disk. 2329 | 2330 | \section{What does value of directory size mean on MooseFS? It is different than standard Linux \code{ls -l} output. Why?} 2331 | Folder size has no special meaning in any filesystem, so our development team decided to give there extra information. The number represents total length of all files inside (like in \code{mfsdirinfo -h -l}) displayed in exponential notation. 2332 | 2333 | You can "translate" the directory size by the following way: 2334 | 2335 | There are 7 digits: \code{xAAAABB}. To translate this notation to number of bytes, use the following expression: 2336 | 2337 | \code{AAAA.BB xBytes} 2338 | 2339 | Where \code{x}: 2340 | \begin{itemize} 2341 | \item \code{0 =} 2342 | \item \code{1 = kibi} 2343 | \item \code{2 = Mebi} 2344 | \item \code{3 = Gibi} 2345 | \item \code{4 = Tebi} 2346 | \end{itemize} 2347 | 2348 | \bigskip 2349 | 2350 | \underline{Example}: 2351 | 2352 | To translate the following entry: 2353 | 2354 | \begin{lstlisting} 2355 | drwxr-xr-x 164 root root 2010616 May 24 11:47 test 2356 | xAAAABB 2357 | \end{lstlisting} 2358 | 2359 | Folder size \code{2010616} should be read as 106.16 MiB. 2360 | 2361 | \bigskip 2362 | 2363 | When \code{x = 0}, the number might be smaller: 2364 | 2365 | \underline{Example}: 2366 | 2367 | Folder size \code{10200} means 102 Bytes. 2368 | 2369 | 2370 | 2371 | \section{When I perform \code{df -h} on a filesystem the results are different from what I would expect taking into account actual sizes of written files.} 2372 | Every chunkserver sends its own disk usage increased by 256MB for each used partition/hdd, and the master sends a sum of these values to the client as total disk usage. If you have 3 chunkservers with 7 hdd each, your disk usage will be increased by 3*7*256MB (about 5GB). 2373 | 2374 | The other reason for differences is, when you use disks exclusively for MooseFS on chunkservers \code{df} will show correct disk usage, but if you have other data on your MooseFS disks \code{df} will count your own files too. 2375 | 2376 | If you want to see the actual space usage of your MooseFS files, use \code{mfsdirinfo} command. 2377 | 2378 | 2379 | \section{Can I keep source code on MooseFS? Why do small files occupy more space than I would have expected?} 2380 | The system was initially designed for keeping large amounts (like several thousands) of very big files (tens of gigabytes) and has a hard-coded chunk size of 64MiB and block size of 64KiB. Using a consistent block size helps improve the networking performance and efficiency, as all nodes in the system are able to work with a single 'bucket' size. That's why even a small file will occupy 64KiB plus additionally 4KiB of checksums and 1KiB for the header. 2381 | 2382 | The issue regarding the occupied space of a small file stored inside a MooseFS chunk is really more significant, but in our opinion it is still negligible. Let's take 25 million files with a goal set to 2. Counting the storage overhead, this could create about 50 million 69 KiB chunks, that may not be completely utilized due to internal fragmentation (wherever the file size was less than the chunk size). So the overall wasted space for the 50 million chunks would be approximately 3.2TiB. By modern standards, this should not be a significant concern. A more typical, medium to large project with 100,000 small files would consume at most 13GiB of extra space due to block size of used file system. 2383 | 2384 | So it is quite reasonable to store source code files on a MooseFS system, either for active use during development or for long term reliable storage or archival purposes. 2385 | 2386 | Perhaps the larger factor to consider is the comfort of developing the code taking into account the performance of a network file system. When using MooseFS (or any other network based file system such as NFS, CIFS) for a project under active development, the network filesystem may not be able to perform file IO operations at the same speed as a directly attached regular hard drive would. 2387 | 2388 | Some modern integrated development environments (IDE), such as Eclipse, make frequent IO requests on several small workspace metadata files. Running Eclipse with the workspace folder on a MooseFS file system (and again, with any other networked file system) will yield slightly slower user interface performance, than running Eclipse with the workspace on a local hard drive. 2389 | 2390 | You may need to evaluate for yourself if using MooseFS for your working copy of active development within an IDE is right for you. 2391 | 2392 | In a different example, using a typical text editor for source code editing and a version control system, such as Subversion, to check out project files into a MooseFS file system, does not typically resulting any performance degradation. The IO overhead of the network file system nature of MooseFS is offset by the larger IO latency of interacting with the remote Subversion repository. And the individual file operations (open, save) do not have any observable latencies when using simple text editors (outside of complicated IDE products). 2393 | 2394 | A more likely situation would be to have the Subversion repository files hosted within a MooseFS file system, where the svnserver or Apache $+$ mod\_svn would service requests to the Subversion repository and users would check out working sandboxes onto their local hard drives. 2395 | 2396 | 2397 | \section{Do Chunkservers and Metadata Server do their own checksumming?} 2398 | 2399 | Chunk servers do their own checksumming. Overhead is about 4B per a 64KiB block which is 4KiB per a 64MiB chunk. 2400 | Metadata servers don't. We thought it would be CPU consuming. We recommend using ECC RAM modules. 2401 | 2402 | \section{What resources are required for the Master Server?} 2403 | 2404 | The most important factor is RAM of \code{mfsmaster} machine, as the full file system structure is cached in RAM for speed. Besides RAM \code{mfsmaster} machine needs some space on HDD for main metadata file together with incremental logs. 2405 | The size of the metadata file is dependent on the number of files (not on their sizes). The size of incremental logs depends on the number of operations per hour, but length (in hours) of this incremental log is configurable. 2406 | 2407 | \section{When I delete files or directories, the MooseFS free space size doesn't change. Why?} 2408 | MooseFS does not immediately erase files on deletion, to allow you to revert the delete operation. Deleted files are kept in the trash bin for the configured amount of time (default: 24h / 86400 seconds) before they are deleted. 2409 | 2410 | You can configure for how long files are kept in trash and empty the trash manually (to release the space). 2411 | 2412 | You cant mount the trash e.g. in the following way: 2413 | First of all, create the directory to mount \code{mfsmeta} 2414 | \begin{lstlisting} 2415 | # mkdir /mnt/mfsmeta 2416 | \end{lstlisting} 2417 | 2418 | Then, mount \code{mfsmeta} (like normally, but with \code{-m} parameter: 2419 | \begin{lstlisting} 2420 | # mfsmount -H master.host.name -m /mnt/mfsmeta 2421 | \end{lstlisting} 2422 | or: 2423 | \begin{lstlisting} 2424 | # mfsmount -H master.host.name -o mfsmeta /mnt/mfsmeta 2425 | \end{lstlisting} 2426 | 2427 | Then, go into trash subdirectory: 2428 | \begin{lstlisting} 2429 | # cd /mnt/mfsmeta/trash 2430 | \end{lstlisting} 2431 | 2432 | If you use MooseFS 2, you can see the list of deleted files in this directory. If you use MooseFS 3.0, you can see 4096 sub-trashes in this directory (named \code{000 .. FFF}). The reason of divide the trash into sub-trashes is a huge amount of files in trash on big instances. In such case, commands like \code{ls} or even \code{find} are not able to functionate properly. So since MooseFS 3, deleted files are located inside these directories (sub-trashes). The best way to locate a file you are looking for is to use \code{find} command. 2433 | 2434 | If you use MooseFS 3 and you want to see the old trash structure (because you e.g. don't have a lot of files in trash and you like old, simple structure), you should mount the trash with a specific \code{mfsflatrash} parameter, e.g.: 2435 | \begin{lstlisting} 2436 | # mfsmount -H master.host.name -o mfsmeta,mfsflattrash /mnt/mfsmeta 2437 | \end{lstlisting} 2438 | 2439 | You can delete files from trash on MooseFS 2, just issue \code{rm} command, e.g.: 2440 | 2441 | \begin{lstlisting} 2442 | # mkdir /mnt/mfsmeta 2443 | # mfsmount -H master.host.name -o mfsmeta /mnt/mfsmeta 2444 | # cd /mnt/mfsmeta/trash 2445 | # rm * 2446 | \end{lstlisting} 2447 | 2448 | If you want to delete files from trash with old structure on MooseFS 3, just issue \code{rm} command like above, but firstly mount it with \code{mfsflattrash} parameter, e.g.: 2449 | 2450 | \begin{lstlisting} 2451 | # mkdir /mnt/mfsmeta 2452 | # mfsmount -H master.host.name -o mfsmeta,mfsflattrash /mnt/mfsmeta 2453 | # cd /mnt/mfsmeta/trash 2454 | # rm * 2455 | \end{lstlisting} 2456 | 2457 | In case you want to delete files from trash on MooseFS 3 with new trash structure, you should combine \code{find} and \code{rm} commands together, e.g.: 2458 | 2459 | \begin{lstlisting} 2460 | # mkdir /mnt/mfsmeta 2461 | # mfsmount -H master.host.name -o mfsmeta /mnt/mfsmeta 2462 | # cd /mnt/mfsmeta/trash 2463 | # find . -type f -exec rm {} \; 2464 | \end{lstlisting} 2465 | 2466 | 2467 | The time of storing a deleted file can be verified by the \code{mfsgettrashtime} command and changed with \code{mfssettrashtime}. 2468 | 2469 | 2470 | \section{When I added a third server as an extra chunkserver, it looked like the system started replicating data to the 3rd server even though the file goal was still set to 2.} 2471 | Yes. Disk usage balancer uses chunks independently, so one file could be redistributed across all of your chunkservers. 2472 | 2473 | 2474 | \section{Is MooseFS 64bit compatible?} 2475 | Yes! 2476 | 2477 | 2478 | \section{Can I modify the chunk size?} 2479 | No. File data is divided into fragments (chunks) with a maximum of 64MiB each. The value of 64 MiB is hard coded into system so you cannot modify its size. We based the chunk size on real-world data and determined it was a very good compromise between number of chunks and speed of rebalancing / updating the filesystem. Of course if a file is smaller than 64 MiB it occupies less space. 2480 | 2481 | In the systems we take care of, several file sizes significantly exceed 100GB with no noticable chunk size penalty. 2482 | 2483 | 2484 | \section{How do I know if a file has been successfully written to MooseFS?} 2485 | Let's briefly discuss the process of writing to the file system and what programming consequences this bears. 2486 | 2487 | In all contemporary filesystems, files are written through a buffer (write cache). As a result, execution of the write command itself only transfers the data to a buffer (cache), with no actual writing taking place. Hence, a confirmed execution of the write command does not mean that the data has been correctly written on a disk. It is only with the invocation and completion of the fsync (or close) command that causes all data kept within the buffers (cache) to get physically written out. If an error occurs while such buffer-kept data is being written, it could cause the fsync (or close) command to return an error response. 2488 | 2489 | The problem is that a vast majority of programmers do not test the close command status (which is generally a very common mistake). Consequently, a program writing data to a disk may "assume" that the data has been written correctly from a success response from the write command, while in actuality, it could have failed during the subsequent close command. 2490 | 2491 | In network filesystems (like MooseFS), due to their nature, the amount of data ``left over'' in the buffers (cache) on average will be higher than in regular file systems. Therefore the amount of data processed during execution of the close or fsync command is often significant and if an error occurs while the data is being written [from the close or fsync command], this will be returned as an error during the execution of this command. Hence, before executing close, it is recommended (especially when using MooseFS) to perform an fsync operation after writing to a file and then checking the status of the result of the fsync operation. Then, for good measure, also check the return status of close as well. 2492 | 2493 | NOTE! When stdio is used, the fflush function only executes the "write" command, so correct execution of fflush is not sufficient to be sure that all data has been written successfully -- you should also check the status of fclose. 2494 | 2495 | The above problem may occur when redirecting a standard output of a program to a file in shell. Bash (and many other programs) do not check the status of the close execution. So the syntax of "\code{application > outcome.txt}" type may wrap up successfully in shell, while in fact there has been an error in writing out the "\code{outcome.txt}" file. You are strongly advised to avoid using the above shell output redirection syntax when writing to a MooseFS mount point. If necessary, you can create a simple program that reads the standard input and writes everything to a chosen file, where this simple program would correctly employ the appropriate check of the result status from the fsync command. For example, "\code{application | mysaver outcome.txt}", where \code{mysaver} is the name of your writing program instead of \code{application > outcome.txt}. 2496 | 2497 | Please note that the problem discussed above is in no way exceptional and does not stem directly from the characteristics of MooseFS itself. It may affect any system of files -- network type systems are simply more prone to such difficulties. Technically speaking, the above recommendations should be followed at all times (also in cases where classic file systems are used). 2498 | 2499 | 2500 | \section{What are limits in MooseFS (e.g. file size limit, filesystem size limit, max number of files, that can be stored on the filesystem)?} 2501 | 2502 | \begin{itemize} 2503 | \item The maximum file size limit in MooseFS is $2^{57}$ bytes = 128 PiB. 2504 | \item The maximum filesystem size limit is $2^{64}$ bytes = 16 EiB = 16 384 PiB 2505 | \item The maximum number of files, that can be stored on one MooseFS instance is $2^{31}$ -- over 2.1 bln. 2506 | \end{itemize} 2507 | 2508 | \section{Can I set up HTTP basic authentication for the mfscgiserv?} 2509 | % \textbf{Move it somewhere else, add confguration examples with nginx} 2510 | \code{mfscgiserv} is a very simple HTTP server written just to run the MooseFS CGI scripts. It does not support any additional features like HTTP authentication. However, the MooseFS CGI scripts may be served from another full-featured HTTP server with CGI support, such as lighttpd or Apache. When using a full-featured HTTP server such as Apache, you may also take advantage of features offered by other modules, such as HTTPS transport. Just place the CGI and its data files (\code{index.html}, \code{mfs.cgi}, \code{chart.cgi}, \code{mfs.css}, \code{acidtab.js}, \code{logomini.png}, \code{err.gif}) under chosen \code{DocumentRoot}. If you already have an HTTP server instance on a given host, you may optionally create a virtual host to allow access to the MooseFS CGI Monitor through a different hostname or port. 2511 | 2512 | 2513 | \section{Can I run a mail server application on MooseFS? Mail server is a very busy application with a large number of small files -- will I not lose any files?} 2514 | You can run a mail server on MooseFS. You won't lose any files under a large system load. When the file system is busy, it will block until its operations are complete, which will just cause the mail server to slow down. 2515 | 2516 | 2517 | \section{Are there any suggestions for the network, MTU or bandwidth?} 2518 | We recommend using jumbo-frames\footnote{\url{https://en.wikipedia.org/wiki/Jumbo_frame}} (MTU=9000). With a greater amount of chunkservers, switches should be connected through optical fiber or use aggregated links\footnote{\url{https://en.wikipedia.org/wiki/Link_aggregation}}. 2519 | 2520 | \section{Does MooseFS support supplementary groups?} 2521 | Yes. 2522 | 2523 | \section{Does MooseFS support file locking?} 2524 | Yes, since MooseFS 3.0. 2525 | 2526 | 2527 | \section{Is it possible to assign IP addresses to chunk servers via DHCP?} 2528 | Yes, but we highly recommend setting "DHCP reservations" based on MAC addresses. 2529 | 2530 | 2531 | \section{Some of my chunkservers utilize 90\% of space while others only 10\%. Why does the rebalancing process take so long?} 2532 | Our experiences from working in a production environment have shown that aggressive replication is not desirable, as it can substantially slow down the whole system. The overall performance of the system is more important than equal utilization of hard drives over all of the chunk servers. By default replication is configured to be a non-aggressive operation. At our environment normally it takes about 1 week for a new chunkserver to get to a standard hdd utilization. Aggressive replication would make the whole system considerably slow for several days. 2533 | 2534 | 2535 | 2536 | Replication speeds can be adjusted on master server startup by setting these two options: 2537 | \begin{itemize} 2538 | 2539 | \item \code{CHUNKS\_WRITE\_REP\_LIMIT} \\ 2540 | Maximum number of chunks to replicate to one chunkserver (default is \code{2,1,1,4}). \\ 2541 | One number is equal to four same numbers separated by colons. 2542 | \begin{itemize} 2543 | \item First limit is for endangered chunks (chunks with only one copy) 2544 | \item Second limit is for undergoal chunks (chunks with number of copies lower than specified goal) 2545 | \item Third limit is for rebalance between servers with space usage around arithmetic mean 2546 | \item Fourth limit is for rebalance between other servers (very low or very high space usage) 2547 | \end{itemize} 2548 | Usually first number should be grater than or equal to second, second greater than or equal to third, and fourth greater than or equal to third (1st $>=$ 2nd $>=$ 3rd $<=$ 4th) 2549 | 2550 | \item \code{CHUNKS\_READ\_REP\_LIMIT} \\ 2551 | Maximum number of chunks to replicate from one chunkserver (default is \code{10,5,2,5}). \\ 2552 | One number is equal to four same numbers separated by colons. \' 2553 | Limit groups are the same as in write limit, also relations between numbers should be the same as in write limits (1st $>=$ 2nd $>=$ 3rd $<=$ 4th) 2554 | 2555 | \end{itemize} 2556 | Tuning these in your environment requires some experiments. 2557 | 2558 | \section{I have a Metalogger running -- should I make additional backup of the metadata file on the Master Server?} 2559 | Yes, it is highly recommended to make additional backup of the metadata file. This provides a worst case recovery option if, for some reason, the metalogger data is not useable for restoring the master server (for example the metalogger server is also destroyed). 2560 | 2561 | The master server flushes metadata kept in RAM to the \code{metadata.mfs.back} binary file every hour on the hour (xx:00). So a good time to copy the metadata file is every hour on the half hour (30 minutes after the dump). This would limit the amount of data loss to about 1.5h of data. Backing up the file can be done using any conventional method of copying the metadata file -- cp, scp, rsync, etc. 2562 | 2563 | After restoring the system based on this backed up metadata file the most recently created files will have been lost. Additionally files, that were appended to, would have their previous size, which they had at the time of the metadata backup. Files that were deleted would exist again. And files that were renamed or moved would be back to their previous names (and locations). But still you would have all of data for the files created in the X past years before the crash occurred. 2564 | 2565 | In MooseFS Pro version, master followers flush metadata from RAM to the hard disk once an hour. The leader master downloads saved metadata from followers once a day. 2566 | 2567 | 2568 | \section{I think one of my disks is slower / damaged. How should I find it?} 2569 | In the CGI monitor go to the "Disks" tab and choose "switch to hour" in "I/O stats" column and sort the results by "write" in "max time" column. Now look for disks which have a significantly larger write time. You can also sort by the "fsync" column and look at the results. It is a good idea to find individual disks that are operating slower, as they may be a bottleneck to the system. 2570 | 2571 | It might be helpful to create a test operation, that continuously copies some data to create enough load on the system for there to be observable statisics in the CGI monitor. On the "Disks" tab specify units of "minutes" instead of hours for the "I/O stats" column. 2572 | 2573 | Once a "bad" disk has been discovered to replace it follow the usual operation of marking the disk for removal, and waiting until the color changes to indicate that all of the chunks stored on this disk have been replicated to achieve the sufficient goal settings. 2574 | 2575 | 2576 | \section{How can I find the master server PID?} 2577 | Issue the following command: 2578 | 2579 | \code{\# mfsmaster status} 2580 | 2581 | \section{Web interface shows there are some copies of chunks with goal 0. What does it mean?} 2582 | This is a way to mark chunks belonging to the non-existing (i.e. deleted) files. Deleting a file is done asynchronously in MooseFS. First, a file is removed from metadata and its chunks are marked as unnecessary (\code{goal=0}). Later, the chunks are removed during an "idle" time. This is much more efficient than erasing everything at the exact moment the file was deleted. 2583 | 2584 | Unnecessary chunks may also appear after a recovery of the master server, if they were created shortly before the failure and were not available in the restored metadata file. 2585 | 2586 | 2587 | \section{Is every error message reported by \code{mfsmount} a serious problem?} 2588 | No. \code{mfsmount} writes every failure encountered during communication with chunkservers to the syslog. Transient communication problems with the network might cause IO errors to be displayed, but this does not mean data loss or that \code{mfsmount} will return an error code to the application. Each operation is retried by the client (\code{mfsmount}) several times and only after the number of failures (reported as \code{try counter}) reaches a certain limit (typically 30), the error is returned to the application that data was not read/saved. 2589 | 2590 | Of course, it is important to monitor these messages. When messages appear more often from one chunkserver than from the others, it may mean there are issues with this chunkserver -- maybe hard drive is broken, maybe network card has some problems -- check its charts, hard disk operation times, etc. in the CGI monitor. 2591 | 2592 | Note: \code{XXXXXXXX} in examples below means IP address of chunkserver. In \code{mfsmount} version $<$ 2.0.42 chunkserver IP is written in hexadecimal format. In \code{mfsmount} version $>=$ 2.0.42 IP is "human-readable". 2593 | 2594 | What does \\ 2595 | \code{file: NNN, index: NNN, chunk: NNN, version: NNN - writeworker: 2596 | connection with (XXXXXXXX:PPPP) was timed out (unfinished writes: Y; try counter: Z)} \\ 2597 | message mean? 2598 | 2599 | This means that Zth try to write the chunk was not successful and writing of Y blocks, sent to the chunkserver, was not confirmed. After reconnecting these blocks would be sent again for saving. The limit of trials is set by default to 30. 2600 | 2601 | This message is for informational purposes and doesn't mean data loss. 2602 | 2603 | What does \\ 2604 | \code{file: NNN, index: NNN, chunk: NNN, version: NNN, cs: XXXXXXXX:PPPP - readblock error (try counter: Z)} \\ 2605 | message mean? 2606 | 2607 | This means that Zth try to read the chunk was not successful and system will try to read the block again. If value of Z equals 1 it is a transitory problem and you should not worry about it. The limit of trials is set by default to 30. 2608 | 2609 | 2610 | \section{How do I verify that the MooseFS cluster is online? What happens with \code{mfsmount} when the master server goes down?} 2611 | When the master server goes down while \code{mfsmount} is already running, \code{mfsmount} doesn't disconnect the mounted resource, and files awaiting to be saved would stay quite long in the queue while trying to reconnect to the master server. After a specified number of tries they eventually return EIO -- "input/output error". On the other hand it is not possible to start \code{mfsmount} when the master server is offline. 2612 | 2613 | There are several ways to make sure that the master server is online, we present a few of these below. 2614 | Check if you can connect to the TCP port of the master server (e.g. socket connection test). 2615 | In order to assure that a MooseFS resource is mounted it is enough to check the inode number -- MooseFS root will always have inode equal to 1. For example if we have MooseFS installation in \code{/mnt/mfs} then \code{stat /mnt/mfs} command (in Linux) will show: 2616 | 2617 | 2618 | \begin{lstlisting} 2619 | $ stat /mnt/mfs 2620 | File: `/mnt/mfs' 2621 | Size: xxxxxx Blocks: xxx IO Block: 4096 directory 2622 | Device: 13h/19d Inode: 1 Links: xx 2623 | (...) 2624 | \end{lstlisting} 2625 | 2626 | Additionaly \code{mfsmount} creates a virtual hidden file \code{.stats} in the root mounted folder. For example, to get the statistics of \code{mfsmount} when MooseFS is mounted we can \code{cat} this \code{.stats} file, eg.: 2627 | 2628 | \begin{lstlisting} 2629 | $ cat /mnt/mfs/.stats 2630 | fuse_ops.statfs: 241 2631 | fuse_ops.access: 0 2632 | fuse_ops.lookup-cached: 707553 2633 | fuse_ops.lookup: 603335 2634 | fuse_ops.getattr-cached: 24927 2635 | fuse_ops.getattr: 687750 2636 | fuse_ops.setattr: 24018 2637 | fuse_ops.mknod: 0 2638 | fuse_ops.unlink: 23083 2639 | fuse_ops.mkdir: 4 2640 | fuse_ops.rmdir: 1 2641 | fuse_ops.symlink: 3 2642 | fuse_ops.readlink: 454 2643 | fuse_ops.rename: 269 2644 | (...) 2645 | \end{lstlisting} 2646 | 2647 | If you want to be sure that master server properly responds you need to try to read the goal of any object, e.g. of the root folder: 2648 | 2649 | \begin{lstlisting} 2650 | $ mfsgetgoal /mnt/mfs 2651 | /mnt/mfs: 2 2652 | \end{lstlisting} 2653 | 2654 | If you get a proper goal of the root folder, you can be sure that the master server is up and running. 2655 | 2656 | \end{document} 2657 | -------------------------------------------------------------------------------- /MooseFS-Step-by-Step-Tutorial.tex: -------------------------------------------------------------------------------- 1 | % Copyright (c) 2013-2017 Piotr Robert Konopelko, Core Technology Sp. z o.o. 2 | % 3 | % This file is part of MooseFS. 4 | % 5 | % MooseFS is free software; you can redistribute it and/or modify 6 | % it under the terms of the GNU General Public License as published by 7 | % the Free Software Foundation, version 2 (only). 8 | % 9 | % MooseFS is distributed in the hope that it will be useful, 10 | % but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | % GNU General Public License for more details. 13 | % 14 | % You should have received a copy of the GNU General Public License 15 | % along with MooseFS; if not, write to the Free Software 16 | % Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 17 | % or visit http://www.gnu.org/licenses/gpl-2.0.html 18 | 19 | \documentclass[a4paper,11pt,english]{report} 20 | \usepackage{url} 21 | \usepackage{hyperref} 22 | \usepackage{fullpage} 23 | \usepackage{parskip} 24 | \usepackage{graphicx} 25 | \usepackage{xcolor} 26 | \usepackage{listings} 27 | 28 | \lstset{ 29 | language=bash, 30 | basicstyle=\ttfamily\scriptsize, 31 | showstringspaces=false, 32 | commentstyle=\color{black}, 33 | keywordstyle=\color{black}, 34 | breakatwhitespace=false, 35 | breaklines=true, 36 | showspaces=false, 37 | tabsize=4 38 | } 39 | 40 | \def\code#1{\texttt{#1}} 41 | 42 | \newenvironment{copyrightnotice} 43 | {\begingroup 44 | \footnotesize 45 | \setlength{\parindent}{0pt} 46 | \setlength{\parskip}{\baselineskip}} 47 | {\endgroup} 48 | 49 | % ------------------------------------------------------------------------ 50 | 51 | \begin{document} 52 | 53 | \renewcommand{\labelitemi}{$\bullet$} 54 | \renewcommand{\labelitemii}{$\circ$} 55 | \renewcommand{\labelitemiii}{$\bullet$} 56 | \renewcommand{\labelitemiv}{$\circ$} 57 | 58 | \begin{titlepage} 59 | \begin{center} 60 | \includegraphics[width=0.2\textwidth]{images/moosefs.png}\\[1cm] 61 | 62 | % Title 63 | { \huge \bfseries Installing MooseFS \\ 64 | Step by Step Tutorial \\[0.4cm] } 65 | 66 | 67 | \textsc{Core Technology} Development \& Support Team 68 | 69 | \vfill 70 | 71 | % Bottom of the page 72 | {\large \today} 73 | \end{center} 74 | \end{titlepage} 75 | 76 | 77 | % Copyright page 78 | \begin{copyrightnotice} 79 | \begin{flushleft} 80 | Copyright \textcopyright{} 2013-\the\year 81 | \hfill 82 | \textsc{v. 1.5.2}\\ % DOCUMENTVERSION 83 | 84 | Piotr Robert Konopelko, \textsc{Core Technology} Development \& Support Team. 85 | 86 | \emph{Proofread by} 87 | Agata Kruszona-Zawadzka \\ 88 | \emph{Coordination \& layout by} Piotr Robert Konopelko. 89 | 90 | Please send corrections to \href{mailto:peter@mfs.io}{Piotr Robert Konopelko} -- peter@mfs.io. 91 | 92 | \bigskip 93 | 94 | This file is part of MooseFS. 95 | 96 | MooseFS is free software; you can redistribute it and/or modify 97 | it under the terms of the GNU General Public License as published by 98 | the Free Software Foundation, version 2 (only). 99 | 100 | MooseFS is distributed in the hope that it will be useful, 101 | but WITHOUT ANY WARRANTY; without even the implied warranty of 102 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 103 | GNU General Public License for more details. 104 | 105 | You should have received a copy of the GNU General Public License 106 | along with MooseFS; if not, write to the Free Software 107 | Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 108 | or visit \url{http://www.gnu.org/licenses/gpl-2.0.html} 109 | \end{flushleft} 110 | \end{copyrightnotice} 111 | 112 | \vfill 113 | 114 | \tableofcontents 115 | 116 | \chapter{Introduction} 117 | Notice: there is one dependency to resolve: users' computers need FUSE package to mount MooseFS. It can be downloaded and installed from repositories. 118 | 119 | \section{Key differences between versions 1.6.2x and 2.0.x} 120 | \begin{enumerate} 121 | \item Master host(s) configuration is done solely via DNS -- it is no longer possible to list master(s) IP address(es) in clients' and Chunkservers' configuration; default name for master domain is \code{mfsmaster}, it can be changed in configuration files; 122 | \item In Pro version Metaloggers become optional, they can be replaced by additional Master Servers; in Community Edition it is still strongly recommended to set up Metaloggers. 123 | \item \code{Mfsmetarestore} tool is no longer present in the system; instead, it is enough to start the master process with \code{-a} switch; 124 | \item Configuration files now sit in \code{mfs} subdirectory inside the \code{/etc} directory (this change was introduced in 1.6.27). 125 | \end{enumerate} 126 | 127 | 128 | \section{Many Master Servers -- how does it work?} 129 | In previous MooseFS versions you had only one master process and any number of Metaloggers. In the event of master failure, system administrator was able to retrieve "metadata" information from the Metalogger and start a new master (on a new machine, if necessary), so the file system was up and running again. But this was always causing the system to be unavailable to clients for a period of time and required manual work to bring it back up. 130 | 131 | New MooseFS Pro version introduces many Master Servers working together in multiple roles. One role is "leader". The Leader Master is acting as it used to for the Chunkservers and clients. There is never more than one leader in any working system. 132 | 133 | The other role is "follower". The follower master is doing what Metaloggers used to do -- it 134 | downloads metadata from the leader master and keeps it. But unlike a Metalogger, if a leader master stops working, a follower master is immediately ready to take on the role of leader. 135 | If the leader master fails, a new candidate for leader is chosen from the followers. The candidate 136 | assumes a role of "elect", that automatically converts to "leader" as soon as more than half of the Chunkservers connect to elect. There can be more than one follower in the system. 137 | 138 | The whole switching operation is almost invisible to the system users, as it usually takes between a couple to a dozen or so seconds. When/if the former leader master starts working again, it assumes the role of follower. If a follower master fails, it has no effect on the whole system. If such a master starts working again, it again assumes the role of follower. 139 | 140 | 141 | \chapter{Things to do before installation} 142 | For the sake of this document, it's assumed that your machines have following IP addresses: 143 | \begin{itemize} 144 | \item Master Servers: 192.168.1.1, 192.168.1.2 145 | \item Chunkservers: 192.168.1.101, 192.168.1.102 and 192.168.1.103 146 | \item Users' computers (clients): 192.168.2.x 147 | \end{itemize} 148 | 149 | \section{Configuring Domain Name Service} 150 | Before you start installing MooseFS, you need to have working DNS. It's needed for MooseFS to work properly with several Master Servers, because DNS can resolve one host name as more than one IP address. 151 | 152 | All IPs of machines, which will be Master Servers, must be included in DNS configuration file and resolved as "\code{mfsmaster}" (or any other selected name), e.g.: 153 | 154 | \begin{lstlisting}[caption={DNS entries}] 155 | mfsmaster IN A 192.168.1.1 ; address of first Master Server 156 | mfsmaster IN A 192.168.1.2 ; address of second Master Server 157 | \end{lstlisting} 158 | 159 | More information about configuring DNS server is included in a supplement to this manual. 160 | 161 | \section{Adding repository} 162 | To install MooseFS you need to add MooseFS Official Supported Repositories to your system. This process, both with detailed instructions for specific operating systems, is described at \url{http://get.moosefs.com} (please select your distribution in menu on the left). 163 | 164 | At this time there are repositories available for Ubuntu/Debian, RHEL/CentOS/Fedora, FreeBSD and MacOS X. 165 | 166 | \subsection{Repository branches} 167 | Our repository contains two branches: \code{moosefs-3} and \code{moosefs-2}. Both branches contain stable and production-ready MooseFS version. 168 | 169 | At the time of writing this guide, \code{moosefs-3} branch contains version 3.0.86-1, and \code{moosefs-2} branch contains version 2.0.91-1. 170 | 171 | \code{moosefs-3} branch is a default and you don't need to make any changes in default URL (\code{http://ppa.moosefs.com/moosefs-3/}). 172 | 173 | If you want to use \code{moosefs-2} branch, you just need to replace \code{moosefs-3} with \code{moosefs-2} after \code{http://ppa.moosefs.com/} and before \code{apt}, \code{yum}, \code{freebsd} or \code{osx}, so URL will look like: 174 | 175 | \code{http://ppa.moosefs.com/moosefs-2/[rest of url]} 176 | 177 | It is also possible to use version number instead of "branch" if you want to upgrade to a specific version of MooseFS (e.g. 3.0.81-1): 178 | 179 | \code{http://ppa.moosefs.com/3.0.81/[rest of url]} 180 | 181 | \textbf{Notice: If you want to use the last option, please remember you need to manually change version number on each server to the selected one before doing an upgrade.} 182 | 183 | \section{Differences in package names between MooseFS \\ and MooseFS Pro} 184 | MooseFS and MooseFS Pro packages are named according to the following pattern: 185 | 186 | \bigskip 187 | 188 | \begin{tabular}{l | l | l} 189 | \textbf{MooseFS module} & \textbf{MooseFS Pro} & {\textbf{MooseFS}} \\ \hline 190 | Master Server & \code{moosefs-pro-master} & \code{moosefs-master} \\ \hline 191 | Chunkserver & \code{moosefs-pro-chunkserver} & \code{moosefs-chunkserver} \\ \hline 192 | Metalogger & \code{moosefs-pro-metalogger} & \code{moosefs-metalogger} \\ \hline 193 | Client & \code{moosefs-pro-client} & \code{moosefs-client} \\ \hline 194 | CLI Interface & \code{moosefs-pro-cli} & \code{moosefs-cli} \\ \hline 195 | CGI Interface & \code{moosefs-pro-cgi} & \code{moosefs-cgi} \\ \hline 196 | CGI Server & \code{moosefs-pro-cgiserv} & \code{moosefs-cgiserv} \\ \hline 197 | Netdump & \code{moosefs-pro-netdump} & \code{moosefs-netdump} \\ \hline 198 | Supervisor & \code{moosefs-pro-supervisor} & n/a 199 | \end{tabular} 200 | 201 | 202 | \chapter{MooseFS installation process on dedicated machines} 203 | \textbf{Notice: In this tutorial it is assumed, that you have MooseFS Community Edition. If you want to install MooseFS Pro, please use '\code{pro}' in package names, e.g.: \code{moosefs-pro-master} instead of \code{moosefs-master}.} 204 | 205 | \bigskip 206 | 207 | In this tutorial it is also assumed, that you have Ubuntu/Debian installed on your machines. If you have another distribution, please use appropriate package manager instead of \code{apt}. 208 | 209 | Notice, that most of commands below are preceded by \code{\#} sign, which means, that you have to run such command as \code{root} (\code{\$} sign means normal user). The easiest way to become \code{root} is to run: 210 | 211 | \begin{lstlisting}[caption={Becoming \code{root}}] 212 | $ sudo su - 213 | \end{lstlisting} 214 | 215 | \section{Master Server(s) installation} 216 | \bigskip 217 | 218 | \begin{center} 219 | \textbf{\fcolorbox{red}{pink}{Warning: Configuration files on all Master Servers must be consistent!}} 220 | \end{center} 221 | 222 | \bigskip 223 | \bigskip 224 | 225 | In MooseFS 2.0 Master Server (and also other modules) installation can be accomplished by running the command listed below: 226 | 227 | \begin{lstlisting}[caption={Installing Master Server}] 228 | # apt-get install moosefs-master 229 | \end{lstlisting} 230 | 231 | 232 | Sample configuration files will be created in \code{/etc/mfs} with the extension \code{*.sample} (MooseFS 3.0+) or \code{*.dist} (MooseFS 2.0). Use these files as your target configuration files: 233 | 234 | \begin{lstlisting}[caption={Copying default config files as target configuration files (MooseFS 3.0)}] 235 | # cd /etc/mfs 236 | # cp mfsmaster.cfg.sample mfsmaster.cfg 237 | # cp mfsexports.cfg.sample mfsexports.cfg 238 | \end{lstlisting} 239 | 240 | 241 | 242 | If you would like to change any of the settings you should uncomment the appropriate line and set a different value. For the lines which are commented the system will use built-in default values, i.e. those listed in commented lines. 243 | 244 | File \code{mfsmaster.cfg} contains Master Server settings. You can find out more information about this file in the man pages (\code{man mfsmaster.cfg} or at \url{https://moosefs.com/manpages/mfsmaster-cfg.html}). 245 | 246 | File \code{mfsexports.cfg} specifies which users' computers can mount the file system and with what 247 | privileges. For example, to specify that only machines addressed as \code{192.168.2.x} can use the whole structure of MooseFS resources (\code{/}) in read/write mode, in the first line which is not commented out change the asterisk (\code{*}) to \code{192.168.2.0/24}, so that you'll have: 248 | 249 | 250 | \begin{lstlisting}[caption={Changes to mfsexports.cfg}] 251 | 192.168.2.0/24 / rw,alldirs,maproot=0 252 | \end{lstlisting} 253 | 254 | \bigskip 255 | 256 | If you are setting up MooseFS Pro, at this point place proper \code{mfslicence.bin} file into \code{/etc/mfs} directory: 257 | 258 | \begin{lstlisting}[caption={Instaling mfslicence.bin file}] 259 | # cp /path/to/mfslicence.bin /etc/mfs 260 | \end{lstlisting} 261 | 262 | The \code{mfslicence.bin} file must be installed on all Master Servers.\\ 263 | \code{mfslicence.bin} file is not necessary to be present on Community Edition Master Server. 264 | 265 | If you want MooseFS Master Server to start automatically during system boot, \\ 266 | edit \code{/etc/default/moosefs-master} and set \code{MFSMASTER\_ENABLE} variable to \code{true}): 267 | 268 | \begin{lstlisting}[caption={Configuring mfsmaster autostart}] 269 | MFSMASTER_ENABLE=true 270 | \end{lstlisting} 271 | 272 | At this point it is possible to run the Master Server (using the standard way to run services): 273 | 274 | \begin{lstlisting}[caption={Starting MooseFS Master Server}] 275 | # service moosefs-master start 276 | \end{lstlisting} 277 | 278 | or: 279 | 280 | \begin{lstlisting}[caption={Starting MooseFS Master Server}] 281 | # mfsmaster start 282 | \end{lstlisting} 283 | 284 | To install second (third, etc.) Master Server just repeat steps listed above on another machine (Pro only). 285 | \vfill 286 | 287 | \section{MooseFS CGI and CGI Server installation} 288 | MooseFS CGI monitor interface is used to let user observe and analyze current MooseFS status (as you can see on screenshots presented below): \\ 289 | 290 | \begin{center} 291 | \includegraphics[width=1\textwidth]{images/mfsscr1_blur.png}\\[1cm] 292 | \includegraphics[width=1\textwidth]{images/mfsscr3_blur.png}\\[0.5cm] 293 | \includegraphics[width=1\textwidth]{images/mfsscr2_blur.png}\\[0.5cm] 294 | \end{center} 295 | 296 | \bigskip 297 | 298 | We recommend installing MooseFS CGI and CGI Server on all Master Servers. 299 | 300 | \bigskip 301 | 302 | \begin{lstlisting}[caption={MooseFS CGI and CGI Server installation}] 303 | # apt-get install moosefs-cgiserv 304 | # apt-get install moosefs-cgi 305 | \end{lstlisting} 306 | 307 | If you want MooseFS CGI Server to start automatically during system boot, \\ 308 | edit \code{/etc/default/moosefs-cgiserv} and set \code{MFSCGISERV\_ENABLE} variable to \code{true}): 309 | 310 | \begin{lstlisting}[caption={Configuring mfscgiserv autostart}] 311 | MFSCGISERV_ENABLE=true 312 | \end{lstlisting} 313 | 314 | You can now run CGI Monitor Server: 315 | 316 | \begin{lstlisting}[caption={Starting MooseFS CGI Server}] 317 | # service moosefs-cgiserv start 318 | \end{lstlisting} 319 | 320 | or: 321 | 322 | \begin{lstlisting}[caption={Starting MooseFS CGI Server}] 323 | # mfscgiserv start 324 | \end{lstlisting} 325 | 326 | Information should now be available under \url{http://192.168.1.1:9425/} (for the moment there will be no data about Chunkservers). 327 | 328 | 329 | \section{MooseFS CLI installation} 330 | MooseFS Command Line Interface (CLI) tool allows you to see various information about MooseFS status. This tool has many options, which allow you to check all the information you can see in your CGI, but from command line, so it is possible to use MooseFS CLI in scripts. You can list all the options by invoking the tool with \code{-h} or \code{--help} switch or check it at \url{https://moosefs.com/manpages/mfscli.html}. E.g. \code{mfscli} with \code{-SIN} option will display basic info similar to the "Info" tab in CGI. 331 | 332 | We recommend installing MooseFS CLI on all Master Servers: 333 | 334 | \begin{lstlisting}[caption={MooseFS CLI installation}] 335 | # apt-get install moosefs-cli 336 | \end{lstlisting} 337 | 338 | 339 | \section{Metadata backup servers (Metaloggers) installation} 340 | In MooseFS Pro, when there are at least two Master Servers present, Metalogger is optional, because when leader master fails, another one takes over its work. 341 | 342 | \textbf{In MooseFS (non-Pro) we strongly recommend to set up at least one Metalogger.} \\ 343 | 344 | It is recommended, that the machine used to install the backup server is as strong as the Master Server (at least in regards to the amount of RAM). In case of the Master Server failure, after importing changelogs to the 345 | metadata file, the Metalogger server can be easily set up to take over functions of the managing server. 346 | 347 | Issue the following commands to install and configure MooseFS Metalogger with default settings: 348 | 349 | \begin{lstlisting}[caption={Installing and configuring Metalogger}] 350 | # apt-get install moosefs-metalogger 351 | 352 | # cd /etc/mfs 353 | # cp mfsmetalogger.cfg.sample mfsmetalogger.cfg 354 | \end{lstlisting} 355 | 356 | For our test installation you'll leave mfsmetalogger.cfg unchanged. You can find out more information about this file in the man pages (\code{man mfsmetalogger.cfg} or at \url{https://moosefs.com/manpages/mfsmetalogger-cfg.html}). 357 | In case you have changed the default name \code{mfsmaster} to a different one, you need to uncomment and change the \code{MASTER\_HOST} variable in \code{mfsmetalogger.cfg} file. 358 | 359 | If you want MooseFS Metalogger to start automatically during system boot, \\ 360 | edit \code{/etc/default/moosefs-metalogger} and set \code{MFSMETALOGGER\_ENABLE} variable to \code{true}): 361 | 362 | \begin{lstlisting}[caption={Configuring mfsmetalogger autostart}] 363 | MFSMETALOGGER_ENABLE=true 364 | \end{lstlisting} 365 | 366 | Now you are ready to start the backup server process: 367 | 368 | \begin{lstlisting}[caption={Starting MooseFS Metalogger}] 369 | # service moosefs-metalogger start 370 | \end{lstlisting} 371 | 372 | or 373 | 374 | \begin{lstlisting}[caption={Starting MooseFS Metalogger}] 375 | # mfsmetalogger start 376 | \end{lstlisting} 377 | 378 | To install second (third, etc.) Metalogger just repeat steps listed above on another machine. 379 | 380 | \section{Chunkservers installation} 381 | Issue the following commands on the machines which are to be Chunkservers: 382 | 383 | \begin{lstlisting}[caption={Installing MooseFS Chunkserver}] 384 | # apt-get install moosefs-chunkserver 385 | \end{lstlisting} 386 | 387 | 388 | Now prepare configuration files of the Chunkservers: 389 | 390 | \begin{lstlisting}[caption={Preparing configuration files}] 391 | # cd /etc/mfs 392 | # cp mfschunkserver.cfg.sample mfschunkserver.cfg 393 | # cp mfshdd.cfg.sample mfshdd.cfg 394 | \end{lstlisting} 395 | 396 | 397 | For our test installation you'll leave mfschunkserver.cfg unchanged. You can find out more information about this file in the man pages (\code{man mfschunkserver.cfg} or at \url{https://moosefs.com/manpages/mfschunkserver-cfg.html}). 398 | In case you have changed the default name \code{mfsmaster} to a different one, you also need to uncomment and change the \code{MASTER\_HOST} variable in \code{mfschunkserver.cfg} file. 399 | 400 | It is recommended that they are used exclusively for the MooseFS -- this is necessary to manage the free space properly. 401 | 402 | Let's assume, that \code{/dev/sdb} and \code{/dev/sdc} devices are designated to store chunks. First of all, create a partition table and partition on these devices. 403 | \begin{lstlisting}[caption={Creating a partition on \code{/dev/sdb}}] 404 | # parted --align optimal /dev/sdb 405 | (parted) mklabel gpt 406 | (parted) mkpart mfschunks1 0% 100% 407 | (parted) q 408 | \end{lstlisting} 409 | 410 | \begin{lstlisting}[caption={Creating a partition on \code{/dev/sdc}}] 411 | # parted --align optimal /dev/sdc 412 | (parted) mklabel gpt 413 | (parted) mkpart mfschunks2 0% 100% 414 | (parted) q 415 | \end{lstlisting} 416 | 417 | Install XFS Progs: 418 | \begin{lstlisting}[caption={Installing \code{xfsprogs} (on Debian/Ubuntu)}] 419 | # apt-get install xfsprogs 420 | \end{lstlisting} 421 | 422 | Then, format newly created partition with XFS filesystem: 423 | \begin{lstlisting}[caption={Formatting partitions}] 424 | # mkfs.xfs /dev/sdb1 425 | # mkfs.xfs /dev/sdc1 426 | \end{lstlisting} 427 | 428 | If you have drives with 4k physical sector size (most of 2 and 4 TiB modern HDDs have 4k physical sector size), instead of the command above, issue: 429 | \begin{lstlisting}[caption={Formatting partitions with \code{4k} block size}] 430 | # mkfs.xfs -s size=4k /dev/sdb1 431 | # mkfs.xfs -s size=4k /dev/sdc1 432 | \end{lstlisting} 433 | 434 | Then, add appropriate entries into \code{/etc/fstab}: 435 | \begin{lstlisting} 436 | /dev/sdb1 /mnt/mfschunks1 xfs defaults 0 0 437 | /dev/sdc1 /mnt/mfschunks2 xfs defaults 0 0 438 | \end{lstlisting} 439 | 440 | Create directories for mounting newly created partitions: 441 | \begin{lstlisting}[caption={Creating directories}] 442 | # mkdir /mnt/mfschunks1 443 | # mkdir /mnt/mfschunks2 444 | \end{lstlisting} 445 | 446 | Mount newly created partitions: 447 | \begin{lstlisting}[caption={Mounting partitions}] 448 | # mount /mnt/mfschunks1 449 | # mount /mnt/mfschunks2 450 | \end{lstlisting} 451 | 452 | Change ownership and access rights to mountpoints to let MooseFS Chunkserver write to them: 453 | 454 | \begin{lstlisting}[caption={Changing ownership}] 455 | # chown mfs:mfs /mnt/mfschunks1 456 | # chown mfs:mfs /mnt/mfschunks2 457 | 458 | # chmod 770 /mnt/mfschunks1 459 | # chmod 770 /mnt/mfschunks2 460 | \end{lstlisting} 461 | 462 | At this point enter mountpoints in \code{mfshdd.cfg} file: 463 | \begin{lstlisting}[caption={Contents of mfshdd.cfg file}] 464 | /mnt/mfschunks1 465 | /mnt/mfschunks2 466 | \end{lstlisting} 467 | 468 | If you want MooseFS Chunkserver to start automatically during system boot, \\ 469 | edit \code{/etc/default/moosefs-chunkserver} and set \code{MFSCHUNKSERVER\_ENABLE} variable to \code{true}). 470 | 471 | \begin{lstlisting}[caption={Configuring autostart of MooseFS Chunkserver}] 472 | MFSCHUNKSERVER_ENABLE=true 473 | \end{lstlisting} 474 | 475 | 476 | Now you are ready to start the Chunkserver: 477 | 478 | \begin{lstlisting}[caption={Starting MooseFS Chunkserver}] 479 | # service moosefs-chunkserver start 480 | \end{lstlisting} 481 | 482 | or: 483 | 484 | \begin{lstlisting}[caption={Starting MooseFS Chunkserver}] 485 | # mfschunkserver start 486 | \end{lstlisting} 487 | 488 | 489 | Repeat the same steps for each Chunkserver you want to use for storing data in MooseFS system. 490 | 491 | At this point, at \url{http://192.168.1.1:9425}, you should be able to see full information about the system including the Master Server and Chunkservers. 492 | 493 | \section{Users' computers installation} 494 | In order to mount a file system based on MooseFS, it is necessary that users' computers have FUSE package (at least in version 2.6, recommended $\geq$ 2.7.2). If it is not present, you have to install it. One of the options is to compile it from sources, or you can install it from repositories on Debian-based systems with following command: 495 | 496 | \begin{lstlisting}[caption={Installing FUSE}] 497 | # apt-get install fuse libfuse2 498 | \end{lstlisting} 499 | 500 | 501 | \code{mfsmount} can be installed in the same way as other MooseFS components: 502 | 503 | \begin{lstlisting}[caption={Installing mfsmount}] 504 | # apt-get install moosefs-client 505 | \end{lstlisting} 506 | 507 | Let's assume that you'll mount the system in a \code{/mnt/mfs} folder on a client's machine. Issue the following commands: 508 | 509 | \begin{lstlisting}[caption={Mounting the Moose File System}] 510 | # mkdir -p /mnt/mfs 511 | # mfsmount /mnt/mfs -H mfsmaster.host.name 512 | \end{lstlisting} 513 | 514 | Now after issuing the \code{df -h | grep mfs} command you should get information similar to this: 515 | \begin{lstlisting}[caption={Result of \code{df -h | grep mfs}}] 516 | /dev/sdb 2.0G 69M 1.9G 4% /mnt/mfschunks1 517 | /dev/sdc 2.0G 69M 1.9G 4% /mnt/mfschunks2 518 | mfsmaster:9421 3.2G 0 3.2G 0% /mnt/mfs 519 | \end{lstlisting} 520 | 521 | \chapter{Basic MooseFS use} 522 | To create \code{folder1} in \code{/mnt/mfs}, in which you will store files in one copy (setting \code{goal=1}), issue the following command: 523 | 524 | \begin{lstlisting}[caption={Making directory \#1}] 525 | mkdir -p /mnt/mfs/folder1 526 | \end{lstlisting} 527 | 528 | To create \code{folder2}, in which you will store files in two copies (setting \code{goal=2}), issue the following command: 529 | \begin{lstlisting}[caption={Making directory \#2}] 530 | mkdir -p /mnt/mfs/folder2 531 | \end{lstlisting} 532 | 533 | The number of copies for the folder is set with the \code{mfssetgoal -r} command: 534 | 535 | \begin{lstlisting}[caption={\code{mfssetgoal -r command}}] 536 | # mfssetgoal -r 1 /mnt/mfs/folder1 537 | /mnt/mfs/folder1: 538 | inodes with goal changed: 0 539 | inodes with goal not changed: 1 540 | inodes with permission denied: 0 541 | 542 | # mfssetgoal -r 2 /mnt/mfs/folder2 543 | /mnt/mfs/folder2: 544 | inodes with goal changed: 1 545 | inodes with goal not changed: 0 546 | inodes with permission denied: 0 547 | \end{lstlisting} 548 | 549 | Create and copy a file to both folders: 550 | 551 | \begin{lstlisting}[caption={Creating and copying a file to newly created folders}] 552 | echo "test" > testmfs 553 | cp testmfs /mnt/mfs/folder1 554 | cp testmfs /mnt/mfs/folder2 555 | \end{lstlisting} 556 | 557 | To check in how many copies a file is stored, use the \code{mfscheckfile} command. In \code{folder1} you have one copy stored in one chunk: 558 | 559 | \begin{lstlisting}[caption={Checking amount of copies}] 560 | # mfscheckfile /mnt/mfs/folder1/testmfs 561 | /mnt/mfs/folder1/testmfs: 562 | chunks with 1 copy: 1 563 | \end{lstlisting} 564 | 565 | And in the \code{folder2} the file \code{testmfs} is stored in two copies: 566 | 567 | \begin{lstlisting}[caption={Checking amount of copies}] 568 | # mfscheckfile /mnt/mfs/folder2/testmfs 569 | /mnt/mfs/folder2/testmfs: 570 | chunks with 2 copies: 1 571 | \end{lstlisting} 572 | 573 | Note, that if you set a goal for a file higher than the total number of working Chunkservers, this file will be saved in only as many copies as there are Chunkservers. This is because one Chunkserver will store no more than one copy of any chunk/file. 574 | 575 | You can find more information about MooseFS usage and commands on this website: 576 | \begin{itemize} 577 | \item \url{https://moosefs.com/documentation.html} 578 | \end{itemize} 579 | 580 | It is also recommended to read Best practices, Frequntly Asked Questions and Manpages: 581 | \begin{itemize} 582 | \item \url{https://moosefs.com/documentation/best-practices.html} 583 | \item \url{https://moosefs.com/documentation/faq.html} 584 | \item \url{https://moosefs.com/manpages.html} 585 | \end{itemize} 586 | 587 | 588 | 589 | \chapter{Stopping MooseFS} 590 | In order to safely stop the MooseFS cluster you have to perform the following steps: 591 | \begin{itemize} 592 | \item Stop all the processes which use MooseFS mounted share. \code{lsof -n | grep mfsmount} may be helpful. 593 | \item Unmount the file system on all machines using umount command (in our examples it would be: \code{umount /mnt/mfs}) 594 | \item Stop the Chunkserver processes: \code{service moosefs-chunkserver stop} 595 | \item Stop the Master Server process(es): \code{service moosefs-master stop} 596 | \item Stop the Metalogger process(es) (if any): \code{service moosefs-metalogger stop} 597 | \end{itemize} 598 | 599 | 600 | \chapter{Supplement: Setting up DNS server on Debian/Ubuntu} 601 | 602 | In this extra chapter you'll use \code{bind9} as your DNS server. \\ 603 | Notice: You can find out more about DNS server e.g. on these pages: 604 | 605 | \begin{itemize} 606 | \item \url{https://help.ubuntu.com/community/BIND9ServerHowto} 607 | \item \url{http://ubuntuforums.org/showthread.php?t=236093} \\ 608 | \end{itemize} 609 | 610 | 611 | \section{Setting up DNS server} 612 | \begin{enumerate} 613 | \item The very first thing to do is installing bind9 and DNS utils. You can do this by running the following command: 614 | 615 | \begin{lstlisting}[caption={installing bind9}] 616 | # sudo apt-get install bind9 dnsutils 617 | \end{lstlisting} 618 | 619 | Main configuration files are placed in /etc/bind/ directory. 620 | 621 | \item The second thing you have to do is edit in your favorite editor (e.g. \code{nano} or \code{vim}) file named "\code{named.conf.local}". You need to add there your new zone, e.g.: 622 | 623 | \begin{lstlisting}[caption={New zone in named.conf.local}] 624 | zone "mfsnetwork.lan" { 625 | type master; 626 | file "/etc/bind/mfsnetwork.lan"; 627 | }; 628 | \end{lstlisting} 629 | 630 | In this file you can decide whether it is master or slave server and select path to zone's config file. 631 | 632 | \item After that create the file you've pointed to in the zone configuration (user \code{bind} must have permissions to read it) and paste there the following code: 633 | 634 | \begin{lstlisting}[caption={mfsnetwork.lan configuration file}] 635 | $TTL 3600 636 | $ORIGIN mfsnetwork.lan. 637 | @ IN SOA dns.mfsnetwork.lan. root.mfsnetwork.lan. ( 638 | 2016032900 ; serial number YYYMMDDSS 639 | 10800 ; refresh 640 | 3600 ; retry 641 | 604800 ; expire 642 | 10800 ; negative TTL 643 | ) 644 | 645 | @ IN NS dns.mfsnetwork.lan. 646 | @ IN A 192.168.0.1 ; address of bind9 647 | dns IN A 192.168.0.1 ; address of bind9 648 | 649 | mfsmaster IN A 192.168.1.1 ; address of Master01 650 | mfsmaster IN A 192.168.1.2 ; address of Master02 651 | 652 | mfsmaster01 IN A 192.168.1.1 ; address of Master01 653 | mfsmaster02 IN A 192.168.1.2 ; address of Master02 654 | 655 | chunkserver01 IN A 192.168.1.101 ; address of Chunkserver01 656 | chunkserver02 IN A 192.168.1.102 ; address of Chunkserver02 657 | chunkserver03 IN A 192.168.1.103 ; address of Chunkserver03 658 | \end{lstlisting} 659 | 660 | \item Next thing to do is to edit file \code{/etc/bind/named.conf.options}. You should use here your ISP's DNS servers, or you can use OpenDNS servers -- IP addresses are presented below: 661 | 662 | \begin{lstlisting}[caption={\code{named.conf.options} configuration file}] 663 | forwarders { 664 | 208.67.222.222; 665 | 208.67.220.220; 666 | }; 667 | \end{lstlisting} 668 | 669 | \item Last thing to do is restarting bind9 DNS server (to let it load new configuration): 670 | 671 | \begin{lstlisting}[caption={Restarting bind9}] 672 | # service bind9 restart 673 | \end{lstlisting} 674 | \end{enumerate} 675 | 676 | \section{Setting up revDNS server} 677 | Reverse DNS server is used by MooseFS and all network services in general to translate IP addresses to human-readable form (e.g. \code{192.168.1.1} to \code{mfsmaster01}). 678 | Ater installing and properly configuring DNS server you need to do 3 more things to have revDNS set up: 679 | 680 | \begin{itemize} 681 | \item In \code{/etc/bind} directory create an empty file named \code{rev.168.192.in-addr.arpa} and paste into it the following code: 682 | 683 | \begin{lstlisting}[caption={Content of \code{rev.168.192.in-addr.arpa} file}] 684 | @ IN SOA dns.mfsnetwork.lan. root.mfsnetwork.lan. ( 685 | 2016032900 ; serial number YYYYMMDDSS 686 | 28800 687 | 604800 688 | 604800 689 | 86400 690 | ) 691 | 692 | 693 | 694 | 168.192.in-addr.arpa. IN NS dns.mfsnetwork.lan. 695 | 696 | 1.1 IN PTR mfsmaster01.mfsnetwork.lan. 697 | 2.1 IN PTR mfsmaster02.mfsnetwork.lan. 698 | 699 | 101.1 IN PTR chunkserver1.mfsnetwork.lan. 700 | 102.1 IN PTR chunkserver2.mfsnetwork.lan. 701 | 103.1 IN PTR chunkserver3.mfsnetwork.lan. 702 | \end{lstlisting} 703 | 704 | \item Add the following code to \code{/etc/bind/named.conf.local} file: 705 | 706 | \begin{lstlisting}[caption={Extra code to add to \code{/etc/bind/named.conf.local} file}] 707 | zone "168.192.in-addr.arpa" { 708 | type master; 709 | file "/etc/bind/rev.168.192.in-addr.arpa"; 710 | }; 711 | \end{lstlisting} 712 | 713 | \item Run \code{service bind9 restart} command: 714 | 715 | \begin{lstlisting}[caption={Running \code{service bind9 restart} command}] 716 | # service bind9 restart 717 | \end{lstlisting} 718 | \end{itemize} 719 | \end{document} 720 | -------------------------------------------------------------------------------- /MooseFS-Upgrade-Guide.tex: -------------------------------------------------------------------------------- 1 | % Copyright (c) 2014-2017 Piotr Robert Konopelko, Core Technology Sp. z o.o. 2 | % 3 | % This file is part of MooseFS. 4 | % 5 | % MooseFS is free software; you can redistribute it and/or modify 6 | % it under the terms of the GNU General Public License as published by 7 | % the Free Software Foundation, version 2 (only). 8 | % 9 | % MooseFS is distributed in the hope that it will be useful, 10 | % but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | % GNU General Public License for more details. 13 | % 14 | % You should have received a copy of the GNU General Public License 15 | % along with MooseFS; if not, write to the Free Software 16 | % Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 17 | % or visit http://www.gnu.org/licenses/gpl-2.0.html 18 | 19 | \documentclass[a4paper,11pt,english]{report} 20 | \usepackage{url} 21 | \usepackage{hyperref} 22 | \usepackage{fullpage} 23 | \usepackage{parskip} 24 | \usepackage{graphicx} 25 | \usepackage{xcolor} 26 | \usepackage{listings} 27 | 28 | \lstset{ 29 | language=bash, 30 | basicstyle=\ttfamily\scriptsize, 31 | showstringspaces=false, 32 | commentstyle=\color{black}, 33 | keywordstyle=\color{black}, 34 | breakatwhitespace=false, 35 | breaklines=true, 36 | showspaces=false, 37 | tabsize=4 38 | } 39 | 40 | \def\code#1{\texttt{#1}} 41 | 42 | \newenvironment{copyrightnotice} 43 | {\begingroup 44 | \footnotesize 45 | \setlength{\parindent}{0pt} 46 | \setlength{\parskip}{\baselineskip}} 47 | {\endgroup} 48 | 49 | % ------------------------------------------------------------------------ 50 | 51 | \begin{document} 52 | 53 | \renewcommand{\labelitemi}{$\bullet$} 54 | \renewcommand{\labelitemii}{$\circ$} 55 | \renewcommand{\labelitemiii}{$\bullet$} 56 | \renewcommand{\labelitemiv}{$\circ$} 57 | 58 | \begin{titlepage} 59 | \begin{center} 60 | \includegraphics[width=0.2\textwidth]{images/moosefs.png}\\[1cm] 61 | 62 | % Title 63 | { \huge \bfseries MooseFS 2.0 Upgrade Guide \\[0.4cm] } 64 | 65 | 66 | \textsc{Core Technology} Development \& Support Team 67 | 68 | \vfill 69 | 70 | % Bottom of the page 71 | {\large \today} 72 | \end{center} 73 | \end{titlepage} 74 | 75 | 76 | % Copyright page 77 | \begin{copyrightnotice} 78 | \begin{flushleft} 79 | Copyright \textcopyright{} 2014-\the\year 80 | \hfill 81 | \textsc{v. 1.0.8}\\ % DOCUMENTVERSION 82 | 83 | Piotr Robert Konopelko, \textsc{Core Technology} Development \& Support Team. 84 | 85 | \emph{Proofread by} 86 | Agata Kruszona-Zawadzka \\ 87 | \emph{Coordination \& layout by} Piotr Robert Konopelko. 88 | 89 | Please send corrections to \href{mailto:peter@mfs.io}{Piotr Robert Konopelko} -- peter@mfs.io. 90 | 91 | \bigskip 92 | 93 | This file is part of MooseFS. 94 | 95 | MooseFS is free software; you can redistribute it and/or modify 96 | it under the terms of the GNU General Public License as published by 97 | the Free Software Foundation, version 2 (only). 98 | 99 | MooseFS is distributed in the hope that it will be useful, 100 | but WITHOUT ANY WARRANTY; without even the implied warranty of 101 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 102 | GNU General Public License for more details. 103 | 104 | You should have received a copy of the GNU General Public License 105 | along with MooseFS; if not, write to the Free Software 106 | Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02111-1301, USA 107 | or visit \url{http://www.gnu.org/licenses/gpl-2.0.html} 108 | \end{flushleft} 109 | \end{copyrightnotice} 110 | 111 | \vfill 112 | 113 | \tableofcontents 114 | 115 | % ================================================================================================================== 116 | 117 | \chapter{Introduction} 118 | This book guides you through the process of upgrading and/or migrating MooseFS. It describes upgrading MooseFS to newer version in 2.0.x branch, upgrading from 1.6.27-5 to 2.0.x, migrating from Community Edition to Pro and migrating from Pro to Community Edition. 119 | 120 | It is strongly advised to read the entire document before attepmting to start the upgrade/migration process. 121 | 122 | There are several prerequisites, discussed in the following paragraphs, that should be taken into consideration before upgrading/migrating MooseFS. 123 | 124 | \section{Backing-up data and configuration files} 125 | The most important thing is to \textbf{back-up all configuration, metadata and changelog files that contain critical MooseFS configuration. Losing those files may lead to data loss and a serious problem with recovering your files.} \\ 126 | 127 | If MooseFS was installed from officially supported packages or repository (version 2.0 and higher), the configuration files should be located in default configuration path i.e. \code{/etc/mfs} and metadata files should be located in \code{/var/lib/mfs}\\ 128 | 129 | If you installed MooseFS 1.6.27-5 from sources without any additional parameters to \code{configure} script, the configuration files and metadata should be located in \code{/usr/local/var/lib/mfs} or \code{/usr/local/etc/mfs}. Depending on parameters selected during compilation process, these paths may vary (e.g. it might be \code{/etc}, \code{/etc/mfs}, \code{/var/lib/mfs}). If you compiled MooseFS from sources and set different paths, you should refer to your site specific documentation to find out where MooseFS configuration files have been placed. 130 | 131 | 132 | \section{Repository} 133 | \subsection{Adding repository} 134 | To install MooseFS 2.0 Pro or CE you need to add MooseFS Official Supported Repositories to your system. This process is described at \url{http://get.moosefs.com} (please select your distribution in menu on the left) or in paragraph 2.2 in document named \textit{Installing MooseFS 2.0 Step by Step Tutorial}.\\ 135 | 136 | At this time there are repositories available for Ubuntu / Debian, RHEL / CentOS / Fedora, FreeBSD and MacOS X. 137 | 138 | \subsection{Repository branches} 139 | Our repository contains two branches: \code{stable} and \code{current}. Version from \code{stable} branch has been tested both in the production and in our test environment. Version from \code{current} branch -- only in our test environment. MooseFS versions in these branches are upgraded automatically after finishing the tests. 140 | 141 | At the time of writing this guide, \code{stable} branch contains version 2.0.39-1, and \code{current} branch contains version 2.0.40-1. 142 | 143 | \code{Stable} branch is default and you don't need to make any changes in default URL: \\ 144 | \code{http://ppa.moosefs.com/stable/}. 145 | 146 | If you want to use \code{current} branch, you just need to replace \code{stable} with \code{current} after \code{http://ppa.moosefs.com/} and before \code{apt}, \code{yum}, \code{freebsd} or \code{osx}, so URL will look like: \\\\ 147 | \code{http://ppa.moosefs.com/current/[rest of url]} \\ 148 | 149 | It is also possible to use version number instead of "branch" if you want to upgrade to a specific version of MooseFS (e.g. 2.0.40-1): 150 | 151 | \code{http://ppa.moosefs.com/2.0.40/[rest of url]} 152 | 153 | If you want to use this option, please remember you need to manually change version number on each server to the selected one before doing an upgrade. 154 | 155 | \chapter{MooseFS 2.0 (CE and Pro) upgrade to newer version} 156 | This chapter describes general MooseFS Upgrade process (e.g. from version 2.0.34 to 2.0.35). Upgrading to newer version is very simple, but \textbf{there is one important thing}: the order of upgrade. We only support the following upgrade order: 157 | \begin{enumerate} 158 | \item Upgrading all MooseFS master follower(s) (one-by-one) (Pro only) 159 | \item Upgrading MooseFS leader master and MooseFS tools (cgi \& cgiserv, cli, supervisor (Pro only)) 160 | \item Upgrading all MooseFS metaloggers (one-by-one) 161 | \item Upgrading all MooseFS chunkservers (one-by-one) 162 | \item Upgrading all MooseFS clients (one-by-one) 163 | \end{enumerate} 164 | 165 | \bigskip 166 | \bigskip 167 | On every server upgrading comes down to running two commands:\\ 168 | \code{\# apt-get update \\ 169 | \# apt-get upgrade} 170 | 171 | \bigskip 172 | \bigskip 173 | The second command listed above will upgrade all packages installed on your system. If you want to upgrade only one package (e.g. \code{moosefs-pro-master}), please run e.g.: \\ 174 | \code{\# apt-get install moosefs-pro-master} 175 | 176 | -- this will upgrade only the specified package to the newest version. 177 | 178 | \chapter{MooseFS 1.6.27-5 to 2.0.x CE upgrade} 179 | \textbf{Notice: It is not possible to roll-back to older (1.6.27-5) version after this upgrade!} 180 | \bigskip 181 | 182 | \bigskip 183 | \section{Making backup of data and configuration files} 184 | Now you \textbf{need to} back-up configuration files in a safe place, as described in \textbf{Paragraph 1.1} above. The backup should be done on \textbf{all servers}, i.e. configuration and metadata from \textbf{each} server should be backed-up. 185 | 186 | \section{Upgrading MooseFS master server: stopping, removing, installing, configuring and starting up} 187 | \bigskip 188 | \subsection{Stopping} 189 | In order to stop \code{mfsmaster}, run the following command: \\ \code{\# mfsmaster stop} 190 | 191 | \bigskip 192 | \subsection{Removing} 193 | \begin{itemize} 194 | \item If you have installed MooseFS 1.6.27-5 from packages: \\ \code{\# apt-get purge --purge mfs-master} 195 | \item or:\\ \code{\# dpkg -P mfs-master} 196 | \item Otherwise, if MooseFS has been compiled and installed from sources, you have to remove all MooseFS binaries (\code{mfsmaster}, \code{mfscgi}, ...) (most likely) from \code{/usr/sbin}, \code{/usr/local/bin} or \code{/usr/local/sbin}. 197 | \end{itemize} 198 | \bigskip 199 | \subsection{Installing new version of MooseFS master server} 200 | The next step to do is installing MooseFS 2.0 CE master server. For example on Debian/Ubuntu you can do it by running the following command: 201 | 202 | \bigskip 203 | \code{\# apt-get install moosefs-master} 204 | 205 | \bigskip 206 | \subsection{Copying and comparing configuration files} 207 | It is assumed, that old configuration is in \code{/etc} (it may exist in \code{/etc}, \code{/usr/local/etc}, or \code{/usr/local/etc/mfs} depending on the way MooseFS has been installed, so make sure where your configuration files are and make changes in following commands). Metadata should exist in \code{/var/lib/mfs}. If metadata files are in different location, please move them to \code{/var/lib/mfs}. 208 | \begin{itemize} 209 | \item Next step is to move old configuration files: \\ 210 | \code{\# mv /etc/mfsmaster.cfg /etc/mfs/mfsmaster.cfg.old} \\ 211 | \code{\# mv /etc/mfsexports.cfg /etc/mfs/mfsexports.cfg.old} \\ 212 | \code{\# mv /etc/mfstopology.cfg /etc/mfs/mfstopology.cfg.old} \\ 213 | 214 | And copy examples as new configuration files: 215 | 216 | \code{\# cp -i /etc/mfs/mfsmaster.cfg.dist /etc/mfs/mfsmaster.cfg} \\ 217 | \code{\# cp -i /etc/mfs/mfsexports.cfg.dist /etc/mfs/mfsexports.cfg} \\ 218 | \code{\# cp -i /etc/mfs/mfstopology.cfg.dist /etc/mfs/mfstopology.cfg} 219 | 220 | 221 | \item The last step before starting up the new master server is to \textbf{compare new configuration files with old ones} and \textbf{make changes in new configuration}. In this step you need to rewrite modified parameters from old file to new one: \\ 222 | \code{\# vim -d /etc/mfs/mfsmaster.cfg /etc/mfs/mfsmaster.cfg.old} \\ 223 | \textbf{Perform this step for every configuration files.} 224 | \end{itemize} 225 | 226 | \bigskip 227 | \subsection{Starting up master server} 228 | The last step is to start up the \code{mfsmaster} process. 229 | 230 | \bigskip 231 | \bigskip 232 | \section{Upgrading the rest of MooseFS modules} 233 | Now you should upgrade the rest of MooseFS modules. It should be done analogously to MooseFS master server upgrade, in following order: 234 | \begin{itemize} 235 | \item Make sure you have backed-up configuration and metadata files 236 | \item Stop MooseFS module process 237 | \item Remove MooseFS module binaries 238 | \item Install version 2.0.x of MooseFS module 239 | \item Copy old, \textbf{compare with new} and \textbf{make proper changes to new configuration files} and make sure metadata and changelogs files are in \code{/var/lib/mfs} (on metaloggers) 240 | \item Start module's process 241 | \end{itemize} 242 | 243 | \bigskip 244 | \bigskip 245 | The proper modules upgrade order is: 246 | \begin{itemize} 247 | \item MooseFS metalogger(s) (one-by-one) 248 | \item MooseFS chunkservers (one-by-one) 249 | \item MooseFS clients (mounts) (one-by-one) 250 | \end{itemize} 251 | 252 | 253 | \chapter{MooseFS 2.0 CE to Pro HA migration} 254 | \textbf{Notes:} 255 | \begin{itemize} 256 | \item MooseFS PRO \textbf{depends on DNS}, so make sure that your local DNS server is running and resolving name for \code{mfsmaster} (or other chosen host name) properly. 257 | \item In Pro version it is not possible to use \code{/etc/hosts} file to resolve \code{mfsmaster} (or other chosen name). After making sure your DNS is working properly, you must remove entries pointing to master host from \code{/etc/hosts}. 258 | \item Before you start the migration it is good to know that during the packages uninstall process files like: \code{/etc/default/moosefs-master} and \code{/etc/init.d/moosefs-master} will be removed. Configuration and metadata files will not be deleted during migration, but good idea is to backup them first. 259 | \end{itemize} 260 | 261 | \bigskip 262 | 263 | 264 | \section{MFS Master migration} 265 | \begin{itemize} 266 | \item Stop MooseFS CE master: \code{mfsmaster -c /etc/mfs/mfsmaster.cfg stop} 267 | \item Uninstall \code{moosefs-master}, \code{moosefs-cli}, \code{moosefs-cgiserv}, \code{moosefs-cgi}, \code{moosefs-supervisor} packages 268 | \item Install \code{moosefs-pro-master}, \code{moosefs-pro-supervisor}, \code{moosefs-pro-cgi}, \code{moosefs-pro-cgiserv}, \code{moosefs-pro-cli} packages 269 | \item Restart MooseFS cgi server \code{mfscgiserv restart} 270 | \item Place \code{mfslicence.bin} file in \code{/etc/mfs} folder 271 | \item Start mfsmaster pro version \code{mfsmaster -c /etc/mfs/mfsmaster.cfg start} 272 | \end{itemize} 273 | 274 | \section{Upgrading chunkservers} 275 | It is important to perform chunkservers migration before adding master FOLLOWER in HA configuration. Only pro chunkservers are capable of switching to new LEADER master during the failure. 276 | 277 | Chunkservers can be updated one by one on a running system \textbf{only} if the minimum goal of all files in the system is at least \textbf{2}. Otherwise you need to stop the master before performing any and all chunkservers updates. 278 | 279 | If in \code{/etc/default/moosefs-chunkserver} file value \code{MFSCHUNKSERVER\_ENABLE} is set to \code{true} then during uninstall proces mfschunkserver will automatically stop and then start. If not, you need to restart it manually. 280 | 281 | \begin{itemize} 282 | \item Stop chunkserver \code{mfschunkserver stop} 283 | \item Uninstall \code{moosefs-chunkserver} from system 284 | \item Install \code{moosefs-pro-chunkserver} 285 | \item Start chunkserver \code{mfschunkserver start} 286 | \end{itemize} 287 | 288 | \section{Upgrading clients (mounts)} 289 | \begin{itemize} 290 | \item Umount mfs mountpoints: \code{umount /mnt/mfs} 291 | \item Remove \code{moosefs-client} from system 292 | \item Install \code{moosefs-pro-client} 293 | \item Mount MooseFS: \code{mfsmount /mnt/mfs} \\\\ 294 | Alternatively you can add this entry to \code{/etc/fstab} to mount MooseFS automatically when system starts: \\ 295 | 296 | \begin{itemize} 297 | \item For Linux: \code{mfsmount /mnt/mfs fuse defaults 0 0} 298 | \item For FreeBSD: \code{mfsmount\_magic /mnt/mfs moosefs rw,mfsmaster=mfsmaster,\\mountprog=/usr/local/bin/mfsmount,late 0 0} 299 | \end{itemize} 300 | \end{itemize} 301 | 302 | \section{Adding master followers} 303 | This process is described below in \textbf{Chapter 6: Adding master follower(s) servers(s) procedure} 304 | To add master followers, jump to Chapter 6 now and follow instructions placed there. 305 | 306 | \chapter{MooseFS 2.0 Pro HA to CE migration} 307 | MooseFS 2.0 is fully switchable between CE and PRO version. So all your data are safe and there is no reason to be afraid to test PRO version and then switch to CE version. 308 | Remember that the most important thing in migration is to preserve the order in these steps: 309 | 310 | \section{Checking LEADER master} 311 | If your LEADER master is not running on selected (i.e. the one, that will became the CE master) hardware machine, switch it using \code{mfssupervisor} command. For example your LEADER master is running on IP 192.168.0.2 machine and your CE master needs to be on IP 192.168.0.1 machine.\\\\ 312 | \code{mfssupervisor -l 192.168.0.1}\\\\ 313 | Now wait a few seconds for chunkservers and clients to switch connection to new master LEADER. 314 | 315 | \section{Changing DNS entry} 316 | When your master LEADER is on selected machine, it is time to remove all 'extra masters' -- FOLLOWERS entries from your DNS configuration. Leave only one mfsmaster IP. Now restart DNS service. Remember that propagation of new DNS entries can take some time and it depends on your TTL DNS configuration. 317 | 318 | \section{Stopping FOLLOWERS} 319 | 320 | Next step in migration is to stop all 'extra masters' -- FOLLOWERS in your cluster. On all master FOLLOWERS machines stop mfsmaster proces\\\\ 321 | \code{mfsmaster -c /etc/mfs/mfsmaster.cfg stop} 322 | 323 | \section{Migrating master} 324 | This step will stop master for installation time. Whole MFS cluster will be down!\\ 325 | \begin{itemize} 326 | \item Stop mfsmaster on LEADER machine\\ \code{mfsmaster -c /etc/mfs/mfsmaster.cfg} 327 | \item Remove \code{moosefs-pro-*} packages from system 328 | \item Install \code{moosefs-}: \code{master}, \code{cgi}, \code{cgiserv}, \code{cli} package 329 | \item Restart \code{cgiserv} (\code{mfscgiserver restart}) 330 | \item Start \code{mfsmaster} CE version \code{mfsmaster -c /etc/mfs/mfsmaster.cfg start} 331 | \end{itemize} 332 | Now your master is running in CE version and CGI and cli are not displaying LEADER -- FOLLOWER informations in tables. 333 | 334 | \section{Migrating metaloggers} 335 | Installation of metaloggers in HA cluster is not necessary, but in CE configuration is is strongly recommended option. Without metaloggers, your metadata files are saved on one physical machine only. Running metalogger on another machine is the best solution for metadata backup.\\ 336 | If metalogger was running in your PRO version perform these steps to migrate: 337 | 338 | \begin{itemize} 339 | \item Stop metalogger \code{mfsmetalogger -c /etc/mfs/mfsmetalogger.cfg stop} 340 | \item Remove \code{moosefs-pro-metalogger} from system 341 | \item Install \code{moosefs-metalogger} 342 | \item Start metalogger \code{mfsmetalogger -c /etc/mfs/mfsmetalogger.cfg start} 343 | \end{itemize} 344 | 345 | If metalogger was not running in your PRO version, install one now -- refer to chapter \textbf{3.4} of "\textbf{Installing MooseFS 2.0 346 | Step by Step Tutorial}" 347 | 348 | \section{Migrating chunkservers} 349 | Now it's time to perform chunkserver migration. It can be done on running system -- if your minimal goal is set to 2. If not, we suggest stopping master server before starting chunkservers migration. If you have set goal 2, on whole MFS then update can be done with One-By-One method and without stopping the cluster. 350 | \begin{itemize} 351 | \item Stop chunkserver \code{mfschunkserver -c /etc/mfs/mfschunkserver.cfg stop} 352 | \item Remove \code{moosefs-pro-chunkserver} from system 353 | \item Install \code{moosefs-chunkserver} 354 | \item Start chunkserver \code{mfschunkserver -c /etc/mfs/mfschunkserver.cfg start} 355 | \end{itemize} 356 | 357 | \textbf{Repeat these steps for every chunk server.} 358 | 359 | \section{Migrating clients (mounts)} 360 | This step is similar to chunkserver migration. First you need to umount mfs and then uninstall \code{mfsclient} from system. 361 | \begin{itemize} 362 | \item Umount all mount points for MooseFS 363 | \item Uninstall \code{moosefs-pro-client} 364 | \item Install \code{moosefs-client} 365 | \item Mount MooseFS: \code{mfsmount /mnt/mfs} \\\\ 366 | Alternatively you can add this entry to \code{/etc/fstab} to mount MooseFS automatically when system starts: \\ 367 | \begin{itemize} 368 | \item For Linux: \code{mfsmount /mnt/mfs fuse defaults 0 0} 369 | \item For FreeBSD: \code{mfsmount\_magic /mnt/mfs moosefs rw,mfsmaster=mfsmaster,\\mountprog=/usr/local/bin/mfsmount,late 0 0} 370 | \end{itemize} 371 | \end{itemize} 372 | 373 | \chapter{Adding master follower(s) server(s) procedure (Pro only)} 374 | \section{Installing 'extra masters'} 375 | Please remember that \textbf{all masters servers need to have the same configuration and licence files} and identical or very similar hardware configuration. For example if your current MASTER have 64GB RAM and 256GB HDD, other masters need to have at least the same (or greater) amount of RAM and HDD space. 376 | 377 | Let's start by adding 'extra master' in your cluster: 378 | \begin{itemize} 379 | \item Install \code{moosefs-pro-master}, \code{moosefs-pro-supervisor}, \\ \code{moosefs-pro-cli} 380 | \item Copy configuration from mfsmaster\\ \code{scp root@mfsmaster:"/etc/mfs/mfsmaster.cfg /etc/mfs/mfsexports.cfg \\ /etc/mfs/mfstopology.cfg /etc/mfs/mfslicence.bin" /etc/mfs} 381 | 382 | \item First start of Follower Master Server should be performed with \code{-e} option to let it download metadata from Leader Master Server. You can achieve this by issuing the following command: \\ 383 | \code{mfsmaster -e -c /etc/mfs/mfsmaster.cfg start} 384 | \end{itemize} 385 | 386 | Your new master is FOLLOWER and is downloading metadata from LEADER master but it is not visible in CGI and CLI tables. Now it's time to add DNS entry for the new master server. 387 | 388 | \section{Adding DNS entries} 389 | To start working with HA and 'extra masters' -- FOLLOWERS in your cluster you need to point all masters IP to one name in DNS. 390 | 391 | For example in your configuration there will be three masters and your entries should look similar to these ones:\\\\ 392 | \code{ 393 | mfsmaster IN A 192.168.0.1;\\ 394 | mfsmaster IN A 192.168.0.2;\\ 395 | mfsmaster IN A 192.168.0.3;\\ 396 | \\ 397 | mfsmaster1 IN A 192.168.0.1;\\ 398 | mfsmaster2 IN A 192.168.0.2;\\ 399 | mfsmaster3 IN A 192.168.0.3;\\ 400 | } 401 | 402 | Now is time to restart DNS service and wait for it to propagate changes in your network. This step can take some time depending on TTL settings. The whole system will be fully-HA after DNS propagation. 403 | 404 | You can test if DNS propagation is done in your LAN. Just run \code{host mfsmaster} command on several machines in LAN. If you see three addresses near hostname it means that DNS cache is updated and now MFS is running in HA configuration. 405 | 406 | Now check CGI or use CLI (\code{mfscli -SIM}) to see if your LEADER and FOLLOWERS are visible. 407 | 408 | \end{document} 409 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MooseFS Documentation 2 | 3 | -------------------------------------------------------------------------------- /images/cgi_labelsAB.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/cgi_labelsAB.png -------------------------------------------------------------------------------- /images/cgi_labelsAB_data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/cgi_labelsAB_data.png -------------------------------------------------------------------------------- /images/cgi_nolabels.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/cgi_nolabels.png -------------------------------------------------------------------------------- /images/cgi_resources1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/cgi_resources1.png -------------------------------------------------------------------------------- /images/cgi_resources2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/cgi_resources2.png -------------------------------------------------------------------------------- /images/diagram_A_B_ssd_hdd_v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/diagram_A_B_ssd_hdd_v2.png -------------------------------------------------------------------------------- /images/diagram_A_B_v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/diagram_A_B_v2.png -------------------------------------------------------------------------------- /images/diagram_ssd_hdd_v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/diagram_ssd_hdd_v2.png -------------------------------------------------------------------------------- /images/diagram_with_labels.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/diagram_with_labels.png -------------------------------------------------------------------------------- /images/diagram_without_labels.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/diagram_without_labels.png -------------------------------------------------------------------------------- /images/mfsscr1_blur.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/mfsscr1_blur.png -------------------------------------------------------------------------------- /images/mfsscr2_blur.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/mfsscr2_blur.png -------------------------------------------------------------------------------- /images/mfsscr3_blur.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/mfsscr3_blur.png -------------------------------------------------------------------------------- /images/moosefs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/moosefs.png -------------------------------------------------------------------------------- /images/read_mfs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/read_mfs.png -------------------------------------------------------------------------------- /images/write_mfs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moosefs/moosefs-documentation/39f4bc2f644a1c8eff4dd7f4f7351ef705fae9bd/images/write_mfs.png --------------------------------------------------------------------------------