├── .gitignore ├── LICENSE ├── README.md ├── checkmk ├── README.md └── thelinuxguy-hd-idle │ └── thelinuxguy_hd-idle ├── disaster_recovery_snapraid-btrfs.md ├── filemover ├── cache-bulk-mover-fast-unsafe.sh ├── uncache-mover.py └── zfs-uncache-mover.py ├── img ├── hardware-10g-10w-idle.png └── intel-gpu-top.png ├── install_steps.md ├── maintenance_tasks.md ├── mergerfs.md ├── miscelaneous_sysadmin.md ├── nfs-check-loop.sh ├── nfs.md ├── performance_benchmarks.md ├── plex_mediaserver_lxc_hw_transcoding.md ├── proxmox.md ├── shares_configuration.md ├── snapraid-cheatsheet.md ├── snapraid_btrfs_runner.md ├── storage_tiered_cache.md └── to_be_added.md /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | compose-photostructure.yaml 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # free-unraid 2 | 3 | This repository is a `work-in-progress` of notes and experiments with attempting to replicate UNRAID features using all open-source tools, scripts and utilities. 4 | 5 | This repository makes no guarantees; use information presented here at your own risk. Do you wish to contribute? Submit a pull request. 6 | 7 | #### 12/25/2022 update 8 | 9 | My original intended configuration with BTRFS+ZFS while working in practice gave me a lot of instability issues with NFS and down several *troubleshooting rabbit holes* - you will find notes about those issues scattered in the individual markdown sections in the repository. 10 | 11 | This also means that "ideal setup" is not up to date with my `real` current configuration. Right now my setup is: 12 | - /cache mdadm RAID1 NVME XFS mirror (bitmap disabled to improve performance) 13 | - XFS filesystem on all physical hard disks. This means I am not using `snapraid-btrfs` and using simple snapraid+snapraid-runner. 14 | 15 | Once my setup has several `airmiles` being stable and not crashing when sharing NFS/SMB I will do a cleanup of all markdown files. 16 | 17 | See [install_steps.md](install_steps.md) for a quick and dirty guide on standing up my current environment. Other sections may have more details on the specifics (e.g: mergerfs create policies & other settings) 18 | 19 | ## The "unraid" ideal setup 20 | 21 | After several years running ZFS arrays, in 2022 I decided I wanted to experiment and take a different approach to my home media-server, which as of this writting has 30TB of digital media on a ZFS array. 22 | 23 | ZFS is a robust filesystem and solution but it requires having all of your hard drives spinning 24x7 when serving data off the array. Most often than not I have about 2-3 concurrent Plex streams reading some media file that could be read from a single hard drive rather than 5 disks. Therefore one of my goals in 2022 is to lower my power consumption of my 24/7 home server. 24 | 25 | ### What's important for me? 26 | 27 | 1. **Low power consumption for 24/7 operation**. This means most hard drives must be spun-down. 28 | 1. **Must run opensource or free software (no licenses)**. Linux ideally. 29 | 1. **Some protection against bit-rot and checksumming of my data files**. 30 | 1. **File system should be able to detect hardware errors** (automatic repair is not necessary, since this isn't an array). 31 | 1. **I should be able to recover my files from a single disk catastropic event** (e.g: hardware failure, or specific data block bitrot) 32 | 1. **NVME caching (tiered storage)**. I want to write all new files to superfast storage (nvme) and later 'archive' my data into spinning hard drives. 33 | 34 | ### TheLinuxGuy's ~10 watt 10Gb Rocket Lake 2023 Build 35 | 36 | ![TLG 10 watt NAS](./img/hardware-10g-10w-idle.png) 37 | 38 | Motherboard: Asus Prime Z790M-Plus D4 39 | - HW Version: Rev 1.xx 40 | - BIOS version 02/22/2023 v0810 41 | 42 | CPU: Intel Raptor Lake 13400 43 | 44 | CPU cooler: stock intel cooler 45 | 46 | RAM: 16GB (2x8GB) G.Skill F4-3000C16-16GISB Aegis DDR4 DDR4-3000 CL16-18-18-38 1.35V 47 | 48 | PSU: RGEEK 12V 300W Pico ATX 49 | 50 | HDD: 1TB Crucial P3 NVME 51 | 52 | 10GB NIC: Intel X710-DA2 dual sfp+ 53 | 54 | Idle: ~10 watt. (~5 watt without 10GB nic) 55 | 56 | ### The software recipe 57 | 58 | The following open-source projects seem to be able to help reach my goals. It requires elbow grease and stich this all together manually to mimic unraid. 59 | 60 | - [SnapRAID](https://www.snapraid.it). Provides data parity, backups, checksumming of existing backups. 61 | - [Claims to be better than UNRAID's](https://www.snapraid.it/compare) own parity system with the ability to 'fix silent errors' and 'verify file integrity' among others. 62 | - [BRTFS Filesystem](https://btrfs.wiki.kernel.org/index.php/Main_Page) similar to ZFS in that it provides be the ability to 'send/receive' data streams (ala `zfs send`) with the added benefit that I can run individual `disk scrubs` to detect hardware issues that require me to restore from snapraid parity. **My observed Btrfs performance is that its poor compared to XFS filesystem on linux.** *Since we use btrfs only for the 'data' disks in the slow mergerfs pool we are not sensitive to speed.* 63 | - **XFS Filesystem for NVME cache on mdadm array**. After finding bugs and instability in my ZFS+NFS+mergerfs implementation my cache disks are now formatted to XFS in RAID1. I did not use btrfs raid1 natively here because btrfs performance was poor (50% throughtput penalty). XFS was able to match ZFS raw speeds (without arc) ~900MB/s. 64 | - [MergerFS](https://github.com/trapexit/mergerfs). FUSE filesystem that allows me to 'stitch together' multiple hard drives with different mountpoints and takes care of directing I/O operations based on a set of rules/criteria/policies. 65 | - [snapraid-btrfs](https://github.com/automorphism88/snapraid-btrfs). Automation and helper script for BRTFS based snapraid configurations. Using BRTFS snapshots as the data source for running 'snapraid sync' allows me to continue using my system 24/7 without data corruption risks or downtime when I want to build my parity/snapraid backups. 66 | - [snapraid-btrfs-runner](https://github.com/fmoledina/snapraid-btrfs-runner). Helper script that runs `snapraid-btrfs` sending its output to the console, a log file and via email. 67 | - [hd-idle](https://github.com/adelolmo/hd-idle). Helper script running on systemd ensuring that spinning hard drives are spun down (standby) and set into a lower power consumption power state. 68 | - [btrfs-list](https://github.com/speed47/btrfs-list). Script providing a nice tree-style view of btrfs subvolumes/snapshots (ala `zfs list`). 69 | 70 | ## OS install and base packages install 71 | 72 | ``` 73 | apt-get install zfsutils-linux cockpit-pcp btrfs-progs libbtrfsutil1 btrfs-compsize duc smartmontools 74 | ``` 75 | 76 | ## ~~ZFS cache pool setup~~ 77 | **WARNING! DEPRECATED** NFS+ZFS is unstable with this setup. Follow XFS+mdadm below. 78 | 79 | RAID1 of two SSD disks. We'll write all stuff here then purge to 'cold-storage' slower disks via cron. 80 | 81 | ``` 82 | zpool create -o ashift=12 cache mirror /dev/sdb /dev/nvme0n1 83 | ``` 84 | 85 | ## XFS RAID1 mirror mdadm 86 | 87 | See [mergerfs](mergerfs.md) for details on ZFS instability. For our cache pool we will use XFS filesystem. Set up the NVME cache as follows: 88 | 89 | ``` 90 | mdadm --create --verbose /dev/md0 --bitmap=none --level=mirror --raid-devices=2 /dev/nvme0n1 /dev/sdb 91 | mkfs.xfs -f -L cache /dev/md0 92 | mdadm --detail /dev/md0 93 | ``` 94 | 95 | Remember to add a mountpoint to start at boot. 96 | 97 | ## BTRFS (disk setup guide) 98 | 99 | ### BTRFS Commands TL;DR 100 | 101 | ``` 102 | btrfs device scan 103 | ``` 104 | 105 | We will format the entire disk without a partition scheme. 106 | 107 | ``` 108 | mkfs.btrfs -L disk1 /dev/sdb 109 | mkdir /mnt/disk1 110 | mount /dev/sdb /mnt/disk1 111 | mkfs.btrfs -L disk2 /dev/sdc 112 | mkdir /mnt/disk2 113 | mount /dev/sdc /mnt/disk2 114 | ``` 115 | 116 | Confirm 117 | 118 | ``` 119 | btrfs filesystem show 120 | btrfs filesystem usage /mnt/disk1 121 | btrfs filesystem usage /mnt/disk2 122 | ``` 123 | 124 | ## Disk scrubbing on BTRFS 125 | 126 | ``` 127 | btrfs scrub start /mnt/disk1 128 | btrfs scrub status /mnt/disk1 129 | ``` 130 | 131 | ## Checking tiered storage mover process state 132 | 133 | ``` 134 | ps -auxq $(cat /var/run/mover.pid) 135 | ``` -------------------------------------------------------------------------------- /checkmk/README.md: -------------------------------------------------------------------------------- 1 | # Monitoring / CheckMK 2 | 3 | ### Plugins 4 | 5 | https://exchange.checkmk.com/p/btrfs-health 6 | 7 | ## hd-idle (custom solutioning) 8 | 9 | ### Identify log lines 10 | 11 | ``` 12 | grep -o --perl-regexp 'hd-idle\[\d+]: \K.*' /var/log/syslog 13 | ``` -------------------------------------------------------------------------------- /checkmk/thelinuxguy-hd-idle/thelinuxguy_hd-idle: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | # github.com/TheLinuxGuy 3 | # Attempt at my first checkmk plugin. 4 | # https://docs.checkmk.com/latest/en/devel_check_plugins.html#_checks_with_more_than_one_service_items_per_host 5 | 6 | -------------------------------------------------------------------------------- /disaster_recovery_snapraid-btrfs.md: -------------------------------------------------------------------------------- 1 | # Disaster Recovery Scenarios 2 | 3 | #### Expected behavior 4 | 5 | 1. Install new physical hard drive 6 | 1. Format with btrfs / prep. 7 | 1. Snapraid restores missing data on new physical disk (recovery). 8 | 1. After restore is complete, we should be back online. 9 | 10 | ## Helpful sources 11 | 12 | https://github.com/trapexit/backup-and-recovery-howtos/blob/master/docs/recovery_(mergerfs,snapraid).md 13 | 14 | ## Pulling out a disk from the array 15 | 16 | Given the following setup 17 | ``` 18 | /dev/sdc1 17T 2.3T 15T 14% /mnt/parity1 19 | /dev/sdd 7.3T 140M 7.3T 1% /mnt/disk1 20 | /dev/sde 17T 2.4T 15T 15% /mnt/disk2 21 | /dev/sdd 7.3T 140M 7.3T 1% /mnt/snapraid-content/disk1 22 | /dev/sde 17T 2.4T 15T 15% /mnt/snapraid-content/disk2 23 | mergerfs 24T 2.4T 22T 10% /mnt/slow-storage 24 | ``` 25 | 26 | Let's pull out `/mnt/disk2` from the system. As you can see 2.4TB of data is used. 27 | 28 | ``` 29 | Oct 31 18:26:06 nas kernel: [40691.137391] BTRFS info (device sde): forced readonly 30 | Oct 31 18:26:06 nas kernel: [40691.137392] BTRFS warning (device sde): Skipping commit of aborted transaction. 31 | Oct 31 18:26:06 nas kernel: [40691.137392] BTRFS: error (device sde) in cleanup_transaction:1826: errno=-5 IO failure 32 | Oct 31 18:26:06 nas kernel: [40691.137400] BTRFS info (device sde): delayed_refs has NO entry 33 | Oct 31 18:26:06 nas kernel: [40691.137455] BTRFS error (device sde): commit super ret -5 34 | Oct 31 18:26:06 nas systemd[1]: mnt-disk2.mount: Succeeded. 35 | Oct 31 18:26:06 nas systemd[1]: Unmounted /mnt/disk2. 36 | ``` 37 | 38 | 39 | #### Procedure 40 | 41 | 1. Verify current /etc/snapraid.conf 42 | 43 | ``` 44 | # SnapRAID configuration file 45 | 46 | # Parity location(s) 47 | 1-parity /mnt/parity1/snapraid.parity 48 | #2-parity /mnt/parity2/snapraid.parity 49 | 50 | # Content file location(s) 51 | content /var/snapraid.content 52 | content /mnt/snapraid-content/disk1/snapraid.content 53 | content /mnt/snapraid-content/disk2/snapraid.content 54 | 55 | # Data disks 56 | data d1 /mnt/disk1 57 | data d2 /mnt/disk2 58 | #data d3 /mnt/disk3 59 | #data d4 /mnt/disk4 60 | 61 | # Excludes hidden files and directories 62 | exclude *.unrecoverable 63 | exclude /tmp/ 64 | exclude /lost+found/ 65 | exclude downloads/ 66 | exclude appdata/ 67 | exclude *.!sync 68 | exclude /.snapshots/ 69 | ``` 70 | 71 | We're failing `/dev/sde` aka `/mnt/disk2` aka `d2` in the config. 72 | 73 | ### /dev/sdf Brand new physical disk configure 74 | 75 | The `/dev/sde` drive is BTRFS configured with LABEL `mergerfsdisk2` let's format the replacement the same way. 76 | 77 | ``` 78 | mkfs.btrfs -L mergerfsdisk2 /dev/sdf 79 | ``` 80 | 81 | Let's temporarily mount it so we can create subvolumes. Folder `mergerfsdisk2` should already exist, if not create it. 82 | 83 | ``` 84 | mount /dev/sdf /mnt/btrfs-roots/mergerfsdisk2 85 | ``` 86 | 87 | Now let's create the subvolumes (mountpoints) used for DATA + snapraid. 88 | 89 | ``` 90 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk2/data 91 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk2/content 92 | umount /mnt/btrfs-roots/mergerfsdisk2 93 | ``` 94 | 95 | We need to recreate the snapper configurations on this new disk to replace the one we pulled out, otherwise `snapraid-btrfs` will fail with errors complaining about /mnt/disk2/.content not being a btrfs_snapshot. See the other section `Recreate backup configurations to resume snapraid sync` below. 96 | 97 | #### Recovery 98 | 99 | Let's remount the old existing `/etc/fstab` that was broken when we pulled the drives. 100 | 101 | ``` 102 | mount /mnt/snapraid-content/disk2 103 | mount /mnt/disk2 104 | ``` 105 | 106 | Let's run recovery from snapraid (inside 'screen' recommended). This will read parity disk and write on the new disk recovered data fragments. 107 | 108 | ``` 109 | mkdir -p /root/drecovery/ 110 | snapraid -d d2 -l /root/drecovery/snapraid-disk2-fix.log fix 111 | ``` 112 | 113 | Expected output 114 | 115 | ``` 116 | 100% completed, 0 MB accessed in 2:59 117 | 118 | 8870882 errors 119 | 8870882 recovered errors 120 | 0 unrecoverable errors 121 | Everything OK 122 | ``` 123 | 124 | Now we need to verify data (hashing verification stage). 125 | 126 | `snapraid -d d2 -a check` 127 | 128 | ##### Permissions on all recovered files will be root/root and must be fixed 129 | 130 | To fix the plex library and fileshares SMB. Reset group to the fileshare groups on all media. /mnt/disk1 was recovered and needs to have permissions reset. 131 | 132 | ``` 133 | chgrp -R fileshare /mnt/disk1/media/ 134 | ``` 135 | 136 | ## Recreate backup configurations to resume snapraid sync 137 | 138 | ``` 139 | rm /etc/snapper/configs/mergerfsdisk2 140 | ``` 141 | 142 | Delete `mergerfsdisk2` from SNAPPER_CONFIGS variable on file `/etc/default/snapper`. 143 | 144 | Recreate on new disk 145 | 146 | ``` 147 | snapper -c mergerfsdisk2 create-config -t mergerfsdisk /mnt/disk2 148 | # verify 149 | snapper list-configs 150 | ``` 151 | 152 | Now try to run the runner. 153 | 154 | ``` 155 | /usr/bin/python3 /opt/snapraid-btrfs-runner/snapraid-btrfs-runner.py -c /opt/snapraid-btrfs-runner/snapraid-btrfs-runner.conf 156 | ``` 157 | 158 | It should work. 159 | 160 | ``` 161 | 2022-11-01 00:44:52,597 [OUTPUT] Loading state from /var/snapraid.content... 162 | 2022-11-01 00:44:52,597 [OUTERR] WARNING! UUID is unsupported for disks: 'd1', 'd2'. Not using inodes to detect move operations. 163 | 2022-11-01 00:44:52,597 [OUTPUT] Comparing... 164 | 2022-11-01 00:44:52,597 [OUTPUT] 165 | 2022-11-01 00:44:52,597 [OUTPUT] 0 equal 166 | 2022-11-01 00:44:52,597 [OUTPUT] 0 added 167 | 2022-11-01 00:44:52,597 [OUTPUT] 0 removed 168 | 2022-11-01 00:44:52,597 [OUTPUT] 0 updated 169 | 2022-11-01 00:44:52,597 [OUTPUT] 0 moved 170 | 2022-11-01 00:44:52,597 [OUTPUT] 0 copied 171 | 2022-11-01 00:44:52,597 [OUTPUT] 0 restored 172 | 2022-11-01 00:44:52,597 [OUTPUT] No differences 173 | 2022-11-01 00:44:52,904 [INFO ] ************************************************************ 174 | 2022-11-01 00:44:52,904 [INFO ] Diff results: 0 added, 0 removed, 0 moved, 0 modified 175 | 2022-11-01 00:44:52,904 [INFO ] No changes detected, no sync required 176 | 2022-11-01 00:44:52,904 [INFO ] Running cleanup... 177 | 2022-11-01 00:44:53,240 [INFO ] ************************************************************ 178 | 2022-11-01 00:44:53,240 [INFO ] All done 179 | 2022-11-01 00:44:53,249 [ERROR ] Failed to send email because smtp host is not set 180 | 2022-11-01 00:44:53,249 [INFO ] Run finished successfully 181 | ``` 182 | 183 | ## Unrecoverable snapraid errors 184 | 185 | Snapraid will create and mark unrecoverable files by changing the file extension to *.unrecoverable. 186 | 187 | To find all issues that are possibly lost: 188 | 189 | ``` 190 | find /mnt/disk1 -iname *.unrecoverable 191 | ``` -------------------------------------------------------------------------------- /filemover/cache-bulk-mover-fast-unsafe.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # github.com/TheLinuxGuy Tiered Storage mover (/cache -> /mnt/slow-storage) 3 | 4 | # This is my customized 'mover' script used for moving files from the cache ZFS pool to the 5 | # main mergerfs pool (/mnt/slow-storage). It is typically invoked via cron and this is 6 | # inspired unraid-mover script. 7 | 8 | # !!! WARNING !!! 9 | # This script uses rsync --inplace to speed up transactions. Although it does initially 10 | # check for open files with `fuser` command before attempting to move a file, there's a small 11 | # but possible time window where a file being copied may be accessed by a user. This may lead 12 | # to data corruption of that file being copied, assuming that changes were made before the rsync 13 | # command was able to finish that in-place copy. See: https://explainshell.com/explain?cmd=rsync+--inplace 14 | # !!! WARNING !!! 15 | 16 | # HOW IT WORKS / WHAT IT DOES 17 | # After checking if it's valid for this script to run, we check each of the top-level 18 | # directories (shares) on the cache disk. Right now this script moves everything out of 19 | # /cache ZFS pool and into the slower disks. 20 | 21 | # The script is set up so that hidden directories (i.e., directory names beginning with a '.' 22 | # character) at the topmost level of the cache drive are also not moved. This behavior can be 23 | # turned off by uncommenting the following line: 24 | # shopt -s dotglob 25 | 26 | # Files at the top level of the cache disk are never moved to the array. 27 | 28 | # The 'find' command generates a list of all files and directories on the cache disk. 29 | # For each file, if the file is not "in use" by any process (as detected by 'fuser' command), 30 | # then the file is copied to the array, and upon success, deleted from the cache disk. 31 | # For each directory, if the directory is empty, then the directory is created on the array, 32 | # and upon success, deleted from the cache disk. 33 | 34 | # For each file or directory, we use 'rsync' to copy the file or directory to the array. 35 | # We specify the proper options to rsync so that files and directories get copied to the 36 | # array while preserving ownership, permissions, access times, and extended attributes (this 37 | # is why we use rsync: a simple mv command will not preserve all metadata properly). 38 | 39 | # If an error occurs in copying (or overwriting) a file from the cache disk to the array, the 40 | # file on the array, if present, is deleted and the operation continues on to the next file. 41 | 42 | CACHE_PATH="/cache" 43 | MERGERFS_SHARE_PATH="/mnt/cached" 44 | MERGERFS_ARCHIVE_PATH="/mnt/slow-storage/" 45 | 46 | # Only run script if cache disk enabled and in use 47 | if [ ! -d $CACHE_PATH -o ! -d $MERGERFS_SHARE_PATH ]; then 48 | exit 0 49 | fi 50 | 51 | # If a previous invokation of this script is already running, exit 52 | if [ -f /var/run/mover.pid ]; then 53 | if ps h `cat /var/run/mover.pid` | grep mover ; then 54 | echo "mover already running" 55 | exit 0 56 | fi 57 | fi 58 | echo $$ >/var/run/mover.pid 59 | echo "mover started" 60 | 61 | cd $CACHE_PATH 62 | shopt -s nullglob 63 | for Share in */ ; do 64 | echo "moving \"${Share%/}\"" 65 | find "./$Share" -depth \( \( -type f ! -exec fuser -s {} \; \) -o \( -type d -empty \) \) -print \ 66 | \( -exec rsync -i -dIWRpEAXogt --numeric-ids --inplace {} $MERGERFS_ARCHIVE_PATH \; -delete \) -o \( -type f -exec rm -f /mnt/slow-storage/{} \; \) 67 | done 68 | find $CACHE_PATH -empty -type d -delete 69 | rm /var/run/mover.pid 70 | echo "mover finished" 71 | -------------------------------------------------------------------------------- /filemover/uncache-mover.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | # TheLinuxGuy XFS/mdadm cache pool mergerfs tiered cache mover. 3 | # File age time-based mover depending on goal % cache utilization. 4 | import argparse 5 | import shutil 6 | import subprocess 7 | import syslog 8 | import os 9 | import sys 10 | import time 11 | from pathlib import Path 12 | 13 | CURRENT_PID = str(os.getpid()) 14 | PID_FILE = '/var/run/uncache-mover.pid' 15 | CACHE_PATH = '/cache' 16 | MERGERFS_SLOW = '/mnt/slow-storage/' 17 | 18 | def check_pid(): 19 | """Check that PID file does not exist.""" 20 | try: 21 | with open(PID_FILE) as file: 22 | pid = int(file.readline()) 23 | except OSError: 24 | # PID doesn't exist. 25 | return 26 | print('Fatal error: Mover script already executing. Check PID file.') 27 | sys.exit(1) 28 | 29 | def write_pid(): 30 | """Create a PID File.""" 31 | try: 32 | with open(PID_FILE, "w") as file: 33 | file.write(CURRENT_PID) 34 | except OSError: 35 | print(f"Fatal Error: Unable to write pid file {PID_FILE}") 36 | sys.exit(1) 37 | 38 | 39 | if __name__ == "__main__": 40 | """ 41 | Uncaching utility. This scripts assumes that you have a cache-like 42 | mount point, for which you want to preserve a certain amount of free 43 | space by moving heavy/rarely-accessed files to a slower mount point. 44 | 45 | The script, in its simplest form, can be run as: 46 | 47 | :: 48 | 49 | $ ./mergerfs-uncache.py -s /mnt/cache -d /mnt/slow -t 75 50 | 51 | In this way least accessed files will be moved one after the other 52 | until the percentage of used capacity will be less than the target. 53 | Other options are also available. Please consider this is a work in 54 | progress. 55 | """ 56 | 57 | check_pid() 58 | parser = argparse.ArgumentParser() 59 | parser.add_argument( 60 | "-s", 61 | "--source", 62 | dest="source", 63 | type=Path, 64 | help="Source path (i.e. cache pool root path.", 65 | ) 66 | parser.add_argument( 67 | "-d", 68 | "--destination", 69 | dest="destination", 70 | type=Path, 71 | help="Destination path (i.e. slow pool root path.", 72 | ) 73 | parser.add_argument( 74 | "--num-files", 75 | dest="num_files", 76 | default=-1, 77 | type=int, 78 | help="Maximum number of files moved away from cache.", 79 | ) 80 | parser.add_argument( 81 | "--time-limit", 82 | dest="time_limit", 83 | default=-1, 84 | type=int, 85 | help="Time limit for the whole process (in seconds). Once reached program exits.", 86 | ) 87 | parser.add_argument( 88 | "-t", 89 | "--target", 90 | dest="target", 91 | type=float, 92 | help="Desired max cache usage, in percentage (e.g. 70).", 93 | ) 94 | parser.add_argument( 95 | "-v", "--verbose", help="Increase output verbosity.", action="store_true" 96 | ) 97 | args = parser.parse_args() 98 | 99 | # Some general checks 100 | cache_path: Path = args.source 101 | if not cache_path.is_dir(): 102 | raise NotADirectoryError(f"{cache_path} is not a valid directory.") 103 | slow_path: Path = args.destination 104 | if not slow_path.is_dir(): 105 | raise NotADirectoryError(f"{slow_path} is not a valid directory.") 106 | 107 | last_id = args.num_files 108 | time_limit = args.time_limit 109 | 110 | target = float(args.target) 111 | if target <= 1 or target >= 100: 112 | raise ValueError( 113 | f"Target value is in percentage, i.e. in the range of (0, 100). Found {target} instead." 114 | ) 115 | 116 | cache_stats = shutil.disk_usage(cache_path) 117 | 118 | usage_percentage = 100 * cache_stats.used / cache_stats.total 119 | syslog.syslog( 120 | syslog.LOG_INFO, 121 | f"Uncaching from {cache_path} ({usage_percentage:.2f}% used) to {slow_path}.", 122 | ) 123 | if usage_percentage <= target: 124 | syslog.syslog( 125 | syslog.LOG_INFO, 126 | f"Target of {target}% of used capacity already reached. Exiting.", 127 | ) 128 | exit(0) 129 | 130 | # Create PID file. 131 | write_pid() 132 | syslog.syslog(syslog.LOG_INFO, "Computing candidates...") 133 | candidates = sorted( 134 | [(c, c.stat()) for c in cache_path.glob("**/*") if c.is_file()], 135 | key=lambda p: p[1].st_atime, 136 | ) 137 | 138 | t_start = time.monotonic() 139 | syslog.syslog(syslog.LOG_INFO, "Processing candidates...") 140 | cache_used = cache_stats.used 141 | for c_id, (c_path, c_stat) in enumerate(candidates): 142 | syslog.syslog(syslog.LOG_DEBUG, f"{c_path}") 143 | 144 | if not c_path.exists(): 145 | # Since rsync moves also other hard links it might be that 146 | # some files are not existing anymore. However, invoking rsync 147 | # for each file (instead of directories) does not preserve 148 | # hard links. 149 | syslog.syslog(syslog.LOG_WARNING, f"{c_path} does not exist.") 150 | continue 151 | 152 | # Rsync options 153 | # -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) 154 | # -x, --one-file-system don't cross filesystem boundaries 155 | # -q, --quiet suppress non-error messages 156 | # -H, --hard-links preserve hard links 157 | # -A, --acls preserve ACLs (implies --perms) 158 | # -X, --xattrs preserve extended attributes 159 | # -W, --whole-file copy files whole (without delta-xfer algorithm) 160 | # -E, --executability preserve the file's executability 161 | # -S, --sparse turn sequences of nulls into sparse blocks 162 | # -R, --relative use relative path names 163 | # --preallocate allocate dest files before writing them 164 | # --remove-source-files sender removes synchronized files (non-dirs) 165 | subprocess.call( 166 | [ 167 | "rsync", 168 | "-axqHAXWESR", 169 | "--preallocate", 170 | "--remove-source-files", 171 | f"{cache_path}/./{c_path.relative_to(cache_path)}", 172 | f"{slow_path}/", 173 | ] 174 | ) 175 | cache_used -= c_stat.st_size 176 | 177 | # Evaluate early breaking conditions 178 | if last_id >= 0 and c_id >= last_id - 1: 179 | syslog.syslog( 180 | syslog.LOG_INFO, f"Maximum number of moved files reached ({last_id})." 181 | ) 182 | break 183 | if time_limit >= 0 and time.monotonic() - t_start > time_limit: 184 | syslog.syslog( 185 | syslog.LOG_INFO, f"Time limit reached ({time_limit} seconds)." 186 | ) 187 | break 188 | if (100 * cache_used / cache_stats.total) <= target: 189 | syslog.syslog( 190 | syslog.LOG_INFO, f"Target of maximum used capacity reached ({target})." 191 | ) 192 | break 193 | 194 | cache_stats = shutil.disk_usage(cache_path) 195 | usage_percentage = 100 * cache_stats.used / cache_stats.total 196 | syslog.syslog( 197 | syslog.LOG_INFO, 198 | f"Process completed in {round(time.monotonic() - t_start)} seconds. Current usage percentage is {usage_percentage:.2f}%.", 199 | ) 200 | # Successful exec; cleanup PID file. 201 | os.unlink(PID_FILE) -------------------------------------------------------------------------------- /filemover/zfs-uncache-mover.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | # TheLinuxGuy ZFS cache pool mergerfs tiered cache mover. 3 | # File age time-based mover depending on goal % cache utilization. 4 | # This script works but is abandoned after NFS+ZFS+mergerfs instability. 5 | # !! THIS SCRIPT IS ZFS POOL SPECIFIC !! DO NOT USE ON XFS cache setup. 6 | 7 | # Usage example: 8 | # python3 zfs-uncache-mover.py -s /cache -d /mnt/slow-storage -t 10 9 | import argparse 10 | import subprocess 11 | import syslog 12 | import time 13 | import re 14 | import os 15 | import sys 16 | from pathlib import Path 17 | 18 | ZP = '/usr/sbin/zpool' # proxmox zpool path. 19 | PID_FILE = '/var/run/uncache-mover.pid' 20 | IGNORE_PATH = '/cache/media/downloads/incomplete/' 21 | CURRENT_PID = str(os.getpid()) 22 | 23 | def check_pid(): 24 | """Check that PID file does not exist.""" 25 | try: 26 | with open(PID_FILE) as file: 27 | pid = int(file.readline()) 28 | except OSError: 29 | # PID doesn't exist. 30 | return 31 | print('Fatal error: Mover script already executing. Check PID file.') 32 | sys.exit(1) 33 | 34 | def write_pid(): 35 | """Create a PID File.""" 36 | try: 37 | with open(PID_FILE, "w") as file: 38 | file.write(CURRENT_PID) 39 | except OSError: 40 | print(f"Fatal Error: Unable to write pid file {PID_FILE}") 41 | sys.exit(1) 42 | 43 | def run(cmd, split=r'\t'): 44 | r = subprocess.check_output( 45 | cmd, 46 | encoding='utf8', 47 | stderr=subprocess.DEVNULL 48 | ) 49 | return [re.split(split, x.strip()) for x in r.split('\n') if x.strip()] 50 | 51 | def pool_attributes(pool_name): 52 | r = run([ 53 | ZP, 54 | "list", 55 | pool_name, 56 | "-Hpo", 57 | "name,size,alloc,free,cap" 58 | ]) 59 | 60 | return {x[0]: { 61 | 'name': x[0], 62 | 'total': int(x[1]), 63 | 'used': int(x[2]), 64 | 'available': int(x[3]), 65 | 'usage_percentage': int(x[4]), 66 | } for x in r} 67 | 68 | def sizeof_fmt(num, suffix="B"): 69 | for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]: 70 | if abs(num) < 1024.0: 71 | return f"{num:3.1f}{unit}{suffix}" 72 | num /= 1024.0 73 | return f"{num:.1f}Yi{suffix}" 74 | 75 | 76 | if __name__ == "__main__": 77 | """ 78 | Uncaching utility. This scripts assumes that you have a cache-like 79 | mount point, for which you want to preserve a certain amount of free 80 | space by moving heavy/rarely-accessed files to a slower mount point. 81 | The script, in its simplest form, can be run as: 82 | :: 83 | $ python3 uncache-mover.py -s /cached -d /mnt/slow-storage -t 10 84 | In this way least accessed files will be moved one after the other 85 | until the percentage of used capacity will be less than the target. 86 | Other options are also available. Please consider this is a work in 87 | progress. 88 | """ 89 | check_pid() 90 | parser = argparse.ArgumentParser() 91 | parser.add_argument( 92 | "-s", 93 | "--source", 94 | dest="source", 95 | help="ZFS Cache Pool name", 96 | ) 97 | parser.add_argument( 98 | "-d", 99 | "--destination", 100 | dest="destination", 101 | help="Destination path (i.e. slow pool root path.", 102 | ) 103 | parser.add_argument( 104 | "--num-files", 105 | dest="num_files", 106 | default=-1, 107 | type=int, 108 | help="Maximum number of files moved away from cache.", 109 | ) 110 | parser.add_argument( 111 | "--time-limit", 112 | dest="time_limit", 113 | default=-1, 114 | type=int, 115 | help="Time limit for the whole process (in seconds). Once reached program exits.", 116 | ) 117 | parser.add_argument( 118 | "-t", 119 | "--target", 120 | dest="target", 121 | type=float, 122 | help="Desired max cache usage, in percentage (e.g. 70).", 123 | ) 124 | parser.add_argument( 125 | "-v", "--verbose", help="Increase output verbosity.", action="store_true" 126 | ) 127 | args = parser.parse_args() 128 | 129 | # Pool name sanitization 130 | zfs_pool_name_from_path = (str(args.source)).lstrip('/') 131 | 132 | # Some general checks 133 | cache_path: Path = Path(args.source) 134 | if not cache_path.is_dir(): 135 | raise NotADirectoryError(f"{cache_path} is not a valid directory.") 136 | slow_path: Path = Path(args.destination) 137 | if not slow_path.is_dir(): 138 | raise NotADirectoryError(f"{slow_path} is not a valid directory.") 139 | 140 | last_id = args.num_files 141 | time_limit = args.time_limit 142 | 143 | target = float(args.target) 144 | if target <= 1 or target >= 100: 145 | raise ValueError( 146 | f"Target value is in percentage, i.e. in the range of (0, 100). Found {target} instead." 147 | ) 148 | 149 | # Initial ZFS filesystem checks 150 | zfs_data = pool_attributes(zfs_pool_name_from_path) 151 | cache_stats = zfs_data[zfs_pool_name_from_path] 152 | 153 | usage_percentage = cache_stats['usage_percentage'] 154 | syslog.syslog( 155 | syslog.LOG_INFO, 156 | f"Uncaching from {cache_path} ({usage_percentage:.2f}% used) to {slow_path}.", 157 | ) 158 | if usage_percentage <= target: 159 | syslog.syslog( 160 | syslog.LOG_INFO, 161 | f"Target of {target}% of used capacity already reached. Exiting.", 162 | ) 163 | exit(0) 164 | 165 | # Create PID file. 166 | write_pid() 167 | syslog.syslog(syslog.LOG_INFO, "Computing candidates...") 168 | candidates = sorted( 169 | [(c, c.stat()) for c in cache_path.glob("**/*") if c.is_file()], 170 | key=lambda p: p[1].st_atime, 171 | ) 172 | 173 | t_start = time.monotonic() 174 | syslog.syslog(syslog.LOG_INFO, "Processing candidates...") 175 | cache_used = cache_stats['used'] 176 | ignored_files = 0 177 | 178 | 179 | for c_id, (c_path, c_stat) in enumerate(candidates): 180 | syslog.syslog(syslog.LOG_DEBUG, f"{c_path}") 181 | 182 | if not c_path.exists(): 183 | # Since rsync moves also other hard links it might be that 184 | # some files are not existing anymore. However, invoking rsync 185 | # for each file (instead of directories) does not preserve 186 | # hard links. 187 | syslog.syslog(syslog.LOG_WARNING, f"{c_path} does not exist.") 188 | continue 189 | 190 | if c_path.is_relative_to(IGNORE_PATH): 191 | ignored_files += 1 192 | continue 193 | # Rsync options 194 | # -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) 195 | # -x, --one-file-system don't cross filesystem boundaries 196 | # -q, --quiet suppress non-error messages 197 | # -H, --hard-links preserve hard links 198 | # -A, --acls preserve ACLs (implies --perms) 199 | # -X, --xattrs preserve extended attributes 200 | # -W, --whole-file copy files whole (without delta-xfer algorithm) 201 | # -E, --executability preserve the file's executability 202 | # -S, --sparse turn sequences of nulls into sparse blocks 203 | # -R, --relative use relative path names 204 | # --preallocate allocate dest files before writing them 205 | # --remove-source-files sender removes synchronized files (non-dirs) 206 | subprocess.call( 207 | [ 208 | "rsync", 209 | "-axqHAXWESR", 210 | "--preallocate", 211 | "--remove-source-files", 212 | f"{cache_path}/./{c_path.relative_to(cache_path)}", 213 | f"{slow_path}/", 214 | ] 215 | ) 216 | cache_used -= c_stat.st_size 217 | 218 | # Evaluate early breaking conditions 219 | if last_id >= 0 and c_id >= last_id - 1: 220 | syslog.syslog( 221 | syslog.LOG_INFO, f"Maximum number of moved files reached ({last_id})." 222 | ) 223 | break 224 | if time_limit >= 0 and time.monotonic() - t_start > time_limit: 225 | syslog.syslog( 226 | syslog.LOG_INFO, f"Time limit reached ({time_limit} seconds)." 227 | ) 228 | break 229 | if (100 * cache_used / cache_stats['total']) <= target: 230 | syslog.syslog( 231 | syslog.LOG_INFO, f"Target of maximum used capacity reached ({target})." 232 | ) 233 | break 234 | 235 | # Verify work is done. 236 | # Initial ZFS filesystem checks 237 | zfs_data = pool_attributes(zfs_pool_name_from_path) 238 | cache_stats = zfs_data[zfs_pool_name_from_path] 239 | usage_percentage = 100 * cache_stats['used'] / cache_stats['total'] 240 | 241 | syslog.syslog( 242 | syslog.LOG_INFO, 243 | f"There were {ignored_files} file skipped due to being on {IGNORE_PATH} path.", 244 | ) 245 | syslog.syslog( 246 | syslog.LOG_INFO, 247 | f"Process completed in {round(time.monotonic() - t_start)} seconds. Current usage percentage is {usage_percentage:.2f}%.", 248 | ) 249 | # Successful exec; cleanup PID file. 250 | os.unlink(PID_FILE) -------------------------------------------------------------------------------- /img/hardware-10g-10w-idle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TheLinuxGuy/free-unraid/9b7bf6a02bc4b916f54bd1f12379a44b415463af/img/hardware-10g-10w-idle.png -------------------------------------------------------------------------------- /img/intel-gpu-top.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TheLinuxGuy/free-unraid/9b7bf6a02bc4b916f54bd1f12379a44b415463af/img/intel-gpu-top.png -------------------------------------------------------------------------------- /install_steps.md: -------------------------------------------------------------------------------- 1 | # Installing OS and tools 2 | 3 | Warning: this may be incomplete. I provide no support or guarantees these steps will always work. 4 | 5 | Assumption: (1) blank virtual machine in proxmox will be our "free-unraid NAS" to be setup. (2) You passthru your physical hard disks to VM via passthrough. (3) You want to mirror my setup and have cockpit for Web UI to manage your shares and samba. (4) you know your way around linux / basic system admin stuff. 6 | 7 | ### High level overview 8 | 9 | 1. Install Ubuntu Server OS (enable SSH). Download ubuntu server DVD image; follow GUI install steps. 10 | 1. Do first login. Run all updates (apt-get update; apt-get dist-upgrade; 'do-release-upgrade') 11 | 1. Shutdown VM - in proxmox configure the PCI passthrough to SATA disks controllers. Map the NVME drive via passthrough. 12 | 1. Install repo https://repo.45drives.com via `curl -sSL https://repo.45drives.com/setup | sudo bash` 13 | 1. Install cockpit and other tools 14 | ``` 15 | apt-get install zfsutils-linux cockpit-pcp btrfs-progs libbtrfsutil1 btrfs-compsize duc smartmontools cockpit-benchmark cockpit-file-sharing cockpit-identities cockpit-navigator lsscsi pv 16 | ``` 17 | 1. Web UI cockpit should now be available at localhost:9090 (or whatever IP the VM has) 18 | 1. Create the /cache NVME RAID1 XFS mirror 19 | ``` 20 | mdadm --stop /dev/md* 21 | mdadm --create --verbose /dev/md0 --bitmap=none --level=mirror --raid-devices=2 /dev/nvme0n1 /dev/sdb 22 | mkfs.xfs -f -L cache /dev/md0 23 | mdadm --detail /dev/md0 24 | ``` 25 | 1. Setup `/etc/fstab` for /cache ensuring it mounts at boot. 26 | 1. Install mergerfs (e.g: mergerfs_2.34.1.ubuntu-jammy_amd64.deb) 27 | 1. Refer to mergerfs.md for `/etc/fstab` configuration for disks. Ensure `/mnt/slow-storage` and `/mnt/cache` folders exists. 28 | 1. Install `hd-idle` ensure its enabled and its configuration file set to log to file `/var/log/hd-idle.log` and begin idle at 180 seconds. 29 | 1. Build a .deb package of snapraid (using docker). 30 | ``` 31 | # these steps assume a valid, working docker installation 32 | apt update && apt install git -y 33 | mkdir ~/tmp && cd ~/tmp 34 | git clone https://github.com/IronicBadger/docker-snapraid 35 | cd docker-snapraid 36 | chmod +x build.sh 37 | ./build.sh 38 | sudo dpkg -i build/snapraid-from-source.deb 39 | ``` 40 | 1. Copy the .deb file to NAS server; install it. `dpkg -i snapraid-from-source.deb`. Verify with `snapraid --version`. 41 | 1. Ensure `/etc/snapraid.conf` contains: 42 | ``` 43 | # SnapRAID configuration file 44 | 45 | # Parity location(s) 46 | 1-parity /mnt/parity/snapraid.parity 47 | #2-parity /mnt/parity2/snapraid.parity 48 | 49 | # Content file location(s) 50 | content /var/snapraid.content 51 | content /mnt/snapraid-content/disk1/snapraid.content 52 | content /mnt/snapraid-content/disk2/snapraid.content 53 | 54 | # Data disks 55 | data d1 /mnt/disk1 56 | data d2 /mnt/disk2 57 | #data d3 /mnt/disk3 58 | #data d4 /mnt/disk4 59 | 60 | # Excludes hidden files and directories 61 | exclude *.unrecoverable 62 | exclude /tmp/ 63 | exclude /lost+found/ 64 | exclude downloads/ 65 | exclude appdata/ 66 | exclude *.!sync 67 | exclude /.snapshots/ 68 | ``` 69 | 1. Install snapraid-runner `git clone https://github.com/Chronial/snapraid-runner.git /opt/snapraid-runner` 70 | 1. Create log file for snapraid runs `touch /var/log/snapraid.log` 71 | 1. `mv snapraid-runner.conf.example snapraid-runner.conf` then configure as expected, including cronjob.[Example file](snapraid_btrfs_runner.md). 72 | 73 | 74 | -------------------------------------------------------------------------------- /maintenance_tasks.md: -------------------------------------------------------------------------------- 1 | # Maintenance Tasks 2 | 3 | ## Random FUSE lockups 4 | 5 | ``` 6 | root@nas:/home/gfm# fusermount -uz /mnt/cached/ 7 | root@nas:/home/gfm# fusermount -uz /mnt/slow-storage/ 8 | # umount -l /mnt/cached/ /mnt/slow-storage/ 9 | root@nas:/home/gfm# mount /mnt/cached/ 10 | root@nas:/home/gfm# mount /mnt/slow-storage/ 11 | root@nas:/home/gfm# systemctl restart nfs-kernel-server 12 | root@nas:/home/gfm# systemctl status nfs-kernel-server 13 | ``` 14 | 15 | ## List all IOMMU Groups mapping 16 | 17 | ``` 18 | for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done; 19 | ``` 20 | 21 | ## Powertop from source 22 | 23 | In debian ensure `apt-get install build-essential` is installed. 24 | 25 | ## CPU / BIOS 26 | 27 | ### Intel Microcode releases 28 | https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases 29 | 30 | Protip: wait until motherboard manufacturer bundles new microcode in BIOS update image file. 31 | 32 | ### BIOS updates for AsRock B660M Pro RS 33 | https://www.asrock.com/mb/Intel/B660m%20Pro%20RS/index.asp#BIOS 34 | 35 | ## GPU 36 | 37 | ### Verifying which kernel driver may be loaded 38 | 39 | ``` 40 | # inxi -G 41 | Graphics: Device-1: Intel Raptor Lake-S UHD Graphics driver: i915 v: kernel 42 | Display: server: No display server data found. Headless machine? tty: 241x22 43 | Message: Advanced graphics data unavailable in console for root. 44 | 45 | ``` 46 | 47 | ## Services 48 | 49 | ### NFS 50 | 51 | Status 52 | ``` 53 | systemctl status nfs-kernel-server 54 | ``` 55 | 56 | Restart 57 | ``` 58 | systemctl restart nfs-kernel-server 59 | ``` 60 | 61 | Check NFS exports on server .54 62 | ``` 63 | showmount -e 192.168.1.54 64 | ``` 65 | 66 | Mounting on client 67 | ``` 68 | mount -t nfs 192.168.1.54:/mnt/cached /mnt/derp/ -vvv 69 | ``` 70 | 71 | Unmount force 72 | ``` 73 | umount -f -l /mnt/derp 74 | ``` 75 | 76 | [Safe] Clean up random *.nfs files leftover from NFS crash on filesystem. 77 | ``` 78 | find /mnt/slow-storage/ -type f -regex '.*\.nfs[0-9].*' -delete 79 | ``` 80 | 81 | [Risky] Clean up random *.nfs files leftover from NFS crash on filesystem. 82 | ``` 83 | find /mnt/cached/ -type f -regex '.*\.nfs[a-zA-Z0-9].*' 84 | ``` 85 | 86 | ## Storage 87 | 88 | Monitor disk activity with the following command. 89 | ``` 90 | dstat -cd --disk-util --disk-tps 91 | ``` 92 | 93 | ### SMART checks 94 | 95 | ``` 96 | for i in {a..d}; do echo DISK sd$i; smartctl -x /dev/sd$i | grep 'Self-test execution status' -A 2; done 97 | ``` 98 | 99 | run tests 100 | ``` 101 | for i in {a..d}; do echo DISK sd$i; smartctl -t long /dev/sd$i; done 102 | ``` 103 | 104 | ### Hard drive sleep (hd-idle) 105 | 106 | ``` 107 | systemctl status hd-idle 108 | grep 'hd-idle' /var/log/syslog 109 | ``` 110 | ### Btrfs scrubs (status, resume) 111 | 112 | ``` 113 | root@nas:/cache/music# btrfs scrub status /mnt/disk2 114 | UUID: 8ff09467-056a-48ff-bb6e-7d72b67ca994 115 | Scrub started: Mon Oct 31 18:17:17 2022 116 | Status: interrupted 117 | Duration: 0:29:47 118 | Total to scrub: 12.19TiB 119 | Rate: 100.96MiB/s 120 | Error summary: no errors found 121 | root@nas:/cache/music# btrfs scrub resume /mnt/disk2 122 | scrub resumed on /mnt/disk2, fsid 8ff09467-056a-48ff-bb6e-7d72b67ca994 (pid=1092977) 123 | ``` 124 | 125 | ### Btrfs snapshots 126 | 127 | List all snapshots 128 | ``` 129 | btrfs subvolume list -s /mnt/disk1 130 | btrfs-list --snap-only /mnt/disk1 131 | ``` 132 | 133 | delete 134 | 135 | ``` 136 | root@nas:/home/gfm# btrfs subvolume list -s /mnt/disk1 137 | ID 271 gen 36 cgen 36 top level 259 otime 2022-10-31 10:58:49 path .snapshots/10/snapshot 138 | ID 272 gen 45 cgen 45 top level 259 otime 2022-11-01 00:44:52 path .snapshots/11/snapshot 139 | root@nas:/home/gfm# btrfs subvolume delete /mnt/disk1/.snapshots/11/snapshot 140 | Delete subvolume (no-commit): '/mnt/disk1/.snapshots/11/snapshot' 141 | root@nas:/home/gfm# btrfs subvolume delete /mnt/disk1/.snapshots/10/snapshot 142 | Delete subvolume (no-commit): '/mnt/disk1/.snapshots/10/snapshot' 143 | ``` 144 | 145 | ### Array disk, disk space full. Upgrade to larger disk. 146 | 147 | - **Scenario**: Time to upgrade to a larger hard drive with more capacity. We plan to remove the smallest disk in the array and replace it offline (thanks to Btrfs we can do this with little downtime). 148 | 149 | 1. Install new hard drive on system. 150 | 2. Have Btrfs move the data to new disk, online replacement. 151 | 3. Tell Brtfs to expand the disk to the new size. 152 | 153 | We will be replacing `/mnt/disk1` 8TB that's 97% full with a new 18TB disk, not yet installed. 154 | 155 | ``` 156 | Filesystem Size Used Avail Use% Mounted on 157 | /dev/sde 7.3T 7.0T 294G 97% /mnt/disk1 158 | /dev/sdc 17T 6.1T 11T 37% /mnt/disk2 159 | # fdisk -l /dev/sdf 160 | Disk /dev/sdg: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors 161 | Disk model: WDC WD180EDGZ-11 162 | Units: sectors of 1 * 512 = 512 bytes 163 | Sector size (logical/physical): 512 bytes / 4096 bytes 164 | I/O size (minimum/optimal): 4096 bytes / 4096 bytes 165 | ``` 166 | 167 | Replacement disk was detected as `/dev/sdf` (note we can't rely on 'sdg' staying the same across reboots; I noted my controller slots not being enumerated well on this system). 168 | 169 | We will use btrfs labels to mitigate this problem. 170 | 171 | #### Procedure 172 | 173 | **Note:** My observations are that `btrfs replace` is much slower than simply formatting a new hard disk then using rsync to migrate data. While `online` replacement handled by btrfs automatically is nicer and less involved, if you are in a time crunch keep this into consideration. 174 | 175 | ``` 176 | # btrfs filesystem show /mnt/disk1 177 | Label: 'mergerfsdisk1' uuid: 007055ff-f9a3-458a-b3d4-56ca3daf6bd5 178 | Total devices 1 FS bytes used 6.98TiB 179 | devid 1 size 7.28TiB used 6.99TiB path /dev/sde 180 | # btrfs replace start 1 /dev/sdf /mnt/disk1 181 | ``` 182 | 183 | Let's verify `/dev/sdf` was added to label `mergerfsdisk1` btrfs. 184 | 185 | ``` 186 | root@nas:/home/gfm# btrfs filesystem show /mnt/disk1 187 | Label: 'mergerfsdisk1' uuid: 007055ff-f9a3-458a-b3d4-56ca3daf6bd5 188 | Total devices 2 FS bytes used 6.98TiB 189 | devid 0 size 7.28TiB used 6.99TiB path /dev/sdf 190 | devid 1 size 7.28TiB used 6.99TiB path /dev/sde 191 | 192 | root@nas:/home/gfm# btrfs replace status -1 /mnt/disk1 193 | 0.1% done, 0 write errs, 0 uncorr. read errs 194 | 195 | ``` 196 | 197 | Once the status command returns complete, we need to ensure that `btrfs replace` command also cloned the subvolume we use for parity (`content`). 198 | 199 | ``` 200 | btrfs subvolume list /mnt/disk1 201 | ``` 202 | 203 | Checking device ID and members. We see ID 0 is our new disk `/dev/sdf` and that we will gain about ~9TB of disk space once we force expand the btrfs filesystem after the `btrfs replace` is complete. 204 | 205 | ``` 206 | root@nas:/home/gfm# btrfs dev usage /mnt/disk1/ 207 | /dev/sdf, ID: 0 208 | Device size: 16.37TiB 209 | Device slack: 9.09TiB 210 | Unallocated: 7.28TiB 211 | 212 | /dev/sde, ID: 1 213 | Device size: 7.28TiB 214 | Device slack: 0.00B 215 | Data,single: 6.97TiB 216 | Metadata,DUP: 18.00GiB 217 | System,DUP: 80.00MiB 218 | Unallocated: 291.95GiB 219 | 220 | ``` 221 | 222 | Verify `btrfs replace` completed. Note `/dev/sdf` (18TB) is now the single disk: 223 | ``` 224 | # btrfs replace status -1 /mnt/disk1 225 | Started on 6.Nov 21:22:39, finished on 7.Nov 18:01:39, 0 write errs, 0 uncorr. read errs 226 | root@nas:/home/gfm# btrfs filesystem show /mnt/disk1 227 | Label: 'mergerfsdisk1' uuid: 007055ff-f9a3-458a-b3d4-56ca3daf6bd5 228 | Total devices 1 FS bytes used 6.98TiB 229 | devid 1 size 7.28TiB used 6.99TiB path /dev/sdf 230 | 231 | root@nas:/home/gfm# lsscsi 232 | [0:0:0:0] disk QEMU QEMU HARDDISK 2.5+ /dev/sda 233 | [0:0:0:1] disk QEMU QEMU HARDDISK 2.5+ /dev/sdb 234 | [7:0:0:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdc 235 | [7:0:1:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdd 236 | [7:0:2:0] disk ATA WDC WD80EFZX-68U 0A83 /dev/sde 237 | [7:0:6:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdf 238 | [N:0:1:1] disk Samsung SSD 950 PRO 256GB__1 /dev/nvme0n1 239 | # btrfs dev usage /mnt/disk1/ 240 | /dev/sdf, ID: 1 241 | Device size: 16.37TiB 242 | Device slack: 9.09TiB 243 | Data,single: 6.97TiB 244 | Metadata,DUP: 18.00GiB 245 | System,DUP: 80.00MiB 246 | Unallocated: 291.95GiB 247 | ``` 248 | 249 | There's 9TB of unclaimed space we can expand so it can be used. 250 | 251 | ``` 252 | # btrfs filesystem resize 1:max /mnt/disk1 253 | Resize device id 1 (/dev/sdf) from 7.28TiB to max 254 | root@nas:/home/gfm# btrfs dev usage /mnt/disk1/ 255 | /dev/sdf, ID: 1 256 | Device size: 16.37TiB 257 | Device slack: 0.00B 258 | Data,single: 6.97TiB 259 | Metadata,DUP: 18.00GiB 260 | System,DUP: 80.00MiB 261 | Unallocated: 9.38TiB 262 | root@nas:/home/gfm# df -h /mnt/disk1 263 | Filesystem Size Used Avail Use% Mounted on 264 | /dev/sdf 17T 7.0T 9.4T 43% /mnt/disk1 265 | ``` 266 | 267 | **That's it. You have completed a live-replace of `/dev/sde` with `/dev/sdf` a larger drive w/o downtime.** 268 | 269 | ### Ubuntu expired apt keys 270 | 271 | One command to rule them all. https://stackoverflow.com/questions/34733340/mongodb-gpg-invalid-signatures 272 | 273 | ``` 274 | sudo apt-key list | \ 275 | grep "expired: " | \ 276 | sed -ne 's|pub .*/\([^ ]*\) .*|\1|gp' | \ 277 | xargs -n1 sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 278 | ``` -------------------------------------------------------------------------------- /mergerfs.md: -------------------------------------------------------------------------------- 1 | # MergerFS 2 | 3 | **WARNING: Using ZFS + NFS (non-zfs native export) + mergerfs cause [ZFS mount instability and crashes](https://github.com/trapexit/mergerfs/discussions/1098).** 4 | 5 | MergerFS is used to "merge" all physical distint disk partitions (/mnt/disk*) into a single logical volume mount. 6 | 7 | ### Policies 8 | 9 | https://github.com/trapexit/mergerfs#policy-descriptions 10 | 11 | Assuming a home media server with top-level folders /movies and /tv - we likely want to keep season folders together on the same disk. 12 | 13 | At the same time, you may want to fill up one hard drive first before starting to place data onto a secondary disk. 14 | 15 | For my situation, I feel "most shared path" policies are best suited for this criteria. The real question is do we want to fill up a single disk to the brim before writing data to another disk? 16 | 17 | #### Thoughts about CREATE policies 18 | 19 | If we fill up a single disk to full before writing to disk2 - we run the risk that if disk1 fails and our snapraid backup fails we lose a lot of information. 20 | 21 | If we already have 2 physical hard disks connected to the server, 24/7 even if spun down you are already consuming 5 watts of energy for that disk being connected to the server. 22 | 23 | IMO, unless you want to "slowly grow" your number of hard drives its best to "spread across" your media usage across hard disks. This way, even if everything works as expected with snapraid being your backup - the data restore for a new whole disk replacement will take less time if it only needs to recover half the total raw storage disk size vs. a full-raw-disk. 24 | 25 | Therefore `msplus (most shared path, least used space)` is what we would want. You can always change this later. 26 | 27 | ### /etc/fstab (TL;DR) config 28 | 29 | The `/etc/fstab` for our slow disks running BTRFS would look like as follows: 30 | 31 | ``` 32 | # add fstab slow-storage 33 | /mnt/disk* /mnt/slow-storage fuse.mergerfs defaults,nonempty,allow_other,use_ino,category.create=msplfs,cache.files=off,moveonenospc=true,dropcacheonclose=true,minfreespace=300G,fsname=mergerfs 0 0 34 | 35 | # add fstab /cache nvme 36 | /cache:/mnt/slow-storage /mnt/cached fuse.mergerfs defaults,nonempty,allow_other,use_ino,noforget,inodecalc=path-hash,security_capability=false,cache.files=partial,category.create=lfs,moveonenospc=true,dropcacheonclose=true,minfreespace=4G,fsname=mergerfs 0 0 37 | ``` 38 | 39 | Caveats of `msplus` means each single disk needs to have the top-level folder created for it (e.g: "movies" and "tv" folder). Therefore `/mnt/diskX/movies` would allow the path walk-back logic to dump data into that disk. 40 | 41 | ## NVME Tiered Cache Strategy 42 | 43 | https://github.com/trapexit/mergerfs#tiered-caching 44 | 45 | To attempt to mirror what unraid provides with their share "cache" we are going to setup yet another mergerfs "pool" or mountpoint with just our nvme disks. 46 | 47 | Recall that I chose to use ZFS and RAID1 mirror for this purpose to provide assurances that my data would not be lost before it gets moved onto parity-protected-snapraid-slow-storage-disks. 48 | 49 | ## NFS instability 50 | 51 | `/mnt/cached` is my mergerfs pool and ZFS mountpoint on my local system. The `mergerfs` process seems to be crashing at some point due to NFS. I haven't yet found the root cause of this issue and have tried everything from upgrading kernel, ZFS, nfs-kernel-server, libfuse and OS (Ubuntu 20.04 to 20.10). 52 | 53 | The crashes seem to be more pronounced when using NFSv4 protocols. NFSv3 is more stable but that is a stateless protocol and I would much prefer v4 only NFS shares. I have disabled v4 and force v3 for the time being to try to make my implementation stable. 54 | 55 | Observed behavior (on local NAS): 56 | ``` 57 | # ls -lah /mnt/cached 58 | ls: cannot access '/mnt/cached': Input/output error 59 | ``` 60 | 61 | Recovery steps 62 | 63 | 64 | ### Debugging with strace 65 | 66 | ``` 67 | root@nas:/home/gfm# strace -fvTtt -s 256 -p PIDHERE -o /tmp/mergerfs.strace.txt 68 | strace: Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf: Operation not permitted 69 | strace: attach: ptrace(PTRACE_SEIZE, 2081428): Operation not permitted 70 | root@nas:/home/gfm# echo "0"|sudo tee /proc/sys/kernel/yama/ptrace_scope 71 | 0 72 | `` 73 | 74 | If that doesn't work, change setting `/etc/sysctl.d/10-ptrace.conf` to 0. Reboot. 75 | 76 | Strace isn't helpful according to mergerfs developer. Here's the proper way to debug mergerfs using gdb 77 | 78 | ### gdb debugging mergerfs 79 | 80 | ``` 81 | If it's crashing then strace is pretty useless. Need a stack trace from gdb. 82 | 83 | gdb path/to/mergerfs 84 | 85 | run -f -o options branches mountpoint 86 | 87 | when it crashes 88 | 89 | thread apply all bt 90 | ``` 91 | 92 | ### Remove ZFS from the equation by using XFS RAID 1 93 | 94 | ``` 95 | mdadm --create --verbose /dev/md0 --bitmap=none --level=mirror --raid-devices=2 /dev/nvme0n1 /dev/sdb 96 | mkfs.xfs -f -L cache /dev/md0 97 | mdadm --detail /dev/md0 98 | ``` 99 | 100 | 101 | ### NFS tweaks that were added 102 | 103 | ``` 104 | noforget,inodecalc=path-hash,security_capability=false,cache.files=partial,category.create=lfs 105 | ``` 106 | 107 | ## Installing mergerfs on ubuntu 21.04 steps 108 | 109 | ``` 110 | wget https://github.com/trapexit/mergerfs/releases/download/2.33.5/mergerfs_2.33.5.ubuntu-focal_amd64.deb 111 | dpkg -i mergerfs_2.33.5.ubuntu-focal_amd64.deb 112 | ``` 113 | 114 | ## Other notes 115 | 116 | 1. Update /etc/snapraid.conf with disk maps. 117 | 1. Ensure `snapper` configs for each disk exist. (`snapper -c mergerfsdisk1 create-config -t mergerfsdisk /mnt/disk1`) 118 | 1. Install snapraid-btrfs and validate with `snapraid-btrfs ls` command. 119 | 1. Install snapraid-btrfs-runner (`git clone https://github.com/fmoledina/snapraid-btrfs-runner.git /opt/snapraid-btrfs-runner`) -------------------------------------------------------------------------------- /miscelaneous_sysadmin.md: -------------------------------------------------------------------------------- 1 | # Miscelaneous sysadmin notes 2 | Scratchpad, tips related to system administrator tasks or configuration of certain services running on free-unraid OS. 3 | 4 | ## OpenWRT 5 | 6 | ### Calculating DHCP scope settings 7 | 8 | ``` 9 | ipcalc.sh network-ip mask-or-prefix start limit 10 | ``` 11 | 12 | https://forum.openwrt.org/t/dhcp-range-configuration/67452/7 13 | 14 | ## Ubuntu 15 | 16 | ### Newer ZFS repo 17 | https://launchpad.net/~jonathonf/+archive/ubuntu/zfs 18 | ``` 19 | sudo add-apt-repository ppa:jonathonf/zfs 20 | sudo apt update 21 | ``` 22 | 23 | ## Plex Media Server 24 | 25 | ### Keeping plex-media-server package up-to-date 26 | 27 | https://github.com/mrworf/plexupdate 28 | 29 | ### Limit ZFS Memory usage to 3GB on NAS VM 30 | 31 | Dynamic during runtime. 32 | ``` 33 | echo "$[3 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max 34 | ``` 35 | 36 | Permanent, add/create `/etc/modprobe.d/zfs.conf` 37 | ``` 38 | options zfs zfs_arc_max=3221225472 39 | ``` 40 | 41 | ## Samba 42 | 43 | ### Newer builds for ubuntu 44 | 45 | https://launchpad.net/~linux-schools/+archive/ubuntu/samba-latest 46 | 47 | ## Docker 48 | ### Update all containers 49 | 50 | ``` 51 | docker run -v /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once 52 | ``` 53 | 54 | 55 | ### NTP 56 | 57 | Ensure timesync. File: `/etc/systemd/timesyncd.conf` 58 | ``` 59 | [Time] 60 | NTP=time.google.com time1.google.com time2.google.com time3.google.com 61 | FallbackNTP=0.pool.ntp.org 1.pool.ntp.org 0.debian.pool.ntp.org 62 | ``` 63 | 64 | Restart `systemd-timesyncd.service` - done. 65 | 66 | ### Folder listings timestamp include year (ls) 67 | 68 | Add to `.bashrc` file 69 | ``` 70 | export TIME_STYLE=long-iso 71 | ``` 72 | 73 | ## Btrfs 74 | 75 | ### Change the partition label 76 | 77 | If system is mounted (live change) 78 | ``` 79 | btrfs filesystem label 80 | ``` 81 | 82 | If unmounted 83 | ``` 84 | btrfs filesystem label 85 | ``` 86 | 87 | Make necessary updates in `/etc/fstab` 88 | 89 | ## NFS 90 | 91 | Check if kernel supports which NFS versions 92 | https://wiki.debian.org/NFSServerSetup 93 | 94 | ``` 95 | grep NFSD /boot/config-`uname -r` 96 | ``` 97 | 98 | check enabled versions 99 | ``` 100 | cat /proc/fs/nfsd/versions 101 | ``` 102 | 103 | Make sure to tweak these configuration files on the server to force NFS v4 always. 104 | 105 | ``` 106 | /etc/nfs.conf 107 | /etc/default/nfs-common 108 | /etc/default/nfs-kernel-server 109 | ``` 110 | 111 | ### The easy way to enable NFS v4.2 112 | 113 | ``` 114 | # nfsconf --set nfsd vers4.2 y 115 | root@nas:/home/gfm# nfsconf --get nfsd vers4.2 116 | y 117 | # cat /etc/nfs.conf 118 | ``` 119 | 120 | Important section 121 | ``` 122 | vers2=n 123 | vers3=n 124 | vers4=n 125 | vers4.0=n 126 | vers4.1=y 127 | vers4.2=y 128 | ``` 129 | 130 | If `rpc-statd.service` complains after disabling V2 V3 NFS. Mask it. 131 | 132 | ``` 133 | systemctl mask rpc-statd.service 134 | ``` 135 | 136 | ### Check dependencies and its service status 137 | 138 | ``` 139 | systemctl list-dependencies nfs-kernel-server 140 | ``` 141 | 142 | ### Mounting NFSv4 on client 143 | 144 | When faced with error `mount.nfs4: mounting 192.168.1.54:/mnt/cached failed, reason given by server: No such file or directory` 145 | 146 | The hint was `fsid=0` from https://askubuntu.com/questions/35077/cannot-mount-nfs4-share-no-such-file-or-directory and the path we use to mount. 147 | 148 | In NFSv3 the old mount command was (when share had fsid=0 in its exports config): 149 | ``` 150 | mount -t nfs 192.168.1.54:/mnt/cached /mnt/derp/ 151 | ``` 152 | 153 | NFSv4 format, fsid has special meaning to the path. 154 | 155 | ``` 156 | mount.nfs4 -o vers=4.2 192.168.1.54:/ /mnt/derp/ 157 | ``` 158 | 159 | More on this: https://unix.stackexchange.com/questions/427597/implications-of-using-nfsv4-fsid-0-and-exporting-the-nfs-root-to-entire-lan-or 160 | 161 | **You can avoid this altogether by simply using fsid=1 or anything other than zero.** -------------------------------------------------------------------------------- /nfs-check-loop.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Basic script to run on an endless loop on a NFS client 3 | # idea is to catch failure in rpcinfo and log timestamp. 4 | 5 | NFS_HOST_IP=192.168.1.54 6 | 7 | counter=1 8 | while : 9 | do 10 | echo "Run " $counter + $(date) 11 | /usr/sbin/rpcinfo -T tcp $NFS_HOST_IP nfs 4; 12 | if [ $? -eq 1 ] 13 | then 14 | echo "NFS server unresponsive." 15 | break 16 | fi 17 | ((counter++)) 18 | sleep 1 19 | done 20 | echo "NFS monitor script finished. " -------------------------------------------------------------------------------- /nfs.md: -------------------------------------------------------------------------------- 1 | # NFS (Network File System) 2 | 3 | Notes on NFS troubleshooting, monitoring, etc. I export my data via NFS to proxmox hosts on my network. 4 | 5 | https://wiki.archlinux.org/title/NFS/Troubleshooting -------------------------------------------------------------------------------- /performance_benchmarks.md: -------------------------------------------------------------------------------- 1 | # Performance 2 | 3 | **Note: after upgrading from ZFS 0.8.4 to ZFS 2.1.4 (+kernel 5.15.0-52-generic) the noted ZFS performance issues have gone away. There appears to be almost no performance penalty of using ZFS+mergerfs.** 4 | 5 | These are my notes about performance of this setup (and some experiments with `autotier` mergerfs competitor). 6 | 7 | Performance is measured by this `fio` command. It's intended to test **sequential writes** with 1MB block size. Imitates write backup activity or large file copies (HD tv or movies). 8 | 9 | ``` 10 | fio --name=fiotest --filename=/mnt/samsung/zfscache/file123 --size=16Gb --rw=write --bs=1M --direct=1 --numjobs=8 --ioengine=libaio --iodepth=8 --group_reporting --runtime=60 --startdelay=60 11 | ``` 12 | 13 | #### My hardware 14 | 15 | ``` 16 | root@nas:/home/gfm# lsscsi 17 | [0:0:0:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdc 18 | [0:0:1:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdd 19 | [0:0:2:0] disk ATA WDC WD80EFZX-68U 0A83 /dev/sde 20 | [1:0:0:0] disk QEMU QEMU HARDDISK 2.5+ /dev/sda 21 | [1:0:0:1] disk QEMU QEMU HARDDISK 2.5+ /dev/sdb 22 | [3:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.5+ /dev/sr0 23 | [N:0:1:1] disk Samsung SSD 950 PRO 256GB__1 /dev/nvme0n1 24 | root@nas:/home/gfm# df -h 25 | Filesystem Size Used Avail Use% Mounted on 26 | udev 1.9G 0 1.9G 0% /dev 27 | tmpfs 390M 2.8M 387M 1% /run 28 | /dev/mapper/ubuntu--vg-ubuntu--lv 60G 9.8G 48G 18% / 29 | tmpfs 2.0G 0 2.0G 0% /dev/shm 30 | tmpfs 5.0M 0 5.0M 0% /run/lock 31 | tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup 32 | mergerfs 231G 1.0M 231G 1% /mnt/cached 33 | mergerfs 24T 121G 24T 1% /mnt/slow-storage 34 | /dev/sda2 2.0G 107M 1.7G 6% /boot 35 | /dev/sda1 1.1G 5.3M 1.1G 1% /boot/efi 36 | /dev/sdc 17T 51G 17T 1% /mnt/disk2 37 | /dev/sdc 17T 51G 17T 1% /mnt/snapraid-content/disk2 38 | cache 231G 1.0M 231G 1% /cache 39 | /dev/sde 7.3T 71G 7.3T 1% /mnt/snapraid-content/disk1 40 | /dev/sde 7.3T 71G 7.3T 1% /mnt/disk1 41 | /dev/loop1 64M 64M 0 100% /snap/core20/1634 42 | /dev/loop3 47M 47M 0 100% /snap/snapd/16292 43 | /dev/loop2 48M 48M 0 100% /snap/snapd/17336 44 | /dev/loop0 64M 64M 0 100% /snap/core20/1623 45 | /dev/loop4 68M 68M 0 100% /snap/lxd/22753 46 | /dev/sdd1 17T 117G 17T 1% /mnt/parity1 47 | tmpfs 390M 0 390M 0% /run/user/1000 48 | root@nas:/home/gfm# zpool status 49 | pool: cache 50 | state: ONLINE 51 | scan: resilvered 12.0M in 0 days 00:00:00 with 0 errors on Thu Nov 3 23:34:42 2022 52 | config: 53 | 54 | NAME STATE READ WRITE CKSUM 55 | cache ONLINE 0 0 0 56 | mirror-0 ONLINE 0 0 0 57 | sdb ONLINE 0 0 0 58 | nvme0n1 ONLINE 0 0 0 59 | 60 | errors: No known data errors 61 | 62 | ``` 63 | 64 | ### /cache ZFS Raw-disk performance (this is my "fast cache" for mergerfs) 65 | 66 | **Update: 11/05/22 - After upgrade from ZFS 0.8.4 to ZFS 2.1.4 the below performance issues don't exist.** 67 | 68 | **NOTE: ZFS filesystem uses memory catching L2ARC and other things that may 'inflate' results** 69 | 70 | I have done tests on this same nvm0n1 disk and max writes are around 900MB/s (if you google for Samsung 950 256GB drives like mine you will find same benchmark results). 71 | 72 | ``` 73 | Jobs: 8 (f=8): [W(8)][100.0%][w=344MiB/s][w=344 IOPS][eta 00m:00s] 74 | fiotest: (groupid=0, jobs=8): err= 0: pid=17709: Thu Nov 3 23:58:41 2022 75 | write: IOPS=1240, BW=1240MiB/s (1300MB/s)(72.7GiB/60017msec); 0 zone resets 76 | slat (usec): min=82, max=103024, avg=6447.15, stdev=11814.99 77 | clat (usec): min=2, max=548443, avg=45148.97, stdev=80640.79 78 | lat (usec): min=1015, max=617530, avg=51596.52, stdev=91759.09 79 | clat percentiles (msec): 80 | | 1.00th=[ 4], 5.00th=[ 7], 10.00th=[ 8], 20.00th=[ 8], 81 | | 30.00th=[ 8], 40.00th=[ 8], 50.00th=[ 9], 60.00th=[ 9], 82 | | 70.00th=[ 16], 80.00th=[ 64], 90.00th=[ 159], 95.00th=[ 234], 83 | | 99.00th=[ 380], 99.50th=[ 418], 99.90th=[ 472], 99.95th=[ 498], 84 | | 99.99th=[ 542] 85 | bw ( MiB/s): min= 111, max= 6468, per=99.93%, avg=1239.24, stdev=193.05, samples=960 86 | iops : min= 107, max= 6468, avg=1238.93, stdev=193.07, samples=960 87 | lat (usec) : 4=0.01%, 10=0.01%, 1000=0.01% 88 | lat (msec) : 2=0.24%, 4=1.84%, 10=63.16%, 20=6.46%, 50=6.25% 89 | lat (msec) : 100=6.65%, 250=11.16%, 500=4.17%, 750=0.05% 90 | cpu : usr=0.70%, sys=2.84%, ctx=671054, majf=0, minf=92 91 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0% 92 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 93 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 94 | issued rwts: total=0,74429,0,0 short=0,0,0,0 dropped=0,0,0,0 95 | latency : target=0, window=0, percentile=100.00%, depth=8 96 | 97 | Run status group 0 (all jobs): 98 | WRITE: bw=1240MiB/s (1300MB/s), 1240MiB/s-1240MiB/s (1300MB/s-1300MB/s), io=72.7GiB (78.0GB), run=60017-60017msec 99 | ``` 100 | 101 | ### /mnt/cached ZFS 2.1.4 write to mergerfs-ZFS benchmarks (performance fixed!) 102 | 103 | ``` 104 | # dpkg -l | grep zfs 105 | ii libzfs4linux 2.1.4-0ubuntu0.1 amd64 OpenZFS filesystem library for Linux - general support 106 | ii zfs-zed 2.1.4-0ubuntu0.1 amd64 OpenZFS Event Daemon 107 | ii zfsutils-linux 2.1.4-0ubuntu0.1 amd64 command-line tools to manage OpenZFS filesystems 108 | # fio --name=fiotest --filename=/mnt/cached/speed --size=16Gb --rw=write --bs=1M --direct=1 --numjobs=8 --ioengine=libaio --iodepth=8 --group_reporting --runtime=60 --startdelay=60 109 | Starting 8 processes 110 | fiotest: Laying out IO file (1 file / 16384MiB) 111 | Jobs: 8 (f=0): [f(8)][100.0%][w=707MiB/s][w=707 IOPS][eta 00m:00s] 112 | fiotest: (groupid=0, jobs=8): err= 0: pid=93346: Sat Nov 5 01:56:46 2022 113 | write: IOPS=944, BW=944MiB/s (990MB/s)(55.3GiB/60039msec); 0 zone resets 114 | slat (usec): min=12, max=81965, avg=8457.02, stdev=8863.16 115 | clat (usec): min=177, max=349728, avg=59322.78, stdev=55613.28 116 | lat (usec): min=198, max=392259, avg=67780.92, stdev=63148.08 117 | clat percentiles (msec): 118 | | 1.00th=[ 3], 5.00th=[ 9], 10.00th=[ 14], 20.00th=[ 18], 119 | | 30.00th=[ 23], 40.00th=[ 31], 50.00th=[ 40], 60.00th=[ 51], 120 | | 70.00th=[ 65], 80.00th=[ 97], 90.00th=[ 142], 95.00th=[ 184], 121 | | 99.00th=[ 245], 99.50th=[ 266], 99.90th=[ 305], 99.95th=[ 326], 122 | | 99.99th=[ 338] 123 | bw ( KiB/s): min=198656, max=5643635, per=99.82%, avg=964905.73, stdev=88839.34, samples=952 124 | iops : min= 194, max= 5511, avg=942.20, stdev=86.75, samples=952 125 | lat (usec) : 250=0.01%, 500=0.15%, 750=0.20%, 1000=0.18% 126 | lat (msec) : 2=0.39%, 4=0.67%, 10=5.83%, 20=18.35%, 50=34.41% 127 | lat (msec) : 100=20.43%, 250=18.56%, 500=0.81% 128 | cpu : usr=0.97%, sys=0.64%, ctx=106614, majf=0, minf=107 129 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0% 130 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 131 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 132 | issued rwts: total=0,56678,0,0 short=0,0,0,0 dropped=0,0,0,0 133 | latency : target=0, window=0, percentile=100.00%, depth=8 134 | 135 | Run status group 0 (all jobs): 136 | WRITE: bw=944MiB/s (990MB/s), 944MiB/s-944MiB/s (990MB/s-990MB/s), io=55.3GiB (59.4GB), run=60039-60039msec 137 | 138 | ``` 139 | 140 | Performance 990MB/s on RAID1 ZFS of dual nvme. Goal achieved. 141 | 142 | ### /mnt/slow-storage - mergerfs aggregate of spinning disks 18TB, 8TB 143 | 144 | These HDDs provide about 170MB/s max write speeds. The older 8TB drive may give 140MB/s. 145 | 146 | The mergerfs `/etc/fstab` for this mount looks like: 147 | ``` 148 | /mnt/disk* /mnt/slow-storage fuse.mergerfs defaults,nonempty,allow_other,use_ino,category.create=eplus,cache.files=off,moveonenospc=true,dropcacheonclose=true,minfreespace=300G,fsname=mergerfs 0 0 149 | ``` 150 | 151 | results 152 | 153 | ``` 154 | fiotest: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=8 155 | ... 156 | fio-3.16 157 | Starting 8 processes 158 | fiotest: Laying out IO file (1 file / 16384MiB) 159 | Jobs: 8 (f=3): [f(8)][100.0%][w=241MiB/s][w=241 IOPS][eta 00m:00s] 160 | fiotest: (groupid=0, jobs=8): err= 0: pid=23616: Fri Nov 4 00:03:59 2022 161 | write: IOPS=184, BW=185MiB/s (194MB/s)(10.8GiB/60076msec); 0 zone resets 162 | slat (usec): min=19, max=728722, avg=43252.77, stdev=36616.24 163 | clat (msec): min=23, max=1335, avg=302.65, stdev=93.08 164 | lat (msec): min=23, max=1376, avg=345.90, stdev=99.29 165 | clat percentiles (msec): 166 | | 1.00th=[ 165], 5.00th=[ 205], 10.00th=[ 224], 20.00th=[ 245], 167 | | 30.00th=[ 262], 40.00th=[ 275], 50.00th=[ 288], 60.00th=[ 305], 168 | | 70.00th=[ 321], 80.00th=[ 347], 90.00th=[ 397], 95.00th=[ 435], 169 | | 99.00th=[ 542], 99.50th=[ 835], 99.90th=[ 1284], 99.95th=[ 1318], 170 | | 99.99th=[ 1334] 171 | bw ( KiB/s): min=51200, max=296960, per=100.00%, avg=189680.43, stdev=4517.78, samples=952 172 | iops : min= 50, max= 290, avg=184.69, stdev= 4.43, samples=952 173 | lat (msec) : 50=0.17%, 100=0.18%, 250=22.55%, 500=75.15%, 750=1.44% 174 | lat (msec) : 1000=0.17%, 2000=0.33% 175 | cpu : usr=0.12%, sys=0.14%, ctx=22235, majf=0, minf=92 176 | IO depths : 1=0.1%, 2=0.1%, 4=0.3%, 8=99.5%, 16=0.0%, 32=0.0%, >=64=0.0% 177 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 178 | complete : 0=0.0%, 4=99.9%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 179 | issued rwts: total=0,11087,0,0 short=0,0,0,0 dropped=0,0,0,0 180 | latency : target=0, window=0, percentile=100.00%, depth=8 181 | 182 | Run status group 0 (all jobs): 183 | WRITE: bw=185MiB/s (194MB/s), 185MiB/s-185MiB/s (194MB/s-194MB/s), io=10.8GiB (11.6GB), run=60076-60076msec 184 | 185 | ``` 186 | 187 | ### /mnt/cached - mergerfs ZFS only zpool mount of /cache (no spinning disks) 188 | 189 | TL;DR **performance penalty on mergerfs** 190 | 191 | ``` 192 | WRITE: bw=374MiB/s (392MB/s), 374MiB/s-374MiB/s (392MB/s-392MB/s), io=21.9GiB (23.6GB), 193 | ``` 194 | 195 | vs. without mergerfs (pure zpool) 196 | ``` 197 | WRITE: bw=1240MiB/s (1300MB/s), 1240MiB/s-1240MiB/s (1300MB/s-1300MB/s), io=72.7GiB (78.0GB), run=60017-60017msec 198 | ``` 199 | 200 | mergerfs `/etc/fstab` is 201 | ``` 202 | /cache /mnt/cached fuse.mergerfs nonempty,allow_other,use_ino,cache.files=off,category.create=lfs,moveonenospc=true,dropcacheonclose=true,minfreespace=4G,fsname=mergerfs 0 0 203 | ``` 204 | 205 | results 206 | 207 | ``` 208 | Jobs: 8 (f=3): [f(2),W(2),f(1),W(1),f(2)][25.7%][w=557MiB/s][w=556 IOPS][eta 05m:49s] 209 | fiotest: (groupid=0, jobs=8): err= 0: pid=25212: Fri Nov 4 00:08:15 2022 210 | write: IOPS=373, BW=374MiB/s (392MB/s)(21.9GiB/60084msec); 0 zone resets 211 | slat (usec): min=10, max=1081.4k, avg=21349.93, stdev=56883.73 212 | clat (usec): min=4, max=2673.2k, avg=149732.07, stdev=290258.09 213 | lat (usec): min=287, max=2795.8k, avg=171082.91, stdev=321499.99 214 | clat percentiles (msec): 215 | | 1.00th=[ 3], 5.00th=[ 11], 10.00th=[ 16], 20.00th=[ 24], 216 | | 30.00th=[ 33], 40.00th=[ 43], 50.00th=[ 54], 60.00th=[ 69], 217 | | 70.00th=[ 100], 80.00th=[ 176], 90.00th=[ 347], 95.00th=[ 642], 218 | | 99.00th=[ 1636], 99.50th=[ 2005], 99.90th=[ 2400], 99.95th=[ 2500], 219 | | 99.99th=[ 2668] 220 | bw ( KiB/s): min=16374, max=2783526, per=100.00%, avg=392628.80, stdev=58773.48, samples=931 221 | iops : min= 14, max= 2717, avg=382.85, stdev=57.39, samples=931 222 | lat (usec) : 10=0.01%, 500=0.35%, 750=0.11%, 1000=0.10% 223 | lat (msec) : 2=0.36%, 4=0.80%, 10=2.36%, 20=11.42%, 50=31.95% 224 | lat (msec) : 100=22.79%, 250=15.45%, 500=7.80%, 750=2.37%, 1000=1.41% 225 | lat (msec) : 2000=2.22%, >=2000=0.49% 226 | cpu : usr=0.27%, sys=0.23%, ctx=30547, majf=0, minf=95 227 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.8%, 16=0.0%, 32=0.0%, >=64=0.0% 228 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 229 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 230 | issued rwts: total=0,22467,0,0 short=0,0,0,0 dropped=0,0,0,0 231 | latency : target=0, window=0, percentile=100.00%, depth=8 232 | 233 | Run status group 0 (all jobs): 234 | WRITE: bw=374MiB/s (392MB/s), 374MiB/s-374MiB/s (392MB/s-392MB/s), io=21.9GiB (23.6GB), run=60084-60084msec 235 | ``` 236 | 237 | ### Let's remove ZFS from the equation (mergerfs) 238 | 239 | ``` 240 | zpool destroy cache 241 | mkfs.btrfs -f -L cachebtrfs /dev/nvme0n1 242 | btrfs-progs v5.4.1 243 | See http://btrfs.wiki.kernel.org for more information. 244 | 245 | Detected a SSD, turning off metadata duplication. Mkfs with -m dup if you want to force metadata duplication. 246 | Label: cachebtrfs 247 | UUID: 53afb172-2ac8-43be-98e0-d749217bf129 248 | Node size: 16384 249 | Sector size: 4096 250 | Filesystem size: 238.47GiB 251 | Block group profiles: 252 | Data: single 8.00MiB 253 | Metadata: single 8.00MiB 254 | System: single 4.00MiB 255 | SSD detected: yes 256 | Incompat features: extref, skinny-metadata 257 | Checksum: crc32c 258 | Number of devices: 1 259 | Devices: 260 | ID SIZE PATH 261 | 1 238.47GiB /dev/nvme0n1 262 | 263 | root@nas:/home/gfm# mkdir /cache 264 | root@nas:/home/gfm# mount /dev/nvme0n1 /cache 265 | /dev/nvme0n1 on /cache type btrfs (rw,relatime,ssd,space_cache,subvolid=5,subvol=/) 266 | 267 | ``` 268 | 269 | 270 | #### Does btrfs raid1 work with mergerfs? 271 | 272 | Let's test. 273 | 274 | ``` 275 | root@nas:/home/gfm# mkfs.btrfs -f -L cached-mirror -m raid1 -d raid1 /dev/nvme0n1 /dev/sdb btrfs-progs v5.4.1 276 | See http://btrfs.wiki.kernel.org for more information. 277 | 278 | Label: cached-mirror 279 | UUID: 0c4241e9-e4ea-41b6-9dab-a3cc4b936edb 280 | Node size: 16384 281 | Sector size: 4096 282 | Filesystem size: 476.96GiB 283 | Block group profiles: 284 | Data: RAID1 1.00GiB 285 | Metadata: RAID1 1.00GiB 286 | System: RAID1 8.00MiB 287 | SSD detected: yes 288 | Incompat features: extref, skinny-metadata 289 | Checksum: crc32c 290 | Number of devices: 2 291 | Devices: 292 | ID SIZE PATH 293 | 1 238.47GiB /dev/nvme0n1 294 | 2 238.49GiB /dev/sdb 295 | 296 | root@nas:/home/gfm# mount /dev/nvme0n1 /cache/ 297 | fio-3.16 298 | Starting 8 processes 299 | fiotest: Laying out IO file (1 file / 16384MiB) 300 | Jobs: 7 (f=7): [W(1),_(1),W(6)][75.7%][eta 00m:44s] 301 | fiotest: (groupid=0, jobs=8): err= 0: pid=14619: Fri Nov 4 01:48:09 2022 302 | write: IOPS=438, BW=438MiB/s (459MB/s)(32.5GiB/76061msec); 0 zone resets 303 | slat (usec): min=28, max=44590k, avg=7783.12, stdev=487469.23 304 | clat (usec): min=329, max=44735k, avg=134311.57, stdev=1290330.14 305 | lat (usec): min=532, max=44735k, avg=142095.86, stdev=1378714.43 306 | clat percentiles (usec): 307 | | 1.00th=[ 644], 5.00th=[ 13304], 10.00th=[ 17957], 308 | | 20.00th=[ 29492], 30.00th=[ 43254], 40.00th=[ 53740], 309 | | 50.00th=[ 67634], 60.00th=[ 83362], 70.00th=[ 101188], 310 | | 80.00th=[ 122160], 90.00th=[ 181404], 95.00th=[ 231736], 311 | | 99.00th=[ 383779], 99.50th=[ 463471], 99.90th=[17112761], 312 | | 99.95th=[17112761], 99.99th=[17112761] 313 | bw ( KiB/s): min=163819, max=2094826, per=100.00%, avg=739932.10, stdev=58373.76, samples=727 314 | iops : min= 159, max= 2045, avg=721.74, stdev=56.99, samples=727 315 | lat (usec) : 500=0.05%, 750=1.04%, 1000=0.50% 316 | lat (msec) : 2=0.62%, 4=0.34%, 10=0.54%, 20=9.06%, 50=24.23% 317 | lat (msec) : 100=33.28%, 250=26.57%, 500=3.46%, 750=0.17%, 1000=0.01% 318 | lat (msec) : >=2000=0.15% 319 | cpu : usr=0.43%, sys=0.45%, ctx=46384, majf=0, minf=88 320 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.8%, 16=0.0%, 32=0.0%, >=64=0.0% 321 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 322 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 323 | issued rwts: total=0,33328,0,0 short=0,0,0,0 dropped=0,0,0,0 324 | latency : target=0, window=0, percentile=100.00%, depth=8 325 | 326 | Run status group 0 (all jobs): 327 | WRITE: bw=438MiB/s (459MB/s), 438MiB/s-438MiB/s (459MB/s-459MB/s), io=32.5GiB (34.9GB), run=76061-76061msec 328 | ``` 329 | 330 | BTRFS raid1 performance is really... poor wow. This isn't even with mergerfs enabled. 331 | 332 | Let's see about it. **results are 15% performance penalty** in line with past mergerfs tests on btrfs. Outcome: RAID1 on btrfs is probably not a good idea; lost 50% of raw performance before even mergerfs comes into play. 333 | 334 | ``` 335 | Run status group 0 (all jobs): 336 | WRITE: bw=296MiB/s (311MB/s), 296MiB/s-296MiB/s (311MB/s-311MB/s), io=17.4GiB (18.7GB), run=60172-60172msec 337 | ``` 338 | 339 | #### mdadm ext4 raid1 test 340 | 341 | ``` 342 | root@nas:/home/gfm# sgdisk -Z /dev/nvme0n1 343 | Creating new GPT entries in memory. 344 | GPT data structures destroyed! You may now partition the disk using fdisk or 345 | other utilities. 346 | root@nas:/home/gfm# sgdisk -Z /dev/sdb 347 | Creating new GPT entries in memory. 348 | GPT data structures destroyed! You may now partition the disk using fdisk or 349 | other utilities. 350 | root@nas:/home/gfm# mdadm --create /dev/md/cache /dev/nvme0n1 /dev/sdb --level=1 --raid-devices=2 351 | mdadm: Note: this array has metadata at the start and 352 | may not be suitable as a boot device. If you plan to 353 | store '/boot' on this device please ensure that 354 | your boot-loader understands md/v1.x metadata, or use 355 | --metadata=0.90 356 | Continue creating array? y 357 | mdadm: Defaulting to version 1.2 metadata 358 | mdadm: array /dev/md/cache started. 359 | root@nas:/home/gfm# mdadm --detail /dev/md/cache 360 | /dev/md/cache: 361 | Version : 1.2 362 | Creation Time : Fri Nov 4 02:02:11 2022 363 | Raid Level : raid1 364 | Array Size : 249926976 (238.35 GiB 255.93 GB) 365 | Used Dev Size : 249926976 (238.35 GiB 255.93 GB) 366 | Raid Devices : 2 367 | Total Devices : 2 368 | Persistence : Superblock is persistent 369 | 370 | Intent Bitmap : Internal 371 | 372 | Update Time : Fri Nov 4 02:02:38 2022 373 | State : clean, resyncing 374 | Active Devices : 2 375 | Working Devices : 2 376 | Failed Devices : 0 377 | Spare Devices : 0 378 | 379 | Consistency Policy : bitmap 380 | 381 | Resync Status : 2% complete 382 | 383 | Name : nas:cache (local to host nas) 384 | UUID : dda209ab:ace57985:25895a5b:f3d95068 385 | Events : 4 386 | 387 | Number Major Minor RaidDevice State 388 | 0 259 0 0 active sync /dev/nvme0n1 389 | 1 8 16 1 active sync /dev/sdb 390 | root@nas:/home/gfm# mkfs.ext4 /dev/md/cache 391 | mke2fs 1.45.5 (07-Jan-2020) 392 | Discarding device blocks: done 393 | Creating filesystem with 62481744 4k blocks and 15622144 inodes 394 | Filesystem UUID: 9bda5776-f50e-40fa-a826-8b2424de3f07 395 | Superblock backups stored on blocks: 396 | 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 397 | 4096000, 7962624, 11239424, 20480000, 23887872 398 | 399 | Allocating group tables: done 400 | Writing inode tables: done 401 | Creating journal (262144 blocks): done 402 | Writing superblocks and filesystem accounting information: done 403 | 404 | root@nas:/home/gfm# mount /dev/md/cache /cache/ 405 | ``` 406 | 407 | After waiting for resync to be complete. IO test. Maybe ZFS is just better at RAID1 w/o performance impacts. 408 | 409 | ``` 410 | Run status group 0 (all jobs): 411 | WRITE: bw=478MiB/s (502MB/s), 478MiB/s-478MiB/s (502MB/s-502MB/s), io=28.1GiB (30.1GB), run=60069-60069msec 412 | 413 | Disk stats (read/write): 414 | md127: ios=0/228818, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/230087, aggrmerge=0/27, aggrticks=0/6898163, aggrin_queue=6647102, aggrutil=94.28% 415 | nvme0n1: ios=0/230087, merge=0/27, ticks=0/13717574, in_queue=13258980, util=54.05% 416 | sdb: ios=0/230087, merge=0/27, ticks=0/78753, in_queue=35224, util=94.28% 417 | 418 | ``` 419 | 420 | https://raid.wiki.kernel.org/index.php/Write-intent_bitmap 421 | https://louwrentius.com/the-impact-of-the-mdadm-bitmap-on-raid-performance.html 422 | 423 | Write intent bitman may be screwing write performance. Let's disable 424 | 425 | ``` 426 | mdadm /dev/md127 --grow --bitmap=none 427 | mdadm --detail /dev/md/cache 428 | # mount 429 | /dev/md127 on /cache type btrfs (rw,relatime,ssd,space_cache,subvolid=5,subvol=/) 430 | # fio results 431 | Run status group 0 (all jobs): 432 | WRITE: bw=540MiB/s (567MB/s), 540MiB/s-540MiB/s (567MB/s-567MB/s), io=31.7GiB (34.0GB), run=60032-60032msec 433 | 434 | ``` 435 | 436 | Interestingly, performance starts at peak speeds. Then CPU utilization jumps to 100% dropping performance. 437 | 438 | ``` 439 | WRITE: bw=568MiB/s (596MB/s), 568MiB/s-568MiB/s (596MB/s-596MB/s), io=33.3GiB (35.8GB), run=60034-60034msec 440 | ``` 441 | 442 | Try something else but didn't help pefromance. 443 | 444 | ``` 445 | mdadm --grow --bitmap=internal --bitmap-chunk=131072 /dev/md127 446 | Run status group 0 (all jobs): 447 | WRITE: bw=329MiB/s (345MB/s), 329MiB/s-329MiB/s (345MB/s-345MB/s), io=19.4GiB (20.8GB), run=60263-60263msec 448 | 449 | ``` 450 | 451 | Kill mdadm array 452 | 453 | ``` 454 | mdadm -S /dev/md127 455 | mdadm --zero-superblock /dev/sdb /dev/nvme0n1 456 | ``` 457 | 458 | 459 | #### Btrfs raw-speed disk results. 460 | 461 | As expected, ~900 MB/s writes. Matches observations in unraid trial for the same hardware. 462 | 463 | ``` 464 | Starting 8 processes 465 | fiotest: Laying out IO file (1 file / 16384MiB) 466 | Jobs: 4 (f=0): [_(2),f(3),_(1),f(1),_(1)][100.0%][w=894MiB/s][w=893 IOPS][eta 00m:00s] 467 | fiotest: (groupid=0, jobs=8): err= 0: pid=53864: Fri Nov 4 00:44:23 2022 468 | write: IOPS=901, BW=902MiB/s (946MB/s)(52.9GiB/60059msec); 0 zone resets 469 | slat (usec): min=434, max=202119, avg=1705.52, stdev=5436.46 470 | clat (msec): min=3, max=263, avg=69.19, stdev=34.71 471 | lat (msec): min=3, max=277, avg=70.90, stdev=35.30 472 | clat percentiles (msec): 473 | | 1.00th=[ 14], 5.00th=[ 20], 10.00th=[ 24], 20.00th=[ 32], 474 | | 30.00th=[ 50], 40.00th=[ 61], 50.00th=[ 70], 60.00th=[ 78], 475 | | 70.00th=[ 87], 80.00th=[ 102], 90.00th=[ 111], 95.00th=[ 126], 476 | | 99.00th=[ 161], 99.50th=[ 174], 99.90th=[ 207], 99.95th=[ 222], 477 | | 99.99th=[ 243] 478 | bw ( KiB/s): min=442249, max=2527361, per=99.97%, avg=923157.66, stdev=47906.96, samples=960 479 | iops : min= 431, max= 2467, avg=901.15, stdev=46.79, samples=960 480 | lat (msec) : 4=0.01%, 10=0.04%, 20=6.16%, 50=23.95%, 100=49.08% 481 | lat (msec) : 250=20.76%, 500=0.01% 482 | cpu : usr=0.30%, sys=5.05%, ctx=59733, majf=0, minf=88 483 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0% 484 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 485 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 486 | issued rwts: total=0,54162,0,0 short=0,0,0,0 dropped=0,0,0,0 487 | latency : target=0, window=0, percentile=100.00%, depth=8 488 | 489 | Run status group 0 (all jobs): 490 | WRITE: bw=902MiB/s (946MB/s), 902MiB/s-902MiB/s (946MB/s-946MB/s), io=52.9GiB (56.8GB), run=60059-60059msec 491 | 492 | ``` 493 | 494 | #### /cache BTRFS mergerfs test. 495 | 496 | TL;DR **Surprising results, w/o ZFS. Performance penalty is ~15%!** 497 | 498 | ``` 499 | root@nas:/mnt# mount /mnt/cached/ 500 | root@nas:/mnt# df -h /mnt/cached/ 501 | Filesystem Size Used Avail Use% Mounted on 502 | mergerfs 239G 17G 222G 7% /mnt/cached 503 | Starting 8 processes 504 | fiotest: Laying out IO file (1 file / 16384MiB) 505 | Jobs: 3 (f=3): [_(3),f(1),_(2),f(2)][100.0%][eta 00m:00s] 506 | fiotest: (groupid=0, jobs=8): err= 0: pid=55377: Fri Nov 4 00:48:28 2022 507 | write: IOPS=770, BW=771MiB/s (808MB/s)(45.2GiB/60022msec); 0 zone resets 508 | slat (usec): min=16, max=80166, avg=10360.79, stdev=5295.21 509 | clat (msec): min=2, max=203, avg=72.59, stdev=13.58 510 | lat (msec): min=2, max=219, avg=82.95, stdev=14.65 511 | clat percentiles (msec): 512 | | 1.00th=[ 40], 5.00th=[ 61], 10.00th=[ 63], 20.00th=[ 69], 513 | | 30.00th=[ 70], 40.00th=[ 70], 50.00th=[ 71], 60.00th=[ 72], 514 | | 70.00th=[ 73], 80.00th=[ 74], 90.00th=[ 83], 95.00th=[ 96], 515 | | 99.00th=[ 132], 99.50th=[ 144], 99.90th=[ 165], 99.95th=[ 171], 516 | | 99.99th=[ 190] 517 | bw ( KiB/s): min=571253, max=913408, per=99.87%, avg=788216.07, stdev=7550.71, samples=960 518 | iops : min= 557, max= 892, avg=769.35, stdev= 7.39, samples=960 519 | lat (msec) : 4=0.01%, 10=0.09%, 20=0.11%, 50=1.72%, 100=93.88% 520 | lat (msec) : 250=4.20% 521 | cpu : usr=0.28%, sys=1.29%, ctx=89094, majf=0, minf=89 522 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0% 523 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 524 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 525 | issued rwts: total=0,46262,0,0 short=0,0,0,0 dropped=0,0,0,0 526 | latency : target=0, window=0, percentile=100.00%, depth=8 527 | 528 | Run status group 0 (all jobs): 529 | WRITE: bw=771MiB/s (808MB/s), 771MiB/s-771MiB/s (808MB/s-808MB/s), io=45.2GiB (48.5GB), run=60022-60022msec 530 | 531 | ``` 532 | 533 | # Autotier experiment 534 | 535 | TL;DR **Worst performance out of all (50% mergerfs performance). Unmaintained project by 45Drives.** 536 | 537 | Unrelated to mergerfs and just for fun. https://github.com/45Drives/autotier feels to be an `abandoned` project (I based this on a lack of response by the owner on open issues and lack of updates since 2021), still this project is another FUSE solution that seems to natively integrate the "move files between storage tiers for me" ideals. 538 | 539 | Let's kick the tires on it on my setup. I expect poor performance here: https://github.com/45Drives/autotier/issues/38 540 | 541 | ### autotierfs 542 | 543 | Filesystem is mounted manually via these options: 544 | 545 | ``` 546 | autotierfs /mnt/autotier -o allow_other,default_permissions 547 | ``` 548 | 549 | The configuration of it: 550 | 551 | ``` 552 | # cat /etc/autotier.conf 553 | # autotier config 554 | [Global] # global settings 555 | Log Level = 1 # 0 = none, 1 = normal, 2 = debug 556 | Tier Period = 1000 # number of seconds between file move batches 557 | Copy Buffer Size = 1 MiB # size of buffer for moving files between tiers 558 | 559 | [Tier 1] # tier name (can be anything) 560 | Path = /cache # full path to tier storage pool 561 | Quota = 20 % # absolute or % usage to keep tier under 562 | # Quota format: x ( % | [K..T][i]B ) 563 | # Example: Quota = 5.3 TiB 564 | 565 | [Tier 2] 566 | Path = /mnt/slow-storage 567 | Quota = 100 % 568 | 569 | ``` 570 | 571 | Results, poor as expected (below results using ZFS) 572 | 573 | ``` 574 | Starting 8 processes 575 | Jobs: 8 (f=8): [W(2),f(3),W(2),f(1)][15.2%][w=215MiB/s][w=215 IOPS][eta 11m:16s] 576 | fiotest: (groupid=0, jobs=8): err= 0: pid=43270: Fri Nov 4 00:17:35 2022 577 | write: IOPS=183, BW=184MiB/s (193MB/s)(10.8GiB/60112msec); 0 zone resets 578 | slat (usec): min=101, max=854743, avg=43446.05, stdev=34704.34 579 | clat (msec): min=23, max=1337, avg=304.13, stdev=85.28 580 | lat (msec): min=23, max=1341, avg=347.57, stdev=92.01 581 | clat percentiles (msec): 582 | | 1.00th=[ 171], 5.00th=[ 211], 10.00th=[ 228], 20.00th=[ 249], 583 | | 30.00th=[ 266], 40.00th=[ 279], 50.00th=[ 296], 60.00th=[ 313], 584 | | 70.00th=[ 330], 80.00th=[ 351], 90.00th=[ 384], 95.00th=[ 418], 585 | | 99.00th=[ 493], 99.50th=[ 919], 99.90th=[ 1217], 99.95th=[ 1250], 586 | | 99.99th=[ 1301] 587 | bw ( KiB/s): min=67571, max=274432, per=100.00%, avg=189065.93, stdev=4192.32, samples=952 588 | iops : min= 65, max= 268, avg=184.20, stdev= 4.11, samples=952 589 | lat (msec) : 50=0.01%, 100=0.09%, 250=20.19%, 500=78.80%, 750=0.41% 590 | lat (msec) : 1000=0.06%, 2000=0.44% 591 | cpu : usr=0.08%, sys=0.34%, ctx=34828, majf=0, minf=98 592 | IO depths : 1=0.1%, 2=0.1%, 4=0.3%, 8=99.5%, 16=0.0%, 32=0.0%, >=64=0.0% 593 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 594 | complete : 0=0.0%, 4=99.9%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 595 | issued rwts: total=0,11051,0,0 short=0,0,0,0 dropped=0,0,0,0 596 | latency : target=0, window=0, percentile=100.00%, depth=8 597 | 598 | Run status group 0 (all jobs): 599 | WRITE: bw=184MiB/s (193MB/s), 184MiB/s-184MiB/s (193MB/s-193MB/s), io=10.8GiB (11.6GB), run=60112-60112msec 600 | 601 | ``` 602 | 603 | #### Verify if ZFS is the reason for poor performance on autotier 604 | 605 | Let's use btrfs here for this test. Same hardware, this time I made btrfs a RAID1. Re-mounted, same options. **autotier did not work with btrfs filesystem**. 606 | 607 | ``` 608 | mkfs.btrfs -f -L cachebtrfs -m raid1 -d raid1 /dev/sdb /dev/nvme0n1 609 | ``` 610 | 611 | debug btrfs 612 | 613 | ``` 614 | dmesg | grep BTRFS | egrep 'error|warning|failed' 615 | ``` 616 | 617 | ``` 618 | root@nas:/mnt# btrfs fi df /cache/ 619 | Data, RAID1: total=33.00GiB, used=31.78GiB 620 | System, RAID1: total=8.00MiB, used=16.00KiB 621 | Metadata, RAID1: total=1.00GiB, used=17.23MiB 622 | GlobalReserve, single: total=17.12MiB, used=0.00B 623 | ``` 624 | 625 | **Autotier on BTRFS did not work. Process was getting hung**. Let's use `ext4` filesystem instead. 626 | 627 | ``` 628 | root@nas:/home/gfm# umount /cache/ 629 | umount: /cache/: target is busy. 630 | root@nas:/home/gfm# ps aux | grep autotier 631 | root 9511 6.0 0.2 832460 11180 ? Ssl 01:33 0:13 autotierfs /mnt/autotier -o allow_other,default_permissions 632 | root 10949 0.0 0.0 6432 724 pts/0 S+ 01:37 0:00 grep --color=auto autotier 633 | root@nas:/home/gfm# kill -9 9511 634 | root@nas:/home/gfm# umount /cache/ 635 | root@nas:/home/gfm# rm /var/lib/autotier/5685251811202329732/ 636 | adhoc.socket conflicts.log db/ 637 | root@nas:/home/gfm# rm /var/lib/autotier/5685251811202329732/adhoc.socket 638 | root@nas:/home/gfm# mkfs -t ext4 /dev/nvme0n1 639 | mke2fs 1.45.5 (07-Jan-2020) 640 | /dev/nvme0n1 contains a btrfs file system labelled 'testme' 641 | Proceed anyway? (y,N) y 642 | Discarding device blocks: done 643 | Creating filesystem with 62514774 4k blocks and 15630336 inodes 644 | Filesystem UUID: ce2eed9e-8e10-4e0c-ab06-d11f17eefe2d 645 | Superblock backups stored on blocks: 646 | 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 647 | 4096000, 7962624, 11239424, 20480000, 23887872 648 | 649 | Allocating group tables: done 650 | Writing inode tables: done 651 | Creating journal (262144 blocks): 652 | done 653 | Writing superblocks and filesystem accounting information: done 654 | 655 | ``` 656 | 657 | #### Now EXT4 autotier test results. 658 | 659 | ``` 660 | fio-3.16 661 | Starting 8 processes 662 | fiotest: Laying out IO file (1 file / 16384MiB) 663 | Jobs: 8 (f=8): [W(8)][100.0%][w=640MiB/s][w=639 IOPS][eta 00m:00s] 664 | fiotest: (groupid=0, jobs=8): err= 0: pid=12306: Fri Nov 4 01:41:48 2022 665 | write: IOPS=657, BW=658MiB/s (689MB/s)(38.5GiB/60030msec); 0 zone resets 666 | slat (usec): min=45, max=276076, avg=12158.02, stdev=19153.35 667 | clat (usec): min=828, max=562034, avg=85134.29, stdev=60544.23 668 | lat (usec): min=1052, max=573771, avg=97292.87, stdev=64876.67 669 | clat percentiles (msec): 670 | | 1.00th=[ 40], 5.00th=[ 46], 10.00th=[ 51], 20.00th=[ 55], 671 | | 30.00th=[ 58], 40.00th=[ 62], 50.00th=[ 65], 60.00th=[ 68], 672 | | 70.00th=[ 73], 80.00th=[ 87], 90.00th=[ 155], 95.00th=[ 213], 673 | | 99.00th=[ 355], 99.50th=[ 447], 99.90th=[ 535], 99.95th=[ 542], 674 | | 99.99th=[ 558] 675 | bw ( KiB/s): min=94154, max=1044480, per=99.88%, avg=672475.66, stdev=22655.00, samples=960 676 | iops : min= 90, max= 1020, avg=656.37, stdev=22.13, samples=960 677 | lat (usec) : 1000=0.01% 678 | lat (msec) : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.06%, 50=9.76% 679 | lat (msec) : 100=70.81%, 250=17.09%, 500=1.93%, 750=0.32% 680 | cpu : usr=0.36%, sys=0.87%, ctx=121217, majf=0, minf=92 681 | IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0% 682 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 683 | complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 684 | issued rwts: total=0,39471,0,0 short=0,0,0,0 dropped=0,0,0,0 685 | latency : target=0, window=0, percentile=100.00%, depth=8 686 | 687 | Run status group 0 (all jobs): 688 | WRITE: bw=658MiB/s (689MB/s), 658MiB/s-658MiB/s (689MB/s-689MB/s), io=38.5GiB (41.4GB), run=60030-60030msec 689 | 690 | ``` 691 | 692 | 2/3 of the drive's raw performance. `mergerfs` still much better. The only benefit to `autotier` would be its automatic promoting of files between tiers based on age and usage. 693 | 694 | I'm a little uneasy on placing a depedency on `autotier` given that it doesn't seem to be maintained. IMO - `mergerfs + btrfs` is the winner combination. 695 | -------------------------------------------------------------------------------- /plex_mediaserver_lxc_hw_transcoding.md: -------------------------------------------------------------------------------- 1 | # Plex Media Server 2 | 3 | Notes about Plex and hardware transcoding. My setup is LXC container on Proxmox; with NFS mount to my unraid. 4 | 5 | ## Intel GPU monitoring (hw transcoding) 6 | ![Alt text](img\intel-gpu-top.png?raw=true "intel_gpu_top") 7 | 8 | Use this tool. Requires `apt-get install intel-gpu-tools` 9 | ``` 10 | intel_gpu_top 11 | ``` 12 | 13 | ## LXC container settings 14 | 15 | ``` 16 | lxc.cgroup2.devices.allow: c 226:0 rwm 17 | lxc.cgroup2.devices.allow: c 226:128 rwm 18 | lxc.cgroup2.devices.allow: c 29:0 rwm 19 | lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file 20 | lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir 21 | lxc.mount.entry: /dev/dri/renderD128 dev/renderD128 none bind,optional,create=file 22 | ``` 23 | 24 | Install packages inside the LXC container. (non-free apt repo req) 25 | 26 | ``` 27 | apt install intel-media-va-driver-non-free 28 | ``` 29 | 30 | ## Setup Proxmox host for GPU passthru to LXC 31 | 32 | - proxmox kernel must be 5.19 or later. 33 | 34 | Based on: https://wiki.archlinux.org/title/intel_graphics 35 | 36 | I have Rocket Lake, so `3` 37 | 38 | ``` 39 | echo “options i915 enable_guc=3” >> /etc/modprobe.d/i915.conf 40 | ``` 41 | 42 | ``` 43 | nano /etc/kernel/cmdline 44 | ``` 45 | add `initcall_blacklist=sysfb_init` 46 | 47 | refresh boot environment. `proxmox-boot-tool refresh` -------------------------------------------------------------------------------- /proxmox.md: -------------------------------------------------------------------------------- 1 | # Promox Virtual Environment 2 | 3 | This is a list of notes related to Proxmox. We use Proxmox as the top-layer of my setup (meaning NAS is a VM with PCIe passthrough for disks HBA controllers). 4 | 5 | ## CPU Pinning (to be researched) 6 | 7 | https://www.youtube.com/watch?v=-c_451HV6fE < useful info> 8 | 9 | ``` 10 | #lscpu -e 11 | # lstopo 12 | ``` 13 | 14 | ## Disable subscription nag / initial setup helper script 15 | 16 | Disable subscription nag using this tool. https://tteck.github.io/Proxmox/ 17 | 18 | ``` 19 | bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/misc/post-pve-install.sh)" 20 | ``` 21 | 22 | ## PCI passthrough SATA onboard controller AHCI 23 | 24 | ``` 25 | lspci -knn 26 | ``` 27 | 28 | You will need to edit `udev` and a few other things. 29 | 30 | https://gist.github.com/kiler129/4f765e8fdc41e1709f1f34f7f8f41706 31 | 32 | -- probably not needed. Just add softdep to the /etc/modprobe.d/asmedia-sata.conf 33 | 34 | ``` 35 | options vfio-pci ids=8086:43d3 36 | softdep ahci pre: vfio-pci 37 | ``` 38 | 39 | ## Building corefreq-cli fails 40 | 41 | Error 42 | ``` 43 | make[1]: *** /lib/modules/6.2.11-2-pve/build: No such file or directory. Stop. 44 | make: *** [Makefile:86: all] Error 2 45 | ``` 46 | You need to install headers. 47 | ``` 48 | apt-get install linux-headers-`uname -r` 49 | ``` 50 | 51 | ## Intel GPU passthru 52 | 53 | Enable kernel `kvmgt` module in /etc/modules 54 | 55 | also add `i915.enable_gvt=1` to /etc/kernel/cmdline 56 | 57 | Run 58 | `update-initramfs -u -k all` and refresh 59 | 60 | ## VM not discovering hard disks (plug and unplug) 61 | 62 | Try changing the `/etc/kernel/cmdline` to set `iommu=soft` 63 | 64 | ``` 65 | root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=soft pcie_aspm=force pcie_aspm.policy=powersupersave vfio-pci.ids=144d:a802,8086:7ae2 initcall_blacklist=sysfb_init 66 | ``` 67 | 68 | ## Avoid pvestatd mkdir bug 69 | 70 | Looks like proxmox has a daemon that checks NFS mount points configured with "ISO" and "Container Templates" constantly to make sure the NFS share is responsive (<2 seconds). This may be problematic if unraid-mover.py script keeps deleting folder 'templates/iso/' and /templates/cache/' 71 | 72 | You may see these kind of errors (its because NAS didnt respond quick enough to directory listing; due to hdd sleeping in the slow branch where folders exist) 73 | ``` 74 | Nov 24 01:36:50 centrix pvestatd[2102]: storage 'unraid' is not online 75 | Nov 24 01:36:53 centrix pvestatd[2102]: unable to activate storage 'unraid' - directory '/mnt/pve/unraid' does not exist or is unreachable 76 | Nov 24 01:37:00 centrix pvestatd[2102]: mkdir /mnt/pve/unraid/template: No space left on device at /usr/share/perl5/PVE/Storage/Plugin.pm line 1323. 77 | Nov 24 01:37:10 centrix pvestatd[2102]: mkdir /mnt/pve/unraid/template: No space left on device at /usr/share/perl5/PVE/Storage/Plugin.pm line 1323. 78 | Nov 24 01:37:20 centrix pvestatd[2102]: mkdir /mnt/pve/unraid/template: No space left on device at /usr/share/perl5/PVE/Storage/Plugin.pm line 1323. 79 | ``` 80 | 81 | Let's set a flag attribute to prevent deletion of these folders so they are always on NVME and no seek to slow-storage is needed. 82 | 83 | ``` 84 | root@nas:/cache# mkdir -p template/iso 85 | root@nas:/cache# mkdir -p template/cache 86 | root@nas:/cache# chattr +i -RV template 87 | root@nas:/cache# chattr +i -RV template/iso 88 | root@nas:/cache# chattr +i -RV template/cache 89 | ``` -------------------------------------------------------------------------------- /shares_configuration.md: -------------------------------------------------------------------------------- 1 | # Samba and NFS shares 2 | 3 | Notes regarding setting up NFS and Samba shares. Since I mostly use cockpit-project for this; this page will be mostly notes, tricks or tips. 4 | 5 | ### Apple Time Machine SMB settings 6 | 7 | Add these additional settings to OSX defaults. 8 | ``` 9 | create mask = 0600 10 | directory mask = 0700 11 | spotlight = yes 12 | fruit:aapl = yes 13 | fruit:time machine = yes 14 | ``` 15 | 16 | ## Permissions on new share 17 | 18 | After creating a user; creating a new ZFS filesystem in the cache drive. We need to grant permissions. 19 | 20 | ``` 21 | chown username:shares /cache/test/ 22 | ``` 23 | 24 | The group `shares` for multi-user. SMB should now be writable. 25 | 26 | ## NFS 27 | 28 | Special settings to force user and group mapping to specific linux UIDs (the NAS user/group of your choice). In our case "proxmox" user (1001) and "shares" group (1002) should be mapped to all NFS requests of /mnt/cached 29 | 30 | **NOTE** NFSv4-only and fsid=0 will give you headaches. Always use fsid >0 to avoid NFS namespaces. 31 | 32 | ``` 33 | rw,sync,no_subtree_check,fsid=1,async,all_squash,anonuid=1001,anongid=1002 34 | ``` 35 | 36 | ### Temporary allow and map root to allow rsync from old server 37 | 38 | Without this no_root_squash flag rsync copy will fail as the chgrp command will error out. 39 | 40 | ``` 41 | rw,sync,no_subtree_check,fsid=1,async,no_root_squash 42 | ``` -------------------------------------------------------------------------------- /snapraid-cheatsheet.md: -------------------------------------------------------------------------------- 1 | # SnapRaid Cheatsheet & Debugging 2 | 3 | Notes related to snapraid. 4 | 5 | https://www.snapraid.it/manual 6 | 7 | ## List of devices to log 8 | 9 | `snapraid -l devices.log devices` 10 | 11 | ## List all the files at the time of last `sync` 12 | 13 | `snapraid list` 14 | 15 | output snippet. Our snapraid has data backup for 12TB of 89718 files. 16 | 17 | ``` 18 | 89718 files, for 12353 GB 19 | 0 links 20 | ``` 21 | 22 | ## Delete snapraid data & backup files to start from scratch 23 | 24 | From your snapraid.conf - delete all parity and content files. 25 | 26 | ``` 27 | # du -sh /var/snapraid.content /mnt/snapraid-content/disk*/snapraid.content /mnt/parity1/snapraid.parity 28 | 737M /var/snapraid.content 29 | 737M /mnt/snapraid-content/disk1/snapraid.content 30 | 737M /mnt/snapraid-content/disk2/snapraid.content 31 | 13T /mnt/parity1/snapraid.parity 32 | ``` -------------------------------------------------------------------------------- /snapraid_btrfs_runner.md: -------------------------------------------------------------------------------- 1 | # Snapraid-btrfs-runner 2 | 3 | We use this program to automate daily parity sync and scrub operations. This script is a wrapper to `snapraid-btrfs` in that it offers all of the same core options `diff`, `sync`, `scrub`. 4 | 5 | ## /opt/snapraid-btrfs-runner/snapraid-btrfs-runner.conf 6 | 7 | The configuration file contents. 8 | 9 | ``` 10 | [snapraid-btrfs] 11 | ; path to the snapraid-btrfs executable (e.g. /usr/bin/snapraid-btrfs) 12 | executable = /usr/local/bin/snapraid-btrfs 13 | ; optional: specify snapper-configs and/or snapper-configs-file as specified in snapraid-btrfs 14 | ; only one instance of each can be specified in this config 15 | ;snapper-configs = /etc/snapper/configs 16 | ;snapper-configs = /etc/snapper/configs 17 | ;snapper-configs-file = 18 | ; specify whether snapraid-btrfs should run the pool command after the sync, and optionally specify pool-dir 19 | pool = false 20 | pool-dir = 21 | ; specify whether snapraid-btrfs-runner should automatically clean up all but the last snapraid-btrfs sync snapshot after a successful sync 22 | cleanup = true 23 | 24 | [snapper] 25 | ; path to snapper executable (e.g. /usr/bin/snapper) 26 | executable = /usr/bin/snapper 27 | 28 | [snapraid] 29 | ; path to the snapraid executable (e.g. /usr/bin/snapraid) 30 | ;executable = /usr/bin/snapraid 31 | executable = /usr/local/bin/snapraid 32 | ; path to the snapraid config to be used 33 | config = /etc/snapraid.conf 34 | ;config = /etc/snapraid.conf 35 | ; abort operation if there are more deletes than this, set to -1 to disable 36 | deletethreshold = 300 37 | ; if you want touch to be ran each time 38 | touch = false 39 | 40 | [logging] 41 | ; logfile to write to, leave empty to disable 42 | file = /var/log/snapraid.log 43 | ; maximum logfile size in KiB, leave empty for infinite 44 | maxsize = 5000 45 | 46 | [email] 47 | ; when to send an email, comma-separated list of [success, error] 48 | sendon = success,error 49 | ; set to false to get full programm output via email 50 | short = true 51 | subject = [SnapRAID] Status Report: 52 | from = changeme@gmail.com 53 | to = changeme@gmail.com 54 | ; maximum email size in KiB 55 | maxsize = 500 56 | 57 | [smtp] 58 | host = 59 | ; leave empty for default port 60 | port = 61 | ; set to "true" to activate 62 | ssl = false 63 | tls = false 64 | user = 65 | password = 66 | 67 | [scrub] 68 | ; set to true to run scrub after sync 69 | enabled = false 70 | ; plan can be 0-100 percent, new, bad, or full 71 | plan = 12 72 | ; only used for percent scrub plan 73 | older-than = 10 74 | 75 | ``` 76 | 77 | ## Scheduling via systemd timers 78 | 79 | ### /etc/systemd/system/snapraid-btrfs-runner.service 80 | 81 | Contents 82 | 83 | ``` 84 | [Unit] 85 | Description=Run snapraid-btrfs-runner every night 86 | 87 | [Service] 88 | Type=oneshot 89 | ExecStart=/usr/bin/python3 /opt/snapraid-btrfs-runner/snapraid-btrfs-runner.py -c /opt/snapraid-btrfs-runner/snapraid-btrfs-runner.conf 90 | 91 | ``` 92 | 93 | ### /etc/systemd/system/snapraid-btrfs-runner.timer 94 | 95 | Contents 96 | 97 | ``` 98 | [Unit] 99 | Description=Run snapraid-btrfs-runner every night 100 | 101 | [Timer] 102 | OnCalendar=*-*-* 03:00:00 103 | RandomizedDelaySec=30m 104 | 105 | [Install] 106 | WantedBy=timers.target 107 | 108 | ``` 109 | 110 | ### Enable systemd timer 111 | 112 | ``` 113 | systemctl enable snapraid-btrfs-runner.timer --now 114 | ``` 115 | 116 | ### Disable snapraid-btrfs-runner systemd timer 117 | 118 | ``` 119 | systemctl disable snapraid-btrfs-runner.timer --now 120 | ``` 121 | 122 | ## Errors 123 | 124 | ### snapraid-btrfs: /mnt/disk1/.snapshots is not a valid btrfs subvolume 125 | 126 | Likely root-cause: ".snapshots/" subvolumes deleted. 127 | 128 | 1. Verify we can see snapper configs 129 | ``` 130 | snapraid-btrfs ls 131 | ``` 132 | 2. The snapper is likely outputting information within the files; let's just delete and set this up again. 133 | ``` 134 | rm /etc/snapper/configs/mergerfsdisk1 /etc/snapper/configs/mergerfsdisk2 135 | ``` 136 | 137 | Modify `/etc/default/snapper` SNAPPER_CONFIGS and delete names. 138 | 139 | Recreate 140 | ``` 141 | snapper -c mergerfsdisk1 create-config -t mergerfsdisk /mnt/disk1 142 | snapper -c mergerfsdisk2 create-config -t mergerfsdisk /mnt/disk2 143 | ``` 144 | 3. Verify fix with `snapraid-btrfs ls` and 145 | -------------------------------------------------------------------------------- /storage_tiered_cache.md: -------------------------------------------------------------------------------- 1 | # Tiered Storage Solution 2 | 3 | I want to mimize the benefits of NVME speeds (super fast) with the large storage capacity benefits of spinning hard drives on my media. Ideal situation is that any new writes of data end up in the NVME drive (/cache), later `stale` or `unused` gets purged to /mnt/slow-storage disks. 4 | 5 | Achiving this hybrid setup will require some automatic script to do the file housekeeping. The author of mergerfs has some basic scripts: https://github.com/trapexit/mergerfs#time-based-expiring - while those are great. If you are looking for advanced tiered storage management you basically will need to write your own scripts with advanced logic in them. 6 | 7 | ### Use cases that an ideal script should consider 8 | 9 | 1. NVME cache almost full (/cache). Avoid `no space left on device` errors; script would need to purge data and free space if this condition happens. If we want to keep using the nvme for fast cache. 10 | 11 | *Note: mergerfs has a minfreespace setting (default 4GB) that will write to next disk in the mount point settings. In mergerfs.md I used fall-back to `/mnt/slow-storage` when nvme gets full; so if you wanted to keep things simple you don't really need a script.* 12 | 13 | 2. Special rules per mount (e.g: "movies" vs. "documents"). 14 | 15 | We probably want to purge large files in "movies" cache, rather than moving 'documents' out of the NVME when in a space crunch. Documents probably deserve to be purged out of cache disk after X days unmodified (since I setup ZFS raid1 protection in `/cache` - our slow disks have snapraid on a scheduled job, meaning there's a gap in time that dataloss is possible in the `/mnt/slow-storage` mount) 16 | 17 | Possible idea/workaround: use **ZFS datasets with quota** defined to limit how much writes can go into /cache/movies (% of total available disk). This in theory should force mergerfs to fallback writes to `/mnt/slow-storage`. (to be tested) 18 | 19 | If no ZFS dataset-quota is pursued, https://duc.zevv.nl could be used to index total disk space utilization on /cache - then use scripting rules to check when /cache/movies folder exceeds X GB then execute a purge. 20 | 21 | #### rsync notes 22 | https://linux.die.net/man/1/rsync 23 | 24 | === mergerfs owner recommends these options === 25 | `axqHAXWESR` 26 | 27 | ``` 28 | a = archive mode 29 | x = don't cross filesystem boundaries 30 | q = quiet 31 | H = preserve hard links 32 | A = preserve ACLs (implies -p) 33 | X = preserve extended attributes 34 | W = copy files whole (w/o delta-xfer algorithm) 35 | E = preserve executability 36 | S = handle sparse files efficiently 37 | R = Use relative paths. This means that the full path names specified on the command line are sent to the server rather than just the last parts of the filenames 38 | ``` 39 | 40 | === unraid === 41 | `dIWRpEAXogt` 42 | 43 | ``` 44 | d = transfer directories without recursing 45 | I = don't skip files that match size and time 46 | W = With this option rsync's delta-transfer algorithm is not used and the whole file is sent as-is instead 47 | R = Use relative paths. This means that the full path names specified on the command line are sent to the server rather than just the last parts of the filenames 48 | p = preserve permissions 49 | E = preserve executability 50 | A = preserve ACLs (implies -p) 51 | X = preserve extended attributes 52 | o = preserve owner (super-user only) 53 | g = preserve group 54 | t = preserve modification times 55 | ``` -------------------------------------------------------------------------------- /to_be_added.md: -------------------------------------------------------------------------------- 1 | ## TODOs 2 | 3 | Investigate VPN docker: https://github.com/qdm12/gluetun#Features 4 | https://wiki.servarr.com/docker-guide 5 | 6 | ### Misc 7 | 8 | https://3os.org/infrastructure/synology/auto-dsm-config-backup/ 9 | 10 | ## Autotier experiment 11 | 12 | ``` 13 | libsnappy-dev 14 | ``` 15 | 16 | [3:0:0:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdc << parity /mnt/parity1 17 | [3:0:1:0] disk ATA WDC WD80EFZX-68U 0A83 /dev/sdd << /mnt/btrfs-roots/mergerfsdisk1 18 | [3:0:2:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sde << /mnt/btrfs-roots/mergerfsdisk2 19 | 20 | 21 | mkfs.btrfs -f -L mergerfsdisk1 /dev/sdd 22 | mkfs.btrfs -f -L mergerfsdisk2 /dev/sde 23 | # add to /etc/fstab (root btrfs) 24 | mount /mnt/btrfs-roots/mergerfsdisk1 25 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk1/data 26 | mount /mnt/btrfs-roots/mergerfsdisk2 27 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk2/data 28 | mkdir /mnt/disk{1,2} 29 | # add to /etc/fstab (data subvolumes) 30 | mount /mnt/disk1 31 | mount /mnt/disk2 32 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk1/content 33 | btrfs subvolume create /mnt/btrfs-roots/mergerfsdisk2/content 34 | mkdir -p /mnt/snapraid-content/disk{1,2} 35 | mount /mnt/snapraid-content/disk1 36 | mount /mnt/snapraid-content/disk2 37 | # Now we can unmount the parent-root-mounts 38 | umount /mnt/btrfs-roots/mergerfsdisk1 39 | umount /mnt/btrfs-roots/mergerfsdisk2 40 | 41 | 42 | # mergerfs 43 | wget https://github.com/trapexit/mergerfs/releases/download/2.33.5/mergerfs_2.33.5.ubuntu-focal_amd64.deb 44 | dpkg -i mergerfs_2.33.5.ubuntu-focal_amd64.deb 45 | 46 | # add fstab slow-storage 47 | /mnt/disk* /mnt/storage fuse.mergerfs defaults,nonempty,allow_other,use_ino,cache.files=off,moveonenospc=true,dropcacheonclose=true,minfreespace=200G,fsname=mergerfs 0 0 48 | 49 | # Update /etc/snapraid.conf file with disks maps. 50 | 51 | # Snapper configs for each disks 52 | snapper -c mergerfsdisk1 create-config -t mergerfsdisk /mnt/disk1 53 | snapper -c mergerfsdisk2 create-config -t mergerfsdisk /mnt/disk2 54 | 55 | # verify 56 | snapper list-configs 57 | 58 | # instsall snapraid-btrfs setup and then verify with 59 | snapraid-btrfs ls 60 | 61 | # install runner 62 | git clone https://github.com/fmoledina/snapraid-btrfs-runner.git /opt/snapraid-btrfs-runner 63 | --------------------------------------------------------------------------------