├── 001-getting-started ├── 001-getting-started.md └── images │ ├── image1.svg │ └── image2.svg ├── 002-elasticsearch-and-the-jvm └── 002-elasticsearch-and-the-jvm.md ├── 003-about-lucene ├── 003-about-lucene.md └── images │ └── image2.svg ├── 004-cluster-design ├── 004-cluster-design.md └── images │ ├── image1.svg │ ├── image4.png │ └── image5.png ├── 005-design-event-logging ├── 005-design-event-logging.md └── images │ └── image6.svg ├── 006-operating-daily └── 006-operating-daily.md ├── 007-monitoring-es ├── 007-monitoring-es.md └── images │ ├── image10.png │ ├── image7.png │ ├── image8.png │ └── image9.png ├── 100-use-cases-reindexing-36-billion-docs ├── 100-use-cases-reindexing-36-billion-docs.md └── images │ ├── image10.png │ ├── image11.png │ ├── image12.png │ ├── image13.png │ ├── image14.png │ ├── image15.png │ ├── image3.png │ ├── image7.svg │ ├── image8.svg │ └── image9.png ├── 101-use-case-migrating-cluster-over-ocean ├── 101-use-case-migrating-cluster-over-ocean.md └── images │ ├── image2.png │ ├── image2.svg │ └── image3.png ├── 102-use-case-advanced-architecture-high-volume-reindexing ├── 102-use-case-advanced-architecture-high-volume-reindexing.md └── images │ ├── image1.png │ ├── image2.svg │ ├── image3.svg │ ├── image4.svg │ ├── image5.svg │ └── image6.svg ├── 103-use-case-migrating-130tb-cluster-without-downtime ├── 103-use-case-migrating-130tb-cluster-without-downtime.md └── images │ ├── image16.svg │ ├── image17.svg │ ├── image18.svg │ ├── image19.png │ ├── image20.svg │ ├── image21.svg │ └── image22.svg ├── LICENSE ├── README.md ├── ZH-CN ├── 001-入门 │ ├── 001-入门.md │ └── images │ │ ├── image1.svg │ │ └── image2.svg ├── 002-Elasticsearch和JVM │ └── 002-elasticsearch-and-the-jvm.md └── 003-关于Lucene │ ├── 003-关于Lucene.md │ └── images │ └── image2.svg ├── _config.yml └── images └── image1.png /001-getting-started/001-getting-started.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Getting Started with Elasticsearch 6 | 7 | This chapter is for people who have not used Elasticsearch yet. It covers Elasticsearch basic concepts and guides you into deploying and using your first single node cluster. Every concept explained here are detailed further in this book. 8 | 9 | In this introduction chapter you will learn: 10 | 11 | - The basic concepts behind Elasticsearch 12 | - What's an Elasticsearch cluster 13 | - How to deploy your first, single node Elasticsearch cluster on the most common operating systems 14 | - How to use Elasticsearch to index documents and find content 15 | - Elasticsearch configuration basics 16 | - What's an Elasticsearch plugin and how to use them 17 | 18 | --- 19 | 20 | ## Prerequisites 21 | 22 | In order to read this book and perform the operations described along its chapters, you need: 23 | 24 | - A machine or virtual machine running one of the popular Linux or Unix environments: Debian / Ubuntu, RHEL / CentOS or FreeBSD. Running Elasticsearch on Mac OS or Windows is not covered in this book 25 | - A basic knowledge of UNIX command line and the use of a terminal 26 | - Your favorite text editor 27 | 28 | If you have never used Elasticsearch before, I recommend to create a virtual machine so you won't harm your main system in case of mistake. You can either run it locally using a virtualization tool like [Virtualbox](https://www.virtualbox.org/) or on your favorite cloud provider. 29 | 30 | --- 31 | 32 | ## Elasticsearch basic concepts 33 | 34 | Elasticsearch is a distributed, scalable, fault tolerant open source search engine written in Java. It provides a powerful REST API both for adding or searching data and updating the configuration. Elasticsearch is led by Elastic, a company created by Shay Banon, who started the project on top of Lucene. 35 | 36 | ### REST APIs 37 | 38 | A REST API is an application program interface (API) that uses HTTP requests to `GET`, `PUT`, `POST` and `DELETE` data. An API for a website is code that allows two software programs to communicate with each another. The API spells out the proper way for a developer to write a program requesting services from an operating system or other application. REST is the Web counterpart of databases CRUD (Create, Read, Update, Delete). 39 | 40 | ### Open Source 41 | 42 | Open source means that Elasticsearch source code, the recipe to build the software, is public, free, and that anyone can contribute to the project by adding missing feature, documentation or fixing bugs. If accepted by the project, their work is then available to the whole community. Because Elasticsearch is open source, the company behind it can go bankrupt or stop maintaining the project without killing it. Someone else will be able to take over it and keep the project alive. 43 | 44 | ### Java 45 | 46 | Java is a programming language created in 1995 by Sun Microsystems. Java applications runs on the top of the Java Virtual Machine (JVM), which means that it is independent of the platform it has been written on. Java is most well known for its Garbage Collector (GC), a powerful way to manage memory. 47 | 48 | Java is not Javascript, which was developed in the mid 90s by Netscape INC. Despite having very similar names, Java and Javascript are two different languages, with a different purpose. 49 | 50 | > Javascript is to Java what hamster is to ham. – Jeremy Keith 51 | 52 | ### Distributed 53 | 54 | Elasticsearch runs on as many hosts as required by the workload or the amount of data. Hosts communicate and synchronize using messages over the network. A networked machine running Elasticsearch is called a node, and the whole group of nodes sharing the same cluster name is called a cluster. 55 | 56 | ### Scalable 57 | 58 | Elasticsearch scales horizontally. Horizontal scaling means that the cluster can grow by adding new nodes. When adding more machines, you don't need to restart the whole cluster. When a new node joins the cluster, it gets a part of the existing data. Horizontal scaling is the opposite of vertical scaling, where the only way to grow is running a software on a bigger machine. 59 | 60 | ### Fault tolerant 61 | 62 | Elasticsearch ensures the data is replicated at least once - unless specified - on 2 separate nodes. When a node leaves the cluster, Elasticsearch rebuilds the replication on the remaining nodes, unless there's no more node to replicate to. 63 | 64 | --- 65 | 66 | ## What's an Elasticsearch cluster? 67 | 68 | A cluster is a host or a group of hosts running Elasticsearch and configured with the same `cluster name`. The default `cluster name` is `elasticsearch` but using it in production is not recommended. 69 | 70 | Each host in an Elasticsearch cluster can fulfill one or multiple roles in the following: 71 | 72 | ### Master node 73 | 74 | The master nodes control the cluster. They give joining nodes information about the cluster, decide where to move the data, and reallocate the missing data when a node leaves. When multiple nodes can handle the master role, Elasticsearch elects an acting master. The acting master is called `elected master` When the elected master leaves the cluster, another master node takes over the role of elected master. 75 | 76 | ### Ingest nodes 77 | 78 | An ingest node pre-process's documents before the actual document indexing happens. The ingest node intercepts bulk and index requests, it applies transformations, and it then passes the documents back to the index or bulk APIs. 79 | 80 | All nodes enable ingest by default, so any node can handle ingest tasks. You can also create dedicated ingest nodes. 81 | 82 | ### Data Nodes 83 | 84 | Data nodes store the indexed data. They are responsible for managing stored data, and performing operations on that data when queried. 85 | 86 | ### Tribe Nodes 87 | 88 | Tribe nodes connect to multiple Elasticsearch clusters and performs operations such as search accross every connected clusters. 89 | 90 | ### A Minimal, Fault Tolerant Elasticsearch Cluster 91 | 92 | ![A Minimal Elasticsearch cluster](images/image1.svg) 93 | 94 | A minimal fault tolerant Elasticsearch cluster should be composed of: 95 | 96 | * 3 master nodes 97 | * 2 ingest nodes 98 | * 2 data nodes 99 | 100 | Having 3 master nodes is important to make sure that the cluster won't be in a state of split brain in case of network separation, by making sure that there are at least 2 eligible master nodes present in the cluster. If the number of eligible master nodes falls behind 2, then the cluster will refuse any new indexing until the problem is fixed. 101 | 102 | --- 103 | 104 | ## What's an Elasticsearch index 105 | 106 | An `index` is a group of documents that with similar characteristics. It is identified by a name which is used when performing operations against stored documents or the `index` structure itself. An `index` structure is defined by a `mapping`, a `JSON` file describing both the document characteristics and the `index` options such as the replication factor. In an Elasticsearch cluster, you can define as many `indexes` as you want. 107 | 108 | An Elasticsearch `index` is composed of 1 or multiple `shards`. A `shard` is a Lucene index, and the number of `shards` is defined at the `index` creation time. Elasticsearch allocates an `index` `shards` accross the cluster, either automatically or according to user defined rules. 109 | 110 | Lucene is the name of the search engine that powers Elasticsearch. It is an open source project from the Apache Foundation. You most probably never hear about Lucene when operating an Elasticsearch cluster, but this book covers the basics you need to know. 111 | 112 | A `shard` is made of one or multiple `segments`, which are binary files where Lucene indexes the stored documents. 113 | 114 | ![Inside an Elasticsearch index](images/image2.svg) 115 | 116 | If you're familiar with relational databases such as MySQL, then an `index` is a database, the `mapping` is the database schema, and the shards represent the database data. Due to the distributed nature of Elasticsearch, and the specificities of Lucene, the comparison with a relational database stops here. 117 | 118 | --- 119 | 120 | ## Deploying your first Elasticsearch cluster 121 | 122 | ### Deploying Elasticsearch on Debian 123 | 124 | TODO [issue #9](https://github.com/fdv/running-elasticsearch-fun-profit/issues/9) 125 | 126 | ### Deploying Elasticsearch on RHEL / CentOS 127 | 128 | TODO [issue #9](https://github.com/fdv/running-elasticsearch-fun-profit/issues/9) 129 | 130 | --- 131 | 132 | ## First step using Elasticsearch 133 | 134 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) 135 | 136 | --- 137 | 138 | ## Elasticsearch Configuration 139 | 140 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) 141 | 142 | ## Elasticsearch Plugins 143 | 144 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) 145 | -------------------------------------------------------------------------------- /001-getting-started/images/image1.svg: -------------------------------------------------------------------------------- 1 | 2 |
Master nodes
Master nodes
Elected master
Elected master
Ingest nodes
Ingest nodes
Eligible masters
Eligible masters
Data nodes
Data nodes
X
-------------------------------------------------------------------------------- /001-getting-started/images/image2.svg: -------------------------------------------------------------------------------- 1 | 2 |

Elasticsearch Index
<br>Elasticsearch Index
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
-------------------------------------------------------------------------------- /002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Elasticsearch and the Java Virtual Machine 6 | 7 | Elasticsearch is a software written in Java. It requires the Java Runtime Environment (JRE) deployed on the same host to run. Currently supported versions of Elasticsearch can run on the following operating systems / distributions and Java. 8 | 9 | ## Supported JVM and operating systems / distributions 10 | 11 | The following matrices present the various operating systems and Java Virtual Machines (JVM) officially supported by Elastic for both 2.4.x and 5.5.x versions. Every operating system or JVM not mentioned here is not supported by Elastic and therefor should not be used. 12 | 13 | ### Operating system matrix 14 | 15 | | | CentOS/RHEL 6.x/7.x | Oracle Enterprise Linux 6/7 with RHEL Kernel only | Ubuntu 14.04 | Ubuntu 16.04 | **Ubuntu 18.04** | SLES 11 SP4\*\*/12 | SLES 12 | openSUSE Leap 42 | 16 | | --- |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| 17 | | **ES 5.0.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 18 | | **ES 5.1.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 19 | | **ES 5.2.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 20 | | **ES 5.3.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 21 | | **ES 5.4.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 22 | | **ES 5.5.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 23 | | **ES 6.0.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 24 | | **ES 6.1.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 25 | | **ES 6.2.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 26 | | **ES 6.3.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 27 | | **ES 6.4.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 28 | | **ES 6.5.x** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | 29 | 30 | 31 | | | Windows Server 2012/R2 | Windows Server 2016 | Debian 7 | Debian 8 | Debian 9 | **Solaris / SmartOS** | Amazon Linux | 32 | | --- |:---:|:---:|:---:|:---:|:---:|:---:|:---:| 33 | | **ES 5.0** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 34 | | **ES 5.1.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 35 | | **ES 5.2.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 36 | | **ES 5.3.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 37 | | **ES 5.4.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 38 | | **ES 5.5.x** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | 39 | | **ES 6.0.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 40 | | **ES 6.1.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 41 | | **ES 6.2.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 42 | | **ES 6.3.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 43 | | **ES 6.4.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 44 | | **ES 6.5.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 45 | 46 | Elasticsearch runs on both OpenSolaris and FreeBSD. FreeBSD 11.1 provides an Elasticsearch 6.4.2 package maintained by [Mark Felder](mailto:feld@freebsd.org), but neither of these operating systems are officially supported by Elastic. 47 | 48 | ### Java Virtual Machine matrix 49 | 50 | | | Oracle/OpenJDK 1.8.0u111+ | Oracle/OpenJDK 9 | OpenJDK 10 | OpenJDK 11 | Azul Zing 16.01.9.0+ | IBM J9 | 51 | | --- |:---:|:---:|:---:|:---:|:---:| --- | 52 | | **ES 5.0.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 53 | | **ES 5.1.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 54 | | **ES 5.2.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 55 | | **ES 5.3.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 56 | | **ES 5.4.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 57 | | **ES 5.5.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 58 | | **ES 5.6**.x | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 59 | | **ES 6.0.x** | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 60 | | **ES 6.1.x** | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 61 | | **ES 6.2.x** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | 62 | | **ES 6.3.x** | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | 63 | | **ES 6.4.x** | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | 64 | | **ES 6.5.x** | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | 65 | 66 | 67 | 68 | ## Memory management 69 | 70 | TODO 71 | 72 | ## Garbage collection 73 | 74 | Java is a garbage collected language. The developer does not have to manage the memory allocation. The Java Virtual Machine periodically runs a specific system thread called GC Thread that takes care of the various garbage collection activities. One of them is reclaiming the memory occupied by objects that are no longer in use by the program. 75 | 76 | Java 1.8 comes with 3 different garbage collector families, which all have their own feature. 77 | 78 | The *Single Collector* uses a single thread to perform the whole garbage collection process. It is efficient on single processor machines, as it suppresses the overhead implied by the communication between threads, but not suitable for most real world use today. It was designed for heaps managing small datasets, of an order of 100MB. 79 | 80 | The *Parallel Collector* runs small garbage collections in parallel. Running parallel collections reduces the garbage collection overhead. It was designed for medium to large datasets running on multi threaded hosts. 81 | 82 | The *Mostly Concurrent Collector* perform most of its work concurrently to keep garbage-collection pauses short. It is designed for large sized datasets, when response time matters, because the technique used to minimise pauses can affect the application performances. Java 1.8 offers two Mostly Concurrent Collectors, the *Concurrent Mark & Sweep Garbage Collector*, and the *Garbage First Garbage Collector*, also known as G1GC. 83 | 84 | ### Concurrent Mark & Sweep Garbage Collector 85 | 86 | TODO 87 | 88 | ### Garbage First Garbage Collector 89 | 90 | TODO 91 | -------------------------------------------------------------------------------- /003-about-lucene/003-about-lucene.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # A few things you need to know about Lucene 6 | 7 | Before you start to think about choosing the right hardware, there are a few things you need to know about [Lucene](http://lucene.apache.org/). 8 | 9 | Lucene is the name of the search engine that powers Elasticsearch. It is an open source project from the Apache Foundation. There's no need to interact with Lucene directly, at least most of the time, when running Elasticsearch. But there's a few important things to know before choosing the cluster storage and file system. 10 | 11 | ## Lucene segments 12 | 13 | Each Elasticsearch index is divided into shards. Shards are both logical and physical divisions of an index. Each Elasticsearch shard is a Lucene index. The maximum number of documents you can have in a Lucene index is 2,147,483,519. The Lucene index is divided into smaller files called segments. A segment is a small Lucene index. Lucene searches in all segments sequentially. 14 | 15 | ![Inside an Elasticsearch index](images/image2.svg) 16 | 17 | Lucene creates a segment when a new writer is opened, and when a writer commits or is closed. It means segments are immutable. When you add new documents into your Elasticsearch index, Lucene creates a new segment and writes it. Lucene can also create more segments when the indexing throughput is important. 18 | 19 | From time to time, Lucene merges smaller segments into a larger one. the merge can also be triggered manually from the Elasticsearch API. 20 | 21 | This behavior has a few consequences from an operational point of view. 22 | 23 | The more segments you have, the slower the search. This is because Lucene has to search through all the segments in sequence, not in parallel. Having a little number of segments improves search performances. 24 | 25 | Lucene merges have a cost in terms of CPU and I/Os. It means they might slow your indexing down. When performing a bulk indexing, for example an initial indexing, it is recommended to disable the merges completely. 26 | 27 | If you plan to host lots of shards and segments on the same host, you might choose a filesystem that copes well with lots of small files and does not have an important inode limitation. This is something we'll deal in details in the part about choosing the right file system. 28 | 29 | ## Lucene deletes and updates 30 | 31 | Lucene performs copy on write when updating and deleting a document. It means the document is never deleted from the index. Instead, Lucene marks the document as deleted and creates another one when an update is triggered. 32 | 33 | This copy on write has an operational consequence. As you'll update or delete documents, your indices will grow on the disk unless you delete them completely. One solution to actually remove the marked documents is to force Lucene segments merges. 34 | 35 | During a merge, Lucene takes 2 segments, and moves the content into a third, new one. Then the old segments are deleted from the disk. It means Lucene needs enough free space on the disk to create a segment the size of both segments it needs to merge. 36 | 37 | A problem can arise when force merging a huge shard. If the shard size is \> half of the disk size, you provably won't be able to fully merge it, unless most of the data is made of deleted documents. 38 | -------------------------------------------------------------------------------- /003-about-lucene/images/image2.svg: -------------------------------------------------------------------------------- 1 | 2 |

Elasticsearch Index
<br>Elasticsearch Index
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
-------------------------------------------------------------------------------- /004-cluster-design/images/image4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/004-cluster-design/images/image4.png -------------------------------------------------------------------------------- /004-cluster-design/images/image5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/004-cluster-design/images/image5.png -------------------------------------------------------------------------------- /005-design-event-logging/005-design-event-logging.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Design for Event Logging 6 | 7 | Elasticsearch has made a blast in the event analysis world thanks --- or because of --- the famous Elasticsearch / Logstash / Kibana (ELK) trinity. In this specific use case, Elasticsearch acts as a hot storage that makes normalized events searchable. 8 | 9 | The usual topology of an event analysis infrastructure is more or less the same whatever the technical stack. 10 | 11 | Heterogeneous events are pushed from various locations into a queue. Queuing has 2 purposes: make sure the data processing won't act as a bottleneck in case of an unexpected spike, and make sure no event is lost if the data processing stack crashes. 12 | 13 | A data processing tool normalizes the events. You have 0 chance to have homogeneous events in an event analysis infrastructure. Events can be logs, metrics, or whatever you can think about, and they need to be normalized to be searchable. 14 | 15 | The data processing tool forwards the events to a hot storage where they can be searched. Here, the hot storage is, indeed, Elasticsearch. 16 | 17 | ## Design of an event logging infrastructure cluster 18 | 19 | Event analysis is the typical use case where you can start small, with a single node cluster, and scale when needed. Most of the time, you won't collect all the events you want to analyse from day 1, so it's OK not to over engineer things. 20 | 21 | The event logging infrastructure is the typical tricky use case that might have you pulling your hair for some time saying Elasticsearch is the worst software ever. It's both extremely heavy on writes, with only a few search query. 22 | 23 | Writes can easily become the bottleneck of the infrastructure, either from a CPU or storage point of view, one more reason to choose the software prior to Elasticsearch wisely to avoid losing events. 24 | 25 | Searches are performed on such an amount of data that one of them might trigger an out of memory error on the Java heap space, or an infinite garbage collection. 26 | 27 | Before you start, there's a few things you need to think about. Since we are focusing on designing an Elasticsearch cluster, we'll start from the moment events are normalized and pushed into Elasticsearch. 28 | 29 | ### Throughput: how many events per second (eps) are you going to collect? 30 | 31 | This is not a question you can answer out of the box unless you already have a central events collection platform. It's an important one though, as it will define most of your hardware requirements. Event logging varies a lot according to your platform activity, so I'd recommend tracking them for a week or more before you start building your Elasticsearch cluster. 32 | 33 | One important things to know is: do you need realtime indexing, or can you accept some lag. If the latter is an option, then you can let the lag being indexed after a spike of events happen, so you don't need to build for the maximum amount of events you can get. 34 | 35 | ### Retention: how long do you want to keep your data, hot and cold? 36 | 37 | Hot data is data you can access immediately, while cold data is what can be accessed within a reasonable amount of time. Retention depends both on your needs and national regulation. For example, in France, we're supposed to keep our access logs during a full year, financial transactions need to be kept for 3 to 5 years, etc. 38 | 39 | On Elasticsearch, hot data means opened, accessible indexes. Cold data means closed indexes, or backups of an index snapshot you can easily and quickly transfer and reopen. 40 | 41 | ### Size: what is the average size of a collected event? 42 | 43 | This metric is important as well. Knowing about throughput * retention period * events size will help you define the amount and type of storage you need, hence the cost of your events logging platform. 44 | 45 | Storage = throughput * events size * retention period. 46 | 47 | Hot data is made of opened, searchable indices. They are the ones you'll search into on a regular basis, for debugging or statistics purpose. 48 | 49 | ### Fault tolerance: can you afford losing your indexed data? 50 | 51 | Most of the time, losing your search backend is an option. Lots of people use the ELK stack to store application logs so they are easier to debug, and they are not a critical part of their infrastructure. Logs are also stored somewhere else, for example on a central syslog server so they are still searchable using some shell skills. 52 | 53 | When you can lose your search backend for a few hours, or don't want to invest in a full cluster, then a single Elasticsearch server is enough, provided your throughput allows it. 54 | 55 | The host minimal configuration is then: 56 | 57 | ```yaml 58 | master: true 59 | data: true 60 | index: 61 | number_of_replicas: 0 62 | ``` 63 | 64 | If you start combining events analysis with alerting, or if you need your events to be searchable in realtime without downtime, then things get a bit more expensive. For example, you might want to correlate your whole platform auth.log to look for intrusion attempts or port scanning, so you can deploy new firewall rules accordingly. Then you'll have to start with a 3 nodes cluster. 3 nodes is a minimum since you need 2 active master nodes to avoid a split brain. 65 | 66 | ![](images/image6.svg) 67 | 68 | Here, the minimal hosts configuration for the master / ingest node is: 69 | 70 | ```yaml 71 | master: true 72 | data: false 73 | index: 74 | number_of_replicas: 1 75 | ``` 76 | 77 | And for the data nodes: 78 | 79 | ```yaml 80 | master: true 81 | data: true 82 | index: 83 | number_of_replicas: 1 84 | ``` 85 | 86 | If you decide to go cheap and combine the master and data nodes in a 3 hosts cluster, never use bulk indexing. 87 | 88 | Bulk indexing can put lots of pressure on the server memory, leading the master to exit the cluster. If you plan to run bulk indexing, then add one or 2 dedicated ingest node. 89 | 90 | The same applies to high memory consuming queries. If you plan to run such queries, then move your master nodes out of the data nodes. 91 | 92 | ### Queries 93 | 94 | The last thing you need to know about is the type of queries that are going to be ran against your Elasticsearch cluster. If you need to run simple queries, like looking for an error message, then memory pressure won't be a real problem, even against large data. Things get more interested when you need to perform complex filtered queries, or aggregations against a large set of data. Then you'll put lots of pressure on the cluster memory. 95 | 96 | ## Which hardware do I need? 97 | 98 | Once you've gathered all your prerequisites, it's time for hardware selection. 99 | 100 | Unless you're using ZFS as a filesystem to profit from compression and snapshots, you should not need more than 64GB RAM. ZFS is popular both to manage extremely large file systems and for its features, but is greedy on memory. 101 | 102 | Choose the CPU depending on both your throughput and your filesystem. ZFS is more greedy than ext4, for example. Elasticsearch index thread pool is equal to the number of available processors + 1, with a default queue of 200. So if you have a 24 core host, Elasticsearch will be able to manage 25 indexing at once, with a queue of 200. Everything else will be rejected. 103 | 104 | You can choose to use bulk indexing, which will allow you to index more events at the same time. The default thread pool and queue size are the same as the index thread pool. 105 | 106 | The storage part will usually be your bottleneck. 107 | 108 | Indeed, local storage and SSD are preferred, but lots of people will choose spinning disks or an external storage with fiberchannel to have more space. 109 | 110 | Whatever you choose, the more disks, the better. More disks provide you more axis, hence a faster indexing. If you go with some RAID10, then choose smaller disks, as very large disks such as 4TB+ spinning disks will take ages to rebuild. 111 | 112 | On a single node infrastructure, my favorite setup for a huge host is a RAID10 with as many 3.8TB SSD disks possible. Some servers can host up to 36 of them, which makes 18 available axes for more or less 55TB of usable space. 113 | 114 | On a multiple node infrastructure, I prefer to multiply the smaller hosts with a RAID0 and 8TB to 10TB space. This works great with 8 data nodes and more since rebuilding takes lots of time. 115 | 116 | ## How to design my indices? 117 | 118 | As usual, it depends on your needs, but this is the time to play with aliases and timestamped indexes. For example, if you're storing the output of your infrastructure auth.log, your indices can be: 119 | 120 | ```bash 121 | auth-\$(date +%Y-%m-%d) 122 | ``` 123 | 124 | You'll probably want to have 1 index for each type of event you want to index, so you can build various, more adapted mappings. Event collection for syslog does not require the same index topology as an application event tracing, or even some temperature metrics you might want to put in a TSDB. 125 | 126 | While doing it, remember that too many indexes and too many shards might put lots of pressure on a single host. Constant writing creates lots of Lucene segments, so make sure Elasticsearch won't have \"too many open files\" issues. 127 | 128 | ## What about some tuning? 129 | 130 | Here starts the fun part. 131 | 132 | Depending on your throughput, you might need a large [indexing buffer](https://www.elastic.co/guide/en/elasticsearch/reference/current/indexing-buffer.html). The indexing buffer is a bunch of memory that stores the data to index. It differs from the index and bulk thread pools which manage the operations. 133 | 134 | Elasticsearch default index buffer is 10% of the memory allocated to the heap. But for heavy indexing operations, you might want to raise it to 30%, if not 40%. 135 | 136 | ```yaml 137 | indices: 138 | memory: 139 | index_buffer_size: "40%" 140 | ``` 141 | 142 | Elasticsearch provides a per node [query cache](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-cache.html). Let's put it this way: you don't need caching on an event logging infrastructure. There's a way to disable it completely and that's what we want. 143 | 144 | ```yaml 145 | indices: 146 | query: 147 | cache.enabled: false 148 | ``` 149 | 150 | You will also have a look at the indexing thread pool. I don't recommend changing the thread pool size, but depending on your throughput, changing the queue size might be a good idea in case of indexing spikes. 151 | 152 | ```yaml 153 | thread_pool: 154 | bulk: 155 | queue_size: 3000 156 | index: 157 | queue_size: 3000 158 | ``` 159 | 160 | Finally, you will want to disable the store throttle if you're running on enough fast disks. 161 | 162 | ```yaml 163 | store: 164 | throttle.type: 'none' 165 | ``` 166 | 167 | One more thing: when you don't need data in realtime, but can afford waiting a bit, you can cut your cluster a little slack by raising the indices refresh interval. 168 | 169 | ```yaml 170 | index: 171 | refresh_interval: "1m" 172 | ``` 173 | -------------------------------------------------------------------------------- /005-design-event-logging/images/image6.svg: -------------------------------------------------------------------------------- 1 | 2 |
master / ingest
master / ingest<br>
master / data
master / data
master / data
master / data
-------------------------------------------------------------------------------- /006-operating-daily/006-operating-daily.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Operating Daily 6 | 7 | ## Elasticsearch most common operations 8 | 9 | ### Mass index deletion with pattern 10 | 11 | I often have to delete hundreds of indexes at once. Their name usually follow some patterns, which makes batch deletion easier. 12 | 13 | ```bash 14 | for index in $(curl -XGET esmaster:9200/_cat/indices | awk '/pattern/ {print $3}'); do 15 | curl -XDELETE "localhost:9200/${index}?master_timeout=120s" 16 | done 17 | ``` 18 | 19 | ### Mass optimize, indexes with the most deleted docs first 20 | 21 | Lucene, which powers Elasticsearch has a specific behavior when it comes to delete or update documents. Instead of actually deleting or overwriting the data, if flags it as deleted and write a new one. The only way to get rid of a deleted document is to run an *optimize* on your indexes. 22 | 23 | This snippet sorts your existing indexes by the number of deleted documents before it runs the optimize. 24 | 25 | ```bash 26 | for indice in $(CURL -XGET esmaster:9200/_cat/indices | sort -rk 7 | awk '{print $3}'); do 27 | curl -XPOST "localhost:9200/${indice}/_optimize?max_num_segments=1" 28 | done 29 | ``` 30 | 31 | ### Restart a cluster using rack awareness 32 | 33 | Using rack awareness allows to split your replicated data evenly between hosts or data center. It's convenient to restart half of your cluster at once instead of host by host. 34 | 35 | ```bash 36 | curl -XPUT 'localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d ' 37 | { 38 | "transient" : { 39 | "cluster.routing.allocation.enable": "none" 40 | } 41 | } 42 | ' 43 | 44 | for host in $(curl -XGET esmaster:9200/_cat/nodeattrs?attr | awk '/rack_id/ {print $2}'); do 45 | ssh ${host} service elasticsearch restart 46 | done 47 | 48 | sleep 60 49 | 50 | curl -XPUT -H 'Content-Type: application/json' "localhost:9200/_cluster/settings" -d ' 51 | { 52 | "transient" : { 53 | "cluster.routing.allocation.enable": "all 54 | } 55 | } 56 | ' 57 | ``` 58 | 59 | ### Optimize your cluster restart 60 | 61 | There's a simple way to accelerate your cluster restart. Once you've brought your masters back, run this snippet. Most of the options are self explanatory: 62 | 63 | ```bash 64 | curl -XPUT 'localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 65 | { 66 | "transient" : { 67 | "cluster.routing.allocation.cluster_concurrent_rebalance": 20, 68 | "indices.recovery.concurrent_streams": 20, 69 | "cluster.routing.allocation.node_initial_primaries_recoveries": 20, 70 | "cluster.routing.allocation.node_concurrent_recoveries": 20, 71 | "indices.recovery.max_bytes_per_sec": "2048mb", 72 | "cluster.routing.allocation.disk.threshold_enabled" : true, 73 | "cluster.routing.allocation.disk.watermark.low" : "90%", 74 | "cluster.routing.allocation.disk.watermark.high" : "98%", 75 | "cluster.routing.allocation.enable": "primary" 76 | } 77 | } 78 | ' 79 | ``` 80 | 81 | Then, once your cluster is back to yellow, run that one: 82 | 83 | ```bash 84 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 85 | { 86 | "transient" : { 87 | "cluster.routing.allocation.enable": "all" 88 | } 89 | } 90 | ' 91 | ``` 92 | 93 | ### Remove data nodes from a cluster the safe way 94 | 95 | ```bash 96 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 97 | { 98 | "transient" : { 99 | "cluster.routing.allocation.exclude._ip" : ",," 100 | } 101 | } 102 | ' 103 | ``` 104 | 105 | ## Get useful information about your cluster 106 | 107 | ### Nodes information 108 | 109 | This snippet gets the most useful information from your Elasticsearch nodes: 110 | 111 | * hostname 112 | * role (master, data, nothing) 113 | * free disk space 114 | * heap used 115 | * ram used 116 | * file descriptors used 117 | * load 118 | 119 | ```bash 120 | curl -XGET "localhost:9200/_cat/nodes?v&h=host,r,d,hc,rc,fdc,l" 121 | ``` 122 | 123 | Output: 124 | ``` 125 | host r d hc rc fdc l 126 | 127 | 192.168.1.139 d 1tb 9.4gb 58.2gb 20752 0.20 128 | 192.168.1.203 d 988.4gb 16.2gb 59.3gb 21004 0.12 129 | 192.168.1.146 d 1tb 14.1gb 59.2gb 20952 0.18 130 | 192.168.1.169 d 1tb 14.3gb 58.8gb 20796 0.10 131 | 192.168.1.180 d 1tb 16.1gb 60.5gb 21140 0.17 132 | 192.168.1.188 d 1tb 9.5gb 59.4gb 20928 0.19 133 | ``` 134 | 135 | Then, it's easy to sort the output to get interesting information. 136 | 137 | Sort by free disk space 138 | 139 | ```bash 140 | curl -XGET "localhost:9200/_cat/nodes?h=host,r,d,hc,rc,fdc,l" | sort -hrk 3 141 | ``` 142 | 143 | Sort by heap occupancy: 144 | 145 | ```bash 146 | curl -XGET "localhost:9200/_cat/nodes?h=host,r,d,hc,rc,fdc,l" | sort -hrk 4 147 | ``` 148 | 149 | And so on. 150 | 151 | ### Monitor your search queues 152 | 153 | It's sometimes useful to know what happens on your data nodes search queues. Beyond the search thread pool(default thread pool being ((CPU * 3) / 2) + 1 on each data node, queries get stacked into the search queue, a 1000 buffer. 154 | 155 | ```bash 156 | while true; do 157 | curl -XGET "localhost:9200/_cat/thread_pool?v&h=host,search.queue,search.active,search.rejected,search.completed" | sort -unk 2,3 158 | sleep 5 159 | done 160 | ``` 161 | 162 | That code snippet only displays the data node running active search queries so it's easier to read on large cluster. 163 | 164 | ### Indices information 165 | 166 | This snippet gets most information you need about your indices. You can then grep on what you need to know: open, closed, green / yellow / red... 167 | 168 | ```bash 169 | curl -XGET "localhost:9200/_cat/indices?v" 170 | ``` 171 | 172 | ### Shard allocation information 173 | 174 | Shards movement have lots of impact on your cluster performances. These snippets allows you to get the most critical information about your shards. 175 | 176 | ```bash 177 | curl -XGET "localhost:9200/_cat/shards?v" 178 | ``` 179 | 180 | Output: 181 | 182 | ``` 183 | 17_20140829 4 r STARTED 2894319 4.3gb 192.168.1.208 esdata89 184 | 17_20140829 10 p STARTED 2894440 4.3gb 192.168.1.206 esdata87 185 | 17_20140829 10 r STARTED 2894440 4.3gb 192.168.1.199 esdata44 186 | 17_20140829 3 p STARTED 2784067 4.1gb 192.168.1.203 esdata48 187 | ``` 188 | 189 | ### Recovery information 190 | 191 | Recovery information comes under the form of a JSON output but it's still easy to read to understand what happens on your cluster. 192 | 193 | ```bash 194 | curl -XGET "localhost:9200/_recovery?pretty&active_only" 195 | ``` 196 | 197 | ### Segments information (can be extremely verbose) 198 | 199 | ```bash 200 | curl -XGET "localhost:9200/_cat/nodes?h=host,r,d,hc,rc,fdc,l" | sort -hrk 3 201 | ``` 202 | 203 | ### Cluster stats 204 | 205 | ```bash 206 | curl -XGET "localhost:9200/_cluster/stats?pretty" 207 | ``` 208 | 209 | ### Nodes stats 210 | 211 | ```bash 212 | curl -XGET "localhost:9200/_nodes/stats?pretty" 213 | ``` 214 | 215 | ### Indice stats 216 | 217 | ```bash 218 | curl -XGET "localhost:9200/someindice/_stats?pretty" 219 | ``` 220 | 221 | ### Indice mapping 222 | 223 | ```bash 224 | curl -XGET "localhost:9200/someindice/_mapping" 225 | ``` 226 | 227 | ### Indice settings 228 | 229 | ```bash 230 | curl -XGET "localhost:9200/someindice/_mapping/settings" 231 | ``` 232 | 233 | ### Cluster dynamic settings 234 | 235 | ```bash 236 | curl -XGET "localhost:9200/_cluster/settings" 237 | ``` 238 | 239 | ### All the cluster settings (can be extremely verbose) 240 | 241 | ```bash 242 | curl -XGET "localhost:9200/_settings" 243 | ``` 244 | -------------------------------------------------------------------------------- /007-monitoring-es/007-monitoring-es.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Monitoring Elasticsearch 6 | 7 | Is your cluster healthy for real? 8 | 9 | Monitoring Elasticsearch is the most important and most difficult part of deploying a cluster. The elements to monitor are countless, and not all of them are worth raising an alert. There are some common points though, but the fine monitoring really depends on the workload and use you need. 10 | 11 | This chapter is divided into 3 different parts, covering the 3 most important environments to monitor: 12 | 13 | * monitoring at the cluster level, 14 | * monitoring at the host level, 15 | * monitoring at the index level. 16 | 17 | Each parts extensively covers the critical things to have a look at, and gives you an overview to the little thing that might be worse checking when troubleshooting. 18 | 19 | ## Tools 20 | 21 | Elastic provides an extensive monitoring system through the X-Pack plugin. X-Pack has a free license with some functional limitations. The free license only lets you manage a single cluster, a limited amount of nodes, and has a limited data retention. X-Pack documentation is available at [https://www.elastic.co/guide/en/x-pack/index.html](https://www.elastic.co/guide/en/x-pack/index.html) 22 | 23 | ![](images/image7.png) 24 | 25 | I have released 3 Grafana dashboards to monitor Elasticsearch Clusters using the data pushed by the X-Pack monitoring plugin. They provide much more information then the X-Pack monitoring interface, and are meant to be used when you need to gather data from various sources. They are not meant to replace X-Pack since they don't provide security, alerting or machine learning feature. 26 | 27 | Monitoring at the cluster level: [https://grafana.com/dashboards/3592](https://grafana.com/dashboards/3592) 28 | 29 | ![](images/image8.png) 30 | 31 | Monitoring at the node level: [https://grafana.com/dashboards/3595](https://grafana.com/dashboards/3595) 32 | 33 | ![](images/image9.png) 34 | 35 | Monitoring at the index level: [https://grafana.com/dashboards/3598](https://grafana.com/dashboards/3598) 36 | 37 | ![](images/image10.png) 38 | 39 | These dashboards are meant to provide a look at everything Elasticsearch sends to the monitoring node. It doesn't mean you'll actually need this data. 40 | 41 | ## Monitoring at the host level 42 | 43 | TODO 44 | 45 | ## Monitoring at the node level 46 | 47 | TODO 48 | 49 | ## Monitoring at the cluster level 50 | 51 | TODO 52 | 53 | ## Monitoring at the index level 54 | 55 | TODO 56 | -------------------------------------------------------------------------------- /007-monitoring-es/images/image10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/007-monitoring-es/images/image10.png -------------------------------------------------------------------------------- /007-monitoring-es/images/image7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/007-monitoring-es/images/image7.png -------------------------------------------------------------------------------- /007-monitoring-es/images/image8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/007-monitoring-es/images/image8.png -------------------------------------------------------------------------------- /007-monitoring-es/images/image9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/007-monitoring-es/images/image9.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image10.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image11.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image12.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image13.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image14.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image15.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image3.png -------------------------------------------------------------------------------- /100-use-cases-reindexing-36-billion-docs/images/image9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/100-use-cases-reindexing-36-billion-docs/images/image9.png -------------------------------------------------------------------------------- /101-use-case-migrating-cluster-over-ocean/101-use-case-migrating-cluster-over-ocean.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Use Case: Migrating a Cluster Across the Ocean Without Downtime 6 | 7 | I had to migrate a whole cluster from Canada to France without downtime. 8 | 9 | With only 1.8TB of data, the cluster was quite small. However, crossing the ocean on an unreliable network made the process long and hazardous. 10 | 11 | My main concern was about downtime: it was not an option. Otherwise I would have shutdown the whole cluster, rsync the data and restarted the Elasticsearch processes. 12 | 13 | 14 | To avoid downtime, I decided to connect both clusters and rely on Elasticsearch elasticity. It was made possible because this (rather small) cluster relied on unicast for discovery. With unicast discovery, you add a list of node in your Elasticsearch configuration, and you let it discover his pairs. This is something I did once, but not cross continent! 15 | 16 | First step was to connect both clusters using unicast. To do this, I've added the IP address of the Canadian master nodes to one of the French cluster nodes configuration. I updated both machines firewall rules to they were able to communicate on port 9300, then restarted the Elasticsearch process. 17 | 18 | At first, I only launched one French node, the one I planned to communicate with the Canadian one as a gateway. After a few hours of shard relocation, everything was green again, and I just I was able to shutdown the first Canadian data node. 19 | 20 | That's when I launched the 2 other French nodes. They only knew about each other and the *gateway* node. They did not know anything about the Canadian ones, but it worked like a charm. 21 | 22 | If for some reason you can't expose your new Elasticsearch cluster, what you can do is add a http only node, you will use as a bridge. Just ensure it can communicate with both clusters by adding 1 IP of each of their nodes, it works quite well, even with 1 public and 1 private subnet. This gateway provides another advantage: you don't need to update your clusters configuration to make them discover each other. 23 | 24 | Once again, it took a few hours to relocate the shards within the cluster, but it was still working like a charm, getting his load of reads and writes from the application. 25 | 26 | Once the cluster was all green, I could shutdown the second Canadian node, then the third after some relocation madness. 27 | 28 | You may have noticed that at that time, routing nodes were still in Canada, and data in France. 29 | 30 | ![Cluster Topology](images/image2.svg) 31 | 32 | That's right. The latest part of it was playing with DNS. 33 | 34 | ![Changing the DNS on Amazon](images/image3.png) 35 | 36 | The main ES hostname the application accesses is managed using Amazon Route53. Route53 provides some nice round robin thing so the same A record can point to many IPs or CNAME with a weight system. It's pretty cool even though it does not provide failover. If one of your nodes crash, it needs to unregister itself from route53. 37 | 38 | As soon as the data transfer was OK, I was able to update route53, adding 3 new records to route53. Then, I deleted the old records and removed the routing nodes from the cluster. Mission successful. 39 | -------------------------------------------------------------------------------- /101-use-case-migrating-cluster-over-ocean/images/image2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/101-use-case-migrating-cluster-over-ocean/images/image2.png -------------------------------------------------------------------------------- /101-use-case-migrating-cluster-over-ocean/images/image3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/101-use-case-migrating-cluster-over-ocean/images/image3.png -------------------------------------------------------------------------------- /102-use-case-advanced-architecture-high-volume-reindexing/102-use-case-advanced-architecture-high-volume-reindexing.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Use Case: An Advanced Elasticsearch Architecture for High-volume Reindexing 6 | 7 | I've found a new and funny way to play with [Elasticsearch](http://elastic.co/) to reindex a production cluster without disturbing our clients. If you haven't already, you might enjoy what we did last summer [reindexing 36 billion documents in 5 days within the same cluster](https://thoughts.t37.net/how-we-reindexed-36-billions-documents-in-5-days-within-the-same-elasticsearch-cluster-cd9c054d1db8#.5lw3khgtb). 8 | 9 | Reindexing that cluster was easy because it was not on production yet. Reindexing a whole cluster where regular clients expect to get their data in real time offers new challenges and more problems to solve. 10 | 11 | As you can see on the screenshot below, our main bottleneck the first time we reindexed Blackhole, the well named, was the CPU. Having the whole cluster at 100% and a load of 20 is not an option, so we need to find a workaround. 12 | 13 | ![Cluster load on Marvel](images/image1.png) 14 | 15 | This time, we won't reindex Blackhole but Blink. Blink stores the data we display in our clients dashboards. We need to reindex them every time we change the mapping to enrich that data and add new feature our clients and colleagues love. 16 | 17 | --- 18 | 19 | ## A glimpse at our infrastructure 20 | 21 | Blink is a group of 3 clusters built around 27 physical hosts each, having 64GB RAM and 4 core / 8 thread Xeon D-1520's. They are small, affordable and disposable hosts. The topology is the same for each cluster: 22 | 23 | * 3 master nodes (2 in our main data center and 1 in our backup data center plus a virtual machine ready to launch in case of major outage) 24 | * 4 http query nodes (2 in each data center) 25 | * 20 data nodes (10 in each data center) 26 | 27 | The data nodes have 4*800GB SSD drives in RAID0, about 58TB per cluster. The data and nodes are configured with Elasticsearch zones awareness. With 1 replica for each index, that makes sure we have 100% of the data in each data center so we're crash proof. 28 | 29 | ![Blink Architecture](images/image2.svg) 30 | 31 | We didn't allocate the http query nodes to a specific zone for a reason: we want to use the whole cluster when possible, at the cost of 1.2ms of network latency. From [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html): 32 | 33 | When executing search or GET requests, with shard awareness enabled, Elasticsearch will prefer using local shards – shards in the same awareness group – to execute the request. This is usually faster than crossing racks or awareness zones. 34 | 35 | In front of the clusters, we have a layer 7 load balancer made of 2 servers each running Haproxy and holding various virtual IP addresses (VIP). A keepalived ensures the active load balancer hold's the VIP. Each load balancer runs in a different data center for fault tolerance. Haproxy uses the allbackups configuration directive so we access the query nodes in the second data center only when the two first ones are down. 36 | 37 | ```haproxy 38 | frontend blink_01 39 | bind 10.10.10.1:9200 40 | default_backend be_blink01 41 | 42 | backend be_blink01 43 | balance leastconn 44 | option allbackups 45 | option httpchk GET /_cluster/health 46 | server esnode01 10.10.10.2:9200 check port 9200 inter 3s fall 3 47 | server esnode02 10.10.10.3:9200 check port 9200 inter 3s fall 3 48 | server esnode03 10.10.10.4:9200 check port 9200 inter 3s fall 3 backup 49 | server esnode04 10.10.10.5:9200 check port 9200 inter 3s fall 3 backup 50 | ``` 51 | 52 | So our infrastructure diagram becomes: 53 | ![Blink infrastructure with datacenter awareness](images/image3.svg) 54 | 55 | In front of the Haproxy, we have an applicative layer called Baldur. Baldur was developed by my colleague [Nicolas Bazire](https://github.com/nicbaz) to handle multiple versions of a same Elasticsearch index and route queries amongst multiple clusters. 56 | 57 | There's a reason why we had to split the infrastructure in multiple clusters even though they all run the same version of Elasticsearch, the same plugins, and they do exactly the same things. Each cluster supports about 10,000 indices, and 30,000 shards. That's a lot, and Elasticsearch master nodes have a hard time dealing with so many indexes and shards. 58 | 59 | Baldur is both an API and an applicative load balancer built on Nginx with the LUA plugin. It connects to a MySQL database and has a local memcache based cache. Baldur was built for 2 reasons: 60 | 61 | to tell our API the active index for a dashboard 62 | 63 | to tell our indexers which indexes they should write in, since we manage multiple versions of the same index. 64 | 65 | In elasticsearch, each index has a defined naming: `_` 66 | 67 | In baldur, we use have 2 tables: 68 | 69 | The first one is the indexes table with the triplet 70 | 71 | ``` 72 | id / cluster id / mapping id 73 | ``` 74 | 75 | That's how we manage to index into multiple versions of the same index with the ongoing data during the migration process from one mapping to another. 76 | 77 | The second table is the reports table with the triplet 78 | 79 | ``` 80 | client id / report id / active index id 81 | ``` 82 | 83 | So the API knows which index it should use as active. 84 | 85 | Just like the load balancers, Baldur holds a VIP managed by another Keepalived, for fail over. 86 | ![Cluster architecture with Baldur](images/image4.svg) 87 | 88 | --- 89 | 90 | ## Using Elasticsearch for fun and profit 91 | 92 | Since you know everything you need about our infrastructure, let's talk about playing with our Elasticsearch cluster the smart way for fun and, indeed, profit. 93 | 94 | Elasticsearch and our indexes naming allows us to be lazy so we can watch more cute kitten videos on Youtube. To create an index with the right mapping and settings, we use Elasticsearch templates and auto create index patterns. 95 | 96 | Every node in the cluster has the following configuration: 97 | 98 | ```yaml 99 | action: 100 | auto_create_index: "+_*,+_*,-*" 101 | ``` 102 | 103 | And we create a template in Elasticsearch for every mapping we need. 104 | 105 | ```bash 106 | curl -XPUT "localhost:9200/_template/template_" -H 'Content-Type: application/json' -d ' 107 | { 108 | "template": "_*", 109 | "settings": { 110 | "number_of_shards": 1 111 | }, 112 | "mappings": { 113 | "add some json": "here" 114 | }, 115 | "mappings": { 116 | "add some json": "here" 117 | } 118 | } 119 | ' 120 | ``` 121 | 122 | Every time the indexer tries to write into a not yet existing index, Elasticsearch creates it with the right mapping. That's the magic. 123 | 124 | Except this time, we don't want to create empty indexes with a single shard as we're going to copy existing data. 125 | 126 | After playing with Elasticsearch for years, we've noticed that the best size / shard was about 10GB. This allows faster reallocation and recovery at a cost of more Lucene segments during heavy writing and more frequent optimization. 127 | 128 | On Blink, 1,000,000 documents weight about 2GB so we're creating indexes with 1 shard for each 5 million documents + 1 when the dashboard already has more than 5 million documents. 129 | 130 | Before reindexing a client, we run a small script to create the new indexes with the right amount of shards. Here's a simplified version without error management for your eyes only. 131 | 132 | ```bash 133 | curl -XPUT "localhost:9200/_" -H 'Content-Type: application/json' -d ' 134 | { "settings.index.number_of_shards" : \'$(( $(curl -XGET "localhost:9200/_/_count" | cut -f 2 -d : | cut -f 1 -d ",") / 5000000 + 1))\' 135 | } 136 | ' 137 | ``` 138 | 139 | Now we're able to reindex, except we didn't solve the CPU issue. That's where fun things start. 140 | 141 | What we're going to do is to leverage Elasticsearch zone awareness to dedicate a few data nodes to the writing process. You can also add some new nodes if you can't afford removing a few from your existing cluster, it works exactly the same way. 142 | 143 | First, let's kick out all the indexes from those nodes. 144 | 145 | ```bash 146 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 147 | { 148 | "transient" : { 149 | "cluster.routing.allocation.exclude._ip" : ",," 150 | } 151 | } 152 | ' 153 | ``` 154 | 155 | Elasticsearch then moves all the data from these nodes to the remaining ones. You can also shutdown those nodes and wait for the indexes to recover but you might lose data. 156 | 157 | Then, for each node, we edit Elasticsearch configuration to assign these nodes to a new zone called *envrack* (fucked up in French). We put all these machines in the secondary data center to use the spare http query nodes for the indexing process. 158 | 159 | ```yaml 160 | node: 161 | zone: 'envrack' 162 | ``` 163 | 164 | Then restart Elasticsearch so it runs with the new configuration. 165 | 166 | We don't want Elasticsearch to allocate the existing indexes to the new zone when we bring back these nodes online, so we update these index settings accordingly. 167 | 168 | ```bash 169 | curl -XPUT "localhost:9200/_*/_settings" -H 'Content-Type: application/json' -d ' 170 | { 171 | "routing.allocation.exclude.zone" : "envrack" 172 | } 173 | ' 174 | ``` 175 | 176 | The same way, we don't want the new indexes to be allocated to the production zones, so we update the creation script. 177 | 178 | ```bash 179 | #!/bin/bash 180 | 181 | shards=1 182 | counter=$(curl -XGET [[http://esnode01:9200/]{.underline}](http://esnode01:9200/)_/_count | cut -f 2 -d : | cut -f 1 -d ",") 183 | 184 | if [ $counter -gt 5000000 ]; then 185 | shards=$(( $counter / 5000000 + 1 )) 186 | fi 187 | 188 | curl -XPUT "localhost:9200/_" -H 'Content-Type: application/json' -d ' 189 | { 190 | "settings" : { 191 | "index.number_of_shards" : '$counter', 192 | "index.numer_of_replicas" : 0, 193 | "routing.allocation.exclude.zone" : "barack,chirack" 194 | } 195 | } 196 | ' 197 | ``` 198 | 199 | More readable than a oneliner isn't it? 200 | 201 | We don't add a replica for 2 reasons: 202 | 203 | * The cluster is zone aware and we only have one zone for the reindexing 204 | * Indexing with a replica means indexing twice, so using twice as much CPU. Adding a replica after indexing is just transferring the data from one host to another. 205 | 206 | Indeed, losing a data node means losing data. If you can't afford reindexing an index multiple times in case of a crash, don't do this and add another zone or allow your new indexes to use the data from the existing zone in the backup data center. 207 | 208 | There's one more thing we want to do before we start indexing. 209 | 210 | Since we've set the new zone in the secondary data center, we update the http query nodes configuration to make them zone aware so they read the local shards in priority. We do the same with the active nodes so they read their zone first. That way, we can query the passive http query nodes when reading during the reindexing process with little hassle on what the clients access. 211 | 212 | In the main data center: 213 | 214 | ```yaml 215 | node: 216 | zone: "barack" 217 | ``` 218 | 219 | And in the secondary: 220 | 221 | ```yaml 222 | node: 223 | zone: "chirack" 224 | ``` 225 | 226 | Here's what our infrastructure looks like now. 227 | 228 | ![The final cluster](images/image5.svg) 229 | 230 | It's now time to reindex. 231 | 232 | We first tried to reindex taking the data from our database clusters, but it put them on their knees. We have large databases and our dashboard are made of documents crawled over time, which means large queries on a huge dataset, with random accesses only. In one word: sluggish. 233 | 234 | What we're doing then is copy the existing data, from the old indexes to the new ones, then add the stuff that makes our data richer. 235 | 236 | To copy the content of an existing index into a new one, [Logstash](https://www.elastic.co/products/logstash) from Elastic is a convenient tool. It takes the data from a source, transforms it if needed and pushes it into a destination. 237 | 238 | Our Logstash configuration are pretty straightforward: 239 | 240 | ```bash 241 | input { 242 | elasticsearch { 243 | hosts => [ "esnode0{3,4}" ] 244 | index => "_INDEX_ID" 245 | size => 1000 246 | scroll => "5m" 247 | docinfo => true 248 | } 249 | 250 | } 251 | 252 | output { 253 | elasticsearch { 254 | host => "esdataXX" 255 | index => "_INDEX_ID" 256 | protocol => "http" 257 | index_type => "%{[[[@metadata]{.underline}](http://twitter.com/metadata)][_type]}" 258 | document_id => "%{[[[@metadata]{.underline}](http://twitter.com/metadata)][_id]}" 259 | workers => 10 260 | } 261 | 262 | stdout { 263 | codec => dots 264 | } 265 | } 266 | ``` 267 | 268 | We can now run Logstash from a host inside the secondary data center. 269 | 270 | Here, we: 271 | 272 | * read from the passive http query nodes. Since they're zone aware, they query the data in the same zone in priority 273 | * write on the data nodes inside the indexing zone so we won't load the nodes accessed by our clients 274 | 275 | ![Reindexing with Baldur](images/image6.svg) 276 | 277 | Once we've done with reindexing a client, we update Baldur to change the active indexes for that client. Then, we add a replica and move the freshly baked indexes inside the production zones. 278 | 279 | ```bash 280 | curl -XPUT "localhost:9200/_" -H 'Content-Type: application/json' -d ' 281 | { 282 | "settings" : { 283 | "index.numer_of_replicas" : 1, 284 | "routing.allocation.exclude.zone" : "envrack", 285 | "routing.allocation.include.zone" : "barack,chirack" 286 | } 287 | } 288 | ' 289 | ``` 290 | 291 | Now, we're ready to delete the old indexes for that client. 292 | 293 | ```bash 294 | curl -XDELETE "localhost:9200/_" 295 | ``` 296 | 297 | --- 298 | 299 | ## Conclusion 300 | 301 | This post doesn't deal with cluster optimization for massive indexing on purpose. The Web is full of articles on that topic so I decided it didn't need another one. 302 | 303 | What I wanted to show is how we managed to isolate the data within the same cluster so we didn't disturb our clients. Considering our current infrastructure, building 3 more clusters might have been easier, but it has a double cost we didn't want to pay. 304 | 305 | First, it means doubling the infrastructure, so buying even more servers you won't use anymore after the reindexing process. And it means buying these servers 1 or 2 months upfront to make sure they're delivered in time. 306 | 307 | I hope you enjoyed reading that post as much as I enjoyed sharing my experience on the topic. If you did, please share it around you, it might be helpful to someone! 308 | -------------------------------------------------------------------------------- /102-use-case-advanced-architecture-high-volume-reindexing/images/image1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/102-use-case-advanced-architecture-high-volume-reindexing/images/image1.png -------------------------------------------------------------------------------- /103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours with 0 Downtime and a Rollback Strategy 6 | 7 | Do you remember [Blackhole, the 36 billion documents Elasticsearch cluster](https://thoughts.t37.net/how-we-reindexed-36-billions-documents-in-5-days-within-the-same-elasticsearch-cluster-cd9c054d1db8)we had to reindex a while ago? Blackhole is now a 130TB grownup with 100 billion documents, and my last task before I left Synthesio was migrating the little boy to Elasticsearch 5.1. This post is a more detailed version of the talk I gave November the 23rd at the ElasticFR meetup in Paris. 8 | 9 | There were many reasons for upgrading Blackhole: feature, performances, better monitoring data exposed. But for me, the main reason to do it before I leave was **for the lulz**. I love running large clusters, whatever the software, I love [performing huge migrations](https://thoughts.t37.net/how-we-upgraded-a-22tb-mysql-cluster-from-5-6-to-5-7-in-9-months-cc41b391895d), and the bigger, the better. 10 | 11 | --- 12 | 13 | ## Elasticsearch @Synthesio, November 2017 14 | 15 | At [Synthesio](https://www.synthesio.com/), we're using Elasticsearch pretty much everywhere as soon as we need hot storage. Cold storage is provided by MySQL and queuing by a bit more than 100TB of Apache Kafka. 16 | 17 | There are 8 clusters running in production, with a bit more than 600 bare metal servers, with 1.7PB storage and 37.5TB RAM. Clusters are hosted in 3 data centers. One of them is dedicated to running each cluster third master host to avoid split brains when we lose a whole data center, which happens from time to time. 18 | 19 | The servers are mostly 6 core, 12 thread Xeon E5--1650v3's with 64GB RAM and 4*800GB SSD or 2*1.2TB NVME, in RAID0. Some clusters have 12 core bi Xeon E5--2687Wv4's with 256GB RAM. 20 | 21 | The average cluster stats are 85k writes / second, with 1.5M in peak, and 800 reads / second, some clusters having a continuous 25k search / second. Doc size varies from 150kB to 200MB. 22 | 23 | --- 24 | 25 | ## The Blackhole Cluster 26 | 27 | Blackhole is a 77 node cluster, with 200TB storage, 4.8TB RAM, 2.4TB being allocated to Java, and 924 CPU cores. It is made of 3 master nodes, 6 ingest nodes, and 68 data nodes. The cluster holds 1137 indices, with 13613 primary shards, and 1 replica, for 201 billion documents. It gets about 7000 new documents / second, with an average of 800 searches / second on the whole dataset. 28 | 29 | Blackhole data nodes are spread between 2 data centers. By using rack awareness, we make sure that each data center holds 100% of the data, for high availability. Ingest nodes are rack aware as well, to leverage Elasticsearch prioritizing nodes within the same rack when running a query. This allows us to minimize the latency when running a query. A Haproxy controls both the ingest nodes health and proper load balancing amongst all of them. 30 | 31 | ![Blackhole rack awareness design](images/image16.svg) 32 | 33 | Blackhole is feeding a small part of a larger processing chain. After multiple enrichment and transformations, the data is pushed into a large Kafka queue. A working unit reads the Kafka queue and pushes the data into Blackhole. 34 | 35 | ![Blackhole processing chain](images/image17.svg) 36 | 37 | This has many pros, the first one being to be able to replay a whole part of the process in case of error. The only con here is having enough disk space for the data retention, but in 2017 disk space is not a problem anymore, even on a 10s of TB scale. 38 | 39 | --- 40 | 41 | ## Migration Strategies: Cluster restart VS Reindex API VS Logstash VS the Fun Way 42 | 43 | There are many ways to migrate an Elasticsearch cluster from a major version to another. 44 | 45 | ### The Cluster Restart Strategy 46 | 47 | Elasticsearch regular upgrade path from 2.x to 5.x requires to close every index using the [`_close` API endpoint](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-open-close.html), upgrade the software version, start the nodes, then open the indexes again using the `_open` API endpoint. 48 | 49 | Relying on the cluster restart strategy means keeping indexes created with Elasticsearch 2. This has no immediate consequence, except being unable to upgrade to Elasticsearch 6 without a full reindex. As this is something we do from time to time anyway, it was not a blocking problem. 50 | 51 | On the cons side, the cluster restart strategy requires to shutdown the whole cluster for a while, which was not acceptable. 52 | 53 | Someone once said there's a Chinese proverb for everything, and if it doesn't exist yet, you can make it a Chinese proverb anyway. 54 | 55 | > When migration requires downtime, throwing more hardware solves all your problems. 56 | > --- Traditional Chinese proverb. 57 | 58 | Throwing hardware at our problems meant we could rely on 2 more migration strategies. 59 | 60 | ### The Reindex API Strategy 61 | 62 | The first one is using [Elasticsearch reindex API](https://www.elastic.co/guide/en/elasticsearch/reference/6.0/docs-reindex.html). We have already used it to migrate some clusters from Elasticsearch 1.7 to 5.1. It has many cons though, so we decided not to use it this time. 63 | 64 | Error handling is suboptimal, and an error on a bulk index means we will lose documents in the process without knowing it. 65 | 66 | It is slow. Elasticsearch reindex API relies on scrolling, and [sliced scrolls](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html)are not available until version 6.0. 67 | 68 | There's also another problem on live indexes. And a huge one: losing data consistency. 69 | 70 | To ensure data consistency between the source and destination index, either you never update your data and it's OK, or you decide that all your indexes are write only during the whole reindexing, which implies an application downtime. Otherwise, you have a risk of race condition between your ongoing and the reindex process if the source cluster is updated before the destination cluster just when the data to update needs to be changed. The risk is small but still exists. 71 | 72 | ### The Logstash Strategy 73 | 74 | We've been using Logstash a lot on Elasticsearch 1.7, as there was no reindex API yet. Logstash is faster than the reindex API, and you can use it inside a script which makes failure management easier. 75 | 76 | Logstash has many cons as well, beside the race condition problem. The biggest one is that it is unreliable, and the risk of losing data in the process, without even noticing it, is too high. Logstash console output makes it difficult to troubleshoot errors as it is either too verbose or not enough. 77 | 78 | ### The Fun Way 79 | 80 | The fun way mixes the Cluster Restart Strategy and throwing hardware at problems, with the con of being able to rollback anytime even after the migration is over. But I don't want to spoil you yet 😈. 81 | 82 | --- 83 | 84 | ## Migrating Blackhole for Real 85 | 86 | The Blackhole migration took place on a warm, sunny Saturday. The birds were singing, the sun was shining, and the coffee was flowing in my cup. 87 | 88 | Migration Prerequisites 89 | 90 | Before starting the migration, we had a few prerequisites to fulfill: 91 | 92 | * Making sure our mapping template compatibility with Elasticsearch 5. 93 | * Using the [Elasticsearch Migration Helper](https://github.com/elastic/elasticsearch-migration/tree/2.x) plugin on Blackhole, just in case. 94 | * Create the next 10 daily indexes, just in case we missed something with the mapping template. 95 | * Telling our hosting provider that we would transfer more than 130TB on the network in the coming hours. 96 | 97 | ### Expanding Blackhole 98 | 99 | The first migration step was throwing more hardware at Blackhole. 100 | 101 | We added 90 new servers, split in 2 data centers. Each server has a 6 core Xeon E5--1650v3 CPU, 64GB RAM, and 2 * 1.2TB NVME drives, setup as a RAID0. These servers were set up to use a dedicated network range as we planned to use them to replace the old Blackhole cluster and didn't want to mess with the existing IP addresses. 102 | 103 | These servers were deployed with a Debian Stretch and Elasticsearch 2.3. We had some issues as Elasticsearch 2 systemd scripts don't work on Stretch, so we had to run the service manually. We configured Elasticsearch to use 2 new racks, Barack and Chirack. Then, we updated the replication factor to 3. 104 | ![Blackhole, expanded](images/image18.svg) 105 | 106 | ```bash 107 | curl -XPUT "localhost:9200/*/_settings" -H 'Content-Type: application/json' -d '{ 108 | "index" : { 109 | "number_of_replicas" : 3 110 | } 111 | } 112 | ' 113 | ``` 114 | 115 | On the vanity metrics level, Blackhole had: 116 | 117 | * 167 servers, 118 | * 53626 shards, 119 | * 279TB of data for 391TB of storage, 120 | * 10,84TB RAM, 5.42TB being allocated to Java, 121 | * 2004 cores. 122 | 123 | ![Blackhole on steroids](images/image19.png) 124 | 125 | If you're wondering why we didn't decide to save time, only raising the replication factor to 2, then do it, lose a data node, enjoy, and read the basics of distributed systems before you want to run one in production. 126 | 127 | Expanding Blackhole, we had to change a few dynamic settings for allocation and recoveries. 128 | 129 | Blackhole initial settings were: 130 | 131 | ```yaml 132 | cluster: 133 | routing: 134 | allocation: 135 | disk: 136 | threshold_enabled: true 137 | watermark: 138 | low: "78%" 139 | high: "79%" 140 | node_initial_primaries_recoveries: 50 141 | node_concurrent_recoveries: 20 142 | allow_rebalance": "always" 143 | cluster_concurrent_rebalance: 50 144 | rebalance.enable: "all" 145 | 146 | indices: 147 | recovery: 148 | max_bytes_per_sec: "2048mb" 149 | concurrent_streams: 30 150 | ``` 151 | 152 | We decided to speed up the cluster recovery a bit, and disable the reallocation completely to avoid mixing both of them until the migration was over. To make sure the cluster would use as much disk space as possible without problems, we raised the watermark thresholds to the maximum. 153 | 154 | ```yaml 155 | cluster: 156 | routing: 157 | allocation: 158 | disk: 159 | watermark.low : "98%" 160 | watermark.high : "99%" 161 | rebalance.enable: "none" 162 | 163 | indices: 164 | recovery: 165 | max_bytes_per_sec: "4096mb" 166 | concurrent_streams: 50 167 | ``` 168 | 169 | Them Came the Problems 170 | 171 | Transferring 130TB of data at up to 4Gb/s puts lots of pressure on the hardware. 172 | 173 | The load on most machines was up to 40, with 99% of the CPU in use. Iowait went from 0 to 60% on most of our servers. As a result, Elasticsearch [bulk thread pool](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html) queue started to fill dangerously despite being configured to 4000, with a risk of rejected data. 174 | 175 | Thankfully, there's a trick for that. 176 | 177 | Elasticsearch provides a concept of zone, which can be combined with rack awareness for a better allocation granularity. For example, you can dedicate lot of hardware to the freshest, most frequently accessed content, less hardware to content accessed less frequently and even less hardware to content that is never accessed. Zones are configured on the host level. 178 | 179 | ![Zone configuration](images/image20.svg) 180 | 181 | We decided to create a zone that would only hold the data of the day, so the hardware would be less stressed by the migration. 182 | 183 | To do it without rollbacking, we decided to disable the recovery, before we forced the indices allocation. 184 | 185 | ```bash 186 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 187 | { 188 | "transient" : { 189 | "cluster.routing.allocation.enable" : "none" 190 | } 191 | } 192 | ' 193 | 194 | curl -XPUT "localhost:9200/*/_settings" -H 'Content-Type: application/json' -d ' 195 | { 196 | "index.routing.allocation.exclude.zone" : "fresh" 197 | } 198 | ' 199 | 200 | curl -XPUT "localhot:9200/latest/_settings" -H 'Content-Type: application/json' -d ' 201 | { 202 | "index.routing.allocation.exclude.zone" : "", 203 | "index.routing.allocation.include.zone" : "fresh" 204 | } 205 | ' 206 | ``` 207 | 208 | After a few minutes, the cluster was quiet and we were able to resume the migration. 209 | 210 | ```bash 211 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 212 | { 213 | "transient" : { 214 | "cluster.routing.allocation.enable" : "all" 215 | } 216 | }' 217 | ``` 218 | 219 | Another way to do it is by playing with the `_ip` exclusion, but when you have more than 150 data nodes, it becomes a bit complicated. Also, you need to know that include and exclude are mutually exclusive, and can lead to some headache the first time you use them. 220 | 221 | ### Splitting Blackhole in 2 222 | 223 | The next step of the migration was creating a full clone of Blackhole. To clone a cluster, all you need is: 224 | 225 | * love 226 | * a bunch of data node with 100% of the data 227 | * a master node from the cluster to clone 228 | 229 | Before doing anything, we disabled the shard allocation globally. 230 | 231 | ```bash 232 | curl -XPUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d ' 233 | { 234 | "transient" : { 235 | "cluster.routing.allocation.enable" : "none" 236 | } 237 | } 238 | ' 239 | ``` 240 | 241 | Then, we shut down Elasticsearch on Barack, Chirack and one of the cluster master nodes. 242 | ![Moving from zone to zone](images/image21.svg) 243 | 244 | Removing nodes to create a new Blackhole 245 | 246 | Then, we reduced the replica number on Blackhole to 1, and enabled allocation. 247 | 248 | ```bash 249 | curl -XPUT "localhost:9200/*/_settings" -H 'Content-Type: application/json' -d ' 250 | { 251 | "index" : { 252 | "number_of_replicas" : 1 253 | } 254 | }' 255 | 256 | curl -XPUT "localhost;9200/_cluster/settings" -H 'Content-Type: application/json' -d 257 | '{ 258 | "transient" : { 259 | "cluster.routing.allocation.enable" : "all" 260 | } 261 | } 262 | ' 263 | ``` 264 | 265 | **The following step were performed with Elasticsearch being stopped on the removed hosts.** 266 | 267 | We changed the excluded master node IP address to move it to a new Blackhole02 cluster network range, as well as the `discovery.zen.ping.unicast.hosts setting` so it was unable to talk to the old cluster anymore. We didn't change the cluster.name since we wanted to reuse all the existing information. 268 | 269 | We also reconfigured the nodes within the Barack and Chirack racks to talk to that new master, then added 2 other fresh masters to respect the `discovery.zen.minimum_master_nodes: 2` settings. 270 | 271 | Then, we started Elasticsearch first on the master taken from Blackhole, then on the 2 new master nodes. We had a new cluster without data nodes, but with all the index and shards information. This was done on purpose so we could close all the indexes without losing time with the data nodes being here, trying to reallocate or whatever. 272 | 273 | We then closed all the existing indexes: 274 | 275 | ```bash 276 | curl -XPUT "localhost:9200/*/_close" 277 | ``` 278 | 279 | It was time to upgrade Elasticsearch on that new Cluster. This was done in a few minutes running our [Ansible](https://ansible.org/) playbook. 280 | 281 | We launched Elasticsearch on the master nodes first, to upgrade the cluster from 2 to 5. It took less than 20 seconds. I was shocked as I expected the process to take a few hours. Did I ever know, I would have asked for a maintenance window, but we would have lost the ability to rollback. 282 | 283 | Then, we started the data nodes, enabled allocation again, and 30 minutes later, the cluster was green. 284 | 285 | The last thing was to add a work unit to feed that Blackhole02 cluster and catch up with the data. This was made possible by saving the Kafka offset before we shut down the Barack and Chirack data nodes. 286 | 287 | ## Conclusion 288 | 289 | The whole migration took less than 20 hours, including transferring 130TB of data on a dual data center setup. 290 | 291 | ![The migration](images/image22.svg) 292 | The most important point here was that we were able to rollback at any time, including after the migration if something was wrong on the application level. 293 | 294 | Deciding to double the cluster for a while was mostly a financial debate, but it had lots of pros, starting with the security it brought, as well as changing the whole hardware that had been running for 2 years. 295 | -------------------------------------------------------------------------------- /103-use-case-migrating-130tb-cluster-without-downtime/images/image17.svg: -------------------------------------------------------------------------------- 1 | 2 |
Write
Write
Processing chain
Processing chain
Read
Read
Kafka queue
Kafka queue
Indexing
Indexing
Working units 
Working units 
Blackhole
Blackhole
-------------------------------------------------------------------------------- /103-use-case-migrating-130tb-cluster-without-downtime/images/image19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/103-use-case-migrating-130tb-cluster-without-downtime/images/image19.png -------------------------------------------------------------------------------- /103-use-case-migrating-130tb-cluster-without-downtime/images/image22.svg: -------------------------------------------------------------------------------- 1 | 2 |
17 hours
17 hours
Replication from 1 to 3
Replication from 1 to 3
Cluster split
Cluster split
20 minutes
20 minutes
Upgrade
Upgrade
1 minute
1 minute
Recovery
Recovery
30 minutes
30 minutes
-------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Fred de Villamil 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Operating Elasticsearch 6 | ## for Fun and Profit 7 | 8 | --- 9 | 10 | ![](images/image1.png) 11 | 12 | 13 | ## [Fred de Villamil](https://thoughts.t37.net) 14 | 15 | --- 16 | 17 | ## [Read online](https://fdv.github.io/running-elasticsearch-fun-profit) 18 | 19 | --- 20 | 21 | ## Code of Conduct 22 | 23 | - Behave like normal, friendly, welcoming human beings or get the hell out. 24 | - Any reference to a non scientific, verifiable element is irrelevant. 25 | 26 | --- 27 | 28 | ## TOC 29 | 30 | - [Getting Started with Elasticsearch](001-getting-started/001-getting-started.md/#getting-started-with-elasticsearch) 31 | * [Prerequisites](001-getting-started/001-getting-started.md/#prerequisites) 32 | * [Elasticsearch basic concepts](001-getting-started/001-getting-started.md/#elasticsearch-basic-concepts) 33 | + [REST APIs](001-getting-started/001-getting-started.md/#rest-apis) 34 | + [Open Source](001-getting-started/001-getting-started.md/#open-source) 35 | + [Java](001-getting-started/001-getting-started.md/#java) 36 | + [Distributed](001-getting-started/001-getting-started.md/#distributed) 37 | + [Scalable](001-getting-started/001-getting-started.md/#scalable) 38 | + [Fault tolerant](001-getting-started/001-getting-started.md/#fault-tolerant) 39 | * [What's an Elasticsearch cluster?](001-getting-started/001-getting-started.md/#whats-an-elasticsearch-cluster) 40 | + [Master node](001-getting-started/001-getting-started.md/#master-node) 41 | + [Ingest nodes](001-getting-started/001-getting-started.md/#ingest--nodes) 42 | + [Data Nodes](001-getting-started/001-getting-started.md/#data-nodes) 43 | + [Tribe Nodes](001-getting-started/001-getting-started.md/#tribe-nodes) 44 | + [A Minimal, Fault Tolerant Elasticsearch Cluster](001-getting-started/001-getting-started.md/#a-minimal-fault-tolerant-elasticsearch-cluster) 45 | * [What's an Elasticsearch index](001-getting-started/001-getting-started.md/#whats-an-elasticsearch-index) 46 | * [Deploying your first Elasticsearch cluster](001-getting-started/001-getting-started.md/#deploying-your-first-elasticsearch-cluster) 47 | + [Deploying Elasticsearch on Debian](001-getting-started/001-getting-started.md/#deploying-elasticsearch-on-debian) 48 | + [Deploying Elasticsearch on RHEL / CentOS](001-getting-started/001-getting-started.md/#deploying-elasticsearch-on-rhel--centos) 49 | * [First step using Elasticsearch](001-getting-started/001-getting-started.md/#first-step-using-elasticsearch) 50 | * [Elasticsearch Configuration](001-getting-started/001-getting-started.md/#elasticsearch-configuration) 51 | * [Elasticsearch Plugins](001-getting-started/001-getting-started.md/#elasticsearch-plugins) 52 | 53 | - [Elasticsearch and the Java Virtual Machine](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#elasticsearch-and-the-java-virtual-machine) 54 | * [Supported JVM and operating systems / distributions](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#supported-jvm-and-operating-systems--distributions) 55 | + [Operating system matrix](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#operating-system-matrix) 56 | + [Java Virtual Machine matrix](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#java-virtual-machine-matrix) 57 | * [Memory management](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#memory-management) 58 | * [Garbage collection](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#garbage-collection) 59 | + [Concurrent Mark & Sweep Garbage Collector](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#concurrent-mark--sweep-garbage-collector) 60 | + [Garbage First Garbage Collector](002-elasticsearch-and-the-jvm/002-elasticsearch-and-the-jvm.md/#garbage-first-garbage-collector) 61 | 62 | - [A few things you need to know about Lucene](003-about-lucene/003-about-lucene.md#a-few-things-you-need-to-know-aboutlucene) 63 | * [Lucene segments](003-about-lucene/003-about-lucene.md#lucene-segments) 64 | * [Lucene deletes and updates](003-about-lucene/003-about-lucene.md#lucene-deletes-andupdates) 65 | 66 | - [Designing the Perfect Elasticsearch Cluster](004-cluster-design/004-cluster-design.md/#designing-the-perfect-elasticsearch-cluster) 67 | * [Elasticsearch is elastic, for real](004-cluster-design/004-cluster-design.md/#elasticsearch-is-elastic-for-real) 68 | * [Design for failure](004-cluster-design/004-cluster-design.md/#design-for-failure) 69 | * [Hardware](004-cluster-design/004-cluster-design.md/#hardware) 70 | + [CPU](004-cluster-design/004-cluster-design.md/#cpu) 71 | + [Memory](004-cluster-design/004-cluster-design.md/#memory) 72 | + [Network](004-cluster-design/004-cluster-design.md/#network) 73 | + [Storage](004-cluster-design/004-cluster-design.md/#storage) 74 | * [Software](004-cluster-design/004-cluster-design.md/#software) 75 | + [The Linux (or FreeBSD) kernel](004-cluster-design/004-cluster-design.md/#the-linux-or-freebsd-kernel) 76 | + [The Java Virtual Machine](004-cluster-design/004-cluster-design.md/#the-java-virtualmachine) 77 | + [The filesystem](004-cluster-design/004-cluster-design.md/#the-filesystem) 78 | * [Designing your indices](004-cluster-design/004-cluster-design.md/#designing-your-indices) 79 | + [Sharding](004-cluster-design/004-cluster-design.md/#sharding) 80 | + [Replication](004-cluster-design/004-cluster-design.md/#replication) 81 | * [Optimising allocation](004-cluster-design/004-cluster-design.md/#optimising-allocation) 82 | * [Troubleshooting and scaling](004-cluster-design/004-cluster-design.md/#troubleshooting-and-scaling) 83 | + [CPU](004-cluster-design/004-cluster-design.md/#cpu-1) 84 | + [Memory](004-cluster-design/004-cluster-design.md/#memory-1) 85 | 86 | - [Design for Event Logging](005-design-event-logging/005-design-event-logging.md/#design-for-event-logging) 87 | * [Design of an event logging infrastructure cluster](005-design-event-logging/005-design-event-logging.md/#design-of-an-event-logging-infrastructure-cluster) 88 | + [Throughput: how many events per second (005-design-event-logging/005-design-event-logging.md//eps) are you going to collect?](005-design-event-logging/005-design-event-logging.md/#throughput--how-many-events-per-second--eps--are-you-going-to-collect-) 89 | + [Retention: how long do you want to keep your data, hot and cold?](005-design-event-logging/005-design-event-logging.md/#retention--how-long-do-you-want-to-keep-your-data--hot-and-cold-) 90 | + [Size: what is the average size of a collected event?](005-design-event-logging/005-design-event-logging.md/#size--what-is-the-average-size-of-a-collected-event-) 91 | + [Fault tolerance: can you afford losing your indexed data?](005-design-event-logging/005-design-event-logging.md/#fault-tolerance--can-you-afford-losing-your-indexed-data-) 92 | + [Queries](005-design-event-logging/005-design-event-logging.md/#queries) 93 | * [Which hardware do I need?](005-design-event-logging/005-design-event-logging.md/#which-hardware-do-i-need-) 94 | * [How to design my indices?](005-design-event-logging/005-design-event-logging.md/#how-to-design-my-indices-) 95 | * [What about some tuning?](005-design-event-logging/005-design-event-logging.md/#what-about-some-tuning-) 96 | 97 | - [Operating Daily](006-operating-daily/006-operating-daily.md/#operating-daily) 98 | * [Elasticsearch most common operations](006-operating-daily/006-operating-daily.md/#elasticsearch-most-common-operations) 99 | + [Mass index deletion with pattern](006-operating-daily/006-operating-daily.md/#mass-index-deletion-with-pattern) 100 | + [Mass optimize, indexes with the most deleted docs first](006-operating-daily/006-operating-daily.md/#mass-optimize--indexes-with-the-most-deleted-docs-first) 101 | + [Restart a cluster using rack awareness](006-operating-daily/006-operating-daily.md/#restart-a-cluster-using-rack-awareness) 102 | + [Optimize your cluster restart](006-operating-daily/006-operating-daily.md/#optimize-your-cluster-restart) 103 | + [Remove data nodes from a cluster the safe way](006-operating-daily/006-operating-daily.md/#remove-data-nodes-from-a-cluster-the-safe-way) 104 | * [Get useful information about your cluster](006-operating-daily/006-operating-daily.md/#get-useful-information-about-your-cluster) 105 | + [Nodes information](006-operating-daily/006-operating-daily.md/#nodes-information) 106 | + [Monitor your search queues](006-operating-daily/006-operating-daily.md/#monitor-your-search-queues) 107 | + [Indices information](006-operating-daily/006-operating-daily.md/#indices-information) 108 | + [Shard allocation information](006-operating-daily/006-operating-daily.md/#shard-allocation-information) 109 | + [Recovery information](006-operating-daily/006-operating-daily.md/#recovery-information) 110 | + [Segments information (006-operating-daily/006-operating-daily.md//can be extremely verbose)](006-operating-daily/006-operating-daily.md/#segments-information--can-be-extremely-verbose-) 111 | + [Cluster stats](006-operating-daily/006-operating-daily.md/#cluster-stats) 112 | + [Nodes stats](006-operating-daily/006-operating-daily.md/#nodes-stats) 113 | + [Indice stats](006-operating-daily/006-operating-daily.md/#indice-stats) 114 | + [Indice mapping](006-operating-daily/006-operating-daily.md/#indice-mapping) 115 | + [Indice settings](006-operating-daily/006-operating-daily.md/#indice-settings) 116 | + [Cluster dynamic settings](006-operating-daily/006-operating-daily.md/#cluster-dynamic-settings) 117 | + [All the cluster settings (006-operating-daily/006-operating-daily.md//can be extremely verbose)](006-operating-daily/006-operating-daily.md/#all-the-cluster-settings--can-be-extremely-verbose-) 118 | 119 | 120 | - [Monitoring Elasticsearch](007-monitoring-es/007-monitoring-es.md/#monitoring-elasticsearch) 121 | * [Tools](007-monitoring-es/007-monitoring-es.md/#tools) 122 | * [Monitoring at the host level](007-monitoring-es/007-monitoring-es.md/#monitoring-at-the-host-level) 123 | * [Monitoring at the node level](007-monitoring-es/007-monitoring-es.md/#monitoring-at-the-node-level) 124 | * [Monitoring at the cluster level](007-monitoring-es/007-monitoring-es.md/#monitoring-at-the-cluster-level) 125 | * [Monitoring at the index level](007-monitoring-es/007-monitoring-es.md/#monitoring-at-the-index-level) 126 | 127 | - [How we reindexed 36 billion documents in 5 days within the same Elasticsearch cluster](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#how-we-reindexed-36-billion-documents-in-5-days-within-the-same-elasticsearch-cluster) 128 | * [The "Blackhole" cluster](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#the--blackhole--cluster) 129 | * [Elasticsearch configuration](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#elasticsearch-configuration) 130 | * [Tuning the Java virtual machine](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#tuning-the-java-virtual-machine) 131 | + [Blackhole Initial indexing](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#blackhole-initial-indexing) 132 | * [Blackhole initial migration](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#blackhole-initial-migration) 133 | * [Blackhole reindexing](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#blackhole-reindexing) 134 | + [The reindexing process](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#the-reindexing-process) 135 | + [Logstash configuration](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#logstash-configuration) 136 | + [Reindexing Elasticsearch configuration](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#reindexing-elasticsearch-configuration) 137 | + [Introducing Yoko and Moulinette](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#introducing-yoko-and-moulinette) 138 | * [Reindexing in 5 days](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#reindexing-in-5-days) 139 | * [Conclusion](100-use-cases-reindexing-36-billion-docs/100-use-cases-reindexing-36-billion-docs.md/#conclusion) 140 | 141 | - [Use Case: Migrating a Cluster Across the Ocean Without Downtime](101-use-case-migrating-cluster-over-ocean/101-use-case-migrating-cluster-over-ocean.md/#use-case--migrating-a-cluster-across-the-ocean-without-downtime) 142 | 143 | - [Use Case: An Advanced Elasticsearch Architecture for High-volume Reindexing](102-use-case-advanced-architecture-high-volume-reindexing/102-use-case-advanced-architecture-high-volume-reindexing.md/#use-case--an-advanced-elasticsearch-architecture-for-high-volume-reindexing) 144 | * [A glimpse at our infrastructure](102-use-case-advanced-architecture-high-volume-reindexing/102-use-case-advanced-architecture-high-volume-reindexing.md/#a-glimpse-at-our-infrastructure) 145 | * [Using Elasticsearch for fun and profit](102-use-case-advanced-architecture-high-volume-reindexing/102-use-case-advanced-architecture-high-volume-reindexing.md/#using-elasticsearch-for-fun-and-profit) 146 | * [Conclusion](102-use-case-advanced-architecture-high-volume-reindexing/102-use-case-advanced-architecture-high-volume-reindexing.md/#conclusion) 147 | 148 | - [Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours with 0 Downtime and a Rollback Strategy](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#migrating-a-130tb-cluster-from-elasticsearch-2-to-5-in-20-hours-with-0-downtime-and-a-rollback-strategy) 149 | * [Elasticsearch @Synthesio, November 2017](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#elasticsearch--synthesio--november-2017) 150 | * [The Blackhole Cluster](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#the-blackhole-cluster) 151 | * [Migration Strategies: Cluster restart VS Reindex API VS Logstash VS the Fun Way](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#migration-strategies--cluster-restart-vs-reindex-api-vs-logstash-vs-the-fun-way) 152 | + [The Cluster Restart Strategy](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#the-cluster-restart-strategy) 153 | + [The Reindex API Strategy](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#the-reindex-api-strategy) 154 | + [The Logstash Strategy](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#the-logstash-strategy) 155 | + [The Fun Way](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#the-fun-way) 156 | * [Migrating Blackhole for Real](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#migrating-blackhole-for-real) 157 | + [Expanding Blackhole](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#expanding-blackhole) 158 | + [Splitting Blackhole in 2](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#splitting-blackhole-in-2) 159 | * [Conclusion](103-use-case-migrating-130tb-cluster-without-downtime/103-use-case-migrating-130tb-cluster-without-downtime.md/#conclusion) 160 | 161 | --- 162 | 163 | ## Styling 164 | 165 | This is the Markdown styling used in this book. If you plan to contribute, please use it. 166 | 167 | ### Chapter title 168 | 169 | ```markdown 170 | # This is a chapter title 171 | 172 | ``` 173 | 174 | ### Chapter part 175 | 176 | ```markdown 177 | --- 178 | 179 | ## A chapter part title is preceded by an horizontal line 180 | ``` 181 | 182 | ### Chapter subpart 183 | 184 | ```markdown 185 | ### A level 1 subpart 186 | #### A level 2 subpart 187 | ``` 188 | 189 | ### Images 190 | 191 | ```markdown 192 | ![An image should have an alt text](use/a/relative.link) 193 | ``` 194 | 195 | ### Code: 196 | 197 | ```markdown 198 | An `inline code block` goes like this 199 | ``` 200 | 201 | API calls go the Curl way 202 | 203 | ```bash 204 | curl -X POST "localhost:9200/_search" -H 'Content-Type: application/json' -d' 205 | { 206 | "query" : { 207 | "match_all" : {} 208 | }, 209 | "stats" : ["group1", "group2"] 210 | } 211 | ' 212 | ``` 213 | 214 | Yaml code is expanded for more readability 215 | ```yaml 216 | --- 217 | some: 218 | value: 219 | goes: "like this" 220 | ``` 221 | 222 | ### Links 223 | 224 | ```markdown 225 | [An internal link](has/a/relative.path) 226 | [An external link](https://has.an.absolute/path) 227 | ``` 228 | 229 | ### Lists 230 | 231 | Urdered lists: 232 | 233 | ```markdown 234 | Only one line break between a paragraph and 235 | 236 | * An 237 | * unordered 238 | * list 239 | * with 240 | * subitems 241 | ``` 242 | 243 | Ordered lists: 244 | 245 | ```markdown 246 | 1. An 247 | 2. Ordered 248 | 3. List 249 | 1. With 250 | 2. subitems 251 | ``` 252 | 253 | -------------------------------------------------------------------------------- /ZH-CN/001-入门/001-入门.md: -------------------------------------------------------------------------------- 1 | ``` 2 | 涵盖ELASTICSEARCH 5.5.x,正在升级到ES 6.5.x 3 | ``` 4 | 5 | # Elasticsearch入门 6 | 7 | 这一章是写给还没有使用过Elasticsearch的人,它涵盖了Elasticsearch基本的概念,并指引你部署和使用你的第一个单节点的集群。每个在这里提及的概念在后文都会有详细的解释。 8 | 9 | 在这个章节你将学习到: 10 | 11 | - Elasticsearch背后的基础概念 12 | - 什么是Elasticsearch集群 13 | - 如何在最常用的操作系统中部署你的第一个、单节点的Elasticsearch集群 14 | - 如何使用Elasticsearch索引文档和查找内容 15 | - Elasticsearch基础配置 16 | - 什么是Elasticsearch插件和如何使用他们 17 | 18 | --- 19 | 20 | ## 读前须知 21 | 22 | 为了阅读这本书和进行这个章节描述的操作,你需要: 23 | 24 | - 一台运行着主流Linux或Unix环境机器或者虚拟机,如Debian / Ubuntu,RHEL / CentOS 或 FreeBSD. 在Mac OS和Windows上运行Elasticsearch不在这本书的涵盖范围内 25 | - 一些基础的UNIX命令行知识和终端的使用 26 | - 你最喜欢的文本编辑器 27 | 28 | 如果你之前还没有使用过Elasticsearch,我建议你创建一个虚拟机,防止你意外损坏你的宿主机系统。你也可以使用一个虚拟化工具如[Virtualbox](https://www.virtualbox.org/)来运行它,或着在你最喜欢的云服务提供商机器上。 29 | 30 | --- 31 | 32 | ## Elasticsearch基础概念 33 | 34 | Elasticsearch是一个使用Java编写的分布式的、可扩展的、具备容错能力的开源的搜索引擎。它提供了一个强大的REST API,用于添加、搜索数据和更新配置。Elasticsearch由Elastic公司开发,公司创建者Shay Banon基于Lucene开发了这个项目 35 | 36 | ### REST API 37 | 38 | REST API是使用HTTP请求来`GET`,`PUT`,`POST`和`DELETE`数据的应用程序接口。一个网站提供的API是允许两个软件程序交互的代码,API提供了一种很好的方式,使得开发者可以在一个操作系统或者其他应用程序编写一段程序来请求网站服务。REST是互联网中与数据库的CRUD(Create,Read,Update,Delete)相对应的概念。 39 | 40 | ### 开源 41 | 42 | 开源意味着Elasticsearch的源代码,也就是构建软件的“配方”,是公开的、免费的并且每个人都可以通过添加缺失的功能、文档或者修复漏洞来对代码做出贡献。如果贡献被项目所接受,那么他们的工作就会被整个社区所知。因为Elasticsearch是开源的,所以无论它背后的公司破产倒闭或者不再维护它,它都不会消亡,因为其他人将会接管它并且使它一直存活。 43 | 44 | ### Java 45 | 46 | Java是一种编程语言,在1995由Sun Microsystems创建。Java应用程序运行在Java虚拟机(JVM)上,这意味着它不依赖于它所运行的平台。Java最让人熟知的是他的垃圾回收器(GC),一种管理内存强有力的方式。 47 | 48 | Java不是Javascript, 后者是在90年代中期由Netscape INC开发的编程语言。两者除了相似的名字,完全是两种不同的语言,使用的目的也不同。 49 | 50 | > Javascript is to Java what hamster is to ham. – Jeremy Keith 51 | 52 | ### 分布式 53 | 54 | Elasticsearch运行在许多机器上,机器的数量由工作负载和数据量而定。机器间使用网络消息相互通信和同步。一台联网的、运行着Elasticsearch的机器称为一个节点,整个共享这相同集群名字的节点群成为集群。 55 | 56 | ### 可扩展的 57 | 58 | Elasticsearch是水平扩展的。水平扩展意味着集群可以通过添加新的节点来扩大规模。当添加机器时,你不需要重启整个集群。当一个新的节点加入集群,它将会获得已有数据中的一部分。和水平扩展相对的是垂直扩展,它扩展的唯一方式就是让软件运行在配置更高的机器上。 59 | 60 | ### 容错的 61 | 62 | 除非指定副本数,Elasticsearch确保了数据至少在两个不同的节点被备份了一次。当其中一个节点离开集群时,Elasticsearch会重新在剩余的节点中构建副本,除非没有剩余的可以备份的节点了。 63 | 64 | --- 65 | 66 | ## 什么是Elasticsearch集群? 67 | 68 | 一个集群可以是运行着Elasticsearch的一台机器或者配置了相同`cluster name`的一群机器。默认的`cluster name`是`elasticsearch`,但是不推荐使用在生产环境中。 69 | 70 | 在Elasticsearch集群中每台机器将承担下面一个或者多个角色: 71 | 72 | ### Master节点 73 | 74 | Master节点控制整个集群。它将集群信息传递给正在加入集群的节点,决定数据如何移动,当一个节点离开集群时重新分配丢失的数据等。当多个节点都能够承担Master节点,Elasticsearch将通过选举产生一个活动的Master。这个活动的Master被称为`elected master`,当这个被选举的Master离开集群时,其他Master节点将接管`elected master`的角色。 75 | 76 | ### Ingest节点 77 | 78 | Ingest节点在文档实际被索引之前对它们进行预处理。Ingest节点会截取bulk和index请求,先对它们进行预处理,再将预处理后的文档发回给index或者bulk API。 79 | 80 | 默认所有节点都开启了Ingest,所以任何节点都能够处理Ingest任务。你也可以创建专用的Ingest节点。 81 | 82 | ### Data节点 83 | 84 | Data节点存储索引好的数据。它们负责管理存储的数据,并且在数据被查询的时候对数据执行操作。 85 | 86 | ### Tribe节点 87 | 88 | Tribe节点连接了多个Elasticsearch集群,它在每个被连接的集群上执行诸如搜索等操作。 89 | 90 | ### 一个最小的、具有容错能力的Elasticsearch集群 91 | 92 | ![一个最新的Elasticsearch集群](images/image1.svg) 93 | 94 | 一个最小的、具有容错能力的Elasticsearch集群应该包括: 95 | 96 | * 3个master节点 97 | * 2个ingest节点 98 | * 2个data节点 99 | 100 | 拥有3个master节点能够确保集群中存在至少2个有选举权的master节点,保证在出现网络分区时不出现脑裂的状态。如果有选举权的master节点小于2个,集群将会拒绝任何新的索引请求直到问题被修复。 101 | 102 | --- 103 | 104 | ## 什么是Elasticsearch索引 105 | 106 | 索引是一系列的拥有相同特征的文档集合。索引由它的名称所确定,名称在对存储的文档或者索引结构本身执行操作时使用。索引结构由映射定义,它是描述了文档特征和索引选项例如“replication factor”的一个`JSON`文件。在Elasticsearch集群中,你可以根据需要定义任意数量的索引。 107 | 108 | Elasticsearch索引由1个或多个分片组成。分片是一个Lucene索引,它的数量在索引被创建的时候就确定了。Elasticsearch在整个集群中分配一个索引的所有分片,可以自动分配或根据用户定义的规则。 109 | 110 | Lucene是Elasticsearch底层的搜索引擎,它是Apache基金会的开源项目。你很可能在操作Elasticsearch集群时不会意识到Lucene,但是这本书将涵盖所有你需要知道的基础知识。 111 | 112 | 一个分片由1个或多个数据段组成,这些数据段是二进制文件,也是Lucene索引存储的文档的地方。 113 | 114 | ![一个Elasticsearch索引内部](images/image2.svg) 115 | 116 | 如果你熟悉关系型数据库如MySQL,那么索引对应于数据库中的库,映射对应于库的schema,分片对应于数据库中的数据。由于Elasticsearch的分布式特性和Lucece的特异性,不再将它们和关系型数据库进行对比。 117 | 118 | --- 119 | 120 | ## 部署你的第一个Elasticsearch集群 121 | 122 | ### 在Debian上部署Elasticsearch 123 | 124 | TODO [issue #9](https://github.com/fdv/running-elasticsearch-fun-profit/issues/9) 125 | 126 | ### 在RHEL / CentOS上部署Elasticsearch 127 | 128 | TODO [issue #9](https://github.com/fdv/running-elasticsearch-fun-profit/issues/9) 129 | 130 | --- 131 | 132 | ## 使用Elasticsearch的第一步 133 | 134 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) 135 | 136 | --- 137 | 138 | ## Elasticsearch配置 139 | 140 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) 141 | 142 | ## Elasticsearch插件 143 | 144 | TODO [issue #10](https://github.com/fdv/running-elasticsearch-fun-profit/issues/10) -------------------------------------------------------------------------------- /ZH-CN/001-入门/images/image1.svg: -------------------------------------------------------------------------------- 1 | 2 |
Master nodes
Master nodes
Elected master
Elected master
Ingest nodes
Ingest nodes
Eligible masters
Eligible masters
Data nodes
Data nodes
X
-------------------------------------------------------------------------------- /ZH-CN/001-入门/images/image2.svg: -------------------------------------------------------------------------------- 1 | 2 |

Elasticsearch Index
<br>Elasticsearch Index
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
-------------------------------------------------------------------------------- /ZH-CN/002-Elasticsearch和JVM/002-elasticsearch-and-the-jvm.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # Elasticsearch和Java虚拟机 6 | 7 | Elasticsearch是使用Java编写的软件,它需要部署在同一台机器的Java运行时环境(JRE)去运行。目前支持的Elasticsearch版本可以在以下操作系统/发行版和Java上运行。 8 | 9 | ## 支持的JVM和操作系统/发行版 10 | 11 | 下面的表格展示了Elastic为2.4.x和5.5.x版本官方所支持的各种的操作系统和Java虚拟机。下面没有提及的操作系统或者JVM是不被Elastic所支持的,因此不应该使用。 12 | 13 | ### 操作系统 14 | 15 | | | CentOS/RHEL 6.x/7.x | Oracle Enterprise Linux 6/7 with RHEL Kernel only | Ubuntu 14.04 | Ubuntu 16.04 | **Ubuntu 18.04** | SLES 11 SP4\*\*/12 | SLES 12 | openSUSE Leap 42 | 16 | | --- |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| 17 | | **ES 5.0.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 18 | | **ES 5.1.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 19 | | **ES 5.2.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 20 | | **ES 5.3.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 21 | | **ES 5.4.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 22 | | **ES 5.5.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | 23 | | **ES 6.0.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 24 | | **ES 6.1.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 25 | | **ES 6.2.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 26 | | **ES 6.3.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 27 | | **ES 6.4.x** | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | 28 | | **ES 6.5.x** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | 29 | 30 | 31 | | | Windows Server 2012/R2 | Windows Server 2016 | Debian 7 | Debian 8 | Debian 9 | **Solaris / SmartOS** | Amazon Linux | 32 | | --- |:---:|:---:|:---:|:---:|:---:|:---:|:---:| 33 | | **ES 5.0** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 34 | | **ES 5.1.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 35 | | **ES 5.2.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 36 | | **ES 5.3.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 37 | | **ES 5.4.x** | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | 38 | | **ES 5.5.x** | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | 39 | | **ES 6.0.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 40 | | **ES 6.1.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 41 | | **ES 6.2.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 42 | | **ES 6.3.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 43 | | **ES 6.4.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 44 | | **ES 6.5.x** | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | 45 | 46 | Elasticsearch可以运行在OpenSolaris和FreeBSD上。FreeBSD 11.1提供了一个由[Mark Felder](mailto:feld@freebsd.org)维护的Elasticsearch 6.4.2版本包,但是这些操作系统都不被Elastic官方支持。 47 | 48 | ### Java虚拟机 49 | 50 | | | Oracle/OpenJDK 1.8.0u111+ | Oracle/OpenJDK 9 | OpenJDK 10 | OpenJDK 11 | Azul Zing 16.01.9.0+ | IBM J9 | 51 | | --- |:---:|:---:|:---:|:---:|:---:| --- | 52 | | **ES 5.0.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 53 | | **ES 5.1.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 54 | | **ES 5.2.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 55 | | **ES 5.3.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 56 | | **ES 5.4.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 57 | | **ES 5.5.x** | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 58 | | **ES 5.6**.x | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | 59 | | **ES 6.0.x** | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 60 | | **ES 6.1.x** | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | 61 | | **ES 6.2.x** | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | 62 | | **ES 6.3.x** | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | 63 | | **ES 6.4.x** | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | 64 | | **ES 6.5.x** | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | 65 | 66 | 67 | 68 | ## 内存管理 69 | 70 | TODO 71 | 72 | ## 垃圾回收 73 | 74 | Java是一个垃圾收集语言。开发者不必管理内存的分配,Java虚拟机周期性地运行一个称为GC线程的特定系统线程来负责不同的垃圾收集活动。这些活动之一是回收不再被程序使用的对象占用的内存。 75 | 76 | Java 1.8 拥有3个不同的垃圾收集器家族,每个都拥有自己的特性。 77 | 78 | *Single Collector* 使用一个单线程来执行整个垃圾回收过程。它在单处理器的机器上非常高效,因为它消除了线程之间通信所隐含的开销,但是它不适合大部分今天现实世界中的用途。它被设计用于管理堆中小的100M数量级的数据集。 79 | 80 | *Parallel Collector* 并行地运行多个小规模的垃圾回收器。运行并行的收集器减少了垃圾收集的开销。它专为在多线程主机上运行的中型到大型数据集而设计。 81 | 82 | *Mostly Concurrent Collector* 同时执行其大部分工作,以防止垃圾收集暂停。它适用于大型数据集且响应时间很重要的情况,因为用于最小化暂停的技术会影响应用程序性能。Java 1.8提供了两个主要的并发收集器:*Concurrent Mark & Sweep Garbage Collector* 和 *Garbage First Garbage Collector*,也被称为G1GC。 83 | 84 | ### 并发标记和扫描垃圾收集器 85 | 86 | TODO 87 | 88 | ### G1垃圾收集器 89 | 90 | TODO 91 | -------------------------------------------------------------------------------- /ZH-CN/003-关于Lucene/003-关于Lucene.md: -------------------------------------------------------------------------------- 1 | ``` 2 | WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x 3 | ``` 4 | 5 | # 关于Lucene你需要知道的 6 | 7 | 在你开始考虑选择合适的硬件前,有一些关于[Lucene](http://lucene.apache.org/)你需要知道的。 8 | 9 | Lucene是Elasticsearch所使用的搜索引擎的名字,它是一个来自Apache基金会的开源项目。当运行Elasticsearch的时候,在大部分情况下,我们不需要直接和Lucene交互。但是有一些在我们选择集群存储和文件系统前需要知道的重要的事情。 10 | 11 | ## Lucene段 12 | 13 | 每个Elasticsearch索引都被分为分片。分片既是一个索引的逻辑也是物理划分。每个Elasticsearch分片都是一个Lucene索引。在一个Lucene索引中你可以拥有的最大文档数是2,147,483,519。Lucene索引被分为更小的称为段的文件,一个段是一个小的Lucene索引。Lucene按顺序搜索所有段。 14 | 15 | ![一个Elasticsearch索引内部](images/image2.svg) 16 | 17 | 当一个新的writer被打开,以及一个writer被提交或者被关闭时,Lucene会创建一个段。这意味着段是不可变的。当你向Elasticsearch索引中加入新的文档,Lucene创建一个新的段并且写入它。当索引的吞吐量很重要时,Lucene也能创建更多的段。 18 | 19 | Lucene不时地将较小的段合并为较大的段。 也可以使用Elasticsearch API手动触发合并。 20 | 21 | 从操作的角度来看,这种行为会产生一些影响。 22 | 23 | 集群中拥有的段越多,搜索的速度也越慢。这是因为Lucene需要顺序地搜索这些所有的段,而不是并行的。因此拥有少量的段可以加快搜索速度。 24 | 25 | Lucene的合并操作需要CPU和I/O开销,这意味着它们可能减慢你的索引速度。当执行一个bulk索引时,比如初始化索引,推荐完全地停止合并操作。 26 | 27 | 如果你计划在同一台机器上存储许多分片和段,您可能需要选择一个能够很好地处理大量小文件的文件系统,并且没inode限制。关于选择正确的文件系统的部分我们将详细讨论。 28 | 29 | ## Lucene删除和更新 30 | 31 | 在更新和删除文档时,Lucene会执行写时拷贝。这意味着文档永远不会从索引中被删除。相反,Lucene将文档标记为已删除,并在触发更新文档时创建另一个文档。 32 | 33 | 写时拷贝带来的操作后果是,当你更新或删除文档时,除非你完全删除它们,否则磁盘上索引空间将不断增长。一种实际删除被标记的文档的解决方案是强制Lucene进行段合并。 34 | 35 | 在合并时,Lucene将2个段的内容移动到第三个新段,然后从磁盘中删除旧段。这意味着Lucene需要足够的可用空间来创建一个和需要合并的两个段大小相同的段。 36 | 37 | 当强制合并一个巨大的分片时可能会出现问题。如果这个分片大小\>磁盘大小的一半,那么你可能无法完全合并它,除非分片中大多数数据都是由已删除的文档组成的。 38 | -------------------------------------------------------------------------------- /ZH-CN/003-关于Lucene/images/image2.svg: -------------------------------------------------------------------------------- 1 | 2 |

Elasticsearch Index
<br>Elasticsearch Index
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
Elasticsearch shard
Elasticsearch shard
Lucene Index
Lucene Index
Lucene segment
Lucene segment
Lucene segment
Lucene segment
-------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-midnight -------------------------------------------------------------------------------- /images/image1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fdv/running-elasticsearch-fun-profit/9e6814b88cdff4263de742a8a810a01f312df15e/images/image1.png --------------------------------------------------------------------------------