At the speed of light

Woot, first post in more than one year, that’s quite a thing!

ElasticSearch is an awesome piece of software, but some management operations can be quite a pain in the administrator’s ass. Performing a rolling restart of your cluster without downtime is one of them. On a 30 something server cluster running up to 900GB shards, it would take up to 3 days. Hopefully, we’re now able to do it in less than 30 minutes on a 70 nodes with more than 100TB of data.

If you’re looking for ElasticSearch rolling restart, here’s what you’ll find:

  1. Disable the shard allocation on your cluster.
  2. Restart a node.
  3. Enable the shard allocation.
  4. Wait for the cluster to be green again.
  5. Goto 2 until you’re done.

Indeed, part 4 is the longest, and you don’t have hours, if not days. Hopefully, there’s a solution to fix that: rack awareness.

About Rack Awareness

ElasticSearch shard allocation awareness is a rather underlooked feature. It allows you to add your ElasticSearch nodes to virtual racks so the primary and replica shards are not allocated in the same rack. That’s extremely handy to ensure fail over when you spread your cluster between multiple data centers.

Rack awareness requires 2 configuration parameters, one at the node level (restart mandatory) and one at the cluster level (so you can do it runtime).

Let’s say we have a 4 nodes cluster and want to activate rack awareness. We’ll split the cluster into 2 racks:

  • goldorack
  • alcorack

On the first two nodes, add the following configuration options:

node.rack_id: "goldorack"

And on the 2 other nodes:

node.rack_id: "alcorack"

Restart, you’re ready to enable rack awareness.

curl -XPUT localhost:9200/_cluster/settings -d '{
    "persistent" : {
        "cluster.routing.allocation.awareness.attributes" : "rack_id"
    }
}'

Here we are, born to be king. Your ElasticSearch cluster starts reallocating shards. Wait until complete rebalance, it can take some times. You’ll soon be able to perform a blazing fast rolling restart.

First, disable shard allocation globally:

curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
    "transient" : {
        "cluster.routing.allocation.enable": "none"
    }
}'

Restart the ElasticSearch process on all nodes in the goldorack rack. Once your cluster is complet, enable shard allocation back.

curl -XPUT 'http://localhost:9200/_cluster/settings' -d '{
    "transient" : {
        "cluster.routing.allocation.enable": "all"
    }
}'

A few minutes later, your cluster is all green. Woot!

What happened there?

Using rack awareness, you ensure that all your data is stored in nodes located in alcorack. When you restart all the goldorack nodes, the cluster elects alcorack replica as primary shards, and your cluster keeps running smoothly since you did not break the quorum. When the goldorack nodes come back, they catch up with the newly indexed data and you’re green in a minute. Now, do exactly the same thing with the other part of the cluster and you’re done.

For the laziest (like me)

Since we’re all lazy and don’t want to ssh on 70 nodes to perform the restart, here’s the Ansible way to do it:

In your inventory:

[escluster]
node01 rack_id=goldorack
node02 rack_id=goldorack
node03 rack_id=alcorack
node04 rack_id=alcorack

And the restart task:

- name: Perform a restart on a machine located in the rack_id passed as a parameter
  service: name=elasticsearch state=restarted
  when: rack is defined and rack == rack_id

That’s all folks, see you soon (I hope)!

Perry the Platypus wants you to subscribe now! Even if you don't visit my site on a regular basis, you can get the latest posts delivered to you for free via Email: