How are searchguard shards assigned?

Elasticsearch 5.6.14. Searchguard 5-5.6.14-19.2

I am trying to understand why searchguard index shards are being allocated to nodes in a way which I think Elasticsearch should not allow.

I work with a cluster that has a hot/cold architecture. We have Ansible use sgadmin.sh to disable automatic replication, count how many hot nodes we have and explicitly set the number of replicas to that number minus one. I.e. we currently have 6 hot nodes so Ansible tells sgadmin.sh to set 5 replicas.

We have 3 hot nodes in a data centre called foo, and 3 in a data centre called bar. We have this in our elasticsearch.yml

cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.zone.values: [u'foo', u'bar']

My understanding of those settings is they mean a replica cannot be in the same data centre as it’s primary. Except some replicas for the searchguard index are in the same data centre as the primary. We’ve only recently started using Search Guard and I hadn’t given this any thought until the other day when I had to stop then start Elasticsearch on a hot node. After the node re-joined the cluster the replica shard of the searchguard index that had been on it was not re-assigned back to it. The cluster allocation explanation API said

[NO(there are too many copies of the shard allocated to nodes with attribute [datacentre], there are [2] total configured shard copies for this shard id and [4] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1])

(This is the same message I get if I tell Elasticsearch to move a primary shard of an index to a node in the same datacentre as it’s replica by way of a POST to /_cluster/reroute) To get cluster health back to green I PUT "number_of_replicas": 1 in to /searchguard/_settings. I then did an Ansible run which used sgadmin.sh to set the replicas back to 5, all of which got assigned.

So given the cluster.routing.allocation.awareness settings our cluster has, how does sgadmin.sh create 5 replicas and get them all assigned to a hot node?

To solve this disable the auto-expand replicas feature of the searchguard index and set the number of replicas manually, see https://docs.search-guard.com/latest/search-guard-index#disable-auto-expand-replicas-of-the-searchguard-index

I’m afraid I don’t see what that solves or how it is an answer to the question I asked. I am already doing both the things you suggest.

Search Guard itself as well as sgadmin does nothing special related to shard allocation of the searchguard index besides “setting index.auto_expand_replicas: 0-all” (auto expanding replicas). The index is created with 1 shard.

So the first thing, if not already done, is to switch off auto_expand_replicas by executing sgadmin with the -dra flag. Something like “Auto-expand replicas disabled” should be printed out. After that set the replica count with the -us <num_of_replicas> to the desired number. Everything else is up to Elasticsearch shard allocation algorithm:

Auto-expand the number of replicas based on the number of data nodes in the cluster. Set to a dash delimited lower and upper bound (e.g. 0-5 ) or use all for the upper bound (e.g. 0-all ). Defaults to false (i.e. disabled). Note that the auto-expanded number of replicas does not take any other allocation rules into account, such as shard allocation awareness, filtering or total shards per node, and this can lead to the cluster health becoming YELLOW if the applicable rules prevent all the replicas from being allocated.

https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html

It really feels like you are answering the question “How do I deal with the problem of seachguard index shards not being assigned?” which is an entirely different question to the one I actually asked and which I feel my original post shows I already knew the answer to. You’re telling me about using sgadmin.sh to disable auto expand replicas and set the number of shards, both of which I mention doing in my original post.

This is relevant config on the hot nodes

$ ansible -i inventories/prod/hosts  logging_elasticsearch_hot_data -m shell -a "grep datacentre /etc/elasticsearch/elasticsearch.yml"

hot_node_3 | CHANGED | rc=0 >>
node.attr.datacentre: foo
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

hot_node_5 | CHANGED | rc=0 >>
node.attr.datacentre: bar
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

hot_node_2 | CHANGED | rc=0 >>
node.attr.datacentre: foo
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

hot_node_1  | CHANGED | rc=0 >>
node.attr.datacentre: foo
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

hot_node_4 | CHANGED | rc=0 >>
node.attr.datacentre: bar
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

hot_node_6 | CHANGED | rc=0 >>
node.attr.datacentre: bar
cluster.routing.allocation.awareness.attributes: datacentre
cluster.routing.allocation.awareness.force.datacentre.values: [u'foo', u'bar']

Which shows what I said in my original post, there are 3 hot nodes in each data centre.

This is how shards of the elasticsearch index are assigned as a result of running telling sgadmin.sh to disable auto expand replicas and set the replicas to 5.

$ curl -s  -u myusername -XGET 'https://ourcluster:9200/_cat/shards' |grep searchguard
Enter host password for user 'myusername':
searchguard                                         0 r STARTED         0 58.8kb 10.70.12.25  hot_node_5
searchguard                                         0 p STARTED         0 58.8kb 10.70.12.24  hot_node_4
searchguard                                         0 r STARTED         0 58.8kb 10.70.12.22  hot_node_2
searchguard                                         0 r STARTED         0 58.8kb 10.70.13.244 hot_node_1
searchguard                                         0 r STARTED         0 58.8kb 10.70.12.23  hot_node_3
searchguard                                         0 r STARTED         0 58.8kb 10.70.12.26  hot_node_6
$ curl -s  -u myusername -XGET 'https://ourcluster:9200/_cat/indices/searchguard?h=health'
Enter host password for user 'myusername':
green
$

Which shows what I said in my original post, that every shard is successfully assigned to a hot node. And as I said, when I restarted a hot node, Elasticsearch said that the shard that had been assigned to it could not be assigned it because of the routing.allocation.awareness settings.

So to restate my original question, given the routing.allocation.awareness settings our cluster has, how how does sgadmin.sh set 5 replicas and they all get assigned to a hot node?

I find myself wondering if Search Guard is doing something to get shards assigned which isn’t possible via the regular API but it can do because it’s a plugin.

As said above: Search Guard itself as well as sgadmin does nothing special related to shard allocation. So SG is not “doing something to get shards assigned which isn’t possible via the regular API but it can do because it’s a plugin”.

Thanks for the info about Search Guard not doing any special with regard to shard assignment. With that in mind I’ve done some experimentation and been able to recreate the seemingly impossible shard allocation without using sgadmin.sh. It’s either a bug in Elasticsearch or behaviour I don’t understand.

If I set number_of_replicas for the searchguard index to 1 then the single replica shard is allocated to a node in the datacentre that the primary shard is not in. If I then set auto_expand_replicas for the searchguard index to "0-all" a replica shard is assigned to every hot node and there is also a bunch of shards which are unassigned due to the hot/cold architecture. If I then set number_of_replicas to 5 the unassigned shard count falls to 0 and I’m left with a shard of the searchguard index assigned to every hot node despite the cluster.routing.allocation.awareness settings. I’ve also found that setting number_of_replicas to 1 and then to a number between 2 and 5 results in some, but not all, of the new shards being assigned to hot nodes despite the cluster.routing.allocation.awareness settings.

So it seems that when the number of replica shards is increased Elasticsearch 5.6.14 will sometimes assign shards to nodes they shouldn’t be assigned to. And that becomes a problem when a node with a shard that shouldn’t be assigned to it gets restarted.

The cluster will soon be upgraded to Elasticsearch 6 so I’m going to forget about it for now and see if I can receate the same behaviour with 6.

I’d recommend to ask the question also here https://discuss.elastic.co and see if this is maybe a known issue to them.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.