Unable to start up ElasticSearch cluster with SearchGuard plugin enable when cluster is un-healthy status

We have seen this issue when we have disk space issue with data.path partition or java heap OOM that cause the cluster un-healthy. When we start up the cluster, ElasticSearch would not be able to start up as it can’t find the searchguard index to load its configuration when the cluster is not available or shards are UNASSIGNED.

Caused by: java.util.concurrent.TimeoutException: Timeout after 5SECONDS while retrieving configuration for roles
at com.floragunn.searchguard.configuration.ConfigurationLoader.load(ConfigurationLoader.java:103) ~[search-guard-6-6.6.1-24.1.jar:6.6.1-24.1]
at com.floragunn.searchguard.configuration.IndexBaseConfigurationRepository.loadConfigurations(IndexBaseConfigurationRepository.java:387) ~[search-guard-6-6.6.1-24.1.jar:6.6.1-24.1]
… 53 more
[2020-01-21T11:34:52,362][ERROR][c.f.s.c.ConfigurationLoader] [prod-hps-elk1.vega.mycompany.com] Failure No shard available for [org.elasticsearch.action.get.MultiGetShardRequest@740c8202] retrieving configuration for [roles] (index=searchguard)
[2020-01-21T11:34:53,843][ERROR][c.f.s.a.BackendRegistry ] [prod-hps-elk1.vega.mycompany.com] Not yet initialized (you may need to run sgadmin)
[2020-01-21T11:34:55,285][ERROR][c.f.s.f.SearchGuardFilter] [prod-hps-elk1.vega.mycompany.com] Unexpected exception ElasticsearchException[java.util.concurrent.TimeoutException: Timeout after 5SECONDS while retrieving configuration for roles]; nested: TimeoutException[Timeout after 5SECONDS while retrieving configuration for roles];
org.elasticsearch.ElasticsearchException: java.util.concurrent.TimeoutException: Timeout after 5SECONDS while retrieving configuration for roles

It appears that SearchGuard plugin tries to read the configuration to bring up the cluster with SearchGuard but it can’t because the SearchGuard index is not available on the cluster. It’s chicken-and-egg that we need the index accessible to start up ElasticSearch cluster where cluster is not available to startup.

End up we have to disable searchguard plugin with “searchguard.disabled: true” to bring up ElasticSearch without SearchGuard until it’s “green” and re-enable plugin with “searchguard.disabled: false”

Wondering how do we start up the cluster when it’s un-healthy and SearchGuard should not cause it not be able to start up because it can’t retrieving configuration for roles

Thanks

Pls refer to Failed to create new users and roles - #2 by jkressin

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.