SearchGuard is not initialized just after a node joins a cluster

Tomoyuki · April 4, 2019, 7:10am

Issue

Assume there’s an ES cluster with SearchGuard initialized, where searchguard index exists.
During the time between when a node joins the cluster and when the node completes loading SearchGuard configurations, an ES node does not accept any indexing requests because the node is considered to be not initialized yet.

An client app receives the following error.

org.elasticsearch.transport.RemoteTransportException: [(host)][(address)][indices:data/write/bulk]
Caused by: org.elasticsearch.ElasticsearchSecurityException: Cannot authenticate null

The client app uses sniffing.

That error happens because BackendRegistry is not initialized.
https://github.com/floragunncom/search-guard/blob/es-6.6.2/src/main/java/com/floragunn/searchguard/transport/SearchGuardRequestHandler.java#L228

The error is transient, so retrying requests might work, but we might want to avoid the error if possible.

Questions

Can we make sure the node is initialized before the node joins the cluster (when searchguard index exists)?
Any idea how to avoid this transient error?

Version

Elasticsearch: 6.6.1
SearchGuard: 24.1

jkressin · April 4, 2019, 6:58pm

@cstaley any ideas regarding this issue?

cstaley · April 5, 2019, 1:29pm

I guess the only thing (beside retrying requests) is to use a loadbalancer which checks the health status of a node with our health check endpoint documented here Installation | Security for Elasticsearch | Search Guard (Or if not a loadbalancer maybe the client application can call the health check api on a regular basis)

Tomoyuki · April 8, 2019, 10:06am

I see. Good to know the health check endpoint.
We might be able to implement a custom sniffer to send requests to initialized nodes. It can be nice for SearchGuard library to provide the sniffer.

If we retry requests, can we check whether the failures can be transient for this case?

cstaley · April 9, 2019, 10:36am

What do you exactly mean with “can we check whether the failures can be transient for this case”?

Tomoyuki · April 10, 2019, 1:52am

What do you exactly mean with “can we check whether the failures can be transient for this case”?

I thought it can be nice if a client can determine whether a failure can be resolved by retrying requests.
For example, an Exception class might have a flag or code that indicates whether retrying might help or not.
If a failure happens because SG is not initialized, clients might choose to retry requests until it’s initialized.
If a failure happens due to insufficient privilege, clients might choose to stop sending requests.

system · May 1, 2019, 1:52am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New install fails on Search Guard not initialized Search Guard	2	403	July 20, 2017
Search Guard not initialized (SG11) for indices:admin/exists Search Guard	2	861	September 28, 2017
Search guard initialization fails with error none of the configured nodes are available Search Guard	9	592	November 10, 2022
Search Guard not initialized (SG11) searchguard 6.7.1 Search Guard	12	2374	June 6, 2019
searchguard not initialized Search Guard	1	373	June 4, 2018

SearchGuard is not initialized just after a node joins a cluster

Issue

Questions

Version

Related topics