zen unicast handshake failed with an exception caused by "bad header found" (HeaderHelper)

Does anyone know what can cause a “bad header found”? I found this in my client logs.

[2017-07-14T09:57:22,686][WARN ][o.e.d.z.UnicastZenPing] [client_hostname.domain] [1] failed send ping to {#zen_unicast_hostname.domain:9031_0#}{mXFkFeVJTs-zZkaYmDc3-g}{hostname.domain}{XXX.XXX.XXX.XXX:9301} java.lang.IllegalStateException: handshake failed with {#zen_unicast_hostname.domain:9031_0#}{mXFkFeVJTs-zZkaYmDc3-g}{hostname.domain}{XXX.XXX.XXX.XXX:9301}
at org.elasitcsearch.transport.TransportService.handshake(TransportService.java:386} ~[elasticsearch-5.4.2.jar:5.4.2]

Caused by: org.elasticsearch.transport.RemoteTransportException: [master_hostname.domain][XXX.XXX.XXX.XXX:9301][internal:transport/handshake]

Caused by: org.elasticsearch.ElasticsearchException: bad header found

at com.floragunn.searchguard.transport.SearchGuardRequestHandler.messageReceivedDecorate(SearchGuardRequestHandler.java:158) ~[?:?]

The problem is, I’m not actually clear as to what exactly is trying to call what here, in order to try to solve the problem :S

**I can see the spot in HeaderHelper.java that detects a **ConfigConstants.SG_CONFIG_PREFIX at the start of a header, but it’s not clear to me where this is coming from, and what I can do about it.

Oops, I think my initial assessment was wrong. It’s not the sg prefix that’s the problem, it’s an exception when trying to look at that part. (I don’t see the “invalid header found” message, just the “Error validating headers”), which probably means the headers are null, or something like that…

Still, if anyone has any idea which requests are even happening so I can try to add the proper headers :slight_smile:

I’m not sure if it matters, but in the client logs, before that bad header error shows up, zen discovery is trying to do the ping, but interesting things are showing up in the transport trace log entries…

Someone (/XXX.XXX.XXX.XXX:48012) speaks transport plaintext instead of ssl, will close the channel

[TRACE] […tracer] [client_hostname] [internal:transport/handshake] received response from [host blah blah]

[TRACE] […UnicastZenPing][client_hostname] closing connection to [host blah blah] due to failure

… … full ping responses {none}

starting to ping

failed to ping

That kind of stuff

“Bad Header” occurs typically if you either miss the OID in the node certificates or if searchguard.nodes_dn is not set correctly.
You can try

searchguard.nodes_dn:

  • ‘*’

just to see if its working but we do not recommend to use this in production or in an sensitive area.

Can you post the output for “keytool -list -v -keystore nodekeystore.jks”?

···

On Friday, 14 July 2017 17:50:53 UTC+2, Steve Haertel wrote:

I’m not sure if it matters, but in the client logs, before that bad header error shows up, zen discovery is trying to do the ping, but interesting things are showing up in the transport trace log entries…

Someone (/XXX.XXX.XXX.XXX:48012) speaks transport plaintext instead of ssl, will close the channel

[TRACE] […tracer] [client_hostname] [internal:transport/handshake] received response from [host blah blah]

[TRACE] […UnicastZenPing][client_hostname] closing connection to [host blah blah] due to failure

… … full ping responses {none}

starting to ping

failed to ping

That kind of stuff