Hi
Our cluster has been using truststore and keystore JKS files for admin/HTTP and node (transport) certificates, but we’re going to use PEM files (using OpenSSL) for transport certs for node communication, while using existing truststore/keystore for admin/HTTP certificates.
Following the guides to setup PEM certs, requirements are installed and configured. I stopped the whole cluster, the first node started without an issue, but when the 2nd node starts, it can’t connect to the other node (logs below). The cert has the SAN attribute, and the hostname verification is disabled (cert and configs below). I tried this with and without explicitly defining “searchguard.nodes_dn” in the configuration file.
Solutions/hints are much appreciated.
Logs
[2019-05-09T15:50:15,548][WARN ][o.e.n.Node ] [es-warm2] timed out while waiting for initial discovery state - timeout: 30s
[2019-05-09T15:50:15,557][INFO ][o.e.h.n.Netty4HttpServerTransport] [es-warm2] publish_address {10.1.2.189:9200}, bound_addresses {127.0.0.1:9200}, {127.0.1.1:9200}, {10.1.2.189:9200}
[2019-05-09T15:50:15,557][INFO ][o.e.n.Node ] [es-warm2] started
[2019-05-09T15:50:15,558][INFO ][c.f.s.SearchGuardPlugin ] [es-warm2] 4 Search Guard modules loaded so far: [Module [type=DLSFLS, implementing class=com.floragunn.searchguard.configuration.SearchGuardFlsDlsIndexSearcherWrapper], Module [type=REST_MANAGEMENT_API, implementing class=com.floragunn.searchguard.dlic.rest.api.SearchGuardRestApiActions], Module [type=MULTITENANCY, implementing class=com.floragunn.searchguard.configuration.PrivilegesInterceptorImpl], Module [type=AUDITLOG, implementing class=com.floragunn.searchguard.auditlog.impl.AuditLogImpl]]
[2019-05-09T15:50:15,615][WARN ][o.e.d.z.ZenDiscovery ] [es-warm2] not enough master nodes discovered during pinging (found [[Candidate{node={es-warm2}{eDxGfvIsQP-LhzJvQI0O7A}{NXGBOSpWRHGy1B_N6vqr1A}{10.1.2.189}{10.1.2.189:9300}{xpack.installed=true, box_type=warm, cabinet=hypervisor9.c1}, clusterStateVersion=-1}]], but needed [3]), pinging again
[2019-05-09T15:50:23,703][ERROR][c.f.s.t.SearchGuardRequestHandler] [es-warm2] ElasticsearchException[Illegal parameter in http or transport request found.
This means that one node is trying to connect to another with
a non-node certificate (no OID or searchguard.nodes_dn incorrect configured) or that someone
is spoofing requests. Check your TLS certificate setup as described here: See http://docs.search-guard.com/latest/troubleshooting-tls]
[2019-05-09T15:50:25,616][WARN ][o.e.d.z.ZenDiscovery ] [es-warm2] not enough master nodes discovered during pinging (found [[Candidate{node={es-warm2}{eDxGfvIsQP-LhzJvQI0O7A}{NXGBOSpWRHGy1B_N6vqr1A}{10.1.2.189}{10.1.2.189:9300}{xpack.installed=true, box_type=warm, cabinet=hypervisor9.c1}, clusterStateVersion=-1}]], but needed [3]), pinging again
[2019-05-09T15:50:25,672][WARN ][o.e.d.z.UnicastZenPing ] [es-warm2] [5] failed send ping to {10.1.1.120:9300}{h8Gu7bRQR8GvKK2nyzYu6Q}{es-hot1.c1}{10.1.1.120:9300}
java.lang.IllegalStateException: handshake failed with {10.1.1.120:9300}{h8Gu7bRQR8GvKK2nyzYu6Q}{es-hot1.c1}{10.1.1.120:9300}
at org.elasticsearch.transport.TransportService.handshake(TransportService.java:418) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TransportService.handshake(TransportService.java:386) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.discovery.zen.UnicastZenPing$PingingRound.getOrConnect(UnicastZenPing.java:371) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.discovery.zen.UnicastZenPing$3.doRun(UnicastZenPing.java:476) [elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:759) [elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.6.0.jar:6.6.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: org.elasticsearch.transport.RemoteTransportException: [es-hot1][10.1.1.120:9300][internal:transport/handshake]
Caused by: org.elasticsearch.ElasticsearchException: Illegal parameter in http or transport request found.
This means that one node is trying to connect to another with
a non-node certificate (no OID or searchguard.nodes_dn incorrect configured) or that someone
is spoofing requests. Check your TLS certificate setup as described here: See http://docs.search-guard.com/latest/troubleshooting-tls
at com.floragunn.searchguard.ssl.util.ExceptionUtils.createBadHeaderException(ExceptionUtils.java:57) ~[?:?]
at com.floragunn.searchguard.transport.SearchGuardRequestHandler.messageReceivedDecorate(SearchGuardRequestHandler.java:201) ~[?:?]
at com.floragunn.searchguard.ssl.transport.SearchGuardSSLRequestHandler.messageReceived(SearchGuardSSLRequestHandler.java:141) ~[?:?]
at com.floragunn.searchguard.SearchGuardPlugin$7$1.messageReceived(SearchGuardPlugin.java:645) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1288) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.common.util.concurrent.EsExecutors$1.execute(EsExecutors.java:140) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1246) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1110) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:913) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:53) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436) ~[?:?]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203) ~[?:?]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[?:?]
... 1 more
ES config (elasticearch.yml, removed unrelated configs)
node.name: es-warm2
node.master: true
node.data: true
node.attr.box_type: warm
xpack.graph.enabled: false
xpack.ml.enabled: false
xpack.monitoring.enabled: false
xpack.security.enabled: false
xpack.watcher.enabled: false
xpack.logstash.enabled: false
bootstrap.memory_lock: true
searchguard.nodes_dn:
- 'CN=*.elk.byte.nl'
- 'CN=elk.byte.nl'
searchguard.ssl.transport.enable_openssl_if_available: true
searchguard.ssl.transport.pemkey_filepath: ssl-transport.key
searchguard.ssl.transport.pemcert_filepath: ssl-transport.cert
searchguard.ssl.transport.pemtrustedcas_filepath: ssl-transport.ca
searchguard.ssl.transport.enforce_hostname_verification: false
searchguard.ssl.http.enabled: true
searchguard.ssl.http.keystore_filepath: node-keystore.jks
searchguard.ssl.http.truststore_filepath: truststore.jks
searchguard.ssl.http.keystore_password: ***************
searchguard.ssl.http.truststore_password: ****************
searchguard.enterprise_modules_enabled: true
searchguard.authcz.admin_dn:
- ...
cert info (from sgtlsdiag, stripped hashes, validation time, etc.)
========================================================================
/etc/elasticsearch/ssl-transport.cert
------------------------------------------------------------------------
Certificate 1
------------------------------------------------------------------------
Subject DN [RFC2253]: CN=*.elk.byte.nl,OU=PositiveSSL Wildcard,OU=Domain Control Validated
Issuer DN [RFC2253]: CN=Sectigo RSA Domain Validation Secure Server CA,O=Sectigo Limited,L=Salford,ST=Greater Manchester,C=GB
Key Usage: digitalSignature keyEncipherment
Signature Algorithm: SHA256WITHRSA
Version: 3
Extended Key Usage: id_kp_serverAuth id_kp_clientAuth
Basic Constraints: -1
SAN:
dNSName: *.elk.byte.nl
dNSName: elk.byte.nl
------------------------------------------------------------------------
Trust anchor:
C=GB,ST=Greater Manchester,L=Salford,O=Sectigo Limited,CN=Sectigo RSA Domain Validation Secure Server CA
========================================================================
/etc/elasticsearch/ssl-transport.ca
------------------------------------------------------------------------
Certificate 1
------------------------------------------------------------------------
Subject DN [RFC2253]: CN=Sectigo RSA Domain Validation Secure Server CA,O=Sectigo Limited,L=Salford,ST=Greater Manchester,C=GB
Issuer DN [RFC2253]: CN=USERTrust RSA Certification Authority,O=The USERTRUST Network,L=Jersey City,ST=New Jersey,C=US
Key Usage: digitalSignature keyCertSign cRLSign
Signature Algorithm: SHA384WITHRSA
Version: 3
Extended Key Usage: id_kp_serverAuth id_kp_clientAuth
Basic Constraints: 0
SAN: (none)
------------------------------------------------------------------------
Certificate 2
------------------------------------------------------------------------
Subject DN [RFC2253]: CN=USERTrust RSA Certification Authority,O=The USERTRUST Network,L=Jersey City,ST=New Jersey,C=US
Issuer DN [RFC2253]: CN=USERTrust RSA Certification Authority,O=The USERTRUST Network,L=Jersey City,ST=New Jersey,C=US
Key Usage: keyCertSign cRLSign
Signature Algorithm: SHA384WITHRSA
Version: 3
Extended Key Usage: null
Basic Constraints: 2147483647
SAN: (none)
------------------------------------------------------------------------
System info:
- Debian GNU/Linux 8.10 (jessie)
- Java:
- OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-1~bpo8+1-b11)
- OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
- ES version: 6.6.0
- SG Version: 6.6.0-24.1