how to add nodes to ES+SG cluster

Hi All,

I am not sure what I should do to add nodes to ES+SG cluster. I have one node successfully working with SG and using LDAP.

As far as I understand:

  1. Each node in elasticsearch.yml should have the same cluster.name

  2. Each node should have searchguard.nodes_dn listing nodes’ certificates

  3. Each node should have set

discovery.zen.minimum_master_nodes:

node.max_local_storage_nodes

Anything else?

I do not understand the following things:

  1. Should there be any relationship between node certificates on each node? In particular should they all be signed by the same root certificate or this is not needed?

  2. Should sgadmin tool run separately on each node? But in that case they would have each their own SG index? I thought there should be a single SG index for the cluster. But if one does not run sgadmin tool, how would each node learn about its certificates?

  3. How can I specify which role each node plays? I want one node to submit jobs and do nothing else and the other nodes to do heavy computations.

  4. Do I understand correctly that only the node interfacing users need to worry about rest api? The rest of the nodes need only to worry about transport api used for internode communication?

  5. What about LDAP for authentication and authorization? This is only needed on the node interfacing users? Should it be configured on other nodes?

  6. Is it OK just to take a working configuration of the login node, copy it to other nodes and just replace node names and host certificates for transport layer?

  7. Can transport and rest certificates be different and signed by different root certificates?

Thank you,

Igor

I think you are mixing Elasticsearch and Search Guard related questions:

  1. Each node in elasticsearch.yml should have the same cluster.name

  2. Each node should have set

discovery.zen.minimum_master_nodes:

node.max_local_storage_nodes

These are Elasticsearch related question. Sure, each node has to have the same cluster name, otherwise you end up with multiple clusters. The question regarding master nodes depends on how many nodes you run in the cluster, and the max storage nodes setting also depends on your setup. Please refer to the Elasticsearch documentation regarding the different types of nodes: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html

  1. Each node should have searchguard.nodes_dn listing nodes’ certificates

Either that, or you add an OID value to the SAN part of your certificates. But easiest is to list the DNs of the node certificates. You can also use wildcards here. See “Configuring node certificates” in the docs:

  1. Should there be any relationship between node certificates on each node? In particular should they all be signed by the same root certificate or this is not needed?

Absolutely, otherwise TLS would not make sense. All certificates have to be signed by the same root or intermediate CA.

  1. Should sgadmin tool run separately on each node? But in that case they would have each their own SG index? I thought there should be a single SG index for the cluster. But if one does not run sgadmin tool, how would each node learn about its certificates?

No, you only need to run it against one / any node in your cluster. Indices in Elasticsearch do not work as you described, there is only one logical index per cluster, regardless on which node(s) / shard(s) the actual data resided. You run sgadmin with an admin certificate. This admin certificate is validated against the truststore and against the list of admin certificates in elasticsearch.yml. If everything is ok the SG index is updated and the changes are propagated throughout the cluster.

  1. How can I specify which role each node plays? I want one node to submit jobs and do nothing else and the other nodes to do heavy computations.

I don’t fully understand this question, but again this is Elasticsearch related rather than Search Guard.

  1. Do I understand correctly that only the node interfacing users need to worry about rest api? The rest of the nodes need only to worry about transport api used for internode communication?

What do you mean by “worry about”? If you only want users to interact with one specific node, then they would use the REST API only on this node, that’s correct. All nodes need to be able to communicate with each other on transport though.

  1. What about LDAP for authentication and authorization? This is only needed on the node interfacing users? Should it be configured on other nodes?

As outline in 2) the SG configuration is stored in an Elasticsearch index and the settings are propagated to each node automatically.

  1. Is it OK just to take a working configuration of the login node, copy it to other nodes and just replace node names and host certificates for transport layer?

Yes, that would be possible.

  1. Can transport and rest certificates be different and signed by different root certificates?

Yes, you can use different certificates including root CAs for REST and transport.

···

On Wednesday, March 28, 2018 at 6:52:23 PM UTC+2, ivy2@uchicago.edu wrote:

Hi All,

I am not sure what I should do to add nodes to ES+SG cluster. I have one node successfully working with SG and using LDAP.

As far as I understand:

  1. Each node in elasticsearch.yml should have the same cluster.name
  1. Each node should have searchguard.nodes_dn listing nodes’ certificates
  1. Each node should have set

discovery.zen.minimum_master_nodes:

node.max_local_storage_nodes

Anything else?

I do not understand the following things:

  1. Should there be any relationship between node certificates on each node? In particular should they all be signed by the same root certificate or this is not needed?
  1. Should sgadmin tool run separately on each node? But in that case they would have each their own SG index? I thought there should be a single SG index for the cluster. But if one does not run sgadmin tool, how would each node learn about its certificates?
  1. How can I specify which role each node plays? I want one node to submit jobs and do nothing else and the other nodes to do heavy computations.
  1. Do I understand correctly that only the node interfacing users need to worry about rest api? The rest of the nodes need only to worry about transport api used for internode communication?
  1. What about LDAP for authentication and authorization? This is only needed on the node interfacing users? Should it be configured on other nodes?
  1. Is it OK just to take a working configuration of the login node, copy it to other nodes and just replace node names and host certificates for transport layer?
  1. Can transport and rest certificates be different and signed by different root certificates?

Thank you,

Igor

Hi Jochen,

  1. Should there be any relationship between node certificates on each node? In particular should they all be signed by the same root certificate or this is not needed?

Absolutely, otherwise TLS would not make sense. All certificates have to be signed by the same root or intermediate CA.

OK, I generated all the certificates using TLS offline tool.

I used the generated code to insert into elasticsearch.yml.

I start elasticsearch on node md01, run sgadmin there, then I start elasticsearch on node md02 and they do not see each other judging by the logs and judging by what kibana shows.

The logs and configuration files are attached.

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

Thank you,

Igor

md02.log (45.1 KB)

md01.log (49.2 KB)

elasticsearch_md01.yml (6.06 KB)

elasticsearch_md02.yml (6.03 KB)

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

···

Thank you,

Igor

Do I understand correctly that each node can have its own storage, that it does not have to be shared and that data directory can be the same on different nodes but pointing to different physical storage? Nodes learn about each other through network only and not through what is stored on disk?

Without SG, the only thing that is needed for nodes to find each other is to have the same cluster.name?

···

On Thursday, March 29, 2018 at 11:21:52 AM UTC-5, iv…@uchicago.edu wrote:

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

Thank you,

Igor

OK this is what allows nodes to be discovered:

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

···

On Thursday, March 29, 2018 at 11:28:27 AM UTC-5, iv…@uchicago.edu wrote:

Do I understand correctly that each node can have its own storage, that it does not have to be shared and that data directory can be the same on different nodes but pointing to different physical storage? Nodes learn about each other through network only and not through what is stored on disk?

Without SG, the only thing that is needed for nodes to find each other is to have the same cluster.name?

On Thursday, March 29, 2018 at 11:21:52 AM UTC-5, iv…@uchicago.edu wrote:

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

Thank you,

Igor

However, once I re-enabled SG, the nodes do not see each other again and there is some certificate complain in both logs:

···

===============

[2018-03-29T11:46:40,821][ERROR][c.f.s.s.t.SearchGuardSSLNettyTransport] [md02] SSL Problem General SSLEngine p
roblem

javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1529) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781) ~[?:?]

    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[?:1.8.0_162]

    at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[netty-handler-4.1.16.F

inal.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[netty-handler-4.1.16.Fin

al.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java

:489) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[netty-codec-

4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec

-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:34

  3. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

     at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [n
    

etty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-tran
    sport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [net
    ty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.
    Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4
    .1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16
    .Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.1
    6.Final]

    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-
    common-4.1.16.Final.jar:4.1.16.Final]

    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]

Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:330) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching md01 found.

    at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214) ~[?:?]

    at sun.security.util.HostnameChecker.match(HostnameChecker.java:96) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1601) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

[2018-03-29T11:46:40,835][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd03 (hd03/hd03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md01 (md01/md01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,583][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md03 (md03/md03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd04 (hd04/hd04) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd01 (hd01/hd01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,586][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd02 (hd02/hd02) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,830][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

===============

On Thursday, March 29, 2018 at 11:42:03 AM UTC-5, iv…@uchicago.edu wrote:

OK this is what allows nodes to be discovered:

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

On Thursday, March 29, 2018 at 11:28:27 AM UTC-5, iv…@uchicago.edu wrote:

Do I understand correctly that each node can have its own storage, that it does not have to be shared and that data directory can be the same on different nodes but pointing to different physical storage? Nodes learn about each other through network only and not through what is stored on disk?

Without SG, the only thing that is needed for nodes to find each other is to have the same cluster.name?

On Thursday, March 29, 2018 at 11:21:52 AM UTC-5, iv…@uchicago.edu wrote:

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

Thank you,

Igor

Should there be any connection between CN in

···

===
searchguard.nodes_dn:

  • CN=md01.rcc.local,OU=ES,O=SG,DC=rcc,DC=local

===

and

===

node.name: md01

network.host: [localhost,md01,hadoop.rcc.uchicago.edu]

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

===

?

When using ping, both md01 and md01.rcc.local are resolved to the same address

When certificates were generated with TLS offline tool, the configuration that was used for the md01:

===

nodes:

  • name: md01

    dn: CN=md01.rcc.local,OU=ES,O=SG,DC=rcc,DC=local

    dns: md01.rcc.local

    ip: 172.25.180.171

===

On Thursday, March 29, 2018 at 11:52:40 AM UTC-5, iv…@uchicago.edu wrote:

However, once I re-enabled SG, the nodes do not see each other again and there is some certificate complain in both logs:

[2018-03-29T11:46:40,821][ERROR][c.f.s.s.t.SearchGuardSSLNettyTransport] [md02] SSL Problem General SSLEngine p
roblem

javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1529) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781) ~[?:?]

    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[?:1.8.0_162]

    at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[netty-handler-4.1.16.F

inal.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[netty-handler-4.1.16.Fin

al.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java

:489) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[netty-codec-

4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec

-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:34

  3. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

     at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [n
    

etty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-tran
    sport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [net
    ty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.
    Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4
    .1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16
    .Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.1
    6.Final]

    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-
    common-4.1.16.Final.jar:4.1.16.Final]

    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]

Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:330) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching md01 found.

    at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214) ~[?:?]

    at sun.security.util.HostnameChecker.match(HostnameChecker.java:96) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1601) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

[2018-03-29T11:46:40,835][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd03 (hd03/hd03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md01 (md01/md01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,583][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md03 (md03/md03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd04 (hd04/hd04) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd01 (hd01/hd01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,586][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd02 (hd02/hd02) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,830][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

===============

On Thursday, March 29, 2018 at 11:42:03 AM UTC-5, iv…@uchicago.edu wrote:

OK this is what allows nodes to be discovered:

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

On Thursday, March 29, 2018 at 11:28:27 AM UTC-5, iv…@uchicago.edu wrote:

Do I understand correctly that each node can have its own storage, that it does not have to be shared and that data directory can be the same on different nodes but pointing to different physical storage? Nodes learn about each other through network only and not through what is stored on disk?

Without SG, the only thing that is needed for nodes to find each other is to have the same cluster.name?

On Thursday, March 29, 2018 at 11:21:52 AM UTC-5, iv…@uchicago.edu wrote:

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

Thank you,

Igor

If I do

···

====

searchguard.ssl.transport.enforce_hostname_verification: false

searchguard.ssl.transport.resolve_hostname: false

====

then nodes can see each other.

So what’s the problem? What do I need to change in my configuration files to be able to set the above two parameters to true?

On Thursday, March 29, 2018 at 11:59:15 AM UTC-5, iv…@uchicago.edu wrote:

Should there be any connection between CN in

searchguard.nodes_dn:

  • CN=md01.rcc.local,OU=ES,O=SG,DC=rcc,DC=local

===

and

===

node.name: md01

network.host: [localhost,md01,hadoop.rcc.uchicago.edu]

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

===

?

When using ping, both md01 and md01.rcc.local are resolved to the same address

When certificates were generated with TLS offline tool, the configuration that was used for the md01:

===

nodes:

  • name: md01

    dn: CN=md01.rcc.local,OU=ES,O=SG,DC=rcc,DC=local

    dns: md01.rcc.local

    ip: 172.25.180.171

===

On Thursday, March 29, 2018 at 11:52:40 AM UTC-5, iv…@uchicago.edu wrote:

However, once I re-enabled SG, the nodes do not see each other again and there is some certificate complain in both logs:

[2018-03-29T11:46:40,821][ERROR][c.f.s.s.t.SearchGuardSSLNettyTransport] [md02] SSL Problem General SSLEngine p
roblem

javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1529) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781) ~[?:?]

    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[?:1.8.0_162]

    at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[netty-handler-4.1.16.F

inal.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[netty-handler-4.1.16.Fin

al.jar:4.1.16.Final]

    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[netty-handler-4.1.16.Final.jar:4.1.16

.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java

:489) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[netty-codec-

4.1.16.Final.jar:4.1.16.Final]

    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec

-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:34

  3. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

     at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [n
    

etty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:
  1. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:

  2. [netty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-tran
    sport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [net
    ty-transport-4.1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.
    Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4
    .1.16.Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16
    .Final.jar:4.1.16.Final]

    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.1
    6.Final]

    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-
    common-4.1.16.Final.jar:4.1.16.Final]

    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]

Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem

    at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[?:?]

    at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:330) ~[?:?]

    at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching md01 found.

    at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:214) ~[?:?]

    at sun.security.util.HostnameChecker.match(HostnameChecker.java:96) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252) ~[?:?]

    at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) ~[?:?]

    at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1601) ~[?:?]

    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) ~[?:?]

    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:992) ~[?:?]

    at sun.security.ssl.Handshaker$1.run(Handshaker.java:989) ~[?:?]

    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_162]

    at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1467) ~[?:?]

    at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]

    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]

    ... 19 more

[2018-03-29T11:46:40,835][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd03 (hd03/hd03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,584][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md01 (md01/md01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,583][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is md03 (md03/md03) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd04 (hd04/hd04) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,585][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd01 (hd01/hd01) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,586][DEBUG][c.f.s.s.t.S.ClientSSLHandler] Hostname of peer is hd02 (hd02/hd02) with hostna
meVerificationResovleHostName: true

[2018-03-29T11:46:41,830][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [md02] no known master node, schedulin
g a retry

===============

On Thursday, March 29, 2018 at 11:42:03 AM UTC-5, iv…@uchicago.edu wrote:

OK this is what allows nodes to be discovered:

discovery.zen.ping.unicast.hosts: [“md01”, “md02”, “md03”, “hd01”, “hd02”, “hd03”, “hd04”]

On Thursday, March 29, 2018 at 11:28:27 AM UTC-5, iv…@uchicago.edu wrote:

Do I understand correctly that each node can have its own storage, that it does not have to be shared and that data directory can be the same on different nodes but pointing to different physical storage? Nodes learn about each other through network only and not through what is stored on disk?

Without SG, the only thing that is needed for nodes to find each other is to have the same cluster.name?

On Thursday, March 29, 2018 at 11:21:52 AM UTC-5, iv…@uchicago.edu wrote:

Perhaps, I should delete SG index, start both nodes, see if they can find each other without SG and only then run sgadmin?

Do I understand correctly that deleting SG index would disable SG but sg entries in configuration file will not confuse ES?

I tried deleting SG index and putting

searchguard.disabled: true

into elasticsearch.yml on both nodes.

The nodes still do not see each other.

What can be a problem?

There are several interfaces on each node.

Perhaps, I need to explicitly say somehow to ES which interface to use for internode communication?

Thank you,

Igor