Hi,
I am using elasticsearch & kibana oss distributions with searchguard plugins (ELK 7.0.1).
Within the cluster when kibana talks to elasticsearch, we do not want tls,
Is there a way to disable node-to-node encryption and TLS while still having authentication for elasticsearch?
If yes, could you please help me with the required configurations for this?
Thanks & Regards,
Shivani
Hi,
no, TLS on the transport layer is one of the main building blocks regarding the Search Guard security architecture and thus cannot be disabled. Disabling inter-node TLS would open the Elasticsearch cluster to all sorts of attack scenarios.
It would be really helpful for us to understand what your concerns are regarding inter-node TLS. Why don’t you want to enable it for your use case?
lewis
October 9, 2019, 4:14am
4
Hi jkressin,
I have the same problem with shivani.aggarwal2195, while using tls, happens to get a problem of ‘javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure’, that will cause to loss nodes, then the cluster will be unstable, that’s a bug or other problems, we just want to use the base Auth, so, is any way to modify this problem, a similar error like ‘http://xwiz.cn/2018-05-09-java-ssl-ciphersuite ’, expired to get your reply
thanks!
In order to debug this, I would need to see your elasticsearch.yml configuration and the full stack trace from the Elasticsearch log file.
If you have trouble setting up TLS, I highly recommend using our TLS offline tool. It provides an easy way to generate production-ready certificates that can be used with Search Guard:
lewis
October 10, 2019, 10:37am
9
the ES configuration is like below:
http.cors.allow-headers: “Authorization,X-Requested-With,- Content-Length,Content-Type”
xpack.security.enabled: false
xpack.ml.enabled: false
searchguard.ssl.transport.pemcert_filepath: tls_file/node.pem
searchguard.ssl.transport.pemkey_filepath: tls_file/node.key
searchguard.ssl.transport.pemkey_password: search-guard-pk
searchguard.ssl.transport.pemtrustedcas_filepath: tls_file/root-ca.pem
searchguard.ssl.transport.enforce_hostname_verification: false
searchguard.ssl.transport.resolve_hostname: false
searchguard.ssl.http.enabled: true
searchguard.ssl.http.pemcert_filepath: tls_file/node_http.pem
searchguard.ssl.http.pemkey_filepath: tls_file/node_http.key
searchguard.ssl.http.pemkey_password: search-guard-pk
searchguard.ssl.http.pemtrustedcas_filepath: tls_file/root-ca.pem
searchguard.nodes_dn:
CN=node.xxxx.com ,OU=Ops,O=XXXX Com, Inc.,DC=xxxx,DC=com
searchguard.authcz.admin_dn:
CN=search-admin.xxxx.com ,OU=Ops,O=xxx Com, Inc.,DC=xxx,DC=com
searchguard.enterprise_modules_enabled: false
lewis
October 10, 2019, 10:38am
10
Thank you so much for your replies, we exactly to use the TLS offline tool to geneate certificates, and search-guard also has tooken effect, and cluster also run well, but this problem appears accidentally, may be once a day or once two days, the trace is like below:
Up to now, we have removed SG from our product cluster, so, is hard to get more info, any suggestions for us?
The elasticsearch version is 6.5.3,and the SG version is also 6.5.3.
So if you have used the TLS tool, and the cluster also runs fine, and the exception only happens every one or two days, I think it cannot be a general configuration problem. If it would be, you would see more exceptions, most probably already on node startup.
My best guess at the moment is that this is due to network issues, probably latency or a timeout. See also here:
opened 11:52AM - 16 Oct 18 UTC
closed 04:48PM - 14 Jun 19 UTC
>test-failure
:Security/TLS
v6.4.4
v8.0.0-alpha1
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.4+multijob-windo… ws-compatibility/64/consoleFull has failed with:
```
12:42:32 1> io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: Received close_notify during handshake
12:42:32 1> at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
12:42:32 1> Caused by: javax.net.ssl.SSLException: Received close_notify during handshake
12:42:32 1> at sun.security.ssl.Alerts.getSSLException(Alerts.java:208) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1666) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1634) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1776) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1083) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907) ~[?:?]
12:42:32 1> at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781) ~[?:?]
12:42:32 1> at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[?:1.8.0_181]
12:42:32 1> at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[netty-handler-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[netty-handler-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[netty-handler-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[netty-handler-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
12:42:32 1> ... 15 more
```
reproduction line of the build above with build id `20181016071459-6169C3F9 (does not reproduce locally):
```
gradlew :x-pack:qa:tribe-tests-with-security:test \
-Dtests.seed=E96200A29197A00B \
-Dtests.class=org.elasticsearch.xpack.security.SecurityTribeTests \
-Dtests.security.manager=true \
-Dtests.locale=de-LU \
-Dtests.timezone=Australia/Tasmania
```
reproduction line of build id `20181008232140-2FF07565 (does not reproduce locally):
```
./gradlew :x-pack:plugin:security:test \
-Dtests.seed=103C6D419C2C4438 \
-Dtests.class=org.elasticsearch.xpack.security.transport.netty4.SimpleSecurityNetty4ServerTransportTests \
-Dtests.method="testSendRandomRequests" \
-Dtests.security.manager=true \
-Dtests.locale=sq \
-Dtests.timezone=MST7MDT \
-Dcompiler.java=11 \
-Druntime.java=8
```
reproduction line of build id `20181007194343-B582AB07` (does not reproduce locally):
```
/gradlew :x-pack:qa:tribe-tests-with-security:test \
-Dtests.seed=E256452AFB8FDE46 \
-Dtests.class=org.elasticsearch.xpack.security.SecurityTribeTests \
-Dtests.method="testRetrieveRolesOnTribeNode" \
-Dtests.security.manager=true \
-Dtests.locale=lv-LV \
-Dtests.timezone=Africa/Nouakchott
```
So far we have the following build ids that failed with this error:
| build id | JDK version | OS / Distro | branch |
|-------------------------|-------------|----------------|--------|
| 20181016071459-6169C3F9 | 10+46 | Windows 2012 | 6.4 |
| 20181010113501-73ECBC82 | 10+46 | Oracle Linux 7 | 6.4 |
| 20181008232140-2FF07565 | 8 | Ubuntu 16.04 | master |
| 20181007194343-B582AB07 | 10+46 | CentOS 7 | 6.4 |
This exception is raised when a sender sends `close_notify` to the recipient to indicate it will not send
any more messages on this connection (see also [RFC5246](https://tools.ietf.org/html/rfc5246#section-7.2.1)). The question is why this happens during the SSL handshake.
I wonder whether this could be caused by a similar issue in Netty as addressed in https://github.com/elastic/elasticsearch/pull/30337 although I did not find anything related in recent Netty tickets. From the mix of failures on different operating systems I think we can rule out the OS.
opened 11:45AM - 01 May 18 UTC
closed 05:29AM - 03 Jul 18 UTC
Type/Bug
Team/StandardLibs
**Description:**
When an HTTPS service is invoked with Jmeter client (thread co… unt = 100) observed that some of the requests failed due to "javax.net.ssl.SSLException: Received close_notify during handshake" exception
**Steps to reproduce:**
1. Use the artefacts in to reproduce the issue - https://github.com/sashikamw/ballerina-stabilitytest/tree/master/ServiceScenarios/Scenario05_HTTPS/released
**Affected Versions:**
ballerina-runtime-0.970.0
**OS, DB, other environment details and versions:**
Ubuntu 16.04
**Observations**
1. Some of the HTTPS calls are failing
```
summary + 880 in 00:00:30 = 29.4/s Avg: 3403 Min: 1549 Max: 10759 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 22485 in 00:12:46 = 29.3/s Avg: 3395 Min: 1405 Max: 27986 Err: 6 (0.03%)
summary + 895 in 00:00:30 = 29.8/s Avg: 3348 Min: 1506 Max: 22261 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 23380 in 00:13:16 = 29.4/s Avg: 3394 Min: 1405 Max: 27986 Err: 6 (0.03%)
summary + 892 in 00:00:30 = 29.8/s Avg: 3371 Min: 1516 Max: 11212 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 24272 in 00:13:46 = 29.4/s Avg: 3393 Min: 1405 Max: 27986 Err: 6 (0.02%)
summary + 892 in 00:00:30 = 29.7/s Avg: 3364 Min: 1464 Max: 12046 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 25164 in 00:14:16 = 29.4/s Avg: 3392 Min: 1405 Max: 27986 Err: 6 (0.02%)
```
2. Client Side error (Jmeter)
```
javax.net.ssl.SSLException: Received close_notify during handshake
at sun.security.ssl.Alerts.getSSLException(Alerts.java:219)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1901)
at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2002)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:414)
at org.apache.jmeter.protocol.http.sampler.LazySchemeSocketFactory.connectSocket(LazySchemeSocketFactory.java:97)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at org.apache.jmeter.protocol.http.sampler.hc.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:318)
at org.apache.jmeter.protocol.http.sampler.MeasuringConnectionManager$MeasuredConnection.open(MeasuringConnectionManager.java:114)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:835)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.executeRequest(HTTPHC4Impl.java:697)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:455)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:74)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1189)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1178)
at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:490)
at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:416)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:250)
at java.lang.Thread.run(Thread.java:745)
```
Do you see any load spikes on the machine(s) when this happens? Any network issues?
lewis
October 12, 2019, 5:01am
13
Yea, Some ‘ping Time_out’ exactly occurred at that time, but after removing SG, so far cluster has no problems, so we think that if TLS has some Potential problems, may be short time network issues will cause a serious problem, we will also study the suggests above,
thanks so much!
lewis
October 15, 2019, 11:26am
14
Yea, Some ‘ping Time_out’ exactly occurred at that time, but after removing SG, so far cluster has no problems, so we think that if TLS has some Potential problems, may be short time network issues will cause a serious problem, we will also study the suggests above, as a temporary solution is using SG without tls and only use baseauth, any suggestions?
thanks so much!
This suggests that your cluster/network is probably already working on it’s limits? TLS adds some performance overhead of course, the amount varies depending on your machines (e.g. hardware support for encryption or not) and the chosen ciphers and encryption algorithms. It’s probably anywhere between 5% and 15%.
As with other security solutions for ES, TLS on transport layer is a central point in the security infrastructure and cannot be turned off.
1 Like
lewis
October 29, 2019, 8:38am
16
Thank you so much, our cluster looks working well without any auth, network and other machine indicators also look well, it’s surprising, we will follow the sg forum.
system
Closed
November 19, 2019, 8:39am
17
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.