external_elasticsearch audit logging throwing errors on bulk write rejects from the target cluster

brian · November 26, 2018, 8:14pm

ELK: 6.4.2
SG: 23.1

(Have a busy cluster)

Search Guard is configured for external_elasticsearch audit logging and the target cluster is at times busy being hammered by Logstash and will throw bulk write rejects (Logstash just keeps retrying). This is throwing an exception in Search Guard. Is there any auto retry option in Search Guard audit logging to help avoid this? I have tried increasing the searchguard.audit.threadpo settings but there is no noticeable improvement. I want to try and avoid raising the bulk write threadpool sizes on the remote cluster nodes if possible (but will if needed)

searchguard.audit.config.enable_ssl: false

searchguard.audit.config.http_endpoints: xx.xx.xx.xx:9200,xx.xx.xx.xx:9200,xx.xx.xx.xx:9200

searchguard.audit.config.index: rollover-custom-searchguard

searchguard.audit.config.verify_hostnames: false

searchguard.audit.type: external_elasticsearch

[2018-11-26T13:46:36,818][ERROR][c.f.s.h.HttpClient ] ElasticsearchStatusException[Elasticsearch exception [type=es_rejected_execution_exception, reason=rejected execution of org.elasticsearch.transport.TransportService$7@7a8b7873 on EsThreadPoolExecutor[name = xxxxxxxxxxxxxxx-es-01/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@7c6c4359[Running, pool size = 16, active threads = 16, queued tasks = 201, completed tasks = 775680898]]]]

org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=es_rejected_execution_exception, reason=rejected execution of org.elasticsearch.transport.TransportService$7@7a8b7873 on EsThreadPoolExecutor[name = xxxxxxxxxxxxxxx-es-01/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@7c6c4359[Running, pool size = 16, active threads = 16, queued tasks = 201, completed tasks = 775680898]]]

at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) ~[elasticsearch-6.4.2.jar:6.4.2]

at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1406) ~[?:?]

at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1382) ~[?:?]

at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1269) ~[?:?]

at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1231) ~[?:?]

at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:587) ~[?:?]

at com.floragunn.searchguard.httpclient.HttpClient.index(HttpClient.java:193) ~[?:?]

at com.floragunn.searchguard.auditlog.sink.ExternalESSink.doStore(ExternalESSink.java:175) ~[?:?]

at com.floragunn.searchguard.auditlog.sink.AuditLogSink.store(AuditLogSink.java:56) ~[?:?]

at com.floragunn.searchguard.auditlog.routing.AsyncStoragePool.lambda$submit$0(AsyncStoragePool.java:60) ~[?:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_191]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]

Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://xx.xx.xx.xx:9200], URI [/rollover-custom-searchguard/auditlog?refresh=true&timeout=1m], status line [HTTP/1.1 429 Too Many Requests]

{“error”:{“root_cause”:[{“type”:“remote_transport_exception”,“reason”:“[xxxxxxxxxxxxxxx-es-01][xx.xx.xx.xx:9300][indices:data/write/bulk[s]]”}],“type”:“es_rejected_execution_exception”,“reason”:“rejected execution of org.elasticsearch.transport.TransportService$7@7a8b7873 on EsThreadPoolExecutor[name = xxxxxxxxxxxxxxx-es-01/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@7c6c4359[Running, pool size = 16, active threads = 16, queued tasks = 201, completed tasks = 775680898]]”},“status”:429}

at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:920) ~[?:?]

at org.elasticsearch.client.RestClient.performRequest(RestClient.java:227) ~[?:?]

at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1256) ~[?:?]