Kibana 7.6 memory issue with SG

Hi,

we’re coming from 7.2 with SG, which worked fine, and have tried for a few days to deploy 7.6 with SG without success.

We’re using Kibana 7.6 and ES 7.6 with their respective SG versions in slightly modified Dockfile.

Increasing memory like this does not help:
ENV NODE_OPTIONS="–max-old-space-size=8192"

We also checked in the container to make sure the env variable was set, and it indeed was.

The VM has 8 vCPU and 16GB RAM, so that should be more than enough.

If needed I can share the whole setup privately, but it is really very barebone ES + Kibana + SG with minimal settings related to certificates and such.

Any help is very much appreciated!

kibana    | {"type":"log","@timestamp":"2020-04-24T11:38:01Z","tags":["info","optimize"],"pid":6,"message":"Optimizing and caching bundles for core, graph, monitoring, space_selector, ml, dashboardViewer, apm, maps, canvas, infra, siem, uptime, lens, searchguard-login, searchguard-customerror, searchguard-configuration, searchguard-multitenancy, searchguard-accountinfo, searchguard-signals, kibana, stateSessionStorageRedirect, status_page and timelion. This may take a few minutes"}
kibana    | FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
kibana    |  1: 0x8fa090 node::Abort() [/usr/share/kibana/bin/../node/bin/node]
kibana    |  2: 0x8fa0dc  [/usr/share/kibana/bin/../node/bin/node]
kibana    | 
kibana    | <--- Last few GCs --->
kibana    | 
kibana    | [6:0x2b1e110]    73743 ms: Scavenge 1344.6 (1422.1) -> 1343.8 (1422.6) MB, 2.1 / 0.0 ms  (average mu = 0.205, current mu = 0.152) allocation failure 
kibana    | [6:0x2b1e110]    73750 ms: Scavenge 1344.7 (1422.6) -> 1343.9 (1423.1) MB, 2.2 / 0.0 ms  (average mu = 0.205, current mu = 0.152) allocation failure 
kibana    | [6:0x2b1e110]    73756 ms: Scavenge 1344.8 (1423.1) -> 1344.0 (1423.6) MB, 2.4 / 0.0 ms  (average mu = 0.205, current mu = 0.152) allocation failure 
kibana    | 
kibana    | 
kibana    | <--- JS stacktrace --->
kibana    | 
kibana    | ==== JS stack trace =========================================
kibana    | 
kibana    |     0: ExitFrame [pc: 0x18a842c5be1d]
kibana    | Security context: 0x34a12719e6e9 <JSObject>
kibana    |     1: 0x37e2a5f07a31 <Symbol: Symbol.split>(aka [Symbol.split]) [0x34a127189641](this=0x1376f3c15471 <JSRegExp <String[5]: \r?\n>>,0x1376f3c153c9 <String[17]: _defineProperties>,0x37e2a5f026f1 <undefined>)
kibana    |     2: split [0x34a1271906c9](this=0x1376f3c153c9 <String[17]: _defineProperties>,0x1376f3c15471 <JSRegExp <String[5]: \r?\n>>)
kibana    |     3: w(aka w) [0x7499b8...
kibana    | 
kibana    |  3: 0xb0052e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
kibana    |  4: 0xb00764 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
kibana    |  5: 0xef4c72  [/usr/share/kibana/bin/../node/bin/node]
kibana    |  6: 0xef4d78 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/share/kibana/bin/../node/bin/node]
kibana    |  7: 0xf00e52 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/share/kibana/bin/../node/bin/node]
kibana    |  8: 0xf01784 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/share/kibana/bin/../node/bin/node]
kibana    |  9: 0xf043f1 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/share/kibana/bin/../node/bin/node]
kibana    | 10: 0xecd4e6 v8::internal::Factory::AllocateRawArray(int, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
kibana    | 11: 0xecdd6a v8::internal::Factory::NewFixedArrayWithFiller(v8::internal::Heap::RootListIndex, int, v8::internal::Object*, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
kibana    | 12: 0xece2f7 v8::internal::Factory::NewFixedArrayWithHoles(int, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
kibana    | 13: 0x11a78ab v8::internal::Runtime_RegExpSplit(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/share/kibana/bin/../node/bin/node]
kibana    | 14: 0x18a842c5be1d

Hi. Try this

NODE_OPTIONS="--max-old-space-size=12288" ./bin/kibana

This is not related to SG work. Look, here, the same error:


If I need 12GB of RAM to start an empty Kibana instance when using SG then that is most definitely related to SearchGuard and a problem, even if it’s not necessarily the fault of SearchGuard that this happens.

That being said, I already tried setting this as high as 16000, which is the theoretical limit of the VM, and it still happened.

Strange. I run Kibana + SG on my laptop every day having a much smaller heap size and I don’t see this error.

$ echo $NODE_OPTIONS
--max_old_space_size=4096

Do you have any other processes on the VM that can occupy a large portion of the RAM? Maybe you don’t have enough RAM at the time you run Kibana?

Please provide the following data

  1. The exact version of Kibana. For example, v7.6.2
  2. File kibana.yml
  3. File elasticsearch.yml
  4. File elasticsearch/plugins/search-guard-7/sgconfig/sg_config.yml
  1. 7.6.2
  2. kibana.yml (507 Bytes)
  3. elasticsearch.yml (1.1 KB)
  4. sg_config.yml (1.3 KB)

No there no other hungry processes running. There’s definitively enough RAM available. One can observe how much RAM a container consumes and Kibana crashes between 2-3GB, which was nowhere close what we had set it. htop also looked ok.

Just to emphasis this: This setup worked completely fine in 7.2.1

One more question, what OS do you use in the VM?

Hi Sergey,

CentOS 7

Thank you for the help!

FWIW, I initially had this issue when I updated to 7.6.2 as well. For me, it was crashing while optimizing plugins (including Search Guard). I run my Kibana pods with 4G memory and set:

export NODE_OPTIONS="--max-old-space-size=2048"

and it was sufficient to resolve the issue. Additionally, I observed that Kibana optimizes plugins every time it starts. Is this correct?

1 Like

@sascha.herzinger
Have you tried to add --max-old-space-size=4096 option directly in the /kibana/bin/kibana? A guy from Elastic suggested this. Just trying to narrow down the scope of the problem, maybe something happened to the env variable. Can you share your docker file?

Also, try to enter the Kibana docker container. Delete the bundles and start Kibana.

rm -rf kibana/optimize/bundles/*
./kibana/bin/kibana

@sascha.herzinger @Doug_Renze I see Kibana optimizes the plugins on every start. And Kibana forks multiple children process on start for the thread-loader. Notice, each process has --max_old_space_size=4096. I presume Kibana expects each process to occupy the heap space up to the limit if required. These workers should exit after Kibana finishes the optimization phase.

$ ps aux | grep kibana
sergiibondarenko 70061 101.6  1.2  4748396 197772   ??  Rs    9:53AM   0:30.92 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70065  94.2  1.2  4752064 202120   ??  Rs    9:53AM   0:30.08 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70063  90.7  1.7  4827268 278248   ??  Rs    9:53AM   0:28.98 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70062  88.4  1.3  4770064 220660   ??  Rs    9:53AM   0:30.23 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70059  86.3  1.2  4752156 201796   ??  Rs    9:53AM   0:33.02 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70060  86.2  1.2  4752096 202820   ??  Rs    9:53AM   0:30.91 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20
sergiibondarenko 70064  79.8  1.2  4750812 200388   ??  Rs    9:53AM   0:30.78 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node/bin/node --no-warnings --max-http-header-size=65536 --max_old_space_size=4096 /Users/sergiibondarenko/Development/kibana/dist/kibana-7.6.2-darwin-x86_64/node_modules/thread-loader/dist/worker.js 20

But I see a user reported the workers didn’t stop for him.

Wow, editing the kibana script actually works.
I don’t understand why, but it does.

Before:
NODE_OPTIONS="--no-warnings --max-http-header-size=65536 ${NODE_OPTIONS}" NODE_ENV=production exec "${NODE}" "${DIR}/src/cli" ${@}

After:
NODE_OPTIONS="--no-warnings --max-http-header-size=65536 ${NODE_OPTIONS}" NODE_ENV=production exec "${NODE}" --max-old-space-size=4096 "${DIR}/src/cli" ${@}

These two should be all means be the same. env returns NODE_OPTIONS=--max-old-space-size=4096
Could this be a node.js issue? This looks like node.js handles env variables different than the passed parameters.

Thank you so much for your help

Can’t reproduce it in docker. Here, you can try it https://git.floragunn.com/search-guard/search-guard-labs/-/tree/7.6.x

The Docker resources:
CPU=4
RAM=8 GB
SWAP=1 GB