Describe the issue:
Running sgctl update-config against a two-node ES 9.3.3 + SG FLX 4.1.10 cluster hangs
indefinitely. If the pool size is 1, the ES management thread on the master node blocks inside ConfigUpdateAction.nodeOperation() and never completes.
The problem does not happen on ES 9.2 neither on ES 8.19; the root cause is an ES 9.3 behaviour change: auto_put_mapping is now dispatched to the
MANAGEMENT thread pool. In earlier versions it ran on a different pool, so the deadlock did
not occur.
Hi @pablo For 9.2 I tested with 4.0.1-es-9.2.6, for 8.19 I tested with 4.0.1-es-8.19.13
In those two it worked,
In 4.1.0-es-9.3.3 there is the deadlock issue
We discover the deadlock in an integration test environment, in the attached file you can see how to reproduce
Thank you @issac.garcia for sharing the info. Could you tell me why you need to set the node.processors to 1? Is it for testing or you have limited resources?
I don’t need to set the node to 1, our testing cluster was using 1 by default when we discover the deadlock. It took us a while to understand the issue because it didn’t happen on previous ES versions. We change it and now we don’t have any problem. I just share what we found with you guys to improve your software