Cluster stopped working, sgadmin hanging

Hey, so I have been testing SG for a while now. I shut off the cluster for a few weeks and then started up the instances. Unfortunately, SG is just hanging.

Both master nodes say:

[2017-09-08T08:42:11,409][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

Aswell as both data nodes:

[2017-09-08T08:42:11,596][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

So I look in the logs to see which elasticsearch instance is the designated master and I see our elastic-master-01 is the elected master. I run the sgadmin command (Which is saved in a bash script so we run the same one each time) and It just hangs on:

Contacting elasticsearch cluster ‘x-cluster’ and wait for YELLOW clusterstate …

Eventually failing with:

  • Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
  • If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)

However the above does not work either, if I run --red-cluster i see a lot of fatals.

The log diagnose file tells me some interesting things:

···

ClusterHealthRequest:

{

“error” : “Can not start an object, expecting field name (context: Object)”

}

Can you post (or mail) the logfiles of your nodes and your elasticsearch.yml?

···

Am 08.09.2017 um 10:47 schrieb anthony.cleaves@actual-experience.com:

Hey, so I have been testing SG for a while now. I shut off the cluster for a few weeks and then started up the instances. Unfortunately, SG is just hanging.

Both master nodes say:

[2017-09-08T08:42:11,409][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

Aswell as both data nodes:

[2017-09-08T08:42:11,596][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

So I look in the logs to see which elasticsearch instance is the designated master and I see our elastic-master-01 is the elected master. I run the sgadmin command (Which is saved in a bash script so we run the same one each time) and It just hangs on:

Contacting elasticsearch cluster 'x-cluster' and wait for YELLOW clusterstate ...

Eventually failing with:

   * Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
   * If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)

However the above does not work either, if I run --red-cluster i see a lot of fatals.

The log diagnose file tells me some interesting things:

ClusterHealthRequest:
{
  "error" : "Can not start an object, expecting field name (context: Object)"
}

Registered Office: Actual Experience plc
Quay House, The Ambury, Bath BA1 1UA,
Registered No. 06838738, VAT No. 971 9696 56

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.

--
You received this message because you are subscribed to the Google Groups "Search Guard Community Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to search-guard+unsubscribe@googlegroups.com.
To post to this group, send email to search-guard@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/search-guard/cdf5a8db-8fff-4f48-a49d-ac1fb6404975%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Of course, I am just going to upgrade to v16 to see if that makes any difference. Then I can email them, any specific email address?

···

On Friday, 8 September 2017 09:57:22 UTC+1, Search Guard wrote:

Can you post (or mail) the logfiles of your nodes and your elasticsearch.yml?

Am 08.09.2017 um 10:47 schrieb anthony...@actual-experience.com:

Hey, so I have been testing SG for a while now. I shut off the cluster for a few weeks and then started up the instances. Unfortunately, SG is just hanging.

Both master nodes say:

[2017-09-08T08:42:11,409][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

Aswell as both data nodes:

[2017-09-08T08:42:11,596][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

So I look in the logs to see which elasticsearch instance is the designated master and I see our elastic-master-01 is the elected master. I run the sgadmin command (Which is saved in a bash script so we run the same one each time) and It just hangs on:

Contacting elasticsearch cluster ‘x-cluster’ and wait for YELLOW clusterstate …

Eventually failing with:

  • Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
  • If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)

However the above does not work either, if I run --red-cluster i see a lot of fatals.

The log diagnose file tells me some interesting things:

ClusterHealthRequest:

{

“error” : “Can not start an object, expecting field name (context: Object)”

}

Registered Office: Actual Experience plc

Quay House, The Ambury, Bath BA1 1UA,

Registered No. 06838738, VAT No. 971 9696 56

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.


You received this message because you are subscribed to the Google Groups “Search Guard Community Forum” group.

To unsubscribe from this group and stop receiving emails from it, send an email to search-guard...@googlegroups.com.

To post to this group, send email to search...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/search-guard/cdf5a8db-8fff-4f48-a49d-ac1fb6404975%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Scrap that, I got my self muddled up with versioning.

···

On Friday, 8 September 2017 10:02:14 UTC+1, anthony...@actual-experience.com wrote:

Of course, I am just going to upgrade to v16 to see if that makes any difference. Then I can email them, any specific email address?

On Friday, 8 September 2017 09:57:22 UTC+1, Search Guard wrote:

Can you post (or mail) the logfiles of your nodes and your elasticsearch.yml?

Am 08.09.2017 um 10:47 schrieb anthony...@actual-experience.com:

Hey, so I have been testing SG for a while now. I shut off the cluster for a few weeks and then started up the instances. Unfortunately, SG is just hanging.

Both master nodes say:

[2017-09-08T08:42:11,409][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

Aswell as both data nodes:

[2017-09-08T08:42:11,596][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

So I look in the logs to see which elasticsearch instance is the designated master and I see our elastic-master-01 is the elected master. I run the sgadmin command (Which is saved in a bash script so we run the same one each time) and It just hangs on:

Contacting elasticsearch cluster ‘x-cluster’ and wait for YELLOW clusterstate …

Eventually failing with:

  • Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
  • If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)

However the above does not work either, if I run --red-cluster i see a lot of fatals.

The log diagnose file tells me some interesting things:

ClusterHealthRequest:

{

“error” : “Can not start an object, expecting field name (context: Object)”

}

Registered Office: Actual Experience plc

Quay House, The Ambury, Bath BA1 1UA,

Registered No. 06838738, VAT No. 971 9696 56

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.


You received this message because you are subscribed to the Google Groups “Search Guard Community Forum” group.

To unsubscribe from this group and stop receiving emails from it, send an email to search-guard...@googlegroups.com.

To post to this group, send email to search...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/search-guard/cdf5a8db-8fff-4f48-a49d-ac1fb6404975%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

I upgraded to 5.5.0.16 and it’s now working?

···

Clustername: x-cluster

Clusterstate: GREEN

Number of nodes: 4

Number of data nodes: 2

searchguard index does not exists, attempt to create it … done (auto expand replicas is on)

Populate config from /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/

Will update ‘config’ with /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/sg_config.yml

SUCC: Configuration for ‘config’ created or updated

Will update ‘roles’ with /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/sg_roles.yml

SUCC: Configuration for ‘roles’ created or updated

Will update ‘rolesmapping’ with /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/sg_roles_mapping.yml

SUCC: Configuration for ‘rolesmapping’ created or updated

Will update ‘internalusers’ with /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/sg_internal_users.yml

SUCC: Configuration for ‘internalusers’ created or updated

Will update ‘actiongroups’ with /usr/share/elasticsearch/plugins/search-guard-5/sgconfig/sg_action_groups.yml

SUCC: Configuration for ‘actiongroups’ created or updated

Done with success

On Friday, 8 September 2017 09:57:22 UTC+1, Search Guard wrote:

Can you post (or mail) the logfiles of your nodes and your elasticsearch.yml?

Am 08.09.2017 um 10:47 schrieb anthony...@actual-experience.com:

Hey, so I have been testing SG for a while now. I shut off the cluster for a few weeks and then started up the instances. Unfortunately, SG is just hanging.

Both master nodes say:

[2017-09-08T08:42:11,409][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

Aswell as both data nodes:

[2017-09-08T08:42:11,596][ERROR][c.f.s.a.BackendRegistry ] Not yet initialized (you may need to run sgadmin)

So I look in the logs to see which elasticsearch instance is the designated master and I see our elastic-master-01 is the elected master. I run the sgadmin command (Which is saved in a bash script so we run the same one each time) and It just hangs on:

Contacting elasticsearch cluster ‘x-cluster’ and wait for YELLOW clusterstate …

Eventually failing with:

  • Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
  • If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)

However the above does not work either, if I run --red-cluster i see a lot of fatals.

The log diagnose file tells me some interesting things:

ClusterHealthRequest:

{

“error” : “Can not start an object, expecting field name (context: Object)”

}

Registered Office: Actual Experience plc

Quay House, The Ambury, Bath BA1 1UA,

Registered No. 06838738, VAT No. 971 9696 56

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.


You received this message because you are subscribed to the Google Groups “Search Guard Community Forum” group.

To unsubscribe from this group and stop receiving emails from it, send an email to search-guard...@googlegroups.com.

To post to this group, send email to search...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/search-guard/cdf5a8db-8fff-4f48-a49d-ac1fb6404975%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.