I am upgrading ELK(with searchguard) from version 6.6.1 to 7.0.1 in kubernetes environment using helm charts.
Since SG configurations need to be migrated from version 6 to 7, I have written a post-upgrade job in my helm chart.
./sgadmin.sh -migrate /usr/share/elasticsearch/sg_migrate -ts $TRUSTSTORE_FILEPATH -ks $CLIENT_KEYSTORE_FILEPATH -cn $CLUSTER_NAME -kspass $KS_PWD -tspass $TS_PWD -h $HOSTNAME -nhnv
I see that sometimes, at post-upgrade, the sg migration fails with the below error:
Contacting elasticsearch cluster 'logging-elk' and wait for YELLOW clusterstate ... Clustername: logging-elk Clusterstate: GREEN Number of nodes: 2 Number of data nodes: 0 searchguard index does not exists, attempt to create it ... done (0-all replicas) ERR: Seems cluster is already migrated
It did not detect any data nodes even though data nodes were up. It tries to create searchguard-7 index and subsequently after this, if i run migrate-job again, it fails with the error - “searchguard index already exists, ERR: seems cluster is already migrated”.
But my SG configuration files are still in SG-6 format. Because of this, I cannot even run sg-admin as files are in SG-6 format & sg index is in 7 format. With this, the cluster becomes unreachable.
How can I guarantee that migrate job runs only after data nodes are detected so as to never end up with this issue?