Resolve without severity option selected

Hi Team,
Is there any option to select the resolve option without clicking on the severity selected?

Hi Ankita,

no, there is no way. This is because only with severity Signals will know whether there is a problem or not. Thus, it can also only tell whether a problem was resolved in this case.

Why would you want to have a resolve action without severity?

Nils

Hi Nils,
I am attaching my alert for cpu as below.

{
“severity”: {
“value”: “data._value[-1].cpu_usage_round”,
“order”: “ascending”,
“mapping”: [
{
“level”: “critical”,
“threshold”: 10
}
]
},
“checks”: [
{
“type”: “search”,
“name”: “mysearch”,
“target”: “mysearch”,
“request”: {
“body”: {
“size”: 0,
“query”: {
“bool”: {
“must”: [
{
“range”: {
@timestamp”: {
“gte”: “now-3m”,
“lte”: “now”
}
}
}
]
}
},
“aggregations”: {
“bucketAgg”: {
“terms”: {
“field”: “agent.hostname.keyword”,
“size”: 500,
“order”: {
“metricAgg”: “desc”
}
},
“aggregations”: {
“metricAgg”: {
“avg”: {
“field”: “system.load.1”
}
},
“Account”: {
“terms”: {
“field”: “cloud.account.id.keyword”
}
},
“ID”: {
“terms”: {
“field”: “cloud.instance.id.keyword”
}
},
“Region”: {
“terms”: {
“field”: “cloud.region.keyword”
}
},
“Time”: {
“terms”: {
“field”: “@timestamp
}
}
}
}
}
},
“indices”: [
“metricbeateoprodsec-7.17.6-2022.11.17”
]
}
},
{
“type”: “transform”,
“name”: “data_normalization”,
“source”: “def hosts=data.mysearch.aggregations.bucketAgg.buckets;\r\nreturn hosts.stream().filter(h->{ def cpu_usage=h.metricAgg.value; return cpu_usage>10; }).map(h->{def cpu_usage=h.metricAgg.value; def cpu_usage_round=BigDecimal.valueOf(cpu_usage1).setScale(2, RoundingMode.HALF_EVEN); def acc=h.Account.buckets[0].key; def reg=h.Region.buckets[0].key; def id=h.ID.buckets[0].key; def date=h.Time.buckets[0].key_as_string; return[‘host’: h.key, ‘cpu_usage’: cpu_usage1, ‘cpu_usage_round’: cpu_usage_round, ‘acc’: acc, ‘reg’: reg, ‘date’: date, ‘id’: id];}).collect(Collectors.toList());”,
“lang”: “painless”
}
],
“resolve_actions”: [
{
“type”: “index”,
“name”: “myelasticsearch”,
“index”: “cpu_alerts_index_crirical_clear”,
“checks”: ,
“resolves_severity”: [
“critical”
]
},
{
“type”: “webhook”,
“name”: “mywebhook”,
“request”: {
“method”: “POST”,
“url”: “https://rtpcool01lx.msts.ericsson.net:1522/probe/webhook”,
“body”: “{\n"Type": "Notification,\n"json.Message.Hostname": "{{#data._value}}{{host}},{{/data._value}}",\n"Text": "Severity level is now {{severity.level}}. The value has decreased from to {{severity.value}}%",\n"Message.Region": "{{#data._value}}{{reg}},{{/data._value}}",\n"Message.AWSAccountId": "{{#data._value}}{{acc}},{{/data._value}}",\n"Message.StateChangeTime": "{{execution_time}}",\n"Message.NewStateValue": "clear",\n"Message.Trigger.MetricName": "CPU Utilization %",\n"Message.AlarmName": "{{#data._value}}{{host}},{{/data._value}}",\n"Message.Trigger.Namespace": "AWS/EC2",\n"Message.AlarmDescription": "CPU Utilization % For Hosts",\n"Message.Source": "ELK",\n"Message.Trigger.Dimensions.0.value": "{{#data._value}}{{id}},{{/data._value}}",\n"Message.Trigger.Threshold": "80%",\n"Message.NewStateReason": "CPU Utilization has breached the threshold {{#data._value}}{{cpu_usage_round}}%,{{/data._value}}"\n}”,
“headers”: {
“Content-type”: “application/json”
}
},
“checks”: ,
“resolves_severity”: [
“critical”
]
}
],
“active”: true,
“_meta”: {
“last_edit”: {
“user”: “admin”,
“date”: “2022-11-17T14:11:09.302Z”
}
},
“trigger”: {
“schedule”: {
“interval”: [
“3m”
],
“timezone”: “Europe/Berlin”
}
},
“_tenant”: “_main”,
“actions”: [
{
“type”: “index”,
“name”: “myelasticsearch”,
“index”: “cpu_alerts_index”,
“checks”: ,
“severity”: [
“critical”
],
“throttle_period”: “1s”
},
{
“type”: “webhook”,
“name”: “mywebhook”,
“request”: {
“method”: “POST”,
“url”: “https://rtpcool01lx.msts.ericsson.net:1522/probe/webhook”,
“body”: " {\n "Type": "Notification",\n "Text": "{{#data._value}}{{host}},{{/data._value}}",\n "Message.Region": "{{#data._value}}{{reg}},{{/data._value}}",\n "Message.AWSAccountId": "{{#data._value}}{{id}},{{/data._value}}",\n "Message.StateChangeTime": "{{execution_time}}",\n "Message.Trigger.MetricName": "CPU Utilization %",\n "Message.AlarmName": "{{#data._value}}{{host}},{{/data._value}}",\n "Message.NewStateValue": "{{severity.level}}",\n "Message.Trigger.Namespace": "AWS/EC2",\n "Message.AlarmDescription": "CPU Utilization % For Hosts",\n "Message.Trigger.Dimensions.0.value": "{{#data._value}}{{id}},{{/data._value}}",\n "Message.Trigger.Threshold": "80%",\n "Message.Source": "ELK",\n "Message.NewStateReason": "CPU Utilization has breached the threshold {{#data._value}}{{cpu_usage_round}}%,{{/data._value}}"\n }",
“headers”: {
“Content-type”: “application/json”
}
},
“checks”: ,
“severity”: [
“critical”
],
“throttle_period”: “1s”
}
],
“_id”: “cpu_critical_alert_watch”
}

I want to send the data to elasticsearch and webhook for alert and resolve as well. However, this query is giving multiple hosts at once. In case there is no hosts that matches the condition, or the cpu reaches below threshold, the clear alert is not sent and i get the error as Index -1 out of bound 0.

Please support.

//Ankita

Can someone guide me how can I set the severity so that it works for sending clear event rather than index out of bound error?

You have defined the severity with this expression

 “value”: “data._value[-1].cpu_usage_round”,

Thus, you are using -1 as array index. This will always lead to an index out of bounds error. Why did you choose -1 here?

Hi,
What needs to be selected as severity field instead of -1.

-1 means all.

In Painless arrays, you can just reference single elements by index. If you want to make aggregate operations, you might be able to use some functions. Painless shares a number of functions/classes with Java:

You might be able to do something like:

Arrays.asList(data._value).map((e) -> e.cpu_usage_round).stream().max(Integer::compare).get();

This should get you the maximum value from the array. Note: I did not test it.

Hi Nils,
Can I DM you to explain better?

//Ankita

Hi Nils,
Following are my requirements listed below:
I have 30 hosts in my infrastructure. I want to get notified with an alert if any of the hosts’ CPU breaches the threshold. That means, I want single alert for single host, since I am sending data to Netcool via Webhook.

Requirement.txt (55.6 KB)

//Ankita

As this is just unpaid support, most discussions should be done in the public part of the forum.

I do not think that you can get per-host notifications with using severity levels in a single watch which scans all hosts.

I see two options:

  • Create a watch for each host
  • Give up per-host notifications and aggregate hosts with issues in one notification