Signals runtime data not accessible

Hi,
few days ago we have upgraded/patched cluster with Search Guard 7.x Bugfix Release July 2021. for ELK 7.9.3
I see there is some problem with accessing runtime data in type condition.
For example simple watch.

watch
[
{
“type”: “search”,
“name”: “mysearch”,
“target”: “mysearch”,
“request”: {
“indices”: [
“kubernetes-prod-logs”
],
“body”: {
“from”: 0,
“size”: 0,
“query”: {
“bool”: {
“filter”: {
“range”: {
@timestamp”: {
“gte”: “now-5m”,
“lte”: “now”
}
}
},
“must”: [
{
“term”: {
“log.level”: “ERROR”
}
}
]
}
}
}
}
},
{
“type”: “condition”,
“name”: “mycondition”,
“source”: “data.mysearch.hits.total.value > 0”
}
]

Response
{
“watch”: {
“id”: “__inline_watch”,
“tenant”: “signals-prod”
},
“data”: {
“mysearch”: {
“_shards”: {
“total”: 198,
“failed”: 0,
“successful”: 198,
“skipped”: 189
},
“hits”: {
“hits”: ,
“total”: {
“value”: 692,
“relation”: “eq”
},
“max_score”: null
},
“took”: 42,
“timed_out”: false
}
},
“severity”: null,
“trigger”: {
“triggered_time”: null,
“scheduled_time”: null,
“previous_scheduled_time”: null,
“next_scheduled_time”: null
},
“execution_time”: “2021-08-19T06:57:30.413567858Z”
}

result
{
“tenant”: “signals-prod”,
“watch_id”: “test”,
“status”: {
“code”: “NO_ACTION”,
“detail”: “No action needed due to check mycondition”
},
“execution_start”: “2021-08-19T07:07:57.096Z”,
“execution_end”: “2021-08-19T07:07:59.566Z”,
“actions”: ,
“node”: “pias-essignals002.srv24246aca-kvm.signals”,
“_id”: “pk48XXsBW_l-8rou7kbO”,
“_index”: “.signals_log_2021.08.19”
}

Watches created before patch are working correct but new one don’t.

If I add condition like this
“source”: “data.mysearch.hits.total.value >= 0”
then action is executed
If i add condition
“source”: “data.mysearch.hits.total.value >= 1”
then action is not executed
I also try with aggs in search

    "aggs": {
      "levelcount": {
        "value_count": {
          "field": "log.level"
        }
      }

with condition
“source”: “data.mysearch.aggregations.levelcount.value >= 0”
then i got exception

“error”: {
“check”: “Condition mycondition”,
“message”: “Cannot invoke "Object.getClass()" because "callArgs[0]" is null”,
“detail”: {
“type”: “script_exception”,
“reason”: “runtime error”,
“script_stack”: [
“data.mysearch.aggregations.levelcount.value >= 0”,
" ^---- HERE"
],
“script”: “data.mysearch.aggregations.levelcount.value >= 0”,
“lang”: “painless”,
“position”: {
“offset”: 26,
“start”: 0,
“end”: 48
},
“caused_by”: {
“type”: “null_pointer_exception”,
“reason”: “Cannot invoke "Object.getClass()" because "callArgs[0]" is null”
}
}
},

    }

Any idea what can be wrong?

Hello @marwojt

In regards to your first query, do you get any results when you change the size to more than 0 (i.e. 5)?
Have you tried to run query against keyword and no text field?

If i change size i get docs in response. I also try condition hits.hits.lenght but it didn’t work also.
Log.level field it is a keword field.

This will access text value and not keyword.

Could you try below instead?

“log.level.keyword”: “ERROR”

I’ve changed it as You suggested but it doesn’t help - still no action executed

query
[
{
“type”: “static”,
“name”: null,
“target”: “constants”,
“value”: {
“threshold_test”: 0
}
},
{
“request”: {
“indices”: [
“kubernetes-prod-logs”
],
“body”: {
“size”: 1,
“query”: {
“bool”: {
“must”: {
“range”: {
@timestamp”: {
“gte”: “now/m-5m”,
“lt”: “now/m”
}
}
}
}
},
“aggs”: {
“levelcount”: {
“value_count”: {
“field”: “log.level.keyword”
}
}
}
}
},
“type”: “search”,
“name”: “mysearch”,
“target”: “mysearch”
},
{
“name”: “mycondition”,
“source”: “data.mysearch.hits.total.value > 0”,
“type”: “condition”,
“lang”: “painless”
}
]
response
{
“watch”: {
“id”: “__inline_watch”,
“tenant”: “signals-prod”
},
“data”: {
“mysearch”: {
“_shards”: {
“total”: 207,
“failed”: 0,
“successful”: 207,
“skipped”: 198
},
“hits”: {
“hits”: ,
“total”: {
“value”: 10000,
“relation”: “gte”
},
“max_score”: 3
},
“took”: 1601,
“timed_out”: false,
“aggregations”: {
“levelcount”: {
“value”: 0
}
}
},
“constants”: {
“threshold_test”: 0
}
},
“severity”: null,
“trigger”: {
“triggered_time”: null,
“scheduled_time”: null,
“previous_scheduled_time”: null,
“next_scheduled_time”: null
},
“execution_time”: “2021-08-23T11:43:18.787588482Z”

I’ve checked signals with other search indexes and it works fine.
But i have a group of indices with signals don;t work. It looks that there is some problem with parsing json.

sg_singnals_log.txt (20.0 KB)

@marwojt

It seems like indexes from the log file have malformed data. How do you collect the logs and push them to Elasticsearch?
The Signals watch performs a GET API query which you can test in dev tools or with the curl command.

Hi,
query from watch is working fine in dev tools and also in signals UI when is executed.
I supposed is something with .signals_watches mapping signals_watches_mapping.json (19.4 KB).
Could you tell me which field is used for checks.request.indices (text/keyword) when watch is executed.

  • “index”:{
    • “type”:“text”,
    • “fields”:{
      • “keyword”:{
        • “type”:“keyword”,
        • “ignore_above”:256}}},
  • “indices”:{
    • “type”:“text”,
    • “fields”:{
      • “keyword”:{
        • “type”:“keyword”,
        • “ignore_above”:256}}},

@marwojt The .signals_watches mapping should really not have any impact on this. Would you be able to run the same exact query with size set to 1, and paste the result here, I would suspect the data received back might provide some insight. Please redact any sensitive details.

Hi,
I have new insights. It looks like issue it’s not related with mapping as i thought.
It’s something related with indices with specific aliases and creating SearchInput.
I have indices with ILM for example
kopernik-k8s-nonprod-dps-logs-2021.08.31-000011 with aliases (kopernik-k8s-nonprod-dps-logs, kubernetes-nonprod-logs).
If I run watch on this data, does not matter, i write index or alias i get error like this
https://forum.search-guard.com/uploads/short-url/emA4QUGQtA6zncqh6P0OF5Cg5nC.txt

If i create index with same data but change name and no aliases for example
kop-k8s-nonprod-dps-logs-2021.08.31-000011
It’ working fine.
If i add alias any of (kopernik-k8s-nonprod-dps-logs, kubernetes-nonprod-logs) it’s not working anymore.
Even if i add alias with one word “kopernik” or “kubernetes” it’s stop working.

As a workaround we are indexing same data to short name indices like “kop-dps-logs” without aliases and run watches on them.

Hi @marwojt

This error reports specific indexes which might be attached to your alias.
Have you checked the shards/replicas health status of those indexes?
Have you tried to remove those indexes from the queried alias?

From the logs:

  "caused_by": {
              "type": "i_o_exception",
              "reason": "java.util.concurrent.ExecutionException: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('\\' (code 92)): was expecting double-quote to start field name\n at [Source: (StringReader); line: 1, column: 3]",
              "caused_by": {
                "type": "execution_exception",
                "reason": "execution_exception: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('\\' (code 92)): was expecting double-quote to start field name\n at [Source: (StringReader); line: 1, column: 3]",
                "caused_by": {
                  "type": "i_o_exception",
                  "reason": "Unexpected character ('\\' (code 92)): was expecting double-quote to start field name\n at [Source: (StringReader); line: 1, column: 3]"
                }
              }
            }

Would it be possible that you check the Elasticsearch logs whether you can find log entries corresponding to the above error? If we would have a stack trace of that error, that might be very helpful.