DLS performance

Hi

We are setting up proof of concept for our central logging system for application logs. We have multiple applications and multiple developers. Typically one developer is responsible for a single application but in some cases they are responsible for multiple applications.

One of the requirements for the central logging system is that developers should only see the logs for the application they are responsible.

I have implemented DLS as an authorization like this example (sg_roles.yml):
app1:
cluster:
- CLUSTER_COMPOSITE_OPS_RO
indices:
‘?kibana-':
'
’:
- READ
‘applog-':
'
’:
- READ
- SEARCH
- indices:data/read/field_stats
dls: ‘{ “term”: { “app”: “app1” } }’

``

We have other applications so there are multiple roles app2, app3, etc each one for filtering logs for one of the applications.
Users are mapped to one or more roles (access to one or more application logs).

Authorization with DLS is working correctly and every user with role app1 sees only logs of his application app1. Users with role app1 and app2 see both of the application logs so everything works as expected even for users with access to multiple application logs.

Problem we are having is performance penalty when enabling DLS.
When doing a search for last 100 records without DLS it takes about 20 ms (notice I have included the exact same term filter for app1 in the query as in DLS criteria):
With DLS enabled the same search takes 900-1000 ms.

get applog-*/_search
{
“size”: 100,
“query”: {
“bool”: {
“filter”: [
{“term”: {
“app”: “app1”
}},
{“term”: {
“env”: “PROD”
}},
{“range”: {
@timestamp”: {
“gte”: “now-15m”
}
}}
]
}
},
“sort”: [
{ “@timestamp”: {“order”: “desc”}}
]
}

``

Could you please help me how to troubleshoot this issue.

Regards,
Mitja

Elasticsearch Version: 6.2.4
Search Guard Version: 6.2.4-22.1
Java™ SE Runtime Environment (build 1.8.0_131-b11)
OS is Oracle Linux Server release 7.3

Before digging deeper into your setup: The SG version you are running (22.1) is quite old. We have added a couple of DLS performance improvements in 22.3. Can you please upgrade to this version and see if the situation changes?

Changelogs:

···

On Thursday, August 2, 2018 at 11:55:43 AM UTC+2, Mitja Golouh wrote:

Hi

We are setting up proof of concept for our central logging system for application logs. We have multiple applications and multiple developers. Typically one developer is responsible for a single application but in some cases they are responsible for multiple applications.

One of the requirements for the central logging system is that developers should only see the logs for the application they are responsible.

I have implemented DLS as an authorization like this example (sg_roles.yml):
app1:
cluster:
- CLUSTER_COMPOSITE_OPS_RO
indices:
‘?kibana-':
'
’:
- READ
‘applog-':
'
’:
- READ
- SEARCH
- indices:data/read/field_stats
dls: ‘{ “term”: { “app”: “app1” } }’

``

We have other applications so there are multiple roles app2, app3, etc each one for filtering logs for one of the applications.
Users are mapped to one or more roles (access to one or more application logs).

Authorization with DLS is working correctly and every user with role app1 sees only logs of his application app1. Users with role app1 and app2 see both of the application logs so everything works as expected even for users with access to multiple application logs.

Problem we are having is performance penalty when enabling DLS.
When doing a search for last 100 records without DLS it takes about 20 ms (notice I have included the exact same term filter for app1 in the query as in DLS criteria):
With DLS enabled the same search takes 900-1000 ms.

get applog-*/_search
{
“size”: 100,
“query”: {
“bool”: {
“filter”: [
{“term”: {
“app”: “app1”
}},
{“term”: {
“env”: “PROD”
}},
{“range”: {
@timestamp”: {
“gte”: “now-15m”
}
}}
]
}
},
“sort”: [
{ “@timestamp”: {“order”: “desc”}}
]
}

``

Could you please help me how to troubleshoot this issue.

Regards,
Mitja

Elasticsearch Version: 6.2.4
Search Guard Version: 6.2.4-22.1
Java™ SE Runtime Environment (build 1.8.0_131-b11)
OS is Oracle Linux Server release 7.3

Hi

I have installed the newer version 22.3 of SG plugin but with first few ad-hoc executions I did not notice any improvement in execution.

Before running a proper test case I also fixed the following line I noticed in sg_admin output:

Legacy index ‘searchguard’ detected.
See Upgrading from 5.x to 6.x | Security for Elasticsearch | Search Guard for more details.

``

Guess this was left over from the upgrade and I didn’t pay attention to it.

After recreating the SG index and starting again with test cases the performances have changed to normal.

The same query now runs as expected (around 40 ms).

I don’t know if this is the result of recreating the SG index to proper version 6 or some cache kicked in after few initial queries but querying with DLS enabled now runs ok.

Regards,

Mitja

That was due to the legacy SG5 index. We support the old index format, but only for making it possible to upgrade from 5 → 6 via a rolling restart without any outage. After the upgrade is completed you should add the index with sgadmin again (as you did) so it is written in the new SG6 format.

···

On Monday, August 6, 2018 at 3:19:11 PM UTC+2, Mitja Golouh wrote:

Hi

I have installed the newer version 22.3 of SG plugin but with first few ad-hoc executions I did not notice any improvement in execution.

Before running a proper test case I also fixed the following line I noticed in sg_admin output:

Legacy index ‘searchguard’ detected.
See http://docs.search-guard.com/v6/upgrading-5-6 for more details.

``

Guess this was left over from the upgrade and I didn’t pay attention to it.

After recreating the SG index and starting again with test cases the performances have changed to normal.

The same query now runs as expected (around 40 ms).

I don’t know if this is the result of recreating the SG index to proper version 6 or some cache kicked in after few initial queries but querying with DLS enabled now runs ok.

Regards,

Mitja