when using Search Guard with 3rd party Elasticsearch plugin I am hitting issues that might represent security risk. At this point I am not sure if the issue is cause by SG or the 3rd party plugin. But the plugin is quite simple. Details including code to reproduce the issue follows (sorry for long email…):
I am using Prometheus exporter plugin , this plugin introduces new REST API to the ES node. I am aware of the fact that this plugin is not officially listed in supported plugins  but first I think it might be very nice to add it there (and I can volunteer to do necessary PRs if needed), second the plugin code is so simple that I am afraid many other plugins can suffer from similar security issues.
I wanted to configure SG to control access to the Prometheus exporter REST API which is exposed at /_prometheus/metrics on each ES node that has this plugin. After looking into plugin code (examples how plugin makes request to ES [3a, 3b]) and doing some experiments I realized that SG will require client accessing ES node to have the following two permissions:
(- and in never versions also indices stats… but leave this out for now)
So far so good. I created new user that has required permissions and as expected it can pull prometheus metrics from ES node. However, I also discovered that users that do not have required permissions can pull metrics as well. Or opposite, i.e. clients with appropriate permissions are rejected. As the following code example demonstrates access to Prometheus REST API can (or can not) work as expected depending which user makes call to the REST API before them (but it is probably more complicated).
Here is the example code:
Using SG 2.4.4.x, ES 2.4.4.x
Imagine I define three SG internal users: ‘first_perm’, ‘second_perm’ and ‘both_perms’. The only user that should be always allowed to pull from prometheus REST API is ‘both_perms’. The other two users always miss first or send permission from the two needed.
In the end I created simple test script that demonstrate two different access cases.
user ‘first_perm’ call the API first
then ‘second_perm’ user
the order of users is reversed
first comes ‘both_perms’
As you can see apart from call to prometheus REST API each user is also calling “raw” cluster health and nodes stats to demonstrate if it can access it as well.
As you can see the only difference between code base in both cases is the order:
Surprisingly, the output of both cases is dramatically different.
What I found broken (and what I consider security risk):
First of all, complete logs can be found here:
output of test.sh script  (starting at line 2205)
Elasticsearch log  (starting at line 2242)
output of test.sh script  (starting at line 2203)
Elasticsearch log  (starting at line 2255)
For case 1:
Both ‘first_perm’ and ‘second_perm’ users are rejected access to prometheus REST API (that is expected) and both can access one of the cluster/health or nodes/stats accordingly. But the ‘both_perms’ user is not allowed to hit prometheus REST API (while at the same time it can hit both cluster/health and modes/stats individually). Further inspection in ES log reveals interesting detail. When ‘both_perms’ makes request to Prometheus API we can see  it is pulled from internal backend (looks ok) and evaluated permissions are for user ‘second_perm’ (!!!) user, thus rejected…
For case 2:
This time ‘both_perms’ user comes first and it is allowed to access prometheus REST API (yea, cool!). Then ‘second_perm’ user comes and it is also allowed to access it (!!!, it shouldn’t). The same apply to ‘first_perm’ user too (!!!). Further inspection of ES log shows that ‘first_perm’ user is evaluated for user ‘both_perms’  as ‘second_perm’ user is evaluated to ‘both_perms’ .
Complete code examples are found in https://github.com/lukas-vlcek/es-sg-prometheus-integration in branches ‘case_1’ and ‘case_2’. For each commit there is Travis service run the code (download ES, install and configure SG, Prometheus plugin and run test.sh). Then log can be fully investigated on Travis as well. Also this can be run locally as well (requires Docker).
My main questions:
Is this issue with SG or Prometheus plugin?
Am I doing anything wrong in my code?