-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(Backend): Add Opensearch as a backend provider
- Loading branch information
Showing
13 changed files
with
447 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# Elasticsearch | ||
|
||
## Backend | ||
|
||
Using the `opensearch` backend class, you can query any metrics available in Opensearch to create an SLO. | ||
|
||
```yaml | ||
backends: | ||
opensearch: | ||
url: ${OPENSEARCH_URL} | ||
``` | ||
Note that `url` can be either a single string (when connecting to a single node) or a list of strings (when connecting to multiple nodes): | ||
|
||
```yaml | ||
backends: | ||
opensearch: | ||
url: https://localhost:9200 | ||
``` | ||
|
||
```yaml | ||
backends: | ||
opensearch: | ||
url: | ||
- https://localhost:9200 | ||
- https://localhost:9201 | ||
``` | ||
|
||
The following method is available to compute SLOs with the `opensearch` backend: | ||
|
||
* `good_bad_ratio` method is used to compute the ratio between two metrics: | ||
|
||
* **Good events**, i.e events we consider as 'good' from the user perspective. | ||
* **Bad or valid events**, i.e events we consider either as 'bad' from the user perspective, or all events we consider as 'valid' for the computation of the SLO. | ||
|
||
This method is often used for availability SLOs, but can be used for other purposes as well (see examples). | ||
|
||
**SLO example:** | ||
|
||
```yaml | ||
backend: opensearch | ||
method: good_bad_ratio | ||
service_level_indicator: | ||
index: my-index | ||
date_field: '@timestamp' | ||
query_good: | ||
must: | ||
range: | ||
api-response-time: | ||
lt: 350 | ||
query_bad: | ||
must: | ||
range: | ||
api-response-time: | ||
gte: 350 | ||
``` | ||
|
||
Additional info: | ||
|
||
* `date_field`: Has to be a valid Opensearch `timestamp` type | ||
|
||
**→ [Full SLO config](../../samples/opensearch/slo_opensearch_latency_sli.yaml)** | ||
|
||
You can also use the `filter_bad` field which identifies bad events instead of the `filter_valid` field which identifies all valid events. | ||
|
||
The Lucene query entered in either the `query_good`, `query_bad` or `query_valid` fields will be combined (using the `bool` operator) into a larger query that filters results on the `window` specified in your Error Budget Policy steps. | ||
|
||
The full `Opensearh` query body for the `query_bad` above will therefore look like: | ||
|
||
```json | ||
{ | ||
"query": { | ||
"bool": { | ||
"must": { | ||
"range": { | ||
"api-response-time": { | ||
"gte": 350 | ||
} | ||
} | ||
}, | ||
"filter": { | ||
"range": { | ||
"@timestamp": { | ||
"gte": "now-3600s/s", | ||
"lt": "now/s" | ||
} | ||
} | ||
} | ||
} | ||
}, | ||
"track_total_hits": true | ||
} | ||
``` | ||
|
||
### Examples | ||
|
||
Complete SLO samples using the `opensearch` backend are available in [samples/elasticsearch](../../samples/opensearch). Check them out! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
apiVersion: sre.google.com/v2 | ||
kind: ServiceLevelObjective | ||
metadata: | ||
name: open-search-availability | ||
labels: | ||
service_name: opensearch | ||
feature_name: opensearch-availability | ||
slo_name: availability | ||
spec: | ||
description: 99% of the element are valid | ||
backend: opensearch | ||
method: good_bad_ratio | ||
exporters: [] | ||
service_level_indicator: | ||
index: my-index | ||
date_field: '@timestamp' | ||
query_good: | ||
must: | ||
term: | ||
status: 200 | ||
query_bad: | ||
must_not: | ||
term: | ||
status: 200 | ||
goal: 0.99 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
apiVersion: sre.google.com/v2 | ||
kind: ServiceLevelObjective | ||
metadata: | ||
name: open-search-latency | ||
labels: | ||
service_name: opensearch | ||
feature_name: opensearch-latency | ||
slo_name: latency | ||
spec: | ||
description: 99% of the element are valid | ||
backend: opensearch | ||
method: good_bad_ratio | ||
exporters: [] | ||
service_level_indicator: | ||
index: my-index | ||
date_field: '@timestamp' | ||
query_good: | ||
must: | ||
range: | ||
api-response-time: | ||
lt: 350 | ||
query_bad: | ||
must: | ||
range: | ||
api-response-time: | ||
gte: 350 | ||
goal: 0.99 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
""" | ||
`opensearch.py` | ||
Opensearch backend implementation. | ||
""" | ||
|
||
import copy | ||
import logging | ||
|
||
from opensearchpy import OpenSearch | ||
|
||
from slo_generator.constants import NO_DATA | ||
|
||
LOGGER = logging.getLogger(__name__) | ||
|
||
|
||
class OpensearchBackend: | ||
"""Backend for querying metrics from OpenSearch. | ||
Args: | ||
client(opensearch.OpenSearch): Existing OS client. | ||
os_config(dict): OS client configuration. | ||
""" | ||
|
||
def __init__(self, client=None, **os_config): | ||
self.client = client | ||
if self.client is None: | ||
conf = copy.deepcopy(os_config) | ||
url = conf.pop("url", None) | ||
basic_auth = conf.pop("basic_auth", None) | ||
api_key = conf.pop("api_key", None) | ||
if url: | ||
conf["hosts"] = url | ||
if basic_auth: | ||
conf["basic_auth"] = (basic_auth["username"], basic_auth["password"]) | ||
if api_key: | ||
conf["api_key"] = (api_key["id"], api_key["value"]) | ||
|
||
self.client = OpenSearch(**conf) | ||
|
||
# pylint: disable=unused-argument | ||
def good_bad_ratio(self, timestamp, window, slo_config): | ||
"""Query two timeseries, one containing 'good' events, one containing | ||
'bad' events. | ||
Args: | ||
timestamp(int): UNIX timestamp. | ||
window(int): Window size (in seconds). | ||
slo_config(dict): SLO configuration. | ||
spec: | ||
method: "good_bad_ratio" | ||
service_level_indicator: | ||
query_good(str): the search query to look for good events | ||
query_bad(str): the search query to look for ba events | ||
query_valid(str): the search query to look for valid events | ||
Returns: | ||
tuple: good_event_count, bad_event_count | ||
""" | ||
measurement = slo_config["spec"]["service_level_indicator"] | ||
index = measurement["index"] | ||
query_good = measurement["query_good"] | ||
query_bad = measurement.get("query_bad") | ||
query_valid = measurement.get("query_valid") | ||
date_field = measurement.get("date_field") | ||
|
||
good = OS.build_query(query_good, window, date_field) | ||
bad = OS.build_query(query_bad, window, date_field) | ||
valid = OS.build_query(query_valid, window, date_field) | ||
|
||
good_events_count = OS.count(self.query(index, good)) | ||
|
||
if query_bad is not None: | ||
bad_events_count = OS.count(self.query(index, bad)) | ||
elif query_valid is not None: | ||
bad_events_count = OS.count(self.query(index, valid)) - good_events_count | ||
else: | ||
raise ValueError("`filter_bad` or `filter_valid` is required.") | ||
|
||
return good_events_count, bad_events_count | ||
|
||
def query(self, index, body): | ||
"""Query Opensearch server. | ||
Args: | ||
index(str): Index to query. | ||
body(dict): Query body. | ||
Returns: | ||
dict: Response. | ||
""" | ||
return self.client.search(index=index, body=body) | ||
|
||
@staticmethod | ||
def count(response): | ||
"""Count event in opensearch response. | ||
Args: | ||
response(dict): Opensearch query response. | ||
Returns: | ||
int: Event count. | ||
""" | ||
try: | ||
return response["hits"]["total"]["value"] | ||
except KeyError as exception: | ||
LOGGER.warning("Couldn't find any values in timeseries response") | ||
LOGGER.debug(exception, exc_info=True) | ||
return NO_DATA | ||
|
||
@staticmethod | ||
def build_query(query, window, date_field): | ||
"""Build Opensearch query. | ||
Add window to existing query. | ||
Replace window for different error budget steps on-the-fly. | ||
Args: | ||
query(dict): Existing query body. | ||
window(int): Window in seconds. | ||
date_field(str): Field to filter time on | ||
Returns: | ||
dict: Query body with range clause added. | ||
""" | ||
if query is None: | ||
return None | ||
body = {"query": {"bool": query}, "track_total_hits": True} | ||
range_query = { | ||
f"{date_field}": { | ||
"gte": f"now-{window}s/s", | ||
"lt": "now/s", | ||
} | ||
} | ||
|
||
if "filter" in body["query"]["bool"]: | ||
body["query"]["bool"]["filter"]["range"] = range_query | ||
else: | ||
body["query"]["bool"]["filter"] = {"range": range_query} | ||
|
||
return body | ||
|
||
|
||
OS = OpensearchBackend |
Oops, something went wrong.