Generic method for passing data for embeddable charts

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem to solve

Users typically have a suite of tools that they use to monitor their tech stack. In addition, we have many customers utilizing tools other than prometheus for time-series monitoring. Time series data is best consumed visually in a chart - we want to provide an intuitive way for our customers to visualize data from any tool in a GitLab issue.

Intended users

Sasha the Software Developer
Devon the DevOps Engineer
Sidney the Systems Administrator

Further details

This work contributes to the Incident Management Vision

Proposal

Conclusions

Overall Goal

The goal is to create a generic flow for users to generate gitlab charts from their datasources.

Sample use case

Users use different monitoring tools that produce data. At some point they might need to visualize this data in gitlab, e.g. embedding into issue description, comment, etc.

Problem

Gitlab charts consume data in a certain predefined format. To visualize user data - we need to find a way to transform user’s data into “digestible” by our chart data. As user rely on different systems that generate their data, consequently their data format is different. We need to find a generic way for users to:

Authenticate their datasource

Transform into predefined charts format

Summary:

Option 1: Even though we want more generic way, its good to have safe options. The most straightforward way would be for each separate datasource to create its adapter. From a developer’s standpoint it would be DataSourseClass inherited from some BasicAdapterClass that would define authentication fields, data transformation, validation rules and chart options rules for each specific datasource. The flow would be similar to adding Project Integrations

Pros:

It's a reliable way. No need for smart guess and complicated logic. Rules are strict and specific.

Grafana is using extension by data source plugins, so it is already verified solution

Reusable. As long as the adapter is created, it can be used by any user.

Cons:

Someone actually has to create these adapters. It can be done on GitLab’s side or by contributors extending our codebase.

It’s not very “generic”. On our side we can create a couple of adapters with the most commonly used data sources. But still each separate case would require engineering effort.

Option 2: Create user interface that will allow user to add as many different datasources for charts as he desires

Assumption: for starters we’ll assume that each datasource can provide data in JSON format and limit charts to line & area as they require data in pretty much the same format

Data Transformation

For proof of concept a basic UI with 2 users inputs - sample json and transformation rules was created. On convert JSON button click user’s JSON is converted with user provided rules and sample charts are rendered.

To describe data transformation rules https://www.npmjs.com/package/awesome-json2json mapper tool was selected. It allows to map field to field, provide default values, formatter functions, etc. There are some other tools out there but this one seemed the most intuitive to start with. It just proves that we can actually transform user data to any desired structure. We can investigate other tools in scope of a separate task. This mapper function would be saved on BE side with some identifier and used for future data transformations.

Questions/ Concerns:

Creating mapping function does require some engineering involvement. As a future improvement we can create some drag&drop UI for instance to map field to field without any coding. Should we actually do that or little coding wouldn’t do any harm and probably is more flexible?

Besides allowing user to add code that would actually be executed might have security concerns. +1 for UI where we generate formatting rules ourselves.

Before creating transformation rules we need to know what kind of chart we want to get as each chart has a different format of input data. There should be a step when we actually ask a user what kind of chart s/he wants.

Our charts can accept a number of rendering options, e.g. axis label, axis data type, formatter, colors, etc. Should we provide a user with inputs for basic config? Datasources itself rarely contain chart config options

Chart datasource might have thousands of entries, that would probably cause performance issues. Should we limit datasource size?

Investigating sample datasources for auth types and json data availability

As a next step, to check that we can request sample data from 3rd party, @ck3g created endpoint that can accept url parameter and request data by provided url. Input for url was added on FE side.

Datasources test:

Prometheus

API: response format is JSON, example request: http://localhost:9090/api/v1/query?query=up&time=2015-07-01T20:10:51.781Z

Auth: By default doesn't have basic auth, test request was done unauthorized to demo prometheus server, users implement auth for each instance. Details...

!

InfluxDB

API: response format is JSON, example request: https://hueylewis-d0e61193.influxcloud.net:8086/query?pretty=true&u=sharlatenok@gmail.com&p=pass&db=test&q=SELECT "value" FROM "some"."autogen"."cpu"

Auth: By default doesn't have basic auth, test request was done using user/password as query params to my InfluxDB cloud instance. Supports HTTP authentication, JWT Tokens, and basic authentication. Details...

Grafana

API: response format is JSON, example request: /api/datasources/:datasourceId.

Auth: Currently you can authenticate via an API Token or via a Session cookie (acquired using regular login or oauth), basic auth if enabled. Details...

Graphite

API: response format can be JSON, supports other formats, interesting might be svg for render api. Metrics API

Auth: Didn't find related info

Splunk

API: With few exceptions, REST API responses use the Atom Syndication Format, known as an Atom Feed (basically XML).

Auth: Username and password, Splunk authentication tokens, enterprise edition supports basic auth. Details...

Sumo Logic

API: response is JSON, most suitable for rendering charts seems Search Job API, though its just an assumption

Auth: Access ID and access key, Base64 encoded access id and access key. Details

Sentry

API: JSON, API List

Auth: API Key, Auth Token, Some endpoint support DSN Authentication . Details...

Datadog

API: JSON format. Query timeseries

Auth: Date retrieval requires API key + App key. Details...

Loggly

API: JSON, Retrieving data API

Auth: Username+ password or API token. Details...

Fluentd

API: JSON Monitoring API

Auth: Didn't find related info

To sum up, all 10 services provide API for querying data, 9 out of 10 return response in JSON format and 1(Splunk) in XML.

Authentication types: No auth, Basic Auth, JWT tokens, API key (API Token), Session cookie (acquired using regular login or oauth), username + password, Access ID + access key, API Key + App Key. Some support several methods of authentication for their API.

How all that can work together:

User navigates to "Chart datasource settings page"

There is UI that might somehow resemble images above

Selects authentication type (from provided options)

Enters auth info dependent on auth type selected in previous step

Enters sample datasource url

GitLab tries to retrieve data with sample url and auth data

when data retrieved successfully, user selects type of chart and we show him/her json format required for chart rendering (optionally chart options - axis labels, colors,...) + sample mapper function (that will be edited by user).

when mapper is updated, we transform data, validate format and render sample chart if possible.

User enters "DatasourceTemplateID", clicks "Save credentials and mapper fn".

For embedding we can use usual flow /embed [list of saved DatasourceTemplateIDs] url

request data via provided url, transform chart, render if possible.

Pros

Flexible and generic given that most datasources support JSON API

Cons

Has many corner cases to consider

Requires engineering involvement on user side if we do not create some intuitive UI for data mapping

Security

FYI: @ClemMakesApps @sarahwaldner