Endpoints¶
The Edge Analytics API is a versioned JSON API which endpoints are organized into two main groups:
v1/
: grouping all the new metrics endpoints along with the query endpoint.legacy/
: containing all endpoints ported from the Legacy Stats API to ease the migration to the Edge Analytics API. More info about the migration process here.
This section of the documentation will focus on the new set of endpoints (under v1/
) because the
legacy set of endpoints inherit the interface from the Legacy Stats API.
API base URL: https://api.system73.com/analytics/
Remember
The API swagger/OpenAPI reference is available on https://api.system73.com/analytics/docs/.
Metrics endpoints (v1/metrics/
)¶
This set of endpoints share the same interface and capabilities as all of them provide a metric and allow you to filter the results with predefined set of filters.
Data filters¶
You can filter metrics results by the following properties:
These filters are optional and any filter combination can be applied.
Time interval¶
You must specify the timeframe of the metric to be calculated with the parameters from_date
and
to_date
. The time boundaries are strings containing a date and time in the ISO-8601
format with a precision of up to seconds. Microseconds are truncated from the input.
Valid inputs
"2012-01-01T10:12:30.000Z"
"2012-01-01T10:12:30"
Granularity¶
You can choose any query granularity from the list of supported granularities:
second
minute
fifteen_minute
thirty_minute
hour
day
week
month
quarter
year
all
(buckets everything into a single time bucket)
However, depending on the data layer you have the following restrictions:
- Realtime: you can use from
second
tothirty_minute
for time intervals of up to 48h. - Historical: the finest supported granularity is
minute
because of the historical datasources characteristics.
Data layer¶
You must choose the data layer (realtime
or historical
) depending on the granularity and the
query time interval restrictions.
Take into account the data layer supported query granularity and data retention as described in the Architecture section.
Tip
The performance of a query is always affected by the combination of data layer, time interval and granularity. As a general rule of thumb fine query granularities must be used on short time intervals and longer time intervals benefit from coarser granularities.
Query endpoint (v1/query/sql
)¶
This endpoint allows you to get metrics from custom SQL queries with the constructs supported by the Druid SQL dialect. In general, this dialect offers much of the same functionality you can get from traditional relational databases but it is highly recommended to read the linked documentation so you get more familiar with Druid specifics.
Requirements¶
Additionally, there is a series of restrictions that need to be followed in order to produce valid queries:
-
Schema fields
You can only reference datasource fields belonging to your Analytics tier. Check the Schema section of the Data Model for more information.
Example
A customer in the Lite tier cannot issue the following query because the field
ConnectionType
belongs to the Advanced tier.SELECT TIME_FLOOR(__time, 'PT1M') AS "timestamp", AVG(BufferHealth) as "avg_buffer_health" FROM "sdk-emea-realtime" WHERE CustomerId = '1234' AND ConnectionType = 'ETHERNET' AND __time >= '2022-12-30' AND __time < '2022-12-31' GROUP BY 1
-
CustomerId
and regionQueries must always filter by your assigned
CustomerId
.Also, you need to select the datasource that corresponds to your data deployment region.
Example
SELECT TIME_FLOOR(__time, 'P1D') AS "timestamp", AVG(BufferHealth) as "avg_buffer_health" FROM "sdk-americas-historical-hour" WHERE CustomerId = '1234' AND __time >= '2022-01-01T00:00:00' AND __time < '2022-04-01T00:00:00' GROUP BY 1
Important
Please send an email to support@system73.com to request your assigned
CustomerId
and corresponding region.
Best practices¶
As in other big data solutions, the following recommendations will help making the most out of the queries while ensuring acceptable performance levels and resource usage:
- Avoid
SELECT *
or including many fields. - Always include a filter for
__time
to limit the amount of data than needs to be scanned. - Avoid queries spanning large intervals of time (e.g.,
>48h
) on realtime datasources. - Avoid fine granularities on large time interval queries.
- Try to avoid subqueries underneath joins: they affect both performance and scalability.