Commit 1d144cf8 authored by Thorsten Simons's avatar Thorsten Simons

1.4.2 - added queries related to Tenant / Namespace / protocol

parent 69f7a002
Release History
===============
**1.4.2 2019-01-11**
* added queries related to Tenant / Namespace / protocol
**1.4.1 2019-01-04**
* very minor optical changes to the result XLSX file (index sheet)
......
......@@ -8,26 +8,31 @@ Built-in queries
$ hcprequestanalytics showqueries
available queries:
500_highest_throughput The 500 records with the highest throughput (Bytes/sec) for objects >= 1 Byte
500_largest The records with the 500 largest requests
500_worst_latency The records with the 500 worst latencies
clientip No. of records per client IP address
clientip_httpcode No. of records per http code per client IP address
clientip_request_httpcode No. of records per http code per request per client IP address
count No. of records, overall
day No. of records per day
day_hour No. of records per hour per day
day_hour_req No. of records per request per hour per day
day_req No. of records per request per day
day_req_httpcode No. of records per http code per request per day
node No. of records per node
node_req No. of records per request per node
node_req_httpcode No. of records per http code per request per node
percentile_req No. of records per request analysis, including percentiles for size and latency
percentile_throughput_b No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 10MB
req No. of records per request
req_httpcode No. of records per http code per request
req_httpcode_node No. of records per node per http code per request
500_highest_throughput The 500 records with the highest throughput (Bytes/sec)
500_largest_req_httpcode_node The records with the 500 largest requests by req, httpcode, node
500_largest_size The records with the 500 largest requests sorted by size
500_worst_latency The records with the 500 worst latencies
clientip No. of records per client IP address
clientip_httpcode No. of records per http code per client IP address
clientip_request_httpcode No. of records per http code per request per client IP address
count No. of records, overall
day No. of records per day
day_hour No. of records per hour per day
day_hour_req No. of records per request per hour per day
day_req No. of records per request per day
day_req_httpcode No. of records per http code per request per day
node No. of records per node
node_req No. of records per request per node
node_req_httpcode No. of records per http code per request per node
percentile_req No. of records per request analysis, including percentiles for size and latency
percentile_throughput_128kb No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 128KB
req No. of records per request
req_httpcode No. of records per http code per request
req_httpcode_node No. of records per node per http code per request
ten_ns_proto_httpcode No. of records per Tenant / Namespace / protocol / http code
ten_ns_proto_percentile_req No. of records per Tenant / Namespace / protocol, including percentiles for size and latency
ten_ns_proto_user_httpcode No. of records per Tenant / Namespace / protocol / user / http code
ten_proto_httpcode No. of records per Tenant / protocol / http code
.. Tip::
......@@ -75,26 +80,34 @@ You can check the available queries, including the additional ones::
$ hcprequestanalytics -d dbfile.db -a addqueries showqueries
available queries:
500_highest_throughput The 500 records with the highest throughput (Bytes/sec) for objects >= 1 Byte
500_largest The records with the 500 largest requests
500_worst_latency The records with the 500 worst latencies
clientip No. of records per client IP address
clientip_httpcode No. of records per http code per client IP address
clientip_request_httpcode No. of records per http code per request per client IP address
count No. of records, overall
day No. of records per day
day_hour No. of records per hour per day
day_hour_req No. of records per request per hour per day
day_req No. of records per request per day
day_req_httpcode No. of records per http code per request per day
node No. of records per node
node_req No. of records per request per node
node_req_httpcode No. of records per http code per request per node
percentile_req No. of records per request analysis, including percentiles for size and latency
percentile_throughput_b No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 10MB
req No. of records per request
req_httpcode No. of records per http code per request
req_httpcode_node No. of records per node per http code per request
500_highest_throughput The 500 records with the highest throughput (Bytes/sec)
500_largest_req_httpcode_node The records with the 500 largest requests by req, httpcode, node
500_largest_size The records with the 500 largest requests sorted by size
500_worst_latency The records with the 500 worst latencies
add_count count all records
add_node_req_http node-per-request-per-httpcode analysis
add_req_count count records per request
clientip No. of records per client IP address
clientip_httpcode No. of records per http code per client IP address
clientip_request_httpcode No. of records per http code per request per client IP address
count No. of records, overall
day No. of records per day
day_hour No. of records per hour per day
day_hour_req No. of records per request per hour per day
day_req No. of records per request per day
day_req_httpcode No. of records per http code per request per day
node No. of records per node
node_req No. of records per request per node
node_req_httpcode No. of records per http code per request per node
percentile_req No. of records per request analysis, including percentiles for size and latency
percentile_throughput_128kb No. of records per request, with percentiles on throughput (Bytes/sec) for objects >= 128KB
req No. of records per request
req_httpcode No. of records per http code per request
req_httpcode_node No. of records per node per http code per request
ten_ns_proto_httpcode No. of records per Tenant / Namespace / protocol / http code
ten_ns_proto_percentile_req No. of records per Tenant / Namespace / protocol, including percentiles for size and latency
ten_ns_proto_user_httpcode No. of records per Tenant / Namespace / protocol / user / http code
ten_proto_httpcode No. of records per Tenant / protocol / http code
Rules:
......@@ -108,7 +121,7 @@ Rules:
`SQLite3 SELECT rules <https://www.sqlite.org/lang_select.html>`_
* You can use all the column names listed below, the aggregate functions
offered by `SQLite <https://www.sqlite.org/lang_aggfunc.html>`_ as
well as the special aggregate function(s) listed below
well as the non-standard function(s) listed below
Columns in the ``logrecs`` table
......@@ -149,6 +162,19 @@ Non-standard SQL functions
Calculates the throughput (in bytes/second) from an objects size and the
internal latency.
* ``getNamespace(path, namespace)``
Extract the name of the Namespace (bucket, container) from the ``path`` and
``namespace`` database column.
* ``getTenant(namespace)``
Extract the name of the Tenant from the ``namespace`` database column.
* ``getProtocol(namespace)``
Extract the access protocol used from the ``namespace`` database column.
Non-standard SQL aggregate functions
------------------------------------
......
......@@ -167,6 +167,12 @@ class DB():
'namespace': rec[10],
'latency': int(rec[11])
}
# DEBUG!!!
if _r['namespace'].find('.') == -1:
print(rec)
_admin['start'] = _tsnum if _tsnum < _admin['start'] else _admin['start']
_admin['end'] = _tsnum if _tsnum > _admin['end'] else _admin['end']
......@@ -341,6 +347,9 @@ def runquery(db, qtitle, query):
con.row_factory = sqlite3.Row
con.create_aggregate("percentile", 2, PercentileFunc)
con.create_function("tp", 2, tpfunc)
con.create_function("getNamespace", 2, getNamespace)
con.create_function("getTenant", 1, getTenant)
con.create_function("getProtocol", 1, getProtocol)
cur = con.cursor()
cur.execute(query)
......@@ -407,3 +416,58 @@ def tpfunc(size, duration):
ret = size / duration * 1000
# print('size={} / latency={} = {} KB/sec'.format(size, duration, ret))
return ret
def getNamespace(path, field):
"""
Extract the Namespace name from a log record's namespace field.
:param path: the path field
:param field: the mentioned log record's namespace field
:return: the Namespace
"""
# make sure we state if we can't find details!
if not '.' in field:
if not '@' in field:
return 'n/a'
else:
return path.split('/')[1]
try:
return field.split('.')[0]
except IndexError:
return ''
def getTenant(field):
"""
Extract the Tenant name from a log record's namespace field.
:param field: the mentioned log record's namespace field
:return: the Tenant
"""
try:
if '.' in field:
return field.split('.')[1].split('@')[0]
else:
return field.split('@')[0]
except IndexError:
return 'n/a'
def getProtocol(field):
"""
Extract the Protocol name from a log record's namespace field.
:param field: the mentioned log record's namespace field
:return: the protocol
"""
try:
_p = field.split('@')[1]
if _p.lower() == 'hs3':
return 'S3'
elif _p.lower() == 'hswift':
return 'Swift'
except IndexError:
return 'native REST'
......@@ -253,3 +253,71 @@ query : SELECT request, count(*),
FROM logrecs where size >= 131072 GROUP BY request
freeze pane : C5
[ten_ns_proto_httpcode]
comment : No. of records per Tenant / Namespace / protocol / http code
query : SELECT getTenant(namespace) as Tenant,
getNamespace(path, namespace) as Namespace,
getProtocol(namespace) as Protocol, httpcode,
count(*)
FROM logrecs GROUP BY getTenant(namespace),
getNamespace(path, namespace),
getProtocol(namespace), httpcode
freeze pane : D5
[ten_ns_proto_percentile_req]
comment : No. of records per Tenant / Namespace / protocol, including percentiles for size and latency
query : SELECT getTenant(namespace) as Tenant,
getNamespace(path, namespace) as Namespace,
getProtocol(namespace) as Protocol, request, count(*),
min(size), avg(size), max(size),
percentile(size, 10) as 'pctl-10 (size)',
percentile(size, 20) as 'pctl-20 (size)',
percentile(size, 30) as 'pctl-30 (size)',
percentile(size, 40) as 'pctl-40 (size)',
percentile(size, 50) as 'pctl-50 (size)',
percentile(size, 60) as 'pctl-60 (size)',
percentile(size, 70) as 'pctl-70 (size)',
percentile(size, 80) as 'pctl-80 (size)',
percentile(size, 90) as 'pctl-90 (size)',
percentile(size, 95) as 'pctl-95 (size)',
percentile(size, 99) as 'pctl-99 (size)',
percentile(size, 99.9) as 'pctl-99.9 (size)',
min(latency), avg(latency),
max(latency),
percentile(latency, 10) as 'pctl-10 (latency)',
percentile(latency, 20) as 'pctl-20 (latency)',
percentile(latency, 30) as 'pctl-30 (latency)',
percentile(latency, 40) as 'pctl-40 (latency)',
percentile(latency, 50) as 'pctl-50 (latency)',
percentile(latency, 60) as 'pctl-60 (latency)',
percentile(latency, 70) as 'pctl-70 (latency)',
percentile(latency, 80) as 'pctl-80 (latency)',
percentile(latency, 90) as 'pctl-90 (latency)',
percentile(latency, 95) as 'pctl-95 (latency)',
percentile(latency, 99) as 'pctl-99 (latency)',
percentile(latency, 99.9) as 'pctl-99.9 (latency)'
FROM logrecs GROUP BY getTenant(namespace),
getNamespace(path, namespace),
getProtocol(namespace), request
freeze pane : E5
[ten_ns_proto_user_httpcode]
comment : No. of records per Tenant / Namespace / protocol / user / http code
query : SELECT getTenant(namespace) as Tenant,
getNamespace(path, namespace) as Namespace,
getProtocol(namespace) as Protocol, user, httpcode,
count(*)
FROM logrecs GROUP BY getTenant(namespace),
getNamespace(path, namespace),
getProtocol(namespace), user, httpcode
freeze pane : E5
[ten_proto_httpcode]
comment : No. of records per Tenant / protocol / http code
query : SELECT getTenant(namespace) as Tenant,
getProtocol(namespace) as Protocol, httpcode,
count(*)
FROM logrecs GROUP BY getTenant(namespace),
getProtocol(namespace), httpcode
freeze pane : C5
......@@ -25,6 +25,8 @@ import csv
import xlsxwriter
from time import asctime, localtime, strftime
from version import Gvars
class Csv(object):
"""
......@@ -237,6 +239,10 @@ class Xlsx(Csv):
'align': 'center',
'font_size': 12,
'bg_color': 'yellow'})
footer2 = self.wb.add_format({'bold': False,
'italic': True,
'align': 'center',
'font_size': 12})
# headline
self.contentws.merge_range(1, 1, 1, 5, 'Content', title)
......@@ -276,13 +282,18 @@ class Xlsx(Csv):
# footer
self.contentws.merge_range(row+2, 1, row+2, 5,
'created {} from log records starting at {}, ending at {}'
'created {} from log records starting at {},'
' ending at {}'
.format(asctime(),
strftime('%a %b %d %H:%M:%S %Y',
localtime(start)),
strftime('%a %b %d %H:%M:%S %Y',
localtime(end))),
footer)
self.contentws.merge_range(row+3, 1, row+3, 5,
'--- {} {} ---'.format(Gvars.s_description,
Gvars.Version),
footer2)
# Thu Oct 5 20:48:16 2017
# '%a %b %d %H:%M:%S %Y'
......
......@@ -27,8 +27,8 @@ class Gvars:
"""
# version control
s_version = "1.4.1"
s_builddate = '2019-01-04'
s_version = "1.4.2"
s_builddate = '2019-01-11'
s_build = "{}/Sm".format(s_builddate)
s_minPython = "3.5"
s_description = "hcprequestanalytics"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment