Verified Commit f2622710 authored by Elger Jonker's avatar Elger Jonker Committed by Johan Bloemberg

updated readme, faster ratings rebuild

parent 83d633c9
......@@ -17,58 +17,99 @@ Donate to this project safely, easily and quickly by clicking on an amount below
<a href="https://useplink.com/payment/qaCyn8t6Tar7c5zVS6Fa/500" target="_blank">&euro;500</a>
<a href="https://useplink.com/payment/qaCyn8t6Tar7c5zVS6Fa" target="_blank">&euro;other</a>
# Requirements
# System requirements
Linux or MacOS capable of running Python3 and git.
Download and install git and python3 to get started.
# Software Requirements
- [git](https://git-scm.com/downloads)
- [python3](https://www.python.org/downloads/)
- tox `pip install tox`
Download and install git and python3 to get started:
- [git](https://git-scm.com/downloads) (download and install)
- [python3](https://www.python.org/downloads/) (download and install)
# Obtaining the software
In a directory of your choosing:
After installation of above tools, all following steps use the command line:
git clone --recursive https://github.com/failmap/admin/
cd admin
sudo easy_install pip # installs pip, a python package manager, with the command pip3
If you need a specific branch, for example "mapwebsite"
git checkout mapwebsite
# Obtaining the software
This repository uses [submodules](https://git-scm.com/docs/git-submodule) to pull in external
dependencies. If you have not cloned the repository with `--recursive` or you need to restore
the submodules to the expected state run:
In a directory of your choosing:
git submodule update
git clone --recursive https://github.com/failmap/admin/ # downloads the software
cd admin # enter the directory of the downloaded software
# Quickstart
It is advised to work within a Python virtualenv or use `direnv` (see below) to keep project
dependencies isolated and managed. (todo: how)
Below commands result in a failmap installation that is suitable for testing and development. It is
capable of handling thousands of urls and still be modestly responsive.
pip3 install -e .
failmap-admin migrate
failmap-admin createsuperuser
failmap-admin load-dataset testdata # slow, get a coffee
failmap-admin rebuild-ratings # slow, also a tea
failmap-admin runserver
If you need a faster, more robust installation, please contact us.
Now visit the [website](http://127.0.0.1:8000/) and/or the
[administrative interface ](http://127.0.0.1:8000/admin/) at http://127.0.0.1:8000
pip3 install -e . # downloads requirements needed to run this software
failmap-admin migrate # creates the database
failmap-admin createsuperuser # create a user to view the admin interface
failmap-admin load-dataset testdata # loads a series of sample data into the database
failmap-admin rebuild-ratings # calculate the scores that should be displayed on the map
failmap-admin runserver # finally starts the server
Now visit the [map website](http://127.0.0.1:8000/) and/or the
[admin website](http://127.0.0.1:8000/admin/) at http://127.0.0.1:8000
# Scanning services (beta)
Onboarding handles all new urls with an initial (fast) scan. The tls scanner slowly gets results
from qualys. Screenshot service makes many gigabytes of screenshots.
These services help fill the database with accurate up to date information. Run each one of them in
a separate command line window and keep them running.
failmap-admin onboard-service # handles all new urls with an initial (fast) scan
failmap-admin scan-tls-qualys-service # slowly gets results from qualys
failmap-admin screenshot-service # makes many gigabytes of screenshots
# Using the software
## The map website
The website is the site intended for humans. There are some controls on the website, such as the
time slider, twitter links and the possibilities to inspect organizations by clicking on them.
Using the map website should be straightforward.
## The admin website
Use the admin website to perform standard [data-operations](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete),
run a series of actions on the data and read documentation of the internal workings of the failmap software.
The admin website is split up in four key parts:
1. Authentication and Authorization
This stores information about who can enter the admin interface and what they can do.
2. Map
Contains all information that is presented to normal humans.
This information is automatically filled based on the scans that have been performed over time.
3. Organizations
Lists of organizations, coordinates and internet adresses.
4. Scanners
Lists of endpoints and assorted scans on these endpoints.
# Troubleshooting getting started
If you need a specific branch, for example "mapwebsite"
git checkout mapwebsite
This repository uses [submodules](https://git-scm.com/docs/git-submodule) to pull in
external dependencies. If you have not cloned the repository with `--recursive` or you need to
restore the submodules to the expected state run:
git submodule update
failmap-admin onboard-service
failmap-admin scan-tls-qualys-service
failmap-admin screenshot-service
# Development
# Code quality / Testing
## Code quality / Testing
This project sticks to default pycodestyle/pyflakes configuration to maintain code quality.
......@@ -97,7 +138,7 @@ Pytest allows to drop into Python debugger when a tests fails. To enable run:
tox -- --pdb
# Direnv / Virtualenv
## Direnv / Virtualenv
This project has [direnv](https://direnv.net/) configuration to automatically manage the Python
virtual environment. Install direnv and run `direnv allow` to enable.
......
......@@ -63,17 +63,6 @@ def rate_organizations(create_history=False):
rate_organization(o, when)
def rate_urls(create_history=False):
times = get_weekly_intervals() if create_history else [
datetime.now(pytz.utc)]
urls = Url.objects.filter(is_dead=False)
for when in times:
for url in urls:
rate_url(url, when)
def rate_organizations_efficient(create_history=False):
os = Organization.objects.all().order_by('name')
if create_history:
......@@ -94,43 +83,25 @@ def rate_organization_efficient(organization, create_history=False):
else:
rate_organization(organization, datetime.now(pytz.utc))
def rate_organization_urls_efficient(organization, create_history=False):
def rerate_existing_urls_of_organization(organization):
UrlRating.objects.all().filter(url__organization=organization).delete()
urls = Url.objects.filter(is_dead=False, organization=organization).order_by('url')
for url in urls:
rerate_url_with_timeline(url)
if create_history:
for url in urls:
times = significant_times(url=url)
for time in times:
rate_url(url, time)
else:
for url in urls:
rate_url(url, datetime.now(pytz.utc))
def rate_urls_efficient(create_history=False):
def rerate_existing_urls():
UrlRating.objects.all().delete()
urls = Url.objects.filter(is_dead=False).order_by('url')
for url in urls:
rate_timeline(timeline(url), url)
if create_history:
for url in urls:
times = significant_times(url=url)
for time in times:
rate_url(url, time)
else:
for url in urls:
rate_url(url, datetime.now(pytz.utc))
def clear_organization_and_urls(organization):
UrlRating.objects.all().filter(url__organization=organization).delete()
OrganizationRating.objects.all().filter(organization=organization).delete()
def clear_all_organization_ratings():
OrganizationRating.objects.all().delete()
def rerate_url_with_timeline(url):
UrlRating.objects.all().filter(url=url).delete()
rate_timeline(timeline(url), url)
def timeline(url):
"""
Searches for all significant point in times that something changed. The goal is to save
......@@ -245,7 +216,9 @@ def timeline(url):
scan_date = scan_date.date()
timeline[scan_date]["tls_qualys_scan"] = {}
timeline[scan_date]["tls_qualys_scan"]["scanned"] = True
ratings = list(tls_qualys_scans.filter(rating_determined_on__date=scan_date))
# prevent a query, below query could be rewritten, which is faster
# ratings = list(tls_qualys_scans.filter(rating_determined_on__date=scan_date))
ratings = [x for x in tls_qualys_scans if x.rating_determined_on.date() == scan_date]
timeline[scan_date]["tls_qualys_scan"]["ratings"] = ratings
endpoints = [x.endpoint for x in ratings]
timeline[scan_date]["tls_qualys_scan"]["endpoints"] = endpoints
......@@ -259,7 +232,9 @@ def timeline(url):
scan_date = scan_date.date()
timeline[scan_date]["generic_scan"] = {}
timeline[scan_date]["generic_scan"]["scanned"] = True
ratings = generic_scans.filter(rating_determined_on__date=scan_date)
# prevent a query, below query could be rewritten, which is faster
# ratings = generic_scans.filter(rating_determined_on__date=scan_date)
ratings = [x for x in generic_scans if x.rating_determined_on.date() == scan_date]
timeline[scan_date]["generic_scan"]["ratings"] = list(ratings)
endpoints = [x.endpoint for x in ratings]
timeline[scan_date]["generic_scan"]["endpoints"] = endpoints
......@@ -440,7 +415,7 @@ def rate_timeline(timeline, url):
}""".strip()
url_rating_json = url_rating_template % (url.url, sum(scores), ",".join(endpoint_jsons))
logger.debug("On %s this would score: %s " % (moment, sum(scores)))
logger.debug("On %s this would score: %s " % (moment, sum(scores)), )
save_url_rating(url, moment, sum(scores), url_rating_json)
......
from django.core.management.base import BaseCommand
from failmap_admin.map.determineratings import (default_ratings, rate_organizations_efficient,
rate_urls_efficient)
rerate_existing_urls)
from failmap_admin.map.models import OrganizationRating, UrlRating
......@@ -16,12 +16,8 @@ class Command(BaseCommand):
# It has now been refactored into a command, so it's easier to work with.
def handle(self, *args, **options):
UrlRating.objects.all().delete()
rate_urls_efficient(create_history=True)
rate_urls_efficient() # this should not do anything anymore...
rerate_existing_urls()
OrganizationRating.objects.all().delete()
default_ratings()
rate_organizations_efficient(create_history=True)
print("Making the most recent organization rating, should not have any effect.")
rate_organizations_efficient() # this should not do anything anymore...
......@@ -58,32 +58,13 @@ class Command(DumpDataCommand):
# this only works for ssqlite.
if "sqlite" in settings.DATABASES["default"]["ENGINE"]:
# http://www.sqlite.org/pragma.html#pragma_foreign_key_check
logger.info(
'Checking for foreign key issues, and generating possible SQL to remediate issues.')
cursor = connection.cursor()
cursor.execute('''PRAGMA foreign_key_check;''')
rows = cursor.fetchall()
if rows:
logger.error("Cannot create export. There are incomplete foreign keys. "
"See information above to fix this. "
"Please fix these issues manually and try again.")
for row in rows:
logger.info("%25s %6s %25s %6s" % (row[0], row[1], row[2], row[3]))
logger.error(
"Here are some extremely crude SQL statements that might help fix the problem.")
for row in rows:
logger.info("DELETE FROM %s WHERE id = \"%s\";" % (row[0], row[1]))
if rows:
if not sqlite_has_correct_referential_integrity():
logger.error("Fix above issues. Not proceeding.")
return
else:
logger.warning("This export might have incorrect integrity: no foreign key check for "
logger.error("This export might have incorrect integrity: no foreign key check for "
"this engine was implemented. Loaddata might not accept this import. "
"Perform a key check manually.")
"Perform a key check manually and then alter this code to continue.")
return
filename = "failmap_dataset_%s.yaml" % datetime.now(pytz.utc).strftime("%Y%m%d_%H%M%S")
......@@ -115,3 +96,29 @@ class Command(DumpDataCommand):
logger.debug(options)
super(Command, self).handle(*app_labels, **options)
def sqlite_has_correct_referential_integrity():
# http://www.sqlite.org/pragma.html#pragma_foreign_key_check
logger.info(
'Checking for foreign key issues, and generating possible SQL to remediate issues.')
cursor = connection.cursor()
cursor.execute('''PRAGMA foreign_key_check;''')
rows = cursor.fetchall()
if rows:
logger.error("Cannot create export. There are incomplete foreign keys. "
"See information above to fix this. "
"Please fix these issues manually and try again.")
for row in rows:
logger.info("%25s %6s %25s %6s" % (row[0], row[1], row[2], row[3]))
logger.error(
"Here are some extremely crude SQL statements that might help fix the problem.")
for row in rows:
logger.info("DELETE FROM %s WHERE id = \"%s\";" % (row[0], row[1]))
if rows:
return False
return True
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment