[wip] basic importing works, displaying on map doesn't after one update

parent 0afe2639
This diff is collapsed.
This diff is collapsed.
Maps, GIS, GeoJSON
==================
Please go through the resources in order to understand how to add districts, points and lines of interest to the map.
Mapping Information / GeoJSON
=============================
Please go through the resources in order to understand how to add
districts, points and lines of interest to the map.
1. Learn about how maps work: https://www.youtube.com/watch?v=2lR7s1Y6Zig
2. What projections Open Street Maps uses: http://openstreetmapdata.com/info/projections
3. Fail map / Open Street Maps uses GeoJson, described at: http://geojson.org
A harder challenge is to get data from a country in GeoJSON. Some countries publish a (large) map for the use with mapping tools (GIS Tools). A free mapping tool to work with this is QGIS, available at qgis.org
A harder challenge is to get data from a country in GeoJSON. Some
countries publish a (large) map for the use with mapping tools (GIS
Tools). A free mapping tool to work with this is QGIS, available at
qgis.org
Any map you'll download will probably contain way too much details. Next steps describe the process to convert a large map to something smaller, so it uses (way) less data:
Any map you'll download will probably contain way too many details.
Next steps describe the process to convert a large map to something
smaller, so it uses (way) less data:
1. Download a map, for example administrative regions of the Netherlands: https://www.pdok.nl/nl/producten/pdok-downloads/basis-registratie-kadaster/bestuurlijke-grenzen-actueel
2. Open the map in QGIS, it will look heavily distorted as an unfamiliar (but correct) projection is used.
3. It's possible to change the projection on the fly. Look for the tiny globe icon and choose something like "Mercator (EPSG 3857)"
4. After you're happy with the projection and "the world makes sense again" remove complexities from the map.
5. Reducing complexities reduces the file size from 8 megabyte to hundreds of kiloytes.
6. Vector > Geometry Tools > Simplify Geometries, enter 500.0000 or something like that. Let the algorithm do it's job.
7. Now right click on the simplified layer and export it. You can export it to GeoJSON that the fail map software understands. The projection is called "WGS84 (EPSG 4326)".
1. Download a map, for example administrative regions of the
Netherlands: https://www.pdok.nl/nl/producten/pdok-downloads/basis-registratie-kadaster/bestuurlijke-grenzen-actueel
2. Open the map in QGIS, it will look heavily distorted as an
unfamiliar (but correct) projection is used.
3. It's possible to change the projection on the fly. Look for the
tiny globe icon and choose something like "Mercator (EPSG 3857)"
4. After you're happy with the projection and "the world makes sense
again" remove complexities from the map.
5. Reducing complexities reduces the file size from 8 megabyte to
hundreds of kiloytes.
6. Vector > Geometry Tools > Simplify Geometries, enter 500.0000 or
something like that. Let the algorithm do it's job.
7. Now right click on the simplified layer and export it. You can
export it to GeoJSON that the fail map software understands. The
projection is called "WGS84 (EPSG 4326)".
8. Import the data into the database, a procedure not yet described.
### Wish: OSM only solution.
OSM uses the administrative regions etc, and it's possible to determine
addresses and such.
Currently being developed.
There are API's that let one search through the data and export the
right things. Yet this was not implemented yet. This would be the
easiest way to have updated mapping data from all around the world that
requires a lesser amount of special steps. (the data is always there,
from everywhere =)
OSM uses the administrative regions etc, and it's possible to determine addresses and such.
There are converters that reduce OSM to GeoJSON (but don't reduce
complexity afaik):
1. https://tyrasd.github.io/osmtogeojson/
### Using Overpass:
http://overpass-turbo.eu/
Query all municipalities, this data is daily updated.
```overpass
[out:json][timeout:300];
area[name="Nederland"]->.gem;
relation(area.gem)["type"="boundary"][admin_level=8];
out geom;
```
(with way(r); it omits tags and is smaller)
This delivers a 52 megabyte file with all coordinates that make up a
municipality, with various tags:
```json
"tags": {
"admin_level": "8",
"authoritative": "yes",
"boundary": "administrative",
"name": "Heemskerk",
"population": "39303",
"ref:gemeentecode": "396",
"source": "dataportal.nl",
"type": "boundary",
"wikidata": "Q9926",
"wikipedia": "nl:Heemskerk"
}
```
The file also contains "markers" inside a region that show metadata.
The regions need to be reduced or simplified to have less data transferred.
Would overpass be able to do this?
There are API's that let one search through the data and export the right things. Yet this was not implemented yet. This would be the easiest way to have updated mapping data from all around the world that requires a lesser amount of special steps. (the data is always there, from everywhere =)
There are converters that reduce OSM to GeoJSON (but don't reduce complexity afaik):
1. https://tyrasd.github.io/osmtogeojson/
### Other things
Additionally there is an awesome world map in GeoJSON, available here:
https://github.com/datasets/geo-countries/blob/master/data/countries.geojson
\ No newline at end of file
https://github.com/datasets/geo-countries/blob/master/data/countries.geojson
###
Vlag opvragen:
( area["ISO3166-1"="NL"][admin_level=2]; )->.a;
......@@ -47,6 +47,9 @@ flag/column and given it's an extra table, some additional columns can
be added for administrative purposes. Such a table should
be made for every history entity.
Sample of another solution:
https://stackoverflow.com/questions/3874199/how-to-store-historical-data
## Stacking / History Support in Django?
Django itself does not support stacking queries: it does have latest and
......
import json
import logging
import subprocess
from datetime import datetime
from typing import Dict
import pytz
import requests
from django.db import transaction
from failmap.organizations.models import Coordinate, Organization
from rdp import rdp
from ..celery import app
log = logging.getLogger(__package__)
@transaction.atomic
def update_coordinates(country: str = "NL", organization_type: str="municipality"):
log.info("Attempting to update coordinates for: %s %s " % (country, organization_type))
# you are about to load 50 megabyte of data. Or MORE! :)
data = get_osm_data(country, organization_type)
import json
log.info("Recieved coordinate data. Starting with: %s" % json.dumps(data)[0:200])
resampling_resolution = get_sampling_resolution(country, organization_type)
for feature in data["features"]:
if not "properties" in feature.keys():
log.debug("Feature misses property")
continue
if not "name" in feature["properties"].keys():
log.debug("Feature does not contain a name, cannot relate feature to existing data.")
continue
log.info("Resampling path for %s" % feature["properties"]["name"])
task = (resample.s(feature, resampling_resolution) | store_updates.s(country, organization_type))
task.apply_async()
# feature = resample_data(feature, resampling_resolution)
# store_coordinates(feature, country, organization_type)
@app.task
def resample(feature: Dict, resampling_resolution: float=0.001):
# downsample the coordinates using the rdp algorithm, mainly to reduce 50 megabyte to a about 150 kilobytes.
# The code is a little bit dirty, using these counters. If you can refactor, please do :)
if feature["geometry"]["type"] == "Polygon":
log.debug("Original length: %s" % len(feature["geometry"]["coordinates"][0]))
i = 0
for coordinate in feature["geometry"]["coordinates"]:
feature["geometry"]["coordinates"][i] = rdp(coordinate, epsilon=resampling_resolution)
i += 1
log.debug("Resampled length: %s" % len(feature["geometry"]["coordinates"][0]))
if feature["geometry"]["type"] == "MultiPolygon":
i, j = 0, 0
for coordinate in feature["geometry"]["coordinates"]:
for nested_coordinate in feature["geometry"]["coordinates"][i]:
feature["geometry"]["coordinates"][i][j] = rdp(nested_coordinate, epsilon=resampling_resolution)
j += 1
# feature["geometry"]["coordinates"][i] = rdp(coordinate, epsilon=resampling_resolution)
j = 0
i += 1
return feature
def get_sampling_resolution(country: str="NL", organization_type: str="municipality") -> float:
if country == "NL" and organization_type == "municipality":
return 0.0001
return 0.0001
@app.task
def store_updates(feature: Dict, country: str="NL", organization_type: str="municipality"):
properties = feature["properties"]
coordinates = feature["geometry"]
"""
Handles the storing / administration of coordinates in failmap using the stacking pattern.
"properties": {
"@id": "relation/47394",
"admin_level": "8",
"authoritative": "yes",
"boundary": "administrative",
"name": "Heemstede",
"ref:gemeentecode": "397",
"source": "dataportal",
"type": "boundary",
"wikidata": "Q9928",
"wikipedia": "nl:Heemstede (Noord-Holland)"
},
Coordinates: [[[x,y], [a,b]]]
"""
# check if organization is part of the database
try:
matching_organization = Organization.objects.get(name=properties["name"],
country=country,
type__name=organization_type,
is_dead=False)
except Organization.DoesNotExist:
log.info("Organization from OSM does not exist in failmap, create it using the admin interface: '%s'" %
properties["name"])
log.info("This might happen with neighboring countries (and the antilles for the Netherlands).")
log.info("Developers might experience this error using testdata etc.")
log.info(properties)
return
# check if we're dealing with the right Feature:
if country == "NL" and organization_type == "municipality":
if properties.get("boundary", "-") != "administrative":
log.info("Feature did not contain properties matching this type of organization.")
log.info("Missing boundary:administrative")
return
# todo: dutch stuff can be handled via gemeentecodes.
# Check if the current coordinates are the same, if so, don't do anything.
# It is possible that an organization has multiple coordinates. Since we always get a single multipoly back,
# we'll then just overwrite all of them to the single one.
old_coordinate = Coordinate.objects.filter(organization=matching_organization, is_dead=False)
if old_coordinate.count() == 1 and old_coordinate[0].area == coordinates["coordinates"]:
log.info("Retrieved coordinates are the same, not changing anything.")
return
message = ""
if old_coordinate.count() > 1:
message = "Automated import does not support multiple coordinates per organization. " \
"New coordinates will be created."
if old_coordinate.count() == 1:
message = "New data received in automated import. DEAL WITH IT!"
log.info(message)
for old_coord in old_coordinate:
old_coord.is_dead = True
old_coord.is_dead_since = datetime.now(pytz.utc)
old_coord.is_dead_reason = message
old_coord.save()
new_coordinate = Coordinate()
new_coordinate.created_on = datetime.now(pytz.utc)
new_coordinate.organization = matching_organization
new_coordinate.creation_metadata = "Automated import via OSM."
new_coordinate.geojsontype = coordinates["type"] # polygon or multipolygon
new_coordinate.area = coordinates["coordinates"]
new_coordinate.save()
log.info("Stored new coordinates!")
# todo: storage dir for downloaded file (can be temp)
def get_osm_data(country: str= "NL", organization_type: str= "municipality"):
"""
Runs an overpass query that results in a set with administrative borders and points with metadata.
Test data: # "/Applications/XAMPP/xamppfiles/htdocs/failmap/admin/failmap/map/map_updates/Netherlands.geojson"
:return: dictionary
"""
filename = "%s_%s_%s.osm" % (country, organization_type, datetime.now().date())
# to test this, without connecting to a server but handle the data returned today(!)
download_and_convert = False
if country == "NL" and organization_type == "municipality":
if download_and_convert:
# returns an OSM file, you need to convert this
# while JSON is nearly instant, a large text file with even less data takes way more time.
response = requests.post("https://www.overpass-api.de/api/interpreter",
data={"data": 'area[name="Nederland"]->.gem; '
'relation(area.gem)["type"="boundary"][admin_level=8]; '
'out geom;',
"submit": "Query"}, stream=True)
log.info("Writing recieved data to file.")
with open(filename, 'wb') as handle:
for block in response.iter_content(1024):
handle.write(block)
# convert the file:
log.info("Converting OSM to geojson")
try:
# shell is True can only be somewhat safe if all input is not susceptible to manipulation
# in this case the filename and all related info is verified.
subprocess.check_call("osmtogeojson %s > %s" % (filename, filename + ".geojson"), shell=True)
except subprocess.CalledProcessError:
log.info("Error while converting to geojson.")
except OSError:
log.info("osmtogeojson not found.")
return json.load(open(filename + ".geojson"))
raise NotImplemented("Combination of country and organization_type does not have a matching OSM query implemented.")
import logging
from django.core.management.base import BaseCommand
from ...geojson import update_coordinates
log = logging.getLogger(__package__)
class Command(BaseCommand):
help = "Connects to OSM and gets a set of coordinates."
# https://nl.wikipedia.org/wiki/Gemeentelijke_herindelingen_in_Nederland#Komende_herindelingen
# Running this every month is fine too :)
def handle(self, *app_labels, **options):
# trace = input()
update_coordinates()
This diff is collapsed.
# -*- coding: utf-8 -*-
# Generated by Django 1.11.8 on 2018-01-19 11:44
from __future__ import unicode_literals
import django.db.models.deletion
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('map', '0007_auto_20171127_1456'),
]
operations = [
migrations.AlterField(
model_name='organizationrating',
name='organization',
field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='organizations.Organization'),
),
]
......@@ -364,30 +364,7 @@ var failmap = {
// if there is one already, overwrite the attributes...
if (failmap.geojson) {
failmap.geojson.eachLayer(function (layer) {
// overwrite some properties
// a for loop is not ideal.
for (i = 0; i < json.features.length; i++) {
if (layer.feature.properties.organization_name === json.features[i].properties.organization_name) {
// console.log(layer);
layer.feature.properties.Overall = json.features[i].properties.Overall;
layer.feature.properties.color = json.features[i].properties.color;
// make the transition
if (layer.feature.geometry.type === "MultiPolygon")
layer.setStyle(failmap.style(layer.feature));
if (layer.feature.geometry.type === "Point") {
if (layer.feature.properties.color === "red")
layer.setIcon(failmap.redIcon);
if (layer.feature.properties.color === "orange")
layer.setIcon(failmap.orangeIcon);
if (layer.feature.properties.color === "green")
layer.setIcon(failmap.greenIcon);
if (layer.feature.properties.color === "gray")
layer.setIcon(failmap.grayIcon);
}
}
}
});
failmap.geojson.eachLayer(function (layer) {failmap.recolormap(json, layer)});
vueMap.loading = false;
} else {
failmap.geojson = L.geoJson(json, {
......@@ -402,6 +379,68 @@ var failmap = {
});
},
recolormap: function (json, layer) {
// overwrite some properties
// a for loop is not ideal.
var existing_feature = layer.feature;
console.log("existing layer");
console.log(layer);
for (i = 0; i < json.features.length; i++) {
var new_feature = json.features[i];
if (existing_feature.properties.organization_name === new_feature.properties.organization_name) {
if (new_feature.geometry.coordinates !== existing_feature.geometry.coordinates){
console.log("changed");
console.log("old");
console.log(existing_feature.geometry.coordinates);
console.log("new");
console.log(new_feature.geometry.coordinates);
// it is not possible to change the shape of a layer :(
// therefore we have to remove this layer and replace it with the new one.
// removing doesn't work: you will still get the old layer(!)
layer.removeFrom(failmap.map);
layer.remove();
failmap.map.removeLayer(layer);
// failmap.geojson.removelayer(layer);
failmap.geojson.removeLayer(layer);
// the new item already has the color we need.
var asdasd = L.geoJson(new_feature, {
style: failmap.style,
pointToLayer: failmap.pointToLayer,
onEachFeature: failmap.onEachFeature
}).addTo(failmap.map);
// we should not manipulate anything else.
continue;
}
// only color changed
// console.log(layer);
existing_feature.properties.Overall = new_feature.properties.Overall;
existing_feature.properties.color = new_feature.properties.color;
// make the transition
if (existing_feature.geometry.type === "MultiPolygon")
layer.setStyle(failmap.style(layer.feature));
if (existing_feature.geometry.type === "Point") {
if (layer.feature.properties.color === "red")
layer.setIcon(failmap.redIcon);
if (layer.feature.properties.color === "orange")
layer.setIcon(failmap.orangeIcon);
if (layer.feature.properties.color === "green")
layer.setIcon(failmap.greenIcon);
if (layer.feature.properties.color === "gray")
layer.setIcon(failmap.grayIcon);
}
}
}
},
showreport: function (e) {
let organization_id = e.target.feature.properties['organization_id'];
if (failmap.map.isFullscreen()) {
......
"""Import modules containing tasks that need to be auto-discovered by Django Celery."""
from . import geojson
# explicitly declare the imported modules as this modules 'content', prevents pyflakes issues
__all__ = [geojson]
......@@ -687,28 +687,32 @@ def map_data(request, weeks_back=0):
rating,
organization.name,
organizations_organizationtype.name,
coordinate.area,
coordinate.geoJsonType,
coordinate_stack.area,
coordinate_stack.geoJsonType,
organization.id,
calculation,
high,
medium,
low
FROM map_organizationrating
INNER JOIN
(SELECT MAX(id) as stacked_organization_id FROM map_organizationrating
WHERE `when` <= '%s' GROUP BY organization_id) as x
ON x.stacked_organization_id = map_organizationrating.id
INNER JOIN
organization on organization.id = map_organizationrating.organization_id
INNER JOIN
organizations_organizationtype on organizations_organizationtype.id = organization.type_id
INNER JOIN
coordinate ON coordinate.organization_id = organization.id
INNER JOIN
(SELECT MAX(id) as id2 FROM map_organizationrating or2
WHERE `when` <= '%s' GROUP BY organization_id) as x
ON x.id2 = map_organizationrating.id
GROUP BY coordinate.area, organization.name
(SELECT MAX(id) as stacked_coordinate_id, area, geoJsonType, organization_id FROM coordinate stacked_coordinate
WHERE stacked_coordinate.created_on <= '%s' GROUP BY organization_id) as coordinate_stack
ON coordinate_stack.organization_id = map_organizationrating.organization_id
GROUP BY coordinate_stack.area, organization.name
ORDER BY `when` ASC
''' % (when, )
''' % (when, when, )
# print(sql)
# with the new solution, you only get just ONE result per organization...
cursor.execute(sql)
rows = cursor.fetchall()
......
......@@ -90,6 +90,7 @@ class PromiseAdminInline(CompactInline):
class ActionMixin:
"""Generic Mixin to add Admin Button for Organization/Url/Endpoint Actions.
<<<<<<< HEAD
This class is intended to be added to ModelAdmin classes so all Actions are available without duplicating code.
Action methods as described in:
......@@ -143,7 +144,7 @@ class ActionMixin:
class OrganizationAdmin(ActionMixin, ImportExportModelAdmin, admin.ModelAdmin):
list_display = ('name', 'type', 'country')
list_display = ('name', 'type', 'country', 'created_on', 'is_dead')
search_fields = (['name', 'country', 'type__name'])
list_filter = ('name', 'type__name', 'country') # todo: type is now listed as name, confusing
fields = ('name', 'type', 'country', 'twitter_handle')
......@@ -349,9 +350,10 @@ class OrganizationTypeAdmin(ImportExportModelAdmin, admin.ModelAdmin):
fields = ('name', )
class CoordinateAdmin(ImportExportModelAdmin, admin.ModelAdmin):
list_display = ('organization', 'geojsontype')
search_fields = ('organization', 'geojsontype')
class CoordinateAdmin(admin.ModelAdmin):
list_display = ('organization', 'geojsontype', 'created_on', 'is_dead', 'is_dead_since')
search_fields = ('organization__name', 'geojsontype')
list_filter = ('organization', 'geojsontype')
fields = ('organization', 'geojsontype', 'area')
......
......@@ -3,21 +3,24 @@ from copy import deepcopy
from datetime import datetime
from typing import List
import pytz
from django.core.management.commands.dumpdata import Command as DumpDataCommand
from django.db import transaction
from failmap.organizations.models import Coordinate, Organization, Url
from failmap.organizations.models import Coordinate, Organization, Promise, Url
log = logging.getLogger(__package__)
# geography update needs to have a OSM connection, otherwise it takes too long to process all new coordinates.
# woot.
@transaction.atomic
class Command(DumpDataCommand):
help = "Starting point for merging organizations"
# https://nl.wikipedia.org/wiki/Gemeentelijke_herindelingen_in_Nederland#Komende_herindelingen
def handle(self, *app_labels, **options):
# making sure this is not run in production yet.
raise NotImplemented
merge_date = datetime(year=2018, month=1, day=1, hour=0, minute=0, second=0, microsecond=0)
"""
......@@ -71,7 +74,9 @@ class Command(DumpDataCommand):
merge(["Hoogezand-Sappemeer", "Menterwolde", "Slochteren"], "Midden-Groningen", merge_date)
def merge(organization_names: List[str], target_organization_name: str, when: datetime,
# implies that name + country + organization_type is unique.
@transaction.atomic
def merge(source_organizations_names: List[str], target_organization_name: str, when: datetime,
organization_type: str="Municipality", country: str="NL"):
"""
Keeping historical data correct is important.
......@@ -84,7 +89,6 @@ def merge(organization_names: List[str], target_organization_name: str, when: da
- A "created_on" is needed, so the new organization is not displayed in the past.
- A "deleted_on" is needed on the previous organizations (with date + reason) so they are not shown in the future.
Situation 2: an existing organization gets all the goodies:
Solution 1: use the existing organization record:
You cannot copy the history of the urls, as that changes the existing organization. It Can have the urls but they
......@@ -133,55 +137,73 @@ def merge(organization_names: List[str], target_organization_name: str, when: da
:return:
"""
target = Organization()
target.type = organization_type
target.country = country
target.name = target_organization_name
target.created_on = when
try:
source_target = Organization.objects.all().filter(
name=target_organization_name, country=country, type=organization_type, is_dead=False)
target.twitter_handle = source_target.twitter_handle
log.info("Creating a new %s, with information from the merged organization." % target_organization_name)
source_target.is_dead = True
source_target.is_dead_since = when
source_target.is_dead_reason = "Merged with other organizations, using a similar name."
source_target.save()
except Organization.DoesNotExist:
pass
target.save()
for organization_name in organization_names:
with Organization.objects.all().filter(
name=organization_name, country=country, type=organization_type, is_dead=False) as source_organization:
new_organization = Organization()
new_organization.type = organization_type
new_organization.country = country
new_organization.name = target_organization_name
new_organization.created_on = when
new_organization.is_dead = False
# todo: ? Store information that the organization stems from a previous organization.
# Get the currently existing organization
# implies that name + country + organization_type is unique.
# Might result in an Organization.DoesNotExist, which means the transactions is rolled back.
original_target = Organization.objects.get(
name=target_organization_name, country=country, type=organization_type, is_dead=False)
new_organization.twitter_handle = original_target.twitter_handle
log.info("Creating a new %s, with information from the merged organization." % target_organization_name)
original_target.is_dead = True
original_target.is_dead_since = when
original_target.is_dead_reason = "Merged with other organizations, using a similar name."
original_target.save()
# save the clone of the organization.
new_organization.save()
for source_organizations_name in source_organizations_names:
with Organization.objects.get(
name=source_organizations_name,
country=country,
type=organization_type, is_dead=False) as source_organization:
# copy the coordinates from all to-be-merged organizations into the target
for coordinate in Coordinate.objects.all().filter(organization=source_organization):
new_coordinate = deepcopy(coordinate)
new_coordinate.id = None
new_coordinate.organization = target
new_coordinate.save()
cloned_coordinate = deepcopy(coordinate)
cloned_coordinate.id = None
cloned_coordinate.organization = new_organization
cloned_coordinate.save()
# still active promises
# all living urls (including everything below it, otherwise you might alter data from the other org and you
# will have a lot of "old stuff" to carry along with you.) Also will have more problems with urls
# that are shared amongst organizations. It's better to manage that on a copy than to alter the original.
for promise in Promise.objects.all().filter(
rganization=source_organization,
expires_on__gte=datetime.now(pytz.utc)):
cloned_promise = deepcopy(promise)
cloned_promise.id = None
cloned_promise.organization = new_organization
cloned_promise.save()
# Should we make copies of all urls? And for what? Or should we state somewhere when an URL was owned
# by an organization? Like n-n with extra information from when that is valid? That would result in
# even more administration overhead?
# No: this should not answer "since when did what organization gain what access to an url".
# This question might arise in the future, but it has never been an issue.