Investigate use of wikipedia as data source
Wikipedia has a wealth of knowledge that could be used either as a data source or for seeding the DB.
Goal
Evaluate wikipedia as a data source. Find out what kind of graphs can be generated and whether data has to be cleaned up or filtered to generate comprehensible graphs.
Tasks
-
Write a wikipedia scraper that generates a dependency graph (Topic X depends on the following topics)
Notes
Importance of dependent topics could be calculated be done by evaluating how many dependents there are i.e
topicA
depends on dependentTopic
which in turn depends on subDependents
. dependentTopic.importance = sum(subDependents)
. A minimal importance could be used to determine which dependentTopic
is show on the graph.
Edited by Average Dude