Skip to content

Search wikipedia articles by title and add them to public elasticsearch server

Added the following functions to qary.etl.elastic module:

  1. load_article_from_wikipedia(title) finds a Wikipedia article and loads it in sections
  2. build_elasticsearch_record(page, section_list) creates records for an ElasticSearch index and adds a keyword field
  3. add_articles_to_elasticsearch(articles, index) adds the article to elasticsearch by sections
  4. parse_and_index_wikipedia_article(title) does all steps at once (1,2,3)

qary.secrets file is a temporary file, containing the ElasticSearch server information (host and port)

Merge request reports

Loading