...
 
Commits (151)
# Setting up .gitignore
# Ignore json files
*.json
# Ignore data directory
data
# Ignore tmp directory
tmp
__pycache__
Copyright 2018 Kathryn Elliott
Thesis - MRes by Kathryn Elliott is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at https://gitlab.com/kathrynee/thesis-mres.
You may obtain a copy of the License at: https://creativecommons.org/licenses/by-sa/4.0/
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
# The Expert in the Aisles: Exploring supermarket narratives in Coles and Woolworths magazines from 2009-2014 using machine learning techniques
## Background
This is the code repository for my Masters of Research thesis project,
at Macquarie University. My project uses topic modelling, a form of machine
learning, together with close reading to analyse the supermarket narratives
found in the magazines released by Australia's two major supermarkets, Coles
and Woolworths.
## Abstract
In Australia, supermarkets dominate our food landscape, with over eighty-four
percent of weekly food purchases occurring at the supermarket. The majority of
this shopping occurs at either Coles or Woolworths. Given Coles and Woolworths'
dominance in food retailing, the messages they promote about food form important
narratives that both reflect and reproduce broader cultural and social beliefs
about taste. This thesis uses a combination of topic modelling, a type of machine
learning and close reading to analyse the supermarket narratives found in the
Coles and Woolworths magazines, _Coles Magazine_ and Woolworths'_Fresh published
between 2009 and 2018. My analysis of these narratives demonstrates how
supermarkets are positioning themselves as food and lifestyle
authorities ready to instruct their customers on how to be good moral citizens,
through their consumption choices. Although the supermarket duopoly was subjected
to intense external scrutiny and criticism from multiple sources during this
period, my research finds that this had little impact on their magazine
narratives. Finally, my research highlights the benefits and analytical richness
to be gained from combining topic modelling and close reading when performing
content analysis on a large corpus of text.
## Text corpus
Magazines were manually downloaded as PDFs from the supermarket websites:
https://www.coles.com.au/magazine and
https://www.woolworths.com.au/shop/recipes/fresh-magazine/ Due to copyright
restrictions I am unable to make this corpus available, however I have provided
a list of magazines, together with their URLs, in the file
`/text-corpus.md`. The Docsplit version of my corpus was uploaded to OSF
(https://osf.io/hzn2a/) and made available to the examiners of my thesis. Again,
due to copyright I cannot make this publicly available, however please contact
me if you'd like to discuss access.
## Pre-processing
The PDFs were converted into text using Docsplit 0.7.6: http://documentcloud.github.io/docsplit/ and then processed using the following steps: tokenisation, stopword removal, lemmatisation, part-of-speech tagging, n-grams and text segmentation.
## Topic modelling
I used gensim 3.4 to topic model my corpus. Gensim is an open source Python
based suite of topic modelling tools. While the website documentation is basic,
the site has an excellent online forum which is very welcoming to newcomers and
beginners. The author of Gensim, Radim Řehůřek is also active on this forum and
maintains the Gensim code base. https://radimrehurek.com/
The topic modelling code is run with various options. To print those options run
the following:
```bash
cd gensim/gensim-tutorial
./run-topic-model --help
```
For example:
`./run-topic-model --trigrams --bigrams --pos-tags NOUN ADJ --min-topic 2 --max-topic 10 data/corpus-2009-2010.json`
## Results
Multiple iterations of the LDA topic modelling processing software were run over
my text corpus, with different parameters and sections of the corpus being used
for each model. After each iteration of topic modelling, the results were
examined and compared manually, with these findings fed back into the modelling,
helping me to refine the parameters for the next iteration. I have included the
logs from all of these runs.
My thesis project combines topic modelling and close reading. Once I had
refined my topic models I close read the top 15 documents from each topic.
Copies of these documents can be found in the following directories:
* Documents from the whole corpus: corpus-20181013T125118
* Documents from Woolworths corpus 2009--2010: 1541383605.266563
* Documents from Woolworths corpus 2011--2014: 1541383605.257328
* Documents from Woolworths corpus 2015--2018: 1541383605.254759
## spaCy
I did some earlier work using spaCy, trialling that for the natural language
processing of my text and the directory contains multiple versions of the
scripts I tested. I did not end up using spaCy.
#!/bin/bash
word_file=$1
corpus_dir=$2
if [[ -z $word_file ]] || [[ -z $corpus_dir ]]; then
echo "usage: $0 <word file> <corpus directory>"
exit
fi
for word in $(grep -Ev "^[#|$]" $word_file); do
# grep --recursive -Fi $word $corpus_dir
count=$(grep --recursive -Fi $word $corpus_dir | wc -l)
echo "$word -> $count"
done
#!/usr/bin/env python3
import json
import argparse
from pprint import pprint
DATASET_FILE = './data/input/newsgroups.100.json'
parser = argparse.ArgumentParser(description='View sample data.')
parser.add_argument('--file', type=argparse.FileType('r'), default=DATASET_FILE)
parser.add_argument('--post', type=int, default=0)
args = parser.parse_args()
print(args)
with args.file as foo:
newsgroups = json.load(foo)
#your data is stupid and should be ashamed of itself.
pprint(newsgroups["content"]["{}".format(args.post)])
#pprint(newsgroups["target"][args.post])
#pprint(newsgroups["target_names"][args.post])
#!/usr/bin/env python3
import re
import json
import sys
import click
@click.command()
@click.argument('input', type=click.File('r'))
@click.argument('output', type=click.File('w'))
def clean_newsgroup_posts(input, output):
print("hi?")
output_list = []
inputjson = json.load(input)
outputjson = inputjson
for line in inputjson["content"]:
print(".", end='', flush=True)
outputjson["content"][line] = clean_newsgroup_post(inputjson["content"][line])
json.dump(outputjson, output, sort_keys=True, indent=4)
return(outputjson)
def clean_newsgroup_post(line):
line = re.sub('(From|Subject|Nntp-Posting-Host|NNTP-Posting-Host|Organization|Lines|Article-I.D.|Distribution|Expires|Reply-To|X-Newsreader|Originator): .*', '', line.rstrip())
line = re.sub('(Thanks|Regards|---|====|____|\*\*\*\*|####|\%\%\%\%|\.\.\.\.|~~~~|==>).*', '', line.rstrip())
line = re.sub('(\n|\t|>>|\\\\|\||^^^|!-*-!-*-)', ' ', line.rstrip())
line = re.sub('(\S*@\S*\s?)', '', line.rstrip()) # Removing emails
line = re.sub('\s+', ' ', line.rstrip()) # Removing new lines
line = re.sub("\'", "", line) # Removing single quotes
print(line)
return(line)
print("########################################################")
line = "From: lerxst@wam.umd.edu (where's my thing)\nSubject: WHAT car is this!?\nNntp-Posting-Host: rac3.wam.umd.edu\nOrganization: University of Maryland, College Park\nLines: 15\n\n I was wondering if anyone out there could enlighten me on this car I saw\nthe other day. It was a 2-door sports car, looked to be from the late 60s/\nearly 70s. It was called a Bricklin. The doors were really small. In addition,\nthe front bumper was separate from the rest of the body. This is \nall I know. If anyone can tellme a model name, engine specs, years\nof production, where this car is made, history, or whatever info you\nhave on this funky looking car, please e-mail.\n\nThanks,\n- IL\n ---- brought to you by your neighborhood Lerxst ----\n\n\n\n\n"
#print(clean_newsgroup_post(line))
clean_newsgroup_posts()
Document Name | Original URL
201404-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1404_Colesmag.pdf
201408-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1408_Colesmag.pdf
201412-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1412_Colesmag.pdf
201402-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1401_Colesmag.pdf
201407-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1407_Colesmag.pdf
201406-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1406_Colesmag.pdf
201403-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1403_Colesmag.pdf
201405-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1405_Colesmag.pdf
201411-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1411_Colesmag.pdf
201410-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1409_Colesmag.pdf
201504-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1504_Colesmag.pdf
201508-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1508_Colesmag.pdf
201512-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1512_Colesmag.pdf
201502-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1501_Colesmag.pdf
201507-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1507_Colesmag.pdf
201506-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1506_Colesmag.pdf
201503-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1503_Colesmag.pdf
201505-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1505_Colesmag.pdf
201511-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1511_Colesmag.pdf
201510-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1510_Colesmag.pdf
201509-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1509_Colesmag.pdf
201604-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1604_Colesmag.pdf
201608-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Colesmag_Aug_2016.pdf
201612-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_Dec16.pdf
201602-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1601_Colesmag.pdf
201607-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1607_Colesmag.pdf
201606-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1606_Colesmag.pdf
201603-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/1603_Colesmag.pdf
201605-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/COLESMAGMAY2016.pdf
201611-coles.pdf https://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_Nov16.pdf
201610-coles.pdf https://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag_Oct%2016.pdf
201609-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag_Sept2016_salesfinder.pdf
201704-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles_Mag_April_2017_eh3ks.pdf
201708-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag%20August%202017_k8u7y3h.pdf
201712-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles_Dec2017%20Mag_h8k39i.pdf
201702-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag%20Feb17_d83hk.pdf
201707-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag%20July%202017_k8h6d.pdf
201706-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag%20June%202017_l20sf5r.pdf
201703-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles_Mag_March2017_j8s3h.pdf
201705-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20mag%20May%202017_dfj3ksls.pdf
201711-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles_Nov%20Mag_2017_7d93k.pdf
201710-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles%20Mag%20October%202017_83jdw.pdf
201709-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_September2017_9k73jw3.pdf
201808-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Col_August%202018%20Magzine_NE88ZNNS.pdf
201807-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_July2018_PVYW34C8.pdf
201806-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_Jun2018_JFEISA3JSF.pdf
201805-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Cole%20May%202018%20Magazine_u7g3m9sk2.pdf
201804-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_April2018_f3j59djwa.pdf
201803-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/ColesMag_March2018_y7j0l3nd7.pdf
201802-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Coles_Feb18_Mag_83h0u62d1.pdf
201801-coles.pdf http://d2g5na3xotdfpc.cloudfront.net/uploads/pdf/Jan%202018%20Coles%20Mag_u318ns02.pdf
This diff is collapsed.
#!/usr/bin/env python3
# this work is based on a tutorial from Machine Learning Plus:
# https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/
# (My Bibtex ref: @SP18). The program works on a sample dataset of newsgroup
# postings, to pre-process the text, run it through an LDA topic model, produce
# various outputs and plot some of them in different ways.
# TODO: Add and improve comments. Number of sections are not explained. Plus in
# re-arranging code many of the existing comments are now outdated or
# unnecessary.
# Import all the packages needed for this program
import os # So I can test each section of my code, with exit()
import io
import sys
import re
import json
import pandas
from pprint import pprint
# Import gensim packages for LDA topic modelling
import gensim
import gensim.corpora as corpora
from gensim.utils import simple_preprocess
from gensim.models.phrases import Phrases, Phraser # building bigrams & trigrams
from gensim.models import CoherenceModel
# From nltk import stopwords library
from nltk.corpus import stopwords
# Import spacy for lemmatisation
import spacy
# Import plotting tools, to use in visualising data
import pyLDAvis
import pyLDAvis.gensim
import matplotlib.pyplot as plt
import logging
import warnings
warnings.filterwarnings("ignore",category=DeprecationWarning)
ALLOWED_POS_TAGS = ['NOUN', 'ADJ', 'VERB', 'ADV']
# 5. Prepare nltk stopwords, so they can be removed from text corpora. Also
# adding a few additional stopwords which should be removed.
from nltk.corpus import stopwords
# Define functions for stopwords, bigrams, trigrams and lemmatisation
def remove_stopwords(texts):
stop_words = stopwords.words('english')
stop_words.extend(['from', 'subject', 're', 'edu', 'use'])
return [[word for word in simple_preprocess(str(doc)) if word not in stop_words] for doc in texts]
def make_bigrams(texts):
return [bigram_mod[doc] for doc in texts]
def make_trigrams(texts):
return [trigram_mod[bigram_mod[doc]] for doc in texts]
def lemmatization(texts, allowed_postags=ALLOWED_POS_TAGS):
"""https://spacy.io/api/annotation"""
texts_out = []
for sent in texts:
doc = nlp(" ".join(sent))
texts_out.append([token.lemma_ for token in doc if token. pos_ in allowed_postags])
return texts_out
def generate_models(corpus, id2word, min_topic_count, max_topic_count):
for n in range (min_topic_count, max_topic_count):
print("\n>>> Generating " + str(n) + " topics <<<\n")
# 12. Building the Topic Model
# In addition to the corpus and dictionary, need to provide the number
# of topics. alpha and eta are hyperparameters which affect the sparsity
# of the topics. In gensim, both default to 1.0/num_topics prior.
# `chunksize` is the number of documents to be used in each training
# chunk. `update_every` determines how often the model parameters should
# be updated. `passes` is the total number of training passes.
lda_model = gensim.models.ldamodel.LdaModel(corpus=corpus,
id2word=id2word,
num_topics=n,
random_state=100,
update_every=1,
chunksize=100,
passes=10,
alpha='auto',
per_word_topics=True)
model_list.append(model)
return(model_list)
def find_maximally_coherent_model(models, lemmatised_data, dictionary, min_topic_count, max_topic_count):
# Compute coherence score
max_model = None
max_coherence_model = None
for n in range (min_topic_count, max_topic_count):
coherence_model = CoherenceModel(model=models[n],
texts=lemmatised_data,
dictionary=dictionary,
coherence='c_v')
if(max_coherence_model == None or coherence_model.get_coherence() > max_coherence_model.get_coherence()):
max_coherence_model = coherence_model.get_coherence()
max_model = models[n]
return(max_model)
def print_topic_keywords(models, min_topic_count, max_topic_count):
for n in range (min_topic_count, max_topic_count):
# Print the Keywords for each of the 10 tbopics
print(">>> keywords for each topic <<<")
pprint(lda_model[n].print_topics())
def format_topics_sentences(ldamodel=lda_model, corpus=corpus, texts=data):
# Init output
sent_topics_df = pandas.DataFrame()
# Get main topic in each document
for i, row in enumerate(ldamodel[corpus]):
row = sorted(row, key=lambda x: (x[1]), reverse=True)
# Get the dominant topic, perecentage contrinbution and keywords
# for each document
for j, (topic_num, prop_topic) in enumerate(row):
if j == 0: # => dominant topic
wp = ldamodel.show_topic(topic_num)
topic_keywords = ", ".join([word for word, prop in wp])
sent_topics_df = sent_topics_df.append(pandas.Series([int
(topic_num), round(prop_topic,4),
topic_keywords]), ignore_index=True)
else:
break
sent_topics_df.columns = ['Dominant_Topic', 'Perc_Contribution', 'Topic_Keywords']
# Add original text to the end of the output
contents = pandas.Series(texts)
sent_topics_df = pandas.concat([sent_topics_df, contents], axis=1)
return(sent_topics_df)
#!/usr/bin/env python3
import argparse
import logging
import sys
from topic_model import TopicModel
parser = argparse.ArgumentParser(description='Process some files.')
parser.add_argument('filename', metavar='filename', help='Data file to process', type=str)
parser.add_argument('--min-topic-count', metavar='N', default=5, help='Minimum topic count', type=int)
parser.add_argument('--max-topic-count', metavar='N', default=20, help='Maximum topic count', type=int)
parser.add_argument('--topic-step', metavar='N', default=1, help='The topic step. For example, a step of 2 would give 1, 3, 5', type=int)
parser.add_argument('--pos-tags', metavar='[POS tags]', default=None, help='The Part Of Speech tags to extract from the text', nargs='*')
parser.add_argument('--log-level', metavar='STRING', default="info", help='The log level (debug, info, etc)', )
parser.add_argument('--export-doc-map', metavar='STRING', default=None, help='Export the document-document id mapping to given file', )
parser.add_argument('--log-to-console', dest='log_to_console', action='store_true', help='Write log messages to the console, rather than a log file')
parser.add_argument('--tfidf', dest='tfidf', action='store_true', help='Process the term document matrix using tf-idf')
parser.add_argument('--trigrams', dest='trigrams', action='store_true', help='Generate tri-grams')
parser.add_argument('--bigrams', dest='bigrams', action='store_true', help='Generate bi-grams')
parser.set_defaults(tfidf=False)
parser.set_defaults(trigrams=False)
parser.set_defaults(bigrams=False)
parser.set_defaults(log_to_console=False)
args = vars(parser.parse_args())
doc_map_filename = args.pop('export_doc_map')
topic_model = TopicModel(args.pop('filename'), **args)
if doc_map_filename:
topic_model.export_document_id_map(doc_map_filename)
sys.exit()
topic_model.run_lda()
topic_model.calculate_coherence_models()
topic_model.export_document_id_map()
topic_model.export_all_topics_per_documents()
topic_model.export_all_topic_visualisation_data()
topic_model.print_topics(extended=True)
from topic_model.topic_model import TopicModel
This diff is collapsed.
mv Coles - April 2014 - Offer valid Tue 1 Apr - Sat 7 Jun 2025.pdf 201404-coles.pdf
mv Coles - August 2014 - Offer valid Fri 1 Aug - Sat 7 Jun 2025.pdf 201408-coles.pdf
mv Coles - December 2014 - Offer valid Mon 1 Dec - Sat 7 Jun 2025.pdf 201412-coles.pdf
mv Coles - January_February 2014 - Offer valid Wed 1 Jan - Sat 7 Jun 2025.pdf 201402-coles.pdf
mv Coles - July 2014 - Offer valid Tue 1 Jul - Sat 7 Jun 2025.pdf 201407-coles.pdf
mv Coles - June 2014 - Offer valid Sun 1 Jun - Sat 7 Jun 2025.pdf 201406-coles.pdf
mv Coles - March 2014 - Offer valid Sat 1 Mar - Sat 7 Jun 2025.pdf 201403-coles.pdf
mv Coles - May 2014 - Offer valid Thu 1 May - Sat 7 Jun 2025.pdf 201405-coles.pdf
mv Coles - November 2014 - Offer valid Sat 1 Nov - Sat 7 Jun 2025.pdf 201411-coles.pdf
mv Coles - September_October 2014 - Offer valid Mon 1 Sep - Sat 7 Jun 2025.pdf 201410-coles.pdf
mv Coles - April 2015 - Offer valid Wed 1 Apr - Sat 7 Jun 2025.pdf 201504-coles.pdf
mv Coles - August 2015 - Offer valid Sat 1 Aug - Sat 7 Jun 2025.pdf 201508-coles.pdf
mv Coles - December 2015 - Offer valid Tue 1 Dec - Sat 7 Jun 2025.pdf 201512-coles.pdf
mv Coles - January_February 2015 - Offer valid Thu 1 Jan - Sat 7 Jun 2025.pdf 201502-coles.pdf
mv Coles - July 2015 - Offer valid Wed 1 Jul - Sat 7 Jun 2025.pdf 201507-coles.pdf
mv Coles - June 2015 - Offer valid Mon 1 Jun - Sat 7 Jun 2025.pdf 201506-coles.pdf
mv Coles - March 2015 - Offer valid Sun 1 Mar - Sat 7 Jun 2025.pdf 201503-coles.pdf
mv Coles - May 2015 - Offer valid Fri 1 May - Sat 7 Jun 2025.pdf 201505-coles.pdf
mv Coles - November 2015 - Offer valid Sun 1 Nov - Sat 7 Jun 2025.pdf 201511-coles.pdf
mv Coles - October 2015 - Offer valid Thu 1 Oct - Sat 7 Jun 2025.pdf 201510-coles.pdf
mv Coles - September 2015 - Offer valid Tue 1 Sep - Sat 7 Jun 2025.pdf 201509-coles.pdf
mv Coles - April 2016 - Offer valid Fri 1 Apr - Sat 7 Jun 2025.pdf 201604-coles.pdf
mv Coles - August 2016 - Offer valid Thu 4 Aug - Mon 31 Dec 2035.pdf 201608-coles.pdf
mv Coles - December 2016 - Offer valid Thu 1 Dec - Sun 1 Dec 2030.pdf 201612-coles.pdf
mv Coles - January_February 2016 - Offer valid Fri 1 Jan - Sat 7 Jun 2025.pdf 201602-coles.pdf
mv Coles - July 2016 - Offer valid Thu 7 Jul - Wed 31 Dec 2025.pdf 201607-coles.pdf
mv Coles - June 2016 - Offer valid Wed 1 Jun - Fri 31 Jan 2025.pdf 201606-coles.pdf
mv Coles - March 2016 - Offer valid Tue 1 Mar - Sat 7 Jun 2025.pdf 201603-coles.pdf
mv Coles - May 2016 - Offer valid Sun 1 May - Wed 31 Dec 2025.pdf 201605-coles.pdf
mv Coles - November 2016 - Offer valid Thu 3 Nov - Thu 31 Dec 2020.pdf 201611-coles.pdf
mv Coles - October 2016 - Offer valid Thu 6 Oct - Tue 31 Dec 2030.pdf 201610-coles.pdf
mv Coles - September 2016 - Offer valid Thu 1 Sep - Tue 31 Dec 2030.pdf 201609-coles.pdf
mv Coles - April - Offer valid Thu 6 Apr - Tue 31 Dec 2030.pdf 201704-coles.pdf
mv Coles - August 2017 - Offer valid Thu 3 Aug - Tue 31 Dec 2030.pdf 201708-coles.pdf
mv Coles - December 2017 - Offer valid Thu 7 Dec - Tue 31 Dec 2030.pdf 201712-coles.pdf
mv Coles - February 2017 - Offer valid Thu 26 Jan - Tue 31 Dec 2030.pdf 201702-coles.pdf
mv Coles - July 2017 - Offer valid Thu 6 Jul - Tue 31 Dec 2030.pdf 201707-coles.pdf
mv Coles - June 2017 - Offer valid Thu 1 Jun - Tue 31 Dec 2030.pdf 201706-coles.pdf
mv Coles - March 2017 - Offer valid Thu 2 Mar - Tue 31 Dec 2030.pdf 201703-coles.pdf
mv Coles - May 2017 - Offer valid Thu 4 May - Tue 31 Dec 2030.pdf 201705-coles.pdf
mv Coles - November 2017 - Offer valid Mon 6 Nov - Tue 31 Dec 2030.pdf 201711-coles.pdf
mv Coles - October 2017 - Offer valid Thu 5 Oct - Tue 31 Dec 2030.pdf 201710-coles.pdf
mv Coles - September 2017 - Offer valid Thu 7 Sep - Tue 31 Dec 2030.pdf 201709-coles.pdf
mv Cole May 2018 Magazine_u7g3m9sk2.pdf 201805-coles.pdf 201805-coles.pdf
mv Coles - February 2018 - Offer valid Mon 8 Feb - Thu 31 Dec 2030.pdf 201802-coles.pdf
mv Coles-January-2018.pdf 201801-coles.pdf
mv ColesMag_April2018_f3j59djwa.pdf 201804-coles.pdf
mv ColesMag_Jun2018_JFEISA3JSF.pdf 201806-coles.pdf
mv ColesMag_March2018_y7j0l3nd7.pdf 201803-coles.pdf
mv Col_August 2018 Magzine_NE88ZNNS.pdf 201808-coles.pdf
mv ColesMag_July2018_PVYW34C8.pdf 201807-coles.pdf
python~=3.5
numpy~=1.14
pandas~=0.23
spacy~=2.0
nltk~=3.3
gensim~=3.4
pyLDAvis~=2.1
matplotlib~=2.2
click~=6.7
# JSON==