Scraping scheduler, testing and DB, Cache integration
This is the final PR for the server side.
to test go to server/Fake News
copy this folder and perform the following steps:
-
install and activate virtualenv
-
install all required libraries:
pip install -r requirements.txt
-
also install nltk files, open python terminal:
1. import nltk 2. nltk.download('punkt') 3. nltk.download('wordnet') 4. nltk.download('stopwords')
-
install postgres for DB and cache
sudo apt install postgresql postgresql-contrib
-
create a db
sudo -u postgres createdb fakeNewsDB
if there is error and required to create user:
` sudo -u postgres createuser "usernameOfUrPC"`
-
to install chrome driver for news web scrapping(if not installed already)
sudo apt-get install chromium-chromedriver
if there is path error:
#Adding the path to the selenium line:
driver = webdriver.Chrome("/usr/lib/chromium-browser/chromedriver")
-
to create all table and handle migration. Do this once
python API_manager.py db init python API_manager.py db migrate python API_manager.py db upgrade python API_manager.py runserver
-
to test if data is committed to DB properly to test if data is committed to DB properly or not
-
login to DB:
sudo -u postgres psql
-
open DB: list all tables present
\c fakeNewsDB
-
to open a realtion/table:
SELECT * FROM "table_name"
-
Tables after testing:
API request struct for