Commit c1d21e16 authored by Mitchell Murphy's avatar Mitchell Murphy
Browse files

Merge branch '2.0.0' into 'master'

v2.0.0

See merge request !2
parents 0a207d56 50b7a83b
Pipeline #303455859 passed with stage
in 53 seconds
......@@ -117,4 +117,7 @@ dmypy.json
### Custom Files ###
# Mac Finder files
.DS_Store
\ No newline at end of file
.DS_Store
# VS Code Settings
.vscode/
\ No newline at end of file
image: python:3.7
image: continuumio/miniconda3:latest
stages:
- build
- test
variables: #change pip location to local dir - only local items can be cached
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache: #cache pip & venv so packages can also be cached
paths:
- .cache/pip
- venv/
setup:
stage: build
script:
- pip install virtualenv
- virtualenv venv
- venv/bin/pip install -r requirements.txt
before_script:
- conda create -n dirta-science python=3.9 poetry=1.1.6
- source activate dirta-science
- poetry install --no-dev
test_repo_setup:
stage: test
script:
- venv/bin/python tests/test_repo_setup.py
- poetry run python tests/test_repo_setup.py
test_repo_unittests:
stage: test
script:
- cd ..
- dirta-science/venv/bin/python -m cookiecutter --no-input dirta-science
- poetry run python -m cookiecutter --no-input ../dirta-science
- cd "repository name"
- pip install .
- python tests/test_data.py
VENV_DIR ?= venv
PYTHON = $(VENV_DIR)/bin/python
pyenv: requirements.txt
pip3 install virtualenv
test -d venv || python3 -m virtualenv venv
$(VENV_DIR)/bin/pip install -r requirements.txt
env: pyproject.toml
conda create -n dirta-science python=3.9 poetry=1.1.6
tests:
$(PYTHON) tests/test_repo_setup.py
\ No newline at end of file
poetry run python tests/test_repo_setup.py
......@@ -2,7 +2,9 @@
A standardised directory structure for Data Science projects.
## Template Architecture
# General Use
## Template Layout
├── LICENSE
......@@ -14,8 +16,8 @@ A standardised directory structure for Data Science projects.
├── data
│ ├── interim <- Intermediate data that has been preprocessed/transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── sample <- Sample data for which experiments can be executed on.
│ ├── processed <- The final, canonical data sets for modelling.
│ └── raw <- Raw data for which experiments can be executed on.
├── docs <- Terminology, manuals, and all other explanatory materials
......@@ -31,26 +33,24 @@ A standardised directory structure for Data Science projects.
├── tests <- Tests for project package
└── {package name} <- Source code following PEP 8 syntax styling guide.
└── {package name} <- Project package.
├── data <- Module for importing/generating data.
├── features <- Module for preprocessing and cleaning raw data.
├── models <- Module for training and implementing models.
└── visualisation <- Module for visualising results.
## Getting Started
### Prerequisites
## Prerequisites
* [GNU Make](https://www.gnu.org/software/make/) (symlink: `make`)
* [Python3.7 or above](https://www.python.org/downloads/)
* pip - installed with Python3
* [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
* Any version of Miniconda/Anaconda that supports Python3.9 is okay.
### Installing
## Creating a Template
Install cookiecutter, e.g. via pip:
Install cookiecutter in your base environment:
```
pip install cookiecutter==1.6.0
conda install cookiecutter=1.7.2
```
Create your project with the data science template:
......@@ -58,21 +58,26 @@ Create your project with the data science template:
cookiecutter https://gitlab.com/mwtmurphy/dirta-science
```
## Development
Prompts will then follow on screen for the remaining project setup. After this, you're free to tackle your project. Enjoy!
# Developers Guide
Install the requirements:
## Setup
After installing the prerequisites stated above, create the virtual environment for this project with:
```
pip install -r requirements.txt
make env
```
Or, create a virutal environment for developing this project:
Activate the environment and install dependencies with:
```
make pyenv
conda activate dirta-science
poetry install
```
This is then accessible through `source venv/bin/activate` (Mac/Linux) or `source venv/Scripts/activate` (Windows + Git Bash).
Should you only want to install core dependencies, add the flag `--no-dev`. After this, you're free to develop your changes.
## Running Tests
......@@ -86,6 +91,12 @@ make tests
* **Mitchell Murphy**
* Maintainer
* Updated template to work with Conda & Poetry (v2.0.0)
* Updated template to work with virtualenv and added CRISP-DM documentation (v1.0.0)
Contributors are advised to follow [PEP 8 guidelines](https://www.python.org/dev/peps/pep-0008/) for code layout.
Outstanding issues (tasks, bugs, etc.) can be found in the Issues tracker. Log any completed issues in the contributors list above.
## License
......@@ -94,5 +105,5 @@ This project is licensed under the [MIT License](./LICENSE).
## Acknowledgements
* This project is a personalised theme for [Cookiecutter](https://cookiecutter.readthedocs.io/en/latest/).
* This theme is created from experience with [Cookiecutter Data Science](https://drivendata.github.io/cookiecutter-data-science/).
* This theme is originally based on [Cookiecutter Data Science](https://drivendata.github.io/cookiecutter-data-science/).
* The template documents are created from the [CRISP-DM guide](docs/crisp_dm.pdf) data mining guide.
\ No newline at end of file
......@@ -4,6 +4,5 @@
"pkg_name": "{{ cookiecutter.repo_name.lower().replace(' ', '_').replace('-', '_') }}",
"project_description": "short description for project",
"project_author": "project author",
"python_symlink": ["python3", "python"],
"pip_symlink": ["pip3", "pip"]
"python_version": "3.9"
}
\ No newline at end of file
This diff is collapsed.
[tool.poetry]
name = "dirta-science"
version = "2.0.0"
description = "Data Science Project Template"
authors = ["mwtmurphy <hi@mwtmurphy.com>"]
license = "MIT"
[tool.poetry.dependencies]
python = "^3.9"
cookiecutter = "^1.7.3"
[tool.poetry.dev-dependencies]
pylint = "^2.8.2"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
cookiecutter==1.6.0
pylint==2.3.1
\ No newline at end of file
......@@ -63,7 +63,7 @@ class TestDirtaScience(unittest.TestCase):
data = input_file.read().split("\n")
self.assertEqual(ARGS["project_title"], data[0].replace("# ", ""))
self.assertEqual(ARGS["project_description"], data[2].replace(".", ""))
self.assertEqual(ARGS["project_author"], data[67].replace("*", "").strip())
self.assertEqual(ARGS["project_author"], data[88].replace("*", "").strip())
def map_dir_names(dir_branch: typing.Tuple[str, list, list]) -> typing.List[list]:
......
......@@ -101,4 +101,7 @@ venv/
### Additional Files ###
# Mac Finder styling
.DS_Store
\ No newline at end of file
.DS_Store
# VS Code settings
.vscode/
\ No newline at end of file
image: python:3.7
image: continuumio/miniconda3:latest
stages:
- build
- test
before_script:
- conda create -n {{ cookiecutter.pkg_name.replace('_', '-') }} python=3.9 poetry=1.1.6
- source activate {{ cookiecutter.pkg_name.replace('_', '-') }}
- poetry install --no-dev
variables: #change pip location to local dir - only local items can be cached
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache: #cache pip & venv so packages can also be cached
paths:
- .cache/pip
- venv/
setup:
stage: build
script:
- pip install virtualenv
- virtualenv venv
- venv/bin/pip install -e .
{{cookiecutter.pkg_name}}_data:
{{ cookiecutter.pkg_name }}_data:
stage: test
script:
- venv/bin/python tests/test_data.py
\ No newline at end of file
- poetry run python tests/test_data.py
\ No newline at end of file
VENV_DIR ?= venv
PYTHON = $(VENV_DIR)/bin/python
pyenv: requirements.txt
{{cookiecutter.pip_symlink}} install virtualenv
test -d venv || {{cookiecutter.python_symlink}} -m virtualenv venv
$(VENV_DIR)/bin/pip install -r requirements.txt
env: pyproject.toml
conda create -n {{ cookiecutter.pkg_name.replace('_', '-') }} python={{ cookiecutter.python_version }} poetry=1.1.6
tests:
test -d venv || make pyenv
$(PYTHON) -m unittest discover -s tests -p "test*.py"
poetry run python -m unittest discover -s tests -p "test*.py"
# {{cookiecutter.project_title}}
# {{ cookiecutter.project_title }}
{{cookiecutter.project_description}}.
{{ cookiecutter.project_description }}.
## Project Architecture
# General Use
## Project Layout
├── LICENSE
......@@ -14,8 +16,8 @@
├── data
│ ├── interim <- Intermediate data that has been preprocessed/transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── sample <- Sample data for which experiments can be executed on.
│ ├── processed <- The final, canonical data sets for modelling.
│ └── raw <- Sample data for which experiments can be executed on.
├── docs <- Terminology, manuals, and all other explanatory materials
......@@ -31,41 +33,60 @@
├── tests <- Tests for project package
└── {{cookiecutter.pkg_name}} <- Source code following PEP 8 syntax styling guide.
└── {{ cookiecutter.pkg_name }} <- Project package.
├── data <- Module for importing/generating data.
├── features <- Module for preprocessing and cleaning raw data.
├── models <- Module for training and implementing models.
└── visualisation <- Module for visualising results.
## Getting Started
~ A guide to getting this project up and running on a local machine ~
### Prerequisites
## Prerequisites
This project requires:
* [GNU Make](https://www.gnu.org/software/make/) (symlink: `make`)
* [Python3.7 or above](https://www.python.org/downloads/) (symlink: `{{ cookiecutter.python_symlink }}`)
* pip - installed with Python3 (symlink: `{{ cookiecutter.pip_symlink }}`)
* [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
* Any version of Miniconda/Anaconda that supports Python3.9 is okay.
~ Refer to any additional software needed to be installed prior to setting up the development environment, and how to install it, here. ~
### Installing
## User Guide
~ A guide for how to use the end product of this project. ~
# Developers Guide
## Setup
Run `make pyenv` in order to set up the virtual environment for this project. You can then activate the environment by executing `source venv/bin/activate` (Mac/Linux) or `source venv/scripts/activate` (Windows + Git Bash).
After installing the prerequisites stated above, create the virtual environment for this project with:
```
make env
```
Activate the environment and install dependencies with:
```
conda activate {{ cookiecutter.pkg_name.replace('_', '-') }}
poetry install
```
Should you only want to install core dependencies, add the flag `--no-dev`. After this, you're free to develop your changes.
~ Provide a step-by-step guide to getting any additional aspects of the development environment running. ~
## Running Tests
Run `make tests` in order to execute the unit tests (found in `tests`).
All tests are located in [tests](/tests) and can be run with:
```
make tests
```
~ Explain the automated tests for this project ~
## Contributors
* **{{cookiecutter.project_author}}**
* **{{ cookiecutter.project_author }}**
* Maintainer
Contributors are advised to follow [PEP 8 guidelines](https://www.python.org/dev/peps/pep-0008/) for code layout.
......@@ -74,10 +95,10 @@ Outstanding issues (tasks, bugs, etc.) can be found in the Issues tracker. Log a
## License
~ Reference to the license - or absence of - in this project ~
~ Reference to the license used in this project ~
## Acknowledgements
This information architecture is based on v1.0.0 of Dirta Science.
This project layout is based on v2.0.0 of [Dirta Science](https://gitlab.com/mwtmurphy/dirta-science).
~ A hat tip to any other research, inspirations, code, etc. used in this project ~
\ No newline at end of file
[tool.poetry]
name = "{{ cookiecutter.pkg_name }}"
version = "0.1.0"
description = "{{ cookiecutter.project_description }}"
authors = ["{{ cookiecutter.project_author }} <you@example.com>"]
[tool.poetry.dependencies]
python = "^{{ cookiecutter.python_version }}"
python-dotenv = "^0.17.1"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
#core packages
python-dotenv
#exploration packages
jupyter
matplotlib
numpy
pandas
scipy
seaborn
#local package
-e .
\ No newline at end of file
import setuptools
setuptools.setup(
name="{{cookiecutter.pkg_name}}",
name="{{ cookiecutter.pkg_name }}",
version="0.1.0",
description="",
license="",
author="{{cookiecutter.project_author}}",
author="{{ cookiecutter.project_author }}",
author_email="",
install_requires=[
"python-dotenv"
],
packages=[
"{{cookiecutter.pkg_name}}",
"{{cookiecutter.pkg_name}}.data",
"{{cookiecutter.pkg_name}}.features",
"{{cookiecutter.pkg_name}}.models",
"{{cookiecutter.pkg_name}}.visualisation"
],
packages=setuptools.find_packages(),
zip_safe=False
)
'''test script for {{cookiecutter.pkg_name}}.data'''
'''test script for {{ cookiecutter.pkg_name }}.data'''
#external imports
import os
......@@ -6,18 +6,18 @@ import unittest
from unittest.mock import patch
#interal imports
from {{cookiecutter.pkg_name}} import data, errors
from {{ cookiecutter.pkg_name }} import data
#unit tests
class TestGetEnvVars(unittest.TestCase):
def test_bad_arg(self):
'''Validate argument error raise when incorrect variable type
provided as argument to get_env_vars'''
self.assertRaises(errors.ArgError, data.get_env_vars, None)
self.assertRaises(errors.ArgError, data.get_env_vars, 1)
self.assertRaises(TypeError, data.get_env_vars, None)
self.assertRaises(TypeError, data.get_env_vars, 1)
def test_bad_env_vars(self):
self.assertRaises(errors.EnvError, data.get_env_vars, "DUMMY VAR")
self.assertRaises(KeyError, data.get_env_vars, "DUMMY VAR")
def test_expected_outcome(self):
with patch.dict('os.environ', {'KEY1': 'value_1'}):
......
......@@ -5,11 +5,11 @@ import dotenv
import os
import typing
from {{cookiecutter.pkg_name}} import errors
from {{ cookiecutter.pkg_name }} import errors
#functions
def get_env_vars(*args: str) -> typing.Union[typing.Tuple[str, ...], str]:
'''Returns list of envrionment variable values respective for each
'''Returns list of environment variable values respective for each
key provided, or a string if only a single variable is requested'''
try:
if len(args) > 1:
......@@ -18,15 +18,14 @@ def get_env_vars(*args: str) -> typing.Union[typing.Tuple[str, ...], str]:
env_vars = os.environ[args[0]]
except KeyError:
raise errors.EnvError("Environment variable not found. Ensure all variables are loaded.")
raise KeyError("Environment variable not found. Ensure all variables are loaded.")
except TypeError:
raise errors.ArgError("Invalid argument provided. Key expected as type 'str'.")
raise TypeError("Invalid argument provided. Key expected as type 'str'.")
return env_vars
def load_env_vars() -> bool:
'''Returns true if environment variables loaded from .env file
expected at the root directory of the project, else false'''
'''Returns true if function executes successfully, else false'''
env_loaded = dotenv.load_dotenv(dotenv.find_dotenv())
return env_loaded
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment