Commit 39a0fe1a authored by Ricardo Avila's avatar Ricardo Avila
Browse files

Merge branch 'authoring'

parents bd8099c1 d81b297b
Pipeline #213912310 failed with stage
in 1 minute and 56 seconds
......@@ -11,8 +11,8 @@ header:
- title: CV
url: /cv.html
- title: <i class="fas fa-bookmark"></i>
url: /links.html
- title: Notes
url: /notes.html
notes-nav:
- title: <i class="fas fa-flask"></i> Bioinformatics
......
......@@ -7,10 +7,16 @@ sidebar:
nav: notes-nav
---
- [Creating DataFrames](#creating-dataframes)
- [Cleaning Data](#cleaning-data)
- [Exploring Data](#exploring-data)
- [Grouping and Transforming Data](#grouping-and-transforming-data)
<!-- vim-markdown-toc GitLab -->
* [Creating DataFrames](#creating-dataframes)
* [Cleaning Data](#cleaning-data)
* [Exploring Data](#exploring-data)
* [Indexing and Selecting Data](#indexing-and-selecting-data)
* [Grouping and Transforming Data](#grouping-and-transforming-data)
<!-- vim-markdown-toc -->
## Creating DataFrames
......@@ -21,6 +27,8 @@ cols = ["col1", "col2", "col3" ]
df = pd.read_csv("data.tab", sep="\t", names=cols)
```
## Cleaning Data
Rename several DataFrame columns:
```python
......@@ -31,14 +39,24 @@ df = df.rename(columns = {
})
```
## Cleaning Data
Get a report of all duplicate records in a DataFrame, based on specific columns:
```python
dupes = df[df.duplicated(['col1', 'col2', 'col3'], keep=False)]
```
Remove duplicates by column, keeping first entry:
```python
df = df.drop_duplicates(subset='col', keep='first')
```
Reset a DataFrame's index to continuous integers, without saving old index (eg. after deleting records).
```python
df.reset_index(drop=True, inplace=True)
```
Clean up missing values in multiple DataFrame columns:
```python
......@@ -50,28 +68,55 @@ df = df.fillna({
})
```
Rename several DataFrame columns:
```python
df = df.rename(columns = {
'col1 old name':'col1 new name',
'col2 old name':'col2 new name',
'col3 old name':'col3 new name',
})
```
## Exploring Data
Sort dataframe on a column:
Sort DataFrame on a column:
```python
df.sort_values(by='col1', ascending=False)
```
## Indexing and Selecting Data
Accessing an index from a multi-index DataFrame:
```python
pandas.MultiIndex.get_level_values
```
## Grouping and Transforming Data
Group by two columns and count total in groups:
```
```python
df.groupby(['col1', 'col2']).count()
```
Spreadsheet-style pivot with aggregation
Group and then aggregate all columns with a function:
```python
table = pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc=np.sum)
df.groupby('col').agg(function)
```
Accessing an index from a multi-index DataFrame:
Group and aggregate specific columns with specific functions:
```python
pandas.MultiIndex.get_level_values
```
\ No newline at end of file
df.groupby('col').agg({"col1": np.sum, "col2": pd.Series.nunique})
```
Spreadsheet-style pivot with aggregation:
```python
table = pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc=np.sum)
```
---
layout: notes
title: Python Tricks
aside:
toc: true
sidebar:
nav: notes-nav
---
<!-- vim-markdown-toc GitLab -->
* [Imports](#imports)
<!-- vim-markdown-toc -->
# Imports
Add a folder to path. Usefule to call a module that is in a parent directory:
```python
import sys
sys.path.append("../")
```
......@@ -11,6 +11,8 @@ sidebar:
<!-- vim-markdown-toc GitLab -->
* [Installing Docker](#installing-docker)
* [Ubuntu installation (July 2020)](#ubuntu-installation-july-2020)
* [Fedora installation (January 2020)](#fedora-installation-january-2020)
* [Getting Docker Images](#getting-docker-images)
* [Running Containers](#running-containers)
* [Interacting with containers](#interacting-with-containers)
......@@ -18,8 +20,9 @@ sidebar:
* [Clearing Non-running Containers](#clearing-non-running-containers)
* [Mounting a Host Filesystem](#mounting-a-host-filesystem)
* [Fix for permissions issues](#fix-for-permissions-issues)
* [Mapping Network Ports](#mapping-network-ports)
* [Creating a Docker Image](#creating-a-docker-image)
* [By modifying an existing image](#by-modifying-an-existing-image)
* [Interactively from an existing image](#interactively-from-an-existing-image)
* [Using a docker file](#using-a-docker-file)
* [Docker Compose](#docker-compose)
......@@ -27,6 +30,23 @@ sidebar:
## Installing Docker
Installation steps have changed quite frequently with new releases of Docker. I will do my best to keep this up-to date, but no guarantees.
### Ubuntu installation (July 2020)
Older versions of Docker were called docker, docker.io, or docker-engine. Now, the packages to install are docker-ce, docker-ce-cli, and containerd.io.
Official documentation:
[https://docs.docker.com/engine/install/ubuntu/](https://docs.docker.com/engine/install/ubuntu/)
For Ubuntu 20.04, follow instructions [here](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-20-04).
### Fedora installation (January 2020)
sudo dnf install docker
sudo sudo systemctl enable docker.service
sudo systemctl start docker.service
......@@ -64,7 +84,7 @@ See installed images: `docker images`
## Running Containers
Use `docker run` to run an application inside a container.
Use `docker run` to initially run a container.
Common options:
......@@ -77,6 +97,35 @@ Common options:
-t
: tty
-p
: make a port available outside container
ple as:
FROM fedora:latest
CMD env
Docker file reference: https://docs.docker.com/engine/reference/builder/
Best practices: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/
In a directory with a Dockerfile run:
sudo docker build -t "my-image" .
If the build is successful you can see my-image in docker images output.
Docker Compose
https://docs.docker.com/compose/
https://developer.fedoraproject.org/tools/docker/compose.html
Docker Compose is a tool to orchestrate Docker containers using a simple YAML file which describes your whole setup.
sudo dnf install docker-compose
© Ricardo Avila 2020, Powered by Jekyll.
About | Links | RSS Feed
--name
: assign container a name
......@@ -85,6 +134,8 @@ Common options:
### Interacting with containers
Once a container is running, you can easily start or stop it.
Start a container (can use an assigned name): `docker start`
Stop a container `docker stop`
......@@ -99,6 +150,8 @@ To show only running containers use:
docker container ls
The old, shorter way of doing this is `docker ps`.
Options:
-a
......@@ -107,6 +160,8 @@ Options:
-s
: see container sizes
## Clearing Non-running Containers
Containers that are not running are not taking any system resources besides disk space.
......@@ -125,9 +180,10 @@ Use the `-v` flag:
docker run -v /host/directory:/container/directory -other -options image_name command_to_run
### Fix for permissions issues
SELinux can cause issues with mounting volumes
In Fedora, SELinux can cause issues with mounting volumes.
The solution is to issue a SELinux rule:
`chcon -Rt svirt_sandbox_file_t /path/to/volume`
......@@ -158,9 +214,13 @@ For more info: [https://docs.docker.com/storage/volumes/](https://docs.docker.co
docker run -v /home/ravila/:/home/ravila/:Z -it mcr.microsoft.com/powershell
```
## Mapping Network Ports
docker run --name mycontainer -p 8080:8080 -p 8000:8000
## Creating a Docker Image
### By modifying an existing image
### Interactively from an existing image
If you change anything (like install new packages) in the running container and exit the container the changes are not automatically saved. If you want to save them in an image, use docker commit.
......
......@@ -10,13 +10,14 @@ sidebar:
<!-- vim-markdown-toc GitLab -->
* [Managing remotes](#managing-remotes)
* [Managing Remotes](#managing-remotes)
* [Submodules](#submodules)
* [Staging and Commits](#staging-and-commits)
* [Branching](#branching)
<!-- vim-markdown-toc -->
## Managing remotes
## Managing Remotes
Add a named remote (you can have multiple remotes):
......@@ -57,6 +58,14 @@ git submodule init
git sumbodule update
```
## Staging and Commits
Remove a file or folder from the staging area:
```
git reset HEAD -- <file or folder>
```
## Branching
Deleting a local branch:
......
......@@ -23,7 +23,7 @@ For example, we may want to run Music Player Daemon (mpd) on startup as a user i
systemctl disable mpd.service
systemctl --user enable mpd.service
## Creating Custom Systemd Services
## Creating Systemd Services
User services are stored under: `~/.config/systemd/user/`. Place any custom scripts that you write in this folder.
......
......@@ -17,7 +17,7 @@ excerpt: Ricardo Avila is a bioinformatics nerd, artist, and lover of open sourc
</div>
</div>
I grew up in the sunny city of El Paso, Texas, and graduated from the University of Texas at El Paso (UTEP), where I studied a Bachelor's of Science in Biochemistry, and a Master's degree in Bioinformatics. My current research involves machine learning methods for drug discovery, but I'm also very interested in protein structure and molecular simulation.
I grew up in the sunny city of El Paso, Texas, and graduated from the University of Texas at El Paso (UTEP), where I studied a Bachelor's of Science in Biochemistry, and a Master's degree in Bioinformatics. I am currently a research programmer at Scripps, in San Diego, CA, where I work in making biological data more [FAIR](https://en.wikipedia.org/wiki/FAIR_data) (findable, accesible, interoperable, and reusable) for the scientific community.
In my free time I like drawing, taking photographs, reading science fiction, and playing music. For more of that, you can visit my second homepage:
[ravilart.com](https://ravilart.com){:.button.button--primary.button--pill}
......@@ -35,37 +35,4 @@ I'm friendly! If you have any questions or comments, feel free to shoot me an em
📧 ravila@protonmail.com
🔑️ PGP Encryption Key
```
-----BEGIN PGP PUBLIC KEY BLOCK-----
xsBNBFXNTwsBCAC9q61TUmfDEPN2XEhSKh9iObxFykKQ+hBZcXjwGKPnfbZS
Pb5ppxyktZFh7p6kzPvq87spAFbYrVOwp9ymveamYBleC1PgtzIsk5o1dD85
dWxduwtrnQzyl8wAONtwO5/ib8vxa+7vwpM1oPHFRym09F2VtxVT9Uk10oh7
1WgZxCSTIlOEac+83IOJDehQhr4uc8p61OR7fPulMcCrd8JSdlEluNcVFQb0
A8F7LSslDL7/QgZ8EVYRWrMba7IgEDEusbAAQxADCz9RLlXLJg7+tgXEAUQR
r7UUfDtmaq2oqM2ib8iT0QlEL+aGDmnfV9mn5+IVn+KUpT646fNJ5MevABEB
AAHNLXJhdmlsYUBwcm90b25tYWlsLmNvbSA8cmF2aWxhQHByb3Rvbm1haWwu
Y29tPsLAfwQQAQgAKQUCWN6HnQYLCQcIAwIJED7J3nUiw3a4BBUICgIDFgIB
AhkBAhsDAh4BAAoJED7J3nUiw3a4zWQH/0JhbiSEKjRGK2pZcsfmozCRewa1
gaBqnp8wJqeK060k4BwC48W/+9aUR0FivlkU7KV5N7PR9QQejg67hd9IluVG
7DOp4RzttOPZlUuCA4cTANLWzic0XiFPYwUJHcsKUBltdm4L8wmcfmVlr+pV
+OcuxaR26s2hODzS4FQrdI1tpX39rXqj6b9ocPfJBJ213H2YM08In0VPf6G5
mDqQXmXPQlXklNXveQ+Q9ZEpcN+L/0FnkMBJpO2PPwuEHGuWDZGo9XD625oa
0eDCW5CkmLNKZiXQmi5iV54pjxRemNXCUqz4DSWIDs6VpwQKoYlwkMCZ6cwK
EBtsUD7O2MqBt2jOwE0EVc1PCwEIALCtY/cYpDA3g73A99f5Sz0Mbiju4NFF
jNZf6ZENjuO2MlBc03lnt+1TFkDmEGWLMol5euL6fIE53sHyE7WWuKM26QVA
KxXI9SX9esc50GFjgChQFJk2LjBB46U0cMhtRTnmwWNj9wIwiMihg2j5SKnG
qI1K16U9dfBsAYbgCgnGwEf3+DXXXZnUO0ma3fyU7MGaAxRQafTgMLIvepI3
QD6DunlZaLu/pptTSOKHAVnwZbxce52JSoJPx6YXfHNmhnjgZ7lAxXXuwqXr
XvMbe4N9XChRdGLzGAcwddTuLkywbAObmtG2FLKLC5SxvdpJAvY3eHkpo1LH
LTblmtxbIPcAEQEAAcLAaQQYAQgAEwUCWN6HnQkQPsnedSLDdrgCGwwACgkQ
PsnedSLDdrjIHgf+N0UhiYsEK4TFERTxUxePFv5QzoTIPwe5ByYq5zdCRSVz
X/5z49wTe+gt0+4/WhNoJ6I0cQPyuV6PZ/m92XdPMB0BtPS3TI4HRNVJJUWZ
j8fvOTlWruKifZ0MR7gJT1ChxXEIBPe6bE9cYdqo2lO5xzLEkI6IdXXjORU7
bv6Bh6e9T3S2LffDYYwxEnNgEZCjgB4JXgDNIi6d2ocfahgokd+n4XLI8Ga0
JrjTY19rB1QZykc2ruSekxuZwTel/lnwrG6JHh6NjVuRBAZIP0tq9XIEsPBh
4zp5trNXsdnYcHU/pJhgdKIUvXMJKQlh2p21Pm9itr+AXV/chCtpzu058A==
=dmYu
-----END PGP PUBLIC KEY BLOCK-----
```
[🔑️ PGP Encryption Key](/assets/publickey.pgp){:.button.button--primary.button--rounded.button--lg}
-----BEGIN PGP PUBLIC KEY BLOCK-----
xsBNBFXNTwsBCAC9q61TUmfDEPN2XEhSKh9iObxFykKQ+hBZcXjwGKPnfbZS
Pb5ppxyktZFh7p6kzPvq87spAFbYrVOwp9ymveamYBleC1PgtzIsk5o1dD85
dWxduwtrnQzyl8wAONtwO5/ib8vxa+7vwpM1oPHFRym09F2VtxVT9Uk10oh7
1WgZxCSTIlOEac+83IOJDehQhr4uc8p61OR7fPulMcCrd8JSdlEluNcVFQb0
A8F7LSslDL7/QgZ8EVYRWrMba7IgEDEusbAAQxADCz9RLlXLJg7+tgXEAUQR
r7UUfDtmaq2oqM2ib8iT0QlEL+aGDmnfV9mn5+IVn+KUpT646fNJ5MevABEB
AAHNLXJhdmlsYUBwcm90b25tYWlsLmNvbSA8cmF2aWxhQHByb3Rvbm1haWwu
Y29tPsLAfwQQAQgAKQUCWN6HnQYLCQcIAwIJED7J3nUiw3a4BBUICgIDFgIB
AhkBAhsDAh4BAAoJED7J3nUiw3a4zWQH/0JhbiSEKjRGK2pZcsfmozCRewa1
gaBqnp8wJqeK060k4BwC48W/+9aUR0FivlkU7KV5N7PR9QQejg67hd9IluVG
7DOp4RzttOPZlUuCA4cTANLWzic0XiFPYwUJHcsKUBltdm4L8wmcfmVlr+pV
+OcuxaR26s2hODzS4FQrdI1tpX39rXqj6b9ocPfJBJ213H2YM08In0VPf6G5
mDqQXmXPQlXklNXveQ+Q9ZEpcN+L/0FnkMBJpO2PPwuEHGuWDZGo9XD625oa
0eDCW5CkmLNKZiXQmi5iV54pjxRemNXCUqz4DSWIDs6VpwQKoYlwkMCZ6cwK
EBtsUD7O2MqBt2jOwE0EVc1PCwEIALCtY/cYpDA3g73A99f5Sz0Mbiju4NFF
jNZf6ZENjuO2MlBc03lnt+1TFkDmEGWLMol5euL6fIE53sHyE7WWuKM26QVA
KxXI9SX9esc50GFjgChQFJk2LjBB46U0cMhtRTnmwWNj9wIwiMihg2j5SKnG
qI1K16U9dfBsAYbgCgnGwEf3+DXXXZnUO0ma3fyU7MGaAxRQafTgMLIvepI3
QD6DunlZaLu/pptTSOKHAVnwZbxce52JSoJPx6YXfHNmhnjgZ7lAxXXuwqXr
XvMbe4N9XChRdGLzGAcwddTuLkywbAObmtG2FLKLC5SxvdpJAvY3eHkpo1LH
LTblmtxbIPcAEQEAAcLAaQQYAQgAEwUCWN6HnQkQPsnedSLDdrgCGwwACgkQ
PsnedSLDdrjIHgf+N0UhiYsEK4TFERTxUxePFv5QzoTIPwe5ByYq5zdCRSVz
X/5z49wTe+gt0+4/WhNoJ6I0cQPyuV6PZ/m92XdPMB0BtPS3TI4HRNVJJUWZ
j8fvOTlWruKifZ0MR7gJT1ChxXEIBPe6bE9cYdqo2lO5xzLEkI6IdXXjORU7
bv6Bh6e9T3S2LffDYYwxEnNgEZCjgB4JXgDNIi6d2ocfahgokd+n4XLI8Ga0
JrjTY19rB1QZykc2ruSekxuZwTel/lnwrG6JHh6NjVuRBAZIP0tq9XIEsPBh
4zp5trNXsdnYcHU/pJhgdKIUvXMJKQlh2p21Pm9itr+AXV/chCtpzu058A==
=dmYu
-----END PGP PUBLIC KEY BLOCK-----
......@@ -5,7 +5,7 @@ aside:
toc: true
---
__Last Updated__: 02/20/2020.
__Last Updated__: 11/09/2020.
## Education
......@@ -14,6 +14,9 @@ __Last Updated__: 02/20/2020.
## Appointments
### Scripps Research
- `07/2020 - Present` Research Programmer, [Su](http://sulab.org) / [Wu](https://wulab.io) Labs
### GlaxoSmithKline
- `08/2019 - 02/2020` Co-op, Medical Insights: Data Analysis and Visualization / DevOps Engineering
- `05/2019 - 08/2019` Intern, Protein Design and Informatics
......@@ -25,6 +28,10 @@ __Last Updated__: 02/20/2020.
- `08/2016 - 08/2017` Graduate Research Assistant, [Dr. Ming-Ying Leung](http://www.math.utep.edu/Faculty/mleung/home.php), UTEP Bioinformatics Program
- `03/2014 - 08/2015` Undergraduate Research Assistant, [Dr. Ricardo Bernal](https://science.utep.edu/chemistry/rbernal/), UTEP Department of Chemistry
## Publications
Xian, Y., Avila, R., Pant, A., Yang, Z., & Xiao, C. (2020). The Role of Tape Measure Protein in Nucleocytoplasmic Large DNA Virus Capsid Assembly. Viral Immunology. [doi:10.1089/vim.2020.0038](http://doi.org/10.1089/vim.2020.0038)
## Presentations
### Posters
......@@ -54,13 +61,13 @@ man OX2 receptor (hOX2R) [19th UTEP/NMSU Workshop on Mathematics, Computer Scien
- MySQL
- Java
### DevOps & Cloud Computing
### Other Technologies
- Git
- Containerization: Docker, Singularity
- Docker
- High Performance Computing (HPC)
- Databricks
- Microsoft Azure: Azure DevOps, Azure Data Factory
- GitLab Continuous Integration
- Microsoft Azure
- Continuous Integration
### Protein & Molecular Informatics
- Cheminformatic toolkits: RDKit, OpenEye
......@@ -70,5 +77,5 @@ man OX2 receptor (hOX2R) [19th UTEP/NMSU Workshop on Mathematics, Computer Scien
### Structural Biology
- Cryo-EM image processing: Relion 3, EMAN2
- Molecular Biology: Protein expression, Protein purification, Crystal screening.
- Molecular Biology: Protein expression, Protein purification.
---
layout: article
title: Links
show_date: false
---
## Ricardo's Notes
My notes on various topics:
[Ricardo's Notebook](/notes/bioinformatics/e-value-bitscore){:.button.button--primary.button--rounded.button--lg}
## Blogs
A curated list of my favorite blogs:
- [Is life worth living?](https://iwatobipen.wordpress.com/)
- [Practical Cheminformatics (Pat Walters)](http://practicalcheminformatics.blogspot.com/)
- [Thomas Kainrad](https://tkainrad.dev/)
- [Paul Stamatiou](https://paulstamatiou.com/)
- [Half Integer](http://blog.rogerluo.me/)
---
layout: notes
title: Ricardo's Notes
show_date: false
aside:
toc: false
sidebar:
nav: notes-nav
---
{%- if page.sidebar.nav -%}
{%- assign _sidebar_nav = site.data.navigation[page.sidebar.nav] -%}
{%- if _sidebar_nav -%}
{%- for _item in _sidebar_nav -%}
<h2>{{ _item.title }}</h2>
{%- if _item.children -%}
{%- for _child in _item.children -%}
{%- include snippets/get-nav-url.html path=_child.url -%}
{%- assign _nav_url = __return -%}
{%- include snippets/get-nav-url.html path=page.url -%}
{%- assign _page_url = __return -%}
{%- if _nav_url == _page_url -%}
<li class="toc-h2 active"><a href="{{ _nav_url }}">{{ _child.title }}</a></li>
{%- else -%}
<li class="toc-h2"><a href="{{ _nav_url }}">{{ _child.title }}</a></li>
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endif -%}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment