README.md 8.95 KB
Newer Older
guy's avatar
guy committed
1
# Django Searchable Encrypted Fields
Guy Willett's avatar
Guy Willett committed
2
3
4
5
6
This package is for you if you would like to encrypt model field data "in app" - ie before it is sent to the database.

**Why another encrypted field package?**

1. We use AES-256 encryption with GCM mode (via the Pycryptodome library).
Guy Willett's avatar
Guy Willett committed
7
8
9
2. Encryption keys never leave the app.
3. It is easy to generate appropriate encryption keys with `secrets.token_hex(32)` from the standard library.
4. You can make 'exact' search lookups when also using the SearchField.
Guy Willett's avatar
Guy Willett committed
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

## Install & Setup
```shell
$ pip install django-searchable-encrypted-fields
```
```python
# in settings.py
INSTALLED_APPS += ["encrypted_fields"]

# A list of hex-encoded 32 byte keys
# You only need one unless/until rotating keys
FIELD_ENCRYPTION_KEYS = [
    "f164ec6bd6fbc4aef5647abc15199da0f9badcc1d2127bde2087ae0d794a9a0b"
]
```

## Intro
guy's avatar
guy committed
27
This package provides two types of model field for Django.
Guy Willett's avatar
Guy Willett committed
28
29
1. A series of **EncryptedField** classes which can be used by themselves and work just like their regular Django counterparts. Contents are transparently encrypted/decrypted.
2. A **SearchField** which can be used in conjunction with any EncryptedField. Values are concatentaed with a `hash_key` and then hashed with SHA256 before storing in a separate field. This means 'exact' searches can be performed.
guy's avatar
guy committed
30
31
32

This is probably best demonstrated by example:

Guy Willett's avatar
Guy Willett committed
33
34
35
36
37
38
39
40
41
## Using a stand-alone EncryptedField
```python
from encrypted_fields import fields

class Person(models.Model):
    favorite_number = fields.EncryptedIntegerField(help_text="Your favorite number.")
```
You can use all the usual field arguments and add validators as normal.
Note, however, that primary_key, unique and db_index are not supported because they do not make sense for encrypted data.
guy's avatar
guy committed
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
### Migrations
Always add a new EncryptedField and do a data-migration, rather than alter an existing regular Django model field.
See the `encrypted_fields_test` app for an example.
### Included EncryptedField classes
The following are included:
```python
"EncryptedFieldMixin",
"EncryptedTextField",
"EncryptedCharField",
"EncryptedEmailField",
"EncryptedIntegerField",
"EncryptedDateField",
"EncryptedDateTimeField",
"EncryptedBigIntegerField",
"EncryptedPositiveIntegerField",
"EncryptedPositiveSmallIntegerField",
"EncryptedSmallIntegerField",
```
Note that, although untested, you should be able to extend other regular Django model field classes like this:
```python
class EncryptedIPAddressField(EncryptedFieldMixin, models.GenericIPAddressField):
    pass
```
Guy Willett's avatar
Guy Willett committed
65
66

## Using a SearchField along with an EncryptedField
Guy Willett's avatar
Guy Willett committed
67
68
69
70
71
72
73
74
75
76
77
78
### Philosophy
The SearchField is responsible for:
1. Providing the input for its EncryptedField
2. Displaying (returning) the EncryptedField's value
3. Storing the searchable hashed version of the input

The EncryptedField is the "real" field and so should be the appropriate field type for the expected input. It does all the under-the-hood things you would expect, eg:
* Providing validation/validators for the input
* Converting the input and database values to the appropriate python object
* Encryption/decryption

### Example usage
guy's avatar
guy committed
79
```python
80
81
82
83
def get_hash_key():
    # This must return a suitable string, eg from secrets.token_hex(32)
    return "f414ed6bd6fbc4aef5647abc15199da0f9badcc1d2127bde2087ae0d794a8a0a"

guy's avatar
guy committed
84
class Person(models.Model):
Guy Willett's avatar
Guy Willett committed
85
    _name_data = fields.EncryptedCharField(max_length=50, default="", null=True/False)
86
    name = fields.SearchField(hash_key=get_hash_key, encrypted_field_name="_name_data")
guy's avatar
guy committed
87
    favorite_number = fields.EncryptedIntegerField()
Guy Willett's avatar
Guy Willett committed
88
    city = models.CharField(max_length=255) # regular Django model field
guy's avatar
guy committed
89
90
91
```
You can then use it like:
```python
Guy Willett's avatar
Guy Willett committed
92
93
# "Jo" is hashed and stored in 'name' as well as symmetrically encrypted and stored in '_name_data'
Person.objects.create(name="Jo", favorite_number=7, city="London")
guy's avatar
guy committed
94
95
96
person = Person.objects.get(name="Jo")
assert person.name == "Jo"
assert person.favorite_number == 7
Guy Willett's avatar
Guy Willett committed
97
98
99

person = Person.objects.get(city="London")
assert person.name == "Jo" . # the data is taken from '_name_data', which decrypts it first.
guy's avatar
guy committed
100
101
102
103
104
105
106
107
108
109
```
You can safely update like this:
```python
person.name = "Simon"
person.save()
```
But when using `update()` you need to provide the value to both fields:
```python
Person.objects.filter(name="Jo").update(name="Bob", _name_data="Bob")
```
Guy Willett's avatar
Guy Willett committed
110
### Please note:
Guy Willett's avatar
Guy Willett committed
111
A SearchField inherits the validators, default value and default formfield (widget) from its associated EncryptedField. So:
Guy Willett's avatar
Guy Willett committed
112

Guy Willett's avatar
Guy Willett committed
113
1. Do not add validators (they will be ignored), add them to the associated EncryptedField instead.
guy's avatar
guy committed
114
2. Use `null=`, `blank=` and `default=` on the EncryptedField, not the SearchField.
Guy Willett's avatar
Guy Willett committed
115
3. Do not include the EncryptedField in forms, only include the SearchField.
guy's avatar
guy committed
116
4. Typically you should avoid `editable=False` in the EncryptedField - it prevents validation.
Guy Willett's avatar
Guy Willett committed
117
5. You can override the SearchField widget in a `ModelForm` as usual (see the `encrypted_fields_test` app).
guy's avatar
guy committed
118
6. By convention, declare the EncryptedField *before* the SearchField in your Model.
Guy Willett's avatar
Guy Willett committed
119

Guy Willett's avatar
Guy Willett committed
120
121
122
123
**Note** Although unique validation (and unique constraints at the database level) for an EncryptedField makes little sense, it is possible to add `unique=True` to a SearchField.

An example of when this makes sense is in a custom user model, where the `username` field is replaced with an `EncryptedCharField` and `SearchField`. Please see the custom user model in `encrypted_fields_test.models` and its tests for an example.

Guy Willett's avatar
Guy Willett committed
124
Please let us know if you have problems when doing this.
guy's avatar
guy committed
125
126
127
## Migrations: Add Search/EncryptedFields to your model, don't alter existing fields
You are encouraged to look at the demo migrations in the `encrypted_fields_test` app.

Guy Willett's avatar
Guy Willett committed
128
129
130
131
132
133
134
135
136
137
**Stand alone EncryptedFields:** 

Be careful not to change/alter a pre-existing regular django field to be an
EncryptedField. The data for existing rows will be unencrypted in the database and
appear 'corrupted' when trying to decrypt/fetch it.
Instead, add the new EncryptedField to the model and do a data-migration
to transfer data from the old field.

**SearchField with EncryptedField:**

guy's avatar
guy committed
138
139
The same goes for SearchFields: add the new SearchField and new Encrypted field to the model. Then do a data-migration to transfer data from the old field to the SearchField (the SearchField will populate the new EncryptedField automatically).

Guy Willett's avatar
Guy Willett committed
140
**IMPORTANT!** Never add a SearchField and point it to an **existing** EncryptedField, or your SearchField will have the wrong value, and you might lose all your data! How? Why? When adding a new field to a model, Django will update each existing row's new field to have the default value. Note that the default value might be `None` or `""` even if `default=` is not defined in your field. If the new field is a SearchField then it will be saved with the EncryptedField's default value. This is almost certainly not what you want, even if you did define a default for it.
Guy Willett's avatar
Guy Willett committed
141
## Generating Encryption Keys
Guy Willett's avatar
Guy Willett committed
142
You can use `secrets` from the standard library. It will print appropriate hex-encoded keys to the terminal, ready to be used in `settings.FIELD_ENCRYPTION_KEYS` or as a hash_key for a SearchField:
Guy Willett's avatar
Guy Willett committed
143
```shell
Guy Willett's avatar
Guy Willett committed
144
145
146
$ python manage.py shell
>>> import secrets
>>> secrets.token_hex(32)
Guy Willett's avatar
Guy Willett committed
147
```
Guy Willett's avatar
Guy Willett committed
148
Note: Thanks to Andrew Mendoza for the suggestion.
Guy Willett's avatar
Guy Willett committed
149

Guy Willett's avatar
Guy Willett committed
150
151
Note: encryption keys **must** be hex encoded and 32 bytes

Guy Willett's avatar
Guy Willett committed
152
**Important**: use different hash_key values for each SearchField and make sure they are different from any keys in `settings.FIELD_ENCRYPTION_KEYS`.
Guy Willett's avatar
Guy Willett committed
153
154
## Rotating Encryption Keys
If you want to rotate the encryption key just prepend `settings.FIELD_ENCRYPTION_KEYS` with a new key. This new key (the first in the list) will be used for encrypting/decrypting all data. If decrypting data fails (because it was encrypted with an older key), each key in the list is tried.
guy's avatar
guy committed
155
156
157
A model instance will start using the new encryption key the next time they are accessed.

You can do a data-migration, simply fetching and saving all objects, to force a complete rotation to the new encryption key.
guy's avatar
guy committed
158
See the `encrypted_fields_test` app for an example.
guy's avatar
guy committed
159
160

Be sure to keep all old encryption keys in the list until you are certain all objects have rotated to the new key.
Guy Willett's avatar
Guy Willett committed
161
## Compatability
Guy Willett's avatar
Guy Willett committed
162
`django-searchable-encrypted-fields` is tested with Django(3.2, 4.0, 4.1) on Python(3.8, 3.9) using SQLite and PostgreSQL (11 and 12).
Guy Willett's avatar
Guy Willett committed
163
164

Test coverage is at 96%.
Guy Willett's avatar
Guy Willett committed
165
166

## More on testing
guy's avatar
guy committed
167
Please see the `encrypted_fields_test` app (in the gitlab repo) for some example admin site and model form implementations. Just run `pip install -r requirements.txt`, `python manage.py migrate` and `python manage.py runserver` to get started using SQLite.
Guy Willett's avatar
Guy Willett committed
168

guy's avatar
guy committed
169
170
There is also a basic DjangoRestFramework implementation with a `ModelSerializer` and `ModelViewSet`.

Guy Willett's avatar
Guy Willett committed
171
172
173
In our test app, the `User` model uses a SearchField for the username. This means that when creating a superuser you must provide the `--username` argument: `python manage.py createsuperuser --username bob` to avoid an error.

Final note of interest: the tox test suite runs `python manage.py makemigrations` for every environment with an empty initial migration directory. This helps ensure the test app will work as expected in all tested environments.