Use Faker to generate data
Currently we're using static data for the faking functions. The data are stored in CSV files here : https://gitlab.com/dalibo/postgresql_anonymizer/tree/master/data
These data come from various sources and it's hard determine their degree of "fakeness". This is a naïve approach and some tools are already able to produce fake data in a better way. For example https://github.com/joke2k/faker and https://github.com/guedes/faker_fdw
The basic goal is to generate a new dataset of fake data before each new release. But I'm open to other ideas too
-
Add documentation and link about https://gitlab.com/dalibo/postgresql_faker
Edited by damien clochard