This project is archived. Its data is read-only. This project is read-only.
Meltano UI converts hash keys in capital to snake case
This is not a bug per se, as this is a new option I introduced with `tap-adwords`. But it blocks further support of `tap-adwords`in Meltano UI, so we should address it some way or another. ### Origin of the issue In `tap-adwords`, we need a more complex way of defining the value of a configuration parameter: Depending on the Streams (Entities) a user chooses to extract, you have to define a set of primary keys for each one. Because the list of selected Streams and the list of primary keys for each one is dynamic and can vary in size, I have opted to define those with a value that is a hash of arrays. This is pretty valid both in JSON (how Taps define their configuration options) and in YAML (how Meltano defines the configuration options for each Tap, which is then translated in JSON) So, the newly added configuration parameter looks like that in JSON: ``` { ... ... "conversion_window_days": 0, "primary_keys": { "KEYWORDS_PERFORMANCE_REPORT": ["campaignID", "adGroupID", "keywordID", "day"], "AD_PERFORMANCE_REPORT": ["campaignID", "adGroupID", "adID", "day"] } } ``` While it is defined as follows in our `discovery.yml`: ``` extractors: - name: tap-adwords label: Google Ads description: Advertising Platform namespace: tap_adwords ... ... ... settings: ... ... ... - name: primary_keys label: Primary Keys value: KEYWORDS_PERFORMANCE_REPORT: - campaignID - adGroupID - keywordID - day AD_PERFORMANCE_REPORT: - campaignID - adGroupID - adID - day kind: hidden description: Primary Keys for the selected Entities (Streams) ``` ### Problem (bug) The aforementioned setup works **without** issues when the `tap-adwords` is run using the CLI. All the configuration parameters are converted correctly and passed to the Tap. You can check what happens by running `meltano config`: ```bash (venv) Yanniss-MacBook-Pro:cli-tap-adwords iroussos$ meltano config tap-adwords { ..., 'conversion_window_days': 0, 'primary_keys': { 'KEYWORDS_PERFORMANCE_REPORT': ['campaignID', 'adGroupID', 'keywordID', 'day'], 'AD_PERFORMANCE_REPORT': ['campaignID', 'adGroupID', 'adID', 'day'] } } ``` But when I tried to run the same Tap using Meltano UI (initialize a new Meltano project, run Meltano UI, and add and run the Tap from inside Meltano UI), I realized that something was off --> The primary keys were not added to the created tables. I checked the configuration Meltano stores in this case and I got the following: ``` (venv) Yanniss-MacBook-Pro:tap-adwords-ui iroussos$ meltano config tap-adwords { ..., 'conversion_window_days': 0, 'primary_keys': { 'a_d__performanc_e__repor_t': ['campaignID', 'adGroupID', 'adID', 'day'], 'keyword_s__performanc_e__repor_t': ['campaignID', 'adGroupID', 'keywordID', 'day'] } } ``` All the keys in the has have been converted to a weird (non correct) version of their snake case counterpart. So it is clear that this is not something happening in the Meltano Core, but somewhere in Meltano UI we make this conversion. As this is a hidden configuration parameter with a predefined value (the hash of arrays), when the Tap is run through the CLI, it fetches the definition from `discovery.yml` and directly uses it for generating the final configuration. My bet is that when an Extractor is added and its configuration is saved, something a little bit different happens that also converts the keys: * Each configuration parameter is fetched from `discovery.yml` * Some magic happens here --> where the wrong conversion happens 99.999% * The values are stored in our internal storage * When the Tap (or meltano config) runs, those values are used ### Additional Investigation of the issue and way to reproduce it Note here that everything else works as expected: the hash and the arrays are properly converted to their JSON counterparts. So this does not seem as an issue with only expecting primitive values in some part of our code. I was pretty sure that this happens to **ALL** hash keys no matter how deep they are, so I run the following experiment: 1. I initialized a new Meltano Project 1. I copy pasted the `discovery.yml` from our master 1. I updated `tap-carbon-intensity` with as follows: ``` extractors: - name: tap-carbon-intensity label: Carbon Emissions Intensity description: National Grid ESO's Carbon Emissions Intensity API namespace: tap_carbon docs: 'https://meltano.com/plugins/extractors/carbon-intensity.html' pip_url: 'git+https://gitlab.com/meltano/tap-carbon-intensity' capabilities: - discover settings: - name: hashes_fail label: Hash Fail value: yannis: - BIG_THING_HERE - AD_PERFORMANCE_REPORT - keywordID - key1: 111 key2: 328976 BIG_KEY_3: 8387387 CAPITAL_KEY_YAY: - campaignID - adGroupID - keywordID - day kind: hidden description: Hash Keys fail ``` So basically I added one hidden configuration parameter that is a hash of arrays, but with an additional hash inside one of the two arrays (the one with `BIG_KEY_3`) I added `tap-carbon-intensity` and then checked what happened: ``` { 'hashes_fail': { 'capita_l__ke_y__ya_y': ['campaignID', 'adGroupID', 'keywordID', 'day'], 'yannis': [ 'BIG_THING_HERE', 'AD_PERFORMANCE_REPORT', 'keywordID', { 'bi_g__ke_y_2': 8387387, 'key1': 111, 'key2': 328976 } ] } } ``` And there it is! Whatever causes this issue, recursively visits all entries and converts all **hash keys** to this weird snake case format: * key inside the hash `CAPITAL_KEY_YAY` --> 'capita_l__ke_y__ya_y' * Key inside a hash that is inside an array inside another hash `CAPITAL_KEY_YAY` --> 'capita_l__ke_y__ya_y' Everything else is left untouched (e.g. `BIG_THING_HERE` or `AD_PERFORMANCE_REPORT`), so this only happens to keys. Once more, the generated JSON correctly follows the YAML definition of a config value that is a hash with arrays, with an additional hash somewhere deep down, so this seems to be treated without issues.
issue