Commit dda054d7 authored by Mike Ledger's avatar Mike Ledger

edit

parent 050b14ab
......@@ -2,6 +2,25 @@
This is a command-line tool for manipulating map-like CSV data, where the first
column is treated as a "key" for that row.
## Motivation
The use case that motivated it was to manipulate a dictionary of terms from a
project that we were tasked to clean up, with very little time to actually do
it.
The dictionary was simply all the title-case(*) terms that we could
automatically pull out of a particular version of the project. This lead to
there being thousands of spurious terms (e.g., from words at the beginning of a
sentence), and terms that were only partially detected, and a proportional
amount of manual work to cull or edit them.
The tool allowed me to:
1. Delete the same terms deleted in version A in version A+1
2. For terms that were edited rather than deleted in version A, replace the
non-edited version from A+1 (if it existed) with the version from A
2. Detect entirely new terms
3. Combine the official project dictionaries with the title-case one I'd
produced
## Usage
```shell
......@@ -34,6 +53,15 @@ Available options:
'"LABEL": EXPR'.
```
## Examples
Take the rows that exist in both `v1`, in addition (preferring `v1`) to the rows
that exist in both `v1` and `v2` (preferring `v2`).
```shell
$ csvmaps -i v1.csv -i v2.csv --expr '(($1 *| $2) +| ($1 - $2))'
```
### MAPEXPR syntax
`$N`
......@@ -107,21 +135,3 @@ Available options:
`A - B`
: Returns the rows of A whose keys are not in B.
## Motivation
The use case that motivated it was to manipulate a dictionary of terms from a
project that we were tasked to clean up, with very little time to actually do
it.
The dictionary was simply all the title-case(*) terms that we could
automatically pull out of a particular version of the project. This lead to
there being thousands of spurious terms (e.g., from words at the beginning of a
sentence), and terms that were only partially detected, and a proportional
amount of manual work to cull or edit them.
The tool allowed me to:
1. Delete the same terms deleted in version A in version A+1
2. For terms that were edited rather than deleted in version A, replace the
non-edited version from A+1 (if it existed) with the version from A
2. Detect entirely new terms
3. Combine the official project dictionaries with the title-case one I'd
produced
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment