Exploratory Sprint: Meltano for NoSQL databases

Many of the U.S. public sector customers have been storing their data in NoSQL databases (MongoDB, Cassandra, MarkLogic), Hadoop-based stores (Hbase, Cloudera, Accumulo), or graph databases (Neo4j, OrientDB). Most of the data that is coming in to those agencies arrive in unstructured formats. The ETL process to push them into structured SQL databases has become cumbersome, so they have adopted unstructured/document-based databases with indexing strategies that allow them to search/exploit the data.

Because of this, many of those agencies have built their own home-grown tools for data analysis. To compete with those home-grown tools, Meltano will likely need to be able to analyze unstructured and disparate data.

Edited May 01, 2019 by Danielle Morrill