Skip to content

[#760] Update QueryGenerator.m to generate INSERT/DELETE/UPDATE queries and CONSTRAINTS

Ganesh Mahesh requested to merge zylog1O1/YDBOcto:ydbocto760 into master

Issues

  1. Update QueryGenerator.m to generate INSERT, DELETE and UPDATE queries along with SELECT queries
  2. Enhance QueryGenerator.m to generate CONSTRAINTS (UNIQUE and CHECK)

Implementation of INSERT, DELETE and UPDATE query generation by QueryGenerator.m

Changes

  • QueryGenerator.m is updated to generate INSERT,DELETE and UPDATE queries
  • run_query_generator() chooses whether to execute QueryGenerator.m to produce read-write queries or just read queries with the help of an additional argument
  • 80% of the time QueryGenerator.m invocation will be to generate read queries (SELECT queries)
  • When read-write query generation is chosen, the distribution of queries are as follows
    • 80% SELECT queries
    • Equal probability of INSERT, DELETE and UPDATE queries in the remaining 20% of the queries
  • This change introduces INSERT queries of the following format
    • INSERT INTO tableName (VALUES(col1,col2,col3),(col1,col2,col3),..)
  • This change introduces DELETE queries of the following format
    • DELETE FROM tableName WHERE expression
  • This change introduces UPDATE queries of the following format
    • UPDATE tableName SET col1=value,col2=value,..
    • UPDATE tableName SET col1=value,col2=value,.. WHERE expression
  • Updated test_helpers.bats.in->run_query_generator() to not include unnecessary - and 0 in query file name. Also, stderr/stdout output from QueryGenerator.m is now redirected to QG_stderr.txt/QG_stdout.txt to avoid cluttering stdout/stderr.
  • start_rocto invocation is changed in the following way because it is now possible for the queries to be read or write type. Having -aw allows schema changes and readwrite on the tables.
    -       test_port=$(start_rocto 1344 --allowschemachanges)
    +       test_port=$(start_rocto 1344 quiet -aw)
  • names table definition is updated in test/fixtures/names.sql to have VARCHAR(30) instead of TEXT(30) as Postgres doesn't support TEXT(30) and during duplicate table definition generation we parse this .sql file to generate a CREATE TABLE query. This query has to work in both Postgres and Octo as the queries go through cross check interface. Hence the change.
  • Now that test_query_generator.bats.in generates DML operations, cmake/bats-tests.cmake is updated to use ADD_BATS_TEST_DML macro for test_query_generator.bats.in such that hello_dp test execution isn't effected by the DML operations
  • The new design of run_query_generator() is as follows
    • Invoke QueryGenerator.m with an additional parameter to inform the type of queries to be generated
      • When QueryGenerator.m includes write queries, additional queries to copy the original table to a different table whose name will be of the form tableNameSubtestname (customersTQG01) are also generated to a file. It also generates DROP TABLE queries for the newly created table as well. In total 3 files will be generated, tabledefinition-TQG**.sql(table definition file), drop-tabledefinition-TQG**.sql (drop table definition file) and the actual query file.
      • run_query_generator() executes table definition files first and captures Postgres and Octo specific outputs in separate files.
      • Then the Octo and Postgres cross check function is invoked on the query file generated previously.
      • Then run_query_generator() executes DROP TABLE queries to provide a clean state for the next test execution.
  • Changes done to QueryGenerator.m itself
    • First query generator sets up some common variables required for SELECT and other query generation
    • Selects whether to generate only SELECT queries or a mix of SELECT/INSERT/DELETE/UPDATE based on additional argument received
    • In the later case genDuplicateTable() is invoked to parse Octo and Postgres table definitions and generate queries to create duplicate table with different name. This is done to avoid manipulating the original table.
    • readSql(), fillDefaultNamesLastnameTbl() and fillDefaultCompositeTbl() is updated to store table related information in lvn's under the new duplicate table name when write queries are being generated.
    • primaryKeyColumn(tableName,columnName)=columnType lvn is maintained to help generation of query elements which modify KEY columns
    • genSelectQuery() is used to generate select query
    • genWriteQuery() is used to generator INSERT/DELETE/UPDATE query
    • genSelectQuery() has the old code which generated SELECT queries previously
    • genWriteQuery() further makes use of genInsertQuery(), genDeleteQuery() and genUpdateQuery() to generate INSERT, DELETE and UPDATE queries
    • Each of the above routines take care of writing the generated queries to the output file
    • The output file will contain a mixture of SELECT, INSERT, DELETE and UPDATE when the QueryGenerator.m chooses to include write queries.

Test

  • hello_db.bats.in is updated to include more DROP TABLE statements for PSQL. These are specific to TQG and TJC subtests since these call run_query_generator() and query generator now generates duplicate tables for write queries. In case of an error these duplicate tables might not be deleted. Hence this change.
  • TC051.ref is updated as names table definition was updated from having a TEXT field to VARCHAR field
  • Removed unnecessary - usage in argument for run_query_generator() in test_jdbc_connection.bats.in
  • Removed unnecessary - usage in argument for run_query_generator() in test_query_generator.bats.in

Implementation of UNIQUE and CHECK constraint included CREATE TABLE statements by QueryGenerator.m

Changes

  • Mainly the table definition queries which previous description explained now will include constraints (UNIQUE and CHECK)
  • test framework is updated to perform cross check between Octo and Postgres - UNIQUE, PRIMARY and CHECK related ERROR messages.
  • QueryGenerator.m chooses to include CONSTRAINTS 10% of the time when its generating read and write queries. Note read and write queries will be opted to be generated with a probability of 20%.
  • Octo's Table definitions (names.sql, customers.sql ...) and in some cases Postgres table definition (postgres-nullnames.sql) are updated to have similar column definition as its counterpart.

More information regarding the changes done is listed below.

Changes done to test_helpers:

  • test_helpers.bats.in is updated by appending || true to Octo and Postgres execution commands in run_query_in_octo_and_postgres_and_crosscheck(). This helps to prevent a test abort when the Octo or Postgres command results in an error output.
  • The result obtained from Postgres and Octo execution in run_query_in_octo_and_postgres_and_crosscheck() is now passed through a sed command when insert, and update statements are being processed. The sed command processes the output and renders a formatted output which can easily be compared. This sed command is based off of the constraint violation cross check done in a3e3ffbe. Three different patterns of the sed command is maintained as the output with JDBC driver, PSQL and octo differs a little bit. For example, JDBC driver/PSQL output will not have [ERROR] ERR_CHECK_CONSTRAINT_VIOLATION included as part of its output when a check constraint violation happens, where as it is included in octo's output.
    • More details about the sed command is below
      • The sed command formats Postgres and Octo output such that the following errors can be compared. They are expected to have the following text when comparison is performed. Note in case of Rocto or JDBC driver usage the ERROR tag starting with ERR_ is not present in the final text as Octo's output doesn't include this tag in these cases.
        • ERR_CHECK_CONSTRAINT_VIOLATION - New row for table [table name] violates check constraint
        • ERR_DUPLICATE_KEY_VALUE - Duplicate Key Value violates UNIQUE constraint
        • ERR_NULL_COL_VALUE - NULL value in column [column name] violates NOT NULL constraint
      • Following are additional formatting performed
        • id -> ID
        • Lines starting with DETAIL is removed
        • Lines having NOTICE is removed
        • Lines starting with Failing row contains is removed
      • (of relation .) pattern is used to skip of relation "orderstqg01" string in the error message for ERR_NULL_COL_VALUE. This additional string was noticed in psql (PostgreSQL) 14.5 (Ubuntu 14.5-1ubuntu1) and psql (PostgreSQL) 13.8 (Debian 13.8-0+deb11u1). Refer to !1203 (comment 1218239838) for more details.
  • PSQL client related code in run_query_in_octo_and_postgres_and_crosscheck() is updated to include a sed command which copies over ERROR lines from error file to output file. When an error occurs (like ERR_CHECK_CONSTRAINT_VIOLATION) its output is directed to the stderr which in this case was a file. So, to be able to compare results the error message had to be copied over to the output file.
  • run_query_generator() is also modified to now split the table definition queries (those created by QueryGenerator.m) before execution. This is done because if an INSERT query (one among the table definition queries generated by QueryGenerator.m) fails (due to constraint violation) then the queries after that were not allowed to execute. Even with || true usage this is the case as the execution stopped at the point where ERROR occurred, which in this case was the INSERT query. To avoid such scenarios, table defining queries from QueryGenerator.m is split making use of split_queries.py. By doing this each query is executed separately and even if one fails the others can still be executed.
  • run_query_generator() is modified to run the same table definition file generated by QueryGenerator.m on both Octo and Postgres. Previously two different table definition files were generated by QueryGenerator.m. This helped to avoid two split_queries.py invocation. Also, drop table definition is now directly created by QueryGenerator.m, so commands in this routine to extract the DROP table queries from table definition files have been removed.

Changes done to run_multiple_query_files.java:

  • Even with || true appended to run_java() invocation in run_query_in_octo_and_postgres_and_crosscheck(), the execution halted after first ERROR output. This is because run_java() makes use of run_multiple_query_files.java to execute a query and the class in the .java file definition wasn't implemented to continue query execution even after a query issued an ERROR. Code in it is updated to catch an SqlException and let the test execution continue. When the exception is caught the exception message which includes the ERROR message is written to the output file itself. After this change run_java() is enabled to be used to perform ERROR cross check which was required to implement Constraint testing in QueryGenerator.m.

Changes to QueryGenerator.m:

  • writeTableDefinition() introduced in previous commit is updated to parse in such in a way which makes constraint addition possible on columns of the table definition.
  • addConstraint() is introduced to update the columnNameAndType string by appending it with CHECK/UNIQUE constraint.
    • This function chooses to add a constraint only 10% of the time
    • CHECK and UNIQUE is chosen with equal probability
    • UNIQUE and CHECK is placed at the end of the column definition passed to this function.
    • CHECK constraint is formed out of a simple expression which composes of a column, an entry from that column and a comparison operator i.e. CHECK (col op value)
  • All the code which got introduced in the previous commit to select a unique value when updating a primary key column has been removed which resulted in simplification of lot of code.
  • table definition file opening and closing is moved closer to its usage as its previous placement allowed for addConstraint() to unintentionally write debug information to the table definition file.
  • genDuplicateTable() introduced in the previous commit is modified to create a single table definition file for both Octo and Postgres. This was only possible because of the removal of differences between Octo and Postgres table definition files. The change to original table definition files has been listed under section - Changes to DDL files needs to be documented along with information on how and why it effects tests. Since both table definition are mostly same, this function makes use of Postgres table definition file to create the new table definition.
  • writeTableDefinition() which does the actual creation of new table definition file is updated to write DROP table queries to a separate file such that the caller doesn't have to extract these queries from the table definition files like how it was originally. Changes to DDL files needs to be documented along with information on how and why it effects tests
  • names.sql
    • The changes done makes column definitions in names.sql same as that of postgres-names.sql.
  • postgres-nullnames.sql
    • The column definition for salary in nullnames and nullnamesb were changed from NUMERIC(20, 2) to NUMERIC
    • This made QueryGenerator.m parsing changes to generate the new table definition simple. This change had no impact on the test system and the change makes postgres-nullnames.sql same as nullnames.sql (Octo's table definition).
  • nullcharnames.sql
    • The changes done makes column definitions in nullcharnames.sql same as that of postgres-nullcharnames.sql.
  • customers.sql
    • The changes done makes column definitions in customers.sql same as that of postgres-customers.sql.
  • test_createtable
    • TC030 - customers.sql changes above resulted in this change
  • test_discard_all
    • TDA01 - customers.sql changes above results in this change
  • test_drop_table
    • TDT04 - customers.sql changes above results in this change

Additional usecases implemented in addition to those discussed above

  • Place UNIQUE and CHECK constraint at the end of table definition and apply it on one or more columns
  • List constraint definition and column definition in genDuplicateTable() in mixed order. This way constraints appearing in between column definition gets tested. At present its only added at the end of column column definition and at the end of table definition.
  • CHECK constraint additions
    • coalesce, nullIf, greatest and Least based CHECK constraint
    • samevalue() based CHECK constraint
    • NULL value based CHECK constraint
    • 0 or 0.0 value based CHECK constraints
    • CHECK(col IS NOT NULL) type of constraint
    • multiple column based CHECK constraint
  • UNIQUE with multiple columns is also added in this commit
  • Include queries like - INSERT INTO tableName (col1,col2,col3) (VALUES(col1,col2,col3),(col1,col2,col3),..)
  • Update SET with nullif,greatest,least,coalesce,samevalue() and other operator based expressions

Future TODO

  • ; #FUTURE_TODO: Include subquery and Case statements in UPDATE's SET list entry
  • ; #FUTURE_TODO: Include queries like - INSERT INTO tableName (SELECT col1,col2,col3 ...)
  • ; #FUTURE_TODO: Include queries like - INSERT INTO tableName (col1,col3,col3) (SELECT col1,col2,col3 ..)
  • ; #FUTURE_TODO: Include queries with UPDATE's SET list entry having more versatile expressions than a simple literal value from the column itself

Misc

  • test_helpers.bats.in -> Postgres and Octo's cross check function -> Error related sed command is updated to account for duplicate column error and NOT NULL constraint violation
  • The sed command is updated in the following way
    • The sed command formats Postgres and Octo output such that the following errors can be compared. They are expected to have the following text when comparison is performed. Note in case of Rocto or JDBC driver usage the ERROR tag starting with ERR_ is not present in the final text as Octo's output doesn't include this tag in these cases.
      • ERR_NULL_COL_VALUE - NULL value in column [column name] violates NOT NULL constraint
      • ERR_DUPLICATE_COLUMN - column '[column name]' specified more than once
    • Following are additional formatting performed
      • Lines containing LINE is removed
      • Lines starting with ^ is removed
Edited by Ganesh Mahesh

Merge request reports