Skip to content

Commit

Permalink
Merge pull request #4854 from bamaer/4117
Browse files Browse the repository at this point in the history
updated table input docs. #4117
  • Loading branch information
hansva authored Feb 1, 2025
2 parents 84a3ada + fff3a82 commit fa2a082
Showing 1 changed file with 62 additions and 11 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,14 @@ under the License.

[%noheader,cols="3a,1a", role="table-no-borders" ]
|===
|
a|
== Description
The Table Input transform is used to read information from a database, using a connection and SQL. Basic SQL statements can be generated automatically by clicking Get SQL select statement. SQL queries can be parameterized through variables and can accept input from previous transform fields.

The Table Input transform is used to read information from a relational database using a connection and a SQL query.


SQL queries can be parameterized through variables and can accept input from previous transform fields. +

|
== Supported Engines
[%noheader,cols="2,1a",frame=none, role="table-supported-engines"]
Expand All @@ -38,17 +43,35 @@ The Table Input transform is used to read information from a database, using a c
!===
|===

Table input does not pass input data to the output, only fields inside the query are returned to the pipeline so all other variables and data will be lost. You can solve this by adding the variable as a field in the query or put a Get variables transform behind the table input.
== Get all columns from a single table

To use data fields from a transform, you will have to select it in the dropdown "Insert data from transform". Note that if you are using a parametrized query using question marks, you must limit the stream to exactly the fields you need as input to the table input. This can also be done with a Select values transform.
Basic SQL statements can be generated automatically by clicking `Get SQL select statement` button.

TIP: If you are getting unexpected query results, use the clear database cache icon (broom), or you can use the Hop menu icon: Tools / Clear DB Cache. In addition, click OK, save the pipeline, close and re-open the pipeline.
The database explorer will open and let you select a table directly or from the list of available database schemas.

TIP: A cartesian join transform will combine a different number of fields from multiple table inputs without requiring key join fields.
Views can be shown and selected as views or as tables. Check your database and JDBC driver documentation for more details.

TIP: Using the "insert data from transform" drop down will block until the transform selected has completed.
After selecting a table, a new popup dialog will ask you to which SQL statemennt to generate:

* select all columns individually, e.g. `select col_a, col_b, col_c from my_table;`
* select all columns with a SQL wildcard, e.g. `select * from my_table;`


== Accept input from previous fields

== Examples
The Table Input transform can accept fields from a previous transform specified in the `Insert data from transform` option.

If a transform is selected, the SQL query in this transform will be specified as a https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html[JDBC Prepared Statement^].

Please note that in addition to allowing parameterization, the most important advantage of prepared statements is that they help prevent SQL injection attacks. As a side effect, not all elements in a SQL query can be parameterized through prepared statements.

Use `?` to replace parameters in your SQL query with the values provided by your input transform. These values will be used in the order provided by the input transform. Use a xref:pipeline/transforms/selectvalues.adoc[Select Values] transform to modify your input stream to provide the correct fields in the correct order to your `Table Input` transform.

The `?` option can't be used to specify values for a SQL `IN` clause.

The prepared statement `?` parameters can be used in combination with the `Use variables in your SQL statement` described below.

=== Examples

*Example to use data row field(s) parameterized query:*

Expand All @@ -58,6 +81,27 @@ TIP: Using the "insert data from transform" drop down will block until the trans

* Insert data from transform: <point to the transform with the fields>

*Example using a date range*:

``SELECT * FROM customers WHERE changed_date BETWEEN ? AND ?``

This SQL statement requests two calendar dates, to create a range, that are read from the Insert data from transform option. The target date range can be provided using the Get System Info transform. For example, if you want to read all customers that have had their data changed yesterday, you can get a target range for yesterday and read the customer data.


== Use variables in your SQL statement

SQL queries in the `Table Input` transform can be parameterized through the use of variables.

To enable variable replacement in your query, check the `Replace variables in script` checkbox.

This option will replace the variables in your SQL statement with the variables values before sending the query to the database. Make sure to apply the correct quotes for your query statement, e.g. `select '{openvar}variable value{closevar}' as free_text_col, col_a, col_b from my_table`.

The query variable statement gives you full control over the SQL statement you send to the database, but does not prevent against SQL injection like the `Accept input from previous fields` option described above does.

The `Use variables in your SQL statement` can be used in combination with the prepared statement `?` parameters described above.

=== Examples

*Example to use a variable value parameterized query*:

This examples uses an integer. If you were using a string you most likely will have to use syntax ``{openvar}PRM_NAME{closevar}``
Expand All @@ -68,11 +112,18 @@ This examples uses an integer. If you were using a string you most likely will h

* Insert data from transform: <empty>

*Example using a date range*:
== Pro Tips

TIP: The Table input transform does not pass input data to the output, only fields inside the query are returned to the pipeline so all other variables and data will be lost. You can solve this by adding the variable as a field in the query or put a Get variables transform behind the table input.

TIP: If you are getting unexpected query results, use the clear database cache icon (broom), or you can use the Hop menu icon: Tools / Clear DB Cache. In addition, click OK, save the pipeline, close and re-open the pipeline.

TIP: A cartesian join transform will combine a different number of fields from multiple table inputs without requiring key join fields.

TIP: Using the "insert data from transform" drop down will block until the transform selected has completed.


``SELECT * FROM customers WHERE changed_date BETWEEN ? AND ?``

This SQL statement requests two calendar dates, to create a range, that are read from the Insert data from transform option. The target date range can be provided using the Get System Info transform. For example, if you want to read all customers that have had their data changed yesterday, you can get a target range for yesterday and read the customer data.
== Options

[options="header"]
Expand Down

0 comments on commit fa2a082

Please sign in to comment.