Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Hop GUI + Get file names - error in showing output fields #4843

Open
dave-csc opened this issue Jan 27, 2025 · 6 comments
Open

[Bug]: Hop GUI + Get file names - error in showing output fields #4843

dave-csc opened this issue Jan 27, 2025 · 6 comments

Comments

@dave-csc
Copy link
Contributor

dave-csc commented Jan 27, 2025

Apache Hop version?

2.11.0, 2.12.0 SNAPSHOT (2025-01-28)

Java version?

17.0.2

Operating system

Linux

What happened?

The Get file names transform has its own set of output fields, discarding any input after processing.

However, Hop GUI shows the outputs as if they were propagated instead: you can use them in subsequent transforms, but when executing the final pipeline you will most likely get an error such as "Field *** not defined in the input stream"

Issue Priority

Priority: 2

Issue Component

Component: Hop Gui, Component: Transforms

@nadment
Copy link
Contributor

nadment commented Jan 27, 2025

Can you provide a basic example pipeline to illustrate the problem?

@dave-csc
Copy link
Contributor Author

Hi @nadment, here's a basic pipeline:

  1. Add a transform that generates a single row with a static path (using e.g. a Data grid or a Get variables): name the field dst_file
  2. Link this transform to a Get file names, specify the file selection in order to look for a single file
  3. Link the latter transform to a Process files, specify Operation = Move, Source file = filename (generated by Get file names), Target file = dst_file (as named above)

The Process files transform fails because the field dst_file is not found in the input stream:

2025/01/28 12:30:07 - Sposta file.0 - ERROR: Unexpected error
2025/01/28 12:30:07 - Sposta file.0 - ERROR: org.apache.hop.core.exception.HopException: 
2025/01/28 12:30:07 - Sposta file.0 - Couldn't find field 'dst_file' in row!
2025/01/28 12:30:07 - Sposta file.0 - 
2025/01/28 12:30:07 - Sposta file.0 - 	at org.apache.hop.pipeline.transforms.processfiles.ProcessFiles.processRow(ProcessFiles.java:87)
2025/01/28 12:30:07 - Sposta file.0 - 	at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:54)
2025/01/28 12:30:07 - Sposta file.0 - 	at java.base/java.lang.Thread.run(Thread.java:833)
2025/01/28 12:30:07 - Sposta file.0 - Finished processing (I=0, O=0, R=1, W=0, U=0, E=1)

However, in Hop GUI you can find evidence of it in various forms:

  • select the Get file names, and then Show output fields: the dst_file field is listed
  • select the Process files, and then Show input fields: the dst_file field is listed
  • when configuring the Process files transform, dst_file is a selectable field

This issue is found in the latest Hop snapshot, too.

Note: in this specific scenario, you can successfully move the file by swapping the first two transforms. The error is that there are listed fields in the subsequent transforms that are actually non usable...

@nadment
Copy link
Contributor

nadment commented Jan 28, 2025

It works as expected, did you add the definition in “Selected files”?

I admit that the user interface isn't very user-friendly.

Image

@nadment
Copy link
Contributor

nadment commented Jan 28, 2025

The user interface needs to be redesigned, and adding a simple button in the folder/file list would be more understandable and practical?

Image

@dave-csc
Copy link
Contributor Author

Hi @nadment,

my report wasn't about the interface (even if I agree with the proposed redesign), but it's about the fact that you can't actually access to fields declared before the Get file names transform.

To check this, try putting a Write to log instead of a Dummy in your example: in Hop GUI you can select for logging the fields generated in the Data grid, but when executing the selected fields aren't found in input...

@nadment
Copy link
Contributor

nadment commented Jan 29, 2025

Ok I can reproduce the problem with the pipeline
Check “Is filename defined in a field” -> OK
Not checked “Is filename defined in a field” -> ERROR

test-get-file-names.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants