Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function in Y axis in heatmaps results in duplicate columns #31974

Open
3 tasks done
Phyks opened this issue Jan 23, 2025 · 4 comments
Open
3 tasks done

Function in Y axis in heatmaps results in duplicate columns #31974

Phyks opened this issue Jan 23, 2025 · 4 comments
Labels
viz:charts:heatmap Related to the Heatmap chart

Comments

@Phyks
Copy link

Phyks commented Jan 23, 2025

Bug description

Starting from Superset 4.1 (and confirmed in 4.1.1), probably related to the heatmaps updates, it no longer works to define Y axis as a function on a field.

Having Y axis set to foobar (with foobar a field in your dataset) works, but setting it to REPLACE('foobar', '', '') (a SQL function of your field) results in duplicating all the fields (X axis and Y axis) in the produced SQL query and then crashing.

I'm using Spark as a datasource, with Apache Hive connector.

Best

Screenshots/recordings

No response

Superset version

4.1.1

Python version

3.11

Node version

18 or greater

Browser

Firefox

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented Jan 23, 2025

Hey @Phyks! I'm here to assist you with any bugs, questions, or contributions. Let's tackle this issue together!

The issue you're experiencing with duplicate columns when using a SQL function in the Y-axis configuration for heatmaps in Superset 4.1.1 seems to be related to changes in the heatmap SQL query generation logic. While I couldn't find specific details about changes in version 4.1 that might cause this, the SQL query generation for heatmaps is handled by the buildQuery function in the file superset-frontend/plugins/plugin-chart-echarts/src/Heatmap/buildQuery.ts [1].

The expected behavior when using SQL functions in axis configurations is that the SQL expression should be supported and incorporated into the query without causing duplication [2]. However, it seems there might be a bug affecting this functionality in your case.

Since you're using Spark with the Apache Hive connector, it's also possible that there might be specific issues or limitations with this setup, although I couldn't find any documented ones related to heatmaps.

To address this, you might want to check the buildQuery logic for any recent changes or consider creating a minimal reproducible example to further investigate the issue.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the viz:charts:heatmap Related to the Heatmap chart label Jan 23, 2025
@rusackas
Copy link
Member

Seems to be working on master when I try it. Can you give us an example/screenshot with an example dataset? Here I'm using Vehicle Sales.

Image

Before REPLACE:
Image

After:
Image

@Phyks
Copy link
Author

Phyks commented Jan 24, 2025

Non-working example for me in 4.1.1:

  • In SQLLab, run:
SELECT 'Jan' AS Qtr1, 'Apr' AS Qtr2
UNION ALL SELECT 'Feb' AS Qtr1, 'May' AS Qtr2
UNION ALL SELECT 'Mar' AS Qtr1, 'Jun' AS Qtr2

and save as a new dataset.

  • In Charts:

Image

It works with the "legacy" heatmap though. When looking at the generated SQL query, it looks like

SELECT `Qtr1` AS `Qtr1`, REPLACE(Qtr2, '', '') AS `My column`, COUNT(*) AS `count`, `Qtr1` AS `Qtr1`, REPLACE(Qtr2, '', '') AS `My column` 
FROM (SELECT 'Jan' AS Qtr1, 'Apr' AS Qtr2
UNION ALL SELECT 'Feb' AS Qtr1, 'May' AS Qtr2
UNION ALL SELECT 'Mar' AS Qtr1, 'Jun' AS Qtr2
) AS `virtual_table` GROUP BY `Qtr1`, REPLACE(Qtr2, '', '') ORDER BY `Qtr1` ASC, `My column` ASC
 LIMIT 1000;

which has duplicates in the outer SELECT.

Thanks

@Phyks
Copy link
Author

Phyks commented Jan 24, 2025

It also seems impossible to import a legacy heatmap in latest superset (4.1.1):

  • If I export a new heatmap chart, the YAML file has a viz type "heatmap v2".
  • If I export a legacy heatmap chart, the YAML file has a viz type "heatmap".

But when importing a "heatmap" type into Superset 4.1.1, it seems to always be promoted to a "heatmap v2" (migration logic?). Therefore, it seems impossible to export and reimport a legacy heatmap chart as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
viz:charts:heatmap Related to the Heatmap chart
Projects
None yet
Development

No branches or pull requests

2 participants