-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function in Y axis in heatmaps results in duplicate columns #31974
Comments
Hey @Phyks! I'm here to assist you with any bugs, questions, or contributions. Let's tackle this issue together! The issue you're experiencing with duplicate columns when using a SQL function in the Y-axis configuration for heatmaps in Superset 4.1.1 seems to be related to changes in the heatmap SQL query generation logic. While I couldn't find specific details about changes in version 4.1 that might cause this, the SQL query generation for heatmaps is handled by the The expected behavior when using SQL functions in axis configurations is that the SQL expression should be supported and incorporated into the query without causing duplication [2]. However, it seems there might be a bug affecting this functionality in your case. Since you're using Spark with the Apache Hive connector, it's also possible that there might be specific issues or limitations with this setup, although I couldn't find any documented ones related to heatmaps. To address this, you might want to check the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Non-working example for me in 4.1.1:
SELECT 'Jan' AS Qtr1, 'Apr' AS Qtr2
UNION ALL SELECT 'Feb' AS Qtr1, 'May' AS Qtr2
UNION ALL SELECT 'Mar' AS Qtr1, 'Jun' AS Qtr2 and save as a new dataset.
It works with the "legacy" heatmap though. When looking at the generated SQL query, it looks like SELECT `Qtr1` AS `Qtr1`, REPLACE(Qtr2, '', '') AS `My column`, COUNT(*) AS `count`, `Qtr1` AS `Qtr1`, REPLACE(Qtr2, '', '') AS `My column`
FROM (SELECT 'Jan' AS Qtr1, 'Apr' AS Qtr2
UNION ALL SELECT 'Feb' AS Qtr1, 'May' AS Qtr2
UNION ALL SELECT 'Mar' AS Qtr1, 'Jun' AS Qtr2
) AS `virtual_table` GROUP BY `Qtr1`, REPLACE(Qtr2, '', '') ORDER BY `Qtr1` ASC, `My column` ASC
LIMIT 1000; which has duplicates in the outer SELECT. Thanks |
It also seems impossible to import a legacy heatmap in latest superset (4.1.1):
But when importing a "heatmap" type into Superset 4.1.1, it seems to always be promoted to a "heatmap v2" (migration logic?). Therefore, it seems impossible to export and reimport a legacy heatmap chart as is. |
Bug description
Starting from Superset 4.1 (and confirmed in 4.1.1), probably related to the heatmaps updates, it no longer works to define Y axis as a function on a field.
Having Y axis set to
foobar
(withfoobar
a field in your dataset) works, but setting it toREPLACE('foobar', '', '')
(a SQL function of your field) results in duplicating all the fields (X axis and Y axis) in the produced SQL query and then crashing.I'm using Spark as a datasource, with Apache Hive connector.
Best
Screenshots/recordings
No response
Superset version
4.1.1
Python version
3.11
Node version
18 or greater
Browser
Firefox
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: