After modifying PostgreSQLLexer.g4 and PostgreSQLParser.g4 to include LineComment and BlockComment in the commentstmt rule and removing -> channel(HIDDEN), why is the enterCommentstmt method in PostgreSQLParserBaseListener.java still not being executed? #4376

drakshayanin · 2025-01-08T13:17:29Z

I am working with PostgreSQLLexer.g4 and PostgreSQLParser.g4. I have extracted the support files from those files, including PostgreSQLParserBaseListener.java. In this file, I have the enterCommentstmt method, and I have overridden it, but the method is not being executed.

@OverRide
public void enterCommentstmt(PostgreSQLParser.CommentstmtContext ctx) {}
After reviewing the Lexer file, I found the following definitions:

LineComment: '--' ~ [\r\n]* -> channel(HIDDEN);

BlockComment:
('/' ('/' BlockComment | ~ [/] | '/'+ ~ [/] | ''+ ~ [/])* '' '*/') -> channel(HIDDEN);
In PostgreSQLLexer.g4, I have removed the -> channel(HIDDEN) as follows:

LineComment: '--' ~ [\r\n]* ;

BlockComment:
('/' ('/' BlockComment | ~ [/] | '/'+ ~ [/] | ''+ ~ [/])* '' '*/');
In PostgreSQLParser.g4, I added LineComment and BlockComment to the commentstmt rule as shown below:

commentstmt
: LineComment
| BlockComment;
However, after making these changes, the enterCommentstmt method is still not being executed. How should I proceed??

kaby76 · 2025-01-08T14:16:42Z

You don't give the input. It's impossible to answer. Likely, your input does not parse, but we can't do anything without the input to "see" the parse tree.

drakshayanin · 2025-01-08T14:23:15Z

Input.sql:

-- Sample SELECT statement
SELECT id, name, salary
FROM employees
WHERE salary > 50000;

-- Sample INSERT statement
INSERT INTO employees (id, name, salary)
VALUES (1, 'John Doe', 55000);

kaby76 · 2025-01-08T23:54:17Z

The parse fails on line 2. The parse tree is incomplete.

I would recommend that you introduce a "comment" mode in the lexer. There is no way your changes can work.

Also, backing up a bit, you are trying to fit a square peg in a round hole. Why are you trying to extract comments through Antlr visitors or listeners? The comments are on the token stream as HIDDEN.

drakshayanin · 2025-01-09T06:37:50Z

I need the code chunk so that the comments can also be read as data. I have modified the lexer file like this, but when I try to generate the supported files, they are not being generated.

COMMENT_MODE: // A custom lexer mode for comments
{ // Switch to the COMMENT_MODE for comment handling
'--' ~[\r\n]* -> skip; // line comment (skip)
'/' .? '*/' -> skip; // block comment (skip)
};

LineComment: '--' ~ [\r\n]* -> pushMode(COMMENT_MODE));

BlockComment:
('/' ('/' BlockComment | ~ [/] | '/'+ ~ [/] | ''+ ~ [/])* '' '*/') -> pushMode(COMMENT_MODE)
;

kaby76 · 2025-01-10T14:29:48Z

It doesn't work because the input does not parse. (You should print out the parse tree, parse result, and tokens for your input.) You could fix this by inserting a semi-colon after the comment in the lexer mode to make the comment really look like a "statement", or redo the grammar even more to allow "comment statements" to not require a following semi-colon. But I would not do any of this. Antlr parse trees don't contain intertoken or "hidden" tokens. And changing the grammar to make comments work in the parse is going to introduce all sorts of problems.

Here's what you should do:

Leave the grammar unmodified.
Write a visitor or listener for the parse tree node you want to examine, and check the token stream directly for comments in the token stream corresponding to the interval in the parse tree.

For example, if you override the visitor or listener for stmtmulti, you can get the token index for the left-most leave node for each stmt or the token index for each SEMI. You'll have to write a recursive function (or visitor) to go down the tree and get this token index. Then, use CommonTokenStream.getTokens()[index] to get at the comment token(s), checking the token type to make sure you get comments, not whitespace. Then, you can synch up the comment with the following statement. Or just use CommonTokenStream.getTokens()[index] and look for LineComment and BlockComment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After modifying PostgreSQLLexer.g4 and PostgreSQLParser.g4 to include LineComment and BlockComment in the commentstmt rule and removing -> channel(HIDDEN), why is the enterCommentstmt method in PostgreSQLParserBaseListener.java still not being executed? #4376

After modifying PostgreSQLLexer.g4 and PostgreSQLParser.g4 to include LineComment and BlockComment in the commentstmt rule and removing -> channel(HIDDEN), why is the enterCommentstmt method in PostgreSQLParserBaseListener.java still not being executed? #4376

drakshayanin commented Jan 8, 2025

kaby76 commented Jan 8, 2025

drakshayanin commented Jan 8, 2025

kaby76 commented Jan 8, 2025

drakshayanin commented Jan 9, 2025

kaby76 commented Jan 10, 2025