Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.StackOverflowError during "Requesting Java AST from selection" #3273

Closed
nlisker opened this issue Nov 9, 2024 · 36 comments
Closed
Assignees
Labels
bug Something isn't working jpms
Milestone

Comments

@nlisker
Copy link

nlisker commented Nov 9, 2024

Eclipse soft crashes during hover over some fields with a SOE and asks to restart. This also breaks the compilation, as after this error, many red lines appear saying that various methods can't be found suddenly.
image
image

The first lines of the stack traces are different every time, but the recursion is always the same. Here is the log with various cases of these errors: crashes.log

Here is the infinite recursion (I don't know what the entry point was):

	at org.eclipse.jdt.internal.compiler.Compiler.accept(Compiler.java:365)
	at org.eclipse.jdt.core.dom.CompilationUnitResolver.accept(CompilationUnitResolver.java:226)
	at org.eclipse.jdt.core.dom.CompilationUnitResolver.accept(CompilationUnitResolver.java:221)
	at org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.askForType(LookupEnvironment.java:391)
	at org.eclipse.jdt.internal.compiler.lookup.PackageBinding.getTypeOrPackage(PackageBinding.java:275)
	at org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.findImport(CompilationUnitScope.java:643)
	at org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.resolveImports(CompilationUnitScope.java:325)
	at org.eclipse.jdt.internal.compiler.lookup.CompilationUnitScope.checkAndSetImports(CompilationUnitScope.java:284)
	at org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment$CompleteTypeBindingsSteps.perform(LookupEnvironment.java:191)
	at org.eclipse.jdt.internal.compiler.lookup.LookupEnvironment.completeTypeBindings(LookupEnvironment.java:602)
	at org.eclipse.jdt.internal.compiler.Compiler.accept(Compiler.java:365)
	at org.eclipse.jdt.core.dom.CompilationUnitResolver.accept(CompilationUnitResolver.java:226)
	at org.eclipse.jdt.core.dom.CompilationUnitResolver.accept(CompilationUnitResolver.java:221)

While I can reproduce this 100% of the times, unfortunately I didn't manage to create a minimal example despite many hours of chopping off parts of the big project where this occurs. This is because the error becomes harder and harder to reproduce the more code is removed, which means that there is a certain load affecting this issue.
I can't post a whole project here, but if there is a way to send the project or post it somewhere private I will do so.

Tested on
Eclipse 4.33 and also build id: I20241101-1800
Java 22
Win 10

@srikanth-sankaran
Copy link
Contributor

I can't post a whole project here, but if there is a way to send the project or post it somewhere private I will do so.

I don't have a place to suggest, but if you organize such a place, I offer to investigate. My ideal reproducer will be a plain vanilla SDK project without layers like maven, gradle etc if possible - I am not familiar with these and it will impede progress. Thanks

@srikanth-sankaran
Copy link
Contributor

srikanth-sankaran commented Nov 12, 2024

I need a few more days though - this week looks tough. I have been burning the midnight candle at both ends if you don't mind my mixing the metaphor :), trying to squeeze in improved support for sealed types as well as switch expressions in 4.34.

I'll let you know once I am able to spare some cycles (later half of next week looks likely

@jukzi jukzi added the bug Something isn't working label Nov 12, 2024
@nlisker
Copy link
Author

nlisker commented Nov 12, 2024

I sent you an invite to the private repo with instructions in the only open issue there. Thank you for taking a look!

Amber features take priority :)

@srikanth-sankaran
Copy link
Contributor

srikanth-sankaran commented Nov 12, 2024

See also

#1081
#1490
#2301

the former has some reduced test case. The last one has a junit too. Need to see if they are all related or even the same

@srikanth-sankaran
Copy link
Contributor

I have verified I am able to access the private repo, clone the repo and import the projects and see the instructions. No luck yet reproducing, but I will spend more time on this next week

@srikanth-sankaran
Copy link
Contributor

In the meantime you may want to check if the hover that triggers the crash is over some symbol statically imported given static imports feature prominently in the other tickets with SOE.

@nlisker
Copy link
Author

nlisker commented Nov 13, 2024

I'll try to narrow it down. For now I'm even getting multiple occurrences
image

@nlisker
Copy link
Author

nlisker commented Nov 13, 2024

Also, the crash can also happen on clicking a tab in the editor (switching files), not just on hover.

@srikanth-sankaran
Copy link
Contributor

This looks nasty. Hopefully we can get to the bottom of it soon.

Deeper stack traces may help.

Also at some point we can consider a version of eclipse that would log some data. We need to identify what to log.

@nlisker
Copy link
Author

nlisker commented Nov 13, 2024

I think managed to narrow it down to a Web Tools feature that is installed in the background. When an Eclipse with that component opens the project it somehow corrupts it, then any Eclipse that will open it should see the same issue. I'll update the reproduction instructions. Even if it's a bug with a plugin, it bubbles to a full JDT stack trace as shown in the first comment (which is why it's so hard to find where the problem is).

@nlisker
Copy link
Author

nlisker commented Nov 13, 2024

Instructions updated with screenshots.

@nlisker
Copy link
Author

nlisker commented Nov 25, 2024

Any update on this?

@srikanth-sankaran
Copy link
Contributor

Not yet. I'll spend time on it tomorrow.

@srikanth-sankaran
Copy link
Contributor

Drowned in this #3310 🙂 but I can context switch tomorrow

@srikanth-sankaran srikanth-sankaran self-assigned this Nov 26, 2024
@srikanth-sankaran srikanth-sankaran added this to the 4.35 M1 milestone Nov 26, 2024
@srikanth-sankaran
Copy link
Contributor

There are some recent, completely unrelated changes to JDT that are blocking me from setting up a development/debugging environment - obviously it is not enough to reproduce, I need to be able to debug the reproduced set up. I need to work with a team mate to get unblocked but he is unavailable today. Will check with him tomorrow. This has bubbled to near the top of priority list, so hopefully will be resolved soon. Stay tuned. Thanks for your patience

@srikanth-sankaran
Copy link
Contributor

(The issue referred #3190 (comment))

@srikanth-sankaran
Copy link
Contributor

Screenshot 2024-11-28 152521

After installing WST and adding JBoss server, Eclipse wants to restart and upon restart I get this dialog.

Is this expected or materially affects things ?? Sorry a bit clueless here

@nlisker
Copy link
Author

nlisker commented Nov 28, 2024

Never seen this. I use JDK 22/23. The entries there say "requires JavaSE version 21" but you are using 20.

@srikanth-sankaran
Copy link
Contributor

Never seen this. I use JDK 22/23. The entries there say "requires JavaSE version 21" but you are using 20.

I have no idea where that JDK20 reference is coming from, but that doesn't seem to matter,

Despite that, fairly quickly I am able to reproduce the SOE, so the instructions are good.

Investigating ...

@srikanth-sankaran
Copy link
Contributor

srikanth-sankaran commented Nov 29, 2024

@stephan-herrmann, Per your expression of interest to own JPMS implementation, I request you to take over this - the smoking gun per my basic investigation points there (SplitPackageBinding et al)

@nlisker - could you please grant permissions to Stephan so he can access the private repo - thanks

Stephan, The instructions to reproduce are clear and you can fairly consistently and quickly reproduce the SOE.

What I am finding is that we have trouble handling static imports that import symbols internal to the CU doing the imports:

Two entries that recur in the problem scenario are:

import static com.sanctorum.game.shared.triggerable.spell.SpellIndex.Characteristic.*; // imports symbols internal to this CUD

<many elided imports>

public sealed interface SpellIndex {
// lots of details elided 
	enum Characteristic {
            // enumerators elided
	}
}

And

import static com.sanctorum.game.shared.entities.Minion.Characteristics.*; // imports symbols internal to this CUD

<Elided imports>

public interface Minion extends BoardGameEntity {

	enum Characteristics {
	 // elided enumerators
	}

In the latter case, resolving the import import static com.sanctorum.game.shared.entities.Minion.Characteristics.*; causes another fresh CompilationUnitDeclaration for Minion.java to be added to org.eclipse.jdt.internal.compiler.Compiler.unitsToProcess notwithstanding the fact that the import is originating from the self same CUD.

This somehow seems to be connected to the package com.sanctorum.game.shared.entities being a SplitPackageBinding
(from <unnamed>, com.sanctorum.game.shared)

I am hoping we get lucky and the same fix solves the other SOEs in #1490 and #1081

@srikanth-sankaran
Copy link
Contributor

This renders Eclipse unusable pretty quickly in the problem scenario :-(

@nlisker
Copy link
Author

nlisker commented Nov 29, 2024

Thanks, added Stephan.

I still have a few confusing points here.

  • If the errors come from these static imports, why could I not reproduce it only on the shared module? The error happens on a different module, and even more so only on that specific one I uploaded. The project has more modules that import the shared one, but they don't exhibit this error.
  • Why could we not reproduce this on a "plain" Eclipse and needed the WST+servers setup to cause this?
  • Why, after the error happens once, opening it in a "plain" Eclipse is suddenly reproducible? It looks like something got corrupted.
  • I couldn't manage to create a small reproducible. The more I removed the less frequent the error became even if the offending classes were there. Is it triggered when a certain amount of data is processed?

@nlisker
Copy link
Author

nlisker commented Nov 29, 2024

This renders Eclipse unusable pretty quickly in the problem scenario :-(

Yes, after many hours of figuring out how to reliably and most-minimally reproduce it, I stopped working on the offending module and never came back :) Hopefully I will be able to go back to that part of the code soon.

BTW, is there a workaround? Imports are not strictly necessary I believe, so if I can temporarily fully qualify the classes and get rid of the problematic imports I will be happy to do so until this is resolved.

@srikanth-sankaran
Copy link
Contributor

Let us wait for @stephan-herrmann to analyze this - I know very little about JPMS and ECJ's implementation having pretended for a long while Java 9 didn't happen at all and we hop skip jumped from Java 8 to 10 :) But if I were to venture a guess: Could it be that the Split package situation arises only when WST server is installed ? (I will admit I don't even know if this question makes sense.)

@stephan-herrmann
Copy link
Contributor

added Stephan

reproduced also here :)

@stephan-herrmann
Copy link
Contributor

In the latter case, resolving the import import static com.sanctorum.game.shared.entities.Minion.Characteristics.*; causes another fresh CompilationUnitDeclaration for Minion.java to be added to org.eclipse.jdt.internal.compiler.Compiler.unitsToProcess notwithstanding the fact that the import is originating from the self same CUD.

This somehow seems to be connected to the package com.sanctorum.game.shared.entities being a SplitPackageBinding (from , com.sanctorum.game.shared)

Yep, this is quite close to the root bug, I believe. I can see that we are finding the type in question both as a SourceTypeBinding associated with a proper SourceModuleBinding and as an SourceType which is reported to belong to the unnamed module. This latter association is wrong, because the enclosing project is modular.

This happens because the resolved classpath entry towards the project has only these extra attributes: [attributes:gradle_scope=main,gradle_used_by_scope=main,test], where it should in fact also have module=true.

The entry is resolved from the following container:

        <classpathentry kind="con" path="org.eclipse.buildship.core.gradleclasspathcontainer">
                <attributes>
                        <attribute name="org.eclipse.jst.component.dependency" value="/WEB-INF/lib"/>
                </attributes>
        </classpathentry>

Two questions:

  • does the gradleclasspathcontainer properly set module=true on resolved entries, normally?
  • if so, does any wst stuff interfere with this container resolution?
    Once the resolved container is persisted this would explain that the bug sticks even when the set of plug-ins that trigger the bug is no longer used.

Possible directions of action:

  • ensure that JDT recognizes the schizophrenia of the same type appearing with different module associations, while remaining capable of detecting truly duplicate types from different modules. Wish me luck!
  • chase down the source of the incomplete resolved container entry.

Stay tuned.

@nlisker
Copy link
Author

nlisker commented Nov 30, 2024

I think I can already answer "does the gradleclasspathcontainer properly set module=true on resolved entries, normally?".

The container itself is always non-modular, but the contained artifacts might or might not be. Here is a typical scenario:
image

This is because tests might want to run in a non-modular setup. It is actually a downstream configuration from Eclipse only allowing 1 module-info per project and not per source. I can probably re-find the discussions about the setup of this container on the Gradle/Buildship GitHub.

@nlisker
Copy link
Author

nlisker commented Nov 30, 2024

Also might be of interest:

                <attributes>
                        <attribute name="org.eclipse.jst.component.dependency" value="/WEB-INF/lib"/>
                </attributes>

JST (Java Server Tools) builds on top of WST (Web Server Tools). When the Java servers are added, it installs JST. This might answer why I found a connection to the WST components that are installed in the background.

@stephan-herrmann
Copy link
Contributor

stephan-herrmann commented Nov 30, 2024

Next find:

The CancelableNameEnvironment used by ASTParser implicitly adds ALL-UNNAMED to the required modules, based on !excludeTestCode and presence of a "test" source folder. This is where the unnamed module initially enters the picture. We have, however, a mixed setup from the outset: a shared modular project referenced from a non-modular project.

stephan-herrmann added a commit to stephan-herrmann/eclipse.jdt.core that referenced this issue Nov 30, 2024
stephan-herrmann added a commit to stephan-herrmann/eclipse.jdt.core that referenced this issue Nov 30, 2024
stephan-herrmann added a commit to stephan-herrmann/eclipse.jdt.core that referenced this issue Nov 30, 2024
+ resilience: avoid accepting a sourceType being completed already
+ reduce fix for 544306 to modular projects

Fixes eclipse-jdt#3273
stephan-herrmann added a commit to stephan-herrmann/eclipse.jdt.core that referenced this issue Nov 30, 2024
change of strategy:
+ better hiding of modules seen via the classpath

Fixes eclipse-jdt#3273
@stephan-herrmann
Copy link
Contributor

Yep, this is quite close to the root bug, I believe. I can see that we are finding the type in question both as a SourceTypeBinding associated with a proper SourceModuleBinding and as an SourceType which is reported to belong to the unnamed module. This latter association is wrong, because the enclosing project is modular.

The final (?) fix in #3373 actually goes the opposite direction: Given that the focus project is not modular, and given that the shared project is on the classpath, not modulepath, then no module (other than JRE) should be relevant.

@nlisker
Copy link
Author

nlisker commented Dec 1, 2024

Thank you Stephan. When the fix will be available in the nightly builds I will test it.

@stephan-herrmann
Copy link
Contributor

For posterity some attempts at answering:

I still have a few confusing points here.

  • If the errors come from these static imports, why could I not reproduce it only on the shared module? The error happens on a different module, and even more so only on that specific one I uploaded. The project has more modules that import the shared one, but they don't exhibit this error.

Static imports pointing into their own compilation unit is one ingredient.

Another ingredient is that different parts of the implementation had different views on the same class and package: To resolve the import, the class was searched in the unnamed module (because the client project is not modular, and the classpath entry pointing to the shared project didn't specify module=true). When the class was found we noticed that the containing package fragment root (here: source folder) contains module-info.java so we associate the class with that module. But when the recursive lookup tries to resolve its import it still doesn't find its own class via the unnamed module.

This is the schizophrenia I mentioned elsewhere: the implementation "thought" that those are two different packages and classes, one set in the unnamed module the other set in a named module, where in fact both sets are the same.

  • Why could we not reproduce this on a "plain" Eclipse and needed the WST+servers setup to cause this?

It is possible that WST+servers influence resolution of the gradle classpath container.

Which influence exactly this is, I did not investigate, since the error is clearly in JDT.

For completeness: other required ingredients for the bug to occur are: the focus project must have a source folder marked as "test". The shared project must have some non-modular dependency.

  • Why, after the error happens once, opening it in a "plain" Eclipse is suddenly reproducible? It looks like something got corrupted.

The resolved entries (what you seen when you drill into the container in the package explorer) are persisted. So until the container needs resolving again this persistent state will also determine behavior in a WST-less workbench.

  • I couldn't manage to create a small reproducible. The more I removed the less frequent the error became even if the offending classes were there. Is it triggered when a certain amount of data is processed?

The number of necessary factors (see above) for reproducing illustrates why extracting a small reproducer was near impossible without debugging.

@nlisker
Copy link
Author

nlisker commented Dec 1, 2024

Thanks for the detailed answers. One question I have remaining is about

The resolved entries (what you seen when you drill into the container in the package explorer) are persisted.

Where are they persisted? I'm wondering what I would need to clean/recompute in such a case.

@stephan-herrmann
Copy link
Contributor

Thanks for the detailed answers. One question I have remaining is about

The resolved entries (what you seen when you drill into the container in the package explorer) are persisted.

Where are they persisted? I'm wondering what I would need to clean/recompute in such a case.

I believe this is in file WORKSPACE/.metadata/.plugins/org.eclipse.core.resources/.projects/YOUR_PROJECT/org.eclipse.jdt.core/state.dat. I have never tried removing such a file, a clean build should be the safer option to achieve the same ... have you tried that?

@nlisker
Copy link
Author

nlisker commented Dec 2, 2024

I've updated to I20241201-1800 and at least for 5 minutes I don't get a crash, which is a record time, so looks fixed! Took me a bit to realize that the nightly builds quietly switched from the I-34 to the I-35 update repository last month.

a clean build should be the safer option to achieve the same ... have you tried that?

In the following scenario:
Open with plain Eclipse -> no crash
Open with servers Eclipse -> crash
Open with plain Eclipse -> crash
A clean-rebuild on the plain Eclipse did not help. Closing the project and re-opening did not help either. So did removing the project only from the workspace and re-importing. This was part of the confusion on where the data is stored. What did work was removing from the workspace + from the disk and re-cloning from the remote repo.

By the way, a reminder to check if the issues from #3273 (comment) were also solved in case you didn't.

@stephan-herrmann
Copy link
Contributor

I've updated to I20241201-1800 and at least for 5 minutes I don't get a crash, which is a record time, so looks fixed!

Thanks for confirming!

A clean-rebuild on the plain Eclipse did not help.

Sorry. Of course it would be nice if we could explain the persistence of the problem, but with the SOE fixed, I don't plan to invest more into this issue.

By the way, a reminder to check if the issues from #3273 (comment) were also solved in case you didn't.

Thanks for the reminder. Nothing I could easily check on my own, but I pinged the respective reporters for re-testing.

jarthana added a commit that referenced this issue Dec 6, 2024
* Remove nolonger support converterJclMin (#3332)

Versions older than 1.8 are not supported and thus shouldn't be needed
anymore.

* Fix and enable Java50Tests#testMissingRequiredBinaries

* Version bump(s) for 4.35 stream

* Test failures in I-Builds due to less diagnostics being emitted (#3357)

* Fixes #3356

* Textual problem indicator goes wild with lamda (#3358)

* jclMin23: ignore missing serializable problem

"The serializable class Long does not declare a static final
serialVersionUID field of type"

* Fix FieldLocator and MethodLocator to support local/anonymous classes (#3314)

- Fix FieldLocator.reportDeclaration() and
  MethodLocator.reportDeclaration() to find the anonymous or local
  type for the declaration rather than to return
- add new tests to JavaSearchBugsTests
- fixes #3308

* Codegen Primitives in record comonent patterns to be enabled with null check before calling accessor(#3361)

Before generating an invoke of accessor of a record component, do a null check if primitive conversions are involved.

* java.lang.StackOverflowError during "Requesting Java AST from selection" (#3373)

+ resilience: avoid accepting a sourceType being completed already
+ better hiding of modules seen via the classpath

Fixes #3273

* Add NoSuchFieldError to converterJclMin18 (#3368)

Add NoSuchFieldError to converterJclMin18
 + build the jar
 + avoid new warning

Enable couple of tests fixed by this.

---------
Co-authored-by: Stephan Herrmann <[email protected]>

* Incorrect control flow analysis causes statement subsequent to a switch statement to be flagged unreachable under some circumstances(#3377)

* Fixes #3376

* Remove unused api problem filter

* Update tycho build to 4.0.10

* Completion for unimported types doesn't work.

- use MissingTypesGuesser to select all missing types and offer their completion
- Fixes #1502

* [Enhanced Switch] Wrong error message: Cannot switch on a value of type Integer... at levels that don't support enhanced switch (#3380)

* Fixes #3379

---------

Co-authored-by: Александър Куртаков <[email protected]>
Co-authored-by: Eclipse JDT Bot <[email protected]>
Co-authored-by: Srikanth Sankaran <[email protected]>
Co-authored-by: Jörg Kubitz <[email protected]>
Co-authored-by: Jeff Johnston <[email protected]>
Co-authored-by: Manoj  N Palat <[email protected]>
Co-authored-by: Stephan Herrmann <[email protected]>
Co-authored-by: Snjeza <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working jpms
Projects
None yet
Development

No branches or pull requests

4 participants