Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid UUID.randomUUID() in startup code #5450

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

geoand
Copy link
Contributor

@geoand geoand commented Jan 16, 2025

Motivation:

This is done because bootstrapping the plumbing
needed by the JDK to produce a UUID value
is expensive, it thus doesn't make sense to
pay this cost when the property isn't actually
needed

Explain here the context, and why you're making that change, what is the problem you're trying to solve.

We are making an effort in Quarkus to improve startup time even further by eliminating various bottlenecks across the board.
The first call to UUID.randomUUID() is definitely heavy (as shown in the following flamegraph) and if we can avoid it a startup code (as we have in the development branch of Quarkus), it would be nice.

uuid

P.S. Ideally we would like to have this in Vert.x 4 as well.

@@ -45,7 +45,7 @@ static File setupCacheDir(String fileCacheDir) {

// the cacheDir will be suffixed a unique id to avoid eavesdropping from other processes/users
// also this ensures that if process A deletes cacheDir, it won't affect process B
String cacheDirName = fileCacheDir + "-" + UUID.randomUUID();
String cacheDirName = fileCacheDir + "-" + System.nanoTime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nanoTime is not absolute - it's relative to the process. Meaning that another application starting can have it again, without any need to be simultaneous - is it what you expect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nanoTime is not absolute - it's relative to the process

Correct. But we do think it can be problematic, I'm happy to use Random.getRandom() or System.currentTimeMillis()

Copy link
Contributor

@franz1981 franz1981 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we use Math.random() it get better - but is still not granted to be unique - because it still uses System::nanoTime and Random per se doesn't guarantee uniqueness across processes (try printing new Random(42).nextInt() running it twice with 2 diff processes...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure we are not looking for that such strong of a guarantee here, but I'll let the maintainers be the judge of that

Copy link
Contributor

@tsegismont tsegismont Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about an optimistic attempt? Something like (simplifying):

for(;;) {
  try {
    String cacheDirName = fileCacheDir + "-" + System.nanoTime();
    Files.createDirectories(cacheDirName);
    break;
  } catch(FileAlreadyExistException ignore) {
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, you could use a random instead of System.nanoTime, I think it would be faster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@franz1981 is Random.nextLong() faster than System.nanoTime()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope - or better - usually nanoTime (if not on the cloud with unreliable time sources) uses a thing called rdts which is as cheap as reading a memory area

@@ -36,7 +36,7 @@ public DefaultDeploymentManager(VertxImpl vertx) {
}

private String generateDeploymentID() {
return UUID.randomUUID().toString();
return Long.valueOf(System.nanoTime()).toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be globally unique when running in clustered mode with HA enabled

So perhaps something like:

    if (vertx.isClustered() && vertx.haManager()!=null) {
      return UUID.randomUUID().toString();
    }
    // Use a counter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty common to deploy verticles concurrently. Even when Vert.x is not clustered, the returned value should be unique.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated it to use Random, is that what you meant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant incrementing an AtomicLong counter instead of using a random value (uniqueness is guaranteed and it shouldn't change the perf results you got)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, fixed

@vietj
Copy link
Member

vietj commented Jan 16, 2025

what seems to take time is the initialization of SecureRandom.getDfaultPrng due to loading providers, I think w ecould generate a faster UUID by using a given provider

@geoand
Copy link
Contributor Author

geoand commented Jan 16, 2025

what seems to take time is the initialization of SecureRandom.getDfaultPrng due to loading providers, I think w ecould generate a faster UUID by using a given provider

But those are not public APIs, no?

@vietj
Copy link
Member

vietj commented Jan 16, 2025

I think we should have a way to specify the exact cache dir (e.g. FileSystemOptions#exactFileCacheDir), when none is provided then it uses UUID. Quarkus would specify it in VertxOptions. This would easily be back-ported

@geoand
Copy link
Contributor Author

geoand commented Jan 16, 2025

Sure, that would make sense for us too

@geoand geoand force-pushed the remove-uuid branch 2 times, most recently from b28445c to 7c9fc8f Compare January 17, 2025 07:19
@geoand
Copy link
Contributor Author

geoand commented Jan 17, 2025

I have updated the PR per suggestions

@geoand
Copy link
Contributor Author

geoand commented Jan 21, 2025

Is there anything else you would like me to do for this one?

@tsegismont
Copy link
Contributor

I think we should have a way to specify the exact cache dir (e.g. FileSystemOptions#exactFileCacheDir), when none is provided then it uses UUID. Quarkus would specify it in VertxOptions. This would easily be back-ported

For usability, it seems to me adding a boolean to the options would be enough (it's what's computed in the end to determine if a UUID should be added to the path).

But it's a matter of taste so I'm fine with keeping an extra dir option if you choose so @vietj

This is done because bootstrapping the plumbing
needed by the JDK to produce a UUID value
is expensive, it thus doesn't make sense to
pay this cost when the property isn't actually
needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants