Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StorageEntity cache not be cleared #353

Open
54446776 opened this issue Dec 24, 2024 · 2 comments
Open

StorageEntity cache not be cleared #353

54446776 opened this issue Dec 24, 2024 · 2 comments

Comments

@54446776
Copy link

54446776 commented Dec 24, 2024

Environment Details

  • EclipseStore Version: 2.0.0
  • JDK version: 21.0.4
  • OS: Mac M1
  • Used frameworks: no

Describe the bug

create db with config blow:

static EmbeddedStorageFoundation<?> foundationForLarge(String graphDbPath)
  {
    var builder = EmbeddedStorageConfiguration.Builder();
    builder.setStorageDirectory(graphDbPath);
    builder.setEntityCacheThreshold(100_000);
    builder.setEntityCacheTimeout(Duration.of(1, ChronoUnit.MINUTES));
    builder.setHousekeepingInterval(Duration.of(100, ChronoUnit.MILLIS));
    builder.setHousekeepingTimeBudget(Duration.of(50, ChronoUnit.MILLIS));
    var foundation = builder.createEmbeddedStorageFoundation();
    registerBasicTypeHandlers(foundation);
    return foundation;
  }

gc every minute:

@Cleanup
    ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
    scheduler.scheduleWithFixedDelay(() -> {
      manager.issueFullGarbageCollection();
      log.info("storage gc");
    }, 1, 1, MINUTES);

StorageEntity total count keep same, will not decrease!
image

I debug the code and found one reason:

clearEntityCache alway return false

public final boolean clearEntityCache(
			final long          cacheSize,
			final long          evalTime ,
			final StorageEntity e)
{
        final long ageInMs = evalTime - e.lastTouched();

	return ageInMs >= this.timeoutMs
		|| this.threshold - cacheSize < e.cachedDataLength() * (ageInMs >> C16) << (e.hasReferences() ? 0 : 1);
}

because checkForCacheClear in each house keep cycle refresh the touch time:

void checkForCacheClear(final StorageEntity.Default entry, final long evalTime)
{
	if(this.entityCacheEvaluator.clearEntityCache(this.usedCacheSize, evalTime, entry))
	{
		// use ensure method for that for purpose of uniformity / simplicity
		this.ensureNoCachedData(entry);
	}
	else
	{
		// if the loaded entity data can stay in memory, touch the entity to mark now as its last use.
		entry.touch();
	}
}

So, every 100ms house keeping refresh touch time, and then ageInMs always less than 60000ms(ageInMs even can be negative number!), lead StorageEntity will not be cleared in house-keeping thread or manual house-keeping gc!

More insert or modified objects, more entities accumulating, lead to OOM finally.

@fh-ms

@54446776 54446776 changed the title StorageEntity will no be cleared StorageEntity cache not be cleared Dec 24, 2024
@54446776
Copy link
Author

54446776 commented Jan 17, 2025

I'm struggling with the above judgment; the checkForCacheClear method doesn't have a bug.

The constantly refreshing entry.lastTouch timestamp is not the direct cause of the OOM.

Let me share my findings over this period; GC is really hard:

  1. Each persistent object must have a StorageEntity instance.

  2. The StorageEntity instance holds an off-heap pointer pointing to the binary array of the persistent object. It acts as a cache so that when deserialization is needed, the memory cache can be directly used without disk IO.

  3. Therefore, the StorageEntity instance, acting as a cache, will be cleared due to idle timeout or if the total cache size exceeds the threshold. Importantly, it is the off-heap memory that gets cleared, not the StorageEntity instance itself. As long as the persistent object is still in use (cannot be garbage collected), the StorageEntity instance will always exist.

  4. Refreshing the entry.lastTouch timestamp delays the clearing time, leading to the accumulation of more off-heap memory, may causing OOMs when memory pressure is very high.

  5. The StorageEntity instance itself is a simple object. Even if the number keeps increasing, it only slowly pushes up the heap memory usage.

  6. If the checkForCacheClear method is modified as follows:

void checkForCacheClear(final StorageEntity.Default entry, final long evalTime)
{
    if(this.entityCacheEvaluator.clearEntityCache(this.usedCacheSize, evalTime, entry))
    {
        // use ensure method for that for purpose of uniformity / simplicity
        this.ensureNoCachedData(entry);
    }
    else
    {
        controlledTouch(entry, evalTime);
    }
}

private void controlledTouch(StorageEntity.Default entry, long evalTime)
{
    if (evalTime - entry.lastTouched() < ACTIVE_INTERVAL)
    {
        entry.touch();
    }
}

private static final long ACTIVE_INTERVAL = 1000; // 1000ms

Then off-heap memory would be released more timely, leading to more predictable memory usage behavior and a more certain program state.

Can the StorageEntity instance be written to disk? For example, using mmap, so even if millions or even billions of objects are persisted, OOM wouldn't occur.

@sblommers
Copy link

sblommers commented Jan 17, 2025

Thank you @54446776 for your investigation and your suggestion. I know how much time and work it is to test this in an actual production use case and come up with a logical explaination of the inner workings and a possible fix/workaround/improvement. Very much appreciated. I need to make some time to process your comment and will do another test if i can get some spare time (midnight perhaps 😃)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants