Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink Loc from 12 to 4 bytes #20777

Merged
merged 2 commits into from
Jan 31, 2025
Merged

Conversation

dkorpel
Copy link
Contributor

@dkorpel dkorpel commented Jan 25, 2025

Instead of storing separate filename, line, column fields in Loc, make it a 4-byte index into a data structure.

Firstly, this adds easy fileOffset access to Loc without needing version (DMDLIB) anymore (which increases Loc.sizeof from 12 to 16). Secondly, this reduces memory consumption and page faults:

When compiling Phobos unittests:

Command being timed: "dmdmaster -i=std -c -unittest -version=StdUnittest -preview=dip1000 std/package.d"
User time (seconds): 27.00
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:33.57
Maximum resident set size (kbytes): 16069584
Minor (reclaiming a frame) page faults: 3521363

Command being timed: "dmdbranch -i=std -c -unittest -version=StdUnittest -preview=dip1000 std/package.d"
User time (seconds): 27.23
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:33.84
Maximum resident set size (kbytes): 15001208
Minor (reclaiming a frame) page faults: 3254426

So RAM usage basically goes from 16GB to 15Gb.

As you can see the total time is still slightly larger, because retrieving a loc's line / column data now requires a binary search instead of just reading a variable. However, this can still be optimized further.

I'll document this more later, for now I'm focusing on testing if it works.

Edit: documentation on how it works can be found in comment above struct BaseLoc

@dkorpel dkorpel added the Severity:Refactoring No semantic changes to code label Jan 25, 2025
@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @dkorpel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#20777"

@dkorpel dkorpel force-pushed the loc-smaller-1 branch 2 times, most recently from c7bbd75 to c06d923 Compare January 25, 2025 17:15
@dkorpel
Copy link
Contributor Author

dkorpel commented Jan 25, 2025

@ibuclaw @kinke How much do LDC and GDC rely on constructing Loc from file, line and column arguments? If it's rare, I can keep an inefficient version of the constructor, or would exposing the resolved SourceLoc type to C++ help?

@kinke
Copy link
Contributor

kinke commented Jan 25, 2025

After a quick glance, it looks like LDC only ever creates default-constructed Loc instances from C++. - The 1 GB reduction isn't bad. 👍

@ibuclaw
Copy link
Member

ibuclaw commented Jan 25, 2025

I see a few instances my end. At a quick glance, only one use of which I can't rewrite in a minute.

Loc loc = (m->md != NULL) ? m->md->loc
  : Loc (m->srcfile.toChars (), 1, 0);

-> Getting a location for debug info DW_TAG_module - m->md can be NULL when there's no module declaration in the source file. Maybe could use a magic unknown location, but will take some time to stop and consider the consequences. :-)

@dkorpel
Copy link
Contributor Author

dkorpel commented Jan 25, 2025

Would it help if I exposed static Loc singleFilename(const char* filename) to C++? That function creates a Loc for a filename with no line/column information, and is currently used in dmd.mars.createModule to intialize module locations.

@dkorpel dkorpel force-pushed the loc-smaller-1 branch 5 times, most recently from e72dcd1 to 954bfe3 Compare January 25, 2025 23:03
@dkorpel
Copy link
Contributor Author

dkorpel commented Jan 25, 2025

Does anyone know how to debug this segfault on Ubuntu 22.04 x86, DMD (latest)?

Program received signal SIGSEGV, Segmentation fault.
0x568f811d in dmd.dstruct.StructDeclaration.this(dmd.location.Loc, dmd.identifier.Identifier, bool) ()
#0  0x568f811d in dmd.dstruct.StructDeclaration.this(dmd.location.Loc, dmd.identifier.Identifier, bool) ()
#1  0x56acebf1 in dmd.parse.Parser!(dmd.astcodegen.ASTCodegen, dmd.lexer.Lexer).Parser.parseAggregate() ()
#2  0x56ad506c in dmd.parse.Parser!(dmd.astcodegen.ASTCodegen, dmd.lexer.Lexer).Parser.parseDeclarations(bool, dmd.parse.PrefixAttributes!(dmd.astcodegen.ASTCodegen).PrefixAttributes*, const(char)*) ()
#3  0x56abaaaa in dmd.parse.Parser!(dmd.astcodegen.ASTCodegen, dmd.lexer.Lexer).Parser.parseDeclDefs(int, dmd.dsymbol.Dsymbol*, dmd.parse.PrefixAttributes!(dmd.astcodegen.ASTCodegen).PrefixAttributes*) ()
#4  0x56ab944e in dmd.parse.Parser!(dmd.astcodegen.ASTCodegen, dmd.lexer.Lexer).Parser.parseModuleContent() ()
#5  0x568d6d84 in dmd.dmodule.Module.parseModule!(dmd.astcodegen.ASTCodegen).parseModule() ()
#6  0x568d3645 in dmd.dmodule.Module.load(ref const(dmd.location.Loc), dmd.identifier.Identifier[], dmd.identifier.Identifier, dmd.globals.ImportPathInfo) ()
#7  0x5692711f in dmd.dsymbolsem.load(dmd.dimport.Import, dmd.dscope.Scope*) ()
#8  0x569265cd in ImportAllVisitor::visit(Import*) ()
#9  0x568af96e in Import::accept(Visitor*) ()
#10 0x56926aaa in ImportAllVisitor::visit(Module*) ()
#11 0x568d4bfd in Module::accept(Visitor*) ()
#12 0x5683ae16 in dmd.main.tryMain(uint, const(char)**, ref dmd.globals.Param) ()
#13 0x568383e1 in D main ()
 ... runnable/test42.d               -fPIC (-inline -release -g -O)
==============================
Test 'runnable/test42.d' failed. The logged output:
/home/runner/work/dmd/dmd/generated/linux/release/32/dmd -conf= -m32 -Irunnable  -fPIC  -od/home/runner/work/dmd/dmd/compiler/test/test_results/runnable/d -of/home/runner/work/dmd/dmd/compiler/test/test_results/runnable/d/test42_0  runnable/test42.d 
Segmentation fault (core dumped)
==============================
Test 'runnable/test42.d' failed: Expected rc == 0, but exited with rc == 139

@thewilsonator
Copy link
Contributor

I would first try dustmiting runnable/test42.d

@kinke
Copy link
Contributor

kinke commented Jan 26, 2025

You could hack

dmd/ci/run.sh

Line 69 in af002fc

generated/build -j$N MODEL=$MODEL HOST_DMD=$DMD DFLAGS="$CI_DFLAGS" ENABLE_RELEASE=${ENABLE_RELEASE:-1} dmd
setting ENABLE_RELEASE=0 and ENABLE_DEBUG=1, which hopefully still fails and gives line info.

@dkorpel
Copy link
Contributor Author

dkorpel commented Jan 26, 2025

I would first try dustmiting runnable/test42.d

Can you run dustmite on CI? I don't have a 32-bit pc, so I can't reproduce it locally. Maybe I should try setting up a VM.

You could hack ... setting ENABLE_RELEASE=0 and ENABLE_DEBUG=1

Thanks, I'll give it a try

Edit: the stack trace is now more informative

Program received signal SIGSEGV, Segmentation fault.
0x568cec8f in StructDeclaration::zeroInit(bool) (this=0xf7bfff00, v=false) at src/dmd/dstruct.d-mixin-117:121
121	src/dmd/dstruct.d-mixin-117: No such file or directory.
#0  0x568cec8f in StructDeclaration::zeroInit(bool) (this=0xf7bfff00, v=false) at src/dmd/dstruct.d-mixin-117:121
#1  0x568cf07d in _D3dmd7dstruct17StructDeclaration6__ctorMFSQBp8location3LocCQCg10identifier10IdentifierbZCQDkQDjQDe (this=0xf7bfff00, inObject=true, id=0xf6ea2a00, loc=...) at src/dmd/dstruct.d:122
...

@thewilsonator
Copy link
Contributor

thewilsonator commented Jan 26, 2025

Hmm, dust mite is probably overkill, you might be able to manually bisect the main function. Or simply prepend printf("%d\n, __LINE__); fflush(stdout); to the start of each of testXYZ(); case in main.

@dkorpel dkorpel force-pushed the loc-smaller-1 branch 5 times, most recently from 065be3d to 2587c22 Compare January 26, 2025 13:36
Comment on lines 36 to 40
// https://github.com/dlang/dmd/pull/20777#issuecomment-2614128849
static if (size_t.sizeof == 4) version(linux)
{
uint dummy;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I found a fix 🤦

Clearly the optimization is just too good for i386 Linux to handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want DigitalMars instead of Linux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure is specific to Ubuntu x86, Windows x86 works fine. It's also specific to DMD compiled with DMD latest, not DMD bootstrap. It happens when calling StructDeclaration.setZeroInit(bool), which is a simple bit field setter. this is not null, and setting this.bitFields = 0 doesn't trigger it, so what kind of memory access even triggers the segfault? I truly can't explain this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the stack frame is really shallow (13 frames), I can't imagine it hitting the guard page

Copy link
Contributor

@kinke kinke Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a bug in the old DMD bootstrap compiler, then please use both version(linux) (or Posix, I can't imagine this to be Linux specific) and version(DigitalMars), so that the ugly workaround doesn't apply when using LDC or GDC (incl. x86 builds of those compilers themselves, as they hardly use DMD as a host compiler).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also specific to DMD compiled with DMD latest, not DMD bootstrap.

Oh sorry, I misread that - so even worse, a regression in recent versions?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#20786 indicates that the regression was introduced in DMD v2.105.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation!

version(DigitalMars)

How do I put that in the C++ header?

Copy link
Member

@ibuclaw ibuclaw Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By chance, I do have a debian 12 x86 VM to hand, and couldn't reproduce the segfault with the host compiler being either 2.104, 2.105, or 2.109.

Not sure it's really worth getting an ubuntu 22.04 x86 VM set up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkorpel dkorpel marked this pull request as ready for review January 26, 2025 16:04
@dkorpel dkorpel force-pushed the loc-smaller-1 branch 2 times, most recently from 521cd1b to 78c8d3b Compare January 30, 2025 19:00
Copy link
Member

@ibuclaw ibuclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few last remarks for gdc/ldc compat. Otherwise looks fine.

compiler/src/dmd/dstruct.d Outdated Show resolved Hide resolved
compiler/src/dmd/location.d Outdated Show resolved Hide resolved
compiler/src/dmd/location.d Outdated Show resolved Hide resolved
@thewilsonator thewilsonator merged commit be8668e into dlang:master Jan 31, 2025
41 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Severity:Refactoring No semantic changes to code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants