Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine types based on debug metadata #191

Draft
wants to merge 33 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9798e97
Initial work on refining types
frabert Oct 22, 2021
49e2601
Visit global variables, improve return types
frabert Oct 25, 2021
b32d10a
Implement typedef printing
frabert Oct 25, 2021
55d3a13
Improve type refinement for fields and arguments
frabert Oct 26, 2021
2b1e41d
Fix struct members
frabert Oct 26, 2021
7db935e
Add explanation for checking argument count
frabert Oct 28, 2021
1067041
Add unit test for `ASTBuilder::CreateTypedefDecl`
frabert Oct 28, 2021
0486438
Fix varargs debug type analysis
frabert Oct 29, 2021
171e90e
Use more debug info for prototypes
frabert Oct 29, 2021
09f8a2e
Fix function argument type refinement
frabert Oct 31, 2021
bdd21e4
Default to signed integers
frabert Oct 31, 2021
7c474eb
Fix tests
frabert Oct 31, 2021
e0e6efe
Desugar types for Z3 conversion
frabert Nov 1, 2021
175c073
Initial work on refining types
frabert Oct 22, 2021
b55a8d0
Visit global variables, improve return types
frabert Oct 25, 2021
287e137
Implement typedef printing
frabert Oct 25, 2021
2292193
Improve type refinement for fields and arguments
frabert Oct 26, 2021
c3da195
Fix struct members
frabert Oct 26, 2021
36cb7c8
Add explanation for checking argument count
frabert Oct 28, 2021
33294d3
Add unit test for `ASTBuilder::CreateTypedefDecl`
frabert Oct 28, 2021
e878d42
Fix varargs debug type analysis
frabert Oct 29, 2021
6b4d302
Use more debug info for prototypes
frabert Oct 29, 2021
b231b24
Fix function argument type refinement
frabert Oct 31, 2021
e02bd8a
Default to signed integers
frabert Oct 31, 2021
3676aa5
Fix tests
frabert Oct 31, 2021
a9b5371
Desugar types for Z3 conversion
frabert Nov 1, 2021
0386c2b
Merge branch 'use-debug-types' of github.com:lifting-bits/rellic into…
frabert Nov 1, 2021
af69ad5
Merge branch 'master' into use-debug-types
frabert Nov 4, 2021
8f39d9d
Add utility functions
frabert Nov 4, 2021
35599ba
Fix bugs
frabert Nov 4, 2021
2905038
Merge branch 'master' into use-debug-types
frabert Nov 8, 2021
dbd0d82
Add void to ptr casts
frabert Nov 8, 2021
38991a6
Use plain `char` when asking for `signed char`
frabert Nov 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions include/rellic/AST/IRToASTVisitor.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ using IRToValDeclMap = std::unordered_map<llvm::Value *, clang::ValueDecl *>;
using IRToStmtMap = std::unordered_map<llvm::Value *, clang::Stmt *>;
using DIToTypedefMap =
std::unordered_map<llvm::DIDerivedType *, clang::TypedefNameDecl *>;
using ArgToTempMap = std::unordered_map<llvm::Argument *, clang::VarDecl *>;

class IRToASTVisitor : public llvm::InstVisitor<IRToASTVisitor> {
private:
Expand All @@ -43,6 +44,7 @@ class IRToASTVisitor : public llvm::InstVisitor<IRToASTVisitor> {
DIToTypedefMap typedef_decls;
IRToStmtMap stmts;
DebugInfoCollector &dic;
ArgToTempMap temp_decls;

clang::Expr *GetOperandExpr(llvm::Value *val);
clang::QualType GetQualType(llvm::Type *type, llvm::DIType *ditype);
Expand Down
1 change: 1 addition & 0 deletions include/rellic/BC/Util.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ namespace rellic {
// Serialize an LLVM object into a string.
std::string LLVMThingToString(llvm::Value *thing);
std::string LLVMThingToString(llvm::Type *thing);
std::string LLVMThingToString(llvm::DIType *thing);

// Try to verify a module.
bool VerifyModule(llvm::Module *module);
Expand Down
38 changes: 36 additions & 2 deletions lib/AST/IRToASTVisitor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ clang::QualType IRToASTVisitor::GetQualType(llvm::Type *type,
}
return ast_ctx.getTypedefType(tdef_decl);
} break;
case llvm::dwarf::DW_TAG_inheritance:
case llvm::dwarf::DW_TAG_member: {
return GetQualType(type, derived->getBaseType());
};
Expand Down Expand Up @@ -526,8 +527,41 @@ void IRToASTVisitor::VisitArgument(llvm::Argument &arg) {
ditype = ditype_array[arg.getArgNo() + 1];
}
}

auto argtype{arg.getType()};
if (arg.hasByValAttr()) {
auto byval{arg.getAttribute(llvm::Attribute::ByVal)};
argtype = byval.getValueAsType();
}
// Create a declaration
parm = ast.CreateParamDecl(fdecl, GetQualType(arg.getType(), ditype), name);
parm = ast.CreateParamDecl(fdecl, GetQualType(argtype, ditype), name);
}

// This function fixes function types for those functions that have arguments
// that are passed by value using the `byval` attribute.
// They need special treatment because those arguments, instead of actually
// being passed by value, are instead passed "by reference" from a bitcode point
// of view, with the caveat that the actual semantics are more like "create a
// copy of the reference before calling, and pass a pointer to that copy
// instead" (this is done implicitly).
// Thus, we need to convert a function type like
// i32 @do_foo(%struct.foo* byval(%struct.foo) align 4 %f)
// into
// i32 @do_foo(%struct.foo %f)
static llvm::FunctionType *GetFixedFunctionType(llvm::Function &func) {
std::vector<llvm::Type *> new_arg_types{};

for (auto &arg : func.args()) {
if (arg.hasByValAttr()) {
auto ptrtype{llvm::cast<llvm::PointerType>(arg.getType())};
new_arg_types.push_back(ptrtype->getElementType());
} else {
new_arg_types.push_back(arg.getType());
}
}

return llvm::FunctionType::get(func.getReturnType(), new_arg_types,
func.isVarArg());
}

void IRToASTVisitor::VisitFunctionDecl(llvm::Function &func) {
Expand All @@ -546,7 +580,7 @@ void IRToASTVisitor::VisitFunctionDecl(llvm::Function &func) {

DLOG(INFO) << "Creating FunctionDecl for " << name;
auto tudecl{ast_ctx.getTranslationUnitDecl()};
auto ftype{func.getFunctionType()};
auto ftype{GetFixedFunctionType(func)};
auto type{GetQualType(ftype, dic.GetIRFuncToDITypeMap()[&func])};
decl = ast.CreateFunctionDecl(tudecl, type, name);

Expand Down
3 changes: 3 additions & 0 deletions lib/AST/StructFieldRenamer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ bool StructFieldRenamer::VisitRecordDecl(clang::RecordDecl *decl) {
// FIXME(frabert): Is a clash between field names actually possible?
// Can this mechanism actually be left out?
auto name{di_field->getName().str()};
if (di_field->getTag() == llvm::dwarf::DW_TAG_inheritance) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make an explicit field out of the base type in the case of inheritance? Can you add a comment here that shows what a simple c++ code would look like, and what we would generate as a result?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some examples that use multiple inheritance and virtual inheritance?

struct Base1 {
  int foo;
};
struct Base2 {
  float bar;
}
struct Derived : Base1 , Base2 {
};
struct Base1 {
  int foo;
};
struct Base2 : Base1 {
  float bar;
};
struct Base3 : Base1 {
  float bar;
};
struct Derived : virtual Base2 , virtual Base3 {
};

Also, here is a particularly thorny example which shows when this method of embedding the base within the structure of the parent is going to break down:
C++: https://godbolt.org/z/bM4vrq6fW
C: https://godbolt.org/z/fYarYo5he

See this SO post for more detail: https://stackoverflow.com/questions/52818411/will-the-padding-of-base-class-be-copied-into-the-derived-class

name = di_field->getBaseType()->getName().str() + "_base";
}
if (seen_names.find(name) == seen_names.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have seen_names be a map of seen_names -> unsinged, then you could have:

auto &name_count = seen_names[name];
if (name_count) {
  name = name + "_" + std::to_string(name_count);
}
++name_count;

seen_names.insert(name);
decl_field->setDeclName(ast.CreateIdentifier(name));
Expand Down
4 changes: 4 additions & 0 deletions lib/BC/Util.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ std::string LLVMThingToString(llvm::Type *thing) {
return DoLLVMThingToString(thing);
}

std::string LLVMThingToString(llvm::DIType *thing) {
return DoLLVMThingToString(thing);
}

// Try to verify a module.
bool VerifyModule(llvm::Module *module) {
std::string error;
Expand Down
1 change: 1 addition & 0 deletions tests/tools/decomp/byval_struct.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ int main() {
struct foo f = {atoi("1"), atoi("2"), atoi("3"), atoi("4")};
long long x = get_3x(f);
printf("%lld %lld\n", f.x, x);
return 0;
}