Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc Request: Can you expand on failure modes of offset guessing in ABI mode? #151

Open
brianv0 opened this issue Dec 18, 2024 · 1 comment

Comments

@brianv0
Copy link

brianv0 commented Dec 18, 2024

In the documentation, it reads:

The more fundamental reason to prefer the API mode is that the C libraries are typically meant to be used with a C compiler. You are not supposed to do things like guess where fields are in the structures.

This makes ABI mode sound extremely brittle, but other parts of the documentation also make it sound robust provided your headers are accurate. Is this mostly related to alignment and padding, and that being more likely to be incorrect in ABI mode/parsing headers?

I was mostly wondering if there's another in-between mode here, e.g. using gdb's python API + lookup_type provided debug symbols are included.

@arigo
Copy link
Contributor

arigo commented Dec 19, 2024

The ABI mode works if you carefully copy exactly the correct declarations in ffi.cdef(). For some C libraries, this is easy enough to do: just copy the C headers. The result is as robust as a compiled C program, in the sense that it works as long as no incompatible "binary-level" changes are made to the C library itself.

In other cases, though, the C headers can contain a mess of platform-specific #ifdefs that mean the declarations seen by the C compiler change depending on various parameters; or #pragma or __attribute__ that influence the alignment and padding of structs. If you don't get these details exactly right in your ABI-mode cdef() then it will crash.

Using gdb to load the debug symbols might be a cool idea. For now, it should be done as a third-party project, e.g. you would run it ahead of time and it would produce the content of the cdef(). I'm afraid there are many (maybe project-specific) issues to fix along the way, though. Maybe a more integrated solution would be nice for some projects, where gdb is run transparently at runtime whenever you try to access, say, a new function; but that's not a solution that works for everybody (it depends on gdb being installed on the runtime machine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants