Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase locality of the character encoding used for a given string. #141

Open
DarkArc opened this issue Oct 17, 2023 · 1 comment
Open
Milestone

Comments

@DarkArc
Copy link

DarkArc commented Oct 17, 2023

It's currently quite difficult to implement visualization for a given const.str entry as the partition contains several possible character encodings and does not specify what character encoding is used for a particular element of the partition (instead relying upon the context for how the string is referenced (e.g., src.word can reference a const.str entry as a wide character literal while TextOffset contextually means a UTF-8 null-terminated identifier).

I think it would be a significant improvement if there was either a partition per-encoding (e.g., const.utf-8-str, const.utf-16-str, etc, or an encoding specified per line). This similarly would make it harder for folks -- such as myself :) -- to misinterpret the encoding of the contents of the partition.

@GabrielDosReis
Copy link
Collaborator

I agree that the partition const.str has morphed over time to serve more generally as backing store for all character strings with the encoding moving into the entry of the character string descriptor. It makes sense to reconsider that and introduce dedicated partitions:
- const.str.utf-8
- const.str.utf-16
- const.str.utf-32

That would probably means replacing the uses of TextOffset for string literals with an StrIndex with sorts:

  • StrSort::Utf8
  • StrSort::Utf16
  • StrSort::Utf16

TextOffset would be an index into const.str.utf-8 since all names are internalized as UTF-8 identifiers.

@GabrielDosReis GabrielDosReis added this to the IFC 0.44 milestone Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants