LLVM verification #356

AFOliveira · 2024-12-13T13:48:23Z

Following up on #258.

I've been doing work on what @lenary proposed as a first approach, I think this is an ok first mock-up. Still a WIP since many instructions still have bugs, but if you have any comments or recommendations, please LMK :).

I did not add LLVM as a submodule yet because it may be easier licensing wise to just point at it in the script? Usage is python3 riscv_parser.py <tablegen_json_file> <arch_inst_directory>"). I've also used jq to enhance readibility on the output of llvm-tblgen -I llvm/include -I llvm/lib/Target/RISCV llvm/lib/Target/RISCV/RISCV.td --dump-json -o <path-to-json-output>.

lenary

I think this is a good start, and is heading in the right direction. Some immediate comments before I look again in the next few days.

Is there anything specific you want feedback on?

Which instructions are you running into issues with so far?

Another overall thought I have is: if this is a test/validation suite, could this be structured using pytest? For instance, you could parse the YAML and the JSON to match up instruction descriptions, and use that to create parameterized fixtures (one instance per matched description) - the advantage of this is that you get all of the niceness of pytest's asserts, and pytest's test suite reports, without having to reimplement all the testcase management. This also makes it easier to split the test code that checks the encoding matches from tests that check the assembly strings match (for example, but we can think of others).

ext/auto-inst/parsing.py

AFOliveira · 2024-12-13T15:40:16Z

I think this is a good start, and is heading in the right direction. Some immediate comments before I look again in the next few days.

Thanks for your feedback!

Is there anything specific you want feedback on?

Not yet, the point was just some general considerations about the initial approach.

Which instructions are you running into issues with so far?

I still didnt find a pattern, but I'll try to fix this soon and if I run into any issue I'm not able to solve by myself, I'll try to bring it up, thanks!

Another overall thought I have is: if this is a test/validation suite, could this be structured using pytest? For instance, you could parse the YAML and the JSON to match up instruction descriptions, and use that to create parameterized fixtures (one instance per matched description) - the advantage of this is that you get all of the niceness of pytest's asserts, and pytest's test suite reports, without having to reimplement all the testcase management. This also makes it easier to split the test code that checks the encoding matches from tests that check the assembly strings match (for example, but we can think of others).

I'll take a look into pytest and see how to port this, thanks for the suggestion!

AFOliveira · 2024-12-19T09:29:54Z

current known bugs are:

Can't parse yaml complex imm encodings (i.e. 31|7|30-25|11-8)
Some instructions need to be fixed in the UDB, since this already caught some errors.
Need to split pytest into different unit tests rather than one big test.
Use ASMString from JSON.

I'm working on both now.

lenary · 2024-12-19T11:00:25Z

Somewhere, a __pycache__ and *.pyc is missing from a gitignore. :)

lenary · 2024-12-19T23:26:47Z

ext/auto-inst/parsing.py

+    instr_name = instr_name.lower().strip()
+
+    # Search through all entries in json_data
+    for key, value in json_data.items():


You can use the !instanceof metadata to only search through things with an encoding, I pointed this out in the comment about the initial approach. This will save you iterating through aliases (and other data like isel patterns)

From what I can see !instanceof follows this structure, but I'm finding it hard to understand how to use it for the parsing purposes, can you please further explain?

{ "!instanceof": { "AES_1Arg_Intrinsic": [ "int_arm_neon_aesimc", "int_arm_neon_aesmc" ], "AES_2Arg_Intrinsic": [ "int_arm_neon_aesd", "int_arm_neon_aese" ], "ALUW_rr": [ "ADDW", "ADD_UW", "DIVUW", "DIVW", "MULW", "PACKW", "REMUW", "REMW", "ROLW", "RORW", "SH1ADD_UW", "SH2ADD_UW", "SH3ADD_UW", "SLLW", "SRAW", "SRLW", "SUBW" ], "ALU_ri": [ "ADDI", "ANDI", "ORI", "SLTI", "SLTIU", "XORI" ],

So, let's assume you parsed the whole json object, into a python variable called json.

json["!instanceof"] is a map, from tablegen classes, to a list of definition names that are instances of that class (or its sub-classes). These definition names appear in the top-level json.

You would use code like the following:

for def_name in json["!instanceof"]["RVInstCommon"]: def_data = json[def_name] ...

This saves you having to look at all the tablegen data that is not an instruction (so an alias or a pattern or CSR or something).

Note you'll still have to look at isPseudo and isCodeGenOnly and potentially exclude items where one or both of those is true.

AFOliveira · 2024-12-22T14:58:18Z

Unit testing is now working.

TODO:
Solve complex immediates
Add the optimization @lenary proposed on code structure
Add this to the validation process

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira · 2024-12-23T14:59:50Z

Can I insert LLVM as a submodule or would this be another licensing problem?

AFOliveira · 2024-12-23T15:52:31Z

This is functional. Things missing is how to address LLVM and adding it the test to Rakefile. I will also see how I can optimize code, e.g. what @lenary proposed on one review.

lenary · 2024-12-23T18:47:14Z

Can I insert LLVM as a submodule or would this be another licensing problem?

I highly suggest using environment variables to point to LLVM, and skipping these tests if those environment variables are not set.

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira · 2025-01-17T15:48:38Z

@dhower-qc as on out last call was talked, I added LLVM to CI with caching so it doesn't always have to build the table, For local runs, I still kept it optional, it only runs if the user have the file present, is it ok this way?

Signed-off-by: Afonso Oliveira <[email protected]>

…he ISA Spec Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira · 2025-01-20T15:09:53Z

Sorry for the mess with force pushes, but I had some trouble finding what I was messing up with GH actions caching. However, I believe this PR is ready for merge after #428 is solved and after @dhower-qc and @lenary reviews.

AFOliveira requested a review from lenary December 13, 2024 13:48

AFOliveira force-pushed the AFOliveira/LLVM branch from a56b0b1 to d9b50b2 Compare December 13, 2024 14:13

lenary reviewed Dec 13, 2024

View reviewed changes

ext/auto-inst/parsing.py Outdated Show resolved Hide resolved

ext/auto-inst/parsing.py Outdated Show resolved Hide resolved

ext/auto-inst/parsing.py Outdated Show resolved Hide resolved

ext/auto-inst/parsing.py Outdated Show resolved Hide resolved

AFOliveira mentioned this pull request Dec 18, 2024

Instruction description mismatch #360

Open

lenary reviewed Dec 19, 2024

View reviewed changes

This was referenced Dec 23, 2024

Acquire and release instructions have a bug #361

Open

Fix CSRRS #376

Merged

AFOliveira and others added 14 commits December 23, 2024 12:59

Add simple Docker environment variable

7f82b46

Signed-off-by: Afonso Oliveira <[email protected]>

Fix errors due to incorrect parsing of VM

7141a9c

Signed-off-by: Afonso Oliveira <[email protected]>

First Refactor to pytest

6e45c3b

Signed-off-by: Afonso Oliveira <[email protected]>

Allow 16 bit instructions for C extension

ac04c28

Signed-off-by: Afonso Oliveira <[email protected]>

Revert bad parsing

f1b8613

Signed-off-by: Afonso Oliveira <[email protected]>

Allow only one value

e3e7456

Signed-off-by: Afonso Oliveira <[email protected]>

Use AsmString instead of name

4355eb0

Signed-off-by: Afonso Oliveira <[email protected]>

Small Refactor on parsing.py

8492581

refactor to do unit tests

3bd1d2c

refactor to file name

6b9fdda

Modify to have seveal Unit tests instead of just one

831a10b

Signed-off-by: Afonso Oliveira <[email protected]>

Clean up and code reorganization

655e1d6

Signed-off-by: Afonso Oliveira <[email protected]>

Ensure it is not pseudo

d61fb3b

Signed-off-by: Afonso Oliveira <[email protected]>

Skip aq/rl instructions

5992124

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira force-pushed the AFOliveira/LLVM branch from 956ad97 to 5992124 Compare December 23, 2024 12:59

add pytest to requirements

c2e6e92

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira added 3 commits January 17, 2025 15:29

Fix caching

0fa5bdd

Signed-off-by: Afonso Oliveira <[email protected]>

Fix caching

56412ac

Signed-off-by: Afonso Oliveira <[email protected]>

Fix caching

b2748e0

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira force-pushed the AFOliveira/LLVM branch from f467578 to b2748e0 Compare January 17, 2025 15:36

Add dependencie for smoke test

6968919

Signed-off-by: Afonso Oliveira <[email protected]>

Change logic for LLVM's path

25aa928

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira force-pushed the AFOliveira/LLVM branch 6 times, most recently from f2d419f to 40af15d Compare January 20, 2025 11:40

Change CI logic for LLVM

5ee8496

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira force-pushed the AFOliveira/LLVM branch 6 times, most recently from 6ca01a7 to e2a5b73 Compare January 20, 2025 13:13

Add cache and update paths

92f9c99

Signed-off-by: Afonso Oliveira <[email protected]>

AFOliveira force-pushed the AFOliveira/LLVM branch from e2a5b73 to 92f9c99 Compare January 20, 2025 13:53

AFOliveira and others added 2 commits January 20, 2025 14:51

Add corner case when implementation of LLVM does not need to follow t…

34e769e

…he ISA Spec Signed-off-by: Afonso Oliveira <[email protected]>

Merge branch 'main' into AFOliveira/LLVM

b3c523c

AFOliveira and others added 2 commits January 22, 2025 18:13

Merge branch 'main' into AFOliveira/LLVM

15b12f3

Work around for FENCE. ISA and compiler should treat it differently

af0acae

AFOliveira force-pushed the AFOliveira/LLVM branch from 5eff105 to af0acae Compare January 27, 2025 09:31

AFOliveira added 2 commits January 27, 2025 09:43

Set CM instruction length 16 bit instead of 32

b1e0ee6

Merge branch 'main' into AFOliveira/LLVM

f99e556

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM verification #356

LLVM verification #356

AFOliveira commented Dec 13, 2024

lenary left a comment

AFOliveira commented Dec 13, 2024

AFOliveira commented Dec 19, 2024 •

edited

Loading

lenary commented Dec 19, 2024

lenary Dec 19, 2024

AFOliveira Dec 20, 2024

lenary Dec 20, 2024

AFOliveira commented Dec 22, 2024

AFOliveira commented Dec 23, 2024

AFOliveira commented Dec 23, 2024 •

edited

Loading

lenary commented Dec 23, 2024

AFOliveira commented Jan 17, 2025

AFOliveira commented Jan 20, 2025

LLVM verification #356

Are you sure you want to change the base?

LLVM verification #356

Conversation

AFOliveira commented Dec 13, 2024

lenary left a comment

Choose a reason for hiding this comment

AFOliveira commented Dec 13, 2024

AFOliveira commented Dec 19, 2024 • edited Loading

lenary commented Dec 19, 2024

lenary Dec 19, 2024

Choose a reason for hiding this comment

AFOliveira Dec 20, 2024

Choose a reason for hiding this comment

lenary Dec 20, 2024

Choose a reason for hiding this comment

AFOliveira commented Dec 22, 2024

AFOliveira commented Dec 23, 2024

AFOliveira commented Dec 23, 2024 • edited Loading

lenary commented Dec 23, 2024

AFOliveira commented Jan 17, 2025

AFOliveira commented Jan 20, 2025

AFOliveira commented Dec 19, 2024 •

edited

Loading

AFOliveira commented Dec 23, 2024 •

edited

Loading