Structure of rules in this repo #3

hazanasec · 2020-07-31T08:57:59Z

Need to agree on a structure of the rules i.e. /v6/java/6.2.5.yaml or /java/v6/6.2.5.yaml or /v6/6.2.5/java/rule.yaml

I think it makes sense to maybe go with language first? What will people look for when they first see the repo?

The text was updated successfully, but these errors were encountered:

clintgibler · 2020-08-01T00:30:04Z

This is a great question. My thoughts:

Top Folder Name

Language makes sense to be top level directory.

Why? Usually when you're running Semgrep on a repo you probably care more about running the rules for the languages contained in that repo, vs. "let me run all output encoding checks on this repo that's only Python."

Mid Folder Name

I'd prefer something like:

java/
  * v5_validation_sanitization_encoding/
  * v6_cryptography/
python/
  * v5_validation_sanitization_encoding/
  * v6_cryptography/

Personally I kind of like having the second level folder name include some context about what it is, so when you've navigating the tree structure it's more explicit than just "v6".

The user may not known what ASVS section corresponds to what off the top of their head, and having to reference another resource while they're navigating this repo or click into every directory seems unnecessarily high friction.

Rule Name

So language/vN_description/???.yml.

But what about after that? I'm not sure if we can just do 6.2.5.yaml, because it's likely that there will be some (many?) cases where there will need to be many rules, even with one language, to capture the same control.

For example, capturing how one can disable output encoding in Flask, Django, Jinja, etc. (all Python). Also, there may be various third-party libraries that provide similar functionality (e.g. crypto), and we want to flag the same issue in all of them (e.g. MD5).

Of course, we could try to put a series of patterns for the same control into the same rule (e.g. finding MD5 in every popular crypto library + stdlib), but that may get complicated, have what the rule does be less intuitive, as the check is less self contained (this rule finds exactly just this issue).

Maybe something like:

language/vN_description/6.2.5_short_description.yml or
language/vN_description/6.2.5/short_description_{a|b}.yml?

These are just some initial thoughts, happy to discuss 👍

Open Questions

One potential future issue I can foresee is that if we name every rule based on which ASVS control it corresponds to, when there's a new version of ASVS, we may have to rename things a bunch of places. For example, a chapter gets added, existing chapters get re-ordered, controls within a chapter get added/renamed/re-ordered, etc.

Not sure how to handle this in a way that makes future us minimally sad :P

clintgibler · 2020-08-01T00:34:57Z

Another potential structure is something like:

python/
  flask/
    v5_validation_sanitization_encoding/
    v6_cryptography/
  django/
    v5_validation_sanitization_encoding/
    v6_cryptography/

java/
  spring/
    v5_validation_sanitization_encoding/
    v6_cryptography/

What's nice about this is that as a consumer, it's structured very much by what you care about: I have these languages and these web app / frontend frameworks, so I'm going to use the rules in those folders. Potentially makes rule pack creation/maintenance easier.

The downside is that we'd be repeating the ASVS sections a bunch of times (for every language/framework combination, where the control is relevant) in the repo.

But that doesn't make our lives too much harder, and might make it easier for contributors to know where to put the rules they're writing.

hazanasec · 2020-08-01T17:02:26Z

Looks good, I agree the structure in the second comment looks very nice for the consumer, and totally agree with the /short_description_{a|b}.yml

I don't think there's too much we could do at the moment to future proof it at the moment unfortunately!

Perhaps we need to add this structure and the message structure into a readme?

danielcuthbert · 2020-08-02T08:42:15Z

We will try and make sure that the numbering doesn't change drastically between 4.x and 5.x, but /short_description_{a|b}.yml works pretty well I feel

clintgibler changed the title ~~Structure of rules~~ Structure of rules in this repo Aug 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structure of rules in this repo #3

Structure of rules in this repo #3

hazanasec commented Jul 31, 2020

clintgibler commented Aug 1, 2020

clintgibler commented Aug 1, 2020

hazanasec commented Aug 1, 2020

danielcuthbert commented Aug 2, 2020

Structure of rules in this repo #3

Structure of rules in this repo #3

Comments

hazanasec commented Jul 31, 2020

clintgibler commented Aug 1, 2020

Top Folder Name

Mid Folder Name

Rule Name

Open Questions

clintgibler commented Aug 1, 2020

hazanasec commented Aug 1, 2020

danielcuthbert commented Aug 2, 2020