C++: Rewrite `cpp/uncontrolled-process-operation` to not use `DefaultTaintTracking` #14561

jketema · 2023-10-23T10:00:53Z

Note that one of the internal tests also needs updating. I'll do that once we're happy with what we have here.

I added a barrier looking at arithmetic types, as these seemed to be the source of many results that are in the end not very interesting. Note that this does mean that we lose some results where the input buffer is copied character-by-character. However, even with the barrier disabled we lose some of these. I'm not sure how worried we should be about this.

Summary of MRVA results:

Total number of results reported by MRVA before on 1000s projects: 1915
Total number of results reported by MRVA after on 1000s projects: 1545
Disregarding the barrier I added, we lose 58 (source, sink)-pairs. They all seem FPs related to pointer/pointee confusion in DTT.
Not disregarding the barrier, we gain 82 (source, sink)-pair. I spot checked these, and they look reasonable. I'm not too worried about this number anyway, since this is a medium precision query.

Still running a MRVA experiment to see how many (source, sink)-pairs we lose when we do not disregard the barrier.

MathiasVP

LGTM if you're happy with the MRVA results. I think the query as it's written now makes a lot of sense 😍.

MathiasVP · 2023-11-15T11:35:31Z

cpp/ql/src/Security/CWE/CWE-114/UncontrolledProcessOperation.ql

  exists(int processOperationArg, FunctionCall call |
    isProcessOperationArgument(processOperation, processOperationArg) and
    call.getTarget().getName() = processOperation and
-    call.getArgument(processOperationArg) = arg
+    call.getArgument(processOperationArg) = [arg.asExpr(), arg.asIndirectExpr()]


I'm wondering if we should, once all this DTT stuff has been merged, should investigate what happens if we remove output.isReturnValue() our flow sources (and simply keep the output.isReturnValueDeref() cases) in models such as this one: https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/models/implementations/Getenv.qll#L18

A quick grep only reveals that this is a problem for:

https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/models/implementations/Getenv.qll#L18

https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/models/implementations/Gets.qll#L51

https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/models/implementations/Gets.qll#L108

This would mean that we didn't have to exclude ataFlow::ExprNode from isSource in this case.

Note that this predicate is used on the sink-side, not the source-side. What you're saying does apply to the not node instanceof DataFlow::ExprNode we have in the source predicate below.

Oops, sorry. Yes, I meant to comment on the source-side. I don't know why I put the comment on this line of code 😂

Should we open an internal issue for this?

Yeah, I can do that now.

MathiasVP · 2023-11-15T11:38:51Z

cpp/ql/src/Security/CWE/CWE-114/UncontrolledProcessOperation.ql

+  sink = sinkNode.getNode() and
+  isProcessOperationExplanation(sink, processOperation) and
+  Flow::flowPath(sourceNode, sinkNode)
+select sink, sourceNode, sinkNode,
  "The value of this argument may come from $@ and is being passed to " + processOperation + ".",
  source, source.toString()


Should we use the getSourceType predicate from the source node to obtain a better alert message?

We can. That's what you did elsewhere, right?

Indeed. For example here: https://github.com/github/codeql/pull/14784/files#diff-6659098746be150a26dcb4f6677d331cbea3ac76fa9945220113d2f6c624adc8R120

Done. I think we source types might need some further tuning, but that can be done later.

Agreed. But this in itself LGTM!

…TaintTracking`

jketema · 2023-11-15T14:01:46Z

Disregarding the barrier I added, we lose 58 (source, sink)-pairs. They all seem FPs related to pointer/pointee confusion in DTT.

Not disregarding the barrier it's 165 (source, sink) pairs

jketema · 2023-11-15T14:02:34Z

Rebased for internal PR purposes.

github-actions bot added the C++ label Oct 23, 2023

jketema force-pushed the rewrite-uncontrolled-process-operation branch 3 times, most recently from c8658fe to d013b4a Compare October 26, 2023 12:19

jketema force-pushed the rewrite-uncontrolled-process-operation branch from d013b4a to ed4b7a4 Compare November 1, 2023 12:18

MathiasVP mentioned this pull request Nov 2, 2023

C++: Allocate more FunctionInput and FunctionOutputs #14667

Merged

jketema force-pushed the rewrite-uncontrolled-process-operation branch from ed4b7a4 to 9df3252 Compare November 15, 2023 10:55

jketema marked this pull request as ready for review November 15, 2023 11:15

jketema requested a review from a team as a code owner November 15, 2023 11:15

MathiasVP previously approved these changes Nov 15, 2023

View reviewed changes

jketema dismissed MathiasVP’s stale review via 20a4f19 November 15, 2023 12:34

jketema added no-change-note-required This PR does not need a change note depends on internal PR This PR should only be merged in sync with an internal Semmle PR labels Nov 15, 2023

MathiasVP previously approved these changes Nov 15, 2023

View reviewed changes

jketema added 2 commits November 15, 2023 14:57

C++: Rewrite cpp/uncontrolled-process-operation to not use `Default…

92c1896

…TaintTracking`

C++: Address review comments

46e6e72

jketema dismissed MathiasVP’s stale review via 46e6e72 November 15, 2023 14:00

jketema force-pushed the rewrite-uncontrolled-process-operation branch from 20a4f19 to 46e6e72 Compare November 15, 2023 14:00

MathiasVP approved these changes Nov 15, 2023

View reviewed changes

jketema merged commit f22979f into github:main Nov 15, 2023

jketema deleted the rewrite-uncontrolled-process-operation branch November 15, 2023 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++: Rewrite `cpp/uncontrolled-process-operation` to not use `DefaultTaintTracking` #14561

C++: Rewrite `cpp/uncontrolled-process-operation` to not use `DefaultTaintTracking` #14561

jketema commented Oct 23, 2023 •

edited

Loading

MathiasVP left a comment

MathiasVP Nov 15, 2023

jketema Nov 15, 2023

MathiasVP Nov 15, 2023

jketema Nov 15, 2023

MathiasVP Nov 15, 2023

MathiasVP Nov 15, 2023

jketema Nov 15, 2023

MathiasVP Nov 15, 2023

jketema Nov 15, 2023

MathiasVP Nov 15, 2023 •

edited

Loading

jketema commented Nov 15, 2023

jketema commented Nov 15, 2023

C++: Rewrite cpp/uncontrolled-process-operation to not use DefaultTaintTracking #14561

C++: Rewrite cpp/uncontrolled-process-operation to not use DefaultTaintTracking #14561

Conversation

jketema commented Oct 23, 2023 • edited Loading

MathiasVP left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MathiasVP Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

jketema commented Nov 15, 2023

jketema commented Nov 15, 2023

C++: Rewrite `cpp/uncontrolled-process-operation` to not use `DefaultTaintTracking` #14561

C++: Rewrite `cpp/uncontrolled-process-operation` to not use `DefaultTaintTracking` #14561

jketema commented Oct 23, 2023 •

edited

Loading

MathiasVP Nov 15, 2023 •

edited

Loading