Skip to content

Commit

Permalink
Refinements to prompt injection tests and benchmark
Browse files Browse the repository at this point in the history
Summary:
- Refine tests to be more specific and accurate in some cases based on initial manual investigation of results.
- Use LLama-70b as judge model for best results
- Example results in the internal datasets folder
- Some logging and QOL changes.

Reviewed By: SimonWan

Differential Revision: D53904223

fbshipit-source-id: 2e2efff745d1f1ffb65640e3c35348f07861b953
  • Loading branch information
cynikolai authored and facebook-github-bot committed Feb 22, 2024
1 parent e1f3762 commit e6f463e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion CybersecurityBenchmarks/benchmark/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def _query_with_retries(
last_exception = exception
sleep_time = backoff_factor * (2**retry)
time.sleep(sleep_time)
LOG.warning(
LOG.debug(
f"LLM Query failed with error: {exception}. Sleeping for {sleep_time} seconds..."
)
raise RuntimeError(
Expand Down

0 comments on commit e6f463e

Please sign in to comment.