-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(ci): run benchmarks for each storage in separate workers #1536
chore(ci): run benchmarks for each storage in separate workers #1536
Conversation
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughThis pull request introduces a new Bash script Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
thank you, i also had such an idea for a short time, but then lost sight of it again |
I faced troubles while contributing the testcontainers support in the other PR, and detected that the benchmark was failing for other reasons. I'm not sure the implications of the changes with the benchmarks website, so please feel free to take over and change what you need |
Marked it as ready to review, as the only failure now is caused by a panic in the MySQL module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
s3/init_test.go (2)
32-33
: Context handling
You have set up aBackground()
context for container orchestration. Consider if you need aWithTimeout
context to avoid potential indefinite waits in case of container runtime failures.
43-47
: Error handling
Panicking on container connection errors is acceptable for test code, but you may want to provide additional logging or uset.Fatal
for clarity..github/scripts/changed-modules.sh (6)
17-17
: Shellcheck SC2155
It's generally recommended to declare and assign variables in separate steps to avoid potential masking of return values in Bash.Here’s an example adjustment:
- readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) + cd "$(dirname "${BASH_SOURCE[0]}")/../.." || exit 1 + readonly ROOT_DIR="$(pwd)"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
27-27
: Shellcheck SC2053
Quote the string when comparing with==
to avoid unintended pattern matching.- if [[ $1 == $excluded_module ]]; then + if [[ "$1" == "$excluded_module" ]]; then🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
36-36
: Shellcheck SC2044
Usingfor
withfind
output can be fragile. Consider awhile
-based approach or-exec
.-for modFile in $(find "${ROOT_DIR}" -name "go.mod" ...); do +find "${ROOT_DIR}" -name "go.mod" ... | while read -r modFile; do ... done🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
45-45
: Shellcheck SC2207
When sorting an array in Bash,mapfile
or read loops are safer than splitting on IFS.- IFS=$'\n' modules=($(sort <<<"${modules[*]}")) + mapfile -t modules < <(printf '%s\n' "${modules[@]}" | sort)Also applies to: 52-52
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
72-72
: Regex-based array checks
Using=~
on quoted strings in Bash can result in literal matches. Consider unquoting the pattern or checking membership with a loop.🧰 Tools
🪛 Shellcheck (0.10.0)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
84-84
: Output formatting
The surrounding quotes inecho "["$(IFS=,; echo "${modified_modules[*]}" | sed 's/ /,/g')"]"
may inadvertently remove spacing. You might want to fully quote the entire echo command.🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
.github/workflows/benchmark.yml
is excluded by!**/*.yml
.github/workflows/test-s3.yml
is excluded by!**/*.yml
s3/go.mod
is excluded by!**/*.mod
s3/go.sum
is excluded by!**/*.sum
,!**/*.sum
📒 Files selected for processing (2)
.github/scripts/changed-modules.sh
(1 hunks)s3/init_test.go
(2 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[warning] 67-67: Variable was used as an array but is now assigned a string.
(SC2178)
[warning] 67-67: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
🔇 Additional comments (7)
s3/init_test.go (6)
4-4
: Use of context package: good practice for test container management
Leveraging thecontext
package for controlling container lifecycles is a great best practice.
9-9
: Dependency import
Importingtestcontainers-go
for spinning up MinIO containers is appropriate here and helps keep tests self-contained.
18-24
: Configurable MinIO environment variables
Defining constants for MinIO credentials and enabling override viaTEST_MINIO_IMAGE
is a robust way to parameterize tests.
27-30
: Environmental override
Smart approach to override the default MinIO image from an environment variable. This ensures flexibility across environments.
34-38
: Proper container setup
The container is configured with custom credentials. This is clean and makes the test more realistic.
51-51
: Dynamic credentials
Replacing hard-coded credentials with the newly defined constants is a good improvement for maintainability.Also applies to: 54-55
.github/scripts/changed-modules.sh (1)
57-57
: Shellcheck SC2124
${ALL_CHANGED_FILES[@]}
can collapse arrays into strings, possibly losing data. IfALL_CHANGED_FILES
is expected to be a string, consider capturing it differently.🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
Shellcheck SC2178 / SC2124 Assigning arrays to a string variable can lead to unexpected behavior.
Shellcheck SC2207 When sorting an array in Bash, mapfile or read loops are safer than splitting on IFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
.github/scripts/changed-modules.sh (4)
16-21
: Improve error handling in ROOT_DIR setupSplit the declaration and assignment to properly handle potential errors in directory resolution.
-readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) +declare -r ROOT_DIR +ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) || exit 1🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
44-54
: Improve array sorting reliabilityUse mapfile for more reliable array sorting.
-IFS=$'\n' modules=($(sort <<<"${modules[*]}")) -unset IFS +mapfile -t modules < <(printf '%s\n' "${modules[@]}" | sort) -IFS=$'\n' allModules=($(sort <<<"${allModules[*]}")) -unset IFS +mapfile -t allModules < <(printf '%s\n' "${allModules[@]}" | sort)🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
80-84
: Improve output formatting reliabilityThe current output formatting might break with special characters.
-echo "["$(IFS=,; echo "${modified_modules[*]}" | sed 's/ /,/g')"]" +printf '[%s]\n' "$(IFS=,; echo "${modified_modules[*]}")"🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
1-84
: Excellent approach to CI optimization!The script effectively implements the PR's objective of running benchmarks for each storage in separate workers by:
- Intelligently detecting modified modules
- Supporting selective testing based on changes
- Enabling parallel execution of benchmarks
This will significantly improve the contribution experience by:
- Reducing CI execution time
- Isolating benchmark failures to specific modules
- Preventing unrelated test failures
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/scripts/changed-modules.sh
(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 27-27: Quote the right-hand side of == in [[ ]] to prevent glob matching.
(SC2053)
[warning] 36-36: For loops over find output are fragile. Use find -exec or a while read loop.
(SC2044)
[warning] 45-45: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 52-52: Prefer mapfile or read -a to split command output (or quote to avoid splitting).
(SC2207)
[warning] 57-57: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
(SC2124)
[error] 72-72: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
[warning] 72-72: Remove quotes from right-hand side of =~ to match as a regex rather than literally.
(SC2076)
[warning] 84-84: The surrounding quotes actually unquote this. Remove or escape them.
(SC2027)
[warning] 84-84: Quote this to prevent word splitting.
(SC2046)
🔇 Additional comments (1)
.github/scripts/changed-modules.sh (1)
1-15
: Well-documented script with clear test instructions!The documentation effectively covers different test scenarios with expected outputs, making it easier for contributors to validate changes.
They just went away 😂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.github/scripts/changed-modules.sh (2)
16-24
: Improve error handling in ROOT_DIR assignmentThe current assignment could mask potential
cd
orpwd
errors.-readonly ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) +declare -r ROOT_DIR +ROOT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd) || exit 1🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
1-81
: Architectural consideration: Custom script vs third-party actionWhile third-party actions like
tj-actions/changed-files
can detect file changes, this script adds essential value by:
- Processing module-specific logic (extracting module names from paths)
- Handling excluded modules
- Formatting output specifically for matrix jobs
The custom implementation is justified as it handles use cases beyond simple file change detection.
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/scripts/changed-modules.sh
(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
.github/scripts/changed-modules.sh
[warning] 17-17: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
⏰ Context from checks skipped due to timeout of 90000ms (15)
- GitHub Check: Compare (valkey)
- GitHub Check: Compare (sqlite3)
- GitHub Check: Compare (scylladb)
- GitHub Check: Compare (s3)
- GitHub Check: Compare (rueidis)
- GitHub Check: Compare (ristretto)
- GitHub Check: Compare (redis)
- GitHub Check: Compare (postgres)
- GitHub Check: Compare (pebble)
- GitHub Check: Compare (nats)
- GitHub Check: Compare (mongodb)
- GitHub Check: Compare (couchbase)
- GitHub Check: Compare (coherence)
- GitHub Check: Compare (clickhouse)
- GitHub Check: Analyse
🔇 Additional comments (5)
.github/scripts/changed-modules.sh (5)
1-15
: Well-documented test cases!The script includes clear examples of how to test different scenarios, which is excellent for maintainability.
25-32
: Clean and safe implementation!The
is_excluded
function follows shell scripting best practices with proper variable quoting and array iteration.
34-51
: Robust module discovery implementation!The code uses safe file processing patterns and proper array handling for sorting.
77-81
: Clean output formatting!The output formatting is well-documented and uses proper shell commands for consistent results.
52-75
:⚠️ Potential issueFix array handling and comparison logic
There are several issues in the modified files processing:
- Unsafe array iteration
- Inefficient array membership check
- Unquoted variable usage
-for file in $modified_files; do +for file in "${modified_files[@]}"; do if [[ $file == .github/* ]]; then modified_modules=("${allModules[@]}") break fi - module_name=$(echo $file | cut -d'/' -f1) + module_name=$(echo "$file" | cut -d'/' -f1) - if [[ ! ${modified_modules[@]} =~ ${module_name} ]]; then + if ! printf '%s\n' "${modified_modules[@]}" | grep -q "^\"${module_name}\"$"; then if is_excluded "$module_name"; then continue fi modified_modules+=("\"$module_name\"") fi doneLikely invalid or redundant comment.
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 62-62: Expanding an array without an index only gives the first element.
(SC2128)
[error] 69-69: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).
(SC2199)
.github/workflows/benchmark.yml
Outdated
modules_count: ${{ steps.set-modified-modules-count.outputs.modules_count }} | ||
steps: | ||
- name: Check out code into the Go module directory | ||
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the version here, as it was
steps: | ||
- name: Install MinIO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was removed, and now the tests fail @mdelapenya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look at this today, as it should use the minio service provided by testcontainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the error is not because of not having a dedicated minio service in the GH workflow, but this #1536 (comment)
.github/workflows/benchmark.yml
Outdated
- detect-modules | ||
strategy: | ||
matrix: | ||
module: ${{ fromJSON(needs.detect-modules.outputs.modules) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think minio for s3 benchmarks is missing here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the error is not because of not having a dedicated minio service in the GH workflow, but this #1536 (comment)
testStore = New( | ||
Config{ | ||
Bucket: bucket, | ||
Endpoint: "http://127.0.0.1:9000/", | ||
Endpoint: "http://" + conn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is not happy with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debugging this atm, will post updates as commits, sorry about that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems an endpoint resolver is needed for the last S3 updates. IIRC I had to do the same for the S3 examples with localstack: https://docs.aws.amazon.com/sdk-for-go/v2/developer-guide/configure-endpoints.html#with-both
This is needed to avoid the BaseEndpoint to prepend the bucket name, as shown in the test errors
@@ -12,15 +15,44 @@ const ( | |||
|
|||
var testStore *Storage | |||
|
|||
const ( | |||
// minioImage is the default image used for running S3 in tests. | |||
minioImage = "docker.io/minio/minio:RELEASE.2024-08-17T01-24-54Z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use latest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just copied the version from the current minio module. We could keep it as is, and bump the versions all together in a follow-up
@ReneWerner87 I think there is some flakiness in the benchmarks workflow that is not caused by this PR. In the latest build, it failed because of arangodb, and some other, but in previous builds it was not the case. My flair would say that there is something wrong in the client code for the given technology, but this is something I cannot confirm atm. |
Ok, will try it tomorrow locally with act |
my local tests were not successful we are currently discussing internally how to proceed |
@mdelapenya Could you update the workflow to use dorny/paths-filter instead of that shell script? There's also https://github.com/tj-actions/changed-files |
I'm currently at Gophercon SG, please let me update it once I'm back (although I'll try to find a gap in between) |
That's awesome, no rush! Enjoy your trip! |
* main: (22 commits) test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow bump pkg´s test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow test new release drafter combi workflow dynamic change detection based on the folders in the first level of the repository dynamic generation of the release drafter config for the changed folders dynamic change detection based on the folders in the first level of the repository dynamic generation of the release drafter config for the changed folders Add timeout Rename variable Simplify even more ...
Hi folks! I'm back and used the new GH action to detect changes. It's simply amazing! I'm going to use it in the testcontainers-go repo too. I ran this branch using |
oh awsome, we will check it in the next days, thx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM, I'm curious if after merged it will run all the benchmarks?
I have one concern about this too: what happens if somebody updates the benchmarks GH workflow? I'd like it to run all of them. For the release drafter PR, did it run all of them when merged? I tried to copy whatever done in that PR |
Thank you folks for this! I loved working with you simplifying the initial PR 🙇 I checked the merge commit, and it triggered the S3 benchmarks, only (as expected). Do we want to run all of them at merge time? Then I think we can add some conditions to detect a merge commit is processed and return the array with all packages instead. Anyway, I think the current state is great for the experience of contributing to one single store package! I'll continue adding testcontainers support for the rest of the packages. Cheers! |
@mdelapenya Thank you, I think I know why. The filters step should only be run during a Pull Request. |
What does this PR do?
It adds an initial job in the benchmark workflow in order to identify which modules where modified for a given build.
To detect the changes, we have created a shell script,
./github/scripts/changed-modules.sh
. This script receives one single env var as input parameter,ALL_CHANGED_FILES
, which on CI will be provided by a Github action, https://github.com/tj-actions/changed-files, that puts in there a list of the modified files, comparing the current PR changeset with the parent branch (main). We can tune this up to always check the latestmain
branch (open to discussion here).The script will compare all existing modules (looking up all the
go.mod
files of interest, no test files) with the modified files, building an array of modified modules. At the moment there is a modified file in the .github directory, all the modules will be included in the build.Finally, for running the benchmarks, we are adding a matrix for running them for the modified storage module, with the following rules:
ubuntu-latest
.UPDATE: I've added the minio module to S3, so that the benchmarks for S3 work again.
Why is this important?
Separate concerns and better troubleshoot of issues with benchmarks: at the moment, all benchmarks are executed in serie, which 1) slows down the build, 2) could cause a PR to fail if one benchmark for another module failed.
With these changes we try to improve the contribution experience when working on one module.
Summary by CodeRabbit
New Features
Chores