-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub Actions workflow for running Automated Tests #62
Conversation
c6944cc
to
d6440bc
Compare
Related Commits / PRsListing some possibly related commits from the codebase history which could be helpful Note:
|
Starting off with getting demo-automated-testing.sh to runDocker setup process in ubuntu vs in macOS
Specific details available here in the official list of runner images. Where does startup_tests.py come from? It isn't there in this repo?
|
Timeout causing jobs to get cancelledFound a related commit here that highlights this issue with the main reason for using timeout being that the docker compose scripts run without the -d flag and hence the process stays active in the terminal. So for now, ignoring this. Will have to come back to it later. |
Demo-automated-testing.sh passed for ubuntuHere is the successful: workflow run This is output seen in the workflow logs:
I hope this is the right indicator that the current automated tests defined by the test script have passed. |
Added macOS runner to run automated testsMacOS docker setup passed finally using the same custom github action used in this PR. Timeout increased to 60 minutes
As seen in this commit as well, when Shankari changed the timeout to 30 minutes, it was because the |
Observations for Workflow Runs with macOSTest 1: This macOS run passed - Pytest tests passed successfully. Test 2: For details on triggering events on PR vs non-PR (or on push) events, see this comment.
However in this macOS run, 1 test failed in the pytest startup_tests. 1 test failed with error message:
This failure error has been observed before as seen here. Test 3: Testing again by pushing changes. Test 4: Testing again in this run, specifically to observe MacOS runner. However, yet again 1 test failed, 1 test passed.
|
Queries before moving onto demo-iso15118-2-ac-plus-ocpp.sh1. Which version of the bash script parameters to use? Shankari’s command for the AC script was valid as per this Readme.md.
But latest code in EVerest/everest-demo parent repo has the new script which was added in the single demo script added in this commit. Bash command has different arguments passed to choose which demo to run.
2. Should all the command versions run in the workflow? But these scripts run docker containers without -d flag which causes them to stay up and running. |
Concerns with current process for image build pushMy understanding of the requirements of this PR w.r.t. automated testing
My understanding of the process for a non-PR or normal push event
What's my concern?
Summarized concern
What should happen for latest image with latest code from PR to be tested?
Observations from actions history
This is observed even for PRs that triggered workflow multiple times since multiple pushes were made:
|
First, I don't want a giant portmanteau PR that is hard to review and merge. This PR should do one thing and do it well, which is enabling automated tests. The High level comments:
This is only partially correct. We do want to test out new images, but not all changes are to the images. Some changes are to the demo scripts as well. Note that the PR that required multiple rounds of testing from @louisg1337 did not make any changes to the images.
Obviously the most recent one. Shankari's PR was started before the script was refactored. It wasn't changed since then. We should use the currently valid version of the script. However, we are going to focus on only the automated testing script for now, so this is irrelevant for this PR.
The challenge with using
I am not sure what you mean by this. each separate run in the matrix runs in a separate VM. So I am not sure what you mean by the 'same port"
As I said above, this is necessary but not sufficient
This is expected, since we only want to push valid images to the repo.
That is not the only option. A better option is to test the locally built image
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure what these changes are doing or why the image build and push has been commented out.
.github/workflows/cicd.yaml
Outdated
# - name: Checkout | ||
# uses: actions/checkout@v4 | ||
# with: | ||
# fetch-depth: 0 | ||
|
||
# - name: Ensure Docker image version is not referencing an existing release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is all this commented out?
Finally, I am not sure why this change was introduced:
when the goal is
Note also that, given the time and resources required to run MacOS, I don't think we should run it on every pull request. The idea behind containerization/software based testing is that if the code works in one container, it should work in all containers. This is clearly not true 100% of the time, but if it works in one setup, we know that the application logic is right, and we just have to figure out the system-level library incompatibilities. |
Just to clarify, these are not changes that I introduced. These were commits that I found in this PR by Shankari and I had just listed the relevant commits that could help better understand the task at hand.
Alright, noted. |
Based on the previous conversations, I tried a few approaches and listing them down along with reasons why I didn't use a specific approach and finally stuck with one. Notes:
Finally decided to go ahead with Approach 3. Approach 1:
Hence not moving ahead with this approach. Approach 2:
The problem with this was that:
So adding just another build job with duplicated code from the 3rd job for Build + Push wasn’t that useful Approach 3:
Notes on: How I am Building the image locally now
Exit Code
|
c3bba26
to
36d7c1a
Compare
Testing Done in my Forked RepoDone by Triggering Workflow Runs for a Success and Failure Test Case
Temporarily changed entrypoint in docker-compose.automated-tests.yml for the purposes of this testing
Similarly triggered a separate run for failure:
Test script used for testing
Success Test Case:
Failure Test Case
|
@MukuFlash03 This is not correct. I am not sure why you are running the |
Was incorrectly using the demo-automated-testing.sh. No need to modify the docker compose directories, file names in there. As noted in this comment by Shankari: link: EVerest#62 (comment) > The demo scripts are designed to be "single line demos" that people can run without having to check out any code.
Was incorrectly using the demo-automated-testing.sh. No need to modify the docker compose directories, file names in there. As noted in this comment by Shankari: link: EVerest#62 (comment) > The demo scripts are designed to be "single line demos" that people can run without having to check out any code Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
.github/workflows/cicd.yaml
Outdated
@@ -2,16 +2,47 @@ name: cicd | |||
|
|||
on: | |||
pull_request: | |||
branches: | |||
- main | |||
branches: [ main ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extraneous change, can be reverted
.github/workflows/cicd.yaml
Outdated
push: | ||
branches: | ||
- main | ||
branches: [ main ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
demo-scripts/charging-profile-0.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this file?
demo-scripts/charging-profile-1.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or this one. All of these seem to be from a separate change by @louisg1337
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not sure how this happened, my PR was created after @louisg1337 's PR's merged, so the changes should have been there in my branch code already and not appear as commits.
On comparing the timestamps, this is the workflow run that was triggered at the same timestamp now applied to all my commits.
The only possibility I can think of is I remember missing to sign-off some commits, and then ran the rebase command as per the documentation. Perhaps that somehow messed it up.
I'll exclude the changes from this PR.
Commits History
What's weird is that all my commits now have have the same timestamp of Jul 3rd, 10:08 AM.
I definitely did not make all the code changes and commits in a single day.
Workflow Runs
On looking at the workflow runs, you can see they're spread out over 2-3 weeks ago.
.github/workflows/cicd.yaml
Outdated
echo "Running docker compose up..." | ||
docker compose --project-name everest-ac-demo \ | ||
--file "docker-compose.automated-tests.yml" up \ | ||
--build \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we building this separately? We build the image later (https://github.com/EVerest/everest-demo/pull/62/files#diff-2c9df400afaffa9270cab617792f08cc671e5f95ae192f1cf111ef4e990a4addL97) and the build takes a long time. We should not rebuild unless we have to.
I see that you wrote a fairly long description of the options here
#62 (comment)
but I wonder if you checked the documentation for the build-push action, and the "test before push" in particular
https://github.com/docker/build-push-action?tab=readme-ov-file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing in below comments.
.github/workflows/cicd.yaml
Outdated
exit_code=$? | ||
echo "Docker-compose up exit code from manager service: $exit_code" | ||
|
||
echo "Running docker compose down..." | ||
docker compose --project-name everest-ac-demo \ | ||
--file "docker-compose.automated-tests.yml" down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this action will fail if the test fails. we will just continue and build + push anyway. Have you actually tested that the workflow (not just the docker compose) will fail if the tests fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you linked to a workflow where the workflow failed, but I don't understand how it works. The exit code of a step is the exit code of the last statement by default. If the manager service ends and then you run another docker compose after it, the exit code should be the exit code of the docker compose down
. I would appreciate an explanation of why/how this works. And I don't see why down
should be needed anyway in this context given that the entire github action environment is throwaway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're correct about not needing docker compose down
.
I might have added it when testing the updated docker compose up with the different abort and exit code flags and forgot to remove it.
Also, yes, the exit code returns the last run command.
With regards to the working of the exit code mechanism:
-
Initially, when I had incorrectly modified the
demo-automated-testing.sh
script, I had included the docker compose commands inside it. I needed to pass the exit code from thedocker compose up
command to GitHub actions and hence used a variableexit_code=$?
to do that. -
Now, without the demo-automated-testing.sh, when I'm directly executing the
docker compose up
inside the workflow file as a job step, it's still the same behavior. -
- In the workflow runner instance, the
exit_code
variable still stores the exit code from the last executed command above it which is thedocker compose up
command.
- In the workflow runner instance, the
Now, bringing GitHub Actions into the picture and how it determines the exit_code to look at.
Setting exit code flag for shell : set -e or +e flag
([documentation])(https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html)
- The
-e
flag instructs the shell to exit immediately if a non-zero exit code is encountered. - The
+e
flag on the other hand, allows execution to continue despite receiving non-zero exit codes. - In our case, we want the
-e
flag to be used.
GitHub action runners use the -e
flag by default (mentioned in accepted answer in this discussion)
To summarize:
So considering the two scenarios:
- Success (exit code 0): Workflow run continues executing onto the steps after
docker compose up
. - Failure (non-zero exit code): Workflow run fails immediately and successive steps, jobs not executed at all.
docker-compose.automated-tests.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should not need any changes to this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing in below comments.
Addressing both these comments as they are related to the image build process:
The main reason I need changes to this file is because the github actions flow I'm proposing involves two jobs:
So, I needed to build the image locally first in the I tried two scenarios where I had used only build flag ((in docker compose up command) or only build attribute (docker compose file).
In both cases, I observed that the image is always pulled first and not built locally. |
What I understood from this is: Both The reason for this is that:
The priority of image source as per this above info is:
But on observing the workflow runs, I saw that:
|
Transitioning into the second comment... The reason for the image always being built is the build flag in the docker compose up command. So we do need the build flag and also build attribute to specify context containing Dockerfile. |
So yes, while the workflow does work, tests run, containers exit....there is the main problem of the long build times. Currently, both jobs build images.
The advantage of this would be that the image layers built in the 1st job will be cached and these should be available to the 2nd job when it tries to build the image, thereby making the build process faster by not building in both jobs. Found some documentation and articles on how this might be possible.
Taking a look at these next. |
@MukuFlash03 I am not sure why you need all these links. we use docker build and push. docker build and push has support for testing in the middle. I linked to the documentation on testing. I would suggest you just use it and move on |
Signed-off-by: louisg1337 <[email protected]> Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Used the existing file template from this PR: EVerest#23 -------- Currently testing only with ubuntu-latest OS. The demo scripts used are: - demo-automated-testing.sh I have commented out the existing /demo-iso15118-2-ac-plus-ocpp201.sh since as per the Readme.md it requires arguments to be passed and I am unsure which set of arguments need to be passed for the purpose of the workflow. https://github.com/EVerest/everest-demo#step-1-run-the-demo ----- Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Not running on main since I will be making code pushes to my feature-branch: automate-tests-actions. Also, created a new branch automate-tests-merge that will be used to test workflow runs whenever a PR is created. This will simulate creating a PR for merging code changes into main in the parent repo. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Trying out workflow execution for both Ubuntu and MacOS. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
The macOS job was failing as the docker setup action is not suited for macOS ARM64 based images. https://github.com/douglascamata/setup-docker-macos-action?tab=readme-ov-file#arm64-processors-m1-m2-m3-series-used-on-macos-14-images-are-unsupported Hence, now using "macos-latest-large" (macOS 14 as of this commit) image as per the official runner images here: https://github.com/actions/runner-images/tree/main ----------- Also, this failure led to that specific job failing and all other jobs being cancelled including the job for ubuntu OS due to the "fail-fast" property being set to true as default. Hence to allow other jobs to go through, setting the fail-fast property to false: - https://github.com/orgs/community/discussions/27192#discussioncomment-3254964 - https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstrategyfail-fast Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
The run failed again but not due to ARM64 architecture but probably due to macos-14 not supported by GitHub action. The GitHub action for docker setup on MacOS mentions that only macos-12 and macos-13 versions are supported: https://github.com/douglascamata/setup-docker-macos-action#currently-supported-public-runner-images Hence, trying to change macOS to version 13 now. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Previous run failed yet again at the docker-verify state with error: ``` shell: /bin/bash --noprofile --norc -e -o pipefail {0} /Users/runner/work/_temp/f2669228-43bc-4f11-9c38-82bd32c1e3b4.sh: line 1: docker: command not found Error: Process completed with exit code 127. ``` Hence printing outputs as mentioned in the action Readme.md. ----- Ah! I see the step to install from the docker action was skipped as I had missed changing the macos version to v 13 in the if condition. Let's see if it works now. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
MacOS docker setup passed finally. Increasing the timeout since the Downloading and Extracting steps themselves have taken about 20 minutes in the macOS runner. But right now even for the demo-automated-tests.sh on MacOS its taking close to 30 minutes just for the download + extraction to complete. As seen in the commit history, when Shankari changed the timeout to 30 minutes, it was because the demo-iso15118-2-ac script itself was taking close to 30 minutes. EVerest@05528d4 Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
In cicd.yaml: - Added my feature branches as well to test triggering of workflows during development. Added a TODO to remove these later. - Commented outsteps of docker-build-and-push-images job since it would fail if version TAG is unchanged from existing pushed image. - Added a POST request to trigger workflow_dispatch to e2etest.yml via GitHub REST API. In e2etest.yml: - Set the trigger to only workflow_dispatch and removed push, pull_request triggers since we want the tests to run on the latest pushed images only which will happen when first workflow completes successfully. - But also to note that the first workflow pushes image only on certain conditions. - Need to handle this as well. - For now, I've done this by checking if event type is not PR (same check done in cicd.yaml). Detailed discussion on this in PR for this development. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Previous run was able to dispatch the e2etest workflow but the macos runner didn't have all the tests pass. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
MacOS runner abruptly failed in the docker setup step just after 7 minutes. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Failure message: ``` Using variable interpolation `${{...}}` with `github` context data in a `run:` step could allow an attacker to inject their own code into the runner. ``` Found a fix which mentions using environment variables instead of directly accessing Github context variables directly in executable statements like the `run` statement: cisagov/client-cert-update#53 (comment) Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
------ Created e2etest.yaml Used the existing file template from this PR: EVerest#23 -------- Currently testing only with ubuntu-latest OS. The demo scripts used are: - demo-automated-testing.sh I have commented out the existing /demo-iso15118-2-ac-plus-ocpp201.sh since as per the Readme.md it requires arguments to be passed and I am unsure which set of arguments need to be passed for the purpose of the workflow. https://github.com/EVerest/everest-demo#step-1-run-the-demo ----- Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
…e + Added exit_code Docker compose should now use locally built image. Problem was with ${DEMO_DIR}. Directly using docker file name works: ${DEMO_COMPOSE_FILE_NAME}. Also added exit_code and abort-container so that the docker compose up command exits whenever any one container exits. In our testing case, when manager container exits. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Was incorrectly using the demo-automated-testing.sh. No need to modify the docker compose directories, file names in there. As noted in this comment by Shankari: link: EVerest#62 (comment) > The demo scripts are designed to be "single line demos" that people can run without having to check out any code Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
Using “test before push” workflow mentioned in docker build push action official documentation. Link: https://docs.docker.com/build/ci/github-actions/test-before-push/ --- While this would work, a problem is that I need to put the docker compose step in the docker build push job. This means, it would run for all matrix jobs. In our case currently manager service image takes time to build, and manager depends on mqtt service image which is built pretty quickly. So we are good here. But consider this scenario. What if a matrix job B is dependent on matrix job A but matrix job A hasn’t finished building it’s image as yet, then matrix job B might use the older image which it would have pulled and not the locally built / updated image. In this case, the tests would be run on an older image, and could falsely pass the tests. --- For instance, in this test workflow run, I wanted to test the success / failure test case and used a different test script. This is copied over in the manager/Dockerfile. But since the automated tests step is part of the same matrix job, it would run for all services and not just manager which has the entry point (run-test.sh) script for automated tests. So, mqtt service image, in this case, also executed the docker compose command, it wasn’t able to find the test script in the manager image. The reason is that while I did include the test script in the manager/Dockerfile, this image is still being built, hence latest changes not yet available. So, the matrix job for the mqtt service uses the pulled image layers from the cache for the manager image (it pulled in the first place since the manager image was used as a part of the docker compose), and this pulled image did not have the test script file. Hence the workflow failed saying, no such file found: https://github.com/MukuFlash03/everest-demo/actions/runs/9960438276/job/27519676597 ---- One workaround I see then is to run the automated tests conditionally only for the manager service matrix job. Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
be1ff47
to
b7b24e2
Compare
I've gone ahead and made the change to implement the test before push workflow inside the docker build push job itself. I did take a look initially and one reason I was trying an alternative method was the fact that we are dealing with matrix jobs. More details in this comment. |
Concerns with Including Testing inside Matrix jobI've used the “test before push” workflow mentioned in docker build push action documentation. While this would work, a problem is that I need to put the docker compose step in the docker build push job. In our case currently But consider this scenario. For instance, in this test workflow run, I wanted to test the success / failure test case and used a different test script. This is copied over in the So, The reason is that while I did include the test script in the Hence the workflow failed saying, no such file found. One workaround I see then is to run the automated tests conditionally only for the
|
The previous PR commit history got messed up due to rebasing to handle the commits that had missing signoffs. Created a new PR to bring over the final changes from there. PR link: EVerest#62 Signed-off-by: Mahadik, Mukul Chandrakant <[email protected]>
The changes from this PR are now moved over to a new PR #67 This was done to handle the commit history being messed up by the rebase commands to add signoffs to commits. |
The eventual goal is to have minimal manual intervention when testing latest images.
Essentially, we want to run the automated tests on every rebuild of the image so we know if the images are good enough to merge.
There are two scripts involved in running the demo tests:
demo-iso15118-2-ac-plus-ocpp.sh
demo-automated-testing.sh
Another requirement is that we need to run tests on different operating systems, hence the GitHub actions workflow must run on atleast ubuntu (for now) and MacOS later on.
MacOS with M1 chips has faced issues in the past and could be tricky to get configured.
Initial PR by shankari.
Manual testing steps highlighted in this PR.