Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory prediction changes #12

Merged
merged 76 commits into from
Jul 19, 2024
Merged

Memory prediction changes #12

merged 76 commits into from
Jul 19, 2024

Conversation

Lehmann-Fabian
Copy link
Member

No description provided.

friederici and others added 30 commits March 4, 2024 16:38
* collect task execution results and store them in the memory optimizer

* restructured TaskScaler

* cleanup debug logs

* cleanup Scheduler

* cleanup Task

* cleanup Task

* add hook after workflow is completed

* change MemoryOptimizer to be an interface, add two different Optimizers

* round suggestions to ceiling

* introduced LinearPredictor

* changed indentation to 4 spaces, like rest of the project uses

* fix mis-formatting

* fix typos

* initial implementation linearPredictor

* builder for observations

* added test for constant predictor

* remove wasted calculation from observation

* remove wasted calculation from observation

* add NonePredictorTest

* sanity checks for observations

* add negative case for ConstantPredictor

* assert rise and fall of suggestions

* added LinearPredictorTest

* avoid negative preditions

* use SimpleRegression for LinearPredictor

* fix naming to always be prediction, instead of suggestion

* fix naming

* fix some minor issues

* collect statistics

* removed Limits, rely solely on Requests instead

* added new CombiPredictor

* remove solved fixme

* csv export

* save statistics summary and csv into file in workflow baseDir

* added Tasks realtime to statistic, moved NfTrace reader to own utility class

* add test if trace file is missing and handling for that

* fix no resize when request was 0

* apply config only in dev profile

* statistics log execution and predictor

* log makespan

* add peak_vmem for sanity checks

* add unit tests for statistics

* added todos for missing testcases

* fix for-loop should continue, not break

* only invoke TaskScaler when config was given

* get memory predictor from config, not from environment

* removed double code

* prepare application.yml for merge

* prepare application.yml for merge

* fixed decimal seperator

* fix decimal seperator

* changed logging in dev profile

* improved predictor selection order

* added template for square predictor

* collect wasted in summary

* add wasted to statistics

* avoid updating tasks when no new model is available

* added new testcases

* change return value for missing file to -1

* changed sanity check

* fix constant predictor

* faster overprovisioning

* add wary predictor

* fix imports

* wary predictor

* filter realtime 0

* use vmem instead of rss

* correct tests

* require 4 successful observations

* ignore list feature

* never provide predictions lower than the lowest successful value was

* prevent cws from get stuck

* blacklist failed tasks

* removed flawed wasted from csv, added assigned node

* lower limit for request size 256MiB

* fix bad naming

* junit test for TaskScaler

* add remark for TaskScalerTest

* fix used predictor

* removed unimplemended square predictor

* removed unused generation feature from constant predictor

* removed unused generation feature

* cleanup classname

* removed testcase that is no longer in line with desired behaviour

* fixed comments

* add description to README

* add description to README

* catch exception that is thrown when InPlacePodVerticalScaling is not enabled

* add note on profiles in README

* always write log to file

* check reason for exception and improve error message, then disable task scaling

* fix comment

* fix formatting

* moved patchTaskMemory method

* add tracing note in README

* reduce loglevel

* change predictor interface to return BigDecimal

* extracted constant for lowest memory request value

* add o.taskName to log, when available
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
Signed-off-by: Lehmann_Fabian <[email protected]>
…ory prediction and offset strategies

Signed-off-by: Lehmann_Fabian <[email protected]>
@Lehmann-Fabian Lehmann-Fabian merged commit 9997c2c into master Jul 19, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants