Memory prediction changes #12

Lehmann-Fabian · 2024-07-19T13:10:35Z

No description provided.

* collect task execution results and store them in the memory optimizer * restructured TaskScaler * cleanup debug logs * cleanup Scheduler * cleanup Task * cleanup Task * add hook after workflow is completed * change MemoryOptimizer to be an interface, add two different Optimizers * round suggestions to ceiling * introduced LinearPredictor * changed indentation to 4 spaces, like rest of the project uses * fix mis-formatting * fix typos * initial implementation linearPredictor * builder for observations * added test for constant predictor * remove wasted calculation from observation * remove wasted calculation from observation * add NonePredictorTest * sanity checks for observations * add negative case for ConstantPredictor * assert rise and fall of suggestions * added LinearPredictorTest * avoid negative preditions * use SimpleRegression for LinearPredictor * fix naming to always be prediction, instead of suggestion * fix naming * fix some minor issues * collect statistics * removed Limits, rely solely on Requests instead * added new CombiPredictor * remove solved fixme * csv export * save statistics summary and csv into file in workflow baseDir * added Tasks realtime to statistic, moved NfTrace reader to own utility class * add test if trace file is missing and handling for that * fix no resize when request was 0 * apply config only in dev profile * statistics log execution and predictor * log makespan * add peak_vmem for sanity checks * add unit tests for statistics * added todos for missing testcases * fix for-loop should continue, not break * only invoke TaskScaler when config was given * get memory predictor from config, not from environment * removed double code * prepare application.yml for merge * prepare application.yml for merge * fixed decimal seperator * fix decimal seperator * changed logging in dev profile * improved predictor selection order * added template for square predictor * collect wasted in summary * add wasted to statistics * avoid updating tasks when no new model is available * added new testcases * change return value for missing file to -1 * changed sanity check * fix constant predictor * faster overprovisioning * add wary predictor * fix imports * wary predictor * filter realtime 0 * use vmem instead of rss * correct tests * require 4 successful observations * ignore list feature * never provide predictions lower than the lowest successful value was * prevent cws from get stuck * blacklist failed tasks * removed flawed wasted from csv, added assigned node * lower limit for request size 256MiB * fix bad naming * junit test for TaskScaler * add remark for TaskScalerTest * fix used predictor * removed unimplemended square predictor * removed unused generation feature from constant predictor * removed unused generation feature * cleanup classname * removed testcase that is no longer in line with desired behaviour * fixed comments * add description to README * add description to README * catch exception that is thrown when InPlacePodVerticalScaling is not enabled * add note on profiles in README * always write log to file * check reason for exception and improve error message, then disable task scaling * fix comment * fix formatting * moved patchTaskMemory method * add tracing note in README * reduce loglevel * change predictor interface to return BigDecimal * extracted constant for lowest memory request value * add o.taskName to log, when available

Signed-off-by: Lehmann_Fabian <[email protected]>

…ory prediction and offset strategies Signed-off-by: Lehmann_Fabian <[email protected]>

Signed-off-by: Lehmann_Fabian <[email protected]>

friederici and others added 30 commits March 4, 2024 16:38

ConstantPredictor uses the maximum value

b4bd854

Signed-off-by: Lehmann_Fabian <[email protected]>

Error instead of fallback to none

4fe6f5a

Signed-off-by: Lehmann_Fabian <[email protected]>

No need to enable tracing manually

9640125

Signed-off-by: Lehmann_Fabian <[email protected]>

NonePredictor and Statistics are now covered by Nextflow traces.

9c3d1dd

Signed-off-by: Lehmann_Fabian <[email protected]>

Remove remaining statistic references

afa5b48

Signed-off-by: Lehmann_Fabian <[email protected]>

added method to get exitCode of task

18c45ba

Signed-off-by: Lehmann_Fabian <[email protected]>

Make getExitCode public

9274f12

Signed-off-by: Lehmann_Fabian <[email protected]>

Added endpoint to receive task metrics

385df66

Signed-off-by: Lehmann_Fabian <[email protected]>

add TaskMetrics to Task

c253c90

Signed-off-by: Lehmann_Fabian <[email protected]>

Refactor MemoryPrediction

a26ded3

Signed-off-by: Lehmann_Fabian <[email protected]>

Add tests

1a77e98

Signed-off-by: Lehmann_Fabian <[email protected]>

No error for normal behaviour

7b8e261

Signed-off-by: Lehmann_Fabian <[email protected]>

fix task was finished in wrong place

b598e1a

Signed-off-by: Lehmann_Fabian <[email protected]>

Improve logging. Format bytes as GB, MB, ...

ceae8d5

Signed-off-by: Lehmann_Fabian <[email protected]>

Add validity check

08e2898

Signed-off-by: Lehmann_Fabian <[email protected]>

Store old memory values and set new directly

7285259

Signed-off-by: Lehmann_Fabian <[email protected]>

Fix tests

0c730f6

Signed-off-by: Lehmann_Fabian <[email protected]>

Remove copyTasks

8d02fd3

Signed-off-by: Lehmann_Fabian <[email protected]>

Store a task's request in a variable and not in the pod.

8fb6268

Signed-off-by: Lehmann_Fabian <[email protected]>

Patch pods if requirements changed

0aad2f6

Signed-off-by: Lehmann_Fabian <[email protected]>

Changed log level in patchTask

c81041e

Signed-off-by: Lehmann_Fabian <[email protected]>

Check if featureGate InPlaceVerticalScaling is active

798cf7a

Signed-off-by: Lehmann_Fabian <[email protected]>

Use fabric capabilities to patch tasks

5c12e95

Signed-off-by: Lehmann_Fabian <[email protected]>

Scale Task to full Mb

4c89408

Signed-off-by: Lehmann_Fabian <[email protected]>

Update tests

5a454d9

Signed-off-by: Lehmann_Fabian <[email protected]>

Use version to avoid prediction if possible

6ff81a3

Signed-off-by: Lehmann_Fabian <[email protected]>

Fix that always the memoryVersion was used

2ec2fec

Signed-off-by: Lehmann_Fabian <[email protected]>

Consider memoryscaler params

755ce88

Signed-off-by: Lehmann_Fabian <[email protected]>

Polynomial Predictor

4f27c6b

Signed-off-by: Lehmann_Fabian <[email protected]>

Lehmann-Fabian added 29 commits April 10, 2024 17:51

allow for factor using StandardDeviationOffset

62db892

Signed-off-by: Lehmann_Fabian <[email protected]>

Refactor OffsetApplier

4919d24

Signed-off-by: Lehmann_Fabian <[email protected]>

Added MeanPredictor

2bfcc89

Signed-off-by: Lehmann_Fabian <[email protected]>

change log level

8b4367d

Signed-off-by: Lehmann_Fabian <[email protected]>

introduce linear regression with unequal loss

6b96b17

Signed-off-by: Lehmann_Fabian <[email protected]>

Added new ponder predictor

388a700

Signed-off-by: Lehmann_Fabian <[email protected]>

Set a label if memory was adapted

3973477

Signed-off-by: Lehmann_Fabian <[email protected]>

implemented leastfinishedfirst

a077257

Signed-off-by: Lehmann_Fabian <[email protected]>

Added tests for prioritizing strategies

8dfcec8

Signed-off-by: Lehmann_Fabian <[email protected]>

allow to update nodes

bcff656

Signed-off-by: Lehmann_Fabian <[email protected]>

TestProcess for prioritization tests

5e32833

Signed-off-by: Lehmann_Fabian <[email protected]>

Update dependencies

8ebdc2c

Signed-off-by: Lehmann_Fabian <[email protected]>

TestTask for prioritization tests

971266a

Signed-off-by: Lehmann_Fabian <[email protected]>

lff prioritize smaller tasks

70c46d7

Signed-off-by: Lehmann_Fabian <[email protected]>

Add new prioritization strategies

8ac0c1e

Signed-off-by: Lehmann_Fabian <[email protected]>

Check edge cases

34da84a

Signed-off-by: Lehmann_Fabian <[email protected]>

add getIndependetValue to Predictor interface

ed09aa8

Signed-off-by: Lehmann_Fabian <[email protected]>

remove getIndependetValue from LinearPredictor

ebc3477

Signed-off-by: Lehmann_Fabian <[email protected]>

Added lambda to UnequalLossFunction

6c167c2

Signed-off-by: Lehmann_Fabian <[email protected]>

also allow short form for linear regression

fc919c2

Signed-off-by: Lehmann_Fabian <[email protected]>

Add weighted PonderingPredictor

cc5e634

Signed-off-by: Lehmann_Fabian <[email protected]>

Restructure

9ca798f

Signed-off-by: Lehmann_Fabian <[email protected]>

Update dependencies

e0c05cb

Signed-off-by: Lehmann_Fabian <[email protected]>

Remove strategies from tests

125dc80

Signed-off-by: Lehmann_Fabian <[email protected]>

Unify naming

a1a73eb

Signed-off-by: Lehmann_Fabian <[email protected]>

Improve code quality

08b1680

Signed-off-by: Lehmann_Fabian <[email protected]>

Unify naming

d76eba2

Signed-off-by: Lehmann_Fabian <[email protected]>

Update readme and add scheduling and node assignment strategies + mem…

8c1f783

…ory prediction and offset strategies Signed-off-by: Lehmann_Fabian <[email protected]>

Merge branch 'master' into memoryPredictionChanges

71f7aab

Signed-off-by: Lehmann_Fabian <[email protected]>

Lehmann-Fabian merged commit 9997c2c into master Jul 19, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory prediction changes #12

Memory prediction changes #12

Lehmann-Fabian commented Jul 19, 2024

Memory prediction changes #12

Memory prediction changes #12

Conversation

Lehmann-Fabian commented Jul 19, 2024