MPS methodology is a project that proposes applying the Multiple Preditors System (MPS) to forecasts time series extracted from Microservice-Based Applications (MBAs). In literature, works have applied time series forecasting to predict performance degradation in MBAs. However, all these studies use a single forecast model, which increases the risk of inaccurate estimates, which can lead the application to undesirable scenarios such as unavailability. MPS emerges as an alternative to this problem since it uses multiple models in the forecast. MPS's basic idea of the ensemble is to combine the strengths of different learning algorithms to build a more accurate and reliable forecasting system.
More reliable and accurate forecasting systems are essential in proactive microservice auto-scaling systems. They improve the decision-making process of these systems by more reliably estimating microservices trends while mitigating incorrect adaptations triggered by inaccurate estimates. Consequently, microservices have a reduction in operating costs, and their customer satisfaction is maintained.
$ virtualenv venv
$ source venv/bin/activate
$ pip3 install -r requirements.txt
All processor results were stored in the results folder.
The description of each folder and its respective content is given below:
Folder | Content description |
---|---|
Increasing | It contains the accuracy of monolithic models in Increasing workload. |
Decreasing | It contains the accuracy of monolithic models in Decreasing workload. |
Periodic | It contains the accuracy of monolithic models in Periodic workload. |
Random | It contains the accuracy of monolithic models in Random workload. |
Series 1 | It contains the accuracy of monolithic models in Series 1 workload. |
Series 2 | It contains the accuracy of monolithic models in Series 2 workload. |
Series 3 | It contains the accuracy of monolithic models in Series 3 workload. |
Series 4 | It contains the accuracy of monolithic models in Series 4 workload. |
Summary | It contains the summary with precision of all approaches (monolithic, homogeneous and heterogeneous). |
Summary/better_lags | It contains the best lag for each monolithic model. |
Summary/better_acurracy | It contains the best accuracy values for each monolithic model. |
Summary/better_pool_values | MPS accuracy. |
Summary/better_pool_values_aggregate | It contains aggregated data of better_pool_values and better_acurracy |
Summary/pool_size_homogeneous_analisys | It contains data from each time series's optimal bagging size analysis. |
Multiple Metrics | It contains a summary of results on additional metrics beyond RMSE |
DM test | it contains data from the DM statistical test. |
Download the models from OneDrive and save them inside the MPS Methodology folder. Models were uploaded externally due to their size.
$ unzip models.zip
Be patient. This process can take a while. Between 15-30 minutes per performance metric.
$ rm results/ -r; mkdir results/
$ python3 generate_initial_results.py --competence_measure rmse --deployment frontend --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metrics cpu memory responsetime traffic --workloads decreasing increasing random periodic
$ python3 generate_pool_results.py --competence_measure rmse --deployment frontend --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metric cpu memory traffic responsetime --workloads decreasing increasing random periodic
For real-world time series, you have to execute the command per series:
$ python3 generate_initial_results.py --competence_measure rmse --deployment microservice1 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metrics cpu memory responsetime traffic --workloads microservice1;
$ python3 generate_pool_results.py --competence_measure rmse --deployment microservice1 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metric cpu memory traffic responsetime --workloads microservice1;
and
$ python3 generate_initial_results.py --competence_measure rmse --deployment microservice2 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metrics cpu memory responsetime traffic --workloads microservice2;
$ python3 generate_pool_results.py --competence_measure rmse --deployment microservice2 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metric cpu memory traffic responsetime --workloads microservice2;
and so on.
If desired, you can generate the values of a specific metric by changing the input command. For example, to generate results for just the memory metric, do:
$ python3 generate_initial_results.py --competence_measure rmse --deployment frontend --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metrics memory --workloads decreasing increasing random periodic;
$ python3 generate_pool_results.py --competence_measure rmse --deployment frontend --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metric memory --workloads decreasing increasing random periodic;
or
$ python3 generate_initial_results.py --competence_measure rmse --deployment microservice1 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metrics memory --workloads microservice1;
$ python3 generate_pool_results.py --competence_measure rmse --deployment microservice1 --lags 10 20 30 40 50 60 --learning_algorithms arima lstm xgboost svr rf mlp --metric memory --workloads microservice1;
Algorithm | Hyper-parameters | Source |
---|---|---|
ARIMA | Autoarima library | - |
LSTM | batch_size : [64, 128], epochs : [1, 2, 4, 8, 10], hidden_layers : [2, 3, 4, 5, 6], learning_rate : [0.05, 0.01, 0.001] |
Coulson et. al. |
MLP | hidden layer sizes : [5, 10, 15, 20], activation : [ tanh, relu, logistic], solver : [lbfgs, sgd, adam], ‘max iter’: [100, 500, 1000, 2000, 3000], ‘learning rate’: [ constant, adaptive] |
Rubak |
RF | min_samples_leaf : [1, 5, 10], min_samples_split : [2, 5, 10, 15], n_estimators : [100, 500, 1,000] |
Espinosa et al. |
SVR | gamma : [0.001, 0.01, 0.1, 1], kernel : [rbf, sigmoid], epsilon : [0.1, 0.001, 0.0001] C : [0.1, 1, 10, 100, 1,000, 10,000] |
de Oliveira et al. |
XGBoost | col_sample_by_tree : [0.4, 0.6, 0.8], gamma : [1, 5, 10], learning_rate : [0.01, 0.1, 1], max_depth : [3, 6, 10], n_estimators : [100, 150, 200], reg_alpha : [0.01, 0.1, 10], reg_lambda : [0.01, 0.1, 10], subsample : [0.4, 0.6, 0.8] |
Mohamed and El-Gayar |
The results for selecting the homogeneous pool size are summarised here. Also, the results of the best monolithic model by dataset and its lag size are available here and here, respectively.
The following table summarises both results for synthetic series.
Datasets | Approaches | |||
---|---|---|---|---|
Time Series | Workload | Best Monolithic | Monolithic Lag | Homogeneous Pool Size |
CPU Usage | Decreasing | SVR | 30 | 20 |
Increasing | MLP | 30 | 40 | |
Periodic | SVR | 10 | 50 | |
Random | MLP | 10 | 90 | |
Memory | Decreasing | SVR | 50 | 20 |
Increasing | SVR | 20 | 10 | |
Periodic | MLP | 10 | 20 | |
Random | SVR | 10 | 10 | |
Response Time | Decreasing | LSTM | 20 | 30 |
Increasing | MLP | 60 | 30 | |
Periodic | MLP | 60 | 110 | |
Random | MLP | 20 | 140 | |
Traffic | Decreasing | MLP | 50 | 110 |
Increasing | MLP | 20 | 130 | |
Periodic | SVR | 60 | 50 | |
Random | MLP | 10 | 10 |
The following table summarises both results for real-world series.
Datasets | Approaches | |||
---|---|---|---|---|
Time Series | Workload | Best Monolithic | Monolithic Lag | Homogeneous Pool Size |
CPU Usage | Decreasing | SVR | 10 | 20 |
Increasing | SVR | 40 | 20 | |
Periodic | SVR | 20 | 20 | |
Random | MLP | 10 | 10 | |
Memory | Decreasing | SVR | 40 | 150 |
Increasing | SVR | 10 | 110 | |
Periodic | MLP | 60 | 10 | |
Random | SVR | 10 | 130 | |
Response Time | Decreasing | SVR | 10 | 20 |
Increasing | MLP | 50 | 30 | |
Periodic | XGBoost | 20 | 30 | |
Random | RF | 40 | 100 | |
Traffic | Decreasing | RF | 20 | 30 |
Increasing | MLP | 10 | 60 | |
Periodic | SVR | 50 | 90 | |
Random | LSTM | 20 | 30 |
The time series used in the research can be found here. Also, we also plot all-time series and create a description file.
The following table describes the synthetic series.
Metric | Series | Trend | Stationary | Frequency | Mean | Median | Std | Size |
---|---|---|---|---|---|---|---|---|
CPU Usage | Decreasing | ✓ | ✕ | Minutes | 244.278 | 260.606 | 44.418 | 4320 |
increasing | ✓ | ✓ | Minutes | 148.470 | 160.590 | 33.122 | 4321 | |
Periodic | ✕ | ✓ | Minutes | 221.668 | 272.340 | 89.298 | 4322 | |
Random | ✕ | ✕ | Minutes | 233.277 | 237.980 | 34.262 | 4323 | |
Memory | Decreasing | ✓ | ✕ | Minutes | 1.34E+08 | 1.29E+08 | 1.29E+07 | 4324 |
increasing | ✓ | ✕ | Minutes | 8.75E+07 | 8.64E+07 | 2.35E+07 | 4325 | |
Periodic | ✕ | ✓ | Minutes | 1.04E+08 | 1.04E+08 | 7.88E+06 | 4326 | |
Random | ✕ | ✓ | Minutes | 9.72E+07 | 9.74E+07 | 1.92E+06 | 4327 | |
Response Time | Decreasing | ✓ | ✕ | Minutes | 514.907 | 561.575 | 162.394 | 4328 |
increasing | ✓ | ✓ | Minutes | 557.310 | 624.800 | 194.021 | 4329 | |
Periodic | ✕ | ✓ | Minutes | 561.310 | 691.467 | 296.526 | 4330 | |
Random | ✕ | ✕ | Minutes | 476.822 | 454.164 | 150.803 | 4331 | |
Traffic | Decreasing | ✓ | ✕ | Minutes | 3046.082 | 3450.782 | 1147.120 | 4332 |
increasing | ✓ | ✓ | Minutes | 3226.507 | 3679.959 | 1338.634 | 4333 | |
Periodic | ✕ | ✓ | Minutes | 3169.468 | 3803.333 | 1865.883 | 4334 | |
Random | ✕ | ✕ | Minutes | 2378.145 | 2132.667 | 968.229 | 4335 |
The following table describes the real-world series.
Metric | Series | Trend | Stationary | Frequency | Mean | Median | Std | Size | Communication |
---|---|---|---|---|---|---|---|---|---|
CPU Usage | 1 | ✕ | ✕ | Seconds | 0.34 | 0.33 | 0.05 | 1,426 | RI, IC, IPC |
2 | ✕ | ✓ | Seconds | 0.34 | 0.40 | 0.11 | 1,427 | RI | |
3 | ✕ | ✕ | Seconds | 0.18 | 0.15 | 0.07 | 1,420 | RI, IC, IPC | |
4 | ✕ | ✕ | Seconds | 0.32 | 0.31 | 0.04 | 1,421 | RI, IC | |
Memory | 1 | ✓ | ✕ | Seconds | 0.53 | 0.52 | 0.04 | 1,427 | RI, IC |
2 | ✕ | ✓ | Seconds | 0.51 | 0.50 | 0.03 | 1,426 | RI | |
3 | ✓ | ✓ | Seconds | 0.52 | 0.52 | 0.00 | 1,426 | RI, IPC | |
4 | ✕ | ✓ | Seconds | 0.45 | 0.45 | 0.02 | 1,424 | RI, IC | |
Response Time | 1 | ✕ | ✕ | Minutes | 1.00 | 1.02 | 0.20 | 720 | RI |
2 | ✕ | ✕ | Minutes | 23.75 | 23.95 | 3.06 | 720 | RI | |
3 | ✓ | ✕ | Minutes | 59.94 | 58.24 | 14.79 | 721 | IC | |
4 | ✕ | ✕ | Minutes | 470.38 | 371.96 | 255.56 | 715 | IC | |
Traffic | 1 | ✓ | ✕ | Minutes | 222.39 | 220.58 | 12.42 | 721 | RI |
2 | ✕ | ✕ | Minutes | 50.75 | 54.37 | 11.69 | 721 | RI | |
3 | ✕ | ✕ | Minutes | 111.60 | 44.29 | 87.34 | 713 | IC | |
4 | ✕ | ✕ | Minutes | 258.10 | 255.12 | 40.43 | 721 | IPC |