-
Notifications
You must be signed in to change notification settings - Fork 1
Performance Output
Data. As input data we used the same set of models used in experiment 1 (see paper).
Setup. This validation was run by using a MacBook Pro (Retina, 13-inch, Early 2015) with CPU 2,7 GHz Intel Core i5 and was organized in 3 main trials, namely: a baseline trial (0), where we selected as input the 105 conceptual models used in the previous experiment; trial (1) where we duplicated the input; trial (3) where we triplicated the input. For each step, to increase the amount of data to be handled, we also tested the approach with 3 different customizations, concerning partition size parameters (14, 10, and 6) and the min frequency (10, 5, 2) for the mining task. %like the graph partition size (12, 8, and 5) the min frequency (10, 5, 2) and the output patterns size (3, 2, 1). The set-up of this trial was inspired by performance tests of related work.
Results. The table below summarizes the results of the second experiment, by providing the time taken by the graph partitioning and the mining steps. It can be observed that the average execution remains acceptable and processing time incrementally increases when: a) the number of models increases, b) the number of nodes for the partitioned graph decrease, and c) the min frequency threshold is decreased, thus allowing to find a larger number of patterns.
Threats to validity. The main threat to the validity of this experiment is the lack of variety in the input data. The largest model dataset was created by triplicating the original one and the final performance can be affected by the size of individual input models. Still, the 105 models we used as baseline present a high level of variety, with some models having more than 1000 classes. % The second main threat is that we tested the approach with a single device. However, we consider this as a minor issue, mainly because the device we used presents not one of the newest set-ups and the final performance is still acceptable.