-
Notifications
You must be signed in to change notification settings - Fork 13
WMArchive aggregation
Valentin Kuznetsov edited this page Mar 28, 2016
·
4 revisions
Based on current schema here we present possible aggregation metrics to collect and visualize.
- For all agents (meta_data.agent_ver, host)
- total number of jobs running by agent
- total number of success/running/failed jobs
- list of all acquisitionEra, acquisitionVersion
- performance metrics for each step
- total cpu, ram usage, avg time
- Performance metrics for individual sites
- get list of sites from sitedb and for each site get total number of success/running/failed jobs, total cpu, ram usage
- Total number of processed runs/lumis
- Total size of produced datasets
we need to perform the following tasks to create WMArchive aggregation framework.
- write code (MR or spark) to collect aggregation statistics
- integration code into production machinery (write and organize crontabs, schedule them on analytics cluster, etc.)
- write code for web frontend to visualize the data, the data should be presented in JSON format
- estimate collection time, i.e. how much it will take to get N-months statistics