WMArchive aggregation

Based on current schema here we present possible aggregation metrics to collect and visualize.

For all agents (meta_data.agent_ver, host)
- total number of jobs running by agent
- total number of success/running/failed jobs
- list of all acquisitionEra, acquisitionVersion
- performance metrics for each step
  - total cpu, ram usage, avg time
Performance metrics for individual sites
- get list of sites from sitedb and for each site get total number of success/running/failed jobs, total cpu, ram usage
Total number of processed runs/lumis
Total size of produced datasets

we need to perform the following tasks to create WMArchive aggregation framework.

write code (MR or spark) to collect aggregation statistics
integration code into production machinery (write and organize crontabs, schedule them on analytics cluster, etc.)
write code for web frontend to visualize the data, the data should be presented in JSON format
- evaluate different JavaScript plotting libraries; a potential list of plotting libraries can be found here or we may use kibana
estimate collection time, i.e. how much it will take to get N-months statistics

Provide feedback