-
Notifications
You must be signed in to change notification settings - Fork 6
Configuring Mesos
etc-collectd/mesos.yaml.example
etc-collectd/mesos-cli.yaml.example
etc-collectd/conf.d/mesos-exec.conf.example
etc-collectd/conf.d/mesos-plugin.conf.example
Enabling and configuring the Mesos plugin(s) is pretty straight-forward and many settings are defaulted to acceptable values. It is optional, obviously only applicable if one is running part of Mesos.
As with the Collectd collector, there are two options for running the Mesos collector.
- As a Python plugin
- + More efficient use of resources.
- - No ability to manipulate the host portion of the metric name.
- As an Exec Plugin
- - Less efficient use of resources, a process is spawned to collect metrics at each interval.
- + Ability to manipulate the host portion of the metric name to assist in maintaining metric continuity.
Both the Python and Exec plugins support a core set of configuration options, as with the CAdvisor plugin, in different file format.
Setting | Default | Description |
---|---|---|
Host | "docker:gateway" |
|
Port | - | Port for connections to the Mesos server (default is commented out, the port is set based on the role of the plugin using the Mesos defaults of master:5050 and slave:5051). |
Separator | "." | Separator character to use when using Mesos metric names as Collectd type-instances |
ConfigFile | "/etc/collectd/mesos.yaml" | Location of the main Mesos configuration file (context is the cadvisor-collectd container). |
TrackingName | "mesos.master" | Only applicable to the active Mesos master - provides a facility whereby the host portion of the metric name is consistent, regardless of which Mesos master instance is active, for master metric continuity. |
### Using the Python plugin
After copying the configuration file, edit it and uncomment the block applicable to this system and update any other configuration options.
cd etc-collectd/conf.d && cp mesos-plugin.conf.example mesos.conf
vi mesos.conf
Mesos Master system example
<LoadPlugin "python">
Globals true
</LoadPlugin>
<Plugin "python">
ModulePath "/opt/collectd/python/"
Import "mesos-master"
<Module "mesos-master">
Host "docker:gateway"
# Port 5050
TrackingName "mesos.master"
ConfigFile "/etc/collectd/mesos.yaml"
Separator "."
</Module>
</Plugin>
Mesos Slave system example
<LoadPlugin "python">
Globals true
</LoadPlugin>
<Plugin "python">
ModulePath "/opt/collectd/python/"
Import "mesos-slave"
<Module "mesos-slave">
Host "docker:gateway"
# Port 5051
ConfigFile "/etc/collectd/mesos.yaml"
Separator "."
</Module>
</Plugin>
### Using the Exec plugin
There are no changes required to mesos.conf
beyond copying the example file which is enough to enable the plugin when Collectd (re)starts. The Exec plugin client has one additional configuration option which must be set.
Setting | Default | Description |
---|---|---|
Role | "master" | Role of the daemon, master or slave, from which metrics will be collected. The role is implied when using the Python plugin, determined by which module is loaded. For the command line client, this must be explicitly set. |
cd etc-collectd/conf.d && cp mesos-exec.conf.example mesos.conf
cd .. && cp mesos-cli.yaml.example mesos-cli.yaml
vi mesos-cli.yaml
Mesos Master example
Role: "master"
Host: "docker:gateway"
# Port: 5050
Separator: "."
ConfigFile: "/etc/collectd/mesos.yaml"
TrackingName: "mesos.master"
Mesos Slave example
Role: "slave"
Host: "docker:gateway"
# Port: 5051
Separator: "."
ConfigFile: "/etc/collectd/mesos.yaml"
Controls the mesos to collectd metric type translations. Provides facility to ignore specific mesos metrics (information, redundant, etc.).
cd etc-collectd && cp mesos.yaml.example mesos.yaml
vi mesos.yaml
#
# metrics definitions
#
# the basic premise is to only define metrics which
# are *not* collectd type 'gauge'. providing a more
# dynamic collection method.
#
# 1. metrics will show up when they are in /metrics/snapshot
# 2. changes to upstream metrics do not require the plugin to
# be changed, only the configuration.
#
# format:
# mesos_metric_name: collectd_type
#
# mesos_metric_name:
# - name of the metric as returned by /metrics/snapshot
#
# collectd_type:
# 1. as defined in Collectd's default types.db
# 2. as defined in in mesos-types.db (custom types added for mesos)
# 3. ignore, *do not* submit metric to collectd
# (e.g. master/elected - information,
# system/mem_free_bytes, system/mem_total_bytes - redundant
# if system level metrics are already being collectd by cadvisor)
#
default_metric_type: gauge
master/cpus_percent: percent
master/disk_percent: percent
master/dropped_messages: counter
master/elected: ignore
master/invalid_framework_to_executor_messages: counter
master/invalid_status_update_acknowledgements: counter
master/invalid_status_updates: counter
master/mem_percent: percent
master/messages_authenticate': counter
master/messages_deactivate_framework': counter
master/messages_exited_executor': counter
master/messages_framework_to_executor': counter
master/messages_kill_task': counter
master/messages_launch_tasks': counter
master/messages_reconcile_tasks': counter
master/messages_register_framework': counter
master/messages_register_slave': counter
master/messages_reregister_framework': counter
master/messages_reregister_slave': counter
master/messages_resource_request: counter
master/messages_revive_offers': counter
master/messages_status_update': counter
master/messages_status_update_acknowledgement': counter
master/messages_unregister_framework': counter
master/messages_unregister_slave': counter
master/recovery_slave_removals': counter
master/slave_registrations': counter
master/slave_removals': counter
master/slave_reregistrations': counter
master/tasks_failed': counter
master/tasks_finished': counter
master/tasks_killed': counter
master/tasks_lost': counter
master/uptime_secs': uptime
master/valid_framework_to_executor_messages': counter
master/valid_status_update_acknowledgements': counter
master/valid_status_updates': counter
registrar/registry_size_bytes: bytes
slave/cpus_percent': percent
slave/disk_percent': percent
slave/executors_terminated': counter
slave/executors_terminated': counter
slave/invalid_framework_messages': counter
slave/invalid_status_updates': counter
slave/mem_percent': percent
slave/recovery_errors': counter
slave/tasks_failed': counter
slave/tasks_finished': counter
slave/tasks_killed': counter
slave/tasks_lost': counter
slave/valid_framework_messages': counter
slave/valid_status_updates': counter
system/cpus_total: ignore
system/load_15min: ignore
system/load_1min: ignore
system/load_5min: ignore
system/mem_free_bytes': bytes
system/mem_free_bytes: ignore
system/mem_total_bytes': bytes
system/mem_total_bytes: ignore