Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed concerns in metrics package #44

Open
Scooletz opened this issue Aug 4, 2017 · 12 comments
Open

Mixed concerns in metrics package #44

Scooletz opened this issue Aug 4, 2017 · 12 comments

Comments

@Scooletz
Copy link
Contributor

Scooletz commented Aug 4, 2017

This is issue is a result of a discussion with @danielmarbach

Problem

The NServiceBus.Metrics package mixes concerns, providing information that is not required by the package itself. The information is required by the PerformanceCounters package, but this could be addressed in another way, but not by polluting Metrics with unneeded information. Additionally, reporting to SC.Monitoring is now based on the transformed .Name of a probe, which doesn't seem to be the best way to identify a probe.

Description

Please find below details of this problem.

ProbeProperties

Below you can find a definition of a probe signalizing a successful processing of a message

[ProbeProperties(
   "# of msgs successfully processed / sec", 
   "The current number of messages processed successfully by the transport per second.")]
class MessageProcessingSuccessProbeBuilder : SignalProbeBuilder

The current schema of probe properties delivers information not the probe itself but the aggregation method and reporting period. These values clearly does not belong to this package as it knows nothing about the way the signal will be used. It might, for example, be aggregated or have its date of occurrence reported. What if, instead of aggregating per sec, we'd like to provide a total?

Reporting

The reporters for SC.Monitoring are created with the MetricType header constructed on the basis of the name with removed spaces.

 RawDataReporter CreateReporter(IDurationProbe probe)
        {
            return CreateReporter(
                w => probe.Register(v => w((long)v.TotalMilliseconds)),
                $"{probe.Name.Replace(" ", string.Empty)}",
                "LongValueOccurrence",
                (e, w) => LongValueWriter.Write(w, e));
        }

Possible solutions

Below you can find possible solutions

Breaking changes

Remove the name and the description from IProbe, leaving just Id in there. This would require following things to happen:

  • NSB.Metrics v2 would need to be released with new interfaces
  • IProbe.Id
  • ProbeProperties would turn into ProbeIdAttribute
  • NSB.Metrics.PeformanceCounters would require to include all the mappings from the newly assigned ids to names and descriptions. The identity map would need to be created and possibly break the build if no mapping exists for a probe that needs to be exposed
  • SC.Monitoring recognition of new MetricType values should be added
  • making ids better, for example, instead of # of msgs successfully processed / sec for the processing success, it should be processing-successful which clearly states what's the signal about, leaving the aggregation, etc for the consumer.

Not so breaking changes

We could use .Name as Id.

@Scooletz personal opinion: being explicit about id (yes, it would require to rerelease) is my preferable way of doing it.

@Scooletz
Copy link
Contributor Author

Scooletz commented Aug 4, 2017

Ping @Particular/metrics-maintainers

@danielmarbach
Copy link
Contributor

thanks @Scooletz for the great write-up of my initial weird thoughts!

@dvdstelt
Copy link
Member

ProbeProperties would turn into ProbeIdAttribute

Why would we still need an attribute? The attribute was previously only used to iterate over all probes to generate the script for the perfcounters.

@danielmarbach
Copy link
Contributor

You still need the attribute for the compile time inspection

@Scooletz
Copy link
Contributor Author

@danielmarbach I think that, with all the ordering/recreation/checking problems with PerformanceCounters package, @Particular/metrics-maintainers (correct me if I'm wrong) are more and more likely to drop the code generation and simply map/create performance counters by writing their data manually (name, description). If that is removed, there's no need for the attribute.

@danielmarbach
Copy link
Contributor

@Scooletz I think it would be easy to reflect over the new static const file and generate counters according to that. But maybe you are right and each counter requires manual mapping anyway. Though it would be a shame to drop and deprecate the build packages again. Maybe there is a good mapping we can come up with that works for the most counters in a generic way and then leave for those the generation on.

@Scooletz
Copy link
Contributor Author

After proposing Particular/NServiceBus.Metrics.PerformanceCounters#27 I think we can combine these approaches @danielmarbach . Let me provide a quick sketch:

  • we augment the way we create/recreate counters as in the PR above
  • we keep the map (id)->(counter name, counter description) in the NServiceBus.Metrics.PerformanceCounters
  • we change the generation to be based only for the counters from the map above

This, combined with the new approach for recreation. looks to me as the best option for both, keeping the packages truly separated and not putting too much effort in providing next counters.

@dvdstelt
Copy link
Member

I vote 👎 to keeping the generation and 👍 for a simple static script.

As I understand it, the original idea was that we generate the counters so that we could define new metrics in NServiceBus.Metrics without a new release to NServiceBus.Metrics.PerformanceCounters

I always thought that was a bad idea because now we need to make the end-user aware of the fact that if NServiceBus.Metrics gets a new release, they MIGHT need to rerun the script generation.
Besides the fact that NServiceBus.Metrics is aware of the names of the performance counters and the fact that it had to use a timer to reset the metrics to 0. With the new probes and identifiers both issues have been removed from NServiceBus.Metrics, although I'm not 100% if the counters are properly reset to 0. This is due to how they always worked and we can't suddenly change that behavior. We discussed a better behavior with new counters and those have been implemented by @Scooletz. I'm not sure we have, but if not, we really should check if everything is working as expected.

Besides that

  • How often do we add new metrics?
  • Out of those times, how often do we not need to update NServiceBus.Metrics.PerformanceCounters as well?
  • Out of those times, how often do we not need to properly inform end-users to rerun the generation script and rerun the generated script on their production servers?
  • No matter if we do generation, we still need a mapping of some metrics to performance counters for those metrics that result in 1 or 2 additional performance counters.
  • If we do a mapping of metrics identifiers to performance counters, what is the use of the generation at all, since we'll always need to update NServiceBus.Metrics.PerformanceCounters when adding a new metric to NServiceBus.Metrics?

@danielmarbach
Copy link
Contributor

I agree with @Scooletz that this is the best approach to move forward as of now but leave it up to the judgement of the maintainer group.

@tmasternak
Copy link
Member

The way I read @Scooletz comment is that we drop support for auto-detection and generation of perf counters script. This aligns with @dvdstelt comment as well.

If my understanding is correct I vote to follow this approach.

@ramonsmits
Copy link
Member

@Scooletz Nice issue about the responsibility of who defines the probe and who defines the metric as I noted prior to my holiday.

I wanted to add that we could even define a new metric in the performance counters package based on the current probes.

For example, currently we have the metric stating the successful rate but we could also add a total successful processed messages counter to the package. A new metric isn't bound specifically to a newly exposed probe.

We can also still maintain a human readable name that could be used in the downstream metric provider.

Its just that an ID would help in having a better mapping and identifies that the ID is more of less used as a key:

The mapping remains as a metric provider can have its own naming convention as Prometheus has.

although I'm not 100% if the counters are properly reset to 0. This is due to how they always worked and we can't suddenly change that behavior. We discussed a better behavior with new counters and those have been implemented by @Scooletz. I'm not sure we have, but if not, we really should check if everything is working as expected.

@dvdstelt Maybe this belongs in a separate issue?

@dvdstelt
Copy link
Member

Maybe this belongs in a separate issue?

@ramonsmits No, they need to be tested thoroughly by someone who knows how the counters worked and verify if they still work that way. Or have two versions side-by-side to check if they produce the same results. I'm not blaming anyone, because mistakes can be made and should be learned from; that's why I mention it. But the counters used to be in seconds, then changed to milliseconds and were changed back to seconds. That's what I meant with testing properly. We're now just making a bigger issue of it then neccesary. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants