Skip to content

Supporting OpenMetrics #189

Open
Open
@dmagliola

Description

@dmagliola

We should support OpenMetrics. I've started this work in a temporary branch, will start opening PRs.
But this seems to be the master list of things to do.

This is a breaking change that we should release as 3.0.
We should have previous minor versions with deprecation warnings about the things we'll be changing.

Todo

  • Basic Formatting hygiene

    • Format negotiation with OpenMetrics Accept Header
    • Add tests to the OpenMetrics formatter I wrote
    • #EOF
    • #UNIT including validation
    • forcing _total on counters. This is a breaking change, released in 3 stages
      • Warn on counters that are not suffixed with _total
      • (next major version) Forcefully add _total to counters. Those that come with _total suffix, remove it and warn, in preparation for next major.
      • (next major version) Error on counters suffixed _total
  • Timestamps Output

  • Exemplars

  • New Metric types

    • StateSet
    • Info
  • Document breaking changes (UPGRADING.md)

    • If you called Metric#values or Stores#all_values, you'll get objects with timestamps instead of simply float
    • Naming convention things: You must pass the unit and not include it in the name. You must not add _total to your counters. We'll add both automatically

Questions:

  1. Is this the most up-to-date documentation we have?

  2. The _created child series:
    a. Is it only supported for counter, summary and histogram?
    b. Is this the time the metric got declared / added to the registry? Or is this the time the metric was first observed?
    c. Do we report one _created value per metric or per time series? In other words, if I have:

    http_server_requests_total{code="200"} 10.0 1590078339.214435
    http_server_requests_total{code="400"} 1.0 1590078339.214435
    

    Should I have:

    http_server_requests_created{code="200"} 1590078336.554018
    http_server_requests_created{code="400"} 1590078336.554018
    

    Or just: http_server_requests_created 1590078336.554018
    And if the former, do we have a separate one per Histogram Bucket?

    Note: in the documentation, all examples happen to not have labels. Moreover, trying to figure this out by experimentation, I think there may be a problem with the Python parser. For Counters, it only accepts the second example (no labels for _created). For Histograms, it only accepts "all the labels minus le". Both kind of make sense independently, but they seem like they should be mutually exclusive.

  3. Exemplars
    a. Are they only available for counter and histogram?
    b. We are reporting an exemplar, of the many the app could have reported. Does it have to be the latest? Can it be an "unstable" pick between the many? (the reason I ask is that if different scrapes are ok to report different exemplars, even if the metric hasn't changed, then we don't need to share those exemplars between processes, and we can just keep them in the Metric object itself, instead of persisting to the files)

Metadata

Metadata

Assignees

No one assigned

    Labels

    breaking-changeIssues/PRs that break API and require a major version release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions