-
Notifications
You must be signed in to change notification settings - Fork 13
Description
What
The exporter returned HTTP 500 and error messages in the /array or /objectstore (depending on the version being used) metrics path for precisely 24h:
An error has occurred while serving metrics:
105 error(s) occurred:
* collected metric "purefb_buckets_s3_specific_performance_throughput_iops" { label:{name:"dimension" value:"others_per_sec"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_throughput_iops" { label:{name:"dimension" value:"read_buckets_per_sec"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_throughput_iops" { label:{name:"dimension" value:"read_objects_per_sec"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_throughput_iops" { label:{name:"dimension" value:"write_buckets_per_sec"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_throughput_iops" { label:{name:"dimension" value:"write_objects_per_sec"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_latency_usec" { label:{name:"dimension" value:"usec_per_other_op"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_latency_usec" { label:{name:"dimension" value:"usec_per_read_bucket_op"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_latency_usec" { label:{name:"dimension" value:"usec_per_read_object_op"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
* collected metric "purefb_buckets_s3_specific_performance_latency_usec" { label:{name:"dimension" value:"usec_per_write_bucket_op"} label:{name:"name" value:"bucket1"} gauge:{value:0}} was collected before with the same name and label values
[...]
How
We used to run version 1.0.12, so the problem first showed up in the /array metrics path, affecting all the relevant metrics for the array health. It started around April 14 12:50 p.m UTC after no changes:
As part of different attempts to fix the issue, we introduced some changes:
- Upgraded the exporter to 1.1.3 and changed the configuration to use the new metric paths
- Increased the scrape interval to 45s for all metric paths except for /array
- Restarted the exporter multiple times
- Changed the configuration to connect to only one FB cluster per exporter (used to be two)
All those changes didn't immediately affect the status
After exactly 24 hours, the problem faded away:

The logs didn't show anything useful too:
● pure-exporter.service - Pure Exporter
Loaded: loaded (/etc/systemd/system/pure-exporter.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2025-04-09 20:38:50 UTC; 4 days ago
Main PID: 925921 (pure-exporter)
Tasks: 36 (limit: 120666)
Memory: 27.1M
CPU: 1h 54min 22.323s
CGroup: /system.slice/pure-exporter.service
└─925921 /opt/pure_exporter/1.0.12/pure-exporter --tokens=/opt/pure_exporter/tokens.yml
Apr 09 20:38:50 host-1 systemd[1]: Started Pure Exporter.
Apr 09 20:38:50 host-1 pure-exporter[925921]: 2025/04/09 20:38:50 Start Pure FlashBlade exporter development on 0.0.0.0:9491
I cloned the project and explored the code and could not find an obvious reason, but it feels like the exporter is trying to register the same metric more than once in the Prometheus registry 🤔