Conversation
… services to improve down detection logic Signed-off-by: theTibi <tkorocz@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## v3 #5492 +/- ##
==========================================
- Coverage 43.62% 43.54% -0.08%
==========================================
Files 413 413
Lines 42401 42928 +527
==========================================
+ Hits 18498 18694 +196
- Misses 22144 22362 +218
- Partials 1759 1872 +113
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
ademidoff
approved these changes
Jun 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MongoDB/PostgreSQL/Valkey/Redis downalerts share the flaw fixed for MySQL inPMM-14193: when the DB and its agent/exporter go down together, the
*_upmetric disappears, the expression has no series to evaluate, and the alert
silently resolves — exactly when it should fire.
Fix
Anchor on
pmm_managed_inventory_agents(emitted by pmm-managed, always present).Any enabled service not currently reporting
*_up == 1(down or missing)stays at value 1 → Alerting; a baseline
0keeps healthy ones Normal. Keying onservice_idalso fixes the oldon(node_name)join for multi-instance nodes.Updated:
mongodb_down,postgresql_down,valkey_down,redis_down— sameexpression as PMM-14193, only metric/
agent_type/labels differ.Not changed —
agent_down: its sourcepmm_managed_inventory_agents{agent_type="pmm-agent"}flips to
0(doesn't vanish) when the host dies, so it has no missing-series issue.Note:
redis_downandvalkey_downare identical (bothredis_up/valkey_exporter,no
service_typelabel to split them) — pre-existing, preserved here.Ref: PMM-14193
Ticket number: PMM-0
Feature build: SUBMODULES-0
If this PR adds, removes or alters one or more API endpoints, please review and update the relevant API documentation as well:
If this PR is related to other PRs, contributions, or ongoing work in this or other repositories, please reference them here: