Symptom-based alertingfocuses onuser-visible problemsorservice-impacting symptomsrather than low-level resource metrics. In Prometheus and Site Reliability Engineering (SRE) practices, alerts should signal conditions that affect users’ experience — such ashigh latency, request failures, or service unavailability— instead of merely reflecting internal resource states.
Among the options,API latencydirectly represents theperformance perceived by end users. If API response times increase, it immediately impacts user satisfaction and indicates a possible service degradation.
In contrast, metrics likedisk space,CPU usage, ordatabase memory utilizationarecause-based metrics— they may correlate with problems but do not always translate into observable user impact.
Prometheus alerting best practices recommend alerting on symptoms (viaRED metrics — Rate, Errors, Duration) while using cause-based metrics for deeper investigation and diagnosis, not for immediate paging alerts.
[References:Verified from Prometheus documentation –Alerting Best Practices,Symptom vs. Cause Alerting, andRED/USE Monitoring Principlessections., , , ]