Please retain a "tombstone" that can be referenced to present a history of "Occurrences" for HIGH Alerts. Allow us an additional retention period so these don't hang around forever - but please don't limit the number of days, just a smallint would be good.
When Monitoring Fails and is cleared, the Occurrence disappears when DB maint nukes "old cleared Alerts". If the retention period is short, say 14 days, the history of occurrences is soon lost.
A "tombstone" would at least keep the occurrences for trending.
Just keep the alert data in the Purge settings indefinitely – it’s only the metadata of the alerts, none of the performance data, so are “tombstones” as you described
Thanks for the suggestion and for including details of your particular use case.
Due to the sometimes immense volume of data that can amass in the RG Monitor DB, one must reduce the retention period to 14 days or even 7 if things get scary. Being hosted, we pay per GB and like many, the $$$ aren't there to keep upping GB to satisfy the space-hunger of a Monitoring DB!
If I "Clear" an Alert, it is whacked after, say, 7 days. It may be "Cleared" as in, "I know what happened, have resolved if possible, moving on..."
Normally that's enough, but if I start getting sporadic "Instance Unreachable", or "Monitoring Error" alerts outside any maintenance activities, then despite clearing them, I would like to save them to see if there's a trend or a specific time of day when these occur.
In support of this, would you allow us to tag a cleared Alert as "Retain Indefinitely" and somehow, separate "folder" perhaps, show when these exist (with count, would be great).
In my case, the servers are hosted, so I need to monitor the failures as they're typically of 3-minute duration. I also get some SSPI failures (DNS lookups failing). In combination, this may point to some network connectivity issue from the monitoring VM to the physical cluster and the cluster to the DNS's.
Any help with longer term diagnosis is appreciated.