Triage and configure repeating alerts more efficiently
CONTEXT: Some alert types such as Long-running queries, Job Failure or Error Log Entries can trigger very frequently under certain circumstances.
PROBLEM: This causes a large number of alert emails to be sent, and the only way to ascetain that all these alerts are the same is to view the details screen for all of them.
• Group similar alerts in the alert inbox
• Raise alerts not on individual events, but on these events happening a specified number of times
• Mention identifying information sooner: e.g. directly in the alert inbox
• Suppress query-based alerts for specific databases (e.g. deadlocking 3rd party vendor databases)
• Have a "quick config" workflow that exludes the current lot of similar alerts without us having to write any regular expressions or other complicated config mechanisms
• When an alert configuration is applied, provide the option to close affected alerts.
Thanks for the suggestion, it's a nice idea, and I'll be interesting to see what others think (including what would be the key information people would like to see displayed for each alert type).
Tim Garcia commented
Some of the detail that is available when viewing an individual alert (i.e., clicking on the alert from the listing in the alert "inbox") can be compactly included in the listing so that it is unnecessary to click through to quickly triage the alert inbox. On my 1080P monitor, there is a large amount of wasted whitespace in that listing.
For example, for these alert types, here are the corresponding tiny bits of useful information that would be wonderful to see in the listing:
"Job duration unusual" -> "Deviation from baseline" (%)
"Blocked Process" -> "Blocking duration"
"Job failed" -> Date & time of last successful execution
Justin Whaley commented
In addition it would be nice if I could do some basic searches from the alert window. For instance, if I change a metric to not give me long running query alerts about sqlbackup, it would be nice if I could search alerts with sqlbackup in them so that I could clear them out.
A bump for this one. looks like it's not been looked at in a while but the issue is still a p.i.t.a!
Johnny Couture commented
Why don't make it possible to disable *any* alert at a database level... It would result in a better management, for example for 3rd party databases alerts.Thank you !
Allen Dunn commented
I meant 'order by' clicking on the column header. Like in ASP listviews or gridviews.
The filter options do not do this.
I am not asking for a way to reduce the list of items just make them easier to list in any order from the column headers.
Hi Jacob, the underlying problem seems to be that the left-hand navigation doesn't drill down all the way to the job level, so if you look at "Job duration unusual" in the alert list, you'll always see a mixture of objects.
The good news is that there is already a list of time-ordered alerts of the same type on the same object! On any alert details page, if you click the "Occurences" tab, you will see a list of alerts on the same job, with a time stamp, linking you through to the relevant details pages. It's a bit bare, but I think it might just get the job done for you.
Hi Daniel, while what you've described this provides great filter functionality, what I was after was the ability to sort alerts. Ie I have filtered all alerts to display type: Job duration unusual. By default this orders the results by Time. But if I want to order by Object so I can view the particular times that a certain job's duration was unusual, I can't currently do that. Or is there another way to achieve what I'm after?
I get spammed like crazy from this product. driving me nuts! need better filtering esp on connection alerts.
Koen Wuyts commented
We have some jobs which run every 10 seconds, it is ok if it fails the first 6 times, only after the 7 fail I would want to be notified.
This goes for several other alerts as well, for instance "Machine Unreachable", we connect over WAN to certain servers, which mean that at least 3 times a day I get a single connection error, which is just due to WAN latency.
Another vote for this. I have the same issue with a 3rd party database which I know has frequent deadlocks (50+ every night) thus 'real' deadlocks get lost in their 'noise'. Disabling per server isn't an option as I would still like to be informed of deadlocks in the other databases on the same server.
Kevin Frazier commented
Deadlocks are one example. Need to identify third party databases vs Inhouse.
Thanks for your suggestion David.
Thanks for your suggestion Joe.
Allen Dunn commented
Could we have a sort option added to the column headers of the alerts page?
It would be very useful to be able to list events by type, object or time for instance.
Allow filtering of alerts by name or description. i.e filter on word block shows all allerts with block in them. Or query txt etc
Hi John, thanks for your suggestion.
You can already filter alerts per database by clicking the expand arrow next to a SQL Server instance in the left-hand "Monitored Entitites" tree view, but some alert types are raised on the SQL Server instance level, and don't have a database associated with them. Did you have a particular problem scenario (or alert type) in mind with this suggestion?
John Baima commented
Sometimes we have a problem with just one customer (one db) and sometimes it is a lot of different databases. It would be nice to know that at a glance. We routinely get some errors I can safely ignore if it is database X. If it is not, then I really need to check it out. This would save me a lot of time.
Thanks for your feedback P. Curd. We would definitely try to improve 'Long running query' alert.