Triage and configure repeating alerts more efficiently
CONTEXT: Some alert types such as Long-running queries, Job Failure or Error Log Entries can trigger very frequently under certain circumstances.
PROBLEM: This causes a large number of alert emails to be sent, and the only way to ascetain that all these alerts are the same is to view the details screen for all of them.
• Group similar alerts in the alert inbox
• Raise alerts not on individual events, but on these events happening a specified number of times
• Mention identifying information sooner: e.g. directly in the alert inbox
• Suppress query-based alerts for specific databases (e.g. deadlocking 3rd party vendor databases)
• Have a "quick config" workflow that exludes the current lot of similar alerts without us having to write any regular expressions or other complicated config mechanisms
• When an alert configuration is applied, provide the option to close affected alerts.
P Curd commented
I have a big frustration at the moment with code that triggers the "long running query" alert and when I use "occurrences" in SQL Monitor I see ALL "long running query" alerts, not just the ones that are the same as the current alert. I think it would be much better to group the same queries together so it shows as an alert with a number of occurrences - this would allow targeting particularly bad code rather than having to sort through hundreds and hundreds of alerts trying to find which ones are new/ad-hoc and which are recurrent.
This would be a really useful addition, especially where something as noisy as Reporting Services has its database hosted on a server with other databases.
SQL2012 introduces a sluggish, blocking-prone "Maintenance Process" that operates on the SSISDB.
SSISDB is the object blocked and "SSIS ISSServerExec" is the process blocked.
Please give us database-savvy and process-name-savvy exclusion options, much like, at least for process names, those in place for Long Running Query Alerts.
Microsoft has a lot of work to do to make their SSIS maintenance and other SSIS artifacts better in SQL2012. Pending this, which is likely to a very long time, please consider these options for further reducing "noise".
ADAM, It has been almost a YEAR since your post. Any UPDATES?
Would that you could improve the Alert's "SQL Culprit". I have a UDF that is heavily used, so the snippet I see is that code, not the underlying code that used it, so I'm utterly blind and the Alert is nigh on useless. Is there ANYTHING you can do to "give me more about the context", please?
Thanks for your suggestion.
David Wimbush commented
We have a couple of databases from 3rd party apps that regularly deadlock themselves, typically trying to update tables that are undergoing housekeeping operations. I can't fix it so I don't really want to keep hearing about it. But I'd like the deadlocks to be recorded so I can still investigate if I get reports of serious issues.
Thanks for the feedback David.
It's an interesting suggestion and one that raises a further question:
Would you expect this behaviour for other alerts or just job alerts? For each job this seems straightforward, but for other alerts, If you take this to its logical conclusion you would potentially only have 1 alert for each type with a number against it?
Interested to know what would work best for you?
David Willow commented
Over the weekend I had several jobs that run every 5 minutes fail each time they ran. On Monday morning I discovered I had an alert for each failure.
It would be nice to have a single alert for each job that includes the number of failures instead of thousands of individual alerts.
I think this falls in with this request. Alerts on SQL Agent Jobs should include the job name, not jus the HEX ID. How hard could it be to run :
SELECT name FROM msdb.dbo.sysjobs_view WHERE job_id =
CAST(@JobIDVariable AS UNIQUEIDENTIFIER)
Thanks John for your feedback.
John Henderson commented
One I would especially like to see, is for Job failed alerts. I would like to be able to hover over the alert to see if it has ran successfully since the alert.
Thanks to you both for your feedback. We have plans to make some significant changes to Alert Inbox. We will update you when this happens.
Tim Garcia commented
Can I shout a big "amen, brother!" on this one? Too much clicking otherwise. :D
Andy Llewellyn commented
If I have more than one alert of the same type then it would be helpful to get more detail on the front screen.
When suppressing an alert for an instance or globally, it should be allowed to retroactively apply the new alert configuration to any uncleared alerts
Frank Coulter commented
It would be helpful if filtering or grouping of alerts based on a user defined query string was available in the Advanced Filter Option. Currenlty Long-running query does not actually show the database, and rather than having to look at each alert individually, it would be time saving option to be able to query on the SQL process fragment, so that it only shows alerts that qualify for a certain query string. For example: SQL Process Fragment like '%table5%' would only return alerts that have a SQL Process Fragment containing the text "table5" . This would allow us to clear the alerts all at once, rather than having to click into each alert and see what the fragment is. Additionally, if the alerts could be filtered by the database indicated in the details of the alert, that would be most beneficial as well.
Simon, thanks for your feedback and sincere apologies that we missed this feedback at the time - clearly it is a popular suggestion!
We have now logged this and are tracking it and I'll post here when we have any updates...
Thanks for your feedback Gareth. This has been suggested by other users also. We will update you soon on this.
Gareth Oakes commented
On the alerts tab rather than display every occurrence of every alert, group the alerts by alert type at the top level show alert type, count of alerts and last timestamp of that particular alert.
With the facility to expand and see every occurrence of that alert rather than see that by defaul as it can be quite overwhelming if you open the alerts and have a lot to work through, plus i normally work through them by type rather than in chronological order of occurrence.
Right now we get a lot of "long running querys" without the alert screen or the email mentioning what data object is causing this. When checking details the data is there (in the form of a "create procedure"), so it can't be too hard to extract this information!
This is important for two reasons:
1 (Most important) You want to have customized alert options for certain data objects! For example, we don't want alerts when a backup, index rebuild or a long running SP is running. We need a longer timeout for these alerts.
This can now be acciomplished with the global regexp exclude on particular querys on the whole database, but it doesn't work correctly and reg exps are hard and fragile.
2. When you get an email, you always want to know what is causing this