Sometimes things are slow. Very. Slow. So slow that your email router starts spitting dreadful messages:
#9628 – An error occurred while delivering the e-mail message with subject “None of your business” in mailbox foo@bar.baz for delivery to https://crm.bar.baz. System.ServiceModel.FaultException`1 [Microsoft.Xrm.Sdk.OrganizationServiceFault]: SQL timeout expired. (Fault Detail is equal to Microsoft.Xrm.Sdk.OrganizationServiceFault).
If you are on CRM Online, contact support, not much else you can do. On premises, however, it’s a different story. The most important bit of the message is highlighted.
tl;dr
If your email router keeps generating SQL Timeout errors, and you have a large number of email records (YMMV but think millions not thousands), then try adding the following index to EmailHashBase table:
CREATE NONCLUSTERED INDEX [indx_router] ON [dbo].[EmailHashBase] ( [ActivityId] ASC, [HashType] ASC ) ON [PRIMARY]
Longer version or how to fish
SQL timeout means we can skip CRM tracing and go straight to SQL Profiler.
- Start SQL Profiler, select T-SQL template. Make sure all columns are available, set database filter to your foobar_MSCRM database only to reduce noise. If possible, also filter the user by setting a condition on the account used to run email router.
- Ready? Start the trace, restart email router service (to force the operation without waiting for the polling interval). Keep your eye on the log, the moment you get the error message, stop SQL trace
- Now you have a massive trace to work with (your definition of “massive” may vary). What I usually do is go through the trace, select and copy bunches of select statements, paste into a SQL query window and run
- Sooner of later you’ll find a statement that just hangs. In my case it was something like
WITH Candidates ( ActivityId ) AS (SELECT ActivityId FROM EmailHashBase WITH (NOLOCK) WHERE ActivityId IN (SELECT ActivityId FROM EmailHashBase WITH (NOLOCK) WHERE HashType = 0 AND Hash IN (-101500970,-295814632) GROUP BY ActivityId HAVING COUNT(ActivityId) >= 3 ) AND HashType = 1 AND Hash IN (-248594596,-742304120) GROUP BY ActivityId HAVING COUNT(ActivityId) >= 2 ) SELECT TOP 1 a.ActivityId FROM ActivityPointerBase AS a WITH (NOLOCK) WHERE a.ActivityId IN ( SELECT ActivityId FROM Candidates AS c WHERE( (SELECT COUNT(*) FROM EmailHashBase AS h WHERE h.ActivityId = c.ActivityId AND h.HashType = 0 ) <= 3 ) ) ORDER BY a.ModifiedOn DESC go
- That’s where you need to possess some SQL skills or get someone who does. Find which part is exactly slow, check the execution plan, look at where clauses and indexes, think. What was suspicious in my case is that EmailHashBase table had over 2 million records but the only helpful index was on ActivityId and HashType condition for some reason pushed SQL to an index scan.
- After adding a new combined index on ActivityId and HashType, the entire statement execution dropped from few minutes to under 1 second. And the router was happy again.
So here you have it. Measuring your SQL Server performance, identifying bottlenecks (specific to your schema and deployment) and tackling them by either throwing more hardware at the problem or by doing some troubleshooting, is a very important part of all on premises installations. And the best part of adding custom indexes is that it’s 100% supported.
Awesome stuff George!
Why are you still on the router, mate? Server-side sync all the way, hybrid scenarios are now fully supported.
Switch over, but still useful to know for some “conservative” clients 🙂
Every now and then, we run into this issue for this query running very slow. I already had the Index in place as suggested but it did not help (It fixed the issue previously). All indexes and statistics were up to date.
After creating the below index, the query immediately started running better from 30+ seconds to less than 1 sec.
CREATE NONCLUSTERED INDEX [IX_email_router] ON [dbo].[EmailHashBase]
(
[ActivityId] ASC,
[HashType] ASC,
[Hash] ASC –Added
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO