Exchange Back Pressure and Transport Issues

Sometimes you’ll get a situation where your email will stop flowing on one of your Exchange servers. Most of the time, we’re worried about our database and log file drives becoming full, but we don’t necessarily look at the configuration of our Transport servers to see if the resources in those directories become full or taxed to the point where it causes, “Back Pressure”. Exchange has events setup to monitor when that threshold is crossed and transport functionality is hindered:

  • Event ID 15004: Increase in the utilization level for any resource (eg from Normal to Medium)
  • Event ID 15005: Decrease in the utilization level for any resource (eg from High to Medium)
  • Event ID 15006: High utilization for disk space (ie critically low free disk space)
  • Event ID 15007: High utilization for memory (ie critically low available memory)

If you think that your server is experiencing a back pressure event, you can look quickly through event viewer for these events with the following script:

In most cases, you get the 15006 event:

Event 15006, MSExchangeTransport
Microsoft Exchange Transport is rejecting message submissions because the available disk space has dropped below the configured threshold. The following resources are under pressure:
Used disk space (“C:\Microsoft\Exchange Server\V15\TransportRoles\data\Queue”)
Used disk space (“C:\Microsoft\Exchange Server\V15\TransportRoles\data”)

Overall Resources
The following components are disabled due to back pressure:
Mail resubmission from the Message Resubmission component.
Mail submission from Pickup directory
Mail submission from Replay directory
Mail resubmission from the Shadow Redundancy Component
Inbound mail submission from the Internet

Exchange uses the following formula to calculate the threshold at which these events fire:

100 * (hard disk size – fixed constant) / hard drive size

So, in order to get transport running again, get your C: Drive cleared so that back pressure is lifted off of the server and transport can run again. You should then get a 15005 Event:

Log Name: Application 
Source: MSExchangeTransport 
Date: 10/19/2017 2:21:52 PM 
Event ID: 15005 
Task Category: ResourceManager 
Level: Information 
Keywords: Classic 
User: N/A 
Computer: EX01.ldlnet.org 
Description: 
The resource pressure decreased from Medium to Low.No components disabled due to back pressure. 
The following resources are in normal state: 
Private bytes 
System memory 
Version buckets[C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue\mail.que] 
Jet Sessions[C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue\mail.que] 
Checkpoint Depth[C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue\mail.que] 
Queue database and disk space (“C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue\mail.que”) 
Used disk space (“C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data\Queue”) 
Used disk space (“C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\data”) 
Overall Resources

Now that you’ve completed this, how to do setup your Exchange Environment to keep this from happening again? Well, I would pick a drive volume that you’d never have to worry about filling up, or give the transport its own drive volume. This can be accomplished with a .ps1 script that is installed in the default Scripts directory on your Exchange Server installation:

‘C:\Program Files\Microsoft\Exchange Server\Vxx\scripts’

The name of that file is Move-TransportDatabase.ps1 and it
changes the location of the transport directories, moves the Queue Database and restarts the Transport service automatically. Here is an example of how the script is executed when running Exchange PowerShell with elevated privileges and wanting to move all the services to the E: Drive:

So, that’s how you get your transport directories configured to relieve “back pressure”. In my experience, somebody was doing a PST export of a mailbox to the local C: Drive instead of a specific drive volume that wouldn’t affect the OS, Exchange, and Transport. That’s for another time though! Happy Troubleshooting!

Reference: Exchange 2016 – Back Pressure
Reference: A Guide To Back Pressure.
Reference: Change Exchange Server 2013/2016 Mail Queue Database Location

PowerShell Script to log NETLOGON Events 5719 and 5783, then test the Secure Channel to verify connectivity

In my support role, we would get nightly alerts showing disconnection to the PDC from other DCs and Exchange Servers, giving the following events:

DC02
10/31/2018 23:20:28 5719
NETLOGON
This computer was not able to set up a secure session with a domain controller in domain LDLNET due to the following:
The remote procedure call was cancelled.
This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.

ADDITIONAL INFO
If this computer is a domain controller for the specified domain, it sets up the secure session to the primary domain controller emulator in the specified domain. Otherwise, this computer sets up the
secure session to any domain controller in the specified domain.

DC03
10/31/2018 23:18:58 5783
NETLOGON
The session setup to the Windows NT or Windows 2000 Domain Controller \\DC01.LDLNET.ORG for the domain LDLNET is not responsive. The current RPC call from Netlogon on \\DC03 to \\DC01.ldlnet.org has been cancelled.

In order to validate the secure channel, you normally run the nltest command (you can also run the Test-ComputerSecureChannel PowerShell cmdlet) to verify the connectivity to the PDC on the secure channel. The scenario is though that multiple DCs or Exchange servers are having multiple events at a similar time due to a network hiccup that brought the secure channel offline between the two Servers.

Our team at the time was getting a lot of alerts generated and it was taking an inordinate amount of time to validate and test. In an effort to provide an efficient solution for this issue, I compiled a PowerShell ps1 script to first validate the events posted in the past three hours, and then secondly, test all the DCs and Exchange Servers for the Secure Channel Connectivity:

NOTE: This script needs to be run on a server that has the Exchange and Active Directory RSAT tools for PowerShell.

I can’t really put out the output since it will have customer PII, but you will see where it will list the DC/Exchange Server Name, show the events, then run the test. You can then troubleshoot from there. Also, know that the secure channel test will FAIL when run on the PDC Emulator DC. The PDC Emulator cannot run a secure channel test on itself.

Please, if you have any questions or comments, please leave some feedback! Happy Troubleshooting!