How to monitor LDAP, NTLM, Kerberos to your domain controllers ?
Troubleshooting high LSASS CPU ?
Root cause:
The root cause of LSASS CPU% peaks could be multiple:
- Identify circular nested groups in the domain (=> https://gallery.technet.microsoft.com/scriptcenter/fa4ccf4f-712e-459c-88b4-aacdb03a08d0 )
- Removal of Cipher protocols on DCs (=> use IISCrypto from Nartac software to remediate)
- Malformed LDAP query on Applications (Linux-Unix-Java based)
- LDAP configuration problems on Applications (Linux/Java based)
- conf, sssd.conf, … config problems on Linux/Unix
- Local Antivirus running on the Domain controllers is not well configured to exclude DC folders and files (NTDS, Sysvol…)
- Centrify configuration settings not optimized on Linux/Unix
- Centrify ZPA not well configured
- Vmware Vcenter not well integrated to a windows domain
- Vmware ESX not well integrated to a windows domain
- Storage appliances not well integrated to a windows domain
DC fails logons or experiences LDAP timeouts:
http://blogs.msdn.com/b/spatdsg/archive/2011/06/24/dc-fails-logons-or-experiences-ldap-timeouts.aspx
Tips and tricks:
- Identify missing subnets and add them on dssite.msc
- Add more CPU and RAM on domain controllers
- Move to 2012 R2 domain controllers
- Disable Netbios on the DC but this may not be an option for everyone, so the site subnet mapping or DNS name resolution should also fix this kind of an issue.
- Educate developers to perform the right LDAP queries
- Configure client applications properly (ldap filters)
- We have seen the LDAP ATQ threads get depleted at a customer due to high volume of LDAP clients using NTLM for authentication. These were overloading the Netlogon service, ran into MaxConcurrentApi bottleneck.
- By default there are 4 threads per processor allocated to the LDAP thread pool, we can change that via LDAP policies, specifically MaxPoolThreads: MaxPoolThreads = Maximum number of threads created by the domain controller for query execution (4 per processor). Set to 8 per proc.
Enable LDAP query logging using NTDS diagnostic values:
http://technet.microsoft.com/en-us/library/cc961809.aspx
with PowerShell script: https://gallery.technet.microsoft.com/scriptcenter/Event-1644-reader-Export-45205268
http://www.frickelsoft.net/blog/?p=243
Basically, you want to set the following registry values:
Path: HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics\15 Field Engineering
Type: DWORD
Value: 5
Path: HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Expensive Search Results Threshold
Type: DWORD
Value: 10000 (decimal – default value)
Path: HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Inefficient Search Results Threshold
Type: DWORD
Value: 1000 (decimal – default value)
Path: HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Search Time Threshold
Type: DWORD
Value: 30000 (decimal – defaut value in milliseconds)
Explanations:
Expensive LDAP calls are the searches those visit large number of entries. Default threshold for expensive search is 10,000 which means if an LDAP call visit 10,000 or more entries then it will be consider as an expensive call. Once you find such call in logs, you can figure out possible solutions to optimize it. For example a query (displayName=*John*) on root domain container will visit all objects in the domain those have any value available in displayName attribute and it will be consider an expensive call if there are 10,000 or plus such objects those have displayName attribute populated.
Inefficient LDAP calls are the searches those return less than 10% of visited entries. For example, if a query visit 10,000 entries in active directory but only return 100 entries then it will be consider inefficient query as return entries are less than 10% of total visited entries. Default visited entries threshold limit for inefficient query is 1,000 which means if a query visit less than 1000 entries then it will not be consider inefficient query even though if it return no entry.
Search Time Threshold, is available only if 2012 R2 DC or after you install the KB 2800945 installed on Server 2012, Server 2008 R2 or Server 2008 domain controllers. By default the value is 30000 milliseconds = 30 seconds ! too long and I recommend to set up to 5000 (5 secs)
These registry changes do not require a reboot but are set per server, so implementing for an entire forest/domain would best be done via Group Policy Preferences. Once set you will find the resulting logs in the Directory Service event log on the DC. They are not exactly parse-friendly but can be wrangled with some regex. The best part is it requires no external utilities/code. Because it is very verbose, don’t forget to remove those values after audit phase.
Which Tools to help?
- perfmon and datacollectorset: http://msdn.microsoft.com/en-us/magazine/cc163437.aspx
- logman, tracerpt
- Quest Change auditor for LDAP (paying software)
-
Have a Server 2012 R2 DC or have KB 2800945 installed on Server 2012, Server 2008 R2 or Server 2008 domain controllers. Hotfix adds performance data to Active Directory event log in Windows Server 2012, Windows Server 2008 SP2 and Windows Server 2008 R2 SP1: https://support.microsoft.com/en-us/kb/2800945
- https://support.microsoft.com/en-sg/help/3060643/how-to-use-event1644reader-ps1-to-analyze-ldap-query-performance-in-wi
Creating More Efficient Active Directory-Enabled Applications:
https://msdn.microsoft.com/en-us/library/ms808539.aspx
Web Resources:
http://blogs.msdn.com/b/spatdsg/archive/2011/06/24/dc-fails-logons-or-experiences-ldap-timeouts.aspx
http://technet.microsoft.com/en-us/library/cc961809.aspx
https://gallery.technet.microsoft.com/scriptcenter/Event-1644-reader-Export-45205268
http://www.nextdirectiontech.com/samples/WhosUsingMyDirectory.pdf
https://support.quest.com/SolutionDetail.aspx?id=SOL72639&pr=
http://codeidol.com/community/ad/enabling-inefficient-and-expensive-ldap-query-logg/2294/
http://www.frickelsoft.net/blog/?p=243
https://blogs.technet.com/b/ad/archive/2009/07/06/referral-chasing.aspx
http://support.microsoft.com/kb/2550044
http://technet.microsoft.com/en-us/library/cc961809.aspx for more on enabling diagnostics logging.
http://msdn.microsoft.com/en-us/magazine/cc163437.aspx : Debugging And Performance Tuning With ETW
http://technet.microsoft.com/en-us/library/cc749337.aspx : creating Data Collector Sets
http://blogs.technet.com/b/askds/archive/2012/02/02/purging-old-nt-security-protocols.aspx