Tag Archive: performance

Troubleshooting Logs and Tools


SaRA tool to assess OUTLOOK client: https://diagnostics.outlook.com/#/

Also on CTRL + right click on OUTLOOK icon on the system tray! to get the connection status

Test connectivity from outside using: https://testconnectivity.microsoft.com/

Also check potential source of problems:

  • Check ADFS policies
  • Check set-CASmailbox – (post authentication) ; if POP or imap protocols are blocked for example
  • AzureAD Conditional access policies – (post authentication)
  • Authentication policies – in Exchange online (“new-authenticationpolicy”)- (pre authentication)
  • Client access rules – exchange online
  • Org level – IP blacklist – legacy authentication can be blocked
  • Org level – blacklist – EWS connections can be blocked
  • Org level – disable SMTP auth legacy – recommended
  • To protect from DDOS attack, enable ADFS extranet lockout protection and check audit log
  • IdFIX tool: https://www.microsoft.com/en-us/download/details.aspx?id=36832

Side-effect on Modern authentication:

If ADFS WAP and Internal servers are stopped ! What are the side-effects to access Outlook ??


  1. On clients with Modern authentication or ADAL! => thanks to the access tokens but we can limit the issues (valid 90 days!)
    1. If ADFS internal is restarted => Only => problem solved (no need WAP)
  2. But for OL 2010 or OL 2013 without ADAL, we are prompted to enter USER/PASSWORD (but without success)
    1. We need also the WAP working! And not only ADFS internal… to solve the problem on old clients not supporting ADAL


And also check logs:


In Exchange 2013, there are several logs in the logging folder. For Outlook clients one of the first logs to examine are the HTTP Proxy logs on CAS. The connection walk-through section shows the process that is used to connect to Exchange 2013. This complete process is logged in the HTTP Proxy log. Also, if it is possible, add Hosts file to the client for one specific CAS to reduce the number of logs.

The logs on CAS are located here by default: C:\Program Files\Microsoft\Exchange Server\V15\Logging\HttpProxy\RpcHttp

HTTP Proxy AutoDiscover Logs

Exchange 2013 has HTTP Proxy logs for AutoDiscover that are similar to the logs shown earlier that can be used to determine whether AutoDiscover is failing.

The logs on CAS are located here by default: C:\Program Files\Microsoft\Exchange Server\V15\Logging\HttpProxy\AutoDiscover

HTTP Error Logs

HTTP Error logs are failures that occur with HTTP.SYS before hitting IIS. However, not all errors for connections to web sites and app pools are seen in the httperr log. For example, if ASP.NET threw the error it may not be logged in the HTTP Error log. By default, HTTP error logs are located in C:\Windows\System32\LogFiles\HTTPERR. Information on the httperr log and codes can be found here.

IIS Logs

IIS logs can be used to review the connection for RPC/HTTP, MAPI/HTTP, EWS, OAB, and AutoDiscover. The full data for the MAPI/HTTP and RPC/HTTP is not always put in the IIS logs. Therefore, there is a possibility that the 200 connection successful may not be seen. IIS codes.

In Exchange 2013 IIS logs on the CAS should contain all user connections on port 443. IIS logs on the Mailbox server should only contain connections from the CAS server on port 444.

Most HTTP connections are first sent anonymously which results in a 401 challenge response. This response includes the authentication types available in the response header. The client should then try to connect again by using one of these authentication methods. Therefore, a 401 status found inside an IIS log does not necessarily indicate an error.

Note that an anonymous request is expected to show a 401 response. You can identify anonymous requests because the domain\username is not listed in the request.

RPC Client Access (RCA) Logs

The RCA logs can be used to find when a user has made a connection to their mailbox, or a connection to an alternate mailbox, errors that occur with the connection, and more information. RCA logs are located in the logging directory which is located at %ExchangeInstallPath%\Logging\RpcClientAccess. By default, these logs have a maximum size of 10MB and roll over when size limit is reached or at the end of the day (based on GMT), and the server keeps 1GB in the log directory.

Outlook ETL Logging (requires a support case with Microsoft to analyze the log) 

ETL logs are located in %temp%/Outlook Logging and are named Outlook-#####.ETL. The numbers are randomly generated by the system.

To enable Outlook logging

In the Outlook interface:

  • Open Outlook.
  • Click File, Options, Advanced.
  • Enable “Enable troubleshooting logging (requires restarting Outlook)”
  • Restart Outlook.

How to enable Outlook logging in the registry:

  • Browse to HKEY_CURRENT_USER\Software\Microsoft\Office\xx.0\Outlook\Options\Mail
  • DWORD: EnableLogging
  • Value: 1
  • Note: xx.0 is a placeholder for your version of Office. 15.0 = Office 2013, 14.0 = Office 2010

ExPerfwiz (Perfmon for Exchange)

You can use Perfmon for issues that you suspect are caused by performance. http://experfwiz.codeplex.com/

Exchange 2013 has daily performance logs that captures the majority of what is needed. These logs are by default located in C:\Program Files\Microsoft\Exchange Server\V15\Logging\Diagnostics\DailyPerformanceLogs

Log Parser Studio

Log Parser Studio is a GUI for Log Parser 2.2. LPS greatly reduces complexity when parsing logs. Additionally, it can parse many kinds of logs including IIS Logs, HTTPErr Logs, Event Logs (both live and EVT/EVTX/CSV), all Exchange protocol logs from 2003-2013, any text based logs, CSV logs and ExTRA traces that were converted to CSV logs. LPS can parse many GB of logs concurrently (we have tested with total log sizes of >60GB).

Blog with tips/how to about LPS: http://blogs.technet.com/b/karywa/

Exmon tool (aka Microsoft Exchange Server User Monitor)

We use this tool to get detailed information about client traffic.


Download sysmon:

NEW: Sysmon 10.42 is available ! : https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon and how to use it:

WMI detections: https://rawsec.lu/blog/posts/2017/Sep/19/sysmon-v610-vs-wmi-persistence/

MITRE framework – sysmon coverage:


Installation and usage:




List of web resources concerning Sysmon: https://github.com/MHaggis/sysmon-dfir

Motiba: https://blogs.technet.microsoft.com/motiba/2017/12/07/sysinternals-sysmon-suspicious-activity-guide/

Sysmon events table: https://rawsec.lu/blog/posts/2017/Sep/19/sysmon-events-table/

Mark russinovitch’s RSA conference: https://onedrive.live.com/view.aspx?resid=D026B4699190F1E6!2843&ithint=file%2cpptx&app=PowerPoint&authkey=!AMvCRTKB_V1J5ow

Sysmon config files explained: https://www.bsk-consulting.de/2015/02/04/sysmon-example-config-xml/

Hide sysmon from services:



View at Medium.com

Else other install guides:

Sysinternals Sysmon unleashed



Detecting APT with Sysmon:




Sysmon with Splunk:



Sysmon log analyzer/parsing sysmon event log:






WEF: https://docs.microsoft.com/en-us/windows/threat-protection/use-windows-event-forwarding-to-assist-in-instrusion-detection

logparser: http://www.microsoft.com/en-us/download/confirmation.aspx?id=24659

logparser GUI: http://lizard-labs.com/log_parser_lizard.aspx

Troubleshooting slow logons:



Logon process: http://fr.slideshare.net/ControlUp/understanding-troubleshooting-the-windows-logon-process

Tools for troubleshooting:



And powershell:


Analyze GPOs load time: http://www.controlup.com/script-library/Analyze-GPO-Extensions-Load-Time/ee682d01-81c4-4495-85a7-4c03c88d7263/


How to use Xperf, Xbootmgr, Procmon, WPA?

xperf;xbootmgr;xperfview comes from Windows ADK (Windows performance toolkit sub part). Procmon is a sysinternal tool.




Other interesting articles:






Windows Performance Analyzer (wpa.exe) youtube: https://www.youtube.com/watch?v=HGTlc_gWH_c

Xperf data collection tool: https://xperf123.codeplex.com/releases/view/66888


For boot tracing:


xbootmgr -trace boot -traceFlags BASE+CSWITCH+POWER -resultPath C:\TEMP

with boot phases:
xbootmgr -trace boot -traceflags base+latency+dispatcher -stackwalk profile+cswitch+readythread 
       -notraceflagsinfilename -postbootdelay 120 -resultPath C:\TEMP

For shutdown tracing:

xbootmgr -trace shutdown -noPrepReboot -traceFlags BASE+CSWITCH+DRIVERS+POWER -resultPath C:\TEMP

For Standby+Resume:

xbootmgr -trace standby -traceFlags BASE+CSWITCH+DRIVERS+POWER -resultPath C:\TEMP

For Hibernate+Resume:

xbootmgr -trace hibernate -traceFlags BASE+CSWITCH+DRIVERS+POWER -resultPath C:\TEMP

replace C:\TEMP with any temp directory on your machine as necessary to store the output files

Analyses of the boot trace:


To start create a summary xml file, run this command (replace the name with the name of your etl file)

xperf /tti -i boot_BASE+CSWITCH+POWER_1.etl -o summary_boot.xml -a boot

Analyses of the shutdown trace:

The shutdown is divided into this 3 parts:


To generate an XML summary of shutdown, use the -a shutdown action with Xperf:

xperf /tti -i shutdown_BASE+CSWITCH+DRIVERS+POWER_1.etl -o summary_shutdown.xml -a shutdown


Useful tools and techniques to monitor system performance on a Windows computer:
1- configure perfmon to capture data in a blg file format (using logman utility and task scheduler)
2- use the perform flowchart (VSBS document)
3- create a report using the VSBS powerpoint template
4- alternatively use also sysinternal tools, Server Performance Advisor
Which Tools?
Xperf, Xperfview (Win7 and greater): available from Windows ADK
Perfmon (NT up to 2003) : make performance monitor output file in .blg format; but load this output log file on the new Win7 perfmon


For more information on mmc.exe /32, refer to http://support.microsoft.com/kb/891238/en-us.Use sample rate: 5 min. Use a perfmon alert to notify IT admins; ex: Available MBytes reach 10MB or below

Performance and reliability monitor (evolution of perfmon) for Vista or greater (also called WRPM):
 perform /sys ; starts only the performance monitor, formerly system monitor
 perfmon /res ; starts only the resource monitor
 perfmon /report ; starts only the diagnostic report for 60sec and display the results
 perfmon /rel  ;  starts only the reliability monitor

Reliability Monitor Helps in historical tracking of software installation and un-installation, and miscellaneous failures over time

For more information on how to use Reliability Monitor to track multiple systems, refer to the following links:

– Using Reliability Monitor: http://technet.microsoft.com/en-us/library/cc722107.aspx

– Start Reliability Monitor: http://technet.microsoft.com/en-us/library/cc748864.aspx

– View Reliability Monitor on a Remote Computer: http://technet.microsoft.com/en-us/library/cc722052.aspx

– Enable Data Collection for Reliability Monitor: http://technet.microsoft.com/en-us/library/cc766393.aspx

– Understanding the System Stability Index: http://technet.microsoft.com/en-us/library/cc749032.aspx

How to rebuild perform counters?

– for XP up to 2003 use fist method or a ‘new’ dedicated tool called: performance counter rebuild wizard (PCRW)


For more information on KB300956, visit http://support.microsoft.com/default.aspx?scid=kb;EN-US;300956.

– for Vista or above: use lodctr command tool only

Sysinternal tools (www.microsoft.com/sysinternals): procexp; procmon are the two most important tools
typeperf : to extract perform counters in a txt file (used in conjonction of logman);Note:For more information about Typeperf, visit http://technet.microsoft.com/en-us/library/cc753182.aspx.
logman : command line utility of perfmon;
tracerpt : to export .etl in CSV

  Example: Logman create counter BlackBox -v mmddhhmm -cf counters.txt -si 05:00 -f bincirc – o “c:PerflogsBlackbox_%computername%” -max 250

relog : command line tool to re-sample or extract portion of perfmon file (blg …)

 Example: Relog SQL:<DSN-name>!<LogSetName> -f bin -o <output.blg> ;  check this blog to discover over explanations; http://blogs.technet.com/b/richard_macdonald/archive/2008/04/08/3032386.aspx

performance analysis of logs, PAL v2 (how to analyze perfmon log files): http://pal.codeplex.com/releases/view/51623
PAL requires:
   .net framework 2.x
Server Performance Advisor: SPA (w2k3 SP2 or R2); complementary to Perform
• Response times
• Failing requests
• Hung application
Resource Usage
• Rogue clients
• Bad scripts
• Out of resources
Tuning and configuration
• Incorrect cache size
• Password expiration policy
• Not enough dynamic ports


SPA is built into Windows Vista and Windows Server 2008 and does not need to be installed (new data collector set on Performance and Reliabiliy Monitor). Microsoft Windows 2000 or Microsoft Windows XP are not supported.

taskman (Task manager)
debugging tools and symbols configuration (windbg) or procexp: http://www.microsoft.com/whdc/devtools/debugging/default.mspx


How to script perfmon?

The following links provide additional information about script deployment methods for perfmon:

– http://technet2.microsoft.com/WindowsServer/f/?en/Library/46938289-edb5-468a-b03f-4e5985bf8fca1033.mspx

– http://technet2.microsoft.com/WindowsServer/f/?en/Library/e7b81ac6-23d3-434f-b33a-e940caf5c1a81033.mspx

– http://blogs.technet.com/richard_macdonald/archive/2008/04/08/3032386.aspx

How to use the Microsoft Symbol Server?

1. Make sure you have installed the latest version of Debugging Tools for Windows.
2. Start a debugging session.
3. Decide where to store the downloaded symbols (the “downstream store”). This can be a local drive or a UNC path.
4. Set the debugger symbol path as follows, substituting your downstream store path for DownstreamStore.SRV*DownstreamStore*http://msdl.microsoft.com/download/symbols

For example, to download symbols to c:websymbols, you would add the following to your symbol path:

Key OS Performance counters?

1- Exchange servers that are exceed kernel paged pool memory due to token use and are slow or even hang at an OS level.

2- SQL servers with slow performance due to disk speeds in the 50ms or higher range (.050) during the reported problem interval.

3- File/Print Servers that perform unacceptably slow usually due to Kernel Non-Pool paged being exceeded by the Server Service, or Disk performance worse than .25ms (.025) response time.

Key OS Performance Metrics Counter Guidelines(Sustained or during captured problem interval)
Logical Disk/Physical DiskBoth are to be captured and monitored due to today’s virtualized disk environments.%idle is a reasonable indicator of disk interface pressure.Note: On very large SAN’s offering a LUN totaling (100’s of drives) can have 0% idle and still be okay, but if we see normal levels and then 0% at the precise time of reported problems, then treat it as a valid issue %idle- 100% idle to 50% idle = Healthy- 49% idle to 20% idle = Warning or Monitor- 19% idle to 0% idle = Critical or Out of Spec%Avg. Disk Sec Read or Write

– 1ms to 15ms = Healthy (10ms for Exch/AD, 15ms for SQL)

15ms to 25ms = Warning or Monitor (25ms in general) – 26ms or greater = Critical or Out of Spec

Avg or Current Disk Queue Length

2 or under = Healthy 3-31 = Warning or Monitor. A possible issue, check read or write latency to confirm. 32 or higher = A likely issue, check read or write latency to confirm.

Memory– * = Windows does not have the ability to report maximum pool values in the OS without a debugger attached. – Please see the appendix chart document get an approximate maximum pool size given the amount of physical memory and boot.ini switches used. Note: Hotplug memory, Special Pool debug flag, Having 6GB or more but using the /Maxmem boot.ini switch to force 4GB of recognized memory (usually on Exchange servers) can reduce Pool Paged Bytes by as much as 100MB, so the appendix chart document is good estimation, but it should be understood that these maximum can vary a little depending on the server configuration in hardware and boot.ini switches chosen. Free System Page Table Entries- Greater than 10,000 free = Healthy– 9,999 to 5,000 free = Monitor – 4,999 or below = Critical or Out of SpecPool Non Paged Bytes*- Less that 60% of pool consumed=Healthy

– 61% – 80% of pool consumed = Warning or Monitor.

Greater than 80% pool consumed = Critical or Out of Spec. Pool Paged Bytes*

– Less that 60% of pool consumed=Healthy – 61% – 80% of pool consumed = Warning or Monitor.

Greater than 80% pool consumed = Critical or Out of Spec. Available Megabytes

50% of free memory available or more =Healthy

25% of free memory available = Monitor.

10% of free memory available = Warning – Less than 100MB or 5% of free memory available = Critical or out of spec

Pages per Second (4k per page, so 1000pps=4MB/sec)

– Less that 1000 pages/sec

sustained = Healthy– 1000-2500 pages/sec sustained = Caution or Monitor.

– Greater than 2500 pages/sec (

10.24MB/sec) sustained = Warning.– Greater than 5000 pages/sec peak = Warning to Critical and should be investigated.

Note regarding Hyper-V and performance:Note:For more information, refer to the Measuring Performance on Hyper-V article at http://msdn.microsoft.com/en-us/library/cc768535.aspx.
ProcessorAt this point it becomes important to identify the process, service, or driver causing the workload. %Processor Time (all instances)- Less than 60% consumed = Healthy- 51% – 90% consumed = Monitor or Caution- 91% – 100% consumed = Critical or Out of Spec%processor time + % idle time = 100%

%processor time = (%user time + %privileged time); %user time used by applications, %privileged time used by the kernel/system part


Sum of %processor time per process object = %user time

Network Interface-Due to electrical signaling limitations we do not expect to exceed 80% throughput on any bus. So if we see 80% of the interface consumed on either received or send, we expect to see the link saturated. Using the rule of thumb that we do not want to operate at > 80% of planned capacity, a maximum threshold of ~64% (80% * 80%) is the guideline for received and send, evaluating each independently. Current Bandwidth*Instances- Note bandwidth for calculation (100Mb, 10Mb, 1000Mb, etc)- Remember that Ethernet is approximately 80% usable due to collision, etc – so the usable ceiling is up to 80% of the interface as a typical guideline since we cannot guarantee a pure switched environment for all customers in all scenarios, all of the time. See note1Bytes Total/sec-

Less than 40% of the interface consumed = Healthy– 41%-64% of the interface consumed = Monitor or Caution.

– 65-100% of the interface consumed = Critical or Out of Spec

Output Queue Length

– 0 = Healthy

– 1-2 = Monitor or Caution.

– Greater than 2 = Critical or Out of Spec

ProcessThis is to detect possible leaks by applications. <process>Handle Count- If this process instance has greater than 500 handles it should be examined over time to see if it is legitimate allocation & de-allocation, or if it is a leak pattern over time
Private Bytes guideline rationaleThe goal is to catch many of the smaller memory footprint services (WinMgmt, SVCHost, or 3rd party applications) before they consume too many resources and begin to cause serious performance or stability issues.LSASS on an Active Directory DC, Exchange Server’s Store.exe, and SQL server’s Sqlserv.exe are expected to be 1+GB values and will need their own ruleFor servers with BackOffice or Large memory applications it may be better to create 2 Perfmon alert, MoM, or NetIQ rules for Private Bytes.The first rule should examine all processes except the very large memory applications, and then the second rule is adjusted for observed levels for the application greater tan 250MB as an average.


<process>Thread Count – If this process instance has greater than 500 threads it should be examined over time to see if it legitimate allocation & de-allocation, or if it is a leak pattern over time.<process>Private Bytes- If this process instance has greater than 250MB of use it should be examined over time to see if it is legitimate allocation & de-allocation, or it is a leak pattern over time.-

Note: Private bytes are not related to pool bytes in any way but very commonly code paths within an application that leak private bytes may leak pool bytes as well. This counter is a key in looking for the source of pool leaks.

Private Bytes is used instead of Working Set since a Private Bytes leak can be difficult to detect using the Working Set object because Private Bytes leaks can be paged out, etc.

<process>Working Set

– If this process instance has greater than 250MB of use it should be examined over time to see if it is legitimate allocation & de-allocation, or it is a leak pattern over time.


Threshold for switched networks and latency tolerant
• < 30 percent: Low utilization
• 30 to 60 percent: Significant utilization
• > 60 percent: High utilization
Threshold for shared networks and latency sensitive
• < 30 percent: Normal utilization
• > 30 percent: High utilization

The performance counters available to measure the Network Interface object are expressed in a mix of bits and bytes. To convert this value into a utilization percentage, you can use the following formula:

( ( “Bytes Total Per Second” * 8) / “Current Bandwidth” ) * 100

For the purpose of this workshop, a Windows Powershell function called Get- NICUtilPercent has been prepared to aid in this calculation. To use this function, copy and paste it (from the appendix of this module) into a Powershell command prompt. Then, run the following command by typing Get-NICUtilPercent followed by the value for Bytes Total / Sec followed by the value for Current Bandwidth, as shown below:

Get-NICUtilPercent -bytesTotal 6250000 -bandwidth 11000000

You can also shorten the command shown above as follows:

Get-NICUtilPercent 6250000 11000000

In either case, the command will return a percentage string, such as the one shown below:

45 percent

Because many NICs run at common speeds, the Get-NICUtilPercent function can accept

Web resources:



The version of Netlogon.dll that has tracing included is installed by default. To enable debug logging, set the debug flag that you want in the registry and restart the service by using the following steps:

  1. Start the Regedt32 program.
  2. Delete the Reg_SZ value of the following registry entry, create a REG_DWORD value with the same name, and then add the 2080FFFF hexadecimal value.
  3. At a command prompt, type net stop netlogon, and then type net start netlogon. This enables debug logging.
  4. To disable debug logging, change the data value to 0x0 in the following registry key:
  5. Quit Regedt32.
  6. Stop Net Logon, and then restart Net Logon.Notes
    • After you restart Net Logon, Net Logon-related activity may be logged to %windir%\debug\netlogon.log.
    • The MaximumLogFileSize registry entry can be used to specify the maximum size of the Netlogon.log file. By default, this registry entry does not exist, and the default maximum size of the Netlogon.log file is 20 MB. When the file reaches 20 MB, it is renamed to Netlogon.bak, and a new Netlogon.log file is created. This registry entry has the following parameters:Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters
      Value Name: MaximumLogFileSize
      Value Type: REG_DWORD
      Value Data: <maximum log file size in bytes>
    • On Windows Server 2003-based computers, you can use the following Group Policy to configure the log file size:
      \Computer Configuration\Administrative Templates\System\Net Logon\Maximum Log File Size

Note As an alternate method, you can set the dbflag without using the registry. To do this run the following command from a command prompt: nltest /dbflag:0x2080ffff

Nltest is included as part of Windows Server 2008 and is also available as part of the Support Tools packages on the installation media for Windows Server 2003, Windows XP, and Windows 2000.

After you finish debugging, you can run the nltest /dbflag:0x0 command from a command prompt to reset the debug flag to 0. For more information, click the following article numbers to view the articles in the Microsoft Knowledge Base:


(http://support.microsoft.com/kb/247811/ )How domain controllers are located in Windows

(http://support.microsoft.com/kb/189541/ )Using the checked Netlogon.dll to track account lockouts