Windows Internals: Case of the Stalled Internet ExplorerPosted: July 7, 2009
I’m going to borrow from Mark Russinovich, and describe how I found and resolved a problem with Internet Explorer. His posts start off as “Case of the xxx”, so I present “The Case of the Stalled Internet Explorer.
I’d been using Internet Explorer 8 for awhile and noticed a really annoying problem. To explain further, let me get into my browsing habits (no, not those kinds of habits!) I often start my day reading the Boston Globe in Outlook and clicking article links continually until there are no more links to click. Then I read each article in turn, after waiting for IE to settle down.
IE would often hang during the process; I took this for granted, as most people would read one article at a time, but I’m a multitasker who doesn’t often wait for pages to load before going on to the next thing.
The way I read RSS feeds using IE’s feed reader is even more interesting: I open a feed, skim it, figure out what articles are interesting to read in full (or whose comments!) and right-click on these articles to Open in New Tab. Repeat for each article I want to read. When I’m done with that feed, open another feed. Rinse and repeat.
That’s hard on any web browser (I use Firefox and Chrome but prefer IE, being a longtime IE user since IE1.0 shipped in the old Windows 95 Plus package.)
I noted something odd: Whenever I opened a tab, IE would stall for anywhere from 30 seconds to 5 minutes. The hand pointer would appear but you couldn’t click anything. Sometimes I would experience the “grayed-out” Screen of Death that Vista uses to indicate an unresponsive processes.
Sometimes, certain web pages (like this Adobe download page) would hang completely.
IE 8 is different from earlier versions in that, to improve reliability, each tab or windows spawns a new process. When a bad webpage crashes, it just affects the window or tab it’s in, rather than the whole browser.
When I monitored performance with Process Explorer, during these hangs, my CPU would stay at 50%—I have a dual core Athlon so that meant one of my cores was saturated—and one of the IE processes was using all of that core. Specifically, one of the threads in IE was using that core for many seconds at a time.
I asked around in the Sysinternals Forum. Molotov (the moderator) gave some standard troubleshooting tips: Empty Temporary Internet Files, run IE with no plugins (IE Safe Mode) and run IE logged as a different user (and profile). Ran with no plugins: No change. Emptied Temporary Internet Files (leaving the cookies, this is an important fact later on.)
IE did work normally under my other user profile. Fine. I knew now that Windows was solid. I wanted, though, to find out just what in my profile was causing the problem. I’d hate having to recreate my favorites and such from scratch.
I have the usual collection of browser plugins: The big three, Java, Adobe and Flash, and a few minor helper apps (Find as you Type). Someone in the forums suggested disabling the Java SSV2 Helper; there are reports of slow tab opens with this plugin.
Disabling the Java SSV2 plugin helped a lot, reducing my stalls to seconds rather than minutes. But not completely. Still hangs.
The Windows Performance team has a tool to analyze performance issues in Windows, using the new Event Tracing For Windows technology: Xperf. It can be used to analyze kernel and user code. It, like all other profiling tools, answers the question: What is the computer doing right now? Developers use xperf to find bottlenecks and hotspots in their code. We can use it to troubleshoot.
Xperf can be found at the Windows Performance Analysis Tools page. The latest stand-alone version is 4.1, but there’s a newer version, 4.5, that comes with the Windows 7 SDK. You’d need this if you run Windows 7, but the earlier version has the same features without downloading the entire 2 gig SDK.
Download it, and follow the instructions, particularly on symbol configuration.
When it’s installed and you’re ready to start a session, enter this command at a prompt:
xperf -on DiagEasy+PROFILE -stackwalk Profile
The command means: Turn tracing on and include DiagEasy, a set of common kernel tracing events, and PROFILE, to enable profiling. The –stackwalk option has xperf include stack information in the profiling.
I ran IE for 20 minutes and managed to reproduce my problem.
Stop the trace:
xperf –d merged.etl
And open the trace viewer:
On looking, I immediately saw something:
I looked at one of the graphs I had, CPU Sampling by Thread. You can select which threads you want to look at; this graph only shows IE threads. You can overlay other graphs; here I overlaid Process Lifetimes, which show when a process starts (red diamonds) and when it ends (blue diamonds).
You can see the CPU spike, and see that it coincides with a new instance of iexplore (pid 10672, though you can’t see it in the picture.)
Load symbols (by rightclicking on the graph and selecting Load Symbols) and wait. This is important for the next step. After xperf finishes grinding, right click on the graph and select Summary Table. Here it is:
The left hand columns represent processes and threads. The right hand columns represent the amount of time each process and thread was active in the interval I selected (about 35 seconds). We can immediately see that an instance of iexplore (pid 10672) is taking most of the CPU time. Names like IEFRAME.DLL!LCIETab_ThreadProc tell us IE is opening a tab or trying to. You can expand each thread to show its stack. Eventually, the thread ends up at wininet.dll!CCookieJar::GetLocation, the function that uses up most of that thread’s CPU.
I have an old collection of browser cookies. Many of these cookies came from my old Windows 95 machine, through XP and exported again to Vista—through three OS revisions, and possibly a fourth when I install Windows 7. A cookies.txt file I had backed up in 2006 shows nearly 1,500 cookies!
There is no way I need all those cookies to log into my favorite sites, especially since like most people, I only go to a handful or two of sites that I absolutely need to log into.
Deleted my cookies. Resolved. IE was much smoother! In fact, my machine has never run better. I will miss it when I go to Windows 7!
(From 1,500 cookies I am down to 50.)
There had to have been a corrupt cookie entry, or else I’d run into a limit in IE.
I can hear people say, why didn’t I just delete my cookies in the first place! Granted. But for myself and most people, cookies are a part of the system profile that is moved when we migrate to a new machine. They’re often forgotten about; web developers are supposed to set cookies to expire when they aren’t needed, but that doesn’t always happen.
I learned from this that the tools for diagnosing problems in Windows are getting better and better, thanks to people like Mark. Next time, it may be a Windows service or a line-of-business app where I can’t just delete configuration to start from scratch. Then xperf can be really useful for the system administrator.