“Ned” the Gnome
In the spirit of Halloween, I want to discuss mythical creatures. What would the world be without J.R.R. Tolkien’s idea of smelly, leather-skinned Orcs or Greek Mythology’s gift of Pegasus, the winged stallion? Unfortunately, for each great mythical creature, like giant walking trees (that walk for hours—thank you Kevin Smith), there is a horrendous one. The dreadful creature I want to discuss today is the corrupt user profile.
I absolutely shudder when I hear the words “corrupt profile.” Like Superman, who is defeated by a glowing green rock—the corrupt profile is my kryptonite (Ned’s is the phrase Tips and Tricks). So, the purpose of this blog is to dispel the myth of the corrupt user profile.
Let me start by contradicting myself—there is actually such a thing as a corrupt user profile; however, it is extremely rare. I’ve spent over ten years at Microsoft and I’ve seen two—count them—two actual corrupt user profiles. I’ll identify the “real” corrupt profile later. First, let’s identify what is NOT a corrupt user profile because it’s more prevalent.
User profile load failure
Occasionally, users report their profiles not loading, or Windows informs users that it logged the user on with a temporary user profile. It’s rare for Windows to not load a user profile because it is a “corrupt” user profile. Typically, a user profile does not load because:
- A process is not playing nice in the sandbox—meaning some process other than Winlogon opened ntuser.dat exclusively thereby preventing Winlogon from loading the user profile.
- Windows cannot find the user profile. This is most often the case when using Roaming, Mandatory, or Super Mandatory user profiles. All these profile types require network connectivity. In these cases, no network means Windows will not loaded user profile.
- Configuration – Windows is configured, through a policy or otherwise, not to load the user profile. Profile quota size, or slow links are common causes for this scenario.
The most common scenario classified as the mythical corrupt profile is the first, and rightly so because is painfully difficult to diagnose. Configuration is the second most likely scenario that attributes to the mythical corrupt profile. It’s rare to associate unavailable user profiles as corrupt, or scenarios involving the awesome access is denied error message.
User settings missing
Another scenario that perpetuates the corrupted profile myth is one that involves user settings disappearing. It’s unlikely that user settings disappear; it’s more likely the user settings were not saved. A number of scenarios can lead to this possibility.
Most recently, I’ve seen a number of scenarios, mostly with Terminal Servers, where settings do not persist. Our case data show a trend of these scenarios using non-Microsoft profile management software. This software changes how Windows handles the user profile. Typically, these implementations treat the user profile as a local profile and then implements “magical magic” to roam user data back to a central location. This introduces a number of moving parts that must work correctly to ensure user settings are saved. Also, some of these non-Microsoft solutions allow you to partition portions of the user settings that persistent and those that do not. This allows control over which user settings roam through their solution and which settings do not. In these cases, verify the solution, third-party or otherwise, propagated the saved settings. However, this is not a corrupt user profile.
Multiple instances – last writer wins
Remember that Windows stores user settings in a registry file. The registry file is the smallest unit of roaming data. That means that Windows roams the entire user hive when the user logs off (or in the background with Windows 7). However, when a user logs on to multiple computers or has multiple sessions, then that user’s settings are only as good as the last session that writes to the central location.
Consider the following scenario. A user has a laptop and frequently uses Terminal Services. The user shares the same profile between these computers. On Friday, the user logs on their laptop—the profile is loaded. After some time, the user makes a Terminal Services connection and begins to work in that session. The user then disconnects the Terminal Services session and goes to lunch. When they return, they change their desktop background on their laptop. The user logs off at the end of the day and their saved user settings roam to the central location. On Monday, the user logs on expecting their new desktop background; however, they receive their old desktop background. You discover that idle Terminal Services sessions are configured to logoff after a preconfigured idle time. The session’s user settings have a later time stamp then the previous and therefore writes last, resulting in the user’s setting appearing as if they did not save. This is another reason why we encourageseparate user profiles for Terminal Services. So, add this experience to the list of mythical corrupt profiles.
Another scenario that perpetuates the corrupt profile myth is with misbehaving applications that “magically” work when you delete the user profile. This is not a corrupt user profile. There is a big different between corrupt data and unexpected data. It’s difficult to determine what is wrong in these scenarios.
Clearly it is related to user data because resetting the user data to blank or nothing restores the application’s performance to the expected behavior. These scenarios require a thorough understanding of the application, how it consumes user data, and the upper and lower limits of each setting. Deleting the entire user profile to accommodate a misbehaving application is a quick fix with huge ramifications. The “fix” for one application effectively breaks other applications. Also, deleting the user profile removes stored credentials, keys, and certificates that may be critical to the user.
A better approach is to create a new user and test the application with a new user profile. But deleting a user profile because an application or a feature of an application does not work is overlooking the larger issue. Resist the urge and instead break out Process Monitor, capture registry activity, and reproduce the issue. Inventory the registry keys the applications uses in the user’s hive. Review the values of each of the keys in a working and failing scenario and compare the two. Use the process of elimination to determine the setting and value that is causing the failure.
If time is not on your side and you know deleting the user profile resolves the problem, then create a virtual machine of the problematic computer so you can continue your investigation at a later time. Incorrect data stored in user settings does not make the profile corrupt.
Will the real corrupt profile please stand up
I’ve identified some of the common misconceptions that are associated with the corrupt profile mythology, and there are others. However, these scenarios consistently rise to the top. So, what is a real corrupt profile? I’m glad you asked.
A user profile is a predetermined folder structure and accompanying registry data. Microsoft Windows uses the registry data to describe and preserve the user environment. The folder structure is storage for user and application data, specifically for an individual user. Windows stores the profile on the local hard drive, loads the profile when the user logs on, and unloads the profile when the user logs off.
The preserved data that describes the user’s environment is nothing more than a registry hive. More specifically, the user’s registry portion of the profile is loaded into HKEY_CURRENT_USER. Registry hives, keys, value names, and values are stored in a specific structure that Windows recognizes as the registry. Each element within the structure has its own metadata, such as last write time and security descriptor. All of this information must adhere to the scope and limits of the structure. Consider the following example.
An application saves the position of its window in the user’s settings. Window locations are represented as coordinates on the screen. These coordinates are integer values. Integers are positive or negative values. However, the upper left corner of the screen is typically represented by the coordinate 0, 0. What if another application saved -12 and 0 as this data? Both numbers are valid integers. It meets the structure of a REG_DWORD, which is an integer data type for the registry. Yet, the application does not work correctly when this value is present in the registry. This is not a corrupt profile—its bad data; however, not in the context of the registry or the profile. The registry only cares that the value is within the scope of that data type.
So, an actual corrupt profile is when the structure of the registry hive no longer conforms to the expected structure. I’ve seen this two times in 13 years and in both cases it was not exclusive to the user’s registry. The corruption persisted throughout registry hives and multiple aspects of the computer did not function correctly. In both these cases, new users with new profile as well as existing user with existing profiles experienced the problem. However, it was noticeable that multiple aspects of the computer were behaving poorly. Ultimately, the problem was diagnosed to a non-windows binary. The binary overwrote heap memory that the registry used. The binary modified that data before it was committed to disk. Then, Windows committed modified memory to disk; thereby misaligning the registry structure—which is a real corrupt user profile.
Be wary when you hear a co-worker reporting a corrupt user profile. Ask them if they saw it during their most recent snark hunting trip or during their last encounter with a ravenous Bugblatter Beast. More likely—they’ve seen one of the manifestations we’ve described in this post. It’s a difficult and time consuming problem to troubleshoot and resolve. But some additional diligence will surface the real problem.