Cherwell IT Service Management Blog
Resources, Best Practices, and Solutions for ITSM Pros

20 Years of Software Identification Challenges Will Persist Well Into the Standardized Tagging Era

Posted by

The following bylined article can also be found in the April issue of IAITAM’s ITAK Magazine and the April edition of FAST IiS Kaleidoscope.

Since the dawn of the desktop era, IT departments have struggled to keep track of software installed across their corporate networks. Accurate software inventories are crucial to ensuring installed applications are properly licensed, understanding whether or not they’re being used, and budgeting for future software purchases. Unfortunately, no standard methodology exists across applications and manufacturers for correlating installed program executables with actual application titles. This leaves asset managers and the software discovery tools they utilize with any number of half-complete approaches to application recognition.

Driven by licensing challenges stemming from inaccurate and incomplete software identification, the ISO/IEC 19770-2 software tagging standard has been developed, providing publishers with guidelines for “tagging” their applications in a standard way that makes identification straightforward, automated, and virtually foolproof for discovery tools. Yet despite the technical ease with which software tags can be implemented, publishers have been painfully slow to adopt the standard, and end users have not pressed vendors hard enough to spur them to action.

This glacial pace of industry adoption, combined with the reality that few, if any, applications on the average desktop utilize software tags, means the practical benefits of tagging will be years in the offing.

A few large software vendors such as Adobe and Symantec are beginning to tag newly-released programs according to the ISO standard, and some branches of the U.S. federal government such as the GSA and DoD have made a commitment to include tagging in their procurement requirements for commercial off-the-shelf software. But the reality for asset managers is that an accurate survey of installed applications based on the 19770-2 standard alone will be practically useless until most, if not all, new software releases comply with the standard, and until every “untagged” copy of software is either retired from the desktop or tagged by end-user organizations themselves.

Until that day arrives, many software inventory tools will continue to rely on inferior methods such as analysis of file headers, registry entries, or installer databases. Unfortunately, these approaches nearly always come up short by over-counting or under-counting installed applications (and in some cases, missing them altogether); presenting data that’s not consistent across applications, versions, editions, and manufacturers; and improperly correlating discovered file data with licensed application titles.

The difficulty for most discovery tools in properly identifying software titles means that end users are often saddled with the task of manually interpreting, validating, and normalizing significant portions of raw application data—a time-consuming and error-prone process that, if neglected or performed improperly, can come back to haunt them later in the form of the dreaded software audit.

While we wait for software tags to become commonplace on the desktop, we must be mindful of the pitfalls of the traditional methods of identification. At the very least, understanding which method(s) are utilized by one’s own discovery tool is key to understanding where to focus one’s reconciliation efforts when attempting to interpret presented data.

The Windows Installer Database (MSI) or, more likely, the subset of MSI data stored in the Windows Registry (visible from within Add/Remove Programs) is a source of information inventory tools commonly use to reveal what’s installed on a machine. However, applications installed using methods other than the Windows Installer often go undetected. In addition, data derived from this source often lacks adequate version granularity and/or can’t be correlated one-to-one with licensable application titles.

File header analysis, another approach utilized by many computer inventory products, is tied directly to the application executable; however, this information is often outdated, incomplete, or inconsistent because publishers aren’t obligated to ensure the data is correct. In addition, different applications may share the same executable file(s), leading to confusion about which product that given file represents. Finally, because many applications consist of multiple (sometimes hundreds or even thousands in the case of the Windows OS) executables; examining individual file headers doesn’t necessarily indicate the relationship between the executables and the licensed product with which they are associated.

Some asset management software vendors have collected application inventory data from corporate networks over many years and developed proprietary software catalogs that enable discovered executables and other application data to be correlated with their licensed software titles. Because of the lack of a standard method for collecting and interpreting this data, it’s incumbent on the curators of such catalogs to continually update the content, manually validate the accuracy of new entries, and normalize the information for practical use. It therefore stands to reason that the utility of any software catalog is only as good as the curator’s commitment to expand and maintain it.

Irrespective of how software is identified—and it’s often a combination of the methods discussed above—real value can only be achieved if discovered data is normalized and presented in a way that allows end users to effectively monitor their license positions.

Unfortunately, we’re still a long ways away from a universal approach to application identification. Ironically, many of the very same software vendors who take a hard line with respect to license compliance have neither implemented tags, nor have they announced plans to do so. Other publishers have tagged their software, but are doing so using their own proprietary syntax, a vendor-centric approach that only frustrates efforts to establish a widely-embraced standard that can be utilized by asset management tools and relied upon by their end users.

Even if publishers were to adopt the ISO standard in earnest, it will take years for all the untagged applications residing on enterprise desktops to be retired and/or replaced with updated, and presumably tagged, versions.

This reality will leave end-user organizations with several options: 1) work with asset management technology that relies on one of the “traditional” recognition methodologies, ensure the limitations are well-understood, and develop practices by which inaccurate data can be accounted for and corrected, 2) utilize a hybrid tool that identifies and normalizes data derived from both tagged and non-tagged software by combining the former option with tag-based recognition, or 3) rely exclusively on software tags by becoming familiar with the ISO 19770-2 standard, and use this knowledge to retroactively assign tags to untagged applications.

Obviously, each of these options have their own unique costs, risks, and benefits, which brings us back to the basic, time-tested principle: establishing careful software license management processes and implementing technologies to effectively support those processes are—and will be continue to be—critical in evaluating and minimizing compliance risk.