What is a Data Dictionary?

What is a Data Dictionary?

Keep up with the latest on information governance as this key strategy emerges for addressing a myriad of information management challenges in healthcare. This blog will highlight the trends and opportunities IG presents for ensuring information is treated as an organizational asset.

By Michelle Hermann, MS, RHIA


A data dictionary, as defined by the AHIMA Press-published textbook Health Information: Management of a Strategic Resource, is a “super catalog” that provides, for each data field or element, a list of information describing the field, where the data originates, edits or rules that apply to that field, type and width of field, description of codes used (if any), what applications or reports use that data element, and so on. This centralized repository, a collection of databases from a wide variety of sources in an organization that are integrated into one database to permit a singular view of the data, provides meaning, relationships to other data, origin, usage and formatting. It aligns the organization and removes confusion by providing the necessary metadata to the designers, users, and administrators as an informational resource management tool. Data definitions should describe and explain the meaning of each data element clearly and concisely.

Essentially, a data dictionary is a tool that provides the communication structure in a way that technical and operational teams can more easily meet the daily operational needs of the organization. A data dictionary enables different systems to transmit and share information through standardized definitions and data mapping in a streamlined approach. It is a reference for all staff, including onboarding new staff easier with clearer requirements.

To begin the development of a data dictionary, organizations should consider the following steps:

  1. Data Stewards should be assigned in all functional domains/business units and are essential to the development and standardization of data definitions. All data stewards within the organization should begin compiling a list of terms (fields/attribute names) in their domains. They can do this by reviewing current systems and applications for a listing of these terms to output. This will ensure the definitions are organized by each domain across the organization.
  2. After compiling all terms, the Data Stewards should provide/develop definitions with clear and unambiguous language. Then, the Data Stewards would sit down with the teams to identify common terms and to refine and standardize their definitions.
  3. Once the terms and definitions are reviewed, vetted, and standardized, they should be integrated into a master list that will be used and published enterprise-wide. Data stewards from all domains should be involved to determine the final definitions and to document clear descriptive terminology.
  4. All teams will need to sign off on the enterprise-wide data dictionary. This is a valuable step to ensure the integrity of the data where leaders have had their final review and approve that these terms will be adopted in all areas.
  5. Then, the data dictionary should be published in a location that is easily accessible to all staff. Training and education should be provided for the workforce.

A data dictionary is usually organized in a spreadsheet format to include common elements such as:

  • Field or Attribute Name: a unique identifier used to label each attribute
  • Optional/Required: indicates if this information is an optional or required field
  • Type: defines the type of data that is in this field such as text, numeric, or date/time

These are the core elements and it is not uncommon to find additional documentation to include the source of the information, the field length, and/or any default values. The table below shows an example of a data dictionary:


Here is an example of a data dictionary:

Table Field Type Format Length Description
Patient Last Name Text 25 Characters Last name
Patient First Name Text 25 Characters First name
Middle Initial Text 1 character Middle initial
Gender Drop down 0 (U), 1 (M), 2 (F) 1 character Patient’s sex
MRN  Numeric XXXXXXXXXX 10 characters Medical record number to serve as patient’s unique identifier


As you can see, a data dictionary provides critical information in a structured format to align all organizational users. The data definitions should be displayed on reports and dashboards to clearly describe the data and context in which it is used. It makes it easier for database developers, report writers, and end users to communicate utilizing the same language to meet the needs of the organization so all users are consistent with their reporting of data. Data definitions should be addressed as a part of an information governance program. This will help streamline data across its lifecycle so that it can be used more strategically and efficiently.


Michelle Hermann is director of health information management at Children’s Health System of Texas.

Leave a comment

Send a Comment

Your email address will not be published. Required fields are marked *