Health Data, Workforce Development

Data Analytics Concepts for Health Information Professionals

In healthcare, health information (HI) professionals encounter various types of data essential for effective management, analysis, and utilization in healthcare delivery, decision-making, and research. The key types of data include:

  1. Structured Data: Organized in a specific format, such as tables or fields, structured data is commonly used for administrative tasks like billing and scheduling. Its consistent structure allows for easy organization, retrieval, and analysis.
  2. Unstructured Data: This type of data lacks a specific format and includes free-form text, images, and other unorganized forms. Unstructured data is prevalent in clinical tasks like provider progress notes and imaging studies, requiring natural language processing techniques for analysis.
  3. Clinical Data: Relevant to a patient's medical condition and treatment, clinical data comprises diagnostic test results, diagnoses, medications, and other observations. It supports clinical decision making and enhances patient outcomes.
  4. Administrative Data: Associated with healthcare organization operations, administrative data includes billing, scheduling, and resource management information. It facilitates efficient healthcare operations and resource allocation.
  5. Electronic Health Records (EHRs): EHRs are electronic versions of patients' medical records, containing both clinical and administrative data. They provide a comprehensive view of patients' healthcare status and history, and are used for delivering healthcare services, managing data, and ensuring accurate information.
  6. Research Data: Research data is used in healthcare research, including data from clinical trials and epidemiological studies. It informs healthcare policies and improves outcomes. HI professionals contribute to managing and analyzing research data for evidence-based practices.

Sources of Healthcare Information

Healthcare data can be obtained from various sources, each with its own advantages and disadvantages. Some common sources include EHRs, claims data, clinical registries, and public health datasets. By optimizing the use of these diverse data sources, HI professionals can gather valuable insights to drive improvements in healthcare delivery and patient outcomes.

EHRs provide rich data for clinical care, research, and quality improvement. Advantages of EHRs include their completeness, accuracy, and timeliness. However, variability among systems and potential data entry errors are disadvantages.

Claims data collected by health insurance companies contains information on medical services, diagnoses, and treatments. It offers complete, consistent, and large sample sizes. Yet, inaccuracies and incomplete data are potential drawbacks.

Clinical registries are databases focusing on specific conditions or procedures. They provide high-quality data and track patient outcomes over time. However, they may have limited scope and selection bias.

Public health datasets comprise data from public health agencies on disease surveillance, environmental health, and health behaviors. They offer large sample sizes and identify trends. Yet, potentially incomplete or inaccurate data and limited availability can be disadvantages.

Data Analysis Techniques and Technologies

The aim of data analysis is to drive informed decision making. There are many data analysis methods that can be used to examine data, the most common of which are descriptive analytics (“what happened”), diagnostic analytics (“why something happened”), predictive analytics (“what could happen”), and prescriptive analytics (“what should happen”). Other techniques that may be used include regression analytics, cohort analytics, cluster analytics, time-series analytics, and text analysis. The ability to appropriately apply one or more techniques is key to ensuring that the purpose for the analysis is successfully achieved.

To complete an analysis, an analyst may use one or more suitable tools such as Microsoft Excel, Microsoft Power BI, Qlik, SAS or Jupyter Notebook. Coding and expression languages such as SQL, R, Python or DAX are used in applications to perform advanced queries and calculations.

For HI professionals working as data analysts, these techniques and technologies can be applied to healthcare data in countless ways. Descriptive analytics techniques are often used in reporting needs in a healthcare organization. Physician performance, or utilization management, may be analyzed using Qlik, for example. Power BI may be used in predictive analytics for forecasting, but also for population health management or hospital readmission analysis and prevention.

When many solutions can be considered for a particular problem, prescriptive analytics allows for recommendations to be pursued, using a language like Python to identify best staffing models or dosing amounts of a drug.

How Data Visualization Helps HI Professionals

Data visualization is the practice of representing information and data visually, aiming to make complex information easily understandable and reveal patterns, trends, and outliers. It is essential for effectively communicating data insights and supporting decision making, enhancing healthcare delivery and patient outcomes.

Various types of data visualizations exist, including charts, graphs, plots, infographics, and maps. Each visualization type serves a distinct purpose, and the appropriate one should be chosen based on the data and intended audience. Clear and effective data visualizations enable analysts and decision-makers to interpret insights and make informed decisions.

Effective data visualization is crucial for HI professionals as it enables them to convey insights efficiently. They use visualizations to simplify complex health information, making it easier for stakeholders to understand. With effective visualizations, HI professionals can identify patterns and trends in healthcare data, leading to informed decisions and improved patient outcomes. Customizing visualizations for specific audiences ensures that healthcare providers and policymakers can easily comprehend the insights presented. Ultimately, clear data visualizations empower stakeholders to make informed decisions on healthcare delivery and policy.

Ethical Considerations When Managing Healthcare Data

HI professionals have a critical responsibility to uphold ethical standards when managing and utilizing healthcare data. To ensure the proper handling of this sensitive information, several ethical considerations must be taken into account. These include:

  1. Addressing Bias: Professionals must identify and rectify biases in healthcare data to prevent distortion. They should strive for representative and unbiased data by avoiding sampling, selection, and measurement biases.
  2. Obtaining Informed Consent: Prior to using individuals' healthcare data, informed consent must be obtained. Professionals should provide comprehensive information about data usage, access, and purpose to facilitate informed decisions.
  3. Privacy and Confidentiality: Safeguarding data and preserving confidentiality is critical. Compliance with laws and regulations, along with implementing security measures, helps prevent unauthorized access.
  4. Ensuring Data Accuracy: Accuracy and reliability are paramount in healthcare data. Quality control measures and data validation processes should be established to verify integrity.
  5. Promoting Fairness: Professionals should ensure equitable data utilization and avoid biases based on race, gender, age, or other factors to prevent discrimination.
  6. Fostering Transparency: Providing clear information about rights, data collection, use, and sharing builds trust and empowers individuals to make informed decisions.

Future Trends in Data Analytics

The future of data analytics in healthcare organizations is one of expansion into usage beyond information technology or reporting departments. Topmost of emerging trends in data analytics are artificial intelligence (AI) and machine learning, which are now being applied to clinical, financial, and operational areas in healthcare.

AI is being used in clinical diagnostics, data mining of EHRs, workflow optimization, and development of tools such as medical chatbots designed to interact with a patient on a provider’s behalf. Other future trends in analytics are natural language processing, augmented analytics, analytics automation, and data as a service (DaaS).

As HI professionals look to a data-driven future in healthcare, investing in training and education that expands on a strong health information foundation is essential. Fundamental to an analyst’s skillset is the ability to think critically, solve problems, and recognize patterns within data.

While most data analysts will not rely heavily on statistics, a strong foundational knowledge of statistics will be helpful if performing work in data science. In addition to helpful credentials such as the RHIT, RHIA, and CHDA, programming languages such as SQL or NoSQL, R or Python will be critical to know as an analyst. The ability to use business intelligence tools such as Tableau, Power BI or Qlik for reporting or visualization of data insights is also crucial.

In summary, being able to “speak healthcare” fluently and utilize the tools and techniques of data analysis will uniquely position HI professionals in the marketplace.


Shannon H. Houser, PhD, MPH, RHIA, FAHIMA, is professor, Department of Health Services Administration, the University of Alabama at Birmingham, Birmingham, AL.

Laura Blabac, MA, MS, RHIA, CHDA, is lead technical product manager, Enterprise Clinical Data & Platforms at O