top of page

The Project Goal

​
To understand and communicate disparities in chronic disease conditions in the Home Care Aide workforce of Washington State through statistical analysis and evaluation

The Task

Over the course of 20 weeks, the team was responsible for completing three major tasks

?

The Data

~

The data used in this project was gathered from HCA medical insurance claims data and included 1,824,238 total claims. These claims covered the years 2016, 2017, and 2018
​
The data included demographic variables such as age, gender, and urban/rural (determined with Rural-Urban Commuting Area (RUCA) codes from the Rural Health Research Center)

The Entity Relationship Diagram

Organizing the Data

The Team Designed an ERD to:

  • Understand the relationship between ICDs and Claims

  • Build a more organized table for future analysis

  • Clean up data and remove irrelevant information

  • Organize the data to be member-centric rather than claim-centric

erd

Research Questions

Diving into the analysis
  • What is the prevalence of the different chronic diseases of interest by year?

​

  • What is the rate of comorbidity in the population?

​

  • What are the most common comorbidities in the claims data?

​

  • Describe the demographics and chronic conditions of the population by year.

​

  • What is the effect of demographic characteristics and having a chronic disease of interest on the likelihood of someone having a comorbidity?

​

  • What is the relationship between stability within the health insurance system and chronic disease?

Chronic Disease

Conditions that last 1 year or more and require ongoing medical attention and/or limit activities of daily living

Comorbidity

The simultaneous presence of two or more chronic diseases in a patient.

Chronic Diseases of Interest

  • Diabetes

  • Hypertension

  • Obesity

  • Cancer

  • Musculoskeletal conditions

  • Kidney Disease

  • Cardiovascular conditions

  • Asthma

  • Chronic obstructive pulmonary disorder (COPD)

  • Cholesterol

  • Mental health disorders

diseases
questions

Analytic Strategies

After reshaping the data, the team used Python to conduct...

​

  • Percentages

  • Kruskal-Wallis Tests

  • Logistic regression

The team also created an interactive data dashboard using Tableau
strategies

Limitations

  • The analysis was limited to three years of data

​

  • The medical claims data only included members enrolled by specific carriers 

​

  • The analysis did not include pharmaceutical data from pharmacy claims to determine whether members may have a chronic condition.

​

  • Further analysis with interaction terms are necessary to understand the interaction between the various chronic health conditions and the demographic factors

​

  • Claims include limited patient demographic data

​

  • Other demographic characteristics that are known to impact health outcomes were not present in the data set

Opportunities for Future Work

Include a more expansive data set

​

Integrate other socio-economic data from the US Census to understand the contributors of health disparity

​

Consider further questions and other potential chronic diseases of interest

​

Expand to include pharmacy claims data to classify chronic disease

bottom of page