Skip to main content
HomeRHealth Survey Data Analysis of BMI
Premium project

Health Survey Data Analysis of BMI

Analyze health survey data to determine how BMI is associated with physical activity and smoking.

Start Project
11 Tasks1,500 XP

Loved by learners at thousands of companies


Project Description

Surveys are often used to study health behavior and determine the risks of disease. Meanwhile, seemingly every day, news outlets publish a different "research says" article about how to lose weight (fast! with no effort at all!). In this project, you will use survey data of ~20k people sampled from the United States to explore health behaviors associated with lower Body Mass Index (BMI), a standardized measure of healthy weight and obesity. Surveys with complex designs use special statistical methods to incorporate sampling weights and design factors into the estimation and inference. Incorporating survey design methods, you will use multiple regression to handle confounders when testing whether physical activity is associated with lower BMI.

This project will use National Health and Nutrition Examination Survey (NHANES) data from ~20,000 participants surveyed in years 2009-2012 found in the NHANES R package.

Project Tasks

  1. 1
    Survey of BMI and physical activity
  2. 2
    Visualize survey weight and strata variables
  3. 3
    Specify the survey design
  4. 4
    Subset the data
  5. 5
    Visualizing BMI
  6. 6
    Is BMI lower in physically active people?
  7. 7
    Could there be confounding by smoking? (part 1)
  8. 8
    Could there be confounding by smoking? (part 2)
  9. 9
    Add smoking in the mix
  10. 10
    Incorporate possible confounding in the model
  11. 11
    What does it all mean?

Technologies

R R

Topics

Data ManipulationProbability & Statistics
Jessica Minnier HeadshotJessica Minnier

Assistant Professor of Biostatistics at Oregon Health & Science University

Jessica is an Assistant Professor of Biostatistics in the OHSU-PSU School of Public Health at Oregon Health & Science University. Her statistical research interests include risk prediction with high dimensional data sets and the analysis of genetic and other omics data. She is passionate about teaching R and programming, reproducible research, and open science.
See More

FAQs

What do other learners have to say?