Steps in a Statistical Study
Identifying the question
What is it that I want to know?
Is it possible to find out what I want to know?
Is the data obtainable? (birth weight – socio economic, drugs, alcohol etc.)
Is it ethical to obtain such data? (Tuskegee)
If not, is there a reasonable substitute?
Are my assumptions reasonable?
Is my approach reasonable?
II. Designing a Study
Survey
Identify the population of interest
Obtain a representative sample of that population
Simple Random Sampling
Stratified Sampling (M-F, Age groups)
Systematic Sampling (class roster, census list)
Multi-Stage Sampling (Neilson ratings)
Sources of Bias
Voluntary Response (Sherry Hite M&M or Ann Landers)
Non-response bias (day phone)
Response bias (people lie)
Undercoverage
Observational Studies
Used when a designed experiment is not ethical
Subjects studied over a period of time in natural setting
Case/Control – Control must match (Love Canal)
Record Variables of interest
Confounding is a major issue
Designing an Experiment
Researcher has control over the subjects or units in the study
An intervention takes place that otherwise would not occur
Randomization used to assign treatments
Strongest case for causality
Classic example, randomized clinical trial
EDA – Exploratory Data Analysis
A data set exists
Explore for relationships, trends, differences
Prelude to a study
Data summary for conclusion
III. Collecting Data
Identify variables
Identify types of variables
Discrete – Countable (number of days the temp. was above 70)
Continuous – uncountable (gas gauge)
Identify Limits of measurement or observation
Analyze the data
Make Conclusions and Discuss Limitations
What are the answers to the original hypotheses?
What are the limitations of the study?
What conclusions does the study not make?
What new questions arise from this study?