Statistics: Methods to collect, analyze, present, interpret data, and make decisions.
Two Types of Statistics:
Note: Probability lies in-between descriptive and inferential statistics.
Course Progression: Descriptive Statistics \to Probability \to Inferential Statistics
Population: The entire group of individuals that we want information about.
Sample: The subset of the population we examine to gather information.
Random Sample: Sample where each element in the population has a chance of being selected.
Sample Size (n): The number of observations in a sample.
Variable: Characteristic of a person or thing.
Observation (x_i): Value of a variable for an element.
Data Set (X): Collection of observations on \ge 1 variables.
Two Types of Variables:
Important: Variable type is determined by how we use it.
- e.g.,
- Numerical Age: 19, 42, 69, etc.
- Categorical Age: Young, Average, Old
Note: Different types require different analyses.
Simple R.S. Select n objects at random from the population.
Stratified R.S. Divide population into strata11. Non-overlapping groups, then select simple random sample for each stratum.
Cluster R.S. Divide population into clusters22. Groups, select clusters at random, then select simple random sample for each cluster.
Note: Clusters v.s. Strata (some v.s. all)
- Clusters are representative of their population.
- e.g., You can describe the population using any cluster.
- Strata must be used together to describe their population.
- e.g., If you have male and female strata, you can’t just use the male stratum to describe the population.
Systematic R.S. Select every kth element in the population.
Multi-Stage R.S.: A mixture or combination of at least two methods above (except simple R.S.)
Note: In this course, we’ll assume the samples we’re given are good samples and gotten through simple R.S., unless told otherwise.