Statistics is a branch of mathematics. It consists of tools and techniques that help describe, organize and interpret information or data.* There are a couple of ways to do this:

The Oregon Tech Libraries offer several books that can help you with statistics. Many of these books employ various tooks to help them in analyzing data such as R, Python, SPSS, and Tableau. However, the general principles will be the same for any tool you use.

**Descriptive Statistics** summarizes or describes data characteristics in a meaningful way. In general, there are two ways in which this data is usually described: Measures of central tendency (e.g. mode, median, mean) or Measures of spread (e.g. range, quartiles, standard deviation)

**Inferential Statistics** draws conclustions (predictions) from population samples to make generalizations about the entire population

**Central Tendencies**

(an estimate of the "center" of a distribution of values)

__Median__: The "middle" of a sorted list of numbers

- Found by ordering the numbers and finding the middle number
- When the list of numbers is even, the 2 middle numbers are averaged for the median
- Also known as the 50th percentile

__Mean__: Same as what most people call average.

- Found by adding up a list of numbers and dividing by the count of numbers in the list

__Mode__: The most commonly occuring value

- Found by counting those numbers that repeat
- Those that repeat the most are the "mode"
- There may be more than one "mode"
- There may be no mode

**Disperson**

(spread of the values around the central tendency)

__Standard Deviation__: extent of deviation for a group as a whole- Find the Mean
- Subtract the mean from the value of each item
- Square the result
- When finished, sum all the results
- Divide that answer by a number that is 1 less than the total number of items
- Take the square root of the quotient

__Quartiles__: Along with the Median, these are values that devide your data into 4 parts.- For the first quartile (Q1) find the middle number between the smallest number and the median
- For the thrid quartile (Q3) find the middle number between the largest number and the median

**Distribution**(summary of the frequency of individual values or ranges of values for a variable)

__Frequency Distribution__: A table showing the values in a sample and how often they occur__Population__: The subjects of a particular study

__Parameters__: A characteristic or property of a population

__Sample__: A subset of the population. What data is collected

__Statistic__: A characteristic of a sample

**Estimation of parameters**

__Confidence intervals__: the best estimation of the parameter of a population value given the sample value.* It is equivelent to the Mean plus or minus the error of margin.

__Error of margin__:

- Calculated by taking the standard deviation (or standard error)
- Dividing it by the square root of the number of observations (> 30 if the std dev is from the sample)
- Multiplying the result by the Confidence Interval (1.96 for a 95% confidence level and 2.576 for 99% confidence level)

**Hypothesis testing**

__Null hypothesis__: When comparing two populations, the assumption that there will be no difference between them.

__Regression analysis__: A statistical process for estimating relationships among variables. It is used to determine which independent variables (

*x*) will have an impact on the main factor (dependent variable -

*y*) you are trying to understand.

__T-test__: Used to determine if there is a significant difference between the means of two groups.

__ANOVA__: Analysis of Variance. A test for the difference between two

**or more**means*

