As with all branches of mathematics, statistics employs many statistical symbols and abbreviations. Below is a list of some of these that you may see in studying standard descriptive and inferential statistics, including probability and hypothesis testing. In general, English letters are used to represent values in a sample and Greek letters are used to represent values in a population.

**Sample Data**

Here are symbols used to represent the values of a set of data and the statistics that describe those data.

* n *The

*number of values*in the sample.

* x *A

*data point or value*in the sample; the measure of the dependent variable for one independent variable.

* f *The

*frequency*that a value occurs.

**x̄ **The *mean* of a set of data values.

**Σ****x ****[Sigma x]** *Sum of* all of the values in the data set.

** s **The

*standard deviation*of a set of data values. This is a measure of variability in the data. (

*s*is the

^{2}*variance*of the data, but standard deviation is more commonly used.)

Consider the following table of values taken from a sample of ages of children in a pre-kindergarten class:

age frequency (f)

3 ** ** 7

4 9

5 6

For this sample:

*x* = 3, 4 and 5

These are the data values (in this sample, ages)

#### f = 7, 9 and 6

These are the frequencies (how often each data value occurs)

#### n = 22

This is the sum of all the frequency (f) values.

**Σ****x **= 87

**x**

This is calculated by multiplying each data value by its frequency and adding the products: 3 ⋅ 7 + 4 ⋅ 9 + 5 ⋅ 6

#### x̄ = 3.95 years ** **

Formula: Σ*x*/*n*. Calculation: 87/22 = 3.95

#### s = 0.79

This is a measure of the average distance each value is from the mean of the set of data.

**Population Parameters**

The following symbols are used when referring to values that represent the entire population. In practice, these values are not always known or easy to obtain. When necessary, they may be estimated or hypothesized based on reasonable judgment.

** N **The

*number of values*in the entire population from which a sample is drawn.

**µ ****[mu]** The *mean* of the population.

**σ [sigma]** The *standard deviation* of the population.

**Analysis of Data and Hypothesis Testing**

**H _{o} **The

*null hypothesis*in a research study, which states that there is no difference between the things being measured.

Ho = there is no change in the rate at which a person types after drinking a caffeinated cup of coffee

**H _{1}** The

*alternative*or

*research hypothesis*in a research study, which states that there is some difference between the things being measured. (Some studies will use H

_{a}instead of H

_{1}.)

H_{1} = there will be a significant increase in the rate at which a person types after drinking a caffeinated cup of coffee

**z** *Standard score for the z probability distribution* (used with large samples). The *z-score* is a measure of a value’s distance from the mean.

In a data set with mean (**x̄ **) equal to 100 and standard deviation (*s*) equal to 15, then a value of 130 has a z-score of 2 because 130 is two standard deviations above the mean.

**t** *Standard score for t probability distribution* (used with small samples).

**F** *Standard score for the F probability distribution* (used to compare two variances or standard deviations).

**r** *Correlation coefficient*. A measure of the relationship between two variables that is between –1 and 1. The closer the value is to –1 or 1, the stronger the relationship.

**∝ [Alpha] **The threshold for concluding a statistically-significant outcome has occurred.

**p **The* p-value *is the probability that the outcome of a statistical test occurred by random chance. If the p-value is less than ∝, then the conclusion of the research is that there is evidence that the dependent variable has a statistically significant effect on the dependent variable.

In a statistical test, if ∝ = 0.05 and p = 0.031, then p < ∝, and we can say that there is a statistically significant effect.

**Probability**

**P(A)** *Probability of a specific event, A,* occurring.

P(choosing a picture card from a standard deck of cards) = 12/52 = 3/13

*n***P r** Number of

*permutations*of

*n*items selected

*r*at a time.

_{12}P* _{5}* = the number of permutations (ordered arrangements) of 5 elements from a group of 12 elements

_{12}P* _{5}* = 12!/(12-5)! = 12!/(7)! = 12 ⋅11 ⋅10 ⋅9 ⋅8 ⋅7 ⋅6 ⋅5 ⋅4 ⋅3 ⋅2 ⋅1 / 7 ⋅6 ⋅5 ⋅4 ⋅3 ⋅2 ⋅1 = 95,040

*n***C r** Number of

*combinations*of

*n*items selected

*r*at a time.

_{12}C* _{5}* = the number of combinations (unordered arrangements) of 5 elements from a group of 12 elements

_{12}C* _{5}* = 12!/5!7! = 12 ⋅11 ⋅10 ⋅9 ⋅8 ⋅7 ⋅6 ⋅5 ⋅4 ⋅3 ⋅2 ⋅1 / 7 ⋅6 ⋅5 ⋅4 ⋅3 ⋅2 ⋅1 ⋅5 ⋅4 ⋅3 ⋅2 ⋅1 = 792

### Learn More

If you are interested in learning more about statistical symbols and how to use them in data analysis, click here for a listing of online data science degree options.