Mean and median are measures of central tendency (a single value that represents the center of a dataset or data distribution. Mean is the most popular measure of mid-point; however, if a data set contains a value that is too large or small compared to the rest, this method will not provide a satisfactory result. In such cases, a median measures the central tendency more effectively.
In simple terms, the mean is the average of a set of values. However, several types of the mean are used in mathematics. Pythagorean mean, statistical location, generalized mean, weighted arithmetic mean, and so on are some of the major types of the mean. It is also known as arithmetic mean or arithmetic average and measures the central tendency of a finite (countable) set of numbers.
The median separates the higher half of the values from the lower half. That is why it is commonly known as the middle or middlemost value. To arrive at the median value, a finite set of numbers must first be arranged in either ascending or descending order. The middle number after the arrangement is the median of that set of values. In a given population, half of the values will be lesser than or equal to the median value, and the other half will be higher or equal to the median.
Mean vs. Median
Though both are measures of central tendency, the mean is the average of a given set of data, whereas the median is the middle value. The mean of a finite set of numbers is derived by adding all the numbers and dividing the total by the number of items/values in the set. The median is the middle value if the data has an odd number of observations. If the data set has an even number of values, the arithmetic mean of the two middle values is the median.
Difference Between Mean And Median In Tabular Form
|Parameters of Comparison||Mean||Median|
|Meaning||The average of a set of data.||The middle value in a set of data.|
|Arithmetic or Positional?||A mean is an arithmetic average.||A median is a positional average.|
|Applicability||Mean is the preferred measure of central tendency in normal distributions.||The median is the best measure of central tendency in skewed distributions.|
|Outliers||Mean is affected by/sensitive to outliers (values that are extremely high or low compared to other values in a set).||The median is not affected by outliers and therefore, is ideal for measuring the central tendency of skewed distributions.|
|Formula||Mean = x1+ x2+…+ xnn||If n is odd, |
median = (n + 1) / 2
If n is even,
median = (n/2 + (n/2 + 1))/2
What Is Mean?
Mean is the simple average of a dataset. It is the most popular method of measuring central tendency, as it is easier to calculate compared to the other measures. A mean is highly sensitive to extreme values in a data set; therefore, it efficiently measures central tendency only when the data is normally distributed. The symbol x denotes mean; the population mean in statistics is denoted by the symbol µ.
Types Of Mean
Different types of the mean are used in statistics, probability, geometry, and mathematical analysis. Some of the major types are as follows:
Arithmetic Mean, Geometric Mean, and Harmonic mean are the different types of the mean.
Arithmetic Mean (AM)
Most people think of the arithmetic mean when they hear the word "mean." It is the simplest type to calculate, and therefore, it is the most popular as well. The arithmetic mean of a dataset is the sum of all the observations (or numbers) divided by the total number of observations. Discrete data are whole or concrete numbers with fixed values. The arithmetic mean for discrete data is calculated by summing the products of each observation and the frequency with which it occurs, and then dividing the total by the sum of the frequencies. Therefore,
x = (∑fx) / N
Where N = ∑f, which is the sum of the frequencies.
If the data can take any value between two points (an interval), it is continuous data. Age, weight, height, and so on are examples of continuous data. For example, the data set 30-40, 40-50, and 50-60 are continuous data. To calculate the arithmetic mean of these values, the mean of each interval is calculated first. Next, the resulting mean values are multiplied by their corresponding frequency. Later, the sum of the products is divided by the sum of the frequencies. Therefore,
x = (∑fm) / N
Where fm is the sum of the product of midpoints with their corresponding frequencies, and N is the sum of the frequencies.
Geometric Mean (GM)
The geometric mean is calculated using the product of the observations in a data set instead of their sum as in the case of the arithmetic mean. It is the nth root of the product of n numbers. That is,
Geometric mean = (x1x2…xn)^(1/n)
Harmonic Mean (HM)
The harmonic mean is useful when the numbers are expressed in rates or ratios. It can be expressed as the reciprocal of the arithmetic mean. That is,
Harmonic mean = n / (1/x1 + 1/x2 + … + 1/xn)
Moreover, the AM ≥ GM ≥ HM, where AM represents the arithmetic mean, GM represents the geometric mean, and HM represents the harmonic mean.
Inequality Of Arithmetic And Geometric Mean
The AM-GM inequality theory states that the arithmetic mean is greater than or equal to the geometric mean. The arithmetic and geometric means will be equal if and only if every number in the given dataset is the same. This relationship can be represented in an equation as follows:
x+y/2 ≥ √(xy) (i.e., AM ≥ GM).
The arithmetic mean will be the same as the geometric mean if x=y. This relationship can also be derived from the following formula:
0 ≤ (x-y)² (since the square of a real number will not be negative).
x+y/2 ≥ 2√(xy)
x+y ≥ 2√(xy)
For n non-negative numbers, the AM-GM inequality is represented as follows:
(x1+x2+…+xn)/n ≥ (x1x2…xn)^(1/n). They are equal only if x1=x2=…=xn.
For skewed distributions, the mean will not be the same as the median (the middle value) or the mode (the most likely value). The mean value is skewed upwards when the central tendency of people’s income is calculated, as it is affected by the small number of extremely large values in the given dataset. The outliers will not affect the median, and the mode will be the most likely value. Though median and mode may seem like better measures for skewed distributions, the mean is the preferred measure when it comes to exponential and poison distributions.
Weighted Arithmetic Mean
Weighted average or weighted arithmetic mean is used when calculating the average of differently weighted samples or datasets. To determine the weighted average, the means of the different data sets must be calculated. Later, weights can be assigned to these means to arrive at the weighted average.
Weighted Arithmetic Mean = (w1x1+w2x2+…+wnxn) / (w1+w2+…+wn)
Most people think of the arithmetic mean when they hear the word "mean." It is the simplest type.
The mean of a data set is sensitive to outliers; to overcome this drawback truncated mean may be used. Parts of the given data at the top and the bottom (typically the same number at each end) are dropped/discarded. The rest of the observations are used to calculate the arithmetic mean. The interquartile mean is an example of a truncated mean. The highest and the lowest value (or values) is dropped, and the rest of the values are used to calculate the arithmetic mean.
Mean of a function, mean of angles and cyclical quantities, Frechet mean, triangular mean, and Swanson’s rule are some of the other types of the mean.
What Is Median?
The median is the middle value after a dataset is arranged in ascending or descending order. The middle value becomes the value of the median. The median is also referred to as the midpoint due to this calculation process. However, most students forget to arrange the data before selecting the middle value, which ultimately results in erroneous results. The process varies slightly if the dataset has an even number of observations. After arranging the data, the mean of the two middle values is calculated to obtain the median value. The median for continuous data is calculated using the following formula:
Median = L + (n/2 - c.f.)/f × i
L = the lower limit of the median class,
c.f. = the cumulative frequency of the preceding median class,
f = the frequency of the median class, and
i = the class width.
In a normal distribution, Mean = Median = Mode.
Most people opt to calculate the median to measure central tendency when the data set has extreme values (values that are too high or too low compared to other values in the set). Though truncated mean could be used for this purpose, the median is a much more reliable measure because of its indifference toward skewed distributions.
History Of Median
An old theory related closely to the modern term ‘median’ is the mid-range. It is believed that the idea of a median may have been first proposed in Edward Right’s book Certaine Errors in Navigation (1599). However, Antoine Augustin Cournot, in the year 1843, was the first person to use the term median in the modern sense.
The median concept was applied in astronomy and its related fields and later in sociological and psychological phenomena. Median was referred to as the middle-most value and medium till 1880; only in 1881 did Francis Galton use the English word Median. Statisticians loved using the median because of the ease with which it can be manually computed. Therefore, it was popular in the 19th century.
Fields In Which The Median Is Applicable
The median is useful in several fields; some of the major fields in which it is applied are:
The median cost of housing helps to know the average value of a house. As a median is not affected by the extremely high values of a few limited properties, it is a better measure than the mean.
The median helps to determine whether a distribution is symmetric. If the mean and median values are the same, the distribution is perfectly symmetric. A distribution will still be considered symmetric if the values are close together.
Human Resource Management
Human resource managers calculate the median wage in various fields of business so that they can offer the best pay package to their potential employees. Calculating the median helps them ascertain a pay offer that is beyond what most companies offer. However, they are careful not to overpay, as it will diminish their organization’s profit leading to the unhappiness of shareholders.
Main Difference Between Mean And Median In Points
- The mean value is derived by dividing the sum of the given values by the number of observations. On the other hand, the median is simply the middle value of a given set of data.
- It is not necessary to find out the median when calculating the mean, whereas it is necessary to find out the mean of the two middle values in a data set to calculate the median value if the number of observations is even. That is why the median is also known as trimmed mid-range.
- Calculating the median when the number of observations is even can be more complex than calculating the mean.
- In a skewed distribution, the mean value is far away from the median value due to the impact of outliers.
- The median is a more dependable measure of central tendency when it comes to skewed distributions. Mean is more suited to measure the central tendency in normal distributions.
- The sum of the observations in a data set determines the value of the mean. However, a particular position in a data set determines the value of a median. That is why the former is referred to as arithmetic average and the latter as positional average.
- Statisticians preferred using the median in the 19th century; however, in the 20th-century mean is the most popular method of central tendency. Nevertheless, the median is considered to be more representative of a nation’s actual income distribution.
One cannot say that mean is a better measure of central tendency than the median or vice versa, as both are suitable for different types of distributions. In certain cases, one cannot be used in place of the other. As they are so different from each other, direct comparison is not plausible. People must choose to calculate the mean or median based on what they want to measure or achieve.