Average

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

In colloquial language, an average is the sum of a list of numbers divided by the number of numbers in the list. In mathematics and statistics, this would be called the arithmetic mean. However, the word "average" may also refer to the median, mode, or other central or typical value. In statistics, these are all known as measures of central tendency.

Calculation

Arithmetic mean

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

The most common type of average is the arithmetic mean. If n numbers are given, each number denoted by ai (where i = 1,2, …, n), the arithmetic mean is the sum of the as divided by n or

AM = \frac{1}{n}\sum_{i=1}^na_i = \frac{1}{n}\left(a_1 + a_2 + \cdots + a_n\right)

The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. One may find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list to 2, 8, and 11, the arithmetic mean is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. One finds that A = (2 + 8 + 11)/3 = 7.

Pythagorean means

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Along with the arithmetic mean above, the geometric mean and the harmonic mean are known collectively as the Pythagorean means.

Geometric mean

The geometric mean of n non-negative numbers is obtained by multiplying them all together and then taking the nth root. In algebraic terms, the geometric mean of a1a2, …, an is defined as

 GM= \sqrt[n]{\prod_{i=1}^n a_i} = \sqrt[n]{a_1 a_2 \cdots a_n}

Geometric mean can be thought of as the antilog of the arithmetic mean of the logs of the numbers.

Example: Geometric mean of 2 and 8 is GM = \sqrt{2 \cdot 8} = 4

Harmonic mean

Harmonic mean for a non-empty collection of numbers a1a2, …, an, all different from 0, is defined as the reciprocal of the arithmetic mean of the reciprocals of the ai's:

HM = \frac{1}{\frac{1}{n}\sum_{i=1}^n \frac{1}{a_i}} = \frac{n}{\frac{1}{a_1} + \frac{1}{a_2} + \cdots + \frac{1}{a_n}}

One example where the harmonic mean is useful is when examining the speed for a number of fixed-distance trips. For example, if the speed for going from point A to B was 60 km/h, and the speed for returning from B to A was 40 km/h, then the harmonic mean speed is given by

\frac{2}{\frac{1}{60} + \frac{1}{40}} = 48

Inequality concerning AM, GM, and HM

A well known inequality concerning arithmetic, geometric, and harmonic means for any set of positive numbers is

AM \ge GM \ge HM

It is easy to remember noting that the alphabetical order of the letters A, G, and H is preserved in the inequality. See Inequality of arithmetic and geometric means.

Thus for the above harmonic mean example: AM = 50, GM ≈ 49, and HM = 48 km/h.

Statistical location

The mode, the median, and the mid-range are often used in addition to the mean as estimates of central tendency in descriptive statistics.

Mode

Comparison of arithmetic mean, median and mode of two log-normal distributions with different skewness.

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

The most frequently occurring number in a list is called the mode. For example, the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. It may happen that there are two or more numbers which occur equally often and more often than any other number. In this case there is no agreed definition of mode. Some authors say they are all modes and some say there is no mode.

Median

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

The median is the middle number of the group when they are ranked in order. (If there are an even number of numbers, the mean of the middle two is taken.)

Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5.

Summary of types

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Name Equation or description
Arithmetic mean \bar{x} = \frac{1}{n}\sum_{i=1}^n x_i  =  \frac{1}{n} (x_1 + \cdots + x_n)
Median The middle value that separates the higher half from the lower half of the data set
Geometric median A rotation invariant extension of the median for points in Rn
Mode The most frequent value in the data set
Geometric mean \bigg(\prod_{i=1}^n x_i \bigg)^{\frac{1}{n}} = \sqrt[n]{x_1 \cdot x_2 \dotsb x_n}
Harmonic mean \frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + \cdots + \frac{1}{x_n}}
Quadratic mean
(or RMS)
\sqrt{\frac{1}{n} \sum_{i=1}^{n} x_i^2} = \sqrt{\frac{1}{n}\left(x_1^2 + x_2^2 + \cdots + x_n^2\right)}
Cubic mean \sqrt[3]{\frac{1}{n} \sum_{i=1}^{n} x_i^3} = \sqrt[3]{\frac{1}{n}\left(x_1^3 + x_2^3 + \cdots + x_n^3\right)}
Generalized mean \sqrt[p]{\frac{1}{n} \cdot \sum_{i=1}^n x_{i}^p}
Weighted mean \frac{ \sum_{i=1}^n w_i x_i}{\sum_{i=1}^n w_i} = \frac{w_1 x_1 + w_2 x_2 + \cdots + w_n x_n}{w_1 + w_2 + \cdots + w_n}
Truncated mean The arithmetic mean of data values after a certain number or proportion of the highest and lowest data values have been discarded
Interquartile mean A special case of the truncated mean, using the interquartile range
Midrange \frac{1}{2}\left(\max x + \min x\right)
Winsorized mean Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain

The table of mathematical symbols explains the symbols used below.

Miscellaneous types

Other more sophisticated averages are: trimean, trimedian, and normalized mean, with their generalizations.[1]

One can create one's own average metric using the generalized f-mean:

y = f^{-1}\left(\frac{1}{n}\left[f(x_1) + f(x_2) + \cdots + f(x_n)\right]\right)

where f is any invertible function. The harmonic mean is an example of this using f(x) = 1/x, and the geometric mean is another, using f(x) = log x.

However, this method for generating means is not general enough to capture all averages. A more general method[2] for defining an average takes any function g(x1x2, …, xn) of a list of arguments that is continuous, strictly increasing in each argument, and symmetric (invariant under permutation of the arguments). The average y is then the value that, when replacing each member of the list, results in the same function value: g(y, y, …, y) = g(x1, x2, …, xn). This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function g(x1, x2, …, xn) = x1+x2+ ··· + xn provides the arithmetic mean. The function g(x1, x2, …, xn) = x1x2···xn (where the list elements are positive numbers) provides the geometric mean. The function g(x1, x2, …, xn) = −(x1−1+x2−1+ ··· + xn−1) (where the list elements are positive numbers) provides the harmonic mean.[2]

Average percentage return and CAGR

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

A type of average used in finance is the average percentage return. It is an example of a geometric mean. When the returns are annual, it is called the Compound Annual Growth Rate (CAGR). For example, if we are considering a period of two years, and the investment return in the first year is −10% and the return in the second year is +60%, then the average percentage return or CAGR, R, can be obtained by solving the equation: (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + R) × (1 + R). The value of R that makes this equation true is 0.2, or 20%. This means that the total return over the 2-year period is the same as if there had been 20% growth each year. Note that the order of the years makes no difference – the average percentage returns of +60% and −10% is the same result as that for −10% and +60%.

This method can be generalized to examples in which the periods are not equal. For example, consider a period of a half of a year for which the return is −23% and a period of two and a half years for which the return is +13%. The average percentage return for the combined period is the single year return, R, that is the solution of the following equation: (1 − 0.23)0.5 × (1 + 0.13)2.5 = (1 + R)0.5+2.5, giving an average percentage return R of 0.0600 or 6.00%.

Moving average

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

Given a time series such as daily stock market prices or yearly temperatures people often want to create a smoother series.[3] This helps to show underlying trends or perhaps periodic behavior. An easy way to do this is to choose a number n and create a new series by taking the arithmetic mean of the first n values, then moving forward one place and so on. This is the simplest form of moving average. More complicated forms involve using a weighted average. The weighting can be used to enhance or suppress various periodic behavior and there is very extensive analysis of what weightings to use in the literature on filtering. In digital signal processing the term “moving average” is used even when the sum of the weights is not 1.0 (so the output series is a scaled version of the averages).[4] The reason for this is that the analyst is usually interested only in the trend or the periodic behavior. A further generalization is an “autoregressive moving average”. In this case the average also includes some of the recently calculated outputs. This allows samples from further back in the history to affect the current output.

History

Origin

The first recorded time that the arithmetic mean was extended from 2 to n cases for the use of estimation was in the sixteenth century. From the late sixteenth century onwards, it gradually became a common method to use for reducing errors of measurement in various areas.[5][6] At the time, astronomers wanted to know a real value from noisy measurement, such as the position of a planet or the diameter of the moon. Using the mean of several measured values, scientists assumed that the errors add up to a relatively small number when compared to the total of all measured values. The method of taking the mean for reducing observation errors was indeed mainly developed in astronomy.[5][7] A possible precursor to the arithmetic mean is the mid-range (the mean of the two extreme values), used for example in Arabian astronomy of the ninth to eleventh centuries, but also in metallurgy and navigation.[6]

However, there are various older vague references to the use of the arithmetic mean (which are not as clear, but might reasonably have to do with our modern definition of the mean). In a text from the 4th century, it was written that (text it square brackets is a possible missing text that might clarify the meaning):[8]

In the first place, we must set out in a row the sequence of numbers from the monad up to nine: 1, 2, 3, 4, 5, 6, 7, 8, 9. Then we must add up the amount of all of them together, and since the row contains nine terms, we must look for the ninth part of the total to see if it is already naturally present among the numbers in the row; and we will find that the property of being [one] ninth [of the sum] only belongs to the [arithmetic] mean itself...

Even older potential references exist. There are records that from about 700 BC, merchants and shippers agreed that damage to the cargo and ship (their "contribution" in case of damage by the sea) should be shared equally among themselves.[7] This might have been calculated using the average, although there seem to be no direct record of the calculation.

Etymology

According to the Oxford English Dictionary, "few words have received more etymological investigation."[9][not in citation given] In the 16th century average meant a customs duty, or the like, and was used in the Mediterranean area. It came to mean the cost of damage sustained at sea. From that came an "average adjuster" who decided how to apportion a loss between the owners and insurers of a ship and cargo.

Marine damage is either particular average, which is borne only by the owner of the damaged property, or general average, where the owner can claim a proportional contribution from all the parties to the marine venture. The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".

A second English usage, documented as early as 1674 and sometimes spelled "averish," is as the residue and second growth of field crops, which were considered suited to consumption by draught animals ("avers").[10]

The root is found in Arabic as awar, in Italian as avaria, in French as avarie and in Dutch as averij. It is unclear in which language the word first appeared.

There is earlier (from at least the 11th century), unrelated use of the word. It appears to be an old legal term for a tenant's day labour obligation to a sheriff, probably anglicised from "avera" found in the English Domesday Book (1085).

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. 5.0 5.1 Lua error in package.lua at line 80: module 'strict' not found.
  6. 6.0 6.1 Eisenhart, Churchill. "The development of the concept of the best mean of a set of measurements from antiquity to the present day." Unpublished presidential address, American Statistical Association, 131st Annual Meeting, Fort Collins, Colorado. 1971.
  7. 7.0 7.1 Bakker, Arthur. "The early history of average values and implications for education." Journal of Statistics Education 11.1 (2003): 17-26.
  8. Waterfield, Robin. "The theology of arithmetic." On the Mystical, mathematical and Cosmological Symbolism of the First Ten Number (1988). page 70.
  9. Lua error in package.lua at line 80: module 'strict' not found. (Subscription or UK public library membership required.)
  10. Lua error in package.lua at line 80: module 'strict' not found.

External links