Complete Statistics Course

The Language of Data
& Uncertainty

From central tendency to regression โ€” every concept you need, with interactive visualizations.

๐Ÿ“Š Central Tendency ๐Ÿ“ Dispersion ๐Ÿ”” Distributions ๐ŸŽฒ Probability ๐Ÿ”— Correlation ๐Ÿ“ˆ Regression ๐Ÿงช Hypothesis Testing
Module 01 ยท Central Tendency
Measures of Central Tendency
Where does the data "centre" itself? Mean, Median, Mode โ€” and when to use each.
Concept Overview
What is Central Tendency?

A measure of central tendency gives us a single representative value that summarises an entire dataset. It answers: "What is the typical or central value?" The three main measures are the Mean, Median, and Mode โ€” each with unique strengths and use-cases.

โž—
Arithmetic Mean
Sum of all values รท count. Most common, sensitive to outliers.
โš–๏ธ
Median
Middle value when sorted. Robust to outliers. Best for income data.
๐Ÿ‘‘
Mode
Most frequent value. Used for categorical data & shoe sizes.
๐Ÿ“
Geometric Mean
โฟโˆš(xโ‚ร—xโ‚‚ร—โ€ฆร—xโ‚™). Used for growth rates, investment returns.
๐Ÿ”„
Harmonic Mean
n รท ฮฃ(1/xแตข). Used for rates and speeds.
โšก
Weighted Mean
Mean where values have different importance weights.
Measure 1
Arithmetic Mean (AM) โ€” The Average

The arithmetic mean is the sum of all observations divided by the number of observations. It is the most widely used measure and is the "balance point" of a distribution.

Arithmetic Mean
xฬ„ = (xโ‚ + xโ‚‚ + ... + xโ‚™) / n = ฮฃxแตข / n

For Grouped Data: xฬ„ = ฮฃ(fแตข ร— mแตข) / ฮฃfแตข
1
List all values
Dataset: 12, 18, 25, 30, 15 โ†’ n = 5
2
Sum all values
ฮฃx = 12 + 18 + 25 + 30 + 15 = 100
3
Divide by count
xฬ„ = 100 / 5 = 20
โš  Outlier Problem: Salaries: โ‚น20k, โ‚น22k, โ‚น19k, โ‚น21k, โ‚น500k. Mean = โ‚น116.4k โ€” completely misleading! One CEO salary skews the entire picture. This is why India's median household income is always reported alongside mean.
๐Ÿ“Š Mean as Balance Point โ€” Interactive
๐Ÿงฎ Mean Calculator
Results will appear here...
Measure 2
Median โ€” The Middle Value

The median is the middle value when data is arranged in ascending order. For an even number of observations, it's the mean of the two middle values. The median is not affected by extreme values (outliers), making it ideal for skewed distributions like income and house prices.

Median โ€” Ungrouped Data
Odd n: Median = value at position (n+1)/2
Even n: Median = average of values at n/2 and (n/2)+1

For Grouped Data (Ogive method):
Median = L + [(n/2 โˆ’ cf) / f] ร— h
Grouped Data Formula Legend: L = lower boundary of median class | n = total frequency | cf = cumulative frequency before median class | f = frequency of median class | h = class width
โš–๏ธ Median Calculator
Results will appear here...
Mean vs Median in Skewed Data
Measure 3
Mode โ€” The Most Frequent Value

The mode is the value that appears most often in a dataset. A distribution can be unimodal (one mode), bimodal (two modes), or multimodal. Mode is the only measure applicable to categorical/nominal data (e.g., most popular colour, most common occupation).

Mode โ€” Grouped Data (Czuprow's Formula)
Mode = L + [fโ‚ โˆ’ fโ‚€ / (2fโ‚ โˆ’ fโ‚€ โˆ’ fโ‚‚)] ร— h

L=lower boundary of modal class | fโ‚=modal class freq | fโ‚€=preceding class freq | fโ‚‚=succeeding class freq | h=class width
Real Example: Shoe sizes: 7, 8, 7, 9, 7, 8, 6, 7. Mode = 7. The manufacturer should produce the most size-7 shoes. Mean (7.4) is useless here โ€” you can't make size 7.4 shoes!
๐Ÿ‘‘ Mode Finder
Results will appear here...
Measure 4
Geometric Mean โ€” For Growth Rates

The geometric mean is the nth root of the product of n values. It is used when values are multiplicative in nature โ€” like compound interest, population growth rates, and investment returns. It is always โ‰ค Arithmetic Mean (AMโ€“GM inequality).

Geometric Mean
GM = โฟโˆš(xโ‚ ร— xโ‚‚ ร— ... ร— xโ‚™) = (โˆxแตข)^(1/n)

Equivalent using logarithms:
log(GM) = [log(xโ‚) + log(xโ‚‚) + ... + log(xโ‚™)] / n
Example: Nifty 50 returns: Year 1 = +20%, Year 2 = โˆ’10%, Year 3 = +15%.
Using values: 1.20, 0.90, 1.15 โ†’ GM = โˆ›(1.20 ร— 0.90 ร— 1.15) = โˆ›1.242 โ‰ˆ 1.075 โ†’ 7.5% CAGR.
Arithmetic mean gives (20โˆ’10+15)/3 = 8.33% โ€” which overstates returns!
๐Ÿ“ Geometric Mean Calculator
Results will appear here...
Measure 5
Harmonic Mean โ€” For Rates & Speeds

The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals. It's used when dealing with rates โ€” speed, frequency, P/E ratios in finance. It gives more weight to smaller values.

Harmonic Mean
HM = n / ฮฃ(1/xแตข) = n / (1/xโ‚ + 1/xโ‚‚ + ... + 1/xโ‚™)
Classic Speed Problem: A car travels 60 km at 30 km/h and 60 km at 60 km/h. Average speed = HM(30, 60) = 2/(1/30+1/60) = 2/(3/60) = 40 km/h. Arithmetic mean gives 45 km/h โ€” wrong!
Finance Use: Averaging P/E ratios across stocks in a portfolio โ€” HM is more appropriate than AM because P/E is a ratio (price per unit of earnings).
Relationship: AM โ‰ฅ GM โ‰ฅ HM (for positive values)
Example: Values 2, 8 โ†’ AM = 5, GM = 4, HM = 3.2
Measure 6
Weighted Mean โ€” Importance Matters

When different values carry different levels of importance (weights), we use the weighted mean. It is the foundation of index numbers, GPA calculations, and portfolio return calculations.

Weighted Arithmetic Mean
xฬ„w = ฮฃ(wแตข ร— xแตข) / ฮฃwแตข
โšก GPA / Weighted Mean Calculator

Example: Marks(Values) and Credits(Weights) for 4 subjects

Results will appear here...
SubjectMarks (xแตข)Credits (wแตข)wแตข ร— xแตข
Economics854340
Statistics723216
Finance904360
History652130
Totalโ€”131046
Weighted Mean = 1046 / 13 = 80.46 (Simple mean = 78 โ€” different!)
Summary
When to Use Which Measure?
SituationBest MeasureWhy
Exam scores, heights, temperaturesArithmetic MeanSymmetric, no outliers
Income, house prices, wealthMedianSkewed distribution, outliers
Shoe sizes, favourite colour, most common professionModeCategorical / nominal data
Investment returns, population growth, CAGRGeometric MeanMultiplicative growth
Speeds, rates, P/E ratio averagingHarmonic MeanRate-based data
GPA, portfolio returns, index numbersWeighted MeanUnequal importance
All Three on One Distribution โ€” Symmetric vs Skewed
A dataset has values: 2, 4, 4, 4, 5, 5, 7, 9. Which statement is TRUE?
Module 02 ยท Measures of Dispersion
How Spread Out is the Data?
Central tendency tells where data centres. Dispersion tells how much it varies.
Core Concept
Why Dispersion Matters

Two datasets can have the same mean but completely different spreads. Student A scores: 48, 50, 50, 52 (Mean=50). Student B scores: 10, 50, 90, 50 (Mean=50). Same mean โ€” but Student B is wildly inconsistent!

๐Ÿ“
Range
Max โˆ’ Min. Simple but affected by outliers.
๐ŸŽฏ
Mean Deviation
Average of |deviations from mean|. More informative than range.
๐Ÿ“
Variance
Average of squared deviations. Foundation of statistics.
ฯƒ
Standard Deviation
โˆšVariance. Most used measure of spread.
๐Ÿ“Š
IQR
Q3 โˆ’ Q1. Middle 50% range. Robust to outliers.
%
CV
CV = (ฯƒ/xฬ„) ร— 100. Compares spread across datasets.
Range & Mean Deviation
Range & Mean Deviation
Range = Xmax โˆ’ Xmin
Mean Deviation (from mean) = ฮฃ|xแตข โˆ’ xฬ„| / n
Mean Deviation (from median) = ฮฃ|xแตข โˆ’ M| / n
Mean Deviation is more representative than Range because Range uses only 2 values while MD uses all values. MD from Median is always โ‰ค MD from Mean.
Variance & Standard Deviation

Standard deviation is the most important measure of dispersion in statistics. It quantifies the average spread of data around the mean. Population vs Sample formulas differ by the denominator (N vs nโˆ’1).

Variance & Standard Deviation
Population Variance: ฯƒยฒ = ฮฃ(xแตข โˆ’ ฮผ)ยฒ / N
Sample Variance: sยฒ = ฮฃ(xแตข โˆ’ xฬ„)ยฒ / (nโˆ’1) โ† Bessel's correction
Standard Deviation: ฯƒ = โˆš[ฮฃ(xแตข โˆ’ ฮผ)ยฒ / N]

Shortcut formula: ฯƒยฒ = ฮฃxแตขยฒ/N โˆ’ (ฮฃxแตข/N)ยฒ = E(Xยฒ) โˆ’ [E(X)]ยฒ
๐Ÿ“ Variance & SD Calculator
Results will appear here...
Standard Deviation: Low vs High Spread
IQR & Box Plot

The Interquartile Range (IQR) is the range of the middle 50% of data. Q1 (25th percentile), Q2 (Median), Q3 (75th percentile). A box plot visualises these five-number statistics: Min, Q1, Median, Q3, Max.

Quartiles & IQR
Q1 = value at (n+1)/4 position
Q3 = value at 3(n+1)/4 position
IQR = Q3 โˆ’ Q1
Outlier bounds: < Q1 โˆ’ 1.5ร—IQR or > Q3 + 1.5ร—IQR
๐Ÿ“ฆ Box Plot Visualisation
Coefficient of Variation (CV)

CV allows comparison of spread across datasets with different units or scales. It expresses standard deviation as a percentage of the mean. Lower CV = more consistent.

Coefficient of Variation
CV = (ฯƒ / xฬ„) ร— 100%
Example: Stock A: Mean return 10%, SD = 2% โ†’ CV = 20%. Stock B: Mean return 20%, SD = 8% โ†’ CV = 40%. Stock A is more consistent relative to its return โ€” better risk-adjusted investment!
Which measure of dispersion is most useful for comparing variability between two datasets with different units (e.g., height in cm vs weight in kg)?
Module 03 ยท Probability Distributions
The Shape of Data
Normal, binomial, Poisson โ€” the mathematical models behind real-world phenomena.
Normal Distribution โ€” The Bell Curve

The normal distribution is the most important distribution in statistics. Many natural phenomena โ€” heights, exam scores, measurement errors โ€” follow it. It is symmetric, bell-shaped, and completely defined by its mean (ฮผ) and standard deviation (ฯƒ).

Normal Distribution PDF
f(x) = (1/ฯƒโˆš2ฯ€) ร— e^[โˆ’(xโˆ’ฮผ)ยฒ/2ฯƒยฒ]

Empirical Rule (68-95-99.7 Rule):
ฮผ ยฑ 1ฯƒ covers 68.27% of data
ฮผ ยฑ 2ฯƒ covers 95.45% of data
ฮผ ยฑ 3ฯƒ covers 99.73% of data
๐Ÿ”” Normal Distribution โ€” 68-95-99.7 Rule
Z-Score (Standardisation): z = (x โˆ’ ฮผ) / ฯƒ. Transforms any normal distribution to Standard Normal (ฮผ=0, ฯƒ=1). Used to find probabilities using Z-tables.
๐Ÿ”” Z-Score Calculator
Results will appear here...
Skewness & Kurtosis

Skewness measures asymmetry. Kurtosis measures the "tailedness" โ€” how heavy the tails are compared to a normal distribution.

Pearson's Coefficient of Skewness
Sk = 3(Mean โˆ’ Median) / ฯƒ

Positive Skew (right): Mean > Median > Mode โ†’ long right tail
Negative Skew (left): Mean < Median < Mode โ†’ long left tail
Symmetric: Mean = Median = Mode
Negative | Symmetric | Positive Skew
Kurtosis: Leptokurtic (K>3) = heavy tails, sharp peak (riskier in finance). Platykurtic (K<3) = light tails, flat peak. Mesokurtic (K=3) = normal distribution baseline.
Binomial Distribution

Models the number of successes in n independent Bernoulli trials, where each trial has probability p of success. Used for quality control, election polling, medical trials.

Binomial Distribution
P(X = k) = C(n,k) ร— p^k ร— (1โˆ’p)^(nโˆ’k)
Mean = np | Variance = np(1โˆ’p) | SD = โˆš[np(1โˆ’p)]
๐ŸŽฒ Binomial Probability Calculator
Results will appear here...
Binomial Distribution (n=10, p=0.4)
Poisson Distribution

Models the number of events occurring in a fixed interval of time or space, when events occur at a constant average rate (ฮป). Used for: calls per hour, accidents per day, typos per page.

Poisson Distribution
P(X = k) = (e^โˆ’ฮป ร— ฮป^k) / k!
Mean = ฮป | Variance = ฮป | (Mean = Variance is a key property!)
Example: A call centre receives 3 calls/minute on average (ฮป=3). P(exactly 5 calls in a minute) = (eโปยณ ร— 3โต) / 5! = (0.0498 ร— 243) / 120 โ‰ˆ 0.1008 (10.08%)
Module 04 ยท Probability Theory
The Mathematics of Chance
Foundation of statistics, finance, and decision making under uncertainty.
Basic Probability Concepts

Probability is a number between 0 and 1 that measures how likely an event is to occur. P=0 means impossible, P=1 means certain.

Classical Probability (Laplace)
P(A) = Number of favourable outcomes / Total possible outcomes

P(A') = 1 โˆ’ P(A) (Complement Rule)
0 โ‰ค P(A) โ‰ค 1 (Axiom of Probability)
โž•
Addition Rule
P(AโˆชB) = P(A) + P(B) โˆ’ P(AโˆฉB). For mutually exclusive: P(AโˆชB) = P(A) + P(B).
โœ–๏ธ
Multiplication Rule
P(AโˆฉB) = P(A) ร— P(B|A). For independent: P(AโˆฉB) = P(A) ร— P(B).
๐Ÿ”€
Conditional Probability
P(A|B) = P(AโˆฉB) / P(B). "Probability of A given B has occurred."
๐Ÿงฌ
Bayes' Theorem
P(A|B) = P(B|A)ร—P(A) / P(B). Updates beliefs with new evidence.
Bayes' Theorem โ€” Deep Dive

Bayes' theorem is one of the most powerful ideas in all of statistics. It tells us how to update our prior beliefs when we receive new evidence.

Bayes' Theorem
P(H|E) = P(E|H) ร— P(H) / P(E)

P(H|E) = Posterior (belief after evidence)
P(H) = Prior (initial belief)
P(E|H) = Likelihood (how well H explains E)
P(E) = Marginal (total probability of E)
Medical Test Example: Disease prevalence = 1% (Prior P(D)=0.01). Test is 99% accurate. You test positive. What's the actual probability you have the disease?

P(D|+) = P(+|D)ร—P(D) / P(+) = (0.99ร—0.01) / (0.99ร—0.01 + 0.01ร—0.99) = 0.0099/0.0198 = 50%! Not 99% as most people intuitively assume.
Probability Tree โ€” Medical Test
Expected Value & Variance of a Random Variable
Expected Value (Discrete)
E(X) = ฮฃ xแตข ร— P(xแตข)
Var(X) = E(Xยฒ) โˆ’ [E(X)]ยฒ = ฮฃ xแตขยฒ ร— P(xแตข) โˆ’ ฮผยฒ
SD(X) = โˆšVar(X)
Portfolio Expected Return: Stock A: 20% return, 60% probability. Stock B: โˆ’5% return, 40% probability. E(return) = 0.20ร—0.60 + (โˆ’0.05)ร—0.40 = 0.12 โˆ’ 0.02 = 10% expected return.
๐ŸŽฒ Expected Value Calculator
Results will appear here...
A fair die is rolled. What is the Expected Value?
Module 05 ยท Correlation Analysis
How Variables Move Together
Pearson, Spearman, and the golden rule: correlation โ‰  causation.
Pearson's Correlation Coefficient (r)

Pearson's r measures the strength and direction of a linear relationship between two continuous variables. It ranges from โˆ’1 to +1.

Pearson's r
r = ฮฃ[(xแตขโˆ’xฬ„)(yแตขโˆ’ศณ)] / โˆš[ฮฃ(xแตขโˆ’xฬ„)ยฒ ร— ฮฃ(yแตขโˆ’ศณ)ยฒ]
r = [nฮฃxy โˆ’ ฮฃxฮฃy] / โˆš{[nฮฃxยฒ โˆ’ (ฮฃx)ยฒ][nฮฃyยฒ โˆ’ (ฮฃy)ยฒ]}
r = +1.0
Perfect +ve
r = +0.7
Strong +ve
r = +0.3
Weak +ve
r = 0.0
No correlation
r = โˆ’0.7
Strong โˆ’ve
r = โˆ’1.0
Perfect โˆ’ve
Scatter Plots โ€” Different r Values
๐Ÿ”— Pearson Correlation Calculator
Results will appear here...
Spearman's Rank Correlation

Spearman's ฯ (rho) is a non-parametric measure based on the ranks of data. Use it when data is ordinal, or when the relationship is monotonic but not necessarily linear.

Spearman's Rank Correlation
ฯ = 1 โˆ’ [6 ร— ฮฃdแตขยฒ] / [n(nยฒ โˆ’ 1)]
where dแตข = rank(xแตข) โˆ’ rank(yแตข)
Use Spearman when: Data is ordinal (ranks, ratings) | Outliers are present | The relationship is monotonic but not linear | You're comparing rankings (e.g., judges' rankings in a competition).
Correlation โ‰  Causation! Ice cream sales and drowning deaths are highly correlated (rโ‰ˆ0.9) โ€” but ice cream doesn't cause drowning! Both are caused by a confounding variable: hot summer weather.
Module 06 ยท Regression Analysis
Predicting with Lines & Curves
Build mathematical models that predict one variable from another.
Simple Linear Regression

Regression finds the best-fit line through data points. The "Ordinary Least Squares" (OLS) method minimises the sum of squared residuals (vertical distances from points to the line).

Regression Line: Y on X
ลท = a + bx
b (slope) = [nฮฃxy โˆ’ ฮฃxฮฃy] / [nฮฃxยฒ โˆ’ (ฮฃx)ยฒ] = r ร— (ฯƒy/ฯƒx)
a (intercept) = ศณ โˆ’ bร—xฬ„

Note: Regression line always passes through (xฬ„, ศณ)
Rยฒ (Coefficient of Determination): Rยฒ = rยฒ tells what % of variance in Y is explained by X. Rยฒ=0.81 means 81% of variation in Y is explained by the regression model.
๐Ÿ“ˆ Linear Regression Calculator
Results will appear here...
Regression Line โ€” Scatter + Best Fit
Two Regression Lines

In statistics, there are two regression lines: Y on X (used to predict Y given X), and X on Y (used to predict X given Y). They are different unless r = ยฑ1.

Two Regression Lines
Y on X: (yโˆ’ศณ) = r(ฯƒy/ฯƒx)(xโˆ’xฬ„) โ†’ use to predict Y
X on Y: (xโˆ’xฬ„) = r(ฯƒx/ฯƒy)(yโˆ’ศณ) โ†’ use to predict X

Both lines intersect at the point (xฬ„, ศณ)
Product of regression coefficients = rยฒ โ†’ byx ร— bxy = rยฒ
Finding r from regression coefficients: If byx = 0.8 and bxy = 0.2, then r = โˆš(0.8 ร— 0.2) = โˆš0.16 = 0.4. Note: r has the same sign as both coefficients.
The regression coefficient of Y on X is 1.6 and of X on Y is 0.4. What is the correlation coefficient r?
Module 07 ยท Hypothesis Testing
Is it Real or Just Random Chance?
The framework of scientific decision-making under uncertainty.
The Hypothesis Testing Framework

Hypothesis testing is a formal procedure to decide whether sample data provides enough evidence to reject a null hypothesis (Hโ‚€). We never "prove" Hโ‚€ true โ€” we only reject it or fail to reject it.

1
State Hypotheses
Hโ‚€ (null): No effect / status quo. Hโ‚ (alternative): There is an effect.
2
Choose Significance Level
ฮฑ = 0.05 (5%) is most common. This is the probability of rejecting Hโ‚€ when it's true (Type I error).
3
Choose & Calculate Test Statistic
Z-test, t-test, chi-square, F-test โ€” depends on data type and sample size.
4
Find p-value or Critical Value
p-value = probability of getting results as extreme as observed, assuming Hโ‚€ is true.
5
Make Decision
If p-value < ฮฑ โ†’ Reject Hโ‚€. If p-value โ‰ฅ ฮฑ โ†’ Fail to reject Hโ‚€.
Common Test Statistics
Z-test (known ฯƒ, large n): z = (xฬ„ โˆ’ ฮผโ‚€) / (ฯƒ/โˆšn)
t-test (unknown ฯƒ, small n): t = (xฬ„ โˆ’ ฮผโ‚€) / (s/โˆšn), df = nโˆ’1
Chi-square (goodness of fit): ฯ‡ยฒ = ฮฃ[(Oโˆ’E)ยฒ/E]
Type I & Type II Errors
Hโ‚€ is TRUEHโ‚€ is FALSE
Reject Hโ‚€Type I Error (ฮฑ) โ€” False PositiveCorrect Decision (Power = 1โˆ’ฮฒ)
Fail to Reject Hโ‚€Correct Decision (1โˆ’ฮฑ)Type II Error (ฮฒ) โ€” False Negative
Type I Error (ฮฑ): Convicting an innocent person. Rejecting Hโ‚€ when it's true. We control this directly with ฮฑ.
Type II Error (ฮฒ): Acquitting a guilty person. Failing to detect a real effect. Minimise by increasing sample size.
Critical Region โ€” One Tailed vs Two Tailed
A drug company claims their drug works. The null hypothesis Hโ‚€ is "the drug has no effect." If the drug works but the test fails to detect it, this is a:
Module 08 ยท Sampling Theory
The Art of Representative Selection
How to draw conclusions about a population from a sample.
Population vs Sample

A population is the entire group of interest. A sample is a subset drawn from it. Statistics (from sample) are used to estimate Parameters (of population).

MeasurePopulation (Parameter)Sample (Statistic)
Meanฮผ (mu)xฬ„ (x-bar)
Varianceฯƒยฒ (sigma squared)sยฒ
Std Devฯƒ (sigma)s
SizeNn
ProportionPpฬ‚
Sampling Methods
๐ŸŽฒ
Simple Random
Every member has equal chance. Like a lottery draw.
๐Ÿ“‹
Systematic
Select every kth element. Simple but may have periodicity bias.
๐Ÿ—‚๏ธ
Stratified
Divide into strata (groups), sample proportionally from each.
๐Ÿ“
Cluster
Divide into clusters, randomly select entire clusters.
โš ๏ธ
Convenience
Use what's easiest. Fast but biased โ€” not recommended.
Central Limit Theorem (CLT) โ€” The Most Important Theorem

The CLT states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases (n โ‰ฅ 30), regardless of the shape of the population distribution. This is why the normal distribution is everywhere!

Sampling Distribution of xฬ„
E(xฬ„) = ฮผ (sample mean is unbiased estimator of population mean)
SE(xฬ„) = ฯƒ/โˆšn (Standard Error of the Mean)

As n โ†’ โˆž, xฬ„ ~ N(ฮผ, ฯƒยฒ/n)
Central Limit Theorem โ€” Sample Size Effect
Confidence Interval: xฬ„ ยฑ z ร— (ฯƒ/โˆšn). For 95% CI, z = 1.96. This means: we are 95% confident the true population mean lies within this range.
Module 09 ยท Time Series Analysis
Patterns Through Time
Decompose trends, seasonality, and cycles in temporal data.
Components of Time Series
๐Ÿ“ˆ
Trend (T)
Long-term upward/downward movement. GDP growing over decades.
๐Ÿ“…
Seasonal (S)
Regular pattern that repeats within a year. Ice cream sales in summer.
๐Ÿ”„
Cyclical (C)
Irregular fluctuations over 2โ€“10 years. Business cycles.
โšก
Irregular (I)
Random, unpredictable variations. Natural disasters, wars.
Time Series Decomposition Models
Additive: Y = T + S + C + I (when seasonal variation is constant)
Multiplicative: Y = T ร— S ร— C ร— I (when seasonal variation grows with trend)
Moving Averages

A moving average smooths out short-term fluctuations to reveal the underlying trend. A 3-year moving average replaces each value with the average of it and its two neighbours.

Moving Average Smoothing
Simple Moving Average (3-period)
MAโ‚ƒ = (Ytโˆ’1 + Yt + Yt+1) / 3

For even-period MAs (e.g. 4-point), a second centring average is needed.
Exponential Smoothing: Gives more weight to recent observations. St = ฮฑXt + (1โˆ’ฮฑ)Stโˆ’1 where ฮฑ is the smoothing constant (0<ฮฑ<1). Higher ฮฑ = more responsive to recent changes.
Module 10 ยท Index Numbers
Measuring Change Over Time
The mathematics behind CPI, WPI, Sensex, and cost-of-living indices.
What are Index Numbers?

Index numbers are specialised averages that measure relative change in a variable (or group of variables) over time or between places. They reduce complex data to a single comparable number. The Consumer Price Index (CPI) measures inflation; the Sensex measures stock market performance.

Simple Price Index
Pโ‚€โ‚ = (Pโ‚ / Pโ‚€) ร— 100
where Pโ‚€ = price in base year, Pโ‚ = price in current year
Weighted Index Numbers
๐Ÿ“Š
Laspeyres Index
Uses BASE YEAR quantities as weights. Tends to overstate inflation.
๐Ÿ“
Paasche Index
Uses CURRENT YEAR quantities as weights. Understates inflation.
โš–๏ธ
Fisher's Ideal Index
Geometric mean of Laspeyres & Paasche. Called "ideal" as it satisfies all tests.
Key Index Formulas
Laspeyres: L = [ฮฃ(Pโ‚Qโ‚€) / ฮฃ(Pโ‚€Qโ‚€)] ร— 100
Paasche: Pa = [ฮฃ(Pโ‚Qโ‚) / ฮฃ(Pโ‚€Qโ‚)] ร— 100
Fisher: F = โˆš(L ร— Pa) โ† Geometric Mean of L and Pa
๐Ÿงฎ Index Number Calculator (Laspeyres & Paasche)

Enter prices and quantities for 3 commodities

Results will appear here...
Tests for Index Numbers (Fisher's tests):
โ€ข Unit Test: Index should be independent of units of measurement.
โ€ข Time Reversal Test: Pโ‚€โ‚ ร— Pโ‚โ‚€ = 1. Fisher satisfies this; Laspeyres & Paasche don't.
โ€ข Factor Reversal Test: Price index ร— Quantity index = Value index. Only Fisher satisfies this.
India CPI Trend (approximate)
Which index number is called Fisher's "Ideal" Index?