Simple Average or Price Relative Method, Weighted index method

Simple Average or Price Relatives Method

In this method, we find out the price relative of individual items and average out the individual values. Price relative refers to the percentage ratio of the value of a variable in the current year to its value in the year chosen as the base.

Price relative (R) = (P1÷P2) × 100

Here, P1= Current year value of item with respect to the variable and P2= Base year value of the item with respect to the variable. Effectively, the formula for index number according to this method is:

 P = ∑[(P1÷P2) × 100] ÷N

Here, N= Number of goods and P= Index number.

Weighted index method

Weighted Aggregate Method

Here different goods are assigned weight according to the quantity bought. There are three well-known sub-methods based on the different views of economists as mentioned below:

Laspeyre’s Method

Laspeyre was of the view that base year quantities must be chosen as weights. Therefore the formula is :

P = (∑P1Q0÷∑P0Q0)×100

Here,  ∑P1Q0= Summation of prices of current year multiplied by quantities of the base year taken as weights and ∑P0Q0= Summation of, prices of base year multiplied by quantities of the base year taken as weights.

Paasche Index Number

The Paasche Price Index is a consumer price index used to measure the change in the price and quantity of a basket of goods and services relative to a base year price and observation year quantity. Developed by German economist Hermann Paasche, the Paasche Price Index is commonly referred to as the “current weighted index.”

Formula for the Paasche Price Index

The formula for the index is as follows:

Where:

  • Pi,0 is the price of the individual item at the base period and Pi,t is the price of the individual item at the observation period.
  • Qi,t is the quantity of the individual item at the observation period.

Marshall Edgeworth Index Number

Calculation of Interest

Calculating interest rate is not at all a difficult method to understand. Knowing to calculate interest rate can solve a lot of wages problems and save money while taking investment decisions. There is an easy formula to calculate simple interest rates. If you are aware of your loan and interest amount you can pay, you can do the largest interest rate calculation for yourself.

Using the simple interest calculation formula, you can also see your interest payments in a year and calculate your annual percentage rate.

Here is the step by step guide to calculate the interest rate.

How to calculate interest rate?

Know the formula which can help you to calculate your interest rate.

Step 1

To calculate your interest rate, you need to know the interest formula I/Pt = r to get your rate. Here,

I = Interest amount paid in a specific time period (month, year etc.)

P = Principle amount (the money before interest)

t = Time period involved

r = Interest rate in decimal

You should remember this equation to calculate your basic interest rate.

Step 2

Once you put all the values required to calculate your interest rate, you will get your interest rate in decimal. Now, you need to convert the interest rate you got by multiplying it by 100. For example, a decimal like .11 will not help much while figuring out your interest rate. So, if you want to find your interest rate for .11, you have to multiply .11 with 100 (.11 x 100).

For this case, your interest rate will be (.11 x 100 = 11) 11%.

Step 3

Apart from this, you can also calculate your time period involved, principal amount and interest amount paid in a specific time period if you have other inputs available with you.

Calculate interest amount paid in a specific time period, I = Prt.

Calculate the principal amount, P = I/rt.

Calculate time period involved t = I/Pr.

Step 4

Most importantly, you have to make sure that your time period and interest rate are following the same parameter.

For example, on a loan, you want to find your monthly interest rate after one year. In this case, if you put t = 1, you will get the final interest rate as the interest rate per year. Whereas, if you want the monthly interest rate, you have to put the correct amount of time elapsed. Here, you can consider the time period like 12 months.

Please remember, your time period should be the same time amount as the interest paid. For example, if you’re calculating a year’s monthly interest payments then, it can be considered you’ve made 12 payments.

Also, you have to make sure that you check the time period (weekly, monthly, yearly etc.) when your interest is calculated with your bank.

Step 5

You can rely on online calculators to get interest rates for complex loans, such as mortgages. You should also know the interest rate of your loan when you sign up for it.

For fluctuating rates, sometimes it becomes difficult to determine what a certain rate means. So, it is better to use free online calculators by searching “variable APR interest calculator”, “mortgage interest calculator” etc.

Calculation of interest when rate of interest and cash price is given

  • Where Cash Price, Interest Rate and Instalment are Given:

Illustration:

On 1st January 2003, A bought a television from a seller under Hire Purchase System, the cash price of which being Rs 10.450 as per the following terms:

(a) Rs 3,000 to be paid on signing the agreement.

(b) Balance to be paid in three equal installments of Rs 3,000 at the end of each year,

(c) The rate of interest charged by the seller is 10% per annum.

You are required to calculate the interest paid by the buyer to the seller each year.

Solution:

Note:

  1. there is no time gap between the signing of the agreement and the cash down payment of Rs 3,000 (1.1.2003). Hence no interest is calculated. The entire amount goes to reduce the cash price.
  2. The interest in the last installment is taken at the differential figure of Rs 285.50 (3,000 – 2,714.50).

(2) Where Cash Price and Installments are Given but Rate of Interest is Omitted:

Where the rate of interest is not given and only the cash price and the total payments under hire purchase installments are given, then the total interest paid is the difference between the cash price of the asset and the total amount paid as per the agreement. This interest amount is apportioned in the ratio of amount outstanding at the end of each period.

Illustration:

Mr. A bought a machine under hire purchase agreement, the cash price of the machine being Rs 18,000. As per the terms, the buyer has to pay Rs 4,000 on signing the agreement and the balance in four installments of Rs 4,000 each, payable at the end of each year. Calculate the interest chargeable at the end of each year.

(3) Where installments and Rate of Interest are Given but Cash Value of the Asset is Omitted:

In certain problems, the cash price is not given. It is necessary that we must first find out the cash price and interest included in the installments. The asset account is to be debited with the actual price of the asset. Under such situations, i.e. in the absence of cash price, the interest is calculated from the last year.

It may be noted that the amount of interest goes on increasing from 3rd year to 2nd year, 2nd year to 1st year. Since the interest is included in the installments and by knowing the rate of interest, we can find out the cash price.

Thus:

Let the cash price outstanding be: Rs 100

Interest @ 10% on Rs 100 for a year: Rs 10

Installment paid at the end of the year 110

The interest on installment price = 10/110 or 1/11 as a ratio.

Illustration:

I buy a television on Hire Purchase System.

The terms of payment are as follows:

Rs 2,000 to be paid on signing the agreement;

Rs 2,800 at the end of the first year;

Rs 2,600 at the end of the second year;

Rs 2,400 at the end of the third year;

Rs 2,200 at the end of the fourth year.

If interest is charged at the rate of 10% p.a., what was the cash value of the television?

Solution:

(4) Calculation of Cash Price when Reference to Annuity Table, the Rate of Interest and Installments are Given:

Sometimes in the problem a reference to annuity table wherein present value of the annuity for a number of years at a certain rate of interest is given. In such cases the cash price is calculated by multiplying the amount of installment and adding the product to the initial payment.

Illustration:

A agrees to purchase a machine from a seller under Hire Purchase System by annual installment of Rs 10,000 over a period of 5 years. The seller charges interest at 4% p.a. on yearly balance.

N.B. The present value of Re 1 p.a. for five years at 4% is Rs 4.4518. Find out the cash price of the machine.

Solution:

Installment Re 1 Present value = Rs 4.4518

Installment = Rs 10,000 Present value = Rs 4.4518 x 10,000 = Rs 44,518

Determinants of the Value of Bonds

Bonds are fixed-income securities that represent a loan from an investor to a borrower, typically a corporation or government. When purchasing a bond, the investor lends money in exchange for periodic interest payments and the return of the bond’s face value at maturity. Bonds are used to finance various projects and operations, providing a predictable income stream for investors.

Valuation of Bonds

The method for valuation of bonds involves three steps as follows:

Step 1: Estimate the expected cash flows

Step 2: Determine the appropriate interest rate that should be used to discount the cash flows.

& Step 3: Calculate the present value of the expected cash flows (step-1) using appropriate interest rate (step- 2) i.e. discounting the expected cash flows

Step 1: Estimating cash flows

Cash flow is the cash that is estimated to be received in future from investment in a bond. There are only two types of cash flows that can be received from investment in bonds i.e. coupon payments and principal payment at maturity.

The usual cash flow cycle of the bond is coupon payments are received at regular intervals as per the bond agreement, and final coupon plus principle payment is received at the maturity. There are some instances when bonds don’t follow these regular patterns. Unusual patterns maybe a result of the different type of bond such as zero-coupon bonds, in which there are no coupon payments. Considering such factors, it is important for an analyst to estimate accurate cash flow for the purpose of bond valuation.

Step 2: Determine the appropriate interest rate to discount the cash flows

Once the cash flow for the bond is estimated, the next step is to determine the appropriate interest rate to discount cash flows. The minimum interest rate that an investor should require is the interest available in the marketplace for default-free cash flow. Default-free cash flows are cash flows from debt security which are completely safe and has zero chances default. Such securities are usually issued by the central bank of a country, for example, in the USA it is bonds by U.S. Treasury Security.

Consider a situation where an investor wants to invest in bonds. If he is considering to invest corporate bonds, he is expecting to earn higher return from these corporate bonds compared to rate of returns of U.S. Treasury Security bonds. This is because chances are that a corporate bond might default, whereas the U.S. Security Treasury bond is never going to default. As he is taking a higher risk by investing in corporate bonds, he expects a higher return.

One may use single interest rate or multiple interest rates for valuation.

Step 3: Discounting the expected cash flows

Now that we already have values of expected future cash flows and interest rate used to discount the cash flow, it is time to find the present value of cash flows. Present Value of a cash flow is the amount of money that must be invested today to generate a specific future value. The present value of a cash flow is more commonly known as discounted value.

The present value of a cash flow depends on two determinants:

  • When a cash flow will be received i.e. timing of a cash flow &;
  • The required interest rate, more widely known as Discount Rate (rate as per Step-2)

First, we calculate the present value of each expected cash flow. Then we add all the individual present values and the resultant sum is the value of the bond.

The formula to find the present value of one cash flow is:

Present value formula for Bond Valuation

Present Value n = Expected cash flow in the period n/ (1+i) n

Here,

i = rate of return/discount rate on bond
n = expected time to receive the cash flow

By this formula, we will get the present value of each individual cash flow t years from now. The next step is to add all individual cash flows.

Bond Value = Present Value 1 + Present Value 2 + ……. + Present Value n

Sampling and Sampling Distribution

Sample design is the framework, or road map, that serves as the basis for the selection of a survey sample and affects many other important aspects of a survey as well. In a broad context, survey researchers are interested in obtaining some type of information through a survey for some population, or universe, of interest. One must define a sampling frame that represents the population of interest, from which a sample is to be drawn. The sampling frame may be identical to the population, or it may be only part of it and is therefore subject to some under coverage, or it may have an indirect relationship to the population.

Sampling is the process of selecting a subset of individuals, items, or observations from a larger population to analyze and draw conclusions about the entire group. It is essential in statistics when studying the entire population is impractical, time-consuming, or costly. Sampling can be done using various methods, such as random, stratified, cluster, or systematic sampling. The main objectives of sampling are to ensure representativeness, reduce costs, and provide timely insights. Proper sampling techniques enhance the reliability and validity of statistical analysis and decision-making processes.

Steps in Sample Design

While developing a sampling design, the researcher must pay attention to the following points:

  • Type of Universe:

The first step in developing any sample design is to clearly define the set of objects, technically called the Universe, to be studied. The universe can be finite or infinite. In finite universe the number of items is certain, but in case of an infinite universe the number of items is infinite, i.e., we cannot have any idea about the total number of items. The population of a city, the number of workers in a factory and the like are examples of finite universes, whereas the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc. are examples of infinite universes.

  • Sampling unit:

A decision has to be taken concerning a sampling unit before selecting sample. Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The researcher will have to decide one or more of such units that he has to select for his study.

  • Source list:

It is also known as ‘sampling frame’ from which sample is to be drawn. It contains the names of all items of a universe (in case of finite universe only). If source list is not available, researcher has to prepare it. Such a list should be comprehensive, correct, reliable and appropriate. It is extremely important for the source list to be as representative of the population as possible.

  • Size of Sample:

This refers to the number of items to be selected from the universe to constitute a sample. This a major problem before a researcher. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills the requirements of efficiency, representativeness, reliability and flexibility. While deciding the size of sample, researcher must determine the desired precision as also an acceptable confidence level for the estimate. The size of population variance needs to be considered as in case of larger variance usually a bigger sample is needed. The size of population must be kept in view for this also limits the sample size. The parameters of interest in a research study must be kept in view, while deciding the size of the sample. Costs too dictate the size of sample that we can draw. As such, budgetary constraint must invariably be taken into consideration when we decide the sample size.

  • Parameters of interest:

In determining the sample design, one must consider the question of the specific population parameters which are of interest. For instance, we may be interested in estimating the proportion of persons with some characteristic in the population, or we may be interested in knowing some average or the other measure concerning the population. There may also be important sub-groups in the population about whom we would like to make estimates. All this has a strong impact upon the sample design we would accept.

  • Budgetary constraint:

Cost considerations, from practical point of view, have a major impact upon decisions relating to not only the size of the sample but also to the type of sample. This fact can even lead to the use of a non-probability sample.

  • Sampling procedure:

Finally, the researcher must decide the type of sample he will use i.e., he must decide about the technique to be used in selecting the items for the sample. In fact, this technique or procedure stands for the sample design itself. There are several sample designs (explained in the pages that follow) out of which the researcher must choose one for his study. Obviously, he must select that design which, for a given sample size and for a given cost, has a smaller sampling error.

Types of Samples

  • Probability Sampling (Representative samples)

Probability samples are selected in such a way as to be representative of the population. They provide the most valid or credible results because they reflect the characteristics of the population from which they are selected (e.g., residents of a particular community, students at an elementary school, etc.). There are two types of probability samples: random and stratified.

  • Random Sample

The term random has a very precise meaning. Each individual in the population of interest has an equal likelihood of selection. This is a very strict meaning you can’t just collect responses on the street and have a random sample.

The assumption of an equal chance of selection means that sources such as a telephone book or voter registration lists are not adequate for providing a random sample of a community. In both these cases there will be a number of residents whose names are not listed. Telephone surveys get around this problem by random-digit dialling but that assumes that everyone in the population has a telephone. The key to random selection is that there is no bias involved in the selection of the sample. Any variation between the sample characteristics and the population characteristics is only a matter of chance.

  • Stratified Sample

A stratified sample is a mini-reproduction of the population. Before sampling, the population is divided into characteristics of importance for the research. For example, by gender, social class, education level, religion, etc. Then the population is randomly sampled within each category or stratum. If 38% of the population is college-educated, then 38% of the sample is randomly selected from the college-educated population.

Stratified samples are as good as or better than random samples, but they require fairly detailed advance knowledge of the population characteristics, and therefore are more difficult to construct.

  • Non-probability Samples (Non-representative samples)

As they are not truly representative, non-probability samples are less desirable than probability samples. However, a researcher may not be able to obtain a random or stratified sample, or it may be too expensive. A researcher may not care about generalizing to a larger population. The validity of non-probability samples can be increased by trying to approximate random selection, and by eliminating as many sources of bias as possible.

  • Quota Sample

The defining characteristic of a quota sample is that the researcher deliberately sets the proportions of levels or strata within the sample. This is generally done to insure the inclusion of a particular segment of the population. The proportions may or may not differ dramatically from the actual proportion in the population. The researcher sets a quota, independent of population characteristics.

Example: A researcher is interested in the attitudes of members of different religions towards the death penalty. In Iowa a random sample might miss Muslims (because there are not many in that state). To be sure of their inclusion, a researcher could set a quota of 3% Muslim for the sample. However, the sample will no longer be representative of the actual proportions in the population. This may limit generalizing to the state population. But the quota will guarantee that the views of Muslims are represented in the survey.

  • Purposive Sample

A purposive sample is a non-representative subset of some larger population, and is constructed to serve a very specific need or purpose. A researcher may have a specific group in mind, such as high level business executives. It may not be possible to specify the population they would not all be known, and access will be difficult. The researcher will attempt to zero in on the target group, interviewing whoever is available.

  • Convenience Sample

A convenience sample is a matter of taking what you can get. It is an accidental sample. Although selection may be unguided, it probably is not random, using the correct definition of everyone in the population having an equal chance of being selected. Volunteers would constitute a convenience sample.

Non-probability samples are limited with regard to generalization. Because they do not truly represent a population, we cannot make valid inferences about the larger group from which they are drawn. Validity can be increased by approximating random selection as much as possible, and making every attempt to avoid introducing bias into sample selection.

Sampling Distribution

Sampling Distribution is a statistical concept that describes the probability distribution of a given statistic (e.g., mean, variance, or proportion) derived from repeated random samples of a specific size taken from a population. It plays a crucial role in inferential statistics, providing the foundation for making predictions and drawing conclusions about a population based on sample data.

Concepts of Sampling Distribution

A sampling distribution is the distribution of a statistic (not raw data) over all possible samples of the same size from a population. Commonly used statistics include the sample mean (Xˉ\bar{X}), sample variance, and sample proportion.

Purpose:

It allows statisticians to estimate population parameters, test hypotheses, and calculate probabilities for statistical inference.

Shape and Characteristics:

    • The shape of the sampling distribution depends on the population distribution and the sample size.
    • For large sample sizes, the Central Limit Theorem states that the sampling distribution of the mean will be approximately normal, regardless of the population’s distribution.

Importance of Sampling Distribution

  • Facilitates Statistical Inference:

Sampling distributions are used to construct confidence intervals and perform hypothesis tests, helping to infer population characteristics.

  • Standard Error:

The standard deviation of the sampling distribution, called the standard error, quantifies the variability of the sample statistic. Smaller standard errors indicate more reliable estimates.

  • Links Population and Samples:

It provides a theoretical framework that connects sample statistics to population parameters.

Types of Sampling Distributions

  • Distribution of Sample Means:

Shows the distribution of means from all possible samples of a population.

  • Distribution of Sample Proportions:

Represents the proportion of a certain outcome in samples, used in binomial settings.

  • Distribution of Sample Variances:

Explains the variability in sample data.

Example

Consider a population of students’ test scores with a mean of 70 and a standard deviation of 10. If we repeatedly draw random samples of size 30 and calculate the sample mean, the distribution of those means forms the sampling distribution. This distribution will have a mean close to 70 and a reduced standard deviation (standard error).

Present Value, Functions

Present Value (PV) concept refers to the current worth of a future sum of money or stream of cash flows, discounted at a specific interest rate. It reflects the principle that a dollar today is worth more than a dollar in the future due to its potential earning capacity.

PV = FV / (1+r)^n

where

FV is the future value,

r is the discount rate,

n is the number of periods until payment.

This concept is essential in finance for assessing investment opportunities and financial planning.

Functions of Present Value:

  • Valuation of Cash Flows:

PV allows investors and analysts to evaluate the worth of future cash flows generated by an investment. By discounting future cash flows to their present value, stakeholders can determine if the investment is financially viable compared to its cost.

  • Investment Decision Making:

In capital budgeting, PV is crucial for assessing whether to proceed with projects or investments. By comparing the present value of expected cash inflows to the initial investment (cost), decision-makers can prioritize projects that offer the highest returns relative to their costs.

  • Comparison of Investment Alternatives:

PV provides a standardized method for comparing different investment opportunities. By converting future cash flows into their present values, investors can effectively evaluate and contrast various investments, regardless of their cash flow patterns or timing.

  • Financial Planning:

Individuals and businesses use PV for financial planning and retirement savings. By calculating the present value of future financial goals (like retirement funds), individuals can determine how much they need to save and invest today to achieve those goals.

  • Debt Valuation:

PV is essential for valuing bonds and other debt instruments. The present value of future interest payments and the principal repayment is calculated to determine the fair market value of the bond. This valuation helps investors make informed decisions about purchasing or selling bonds.

  • Risk Assessment:

Present Value helps in assessing the risk associated with investments. Higher discount rates, which account for risk and uncertainty, lower the present value of future cash flows. This relationship allows investors to gauge the risk-return trade-off of different investments effectively.

Present Value of a Single Flow:

Used when we have a single future amount to be received after a certain time.

Formula:

Example:

You will receive ₹15,000 after 3 years. What is its present value if the discount rate is 10%?

Future Value () Years Rate (%) PV ()
15,000 3 10 11,270

This applies when cash flows are not equal each year. Each amount is discounted separately.

Present Value of Uneven Cash Flows

Example:

You will receive ₹2,000 in Year 1, ₹3,000 in Year 2, and ₹4,000 in Year 3. Discount rate = 10%

Year Cash Flow () PV Factor @10% Present Value ()
1 2,000 0.909 1,818
2 3,000 0.826 2,478
3 4,000 0.751 3,004
₹7,300

Present Value of an Annuity (Ordinary Annuity):

Used when you receive equal payments at the end of each period for a specific number of years.

Present Value of an Annuity (Ordinary Annuity)

Example:

You will receive ₹2,000 every year for 3 years. Discount rate = 10%

PV = 2,000 × (1−(1+0.10)^−3 / 0.10) = 2,000 × 2.487 = ₹4,974

Year Payment ()

PV Factor @10%

PV ()
1 2,000 0.909 1,818
2 2,000 0.826 1,652
3 2,000 0.751 1,504

4,974

Future Value, Functions, Types

Future Value (FV) is the value of a current asset at a future date based on an assumed rate of growth. The future value (FV) is important to investors and financial planners as they use it to estimate how much an investment made today will be worth in the future. Knowing the future value enables investors to make sound investment decisions based on their anticipated needs.

FV calculation allows investors to predict, with varying degrees of accuracy, the amount of profit that can be generated by different investments. The amount of growth generated by holding a given amount in cash will likely be different than if that same amount were invested in stocks; so, the FV equation is used to compare multiple options.

Determining the FV of an asset can become complicated, depending on the type of asset. Also, the FV calculation is based on the assumption of a stable growth rate. If money is placed in a savings account with a guaranteed interest rate, then the FV is easy to determine accurately. However, investments in the stock market or other securities with a more volatile rate of return can present greater difficulty.

Future Value (FV) formula assumes a constant rate of growth and a single upfront payment left untouched for the duration of the investment. The FV calculation can be done one of two ways depending on the type of interest being earned. If an investment earns simple interest, then the Future Value (FV) formula is:

  • Future value (FV) is the value of a current asset at some point in the future based on an assumed growth rate.
  • Investors are able to reasonably assume an investment’s profit using the future value (FV) calculation.
  • Determining the future value (FV) of a market investment can be challenging because of the market’s volatility.
  • There are two ways of calculating the future value (FV) of an asset: FV using simple interest and FV using compound interest.

Functions of Future Value:

  • Investment Growth Measurement:

FV is used to calculate how much an investment will grow over time. By applying a specified interest rate, investors can estimate the future worth of their initial investments or savings, helping them understand the potential returns.

  • Retirement Planning:

FV plays a critical role in retirement planning. Individuals can determine how much they need to save today to achieve a desired retirement income. By calculating the future value of regular contributions to retirement accounts, they can set realistic savings goals.

  • Loan Repayment Calculations:

For borrowers, FV is crucial in understanding the total amount owed on loans over time. It helps them visualize the long-term cost of borrowing, including interest payments, aiding in budgeting and financial decision-making.

  • Comparison of Investment Opportunities:

FV provides a standardized way to compare different investment options. By calculating the future value of various investment opportunities, investors can evaluate which options offer the highest potential returns over a specified period.

  • Education Funding:

Parents can use FV to plan for their children’s education expenses. By estimating future tuition costs and calculating how much they need to save now, parents can ensure they accumulate sufficient funds by the time their children enter college.

  • Inflation Adjustment:

FV helps investors account for inflation when planning for future expenses. By incorporating an expected inflation rate into future value calculations, individuals and businesses can better estimate the amount needed to maintain purchasing power over time.

Future Value of a Single Flow:

This occurs when a single sum of money is invested for a certain period at a given interest rate.

Formula:

FV = PV × (1+r)^n

Example:

Suppose ₹10,000 is invested for 3 years at 10% annual interest.

Year Calculation Future Value ()

3

₹10,000 × (1 + 0.10)^3

₹13,310

Range and co-efficient of Range

The range is a measure of dispersion that represents the difference between the highest and lowest values in a dataset. It provides a simple way to understand the spread of data. While easy to calculate, the range is sensitive to outliers and does not provide information about the distribution of values between the extremes.

Range of a distribution gives a measure of the width (or the spread) of the data values of the corresponding random variable. For example, if there are two random variables X and Y such that X corresponds to the age of human beings and Y corresponds to the age of turtles, we know from our general knowledge that the variable corresponding to the age of turtles should be larger.

Since the average age of humans is 50-60 years, while that of turtles is about 150-200 years; the values taken by the random variable Y are indeed spread out from 0 to at least 250 and above; while those of X will have a smaller range. Thus, qualitatively you’ve already understood what the Range of a distribution means. The mathematical formula for the same is given as:

Range = L – S

where

L: The Largets/maximum value attained by the random variable under consideration

S: The smallest/minimum value.

Properties

  • The Range of a given distribution has the same units as the data points.
  • If a random variable is transformed into a new random variable by a change of scale and a shift of origin as:

Y = aX + b

where

Y: the new random variable

X: the original random variable

a,b: constants.

Then the ranges of X and Y can be related as:

RY = |a|RX

Clearly, the shift in origin doesn’t affect the shape of the distribution, and therefore its spread (or the width) remains unchanged. Only the scaling factor is important.

  • For a grouped class distribution, the Range is defined as the difference between the two extreme class boundaries.
  • A better measure of the spread of a distribution is the Coefficient of Range, given by:

Coefficient of Range (expressed as a percentage) = L – SL + S × 100

Clearly, we need to take the ratio between the Range and the total (combined) extent of the distribution. Besides, since it is a ratio, it is dimensionless, and can, therefore, one can use it to compare the spreads of two or more different distributions as well.

  • The range is an absolute measure of Dispersion of a distribution while the Coefficient of Range is a relative measure of dispersion.

Due to the consideration of only the end-points of a distribution, the Range never gives us any information about the shape of the distribution curve between the extreme points. Thus, we must move on to better measures of dispersion. One such quantity is Mean Deviation which is we are going to discuss now.

Interquartile range (IQR)

The interquartile range is the middle half of the data. To visualize it, think about the median value that splits the dataset in half. Similarly, you can divide the data into quarters. Statisticians refer to these quarters as quartiles and denote them from low to high as Q1, Q2, Q3, and Q4. The lowest quartile (Q1) contains the quarter of the dataset with the smallest values. The upper quartile (Q4) contains the quarter of the dataset with the highest values. The interquartile range is the middle half of the data that is in between the upper and lower quartiles. In other words, the interquartile range includes the 50% of data points that fall in Q2 and

The IQR is the red area in the graph below.

The interquartile range is a robust measure of variability in a similar manner that the median is a robust measure of central tendency. Neither measure is influenced dramatically by outliers because they don’t depend on every value. Additionally, the interquartile range is excellent for skewed distributions, just like the median. As you’ll learn, when you have a normal distribution, the standard deviation tells you the percentage of observations that fall specific distances from the mean. However, this doesn’t work for skewed distributions, and the IQR is a great alternative.

I’ve divided the dataset below into quartiles. The interquartile range (IQR) extends from the low end of Q2 to the upper limit of Q3. For this dataset, the range is 21 – 39.

Skewness

Skewness is a statistical measure that indicates the degree and direction of asymmetry in a frequency distribution. When data is distributed evenly around the central value, the distribution is said to be symmetrical. However, if one side of the distribution extends farther than the other, the distribution is skewed.

In Business Statistics, skewness helps researchers and managers understand the nature of data distribution, identify trends, and make informed decisions. It is commonly used in the analysis of income, profits, wages, sales, investment returns, and market behavior.

Definition of Skewness

Skewness refers to the extent to which a distribution deviates from symmetry. It measures whether the observations are concentrated more on one side of the distribution than the other.

A distribution may be:

  • Symmetrical
  • Positively Skewed
  • Negatively Skewed

Types of Skewness

1. Symmetrical Distribution

A symmetrical distribution has equal frequencies on both sides of the central value.

Characteristics

  • Mean = Median = Mode
  • No skewness
  • Skewness coefficient = 0

Example: The distribution of heights of a large group of people often approximates a symmetrical distribution.

Diagram

2. Positive Skewness (Right Skewness)

A distribution is positively skewed when the tail extends toward the right side.

Characteristics

  • Mean > Median > Mode
  • More observations are concentrated at lower values.
  • A few high values pull the mean to the right.

Example: Income distribution in many countries where a small number of people earn very high incomes.

Diagram

3. Negative Skewness (Left Skewness)

A distribution is negatively skewed when the tail extends toward the left side.

Characteristics

  • Mean < Median < Mode
  • More observations are concentrated at higher values.
  • A few low values pull the mean to the left.

Example: Marks obtained in an easy examination where most students score high marks.

Diagram

Importance of Skewness

  • Helps Understand the Nature of Data Distribution

Skewness helps statisticians and business analysts understand whether a dataset is symmetrical or asymmetrical. It reveals the direction and degree of deviation from a normal distribution. By examining skewness, researchers can identify whether observations are concentrated toward higher or lower values. This understanding is essential for interpreting data accurately. In business statistics, knowing the nature of distribution helps managers evaluate performance, customer behavior, and market trends more effectively, leading to better analysis and decision-making.

  • Assists in Business Decision-Making

Business decisions often depend on accurate interpretation of statistical data. Skewness provides valuable insights into the distribution of sales, profits, costs, and customer preferences. By understanding whether data is positively or negatively skewed, managers can identify unusual patterns and take appropriate actions. It helps in resource allocation, strategic planning, and performance evaluation. Therefore, skewness serves as an important analytical tool that supports informed and rational decision-making in various business activities and organizational operations.

  • Useful in Forecasting and Planning

Forecasting future trends requires a proper understanding of past and present data. Skewness helps identify the distribution pattern of historical observations, enabling analysts to make more accurate predictions. If data is highly skewed, forecasting models may need adjustments to improve reliability. Businesses use skewness while planning production, inventory, marketing strategies, and financial investments. By understanding the direction of data concentration, organizations can anticipate future developments and prepare suitable plans, reducing uncertainty and improving operational efficiency.

  • Helps in Selecting Appropriate Statistical Methods

Many statistical techniques assume that data follows a normal or symmetrical distribution. Skewness helps determine whether these assumptions are valid. If a dataset is highly skewed, analysts may need to use alternative methods or transform the data before analysis. This ensures the accuracy and validity of statistical results. In research and business studies, selecting the correct analytical technique is crucial for drawing reliable conclusions. Therefore, skewness plays an important role in choosing suitable statistical tools and procedures.

  • Identifies the Presence of Extreme Values

Skewness helps detect the influence of extreme values or outliers in a dataset. A highly skewed distribution often indicates that a few observations are significantly larger or smaller than the majority. Identifying such values is important because they can affect averages, forecasts, and business decisions. Managers and researchers can investigate these unusual observations to determine whether they represent genuine trends or data errors. Thus, skewness contributes to more accurate data interpretation and enhances the quality of statistical analysis.

  • Useful in Financial and Investment Analysis

In finance, skewness is widely used to analyze investment returns, stock prices, and financial risks. Investors prefer to understand whether returns are concentrated around gains or losses. Positive and negative skewness provide information about potential opportunities and risks associated with investments. Financial analysts use skewness to evaluate portfolio performance and make informed investment decisions. Therefore, skewness is an important measure in risk assessment, helping businesses and investors manage uncertainty and improve financial planning.

  • Facilitates Comparison of Different Distributions

Skewness enables comparison between different datasets by showing the direction and degree of asymmetry. Two datasets may have similar averages but differ significantly in their distribution patterns. By measuring skewness, analysts can identify these differences and gain deeper insights into the data. Businesses often compare sales performance, customer behavior, employee productivity, and financial results using skewness measures. This comparative analysis helps managers understand relative performance and make more effective decisions based on statistical evidence.

  • Enhances Research and Market Analysis

Skewness is an important tool in research and market analysis because it provides information about consumer behavior, market demand, and economic conditions. Researchers use skewness to study patterns and identify trends within datasets. In marketing, understanding skewed distributions helps businesses segment customers and develop targeted strategies. It also assists in evaluating survey results and market responses. By offering a clearer picture of data behavior, skewness improves the quality of research findings and supports better business and policy decisions.

Limitations of Skewness

  • Highly Sensitive to Extreme Values

One of the major limitations of skewness is its sensitivity to extreme values or outliers. A few unusually large or small observations can significantly influence the skewness coefficient and create a misleading impression of the distribution. In business data, unusual sales figures, profits, or losses may distort the measure of skewness. As a result, the calculated value may not accurately represent the majority of observations. Therefore, analysts must carefully examine the presence of outliers before interpreting skewness and drawing conclusions from statistical data.

  • Does Not Measure Dispersion

Skewness measures only the asymmetry of a distribution and provides no information about the spread or variability of data. Two datasets may have the same skewness value but differ greatly in their dispersion. To understand the complete nature of a distribution, skewness must be used along with measures such as range, variance, and standard deviation. Relying solely on skewness can lead to incomplete analysis. Therefore, it should be considered as one aspect of statistical description rather than a comprehensive measure of data characteristics.

  • Different Methods May Give Different Results

There are several methods of measuring skewness, including Karl Pearson’s, Bowley’s, and Kelly’s coefficients. These methods are based on different statistical concepts and may produce different values for the same dataset. Such variations can create confusion in interpretation and comparison. Analysts may find it difficult to determine which measure best represents the distribution. Consequently, the existence of multiple methods reduces the uniformity of skewness measurement and sometimes complicates statistical analysis, especially when comparing results from different studies or datasets.

  • Difficult to Interpret Precisely

Although skewness indicates the direction and degree of asymmetry, its exact interpretation is often difficult. A positive or negative value shows the direction of skewness, but understanding the practical significance of a particular value may not be straightforward. For example, determining whether a skewness coefficient indicates moderate or severe asymmetry requires additional judgment. This complexity may create challenges for managers, researchers, and students. Therefore, skewness values should be interpreted carefully and in conjunction with graphical analysis and other statistical measures.

  • Not Reliable for Small Samples

Skewness may not provide reliable results when calculated from small samples. In small datasets, a few observations can greatly influence the measure, making it unstable and less representative of the population. Sampling fluctuations may cause skewness values to vary considerably from one sample to another. As a result, conclusions based on skewness from limited data may be misleading. For accurate interpretation, larger datasets are generally preferred. Therefore, analysts should exercise caution when using skewness to evaluate distributions based on small samples.

  • Cannot Fully Describe Distribution Shape

Skewness provides information only about asymmetry and does not fully describe the shape of a distribution. Other characteristics, such as kurtosis, modality, and dispersion, are also important for understanding data behavior. Two distributions may have identical skewness values but differ significantly in other aspects. Consequently, skewness alone cannot provide a complete picture of the dataset. Analysts must combine it with additional statistical measures and graphical tools to gain a thorough understanding of the distribution and make informed decisions.

  • Requires Accurate Data

The accuracy of skewness depends heavily on the quality of the data used. Errors in data collection, recording, classification, or tabulation can affect the calculated skewness coefficient and lead to incorrect conclusions. In business statistics, inaccurate sales, profit, or customer data may distort the measure of asymmetry. Therefore, reliable and properly verified data is essential for meaningful skewness analysis. This dependence on data accuracy represents a limitation because errors at any stage of data handling can reduce the usefulness of skewness measurements.

  • Limited Use When Used Alone

Skewness has limited usefulness when considered in isolation. While it provides information about asymmetry, it does not explain other important characteristics of the dataset. Effective statistical analysis requires the use of multiple measures, including averages, dispersion, and correlation. If skewness is used alone, analysts may overlook critical aspects of data behavior. Therefore, it should be regarded as a supplementary measure rather than a complete analytical tool. Combining skewness with other statistical techniques leads to more accurate interpretations and better decision-making.

Kurtosis

Kurtosis is a statistical measure that describes the degree of peakedness or flatness of a frequency distribution in comparison with a normal distribution. It indicates how observations are concentrated around the mean and how the tails of the distribution behave.

In Business Statistics, kurtosis helps analysts understand the shape of a distribution and identify whether data contains extreme observations. It is widely used in finance, economics, market research, quality control, and risk analysis.

Definition of Kurtosis

Kurtosis is the measure of the shape of a distribution that indicates the extent to which observations cluster around the center and the thickness of the tails relative to a normal distribution.

The term Kurtosis was introduced by Karl Pearson.

Excess Kurtosis

An excess kurtosis is a metric that compares the kurtosis of a distribution against the kurtosis of a normal distribution. The kurtosis of a normal distribution equals 3. Therefore, the excess kurtosis is found using the formula below:

Excess Kurtosis = Kurtosis – 3

Types of Kurtosis

The types of kurtosis are determined by the excess kurtosis of a particular distribution. The excess kurtosis can take positive or negative values as well, as values close to zero.

1. Mesokurtic

Mesokurtic Distribution is a distribution that has the same degree of peakedness and tail thickness as a normal distribution. It serves as the standard or benchmark against which other types of kurtosis are compared. In a mesokurtic distribution, observations are moderately concentrated around the mean, and the tails are neither too heavy nor too light. The coefficient of kurtosis (β₂) is equal to 3, while excess kurtosis is 0. Many natural and social phenomena approximately follow a mesokurtic pattern. This type of distribution indicates a balanced spread of data without an unusual concentration of extreme values. In business statistics, mesokurtic distributions are often considered ideal because they reflect a normal and predictable pattern of observations.

Example: The distribution of examination scores in a large class often approximates a mesokurtic distribution.

2. Leptokurtic

Leptokurtic Distribution is more peaked than a normal distribution and has heavier tails. In this type of distribution, a large number of observations are concentrated near the mean, while the tails contain more extreme values than a normal distribution. The coefficient of kurtosis (β₂) is greater than 3, and excess kurtosis is positive. Because of its heavy tails, a leptokurtic distribution indicates a higher probability of extreme observations occurring. This characteristic is particularly important in finance and investment analysis, where sudden gains or losses may occur. In business statistics, leptokurtic distributions are useful for identifying situations involving high risk and volatility. The presence of a sharp peak and heavy tails suggests that observations cluster around the center but occasionally produce significant deviations from the average.

Example: Stock market returns often follow a leptokurtic distribution because extreme gains and losses occur more frequently than expected under a normal distribution.

3. Platykurtic

Platykurtic Distribution is flatter than a normal distribution and has lighter tails. In this type of distribution, observations are more evenly spread across the range of data, resulting in a broad and low central peak. The coefficient of kurtosis (β₂) is less than 3, while excess kurtosis is negative. Because the tails are lighter, extreme observations occur less frequently than in a normal distribution. A platykurtic distribution indicates greater dispersion and lower concentration of observations around the mean. In business statistics, such distributions may occur when data is uniformly distributed across different categories. The flatter shape suggests that observations are widely dispersed and that the likelihood of unusually high or low values is relatively small.

Example: The distribution of customer arrivals spread evenly throughout a day may exhibit a platykurtic pattern.

Karl Pearson and Spearman Rank Correlation

Karl Pearson Coefficient of Correlation

Karl Pearson Coefficient of Correlation (also called the Pearson correlation coefficient or Pearson’s r) is a measure of the strength and direction of the linear relationship between two variables. It ranges from -1 to +1, where +1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship. The formula for Pearson’s r is calculated by dividing the covariance of the two variables by the product of their standard deviations. It is widely used in statistics to analyze the degree of correlation between paired data.

The following are the main properties of correlation.

1. Coefficient of Correlation lies between -1 and +1:

The coefficient of correlation cannot take value less than -1 or more than one +1. Symbolically,

-1<=r<= + 1 or | r | <1.

2. Coefficients of Correlation are independent of Change of Origin:

This property reveals that if we subtract any constant from all the values of X and Y, it will not affect the coefficient of correlation.

3. Coefficients of Correlation possess the property of symmetry:

The degree of relationship between two variables is symmetric as shown below:

4. Coefficient of Correlation is independent of Change of Scale:

This property reveals that if we divide or multiply all the values of X and Y, it will not affect the coefficient of correlation.

5. Co-efficient of correlation measures only linear correlation between X and Y.

6. If two variables X and Y are independent, coefficient of correlation between them will be zero.

Karl Pearson’s Coefficient of Correlation is widely used mathematical method wherein the numerical expression is used to calculate the degree and direction of the relationship between linear related variables.

Pearson’s method, popularly known as a Pearsonian Coefficient of Correlation, is the most extensively used quantitative methods in practice. The coefficient of correlation is denoted by “r”.

If the relationship between two variables X and Y is to be ascertained, then the following formula is used:

Properties of Coefficient of Correlation

  • The value of the coefficient of correlation (r) always lies between±1. Such as:r = +1, perfect positive correlation

    r = -1, perfect negative correlation

    r = 0, no correlation

  • The coefficient of correlation is independent of the origin and scale.By origin, it means subtracting any non-zero constant from the given value of X and Y the vale of “r” remains unchanged. By scale it means, there is no effect on the value of “r” if the value of X and Y is divided or multiplied by any constant.
  • The coefficient of correlation is a geometric mean of two regression coefficient. Symbolically it is represented as:
  • The coefficient of correlation is “ zero” when the variables X and Y are independent. But, however, the converse is not true.

Assumptions of Karl Pearson’s Coefficient of Correlation

  • The relationship between the variables is “Linear”, which means when the two variables are plotted, a straight line is formed by the points plotted.
  • There are a large number of independent causes that affect the variables under study so as to form a Normal Distribution. Such as, variables like price, demand, supply, etc. are affected by such factors that the normal distribution is formed.
  • The variables are independent of each other.                                     

Note: The coefficient of correlation measures not only the magnitude of correlation but also tells the direction. Such as, r = -0.67, which shows correlation is negative because the sign is “-“ and the magnitude is 0.67.

Spearman Rank Correlation

Spearman rank correlation is a non-parametric test that is used to measure the degree of association between two variables.  The Spearman rank correlation test does not carry any assumptions about the distribution of the data and is the appropriate correlation analysis when the variables are measured on a scale that is at least ordinal.

The Spearman correlation between two variables is equal to the Pearson correlation between the rank values of those two variables; while Pearson’s correlation assesses linear relationships, Spearman’s correlation assesses monotonic relationships (whether linear or not). If there are no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other.

Intuitively, the Spearman correlation between two variables will be high when observations have a similar (or identical for a correlation of 1) rank (i.e. relative position label of the observations within the variable: 1st, 2nd, 3rd, etc.) between the two variables, and low when observations have a dissimilar (or fully opposed for a correlation of −1) rank between the two variables.

The following formula is used to calculate the Spearman rank correlation:

ρ = Spearman rank correlation

di = the difference between the ranks of corresponding variables

n = number of observations

Assumptions

The assumptions of the Spearman correlation are that data must be at least ordinal and the scores on one variable must be monotonically related to the other variable.

error: Content is protected !!