SlideShare a Scribd company logo
BIET – MBA Programme, Davangere
1
Prof. Vijay K S Business Statistics and Analytics
Business Statistics and
Analytics
-: Working Manual:-
Name of the student:
Section:
USN Number:
BIET – MBA Programme, Davangere
2
Prof. Vijay K S Business Statistics and Analytics
Course Objectives:
1. To make the students learn about the applications of statistical tools and techniques
in decision making.
2. To emphasize the need for statistics and decision models in solving business
problems.
3. To enhance the knowledge on descriptive and inferential statistics.
4. To familiarize the students with analytical package MS Excel.
5. To develop analytical skills in students in order to comprehend and practice data
analysis at different levels.
Syllabus
Unit 1: (12 Hours) Introduction to Statistics: Meaning and Definition, functions, scope and
limitations, Collection and presentation of data, frequency distribution, measures of
central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean,
Measuresof dispersion:Range – Quartile Deviation – Mean Deviation - Standard Deviation
– Variance- Coefficient of Variance - Comparison of various measures of Dispersion
Unit 2: (8 Hours) Correlation and Regression: Scatter Diagram, Karl Pearson correlation,
Spearman’s Rank correlation (one way table only), simple and multiple regression
(problems on simple regression only)
Unit 3: (6 Hours) Probability Distribution: Concept and definition - Rules of probability –
Random variables – Concept of probability distribution – Theoretical probability
distributions: Binomial, Poisson, Normal and Exponential – Baye’s theorem (No
derivation) (Problems only on Binomial, Poisson and Normal).
Unit 4: (10 Hours) Time Series Analysis: Introduction - Objectives Of Studying Time Series
Analysis - Variations In Time Series - Methods Of Estimating Trend: Freehand Method -
Moving Average Method - Semi-Average Method – Least Square Method. Methods of
Estimating Seasonal Index: Method Of Simple Averages - Ratio To Trend Method - Ratio
To Moving Average Method
Unit 5: (8Hours) Linear Programming: structure, advantages, disadvantages, formulation
of LPP, solution using Graphical method.
Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced,
restricted and maximization problems.
Unit 6: (8 Hours) Project Management: Introduction – Basic difference between PERT &
CPM – Network components and precedence relationships – Critical path analysis –
Project scheduling – Project time-cost trade off – Resource allocation, Concept of project
crashing.
BIET – MBA Programme, Davangere
3
Prof. Vijay K S Business Statistics and Analytics
Uncheck the following assumptions:
- Business Statistics is all about only numbers
- People who are good in maths can do well with business statistics
- Analytics and Business Statistics are the difficult subjects to pass
- Need advanced mathematical ability is required to learn Business Statistics.
PRACTICAL COMPONENT :( Student-Centered Learning)
- Students are expected to have a basic excel classes
- Students should be able to relate the concepts which can highly enhance an
application scenario in your profession.
- Student should demonstrate the application of the techniques covered in this
course.
COURSE OUTCOMES:
- Facilitate objective solutions in business decision making under subjective
conditions.
- Demonstrate different statistical techniques in business/real-life situations.
- Understand the importance of probability in decision making
- Understand the need and application of analytics.
- Understand and apply various data analysis functions for business problems
BIET – MBA Programme, Davangere
4
Prof. Vijay K S Business Statistics and Analytics
Unit – 1
Introduction to Statistics
Measures of Central Tendency
&
Measures of Dispersion
BIET – MBA Programme, Davangere
5
Prof. Vijay K S Business Statistics and Analytics
Unit 1: (12 Hours)
Introduction to Statistics: Meaning and Definition, functions, scope and limitations,
Collection and presentation of data, frequency distribution
Measures of central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean,
Measures of dispersion: Range – Quartile Deviation – Mean Deviation - Standard
Deviation – Variance, Coefficient of Variance - Comparison of various measures of
Dispersion
Statistics:
Meaning and Definition:
The simple sense of statistics is the facts is shown in number
Example: Average score in maths is 45
Definition:
- “The collection, representation, analysis and interpretation of the numerical data.”
- The term statistics means a numerical statement or statistical methodology, when
used in the sense of statistical data it refers to quantitative aspects of things and
is a numerical description.
- The art and science of collecting, analysing, presenting and interpreting data.
Characteristics of Statistics:
By statistics we mean
 Aggregate of facts
 Aggregate to a marked extent by a multiplicity of courses
 Enumerated and expressed interms of numbers
 Statistics should be collected with a reasonable standard of accuracy
 Collected and placed to in relation to each other
Statistical Methods:
It is a science which deals with the methods of collecting, classifying, presenting,
comparing and interpreting nemerical data collected to throw same light on any sphere
of enquiry.
Types of Statistical Methods:
Descriptive statistics: - It consists of procedures used to summarize and describe the
charecteristics of a set of data.
Inferential Statistics: - It consists of procedure used to make inference about population
charecteristics on the basis of sample results.
BIET – MBA Programme, Davangere
6
Prof. Vijay K S Business Statistics and Analytics
Some Important terminologies:
- Data: Collection of observatios of one or more variable of interest
- Population: A collection of all elements (Units or variable) of interest
- Sample: A subset of the population
- Variable: A characteristic, number, or quantity that increases or decreases over
time, or takes different values in different situations.
Functions of Statistics:
- To collect and present facts in a systematic manner
- To help in formulation and testing of hypothesis
- To help in facilitating the comparison of data
- To help predicting the future trends
- To help to find the relationship between variables
- Simplefies the mass of complex data
- To help to formulate policies
Scope and Importance of statistics:
1. Statistics and planning: Statistics in indispensable into planning in the modern age
which is termed as “the age of planning”. Almost all over the world the govt. are re-storing
to planning for economic development.
2. Statistics and economics: Statistical data and techniques of statistical analysis have to
immensely useful involving economical problem. Such as wages, price, time series
analysis, demand analysis.
3. Statistics and business: Statistics is an irresponsible tool of production control.
Business executive are relying more and more on statistical techniques for studying the
much and desire of the valued customers.
4. Statistics and industry: In industry statistics is widely used inequality control. In
production engineering to find out whether the product is confirming to the
specifications or not. Statistical tools, such as inspection plan, control chart etc.
5. Statistics and mathematics: Statistics are intimately related recent advancements in
statistical technique are the outcome of wide applications of mathematics.
6. Statistics and modern science: In medical science the statistical tools for collection,
presentation and analysis of observed facts relating to causes and incidence of dieses and
the result of application various drugs and medicine are of great importance.
7. Statistics, psychology and education: In education and physiology statistics has found
wide application such as, determining or to determine the reliability and validity to a test,
factor analysis etc.
BIET – MBA Programme, Davangere
7
Prof. Vijay K S Business Statistics and Analytics
8. Statistics and war: In war the theory of decision function can be a great assistance to
the military and personal to plan “maximum destruction with minimum effort.”
Statistics in business and management:
1. Marketing: Statistical analysis are frequently used in providing information for making
decision in the field of marketing it is necessary first to find out what can be sold and the
to evolve suitable strategy, so that the goods which to the ultimate consumer. A skill full
analysis of data on production purchasing power, man power, habits of compotators,
habits of consumer, transportation cost should be consider to take any attempt to
establish a new market.
2. Production: In the field of production statistical data and method play a very important
role. The decision about what to produce? How to produce? When to produce? For whom
to produce is based largely on statistical analysis.
3. Finance: The financial organization discharging their finance function effectively
depend very heavily on statistical analysis of peat and tigers.
3. Banking: Banking institute have found if increasingly to establish research department
within their organization for the purpose of gathering and analysis information, not only
regarding their own business but also regarding general economic situation and every
segment of business in which they may have interest.
4. Investment: Statistics greatly assists investors in making clear and valued judgment in
his investment decision in selecting securities which are safe and have the best prospects
of yielding a good income.
5. Purchase: the purchase department in discharging their function makes use of
statistical data to frame suitable purchase policies such as what to buy? What quantity to
buy? What time to buy? Where to buy? Whom to buy?
6. Accounting: statistical data are also employer in accounting particularly in auditing
function, the technique of sampling and destination is frequently used.
7. Control: the management control process combines statistical and accounting method
in making the overall budget for the coming year including sales, materials, labor and
other costs and net profits and capital requirement.
Limitations of Statistics:
- Does not study qualitative phenomenon
- Does not deal with indiivdual items
- Statistical results are true only on an average
- Statistical s=data should be uniform and homogeneous
- Statsitical results depends on the accuracy of data
- Statistical conclusions are not universally true
- Statistics results can be interpreted only if a person has sound knowledge of
statisctics
BIET – MBA Programme, Davangere
8
Prof. Vijay K S Business Statistics and Analytics
Collection and presentation of data:
Statistical data: A set of information collected from a sample to draw to general
conclusion about the population
Statistical data may be classified as
- Primary Data – Collected first time by the researcher
o Sources – Interview, Observation, Indirect or oral investigation,
information from the local agents and correspondents, mail
questionnaires, through enumerations.
- Secondary Data – Already collected data
o Sources: Published statitics, Publication of research institutes,
Publication of business and financial institutes, newspaper and
periodicals, reports of various committees and commissions,
unpublished statistics
Presentation of Date: arranging things or data in groups or classes according to
their resembalces and affinities and gives expressions to the chapter of attributes
that may subset among a diversity of individuals
Some Important classification
o Geographical (on the basis of area or region)
Example: Sales of the company
Region Sales
North 450
South 310
East 281
West 114
o Chronological (On the basis of histrical i.e. with respect to time)
Example: Sales reported by the departmental stores
Month Sales (In lakhs)
Jan 45
Feb 31
March 28
April 11
o Qualitative ( On the basis of character / attributes)
 Simple Classification: Classification is done into two calsses
 Maniifold Classification : The classification is based on more than
one attribute at a time
o Numerical, qunatitative ( On the basis of magnitude)
Marks No of Students
0-10 45
10-20 31
20-30 28
30-40 11
BIET – MBA Programme, Davangere
9
Prof. Vijay K S Business Statistics and Analytics
Frequency distribution:
A frequency distribution is a statistical table, which shows the set of all distict
values of the variable arranged in order of magnitude, either individually or in
groups with their corresponding frequencies.
Classification of Frequency distribution:
- Series of individual observation
Items are listed one after the other
Roll
No.
Marks Obtained
1 83
2 53
3 72
4 61
- Discrete (Ungrouped) Frequency distribution
Variants differ from each other by a definite amount
No of Kids Families
1 13
2 53
3 12
4 14
- Continuous Frequency distribution (Grouped frequency distribution)
Measurements are only approximations and are expressed in terms of
intervals with certyain limits
Marks Students
0-5 1
5-10 13
10-15 8
15-20 5
Some technical terms in formulating the frequency distribution
o Class Limits: Smallest and largest values in the class
o Class Intervals: The difference between upper and lower limit of a class
interval
Methods of Forming Class Interval:
o Exclusive Method (Over lapping)
Marks Students
0-5 1
5-10 13
o Inclusive Method (Non Overlapping)
Marks Students
0-4 1
5-9 13
BIET – MBA Programme, Davangere
10
Prof. Vijay K S Business Statistics and Analytics
Presenting Data:
Some of the diagrammatic representation of data
o One dimensional diagrams (Line and Bar)
o Two-dimensional diagram (Rectangle, square, pie)
o Three dimensional diagram (Cube, Sphere, Cylinder)
o Pictogram
o Cartogram
Measures of Central Tendency
Central Tendency:
 It is also termed as average
 They sometime referred as measures of location
 Central tendency is the middle point of distribution
 This is used to describe the inherent (Essential) characteristics of a frequency
distribution
 Average or Central Tendency which condense a huge unwieldy (Awkward or
heavy) set of numerical data into single numerical values which are
representative of the entire distribution
 This will give us a bird’s eye view of the huge mass of numerical data
 Central tendency or the average values are typically values around which other
items of the distribution assembles or congregates.
 These are the values lie between the two extreme observations of the
distribution and give us an idea about the concentration of the values in the
central part of distribution
 This is very much useful in
o Describing the distribution in concise manner
o Comparative study of different distribution
o To compute other measures such as Dispersion
“Central tendency is the tendency (behaviour) of numerical data to move towards
its central value like Arithmetic mean, Median, Mode, Geographical Mean, and
Harmonic Mean is called Central Tendency”
Example: Average score of class in particular subject is 65
 It implies that each student score has contributed in getting 65 and thus, each
score is understood to move towards 65
BIET – MBA Programme, Davangere
11
Prof. Vijay K S Business Statistics and Analytics
Central Tendency / Central Location of following curves
 Curve A
and C has got
same or
equal Central
Tendency
(CT)
 Central
Tendency of
curve B lies
right to the
curve A and
C
Dispersion: Dispersion is the spread of the data in a distribution i.e. the extent to which
the observations are scattered
Dispersion of the following curves
Here the curve B has got
wider spread or
dispersion than the
curve A
Various Measures of Central Tendency
I. Mean (Arithmetic Mean / Simple Mean denoted as AM or 𝑿̅ )
II. Median (Denoted as Md also called as positional average)
III. Mode ( Denoted as Z or Mo)
IV. Geometric Mean (GM)
V. Harmonic Mean (HM)
Depending on the seriousness of data analysis we choose between different
measures of central tendency listed above
BIET – MBA Programme, Davangere
12
Prof. Vijay K S Business Statistics and Analytics
I. MEAN (ARITHMETIC MEAN / SIMPLE MEAN)
 Most of the time it was referred as average of something i.e. some given
value
 Arithmetic mean of a given set of observations is their sum divided by
the number of observation
 Mean= Sum of all values / Total number of values
Examples:
- Average winter temperature of New-York city
- Average corn yield from acre of land
Note: The date can be in any one of the following form - Grouped and Ungrouped Date
1. Raw data: Examples: The height of 6 plants in the garden is 6, 5, 3, 4, 2, 7
2. Discrete frequency distribution: A discrete variable is one whose set of possible
values is finite. Discrete variables are frequently counting variables, like the
number of cars owned, Number kids in the family etc.
Example:
X f
0 3
1 12
2 18
BIET – MBA Programme, Davangere
13
Prof. Vijay K S Business Statistics and Analytics
3 7
40
Where X = Number of Children f = Number of family
3. Continuous frequency distribution
a. Mutually Exclusive Class Intervals
b. Mutually Inclusive
c. Open Ended Class Intervals
a. Mutually Exclusive Class Intervals: Here the lower limit is included
and upper limit is excluded from the class interval.
CI f
50 – 60 5
60 – 70 16
70 -80 19
80 -90 10
50
CI=Class Interval
b. Mutually Inclusive Class Intervals: here both upper and lower limit
is included in the class interval
CI f
50-59 4
60-69 17
70-79 20
80-89 8
90-99 1
50
c. Open Ended Class: If the initial or final class interval is
indeterminate at its end
CI f
Below
60
5
60 – 70 16
70 -80 19
80-90 30
30
Some Points Regarding the Class Intervals
 To calculate median, mode for the particular values, the mutually inclusive class
intervals to be converted to mutually exclusive class intervals.
 To calculate Arithmetic Mean (AM), Geometric Mean (GM) and Harmonic Mean
(HM) conversion of mutually inclusive to exclusive is not necessary
 For all calculation open ended CI must be converted into mutually exclusive Class
Intervals.
BIET – MBA Programme, Davangere
14
Prof. Vijay K S Business Statistics and Analytics
 Calculating the mean from ungrouped data or Raw Data or Individual
Observation
Ungrouped Data or Raw Data:
 Here the sample size is small
 We add all the observation to calculate mean
 This is not possible, if there is 5000 observation i.e. large number of data or
observation
In General, if X1, X2, X3………..Xn are given “n” observations then their Arithmetic
Mean usually denoted as 𝑿̅ is given by
𝑋̅ =
∑ 𝑋
𝑛
Where:
Example: The Arithmetic mean of 5, 8, 10, 15, 24 and 28 is
=
5+8+10+15+24+28
6
= 90/8 = 15
Calculating the mean from grouped data:
 This is used when the number of observation is large and difficult to compute
 Here, we are access the frequency distribution of the data, not every individual
observation
o Discrete frequency distribution
o Continuous frequency distribution
 Frequency distribution consist of data that are grouped by classes. Each value of
the observation falls somewhere one of the classes.
 To find the arithmetic mean of continuous frequency distribution, we first
calculate the midpoint of each class
 Then multiply each midpoint by the frequency of observations in that class, sum
all these results, and divide the sum by the total number of observations in the
sample.
o Discrete frequency distribution
𝑋̅ =
∑ 𝑓𝑋
𝑁
BIET – MBA Programme, Davangere
15
Prof. Vijay K S Business Statistics and Analytics
o Continuous frequency distribution
𝑋̅ =
∑ 𝑓𝑋
𝑁
Where “X” is the middle value
𝑋̅ = 𝐴 +
∑ 𝑑
𝑁
(Shortcut method / Step Deviation method)
𝑋̅ = 𝐴 +
ℎ ∑ 𝑑
𝑁
(Shortcut method / Step Deviation method)
Where:
Problems:
Q.No-1.1: Calculate AM for the raw data; 22, 28, 26, 24, 26, 15, 08, 09, 32, 20
Q.No-1.2: Calculate AM for the following distribution
X 0 1 2 3 4
f 8 23 45 24 7
Note: It is a discrete frequency distribution
Q.No-1.3: Calculate AM for the following distribution
CI 0-10 10-20 20-30 30-40 40-50
F 3 14 31 13 4
Note: It is a continuous frequency distribution
Q.No-1.4: Calculate AM for the following distribution table
CI 0 - 9 10-19 20-29 30-39 40-49 50-59
F 3 15 38 14 10 5
Q.No-1.5: Calculate AM for the following distribution table
Marks Below 20 Below 30 Below 40 Below 50
Students 12 35 48 60
Note: Since it has open ended class intervals so it should be converted into mutually
exclusive class intervals.
BIET – MBA Programme, Davangere
16
Prof. Vijay K S Business Statistics and Analytics
Q.No-1.6: Calculate AM for the following distribution table
Marks 10 Above 20 Above 30 Above 40 Above 50 Above 60
Above
No. Students 60 48 35 18 8 2
Q.No-1.7: The following is the frequency distribution of the number of telephone calls
received in 245 successive one-minute intervals at an exchange
No. of calls: 0 1 2 3 4 5 6 7
Frequency: 14 21 25 43 51 40 39 12
Obtain the mean number of calls per minute
Q.No-1.8: The following table gives salary per month of 450 employees in a factory,
Find mean salary of the employees
Salary (,000) 0-5 5-10 10-15 15-20 20-25 25-30
Employees 80 120 100 60 50 40
Q.No-1.9: An average monthly balances of 600 customers is given as follows. Find the
mean from this data
Class (in Dollars) Frequency
0 - 49.99
50.00 – 99.99
100.00 – 149.99
150.00 – 199.99
200.00 – 249.99
250.00 – 299.99
300.00 – 349.99
350.00 – 399.99
400.00 – 449.99
450.00 – 499.99
78
123
187
82
51
47
13
9
6
4
Q.No-1.10: From the following data find AM?
C.I 130-
134
135-139 140-144 145-149 150-
154
155-
159
160-164
Employees 5 15 28 24 17 10 1
Q.No-1.11: Calculate the average no. of days the workers are absent in a company
No. Days
Absent
Less
than 5
5-10 10-15 15-20 20-25 25-30 30-35
No. of
Workers
29 224 465 582 634 644 650
BIET – MBA Programme, Davangere
17
Prof. Vijay K S Business Statistics and Analytics
Q.No-1.12: Calculate the average age of employee with the help of following distribution
table
Age above
Years
20 25 30 35 40 45 50 55 60
f 450 410 330 300 210 185 85 40 12
Q.No-1.13: Find the missing frequency from the following data, if 𝑋̅ = 15.38
X 10 12 14 16 18 20
f 3 7 x 20 8 5
Q.No-1.14: Find missing frequency in the following data given 𝑋̅ = 50 and N = 120
C.I 0-20 20-40 40-60 60-80 80-100
f 17 f1 32 f2 19
 Step deviation method for grouped or continuous frequency distribution:
In case of grouped or continuous frequency distribution, with class intervals of equal
magnitude, the calculation are further simplified by taking
𝑋̅ = 𝐴 +
ℎ ∑ 𝑓𝑑
𝑁
Q.No-1.15: Calculate the mean for the following frequency distribution
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of
Students
5 5 7 15 8 6 4
a) By the direct formula b) Step deviation method
Properties of Arithmetic Mean
1. The algebraic sum of the deviations of the given set of observations from their
Arithmetic mean is “Zero”
In simple words, the sum of deviations taken from the Arithmetic mean is always
“Zero”
Mathematically ∑(𝑋 − 𝑋̅) = 0
For frequency distribution ∑f(X - 𝑋̅) = 0
2. Sum of square deviation taken from the AM is always least among such deviations
taken from other measures of other tendency
Mathematically ∑(𝑋 − 𝑋̅)
2
is always less than ∑(𝑋 − 𝑀)
2
, ∑(𝑋 − 𝑍)
2
∑(𝑋 − 𝐺𝑀)
2
, ∑(𝑋 − 𝐻𝑀)
2
BIET – MBA Programme, Davangere
18
Prof. Vijay K S Business Statistics and Analytics
3. Mean of the combined series: If we know the sizes and means of two component
series, then we can find the mean of the resultant series obtained on combining
the given series.
If n1 and n2 observations posses x1 and x2 as means respectively then the
combined group of size n1 and n2 is given by
𝑋̅ 12 =
𝑁1𝑋̅ 1+𝑁2 𝑋 2̅̅̅̅̅
𝑁1+𝑁2
Merits and Demerits of Arithmetic Mean:
Merits
 It is rigidly defined
 It is easy to calculate and understand i.e. the concept is familiar to most
people and intuitively clear
 It is based on all the observations
 Every data set has a mean. It is a measure that can be calculated and it is
unique because every data set has one and only mean
 It is suitable for further mathematical treatment
 Of all averages, Arithmetic mean is affected least by fluctuations of
sampling, that means the Arithmetic mean is a stable average
 The mean is useful for performing statistical procedures such as
comparing the means from several data set
Demerits
 The strongest drawback of Arithmetic mean is that it is very much affected
by extreme observations. Two or three very large values of variable may
disproportionately affect the values of the Arithmetic Mean
 Arithmetic mean cannot be used in the case of open end classes such as
less than 10, more than 70. Since such classes we cannot determine the
midpoint / mid value. In such cases Mode or Median may be used.
 It cannot be determined by inspection nor can it be located graphically
 It cannot be used, if we are dealing with qualitative data / Characteristic
such as Honesty, Beauty
In case of qualitative data, Median is the only average used
 Arithmetic Mean cannot be obtained, if a single observation is missing or
lost or is illegible unless we drop it out and compute the Arithmetic Mean
of the remaining value
 In extreme asymmetrical (Skewed) distribution, usually arithmetic mean
is not representative of the distribution and hence not suitable for
measure of location or Measure of Central Tendency
BIET – MBA Programme, Davangere
19
Prof. Vijay K S Business Statistics and Analytics
 Arithmetic mean may lead to wrong conclusions if the details of the data
from which it is obtained are not available.
 Arithmetic mean may not be one of the values which the variable actually
takes and is termed as fictitious average.
Weighted Arithmetic Mean
Usual AM gives equal importance for all items; but most of the situation Mean is
calculated based on the importance level of the observations.
To make the average computed as representative of the distribution – proper weightage
is given to various items
Let W1, W2, W3,……….Wn be the weights attached to variable values X1, X2, X3………Xn
respectively. Then the Weighted Arithmetic Mean usually denoted c
𝑋̅ 𝑤 =
𝑊1𝑋1 + 𝑊2𝑋2 + 𝑊3𝑋3 … … … 𝑊𝑛𝑋𝑛
𝑊1 + 𝑊2 + 𝑊3 … … 𝑊𝑛
𝑋̅ 𝑤 =
∑ 𝑊𝑋
∑ 𝑊
In Case of frequency distribution, if f1, f2, f3…………..fn are the frequencies of the variable
values X1, X2, X3……………Xn respectively than the weighted average / weighted
arithmetic mean is given by
𝑋̅ 𝑤 =
𝑊1(𝑓1𝑋1) + 𝑊2(𝑓2𝑋2) + 𝑊3(𝑓3𝑋3) … … … 𝑊𝑛(𝑓𝑛𝑋𝑛)
𝑊1 + 𝑊2 + 𝑊3 … … 𝑊𝑛
𝑋̅ 𝑤 =
∑ 𝑊(𝑓𝑋)
∑ 𝑊
Q.No-1.16: Calculate the Weighted Arithmetic Mean for the following distribution
Item Rice Wheat Sugar Jawar Oil Tea Salt
Price 40 30 33 35 25 250 15
Weight 1 0.5 0.2 0.5 0.25 0.1 0.05
Q.No-1.17: A candidate obtained the following percentage of marks in an examination:
English-60, Hindi-75, Mathematics-65, Physics-59, and Chemistry-55. Find the
candidate’s weighted arithmetic mean if weights are 1, 2, 1, 3, and 3 respectively are
allocated to the subjects.
BIET – MBA Programme, Davangere
20
Prof. Vijay K S Business Statistics and Analytics
Q.No-1.18: The mean annual salary of all employees in a company is Rs. 25000. The mean
salary of female and male employees is Rs. 27,000 and 17,000 respectively. Find percent
of male, female employed by the company.
Q.No-1.19: The mean monthly salary paid to 77 employees in a company was Rs. 78. The
mean salary of 32 of them was 75, and that of others 25 was 82, what was the mean salary
of remaining?
Q.No-1.20: Average daily income for group of 50 persons in a factory was calculated to be
Rs. 169, it was later found that 1 values was measured by 134 instead of the current value
143. Calculate the correct average income.
Q.No-1.21: Calculate the Mean score of students using weights aligned to subjects Physics,
Chemistry, Maths, Biology, English and Hindi; respectively as 3, 2, 3, 0, 1, 1 using the
following marks
Subjects Hindi English Physic Chemistry Maths Biology
Marks 56 70 72 62 80 69
Q.No-1.22: The number 3.2, 5.8, 7.9 and 4.5 have frequencies X, (X+2), (X-3) and (X+6)
respectively. If the arithmetic mean is 4.876. Find the value of x.
Q.No-1.23: Marks secured by 50 students in a test paper are given below
30 45 48 55 39 25 31 12 18 21 54 59 51
33 43 44 10 38 19 26 41 35 37 41 46 33
51 37 58 58 17 19 23 26 29 38 57 36 35
44 43 27 19 43 22 31 47 34 31 15 35 32
Prepare frequency table with class interval 10-19, 20-29, 30-39…….., and calculate the
value of the Arithmetic Mean from the frequency table obtained.
2. MEDIAN
Median is another measure of Central Tendency which locates the middle most value in
given set of data
Median is the measure of Central Tendency different from any of the means
Median is a single value from the data set that measures the central item in the data
Median is that value of the variable which divides the group in two equal parts, one part
comprising of the values greater than and the other less than Median
This single item is the middlemost or most central item in the set of numbers. As said
earlier half of the items lie above this point and the other half lie below it
Contradicting to the Arithmetic mean which is based on all the items of the distribution,
the median is only positional average i.e. its value depends on the position occupied by a
value in the frequency distribution.
BIET – MBA Programme, Davangere
21
Prof. Vijay K S Business Statistics and Analytics
 Calculation of Median from raw data or ungrouped data
To find the Median of a dataset, first array the data in ascending or descending order. If
the data set contains an odd number of items, the middle item of the array is the Median.
If there is even items, the arithmetic means of two middle items
Median M = (
𝑥+1
2
)𝑡ℎ́ item in an arrange
Where 𝑥 is the number of items in the array.
 Calculation of Median from Discrete frequency distribution
Here in this case
Median M= (
𝑁+1
2
) 𝑡ℎ́ observation for discrete frequency distribution
Where N is the number of items in the distribution i.e. sum of frequencies
 Calculation of Median from Continuous frequency distribution
Median M = L + {
(
𝑁
2
−𝑀)∗𝐶
𝑓
}
Where L = Lower limit of Median class
N = Total number of items
M = Cumulative frequency of the class proceeding the median class
f = Frequency of the median class
C = Class width or Magnitude of the class
Q.No-2.1: Find the Median of the observations 12, 15, 16, 82, 75
Q.No-2.2: Find the Median of the observations 12, 18, 13, 42, 63, 78
Q.No-2.3: Find the Median of the following distribution
X 10 20 30 40 50
f 3 8 13 9 7
Q.No-2.4: Find the Median of the following distribution
C.I 0-10 10-20 20-30 30- 40 40- 50
f 5 12 23 12 3
BIET – MBA Programme, Davangere
22
Prof. Vijay K S Business Statistics and Analytics
Q.No-2.5: Find the Median of the following distribution
C.I 10 - 14 15 - 19 20 - 24 25 – 29 30 – 34
f 2 5 8 4 1
Note: to calculate Median
- Cumulative frequency is a must
- Class intervals should be mutually exclusive
Q.No-2.6: Calculate Mean and Median from the following distribution
C.I 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80 80-90
f 4 12 40 41 27 13 9 4
Q.No-2.7: Find the Median for the following data
Height (C.I) 125-129 130-134 135-139 140-144 145-149
No. of Items (f) 2 5 8 4 1
Q.No-2.8: Find the Median wage of labours from the following
Wages Above
0
Above
10
Above
20
Above
30
Above
40
Above
50
Above
60
Above
70
No. Labours 650 500 425 375 300 275 250 100
Q.No-2.9: Find the missing frequency, if M=14
C.I 0-5 5-10 10-15 15-20 20-25 25-30
f 5 7 Q 8 6 4
Q.No-2.10: Calculate Median from the series
5 Men get less than Rs. 5
12 Men get less than Rs. 10
22 Men get less than Rs. 15
30 Men get less than Rs. 20
36 Men get less than Rs. 25
40 Men get less than Rs. 30
Q.No-2.11: Calculate the missing frequency from the following data having Median = 46
and N = 230
Variables 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80
f 12 30 f1 65 f2 25 19
Merits and Demerits of Median
BIET – MBA Programme, Davangere
23
Prof. Vijay K S Business Statistics and Analytics
Merits:
 Rigidly defined
 Easy to calculate for non-mathematical person
 Since, it is a positional average, not affected by the extreme observations.
Useful in the skewed distribution
 Computed while dealing with open ended classes
 Located by simple inspection and even graphically
 This is the only average which will deal with qualitative characteristics
3. MODE:
 Mode is one of the measure of central tendency that is different from the mean
that somewhat like the median
 The mode is the value that is repeated most often in the data set
 The mode is defined as the highest or the most popular value in the given data
 Mode is the value which occurs most frequently in a set of observations and
around which the other items of the set clusters densely located
 It is the value at the point around which the items tend to be most heavily
concentrated. It is regarded as the most typical of a series of values
 Mode is the value which has the greatest frequency density in its immediate
neighbourhood
 Mode is termed as the fashionable value of the distribution
o Example: Average size of the shoe sold in a shop is 7
o Average Indian Male is 5 feet 6 inch
Here the average refer to neither mean nor median but mode, the most
frequent value in the distribution
 Mode denoted a Mo or Z
 Calculation of “Mode”
o Mode (Z) is the highest value - Raw Data
o Mode (Z) is the value corresponding to highest frequency in discrete
frequency distribution
o Mode (Z) in continuous frequency distribution is
Z =L + {
( 𝑓−𝑓1)∗𝐶
2𝑓−𝑓1−𝑓2
} in a model class
Model class = Class with highest frequency
L = Lower limit of the model class
BIET – MBA Programme, Davangere
24
Prof. Vijay K S Business Statistics and Analytics
f = Frequency of model class
f1 = Frequency of proceeding model class
f2 = Frequency of succeeding model class
C = Class width of model class
Q.No-3.1: Find Mode 2, 6, 8, 12, 32, 25, 41, 63, 25
Q.No-3.2: Find the mode for following distribution table
X 0 1 2 3 4
f 7 13 25 10 3
Q.No-3.3: Find the mode for following distribution table
C.I 0-10 10-20 20-30 30-40 40-50
f 3 8 13 10 1
Q.No-3.4: Find the Mean, Median and Mode
CI 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
f 4 12 40 41 27 13 9 4
Merits and Demerits of Mode
Merits:
 Easy to calculate and understand; done by merely inspection process
 Not affected by observations
 Convenient for open ended class
Demerits:
 Mode is not rigidly defined
 Mode is not suitable for further mathematical treatment
 Affected to a greater extent with the fluctuation of samplings
Empirical Relationship between Mean (𝑿̅ ), Median (M) and Mode (Z) (Slightly Skewed)
Symmetrical distribution that contains only one mode always have the same value for the
Mean, Median and Mode
In case of Skewed distribution
In case of positively and negatively skewed distribution. The median can be taken for to
measure the central tendency
BIET – MBA Programme, Davangere
25
Prof. Vijay K S Business Statistics and Analytics
Since the Median is not as highly influenced by the frequency of occurrence of a single
value as is the mode, nor it is pulled by extreme value as is the Mean
Whenever the given distribution is slightly skewed Mean (𝑿̅ ), Median (M) and Mode (Z)
have showed following relationship
Z = 3M – 2 𝑿̅
I.e. Mode = 3 Median – 2 Mean
Q.No-3.5: Find Mean, Median and Mode using Empirical Relationship
Size in
Inches
5 10 25 20 25 30 35
f 1 3 13 17 27 36 38
GEOMETRIC MEAN
- GM is nth root of product of quantities of the series. It is observed by
multiplying the values of items together and extracting the root of the
product corresponding to the number of items.
- Thus, square root of the products of two items and cube root of the
products of the three items are the geometric mean
- It is never larger than the arithmetic mean
- If there are zeroes and negative numbers in the series, the geometric
mean cannot be used.
- Logarithms can be used to find the geometric mean to reduce large
numbers and to save time
- Appropriate in situations where, there is an average percentage rate of
change over a period of time.
- It is widely used in the construction of index numbers
Geometric Mean (GM) = √ 𝑥1 𝑥2 𝑥3 𝑥4 … … … … … … … … . 𝑥 𝑛
𝑛
When the number of items in the series is larger than 3, the process of
computing GM is difficult. To overcome this, logarithms of each value is obtained. The log
of all the values added up and divided by number of items. The antilog of the ratio
obtained is the required GM.
Geometric Mean (GM) = Antilog [
𝑙𝑜𝑔1
𝑥+ 𝑙𝑜𝑔2
𝑥+ 𝑙𝑜𝑔3
𝑥+ 𝑙𝑜𝑔4
𝑥………………………..+ 𝑙𝑜𝑔 𝑛
𝑥
𝑛
]
= Antilog [∑
𝑙𝑜𝑔 𝑥𝑖
𝑁
𝑛
𝑖=1 ]
Geometric Mean (GM) for Continuous case
GM = Antilog [
∑ 𝑓 𝑙𝑜𝑔 𝑥
𝑁
]
BIET – MBA Programme, Davangere
26
Prof. Vijay K S Business Statistics and Analytics
Merits of GM
- It is based on all the observation in the series
- It is rigidly defined
- It is suited for averages and ratios
- It is less affected by extreme values
- It is useful for studying social and economic data
Demerits of GM
- It is not simple to understand
- It requires computational skill
- It cannot be computed if any items are zero or negative
- It has restricted applications
Problems:
4.1 Find the GM of date 2, 4, 8
4.2 Find the GM of date 2, 4, 8, 10 using logarithms
4.3 The annual rate of growth rate of growth of output of a company in the last five years
is given below. Find the GM of the growth rate
Year Growth Rate Output at the end of the
year
1998 5.0 105
1999 7.5 112.87
2000 2.5 115.69
2001 5.0 121.47
2002 10.0 133.61
4.4 Comparing the previous year, the overhead (OH) expenses went up to 32% is year
2003, then increased by 40% in the next year and 50% increase in the following year.
Calculate average increase in overhead expenses. Let 100% OH expenses at base year
Year Growth Rate
2002 Base Year
2003 132
2004 140
2005 150
4.5 Consider the following time series at monthly sales of ABC Company for 4 months.
Find average rate of change per month sales
Month Sales
I 10,000
II 8,000
III 12,000
IV 15,000
BIET – MBA Programme, Davangere
27
Prof. Vijay K S Business Statistics and Analytics
4.6 Find the GM for the following data
Yield of Wheat in MT No. of Farms
1 – 10 3
11 – 20 16
21 – 30 26
31 – 40 31
41 – 50 16
51 – 60 8
Harmonic Mean
- It is the total number of items of a value, divided by the sum of
reciprocal of values of a variable
- It is a specified average which solves problems involving variables
expressed in “Time rates” that vary according to time
- Example: Speed in km/hr., min/day, Price/chapter
- Harmonic mean (HM) is suitable only when time factor is a variable and
the act being performed remains constant
HM =
𝑁
∑
1
𝑥
Merits of Harmonic Mean
- It is based on all observation
- It is rigidly defined
- ‘Suitable in case of series having wide dispersion
- It is suitable for further mathematical treatment
Demerits of Harmonic Mean
- It is not easy to compute
- Cannot be used when one of the items is zero
- It cannot represent distribution
Problems:
5.1: The daily income of 5 families in a very remote village is given below. Compute HM
Family Income
(X)
1 85
2 90
3 70
4 50
5 60
BIET – MBA Programme, Davangere
28
Prof. Vijay K S Business Statistics and Analytics
5.2 A man travels by a car for 3 days; he covered 480 km each day. On the first day, he
drives for 10 hrs. At the rate of 48 KMPH, on the second day for 12 hrs. At the rate of
40 KMPH, and on the 3rd day for 15 hrs. at the rate 32 KMPH. Compute HM, Weighted
mean and compare them.
5.3 Find the HM for the following data
Class Interval Frequency
0 – 10 5
10 – 20 15
20 – 30 25
30 – 40 8
40 - 50 7
MEASURES OF DISPERSION
 An average does not tell the full story. It is hardly a full representative of a mass
unless we know the manner in which the individual items scatter around it. A
further description of the series is necessary if we are to gauge how representative
the average is.
 The measure of central tendency must be supported and supplemented by some
other measure, one such measures is dispersion.
 The literal meaning of dispersion is “Scatteredness”
 We study dispersion to have an idea of homogeneity (Compactness) or
heterogeneity (Scatter) of the distribution.
 Why dispersion is important characteristic to understand and Measure?
o It enables us to judge the reliability of our measure of Central Tendency
o To tackle the problems associated with widely dispersed data
o We may wish to compare the dispersion of various samples
 Dispersion is the measure of the variation of the items
 It is a measure of the extent to which the individual item vary
 It is the degree of the scatter or variation of variables about a central value
 Degree to which numerical data tend to spread about an average value is called
variation or dispersion of data
BIET – MBA Programme, Davangere
29
Prof. Vijay K S Business Statistics and Analytics
- The curve A has git less
dispersion or Variability
- The Curve B has got less
variability than curve C
but more variability than
curve A
- Curve C has got more
dispersion / variability
than curve A and curve B
Objectives or Significance of the measures of dispersion
 To find the reliability of an average
 To control the variation of the data from the central value
 To compare two or more set of data regarding their variability
 To obtain other statistical measures for further analysis of data
Characteristics for an ideal measure of dispersion
 It should be rigidly defined
 Easy to calculate and easy to understand
 It should be based on all the observations
 It should be amiable to further mathematical treatment
 Less affected by possible fluctuation of sampling
 It should not be much affected by extreme observations
Measures of Dispersion
1. Range
2. Quartile Deviation
3. Mean Deviation
4. Standard Deviation
1. RANGE
Range is the crude measure of dispersion; Calculated as
R = High – Low
R = H – L
Its relative measure is called co-efficient of Range R =
𝐻−𝐿
𝐻+𝐿
Example: Find range and its co-efficient 10, 13, 18, 8, 14, 16, 23, 25
BIET – MBA Programme, Davangere
30
Prof. Vijay K S Business Statistics and Analytics
2. QUARTILE DEVIATION
This is measure of dispersion which consider deviation between upper and lower
quartile
Quartile Deviation - QD = Q3 – Q1 (Inter quartile range)
QD = =
𝑄3−𝑄1
2
(Semi - Inter quartile range)
Co-efficient of QD = =
𝑄3−𝑄1
𝑄3+𝑄1
Example1: Find Quartile Deviation and its Co-efficient
13, 82, 65, 45, 58, 76, 18, 29, 34, 91
Example 2: Find semi-interquartile range and co-efficient of Quartile Deviation for the
following frequency distribution
Size 4-8 8-12 12-16 16-20 20-24 24-28 28-32 32-36 36-40
f 6 10 18 30 15 12 10 6 2
3. MEAN DEVIATION
It is measure of dispersion which calculate average distance between each observation
and its central value (𝑋̅ or M or Z)
If the Mean Deviation is calculated around 𝑋̅ it is called Mean Deviation about the mean.
Formulas
Mean Deviation =
∑ I X−A I
𝑛
or
∑ I d I
𝑛
Where A = Mean/Median/Mode
I d I = Mod d = I X-A I
In case of Frequency Distribution
Mean Deviation =
∑ f I X−A I
𝑛
or
∑ f I d I
𝑛
Relative Measure of Mean Deviation
Co-efficient of Mean Deviation =
Mean Deviation
Average about which it is calculated
BIET – MBA Programme, Davangere
31
Prof. Vijay K S Business Statistics and Analytics
Example 1: Find Mean Deviation about 𝑋̅ for the following data
18, 75, 56, 63, 36
Example 2: Find Mean Deviation about Median
Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
Student 2 6 12 18 25 20 10 7
Example 3: Calculate M.D. about the Median and its coefficient
Size of Items 4 6 8 10 12 14 16
f 2 1 3 6 4 3 1
STANDARD DEVIATION & VARIANCE
The most comprehensive explanation of dispersion are those that deal with the average
deviation from some measure of central tendency
Variance and Standard Deviation are the two measures for measuring dispersion in
statistics. Both of these will tell us an average distance of any observation in the data set
from the mean of the distribution
VARIANCE and STANDARD DEVIATION of PAPULATION
Population Variance
 Every population has a variance, which is symbolised by 𝜎2
(Sigma Squared)
 To calculate the population variance, we divide the sum of squared distances
between the mean and each item in the population by total number of items in
the population
 By squaring the each distance, we make each number positive and at the same
time, assign more weight to large deviation (Distance between the mean and the
value)
Population Standard Deviation
 Population Standard deviation is denoted as 𝜎 (Sigma Squared)
 It is simply the square root of population variance
 Standard deviation is the square root of the average of squared distances of the
observations from the mean
 While the variance is expressed in the square of the units used in the data,
standard deviation is in the same units as those used in the data
BIET – MBA Programme, Davangere
32
Prof. Vijay K S Business Statistics and Analytics
Formula for Variance and Standard Deviation
Formula for Raw Data:
Variance = 𝜎2
=
∑(𝑋−𝑋̅ )2
𝑁
Standard Deviation = 𝜎 = √
∑(𝑋−𝑋̅ )2
𝑁
𝑜𝑟 𝜎 = √
∑(𝑋)2
𝑁
− 𝑋̅2
𝜎2
= Population Standard Deviation
𝜎 = Population Variance
X = Observation
𝑋̅ = Mean
N = Number of observation in the population
∑ = Sum of all the values
Grouped data:
Variance = 𝜎2
=
∑ 𝑓(𝑋−𝑋̅ )2
𝑁
𝑜𝑟
∑ 𝑓(𝑋)2
𝑁
− (𝑋̅)2
or
∑ 𝑓(𝑋)2
𝑁
−
(
∑ 𝑓𝑋
𝑁
)2
Standard Deviation = 𝜎 = √
∑ 𝑓(𝑋−𝑋̅ )2
𝑁
𝑜𝑟 𝜎 = √
∑ 𝑓(𝑋)2
𝑁
− 𝑋̅2
or
𝜎 = √
∑ 𝑓(𝑥)2
𝑁
− (
∑ 𝑓𝑋
𝑁
)2
Where f = Frequency of each of the class
Example 1: Calculate S.D for the following data
X 25 36 45 65 82 93 58 70
Example 2: Find S.D and Co-Efficient of SD for the following data
C.I 0-10 10-20 20-30 30-40 40-50 50-60 60-70
F 5 7 14 12 9 6 2
BIET – MBA Programme, Davangere
33
Prof. Vijay K S Business Statistics and Analytics
Example 3: Find Mean and SD for the following data
Age 10 20 30 40 50 60 70 80
No. of
Person
15 30 53 75 100 110 115 125
Example 4: The 15 Vessels were produced in one day and we test each vessel to
determine its purity. The data is given below, calculate the standard deviation.
The result of purity test on vessel and Observed percentage of impurity is as follows
0.04 0.14 0.17 0.19 0.22
0.06 0.14 0.17 0.21 0.24
0.12 0.15 0.18 0.21 0.25
CO-EFFICIENT OF VARIANCE
If Co-efficient of Variance for a given data is more, the data is said to be less consistent,
the other hand if C.V is less it means that variability in the data is less and more
consistent.
C.V =
𝜎
𝑋̅ *100
Example 1: The run scored by two batsmen A & B in 10 innings are as follows
A 10 115 5 75 7 120 36 84 29 19
B 45 12 76 42 4 50 37 48 130 0
Find I) Better one score II) Consistent Batsmen
Example 2: Life of 2 models of refrigerator in recent survey are shown as follows.
What is the average life of each model?
Which model has grater uniformity?
Life (years) 0-2 2-4 4-6 6-8 8-10 10-12
A 5 16 13 7 5 4
B 2 7 12 19 9 1
Example 3: Two brands of tyres are listed with the following results
A) Which brand of tyre have greater average life?
B) Compare the variability and state which brand of tyres would you use
Life 20-25 25-30 30-35 35-40 40-45
X 1 22 64 10 3
Y 0 24 76 0 0
BIET – MBA Programme, Davangere
34
Prof. Vijay K S Business Statistics and Analytics
Standard Deviation for combined series
If n1 observation have mean 𝑋̅1, and SD 𝜎1 and n2 observation have mean 𝑋̅2, and SD
𝜎2 then combined SD of (n1+n2) observations is calculated as
𝜎 = √
𝑛1( 𝜎12 + 𝑑12) + 𝑛2( 𝜎22 + 𝑑22)
𝑛1 + 𝑛2
Where d1 = 𝑋̅-𝑋̅1 d2 = 𝑋̅-𝑋̅2
And 𝑋̅ =
𝑛1𝑋̅1+𝑛2𝑋̅2
𝑛1+𝑛2
Example1: The Mean and SD of marks obtained by 2 groups of students consisting of 50
each are given below. Calculate S.D of all 100 students
Group Mean Standard Deviation n
1 60 8 50
2 55 7 50
Example 2: Calculate the missing information from the following data
A B C Combined
Numbers 175 ? 225 500
SD ? 63 5.9 5.4
Mean 220 240 ? 235
Example 3: A shareholders research centre of India has conducted a research study on
price behaviour of 3 leading industries A, B, C. The results published in quarterly journal
are as follows
Shares Average Price Standard Deviation
(SD)
Current Selling Price (S
P)
A 18.2 5.4 36.0
B 22.5 4.5 34.75
C 24.0 6.0 39.0
I. Which share in your opinion is more stable in value
II. If you are holder of all 3 shares, which are would you dispose off at present?
Why?
Example 4: Following are the record of goals scored by team A in the football season.
No. of Goals 0 1 2 3 4
Matches 1 9 7 5 3
For team B the average number of goals scored per match was 2.5 with S.D = 1.25 gaols
BIET – MBA Programme, Davangere
35
Prof. Vijay K S Business Statistics and Analytics
Find which team to be consider more consistent?
Note: Standard deviation is understood to be best measure of dispersion as
1. It includes all observations in its calculations
a. Is more suitable as compared to other measurement of dispersion
b. Is not much affected by extreme values
c. It is rigidly defined
2. S.D is used in finding the normal probabilities of population
Some remarks on Standard Deviation
1. Mean Deviation can be computed by taking the deviation from any averages i.e.
Mean, Median and Mode, but Standard Deviation is always computed from
Arithmetic Mean
2. Standard Deviation of the variable X will be denoted by 𝜎𝑥 , This notation will be
useful when we have to deal with the standard deviation of two or more
variables
3. SD is always taken as the positive square roots
4. Since S.D depends on the numerical value of the deviation thus the value of 𝜎 will
be greater if the value of x are scattered ;widely away from the mean
- Smaller the 𝜎 implies that distribution is homogeneous
- Larger value of 𝜎 implies distribution is heterogeneous
Mathematical Properties of Standard Deviation
1. Standard deviation is independent of change of origin but not of scale
2. Standard Deviation is the minimum value of the root mean square deviation
3. S.D greater or equal to Range
4. S.D is suitable for further mathematical treatment
5. The S.D of first n natural numbers 1, 2, 3,4 ….n is √
(𝑛2−1)
12
6. The Empirical Rule:
a. For a symmetrical bell shaped distribution, we have approximately the
following properties
i. 68% of the Observation lie in the range : Mean ± 𝜎
ii. 95% of the observation lie in the range : Mean ± 2𝜎
iii. 99% of the observation lie in the range : Mean ± 3𝜎
7. The approximate relationship between Quartile Deviation (QD), Mean Deviation
(MD) and Standard Deviation (𝜎) is
QD =
2
3
𝜎 MD =
4
5
𝜎
BIET – MBA Programme, Davangere
36
Prof. Vijay K S Business Statistics and Analytics
QD : MD : SD :: 10 : 12 : 15
8. Any discrete distribution, Standard Deviation is not less than the Mean Deviation
about mean I.e. SD ≥ Mean Deviation about mean
Question from Previous Question Papers:
1. Which is good measure of central tendency? Give any two reason
2. The following distribution gives the pattern of overtime work done by 100 employees of
a company. Find the mean and median.
Overtime(Hrs) 10-15 15-20 20-25 25-30 30-35 35-40
No. of
Employees
11 20 35 20 8 6
3. The following data gives the prices X and Y of shares A and B respectively. Compute the
coefficient of variance X and Y and state which is more stable in value.
Price of Shares A X 55 54 52 56 58 52 50 51 49
Price of Shares B Y 108 107 105 106 107 104 103 104 101
4. A Sample of 50 cars each of 2 makes X and Y is taken and average running life in years is
recorded
Life (No. of Years)
No. of Cars
Make X Make Y
0-5 8 6
5-10 12 10
10-15 17 20
15-20 10 12
20-25 3 2
a) Which of these two make gives higher average life?
b) Which of these makes has shown greater consistent performance? Use standard
deviation.
5. Calculate mean for the following frequency distribution
I) By direct method
II) By step-deviation method
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 6 5 8 15 7 6 3
6. Find the value of Mean, Median and Mode from the following date given below
Wt (in kg) 93-97 98-102 103-107 108-112 113-117 118-122 123-127 128-132
No. of Students 3 5 12 17 14 6 3 1
7. Determine the missing frequency of the class interval 15-20, the mean being 19 units
X 5-10 10-15 15-20 20-25 25-30
F 2 2 ? 4 4
BIET – MBA Programme, Davangere
37
Prof. Vijay K S Business Statistics and Analytics
8. Find the standard deviation method for the following data
CI 0-10 10-20 20-30 30-40 40-50 50-60 60-70
F 6 14 10 8 1 3 8
9. Calculate the quartiles from the following data
CI 0-10 10-20 20-30 30-40 40-50
F 3 8 20 12 7
10. Compute Mean, Median and Mode from the data pertaining to marks scored by 80
students in statistics. The test is of 140 marks?
Marks more
than
00 20 40 60 80 100 120
No of Students 80 76 50 28 18 09 01
11. Find the value of X and Y from the following distribution
Mid Values 15 25 35 45 55 65
Frequency 10 X 15 20 Y 11
Note: N =82 and Median = 41
12. Calculate Q1 and Q3 from the following Distribution
CI 0-10 10-20 20-30 30-40 40-50 50-60
F 3 8 20 12 7 3
13. From the prices of shares of X and Y below find out which is more stable in values.
X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 106 107 104 103 104 105 101
14. The following data gives the prices X and Y of shares A and B respectively. Compare the
co-efficient of variation of X and Y ad state which share is more stable in value.
Share A 55 54 52 56 58 52 50 51 49
Share B 108 107 105 106 107 104 103 104 101
BIET – MBA Programme, Davangere
38
Prof. Vijay K S Business Statistics and Analytics
Unit – 2
Correlation and Regression
BIET – MBA Programme, Davangere
39
Prof. Vijay K S Business Statistics and Analytics
Unit -2
Correction and Regression: Scatter Diagram, Karl Pearson correlation, Sparman’s Rank
correlation (One way table only), Simple and multiple regression (Problems on simple
regression only)
Correlation:
Introduction:
Measures of central tendency and dispersion are confined to univariate distribution i.e.
the distribution involving only one variable. These measures also used for the purpose of
the comparision and analysis
In some distribution or set of data, each unit assumes two values, we call it as bivariate
distribution. Example, an individual in distribution has got two vaiables like hight and
weight. If we measure more than two variables on each unit of distribution, it is called
“Multivariate distribution”.
Some of the examples of “Bivariate distribution”
- The series of marks of individuals in two subjects in an examination
- The series of sales revenue and advertising expenditure of different
companies in a particular year.
- Imports and exports of cotton from 1989 to 1994
- The series of ages of husband and wives in a sample of selected married
couple
In bivairate distribution, we may be interested to find if there is any relationship between
two variables under study.
The correlation is a statistical tool which studies the relationship between two variables
The correation analysis involves various methods and techniques used for studying and
measuring the extent of the relationship between the two variables.
Variables are set to be correlated if the change in one variable result in corresponding
change in the other variable.
Some definition:
“When the relationship is of a quantitative nature, the appropriate statistical tool for
discovering and measuring the relationship and expressing it in a brief formula is known
as correlation”
“Correlation is an analysis of the covariation between two or more variables”.
BIET – MBA Programme, Davangere
40
Prof. Vijay K S Business Statistics and Analytics
Types of correlation
A) Positive and Negative Correlation
B) Linear and Non-Linear Correlation
A) Positive and Negative Correlation
If the values of two variables deviate in the same direction, correlation is said to be
“Positive or Direct” Correlation
Example:
- Hight and Weights
- The family income and expenditure on laxury goods
- Amount of rainfall and yeild of crops
- Price and Supply of commodities
If the values of two variables deviate in opposite direction
Example:
- Price and Demand of commodity
- Volume and pressure of perfect gas
- Sales of wollen garments and day temperature
B) Linear and Non-Linear Correlation
Correlation is linear if corresponding to a unit change in one variable, there is a
constant change in the other variables over the entire range of values
In general, two variables x and y are said to be linear related, if there exists a
relationship of the form
Y = a + b x
- Here “b”is the slope of the stright line
- Generally assumed that the relationship between two variables under study
is linear
The relationship between two variables is said to be non-linear or curvilinear, if
corresponding to a unit change in one variable, the other variable doesnot change at a
constant rate but at flactuating rate.
BIET – MBA Programme, Davangere
41
Prof. Vijay K S Business Statistics and Analytics
Correlation and Cousation
Correlation analysis enables us to have an idea about the degree and direction of
relationship between the two variables under study
But it fails to reflect upon the cause and effect relationship between two variables
In a bivariate distribution, if the variables have cause and effect relationship, they are
bound to have high degree of correlation between them. That means, causation always
implies correlation.
However, the converse is not true i.e. there may be a fairly high degree of correlation
between the two variables; need not imply a cause and effect relationship between them.
This high degree of correlation between variables may be due to
Mutual dependence
Both the variables bring influenced by the same external factors
Pure chance
Methods of studying correlation:
Assuming that there is a linear relationship exist between two variables or series
The commonly used method for studying the correlation between two variables are
I. Scatter diagram method
II. Karl Pearson’s Co-efficient of Correlation (Covariance method)
III. Two way frequency table (Bivariate correlation method)
IV. Ranks method or Spearman’s Rank Correlation
V. Concurrent Deviation Method
I. Scatter Diagram Method:
- It is one of the simplest way or method of diagrammatic representation of a
bivariate distribution and provides us one of the simplest tool of
ascertaining the correlation between two variables
- The “n” points are plotted as dots of two variables (Examples heights and
weight). The diagram of dots so obtained is known as “Scatter Diagram”
- From the scatter diagram, we can form a fairly good, tough rough idea about
the relationship between the two variables.
Some points regarding scatter diagram:
- If the points are very dense, i.e. very close to each other then it is said to be
fairly good amount of correlation may be expected between two variables.
BIET – MBA Programme, Davangere
42
Prof. Vijay K S Business Statistics and Analytics
On the other end, if the points are widely scattered, a poor correlation may
be expected between them.
- If the points on the scatter diagram reveal trend (either upward or
downward), the variables are said to be correlated and if no trend is
revealed, the variables are not correlated. Uncorrelated
Perfect Positive Correlation
Perfect Negative Correlation
Low degree of positive correlation
Low degree of negative correlation
BIET – MBA Programme, Davangere
43
Prof. Vijay K S Business Statistics and Analytics
High degree of positive correlation
High degree of negative correlation
No Correlation
Remarks:
- Scatter diagram provides rough idea about the relationship between two variables.
It is not getting affected by the extreme numbers or observations as it does with
the mathematical formulae. However, this method is not suitable if the number of
observation is fairly large
- It does not provide us an exact measures of the extent of relationship between two
variables
- It provides only the approximate estimating line or line of best fit by free hand
method
BIET – MBA Programme, Davangere
44
Prof. Vijay K S Business Statistics and Analytics
Karl Pearson Coefficient of Correlation:
This is also called “Covariance Method” or “Product moment correlation co-efficient”
- A mathematical method of measuring the intensity or the magnitude of linear
relationship between two variable series
- It was suggested by Karl Pearson, and this method is most widely used in the area
of practice
- Karl Pearson’s measure, also known as Pearsonian correlation coefficient between
two variables i.e. series X and Y. Usually denoted by r( x , y ) or rxy or r
r =
𝐶𝑜𝑣 (𝑥,𝑦)
𝜎𝑥 𝜎𝑦
It is the ratio of the covairnace between x and y, written as Cov (x, y), to
the product of standrad deviation of x and y
Here the Cov(x,y) =
1
𝑛
∑(𝑥 − 𝑥̅) . (y - 𝑦̅)
𝜎𝑥 = √
∑(𝑥−𝑥̅)2
𝑛
𝜎𝑦 = √
∑(𝑦−𝑦̅)2
𝑛
If 𝑥̅ 𝑎𝑛𝑑 𝑦̅ come out to be integer (i.e. whole number) then the following
formula is feasible to use
r =
∑ 𝑑𝑥 . 𝑑𝑦
√∑ 𝑑𝑥2 . 𝑑𝑦2
or
r =
∑( 𝑥−𝑥̅) . (𝑦−𝑦̅)
√∑( 𝑥−𝑥̅)2 . (𝑦−𝑦̅)2
If 𝑥̅ 𝑎𝑛𝑑 𝑦̅ are in fractions then the above formula is cumbersome to apply,
then we should use the following formula
r =
𝑛 ∑ 𝑥𝑦− ∑ 𝑥 . ∑ 𝑦
√ 𝑛.∑ 𝑥2−(∑ 𝑥)
2
∗𝑛.∑ 𝑦2−(∑ 𝑦)
2
BIET – MBA Programme, Davangere
45
Prof. Vijay K S Business Statistics and Analytics
Note:
1. Two variables are said to be correlated if their exist cause and effect relationship
between them
Example:
- Yield and rainfall
- Production and price
- Demand and supply
2. Correlation is said to be Positive, if increases in one variable result in increase on
other variable or Decrease in one as a result of decrease of other.
Example: Demand and Production, Production and Supply, Income and Want
3. Correlation is said to be negative, if increase in one variable result in decrease of
other variable. Similarly the decrease in one due to increase in other variable.
Example: Production and Price, Supply and Price
4. Correlation is said to be zero, if the variables are independently behaving
Example: Decrease in tax rate and revenue
5. rlies between -1 and +1
 If r lies between 0 to 1, that means positive correlation exists
 If ris exactly 1, the correlation is perfect positive correlation
 If r lies between -1 to 0, that means negative correlation exists
 If r is -1, that implies perfect negative correlation
Problems:
C-1: Find Karl Pearson’s co-efficient of correlation for the following data
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60
C-2: Calculate KPCC for the following two variables
X 80 90 100 110 120 130 140 150 160
Y 15 15 16 19 17 18 16 18 14
C-3: Calculate Pearson’s Co-efficient of correlation for the following
X 6.9 8.5 5.8 8.6 9.6 8.0 9.7
Y 2.9 3.8 6.5 2.3 5.5 3.5 3.2
BIET – MBA Programme, Davangere
46
Prof. Vijay K S Business Statistics and Analytics
C-4: Calculate Karl Pearson’s Co-efficient of correlation between expenditure on
Advertising and sales from the data given below
Ad Expenses 39 65 62 90 82 75 25 98 36 78
Sales
(Lakhs)
47 53 58 86 62 68 60 91 51 84
C-5: From the following table calculate the co-efficient of correlation by Karl Pearson’s
Method
X 6 2 10 4 8
Y 9 11 ? 8 7
Arithmetic mean of X and Y series are 6 & 8 respectively
C-6: Calculate the co-efficient of correlation between X and Y series from the following
data
X Y
No. of Pairs of Observation 15 15
Arithmetic Mean 25 18
Standard Deviation 3.01 3.03
Sum of Squared Deviation from
Mean
136 138
C-7: The Co-efficient of correlation between X and Y is 0.48, the Co-variance of x, y is 36,
the variance of X is 16. Find the standard deviation of Y.
C-8: Given the following information
rxy = 0.8 ∑ 𝑥𝑦= 60 𝜎𝑦 = 2.5 and ∑ 𝑥2
= 90, where x and y are the deviation from the
respective means. Find the number of items.
Properties of Correlation Co-efficient
1. Pearson’s Correlation Co-efficient can bot exceed 1 numerically. In other words it
lies between -1 and +1
2. Correlation Co-efficient is independent of the change of origin and scale
3. Two Independent variables are uncorrelated but the converse is not true i.e.
uncorrelated variables need not necessarily be independent
Spearman’s Rank Correlation
Sometimes we come across statistical series in which the variables under consideration
are not capable of quantitative measurement can be arranged in serial order. This
happens when we are dealing with qualitative characteristics such a honesty, Beauty,
Character, morality etc.
BIET – MBA Programme, Davangere
47
Prof. Vijay K S Business Statistics and Analytics
In these above cases or situation Karl Person’s Co-efficient of correlation cannot be used
as such.
The variables are measurable and attributes are not able to measure i.e. quantification of
these are difficult. Spearman’s Co-efficient of correlation is more appropriate for this and
it is calculated by using following formula
It is denoted as 𝜌 (Rho)
𝜌 = 1 −
6 ∑ 𝐷2
𝑛3−𝑛
D = Difference between ranks i.e. D = R1-R2
N = Number of observations under x and y
Problems:
C-9: Find Spearman’s Rank correlation for the following data
X 12 18 32 45 21 30
Y 70 68 75 95 86 12
C-10: From the following data calculate co-efficient of Rank correlation
Ranks X 1 2 3 4 5 6 7 8 9 10 11 12
Y 12 9 6 10 3 5 4 7 8 2 11 1
C-11: Calculate the Spearman’s Co-efficient of rank for the following data
X 78 89 97 69 49 79 68 57
Y 125 137 156 112 107 136 123 108
Note: If X and Y contains repeated observations, spearman’s rank co-efficient
correlation is calculated as
𝜌 = 1 −
6 (∑ 𝐷2+𝐶𝐹)
𝑛3−𝑛
Where CF is the correction factor for the repeated observations and is given by
CF =
𝑚3−𝑚
12
”m” is the number of repeated observations
BIET – MBA Programme, Davangere
48
Prof. Vijay K S Business Statistics and Analytics
Problems:
C-12: Find the spearman’s rank correlation co-efficient
X 78 89 89 69 59 79 68 57
Y 125 137 156 112 112 112 123 108
C-13: Find Spearman’s Rank Correlation Co-efficient
X 12 18 32 18 25 24 25 40 38 22
Y 16 15 28 16 24 22 28 36 34 19
C-14: Find the Co-efficient of rank correlation between X and Y
X 30 38 28 27 28 23 30 33 28 35
Y 29 27 22 29 20 29 18 21 27 22
C-15: Calculate Co-efficient of rank correlation
X 15 20 28 12 40 60 20 80
Y 40 30 50 30 20 10 30 60
Regression Analysis:
Technique of establishing one variable based on the values of other variable whenever
two variables are correlated is called regression.
I.e. if “x” and “y” are correlated then estimating the values of “x” based on the values “y”
or estimating the values of “y” based on the values of “x” is called Regression Analysis.
To estimate “x” values based on “y”, we have the regression equation of “x” on “y” is
given by
(x- 𝑥̅) = bxy (y- 𝑦̅)
Where bxy is the regression coefficient of x on y
𝑥̅, 𝑦̅ are the means of x and y respectively
Similarly to estimate “y” based on “x” we use the regression equation of “y” on “x” as
(y- 𝑦̅) = byx (x- 𝑥̅)
Where byx is the regression coefficient of y on x
𝑥̅, 𝑦̅ are the means of x and y respectively
And
BIET – MBA Programme, Davangere
49
Prof. Vijay K S Business Statistics and Analytics
bxy = r .
𝜎𝑥
𝜎𝑦
or bxy =
𝑛 ∑ 𝑥𝑦−(∑ 𝑥) (∑ 𝑦)
𝑛 ∑ 𝑦2− (∑ 𝑦)
2
byx = r .
𝜎𝑦
𝜎𝑥
or byx =
𝑛 ∑ 𝑥𝑦−(∑ 𝑥) (∑ 𝑦)
𝑛 ∑ 𝑥2− (∑ 𝑥)
2
R-1: Find two regression Equation for the following data
x 62 72 98 76 81 56 76 92 88 49
y 112 124 131 117 132 96 120 136 97 85
R-2: Following data relate to experience of 8 operators and their performance rating (y),
calculate the regression line of performance rating on experience and estimate the
probable performance of the operator has 15 years of experience.
R-3: Fit a least square line of the following
X 1 3 4 8 9 11 14
Y 1 2 4 5 7 8 9
a) Obtain Co-efficient of X on Y and Y on X
b) Find the co-efficient of correlation between X and Y
c) Find Y and X, when X=10 and Y=6 respectively
Note:
1. Co-efficient of correlation r = √𝑏𝑥𝑦 . 𝑏𝑦𝑥
2. bxy and byx will never be a opposite sign
3. r will be positive if bxy and byx are positive
4. r will be negative if bxy and byx are negative
R-4: The height of father’s and Son’s is given in the following table, Find the two lines of
regression and estimate the expected average height of son’s, when the height of the
father is 67.5 inches
Height of father 65 66 67 67 68 69 71 73
Height of sons 67 68 64 68 72 70 69 70
R-5: The marks of the 8th Standard students in mathematics and statistics are as follows,
find the regression on rank of marks in statistics on marks in mathematics also find the
marks of 9th student in statistics, if he has scored 90 in mathematics.
(X) Maths Scored 50 40 60 46 50 48 59 47
(Y) Statistics Scored 30 37 42 32 35 45 40 35
R-6: You are given the following information. Find
- r = 0.66
BIET – MBA Programme, Davangere
50
Prof. Vijay K S Business Statistics and Analytics
- Two regression equation
- Regression co-efficient of correlation
- Estimate x when y=100
X Y
AM 36 85
SD 11 8
R-7: Following is the information about advertisement expenditure and Sales
Advertising
Expenditure
Sales
AM 20 120
SD 5 25
r = 0.8
a) Calculate two regression equation
b) Find the likely sales when advertisement expenses are 25 crores
c) What should be the advertisement budget when sales target is 150 crores
R-8: Two regression equation are 2y-x-50 = 0 and 3y-2x-10=0
Find (i) r (ii) 𝑥̅ and 𝑦̅ or (Point of Intersection)
Question from Previous Question Papers:
1. The following data give the test scores and sales made by nine sales men during
certain period:
Test Scores: 14 19 24 21 26 22 15 20 19
Sales (’00 Rs) 31 36 48 37 50 45 33 41 39
Find the regression equation and also estimate the most probable sales volume of a
salesman making a score of 28.
2. From the following data calculate the rank correlation coefficient after making
adjustment for ties ranks
X 48 33 40 9 16 16 65 24 16 57
Y 13 13 24 6 15 4 20 9 6 19
3. The Following data relate to age of employees and the number of days they reported
sick in a month. Calculate Karl Pearson’s co-efficient of correlation and interpret it.
Age (Years) 30 32 35 40 48 50 52 55 57 61
Sick Days 1 0 2 5 2 4 6 5 7 8
BIET – MBA Programme, Davangere
51
Prof. Vijay K S Business Statistics and Analytics
4. The following data gives the experience of machine operator and their performance
ratings.
Operator 1 2 3 4 5 6 7 8
Performance (in Years) 16 12 18 4 3 10 5 12
Performance Ratings 87 88 89 68 78 80 75 83
Calculate the regression line of performance rating on experience and estimate the
probable performance rating if an operator has 7 years of experience
5. A Company wants to assess the impact of R and D expenditure (in Rs. 1000/-)
on its annual profit (in Rs. 1000/-). The following table presents the
information for last 8 years.
Year R and D
Expenditure
Annual Profit
2010 9 45
2011 7 42
2012 7 40
2013 1 60
2014 4 30
2015 5 34
2016 3 25
2017 3 20
Estimate the regression equations and predict the annual profit for the year
2020 for an allocated sum of Rs. 100,000/- as R and D expenditure.
6. The following table shows the ages (x) and blood pressure (y) of 8 persons.
X 52 63 45 36 72 65 47 25
y 62 53 51 25 19 43 60 33
Obtain the regression equation of y on x and find the expected blood pressure of a
person who is 49 year old.
7. The following data relate to the ages of husbands and wives:
Age of Husbands (Years) 25 28 30 32 35 36 38 39 42 55
Age of Wives (Years) 20 26 29 30 25 18 26 35 35 46
Find the regressions and also find the most likely age of husband when wife’s age is 25
Years
8. Calculate Karl Pearson’s coefficient of correlation between expenditure on
advertising and sales from the data given below:
Advertisement expenditure (Rs.000’) 39 78 65 62 90 82 75 25 98 36
Sales (Rs. Lacs) 47 84 53 58 86 62 68 60 91 51
Comment the results
BIET – MBA Programme, Davangere
52
Prof. Vijay K S Business Statistics and Analytics
9. Consider the following data, obtain the regression equations:
X 6 2 10 4 8
Y 9 11 5 8 7
10. A research company summarized advertising expenditure and sales results
as follows:
Adv. Exp. (Rs. In Crore) Sales (Rs. In Crore
Mean 20 200
SD 18 17
11. The following table gives the age of cars of a certain make and annual
maintenance costs. Estimate the maintenance cost of a 7 year old car.
Age of Cars (in
Years)
2 4 6 8
Maintenance cost (in
Hundreds of Rs.)
10 20 25 30
12. A financial analyst wanted to find, out whether inventory turnover influence
any company’s earnings per share (in%). A random sample of 7 companies
listed in stock exchange were selected and the following data was recorded.
Company A B C D E F G
Inventory Turnover 4 5 7 8 6 3 5
Earning Per Share (%) 11 9 13 7 13 8 8
BIET – MBA Programme, Davangere
53
Prof. Vijay K S Business Statistics and Analytics
Unit – 3
Probability Distribution
BIET – MBA Programme, Davangere
54
Prof. Vijay K S Business Statistics and Analytics
Unit 3: (8 Hours)
Probability Distribution:
 Concept and definition – Rules of probability – Random variables – Concept of
probability distribution
 Theoretical probability distribution: Binomial, Poisson, Normal and Exponential –
Baye’s theorem (No deviation) (Problems only on Binomial, Poisson and Normal)
Introduction:
The result or outcome of an experiment, which is performed repeatedly under essentially
homogeneous and similar condition are categorised either of the following
- It is unique or Certain
o Here the results are predictable with certainty and they are known as
deterministic or predictable phenomenon
Example: Boyles Law: Pressure × Volume = Constant
o Most of physical and Chemical sciences are deterministic in nature
- Uncertain or Unpredictable
o Where the results cannot be predicted with certainty and they are known
as unpredictable or probabilistic phenomenon.
Example: Sales manager on sales target, life of electric bulb
o It is frequently observed in Economics, Business and Social Sciences
A numerical measurement of uncertainty is provided by a very important branch of
statistics called the “Theory of Probability”.
Here the mathematics and statistics, we try to present condition under which we can
make sensible numerical statements about uncertainty and apply certain methods of
calculating numerical values of probabilities and expectations.
“Statistics is the science of decision making with calculated risks in the face of
uncertainty”
Many business decisions are based on variables which are certainly not under control
and hence decision yields poor results.
Using mathematical models, we can make better decisions but in limited number of cases.
In such circumstances we make use of probability i.e. calculations to predict about
uncertain conditions or situation.
Probability is a measure of chance associated with occurrence of an event (That is
essential to make good business decisions)
Example: Demand for the products, which is newly launched.
BIET – MBA Programme, Davangere
55
Prof. Vijay K S Business Statistics and Analytics
Important terminologies in Probability:
1. Experiments
The term experiment refer to describe an act which can be repeated under same
given conditions.
Random experiment: An experiment is called random experiment if when
conducted repeatedly under essentially homogeneous conditions, the result is not
unique or results is not certain but may be any one of the various possible
outcomes.
Or
An Experiment having random outcomes
Or
Experiments whose results are depends on chance
Example: Tossing a coin, rolling a dice
2. Trail
Performing of a random experiment is called a trial
Example: Tossing experiment of a coin has done two times, that means two
trials
3. Event:
Outcome or combination of outcomes of an experiment are termed as events
Example: Tossing a coin – You may get H or T – These are events
4. Mutually Exclusive Events:
Two events are said to be mutually exclusive or incompatible, when both cannot
happen simultaneously in a single trial or in other words, the occurrence of any
one of them avoid the occurrence of the other.
In other words “if happening of one event prevents the happening of the other
events such events we call it as mutually exclusive events”
Example: Tossing a coin leads to two events Head (H) or Trail (T)
If head turns up in tossing a coin, then head prevents tail to turn-up
and vice-versa
BIET – MBA Programme, Davangere
56
Prof. Vijay K S Business Statistics and Analytics
5. Independent and Dependent Events:
Two or more events are said to be independent when the outcome of one doesn’t
affect, and is not affected by the other.
Example: Tossing of coin twice, happening of head during the first trail will
not affect the happening of other in the next trial
The occurrence and non-occurrence of one event in any one trial affect the
probability of other event in other trial
Example: Drawing a card without replacement.
6. Equally likely events:
Events are said to be equally likely when one doesn’t occur more often than the
others. This means none of them is expected to occur in preference of other.
In other words – equal chance of occurrence and importance for all the events to
occur
Example: When you roll a dice, occurrence of all the 6 faces i.e. 1, 2, 3, 4, 5, 6 are
equally likely
7. Simple and Compound Events:
In case of simple events we consider the probability of the happening or not
happening of single events
Compound events, we consider the joint occurrence of two or more events
8. Exhaustive Events:
Events are said to be exhaustive when their totality includes all the possible
outcomes of a random experiment.
In other words, if the sum of individual chance of occurrence is equal to 1
Example1 : Rolling dice, once the possible outcomes are 1, 2, 3, 4, 5 and 6,
hence the exhaustive number of cases is 6
Example 2: If we roll two dice once the exhaustive number of cases is 62
= 36
Similarly for rolling of three dice leads to 216 outcomes and summation of
possibilities or probability of occurrence of all these events is 1
10 Red balls and
6 White balls
Probability of drawing 3 white
balls in the first draw and 3
black balls in second draw
Probability of
drawing a red ball
BIET – MBA Programme, Davangere
57
Prof. Vijay K S Business Statistics and Analytics
9. Complementary events:
Let there be two events A and B, A is called the complementary event of B (and
Vice versa), if A and B are mutually exclusive and exhaustive.
Example: When the dice is thrown, the occurrence of an even number and
odd number are complementary events.
Simultaneous occurrence of two events A and B is generally written as AB
Definition of Mathematical Probability
If there be a random experiment with “N” outcomes which are mutually exclusive,
exhaustive and equally likely
Let there be an event “A”, Let “M” outcome occur for the event “A” (Favourable
outcomes), then the probability of occurrence of “A” can be written as follows
P (A) =
𝑚
𝑁
=
𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 "𝐴"
𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠
Example: Rolling of dice once S = 1, 2, 3, 4, 5, 6
The total number of outcomes N = 6
 Probability of getting odd numbers
P (Odd Numbers) =
𝑚
𝑁
=
3
6
=
1
2
= 50%
That means the probability of getting the odd number is 50%
 Probability of getting the Even numbers
P (Even Numbers) =
𝑚
𝑁
=
3
6
=
1
2
= 50%
That means the probability of getting the even number is 50%
The probability of not happening of “A”, we call it as complementary events of “A”
denoted as 𝐴̅ or 𝐴 𝑐
or 𝐴1
i.e. P( 𝐴̅) =
𝑁−𝑚
𝑁
=
𝑇𝑜𝑡𝑎𝑙 𝑂𝑢𝑡𝑐𝑜𝑚𝑒−𝐹𝑎𝑣𝑜𝑢𝑎𝑏𝑙𝑒 𝑂𝑢𝑡𝑐𝑜𝑚𝑒
𝑇𝑜𝑡𝑎𝑙 𝑂𝑢𝑡𝑐𝑜𝑚𝑒
P(𝐴̅) =
𝑁
𝑁
−
𝑚
𝑁
= 1 −
𝑚
𝑁
= 1 – P (A)
BIET – MBA Programme, Davangere
58
Prof. Vijay K S Business Statistics and Analytics
P(𝐴̅) + P (A) = 1 i.e. the P(Failure) + P (Success) = 1
Theorems of Probability or Rules of Probability
The two important theorems of probability
1. The addition theorem
2. The multiplication theorem
1. The addition Theorem of Probability:
- This is also called as “Or” Probability P (A) or P (B)
P (A Occurring) or P (B Occurring)
- This is also denoted as follows
o P(A) or P(B)
o P(A) U P(B)
o P(A or B)
o P(A U B)
- Here “Or” means “add”
- The addition theorem stats that if two or more A and B are mutually exclusive, the
probability of the occurrence of either A or B is the sum of the individual
probabilities of A and B
So Symbolically P(A or B) = P(A) + P(B)
- When the events are mutually exclusive events i.e. when the events are disjoint,
then adding the probability of occurring both events will be good
Mutually Exclusive events
P(A or B) = P(A U B) = P(A) + P(B)
Similarly, for three or more mutually exclusive events=P(A or B or C)=P(A)+P(B)+P(C)
A B
Event BEvent A
Denotes “A Union B”
BIET – MBA Programme, Davangere
59
Prof. Vijay K S Business Statistics and Analytics
- When events are “Not mutually exclusive”, here there is a possibility of occurrence
of both the events. Then the addition rule of probability will get modified
When the events are not mutually exclusive i.e. when there is a overlap, then the
addition rule is
Overlap of Event
P(A or B) = P(A U B) = P(A) + P(B) – P(A ⊓ B)
Or
P(A or B) = P(A U B) = P(A) + P(B) – P(A and B)
Or
P(A or B) = P(A U B) = P(A) + P(B) – P(AB)
2. The Multiplication Theorem of Probability:
- This is also called as “And” Probability
- Denoted as P(A) and P(B) for events A and B
- Also denoted as
o P(A & B)
o P(A ⊓ B)
o P(A) ⊓ P(B)
- This theorem states that if two events A and B are independent, the probability
that they both will occur is equal to the product of their individual probabilities
If A and B are independent i.e. Event “A” will not affect the event “B”
P(A ⊓ B) = P(A) . P(B) or
P(A and B) = P(A) × P(B)
Event A
BIET – MBA Programme, Davangere
60
Prof. Vijay K S Business Statistics and Analytics
Similarly for three events
P(A ⊓ B ⊓ C) = P(A) × P(B) × P(C) or
P(A, B and C) = P(A) . P(B) . P(C)
- If the events are interdependent / Dependent, two events; if A and B are said to be
dependent i.e. B occurs only when A is known to have occurred
Then
P(A ⊓ B) = P(A) × P(B/A) ; P(A) ≠ 0
P(B ⊓ A) = P (B) × P(A/B) ; P(B) ≠ 0
Or
P (B/A) =
𝑃 (𝐴𝐵)
𝑃(𝐴)
=
𝑃 (𝐴) 𝑎𝑛𝑑 𝑃(𝐵)
𝑃(𝐴)
P (A/B) =
𝑃 (𝐴𝐵)
𝑃(𝐵)
=
𝑃 (𝐴) 𝑎𝑛𝑑 𝑃(𝐵)
𝑃(𝐵)
The above cases were also called as conditional probability
Problems:
1. Two unbiased coins are tossed once. Find the probability of getting
I) At least one head
II) At most one head
III) Two tails
2. An unbiased dice is rolled once. Find the probability of getting
I) Odd Numbers
II) Even Numbers
3. A pair of dice are rolled. Find the probability of getting the sum on the faces turning
up to be
I) Up to be 7
II) At-least 10
III) Up to be 12
IV) Neither 7 nor 10
4. A Bag contains 6 White, 4 Red and 8 Black marbles. 3 Marbles drawn randomly,
what is the probability that they are of
I) Same Colour
II) Different Colours
BIET – MBA Programme, Davangere
61
Prof. Vijay K S Business Statistics and Analytics
5. A Bag contain 7 White and 9 Black Marbles, 2 marbles are drawn randomly. What is
the probability that
I) They are of same colours?
II) They are of different colours?
6. A bag contains 5 Red, 3 white and 6 Green sticks. 3 sticks are drawn randomly. Find
the probability that
I) All are green II) 2 Red and 1 Green Sticks III) 3 White Sticks
7. 4 Cards are drawn from a pack of cards. Find the probability that 2 are Spades and 2
are Hearts.
8. From the pack of 52 cards, 4 are accidently drawn. Find the chance that
I) They will consist of a Jack, A Queen, A King and A ACE
II) They are one from each suit
III) 2 of them are Red and 2 of them are Black
9. In a College, 60% of the students play football and 50% of them play basketball. If a
student is selected randomly from the college, what is the probability that
I) He plays basketball or Football
II) Students play neither sports
10. A person is known to hit the target in 3 out of 4 shots, where as another person
known to hit the target in 2 out of 3 shots. Find the probability that hit the target?
11. Salesman is known to sell the product in 3 out of 5 attempt, another salesman in 2
out of 5 attempt. Find the probability that
I) Number of sales will be affected when they try to sell the product => Both
will not be able to sell
II) Either of them succeed in selling the product
12. Count against student “X” solving a B/S problem are 8:6 and count in favour of
student “Y” solving B/S problems are 14:16
I) What is the chance that the problem will be solved or they both have
independently of each other?
II) What is the probability that neither solves the problem?
13. If the P(A) is 0.3, P(B) is 0.2 and P(C) = 0.1 and A B C are independent events. Find
the probability of occurrence of at-least one of the 3 events A, B and C
Bayes’ Theorem:
This theorem allows us to use new information to update the conditional probability of an
event. Bayes’ theorem in its simple form is given by
P(A / B) =
𝑃 (𝐴 ∩ 𝐵)
𝑃 (𝐵)
BIET – MBA Programme, Davangere
62
Prof. Vijay K S Business Statistics and Analytics
Random Variable
Random variables are really ways to map the outcome of random processes to numbers. It is a
process of quantifying the outcomes of the random experiment. If you have a random process like
flipping a coin or rolling a dice or you are measuring a rain that might fall tomorrow; here you are
measuring the outcomes of these random processes to numbers that means you are quantifying
the outcomes.
 Random variable is a function which takes real values which are determined by the
outcomes of the random experiment
 The random variables were denoted by the capital letters X, Y, Z
 The actual values which events assumes is not a random variable.
 The random is used to do further mathematical operation of the outcomes and for the
purpose of notation.
Example: A Random experiment where three coins are tossed simultaneously; then the
outcomes are
S = {( 𝐻, 𝑇) 𝑎𝑛𝑑 ( 𝐻, 𝑇) 𝑎𝑛𝑑 ( 𝐻, 𝑇)}, which can also be denoted as follows
S = {( 𝐻, 𝑇) × ( 𝐻, 𝑇) × ( 𝐻, 𝑇)}
The total outcomes as follows
S = { 𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝑇}
Let us consider variable “X” to quantify the outcomes of the above experiment; If “X” is
the No of Head obtained, Then “X” takes any one of the value {0, 1, 2, 3}
Outcomes: 𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝑇
Values of X: 3 2 2 1 2 1 1 0
Hence the random variable is a function which takes real values which are determined by the
outcomes of the random experiment.
Discrete and Continuous Random Variable:
If “X” Assumes only a finite or countable infinite set of values, it is known as Discrete
Random Variable
Example: No of students in a college, Marks obtained by the students in a test, Number
of defective mangoes in a basket.
If “X” assumes infinite and uncountable set of values, it is set to be Continuous Random
Variable. Here we usually talk of the values in a particular interval and not at a point.
Example: Height or Weight of students in a classroom
Generally Discrete Random Variable represents counted data while Continuous Random
Variable represents measured data.
BIET – MBA Programme, Davangere
63
Prof. Vijay K S Business Statistics and Analytics
Probability Distribution of a Discrete Random Variable
Let us consider a Discrete Random Variable “X” which can take the possible values x1, x2,
x3,……., xn with each value of the variable X, we associate a number
pi = P(X=Xi); i=1,2,3……………., n
Where pi = P(X=Xi) ≥ 0 𝑎𝑛𝑑 ∑ 𝑝𝑖 = 𝑝1 + 𝑝2 + 𝑝3 … … . 𝑝𝑛 = 1
The function pi = P(X=Xi) pr p(x) is called the Probability Mass Function of the random
variable X and the set of all possible ordered pairs { 𝑥, 𝑝(𝑥)} is called the Probability
Distribution of random variable X
The concept of probability distribution is analogous to that of frequency distribution. Just as
frequency distribution tells us how the total frequency is distributed among different values (or
classes) of the variable. Similarly a probability distribution tells us how total probability of 1 is
distributed among the various values which the random variable can take. It usually represented
in a tabular form given below.
Probability Distribution of Random Variable X
x p(x)
x1
x2
x3
.
.
xn
p1
p2
p3
.
.
pn
Probability Distribution of a Continuous Random Variable
This will be represented in the form of frequency polygon drawn by referring to the
grouped frequency distribution of a continuous variable. A frequency polygon gets
soother and smoother as the sample size gets larger, and the class intervals become more
numerous and narrow. Ultimately the density polygon becomes a smooth curve called the
density curve. The function that defines the curve is called the Probability Density
Function.
Concept of Probability Distribution
The probability distribution of a random variables may be
- Theoretical list of outcomes and probabilities which can be obtained from a mathematical
model representing some phenomenon or process of interest
- Empirical listing of outcomes associated, with their subjective or contrived probabilities
representing the degree of conviction of the decision maker as to the likelihood of the
possible outcomes.
- An empirical listing of outcomes and their observed relative frequency
Here we are focusing on the theoretical listing of the outcomes and their probabilities.
BIET – MBA Programme, Davangere
64
Prof. Vijay K S Business Statistics and Analytics
Mathematical Expectations of Random Variable:
The expected value of “X” and denoted by E(X) is defined as
E(X) = ∑ [𝑥 × 𝑝(𝑥)]
E(X) = 𝑋̅
Hence Mathematical expectations of a random variable is nothing but its
Arithmetic mean
Variance, Standard Deviation and Mean
Mean = E(X) = ∑ 𝑥 × 𝑝(𝑥)
Variance = 𝜎𝑥
2
= 𝐸 (𝑋)2
– [𝐸(𝑋)] 2
= ∑ 𝑋2
× 𝑝(𝑥) − [∑ 𝑥 𝑝(𝑥)] 2
Standard Deviation: 𝜎𝑥 = √∑ 𝑋2
× 𝑝( 𝑥) − [∑ 𝑥 𝑝(𝑥)] 2
Problems:
1. A die is tossed twice. Getting “an odd number” is termed as success. Find the probability
distribution of the number of successes.
2. Two cards are drown
A. Successively with replacement
B. Simultaneously (Successively without replacement)
From a well shuffled deck of 52 cards. Find the probability distribution of the number of
aces.
3. Obtain the probability distribution of X, the number of heads in three tosses of a coin (Or
simultaneous toss of three coins)
4. Two dice are rolled at random. Obtain the probability distribution of the sum of the
numbers on them.
5. Four bad Apples are mixed accidentally with 20 good apples. Obtain the probability
distribution of the number of bad apples in a draw of 2 apples at random.
6. A die is thrown at random. Find the expectation of the number on it.
7. A random variable “X” has the following probability distribution. Find Mean and
Variance
x 4 5 6 7
P(x) 0.1 0.3 0.4 0.2
8. A r.v. X has the following probability function
X -2 -1 0 1 2 3
P(X) 0.1 K 0.2 2k 0.3 k
Find k, Mean and s.d (X)
BIET – MBA Programme, Davangere
65
Prof. Vijay K S Business Statistics and Analytics
Theoretical Probability Distribution
Theoretical probability distribution are the functions of a known random variable which
generates probabilities for a given values of a random variable. In other words probability
distribution are the ready to use formula (Functions) for calculating probability of a known
variable.
Amongst theoretical or expected frequency distribution the following are popular
1. Binomial Distribution
2. Poisson Distribution
3. Normal Distribution
Binomial Probability Distribution:
- It is also known as “Bernoulli Distribution”, Probability distribution expressing the
probability of one set of dichotomous alternatives i.e. success or failure.
- Conditions or assumptions of Binomial Distribution
o n, the number of trails is finite
o each trail results in two mutually exclusive and exhaustive outcomes, termed as
success and failure
o Trails are independent
o p, the probability of Success is constant for each trail, then q = 1-p, is the
probability of failure in any trail
- Bernoulli trail: A trail having only two outcomes
Example: Tossing a coin: H or T
Outcome of the game: Win or Lose
Business outcome: Success or failure
Let “x” be a random variable for a binomial variable with “n” trail and P(Success) = p, then
probability of “x” number of success is given by
P(x) = n𝐶 𝑥 . 𝑝 𝑥
. 𝑞 𝑛−𝑥
Where x = Number of success in “n” trail
n = Number of trail
p = probability of success in a single trail
q = (1-p) = (1-Success)
Here “n” and “p” are the parameters of Binomial Distribution. They are the unknown
values in the above formula.
If we know the values of “n” and “p”, then we can able to find the required solution
Constants of Binomial Distribution
- Mean = np
- Variance = npq
BIET – MBA Programme, Davangere
66
Prof. Vijay K S Business Statistics and Analytics
- Standard Deviation = √ 𝑛𝑝𝑞
Problems on Binomial Distribution
14. A fair coin is tossed 5 times, what is the probability of getting
I) Exactly three head
II) At-least 1 head
III) No Heads
IV) At most 3 heads
15. A salesman makes a sale of 4 out of 10 (40% success) customers he contacts. If four
customers are contacted today, what is the probability that he makes sales exactly
two?
16. 20% of the bolts manufactured by machine are defective. Find the probability that
there are
I) No defective
II) At most two defective
III) At-least 1 defective
IV) Exactly one defective bolt when 5 bolts are chosen randomly
17. The probability of man hitting target is 1/5, what is the probability that he targets
I) At-least once
II) At-least trice
If he aims 7 times at the target
18. In hundred sets of 10 tosses of an unbiased coin, how many tosses should we expect
to get
I) 7 heads and 3 Tails
II) At-least 7 heads
Fitting up of distribution (Framing a probability Distribution)
- The process of obtaining expected frequency based on the theoretical probability
distribution is called fitting up of distribution.
19. Fit a Binomial distribution for the following data
X 0 1 2 3 4 5
f 2 10 24 38 18 8
20. Fit Binomial distribution for the following data
X 0 1 2 3 4 5
f 2 10 48 114 72 40
21. Fit a binomial distribution for the following data
X 0 1 2 3 4 5 6 7
f 7 6 19 35 30 23 7 1
BIET – MBA Programme, Davangere
67
Prof. Vijay K S Business Statistics and Analytics
22. Mean and Variance of Binomial distribution are 12 and 5 respectively. Find the
parameters of Binomial Distribution?
23. The mean and Standard Deviation of Binomial Distribution is 4 and √3 respectively.
Find n, p and q
24. Find the probability of Success for Binomial distribution if n=6 and 4P(X=4) =
P(X=2)
Poisson Probability Distribution
Poisson distribution may be expected in cases where the chance of any individual event being a
success is small. The distribution is used to describe the behaviour of rare events such as the
number of accidents on road, Number of printing mistakes in books.
It has been called “the Law of Impossible Events”
Let “x” be a discrete random variable with mean ( 𝜆 ) assume assigned to a rare event. “x” is said
to follow Poisson probability distribution, the probability mass function (PMF) is given by
P(x) =
𝑒−𝜆 𝜆 𝑥
𝑥!
Where x= 1, 2, 3, 4…… . . ∞
𝜆 = Parameters of the Poisson distribution
Let “x” be a discrete random variable with mean ( 𝜆) i.e. np or the average number of
occurrence of an event.
The Poisson distribution is a discrete distribution with a single parameter
“ 𝜆" increases, the distribution shifts to the right.
All the Poisson probability distribution are skewed to the right. This is the reason why the Poisson
probability has been called the probability distribution of rare events.
Constants of the Poisson distribution
- The mean of Poisson distribution = 𝜆
- The standard deviation = √ 𝜆
Role of the Poisson distribution
Used in infrequently occurring events with respect to time, area, volume or similar events. Some
practical situation in which Poisson distribution can be used are given below.
- Quality control statistics
- Biology to count the number of bacteria
- Number of particles emitted from radioactive substance
- Insurance – No of causalities
- Waiting time problem – Number of incoming calls
BIET – MBA Programme, Davangere
68
Prof. Vijay K S Business Statistics and Analytics
Problems on Poisson distribution:
25. Skilled typist makes an average 2 types mistakes per book of 100 pages. In a randomly
chosen book of 100 pages typed by the same typist, what is the probability that
I) There are no typing mistakes
II) There is at-least one typing mistakes
III) There are exactly 3 typing mistakes
26. In an express way, the average number of car accidents in a week are 3, on randomly
chosen week of the year, what is the probability that
I) There are no accidents reported
II) There are exactly 2 accidents in that week
27. In a world class manufacturing facility, the average number of occupational hazards
in a year is 2. Find the probability that in a given year of safety systems assessment
there are at most two occupational hazards.
28. On an average 2 flights crashed due to technical problem in this year. Find the
probability that in a given year
I) No crash
II) At least one crash
29. Number of customers demanding for information using RTI act in government office
in a week is 1.5 on an average. Find probability that in given week of the year
I) There is no demand
II) There are 3 demands
30. Accidents occurs on a particular stretch of highway t an average vales of 3 per week,
assuming Poisson probability. Find probability of exactly two accidents in a given
week( 𝑒−3
= 0.04979)
Fitting a Poisson distribution
31. Systematic sample of 200 pages was taken from the manuscript type by typist and
observed frequency distribution of the typing mistakes per page is found to be as
follows. Fit Poisson distribution.
Number of typing mistakes 0 1 2 3 4
Number of pages 122 60 15 2 1
32. Calculate Mean and Variance of Poisson variable “X”,
if the probability P(X=4) = P(X=5)
33. For Poisson variable P(X=1) = P(X=2) find mean and P(X=0)
Poisson Approximation to Binomial distribution
- If “A” is Binomial variable with “n” very large (>30) and “P” is very small (<0.1) then X
follows Poisson probability 𝜆 = np
34. If 2% electric bulbs manufactured by company are defective. Find the probability that
in a sample of 200 bulbs
I) Less than 2 bulbs II) More than 3 bulbs are defective
BIET – MBA Programme, Davangere
69
Prof. Vijay K S Business Statistics and Analytics
35. On an average one in 400 chips is found to be defective if chips are packed in 100’s,
what is the probability that any given box will contain
I) One defectives
II) One or more defectives
III) Less than 2 defectives
Normal Probability Distribution
The normal distribution, also called the Normal Probability Distribution, to be the most useful
theoretical distribution for continuous variables. Most of the date relating to economic and
business statistics or even in social and physical science conform to this distribution.
Properties of the Normal Distribution
1. Normal curve is “bell shaped” and symmetrical in it appearance
2. The height of the normal curve is at its maximum at the mean. Hence the mean and mode
of the normal distribution coincide. Thus for a Normal Distribution Mean, Median and
Mode are all equal.
3. There is one maximum point of the normal curve which occurs at the mean
4. Since there is only one maximum point, the normal curve is uni-modal i.e. it has only one
Mode
5. As dissatisfied from Binomial and Poisson distribution where the variable is discrete. The
variable distributed according to the normal curve is continuous.
6. The first and third quartile are equidistance from the Median
7. The area under the normal curve distributed as follows
a. Mean ± 1 𝜎 covers 68.27% area and 34.135 % area will lie on either side of the
Mean
b. Mean ± 2 𝜎 covers 95.45% area
c. Mean ± 3 𝜎 Covers 99.73% area
8. The mean deviation is 4th or more precisely 0.7979 of the standard deviation
Conditions for Probability
- The causal forces must be numerous and of appropriately equal weights
- Forces must be the same over the universe from which the observations are drawn. This
is the condition of homogeneity
- Forces affecting events must be independent of one another
- Condition symmetry
Normal Distribution Graph
BIET – MBA Programme, Davangere
70
Prof. Vijay K S Business Statistics and Analytics
Calculation of “Variables” in Normal Distribution
In order to calculate probabilities of Normal variable “X’, we transform to Normal variable “x” to
Standard normal variable “Z” by using Z=
𝑥−µ
𝜎
There for x ∿ n (µ, 𝜎) 𝑔𝑒𝑡𝑠 𝑡𝑟𝑎𝑛𝑠𝑓𝑜𝑟𝑚 𝑡𝑜 𝑍 ∿ SD (µ = 0, 𝜎 = 1)
f = (Z) =
1
2√ 𝜋
𝑒−
𝑧2
2
The probability values corresponding to different values of Z are made available under normal
tables which are used to get probabilities for normal distribution.
Problems on Normal distribution:
36. The marks obtained by students in an examination follows normal distribution with
mean = 45 and SD =10. Calculate the probability that randomly chosen student has
scored
I) Less than 60 Marks
II) Between 60 and 80 Marks
III) Less than 40 Marks
IV) More than 70 Marks
V) Between 35 and 60 marks
37. The average daily sales of 500 branch offices was Rs. 1,50,000, SD = 15,000, assuming
the distribution will be normal. Calculate how many branches have sales
I) Above 1,70,000
II) Between 1,20,000 and 1,40,000
III) Between 145 thousands and 165 thousands
38. Distribution of monthly income of 500 workers follow normal distribution with mean
of Rs. 2000 and SD of Rs. 200, estimate the number of workers with income
I) Exceeding Rs. 2300 per month
II) Between 1800 and 2300 per month
III) What is the lowest income of 25% of workers in the highest income group
39. Banking recruitment board conducts qualifying exams for 1000 candidates and the
scores of the candidates follow the normal distribution with mean of 52 marks and SD
= 6
I) Find the number of candidates scoring between 40 and 55 marks
II) If the recruitment board wishes to recruit only 10% of the top scores, what is
the cut marks?
BIET – MBA Programme, Davangere
71
Prof. Vijay K S Business Statistics and Analytics
Questions from Previous Year Question Papers:
1. A Merchant’s file of 20 accounts contains 6 delinquent and 14 non-delinquent accounts.
An auditor randomly selects 5 of these accounts for examination:
A) What is the probability that the auditor finds exactly 2 – delinquent accounts?
B) Find the expected number of delinquent accounts in the sample selected.
2. The mean and standard deviation of the wages of 6000 workers engaged in a factory are
Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal estimate:
Percentage of workers getting wages above Rs. 1600
Number of workers getting wages between Rs. 1100 and Rs. 1500
The relevant extract of the area table (under the normal courve from Z=0 to ∞ is given below
Z 0.25 0.5 0.6 0.75 1.00 1.25 1.5
Area 0.0987 0.1915 0.2257 0.2734 0.3413 0.3944 0.4332
3. The mean and standard deviation of the wages of 1000 workers engaged in a factory are
Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal, estimate
a. Percentage of workers getting wages above Rs. 1600
b. Number of workers getting wages between Rs. 600 and Rs. 900
4. The probability that a pen manufactured by a company will be defective is 1/10. If 12
such pens are manufactured, using binomial distribution find the probability that.
a. Exactly two will be defective
b. At least 3 will be defective
c. At most 3 will be defective
5. A Project yields an average cash flow of Rs. 500 Lakhs, with a standard deviation of Rs.
60 Lakhs, Calculate the following probabilities.
a. Cash flow will be more than 560 Lakhs
b. Cash Flow will be less than 420 Lakhs
6. The incidence of occupational diseases in an industry is such that the worker have 20
percent chance of suffering from it. What is the probability that out of six worker’s 4 or
more will come in contact of the disease?
7. Suppose a life insurance company insures the lives of 5000 persons aged 42. If
studies show that any 42 years old person will die in a given year to be 0.001. Find
the probability that the company will have to pay at-least two claims during a
given year. What is the probability that company will have to pay zero claims?
8. In a manufacturing organization with 5000 employees. The mean was of workers
is Rs. 8000/- per month with standard deviation of Rs. 2000/-. Assuming normal
distribution, estimate:
a. Number of workers getting salary below Rs. 6000/-
b. Number of workers getting salary above Rs. 10,000/-
c. Number of workers getting salary between Rs. 7000/- and Rs. 9000/-
BIET – MBA Programme, Davangere
72
Prof. Vijay K S Business Statistics and Analytics
9. Mean and standard deviation of wages of 1000 workers engaged in a factory are
Rs. 1200 and 400 respectively. Assuming the distribution to be normal, estimate
a. Percentage of workers getting wages above Rs. 1600
b. Number of workers getting wages between Rs. 600 and Rs. 900
The area under normal curve for different Z are given below
Z 0.5 0.75 1 1.5
Area 0.1915 0.2734 0.3413 0.4332
10. In a factory turning out fan blades, there is a small change of 0.002 for any blade
to be defective. The blades are supplied in packets of 10. Use poisson distribution
to calculate the approximate number of pockets containing no defectives, One
defective and two defective blades respectively in a consignment of 10,000
packets.
BIET – MBA Programme, Davangere
73
Prof. Vijay K S Business Statistics and Analytics
Unit – 4
Time Series Analysis
BIET – MBA Programme, Davangere
74
Prof. Vijay K S Business Statistics and Analytics
Unit 4: (12 Hours)
Time Series Analysis: Introduction - Objectives Of Studying Time Series Analysis -
Variations In Time Series - Methods Of Estimating Trend: Freehand Method - Moving
Average Method - Semi-Average Method - Least Square Method. Methods of Estimating
Seasonal Index: Method of Simple Averages - Ratio to Trend Method - Ratio to Moving
Average Method
Introduction:
Forecasting is an important tool in any decision-making process. Forecasting is essential
in making many magerial decision such as deciding about raw material required for
production, investment required for equipment purchase, sales forecast, human resource
requirement forecast, etc
A time series is a set of numerical values of some variable at regular period over time. The
series is usually tabulated or graphed in a manner that readily conveys the behaviour of
te variable under study.
Example:
The export of cement company between 2007 to 2017
Year Export
(Tonnes)
2007
2
2008 3
2009 6
2010 10
2011 8
2012 7
2013 12
2014 14
2015 14
2016 18
2017 19
The above graph suggests that the series is time dependent. The management of the
company is invested in determining how the series is dependent on time and in
developing a means of predicting future levels with some degree of reliability
Objective of Studying Time Series Analysis
1. The assumption underlying time series analysis is that the time series data
behaves the same in the future as that in the past. Time series analysis is used to
detect the pattern underlying data, isolate the influencing factors which in turn
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Export (Tonnes) 2 3 6 10 8 7 12 14 14 18 19
2
3
6
10
8
7
12
14 14
18
19
0
2
4
6
8
10
12
14
16
18
20
Export (Tonnes)
BIET – MBA Programme, Davangere
75
Prof. Vijay K S Business Statistics and Analytics
used to estimate the future accurately. Thus, the time series data helps us to cope
with the uncertainty about the future.
2. To review and evaluate the progress made in the plans are based on the time series
data. For example, Finance Minstry of Govt. of India (GOI) reviewing the gross
domestic product ) GDP of the economy during the financial year and chalking out
the strategies to further the growth.
Variations in Time Series / Components of a Time Series
In typical time-series there are three main components which seem to be independent
of one another and seems to be influencing time-series data.
An important step in analysing time series is to consider the types of data patterns. A
time series data can contain some or all of the following elements. They are:
1. Trend (T)
2. Cyclical (C)
3. Seasonal (S)
4. Irregular (I)
1. Trend (T) : The trend is the long term pattern of a time series. A trend can be
positive or negative depending on whether, the time series exhibits an increasing
long term pattern or a decreasing long term pattern. The rate of trend growth
usually varies over time.
BIET – MBA Programme, Davangere
76
Prof. Vijay K S Business Statistics and Analytics
2. Cyclical (C) : Time series data may show up and down movement around a given
trend. For example, business cycle over the years show upward trend and touches
its peak and then it may show slump and hits the bottom. The pattern repeats but
not a regular interval of time. The duration of a cycle depends on the type of
business or industry.
In brief an upward and downward oscillation of uncertain duration and
magnitude about the trend line due to seasonal effect with fairly regular period
with irregular swings is called a cycle.
3. Seasonal (S): It is a speacial case of a cycle component of time series in which the
magnitude and duration of the cycle do not vary but happen at a regular interval
each year. Seasonality occurs when the time series exhibits regular variation
during the same periods (Month, Year or same quarter every year)
BIET – MBA Programme, Davangere
77
Prof. Vijay K S Business Statistics and Analytics
4. Irregular or Random: This type of
variation is unpredictable. This is
caused by short term unanticipated and
non-recurring factors. These follows np
specific pattern.
Methods of Estimating Trend:
These are also called the forecasting methods of Time Series Analysis
Some of them are
1. Freehand Method
2. Moving Average Method
3. Semi-average Method
4. Least-Square Method
1. Freehand Method:
It is easy method of estimating the trend. First, step is to plot the values of the time
series on the graph and then draw a trend line through these points such that the
line reflects long-term trend of the data. This method does not require any
rigorous mathematical calculations.
Here the forecast can be obtained simply by extending the trend line. A trend line
fitted by the freehand method should conform to the following conditions.
- The trend line should be smooth – a straight line or mix of long gradual curve
- The sum of the vertical deviation of the observations above the trend line
should be equal to the sum of the vertical deviation of the observation below
the trend line.
- The sum of squares of the vertical deviations of the observations from the
trend line should be as samll as possible.
- The trend line shoulc bisect the cycles so that area above the trend line should
be equal to the area below the trend line, not only for the entire series but as
much as possible for each full cycle.
Limitations:
The method involves personal bias, very subjective and needs judgement.
Example:
Fit a trend line to the following data by using the freehand method
Year 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
80 90 92 83 94 99 92 104
BIET – MBA Programme, Davangere
78
Prof. Vijay K S Business Statistics and Analytics
2. Moving Average Method:
It is a very simple and flexible method. As the name refers to, in this method we
calculate a series of averages of successive overlapping groups. The number of
values to be included in an average is determined by a constant called Period. The
resulting averages smoothens the fluctuations and extreme values.
Two cases
- When the period is odd
- When the period is even
Example: The sales of a store for 11 years are given below. Find the 3-Year, 5-Year
and 7-Year Mooving average
Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
7 13 19 25 31 37 43 49 55 61 67
Example: The sales of a store for 11 years are given below. Find the 4-Year and 6-
Year
Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Sales
(Lakh)
7 13 19 25 31 37 43 49 55 61 67
3. Semi-average Method:
This method is used to estimate the trend line if a linear function can discribe the data
sufficiently. This procedure is as follows
1. Divide the given time series into two segments leaving the middle period if the data is
odd in number. If you find the number of time periods even, divide the time series into
two segments leaving the two time period in the middle.
2. Find the average of te values of each segment and plot the two average points on the
graph against the middle time period of each segment.
3. Join the two points to plot the trend line. You can extend the trend line to predict the
value of a future time period.
4. The trend line equation is of the form 𝑌⏞ = a + bx. The intercept “a” and the slope can
be found by using the experssion,
Slope =
∆ 𝑌
∆ 𝑋
=
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑠𝑒𝑔𝑚𝑒𝑛𝑡
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑚𝑖𝑑 𝑝𝑒𝑟𝑖𝑜𝑑 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑠𝑒𝑔𝑚𝑒𝑛𝑡
Intercept = a = Average value of the first segment at its mid – period
BIET – MBA Programme, Davangere
79
Prof. Vijay K S Business Statistics and Analytics
Example: The sales of a manufacturing firm from 2001 – 2011 is given below. Fit a trend
line by using the method of semi-averages and also estimate the sales for the year 2014.
Year 200
1
200
2
200
3
200
4
200
5
200
6
200
7
200
8
200
9
201
0
201
1
Sales
in
Crore
s
103 105 114 112 116 120 116 122 126 127 124
4. Semi-average Method:
The trend may be linear or curvilinear. Let us consider the values that can be descibed by
a stright line. These are called linear trends. The general equation for estimating a stright
line is 𝑌⏞ = a + bx.
Where 𝑌⏞ = Estimated value of the independent variable
a = y-intercept, i.e. the value of 𝑌⏞ when x = 0
b = Slope of the trend line
x = independent variable, i.e. the time
The values of “a” and “b” can be found out by
a =
∑ 𝑦
𝑛
b =
∑ 𝑥 𝑦
∑ 𝑥2
Example: The production of the firm over the years is given below
a) Fit a stright line trend using the method of least squares
b) Estimate the production figures for the year 2012
c) Use a graph to plot the actual and the estimated production
Year (X) 2005 2006 2007 2008 2009 2010 2011
Production
(in ‘000s)
87 91 93 86 96 99 92
BIET – MBA Programme, Davangere
80
Prof. Vijay K S Business Statistics and Analytics
Methods of Estimating Seasonal Inde
- Method of Simple Averages
- Ratio to trend method
- Ratio to moving average method
Method of simple averages
Example: The sales of lathes in the last three years is given below. Use the method of
simple averages to determine the seasonal index for each month.
Month Jan Feb Mar April May June July Aug Sep Oct Nov Dec
2009 16 17 19 19 24 24 21 29 30 34 34 39
2010 22 21 27 26 30 27 21 27 31 36 33 43
2011 28 28 38 39 39 33 33 37 41 50 44 56
Ratio to Trend Method
Example: The quarterly sales of a stationery store (Rs. In thousands) for five years, i.e.
2007 – 2011 is given below. Use ratio to trend method to determine the seasonal indexes.
Ratio to moving Average method
Example: The quarterly sales for five years from 2008-2011 is given below. Use ratio to
moving average method to determine the sesonal indexes.
Quarter Sales (Rs. In Thousands
I II III IV
2008 77 62 56 61
2009 85 64 62 79
2010 91 73 67 86
2011 102 80 74 95
1. Fit a stright line trend by semi average method for the following data: (June
2010)
Year 1994 1995 1996 1997 1998 1999 2000 2001 2002
Sales (in
‘000)
45 50 60 55 60 65 70 80 85
2. Calculate the trend values by the method of moving averages assuming a 4- year
cycle from the following data relating to sugar production in India. Also plot the
actual and trend values on a graph. (June 2010)
Year 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988
Sugar prodn
(in Lakh
tons)
75 62 76 78 94 84 96 128 116 76 102 168
BIET – MBA Programme, Davangere
81
Prof. Vijay K S Business Statistics and Analytics
3. Compute the trend values by finding four yearly moving- averages for the
following time series. Also graph the deserved values and the trend values.
Year 1988 1989 1990 1991 1992 1993 1994 1995 1996
Sales (in
‘000)
103 104 107 101 102 104 105 99 100
4. Fit a stright line trend for the following data by the method of least squares.
Estimate the value for 1996.
Year: 1989 1990 1991 1992 1993 1994
Value: 10 8 12 9 11 12
5. What is Time Series? Discuss various components of Time Series.
6. Fit a trend line to the following data by the method of Semi-average (Draw a
Graph)
Year Sales of Firm A (thousand
units)
1990 102
1991 105
1992 114
1993 110
1994 108
1995 116
1996 112
7. Below are given the figures of production (in thousand kgs.) of casting in a
factory
Years 1990 1991 1992 1993 1994 1995 1996
Production 12 10 14 11 13 15 16
BIET – MBA Programme, Davangere
82
Prof. Vijay K S Business Statistics and Analytics
Questions from Previous Year Question Papers:
1. Below are given the figures of production of a sugar factory. Values are in
thousand quintals
Year 2011 2012 2013 2014 2015 2016 2017
Production 80 90 92 83 94 99 92
Fit a straight line trend and show the trend line on graph. Estimate production in
2020.
2. For the data on prices (in Rs. Per Kg) of a certain commodity during 2007 to
2011 are shown below. Compute the seasonal indexes by the average percentage
method.
Quarter 2007 2008 2009 2010 2011
1 45 48 49 52 60
2 54 56 63 65 70
3 72 63 70 75 84
4 60 56 65 72 66
3. From the following series of annual data, find the trend line by the method of
semi averages. Also estimate the value for 1999
Year 1990 1991 1992 1993 1994 1995 1996 1997 1998
Actual
Value
170 231 261 267 278 302 299 298 340
4. The sales of a company in millions of rupees for the year 1994 -2001 are given
below
Year 1994 1995 1996 1997 1998 1999 2000 2001
Sales 550 560 555 585 540 525 545 585
a. Find the linear trend equation?
b. Estimate the sales for the year 1993?
c. Find the slope of the straight line trend?
d. Do the figures show a rising trend or a falling trend?
5. Taking of the deviation of the time variable compute the trend values for the
following data by the method of least square:
Days 1 2 3 4 5 6 7
Sales (Rs.) 20 30 40 20 50 60 80
6. With the help of following data, calculate the trend values by the method of least
squares and estimate the sales for the year 2011.
Years 2000 2001 2002 2003 2004 2005 2006
Sales (in Lakhs) 25 27 32 36 44 55 69
BIET – MBA Programme, Davangere
83
Prof. Vijay K S Business Statistics and Analytics
7. The following table relates to the tourists arrivals (in million) during 1994 to
2000 in India.
Years 1994 1995 1996 1997 1998 1999 2000
Tourists Arrival 18 20 23 25 24 28 30
Fit a straight line trend by method of least square and estimate the number of
tourist that would arrive in the year 2004.
8. The sales of a company in millions of rupees for the years 1994-2001 are given
below:
Years: 1994 1995 1996 1997 1998 1999 2000 2001
Sales: 550 560 555 585 540 525 545 585
a. Find the linear trend equation
b. Estimate the sales for the year 1993
9. Calculate seasonal indices by the “ratio to moving averages” method from the
following data
Year 1st Quarter 2nd Quarter 3rd Quarter 4th Quarter
2005 68 62 61 63
2006 65 58 66 61
2007 68 63 63 67
10. Gross revenue data (Rs. In Million) for a travel agency for a 10 year period is as
follows
Years: 2000 01 02 03 04 05 06 07 08 09
Revenue: 3 6 10 8 7 12 14 14 18 19
Calculate a 3 year moving average for the revenue earned.
BIET – MBA Programme, Davangere
84
Prof. Vijay K S Business Statistics and Analytics
Unit – 5
Part A
Linear Programming
BIET – MBA Programme, Davangere
85
Prof. Vijay K S Business Statistics and Analytics
Unit 5: (8Hours)
Linear Programming: structure, advantages, disadvantages, formulation of LPP, solution
using Graphical method.
Linear Programming
For making decision in a business environment. Model formulation is very important
because it represents the essence of business decision problem.
Here formulation means, the process of converting the verbal description and numerical
data into mathematical expression, which represents the relevant relationship among
- Decision factors / variables
- Objectives that firm wants to achieve – Objective function
- Restriction – on the use of resources
“Linear programming is a particular type of techniques used for economic allocation of
scare and limited resources, such as Labour, Materials, Machine, Time, Warehouse, Space,
Capital, Energy to several competing activities such as Products, Services, Jobs, New
equipment, Projects etc.”
Linear Programming is one of the optimization techniques used to optimise the business
variables like Profit, Cost, Sales, and Waste with available limited resources.
Linear Programming uses mathematical modelling with the help of “Linear Equation”
“Linear programming is one of the optimizing techniques used to minimize profits or
minimizing cost of given function using linear equation.”
Two works comprising Linear Programming = Linear + Programming
a. Linear – Means linear relationship among the variables in the model, change in
one leads to proportionate change in other
b. Programming – Modelling or Solving problem mathematically
Structure of Linear Programming Model
General Structure of LP Model:
LP model consist of three components
1. Decision Variable / Courses of Action
2. Objective Function
3. Constraints
1. Decision Variable: To arrive at optimal value of the objective function, we need to
evaluate the various alternative i.e. various courses of actions, If there is no
alternative, no need of LP
BIET – MBA Programme, Davangere
86
Prof. Vijay K S Business Statistics and Analytics
- These activities denoted as X1, X2, X3,……Xn
- The values of these denotes the extent to which each of these performed
- They are under the control of decision maker
- These are interrelated in terms of limited resources
- All decision variables are continuous, controllable and Non negative
X1 ≥ 0, X2 ≥ 0 …………….. 𝑋𝑛 ≥ 0
2. Objective Function: Mathematical Representation of the objectives in terms of
measurable quantity such as Profit, Cost, Revenue, Distance etc.
LPP Aims to achieve the highest profit or lowest cost by utilizing the available
limited resources to the best possible extent
Optimize (Minimize or Maximize) Z = 𝐶1 𝑥1 + 𝐶2 𝑥2 + 𝐶3 𝑥3 + ………
𝐶 𝑛 𝑥 𝑛
Z = Measure of performance variable
𝑥1 , 𝑥2 , 𝑥3, 𝑥4…………………..𝑥 𝑛 = Decision Variable
𝐶1 , 𝐶2 , 𝐶3 … … … … … 𝐶 𝑛 = Quantities
The optimum value of the given objectives function is obtained by the graphical
method and simple method.
3. Constraints:
- There are certain limitations (Or Constraints) on the use of resources,
Example. Labour, Machine, Raw Materials, Space, Money etc. the limit the
degree to which objective can be achieved
- Such constraints must be expressed as linear equalities or inequalities in
terms of decision variable
- The solution of LP model must satisfy these
BIET – MBA Programme, Davangere
87
Prof. Vijay K S Business Statistics and Analytics
General Mathematical Model of Linear Programming Problem:
The General Linear Programming Problem / Model with “n” Decision variables and “m”
constraints can be stated in the following form
Decision Variable 𝑥1 , 𝑥2 , 𝑥3, 𝑥4……………..𝑥 𝑛 ( One should find these values)
Objective Function: Z = 𝐶1 𝑥1 + 𝐶2 𝑥2 + 𝐶3 𝑥3 + ……… 𝐶 𝑛 𝑥 𝑛
Subjected to the linear Constraints
𝑎11 𝑥1 + 𝑎12 𝑥2 + 𝑎13 𝑥3 + …………………….. 𝑎1𝑛 𝑥 𝑛 (≤ = ≥) 𝑏1
𝑎11 𝑥1 + 𝑎12 𝑥2 + 𝑎13 𝑥3 + …………………….. 𝑎1𝑛 𝑥 𝑛 (≤ = ≥) 𝑏1
𝑎 𝑚1 𝑥1 + 𝑎 𝑚2 𝑥2 + 𝑎 𝑚3 𝑥3 + …………………… 𝑎 𝑚𝑛 𝑥 𝑛 (≤ = ≥) 𝑏 𝑚
Such that 𝑥1 , 𝑥2 , 𝑥3, 𝑥4……………..𝑥 𝑛 ≥ 0
Non Negativity Constraints
Where
𝐶1 , 𝐶2 , 𝐶3 … … … … … 𝐶 𝑛 => These are constant of profit / loss
𝑎11, 𝑎12, 𝑎13 …………………….. 𝑎 𝑚𝑛 => Technical Constant
𝑏1, 𝑏2, 𝑏3 …………………….. 𝑏 𝑚 => Availability or Requirements
Assumption of Linear Programming
- Certainty
- Diversity
- Additivity
- Linearity
BIET – MBA Programme, Davangere
88
Prof. Vijay K S Business Statistics and Analytics
Certainty: LP model assume that all parameters such as availability of resources, profits
or cost contribution of a unit if decision variable and consumption of resources by a unit
of decision variable must be known and may constant.
Diversity (Continuity): The solution values of decision variable and resources are
assumed to have either whole number (Integer) or mixed number.
Additivity: The values of the objective function for the given values of decision variable
and the total sum of resources used, must be equal to the sum of the contribution (Profit
or cost) earned from each decision variable and the sum of the resources used by each
decision variable.
Example 1: Total profit earned by the sale of two products A and B = Sum
of profit earned separately from A and B
Example 2: Resources consumed by A and B = Sum of resources used for A
and B individually
Linearity / Proportionately: All relationships in the LP model (both objective function and
constraints must be linear)
Example: If production of one unit of a product uses 5 hours of a particular
resources, then making 3 units of that product use 3*5 = 15 hours of that resource
Advantages of Linear Programming
1. Help in attaining optimum use of productive resources
2. Improve quality of decisions, Since it is more of objective than subjective
3. It provides possible and practical solutions
4. Highlighting of bottlenecks in the production processes
5. Linear Programming helps in re-evaluating of a basic plan for changing
condictions
Limitations of Linear Programming / Disadvantages
1. Treats all relationship among decision variable as linear
2. There is no guarantee of getting integer valued solution
3. It doesn’t take into consideration the effect of time and uncertainty
4. It is possible to solve large scale problems in LP with the usage of computer, but
problem can be fragmented into several small problems and solving each
separately
5. Parameters appearing the model are assumed to be constant but in real life they
are frequently neither nor constant.
6. It deals with single objective, where as in real life situation we may come across
conflicting multi-objective problems. In such cases a goal programming model is
used instead of linear programming
BIET – MBA Programme, Davangere
89
Prof. Vijay K S Business Statistics and Analytics
Application Areas of Linear Programming
1. Agriculture Application:
- Efficient production patterns can be specified by the Linear Programming
Model under regional land resources and national demand constraints
- Applied in agriculture planning – resource allocation
2. Military Application
- No of defence units that should be used in a given attack in order to
provide the required level of protection at the lowest possible cost
3. Production Management
- Product mix – The objective is to maximise the total contribution subject
to all constraints
- Production Planning – To manage the operating cost
- Assembly – Line Balancing – to reduce total elapse time
- Blending problem – Find the minimum cost blend
- Trim Loss – Minimise trim loss
4. Financial Management
- Portfolio Selection: To find the allocation, which maximises the total
expected return or minimise the risk under certain limitation
- Portfolio Planning: Maximizing the profit margin from investment in
plant facility and equipment
5. Marketing Management
- Median selection: Maximise the effective exposure subject to limitation of
budget, specified exposure rates to different market segments
- Travelling salesmen problems – To find the shortest route
- Physical distribution – Locating the manufacturing plants and
distribution centres
6. Personal Management
- Staffing Problem: Allocation of optimal resources i.e. manpower to
particular job to reduce the overtime cost
- Determination of equitable salaries
- Job Evaluation and selection – Identifying a suitable person for a
specified job
Other application of linear programming lie in the areas of administration, education,
fleet utilization, Awarding contracts, hospital administration and capital budgeting
BIET – MBA Programme, Davangere
90
Prof. Vijay K S Business Statistics and Analytics
Guideline or Steps for Linear Programming Model Formulation:
Steps in Linear Programming (LP) Model formulation
1. Identify Decision Variable (“n” number of Decision variables)
- How many?
- How much quantity of the decision variable?
- Which?
2. Formulating the Objective Function
- Identify whether the objective function is to be maximised or minimised
- Maximize: Profit, Revenue, Margin, Viewers,
- Minimization: Cost, Time, Number of employee problem
- The function of the objective has to be written, like as follows
Z = C1X1 + C2X2 + C2X3……………..CnXn
3. Identify the Problem Data
- Here we need to provide the actual values for the decision variables
identified earlier. For this we need to know the information given in the
problem to determine those values
- These quantities constitute the problem data
4. Formulate the Constraints (“m” number of constraints)
- Express the constraints in terms of requirements and availability of each
resources
- Convert the verbal expression of the constraints imposed by the resource
availability as a linear equality or inequality in terms if decision variable
defined in step 1
A11X1 + A12X2 + A13X3……………………………….A1nXn (≤ / ≥ / = ) b1
A21X1 + A22X2 + A23X3………………………………..A2nXn (≤ / ≥ / = ) b2
.
.
.
Am1X1 + Am2X2 + Am3X……………………………….AmXn (≤ / ≥ / = ) bm
Here the main aim is to translate a real life problem into mathematical
model
5. Non Negativity Constraints:
X1, X2, X3…….Xn ≥ 0
X1, X2, ………. Xn
BIET – MBA Programme, Davangere
91
Prof. Vijay K S Business Statistics and Analytics
Problems:
1. Suppose that company produces 2 products A and B. Product A gets a profit
of Rs. 200 per units, Product B gets a profit of Rs. 500 per units. Company
uses 3 main resources Labour, Power and Raw materials.
Product A uses 2 Units of Labour
3 Units of Power
2 Units of Raw Materials
Product B Uses 3 Units of Labour
4 Units of Power
3 Units of Raw Materials
For a day Production Company can use 200 units of Labour, 500 Units of
Power and 1000 units of Raw Materials. Formulate LPP.
2. A manufacturer produces 2 types of Model M1 and M2. Each model M1
requires 4 hours of grinding and 2 hours of polishing. Each model M2 requires
2 hours of grinding and 5 hours of polishing. The manufacturer has got 2
grinders and 3 polishers, each grinder works 40 hours a week, and each
polisher works 60 hours a week. The profit on M1 is 30 / Unit and Profit on M2
is 40 / Unit.
Formulate LPP.
3. A company manufactures 2 products Fibre and Viscose. 1 Kg of Fibre fetches a
profit of Rs. 400 and 1 Kg of Viscose gives a profit of Rs. 500. These 2 products
uses 3 basic resources Labour, Power and Raw Materials. To Produce 1 Kg of
Fibre it requires 1 Man days of Labour, 2 units of Power and 2 Units of Raw
Materials. To Produce 1 Kg of Viscose it requires 2 Man days of Labour, 3 Units
of Power and 1 Unit of Raw material is required. The resources are limited in
nature such that company can utilize 100 Man days of Labour, 500 Units of
Power and 800 Units of Raw materials. Formulate LPP
4. A paper mill produces 2 grades of paper namely X and Y. There is a
restriction on availability of raw materials that it cannot produce more than
400 tons of grade X and 300 tonnes of grade Y in a week. It requires 0.2 and
0.4 hours to produce a tonne of products X and Y respectively with
corresponding profits of Rs. 200 and Rs. 400 per tonne.
Formulate the above LPP. There are 160 production hours in a week
5. A person requires 10, 12 and 12 units of chemicals A, B and C respectively for
his garden. A liquid products contains 5, 2 and 1 Units if A, B and C
respectively per Jar. A Dry products contains 1. 2 and 4 Units of A, B and C
BIET – MBA Programme, Davangere
92
Prof. Vijay K S Business Statistics and Analytics
respectively per carton. If the liquid products is sold for Rs. 3 /- per Jar and
Dry product is sold for Rs. 2/- per Carton. How many units of such products
should be purchased in order to minimise the cost and meet the
requirements? Formulate LPP
6. A Company produces two types of leather belts “A” and “B”. A is a superior
quality than B. The respective profits are Rs. 10 and Rs. 15 per belt. The supply
of raw materials is sufficient for making 850 belts per day. For belt A, a special
type of buckle is required and 500 pieces are available per day. There are 700
buckles available for belt B per day. Belt A needs twice as much time as that
required for belt B and Company can produce 500 belts if all of them were of
type A. Formulate LPP.
7. Company manufactures 2 products A and B. Each unit B takes twice as long as
to produce / unit of A. If company is to produce only A it would have time to
produce 2000 per unit day. The availability of Raw materials is sufficient to
produce 1500 per unit/day both A and B combined. Product B uses special
ingredients so only 600 units per day can be produced per day. If A cost Rs. 20
and B cost Rs. 50. Formulate LPP
8. An animal food company must produce 200 KG of mixture consisting of
ingredients X1 and X2 daily. X1 cost Rs. 3 per Kg and X2 cost 8 per Kg. Not more
than 80 Kg of X1 can be used and at least 60 Kg of X2 must be used. Formulate
LP Model to minimise the cost.
9. A firm produces 3 products A, B and C. It uses 2 types of Raw materials I and II
of which 500 and 7500 units respectively are available. The raw materials
requirements per unit of products are given below
Requirements of Products
Raw Materials A B C
I 3 4 5
II 5 3 5
The labour time for each units of products A is tice as that of products A and 3
times of that of products C. The entire labour force of that firm can produce
equivalent of Rs. 3000 Units.
The minimum demand for 3 products if 6oo, 650 and 500 Units respectively.
The ratio of the number of units produced must be equal to 2:3:4. Assuming
the profit per unit of A, B and C is 50, 50 and 80 respectively. Formulate LPP
10. A retired person want to invest an amount of Rs. 30,000 in a fixed income
securities. His broker recommends investing in 2 bonds. Bond A yields 7 % and
bond B yields 10%. After some consideration he decides to invest at most of
Rs. 12,000 in bond B and At least of Rs. 6000 in bond A. He also wants the
amount invested in bond A to be at least equal to amount invested in bond B.
Formulate LP model to maximize returns on investment
BIET – MBA Programme, Davangere
93
Prof. Vijay K S Business Statistics and Analytics
11. A person is inherited with Rs. 1,00,000 from his father-in-law that can be
invested in combination of only 2 stock portfolio’s, with the maximum
investment allowed in either portfolio at Rs. 75,000.
The first portfolio has an average returns of 10% and the 2nd has 20%. In
return of risk factor associated with these portfolio, The first has the risk rating
of 4 (on the scale of 0-10) and the 2nd has 9. Since he wants maximise the return
that will not accept an average rate of return below 12% or a risk factor above
6. Formulate LPP
12. A manufacturer employees 3 inputs, Man hours, Machine hours and cloth
materials to manufacture 2 type of dresses. Type A dress fetches him a profit
of 160 per piece, while type B that of Rs. 180 per piece. The manufacturer has
enough man hours to manufacture 50 Pieces of type A and 20 Pieces of type B
dresses per day. The machine hours he processes supply only for 36 pieces of
type A and 24 Pieces of type B. Cloth materials available per day is limited but
sufficient enough for 30 pieces of either type of dresses. Formulate LPP
13. A company produces 2 Types of hats each hat of the first type requires twice
as much as labour time of 2nd Type. If all hats are of 2nd Type only. The company
can produce the total number of 500 hats / day. The market limit daily sales of
the 1st and 2nd type of 150 and 250 hats. Assuming that the profits per hat are
Rs. 8 for type A and Rs. 15 for type B. Formulate LPP.
14. An animal food company must produce 200 Kg of the mixture of ingredients
X1 and X2 daily. X1 cost Rs. 3 / Kg and X2 cost Rs. 8 /Kg. Not more than 80 KG
of X1 can be used and at least 60 Kg of X2 must be used.
Formulate LPP
Solution to LPP using Graphical Method
Once the given business situation is transformed to LPP. It is solved using
graphical method as follows
Step 1: Plot the constraint equation on the graph sheet and locate the
common region between all the constraints, It is called as common region
Step 2: Identify the extreme points binding the feasible region
Step 3: If solution exist to given LPP it is at any one of these extreme points.
Substitute the values of each extreme point in the objective function.
Solution to the LPP corresponding to that extreme point for which the
values of objective function is optimized
BIET – MBA Programme, Davangere
94
Prof. Vijay K S Business Statistics and Analytics
15. Solve the following LPP using graphical method
Max Z = 8 𝑋1+ 16 𝑋2
Subjected to 𝑋1 + 𝑋2 ≤ 200
𝑋2 ≤ 125
3𝑋1 + 6𝑋2 ≤ 900
Where 𝑋1 , 𝑋2 ≥ 0
16. Find solution to the following LPP graphically
Max Z = 10 𝑋1+ 8 𝑋2
Subjected to 2𝑋1 + 𝑋2 ≤ 20
𝑋1 + 3𝑋2 ≤ 30
𝑋1 − 2𝑋2 ≥ −15
Where 𝑋1 , 𝑋2 ≥ 0
17. Find Maximum and Minimum values for the function
Z = 8 𝑋1+ 5 𝑋2
Subjected to 3𝑋1 − 2𝑋2 ≥ 6
− 2𝑋1 + 7𝑋2 ≥ 7
2𝑋1 − 3𝑋2 ≤ 6
Where 𝑋1 , 𝑋2 ≥ 0
18. Solve the following LPP using Graphical Method
Z = 10 𝑋1 - 4 𝑋2
Subjected to 2𝑋1 − 6𝑋2 ≤ 0
𝑋1 − 2𝑋2 ≤ 2
− 3 𝑋1 − 3𝑋2 ≥ −24
Where 𝑋1 , 𝑋2 ≥ 0
19. Solve the following LPP using Graphical method
Max Z = 10 𝑋1 + 15 𝑋2
Subjected to 2𝑋1 + 𝑋2 ≤ 26
2𝑋1 + 4𝑋2 ≤ 56
𝑋1 - 𝑋2 ≥ −5
Where 𝑋1 , 𝑋2 ≥ 0
20. Solve the following LPP using graphical method
Max Z = 0.07 𝑋 + 0.1 𝑌
Subjected to 𝑋 + 𝑌 ≤ 30,000
𝑌 ≤ 12000
𝑋 ≥ 6000
X – Y ≥ 0
Where 𝑋1 , 𝑋2 ≥ 0
21. Solve Graphically
BIET – MBA Programme, Davangere
95
Prof. Vijay K S Business Statistics and Analytics
Max Z = 0.1 𝑋 + 0.2 𝑌
Subjected to 𝑋 + 𝑌 ≤ 100,000
𝑌 ≤ 75000
𝑋 ≥ 75000
-2X + 3Y ≤ 0
-2X + 8Y ≥ 0
Where 𝑋, 𝑌 ≥ 0
22. Solve Graphically
Max Z = - 150 𝑋1 − 100 𝑋2 + 2,80,000
Subjected to 20 ≤ 𝑋1 ≤ 60
70 ≤ 𝑋2 ≤ 140
120 ≤ 𝑋1 + 𝑋2 ≤ 140
Where 𝑋1 , 𝑋2 ≥ 0
Questions from Previous Year Question Papers:
1. A firm buys castings of P and Q type of parts and sells them as finished product after
machining, boring and polishing. The purchase cost for casting are Rs. 3 and Rs. 4 each for
parts P and Q and selling costs are Rs. 8 and Rs. 10 respectively. The per hour capacity of
machines used for machining, boring and polishing for two products is given below:
Capacity
per hour
Parts
P Q
Machining 30 50
Boring 30 45
Polishing 45 30
The running costs for machining, boring and polishing are Rs. 30, Rs. 22.50 and Rs. 22.50 per
hour respectively. Formulate LPP to find out the product mix to maximize the profit.
2. Mr. X has Rs. 1,00,000 that can be invested in a combination of only two stock portfolios with
maximum investment allowed in either portfolio set at Rs. 75,000. The first portfolio has an
average return of Rs. 10% where as second has Rs. 20%. In terms of risk factors associated
with these portfolios, the first has a risk rating of 4 and second has 9. Since he wants to
maximise his returns, he will not accept an average rate of returns below 12% of risk rating
above 6. How much should he invest in each portfolio? Formulate this as linear programming
problem and solve it graphically.
3. Solve the following problem by using graphical method:
Minimize Z = 3X1 + 5X2
Subjected to -3X1 + 4X2 ≤ 12
2X1 + 3X2 ≥ 12
2X1 – X2 ≥ - 2
And X1 ≤ 4 ; X2 ≥ 2 ; X1, X2 ≥ 0
BIET – MBA Programme, Davangere
96
Prof. Vijay K S Business Statistics and Analytics
4. Solve the following LPP graphically
Maximum Z = 10X1 + 15X2
Subjected to 2X1 + X2 ≤ 26
2X1 + 4X2 ≤ 56
X1 – X2 ≥ - 5
X1, X2 ≥ 0
5. Solve the following LPP using Graphical method
Minimum Z = 20X1 + 10X2
Subjected to the constraints
X1 + 2X2 ≤ 40
3X1 + X2 ≥ 30
4X1 + 3X2 ≥ 60
Such that X1, X2 ≥ 0
BIET – MBA Programme, Davangere
97
Prof. Vijay K S Business Statistics and Analytics
Unit – 5
Part A
Transportation Problem
BIET – MBA Programme, Davangere
98
Prof. Vijay K S Business Statistics and Analytics
Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced,
restricted and maximization problems.
Transportation Problem
Transportation Problem is a particular case of LPP used to minimise the transportation
cost involved in transporting goods from “m” different origins to “n” different
destinations under the existing supply and demand constraints.
It is to transport various amounts of a single homogeneous commodity that are initially
stored at various origins, to different destinations in such a way that total transportation
cost is minimum.
The cost of transporting one unit of the commodity from each source to each destination
is also known. The commodity is to be transported from various sources to different
destinations in such a way that the requirement of each destination is satisfied and at the
same time the total cost of transportation in minimized.
Example: Pepsi has manufacturing unit at 4 cities in Karnataka and distributes more than
40 distribution centres.
A typical transportation problem contains
• Inputs:
• Sources with availability (Supply)
• Destinations with requirements (Demand)
• Unit cost of transportation from various sources to destinations (Cost)
• Objective:
• To determine schedule of transportation to minimize total transportation cost.
BIET – MBA Programme, Davangere
99
Prof. Vijay K S Business Statistics and Analytics
A transportation problem can be stated mathematically as follows:
Let there be ‘m’ SOURCES and ‘n’ DESTINATIONS
Let 𝑎𝑖: the availability at the 𝑖 𝑡ℎ
source
𝑏𝑗: The requirement of the 𝑗 𝑡ℎ
destination.
𝑐𝑖𝑗 : The cost of transporting one unit of commodity from the 𝑖 𝑡ℎ
source to the
𝑗 𝑡ℎ
destination
𝑥𝑖𝑗 : The quantity of the commodity transported from 𝑖 𝑡ℎ
source to the 𝑗 𝑡ℎ
destination (i=1, 2, …… m; j=1,2, …..n)
𝑹𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑮𝒆𝒏𝒆𝒓𝒂𝒍 𝑻𝒓𝒂𝒏𝒔𝒑𝒐𝒓𝒕𝒂𝒕𝒊𝒐𝒏 𝑷𝒓𝒐𝒃𝒍𝒆𝒎
Destinations Supply /
Availabilit
y
𝑫 𝟏 𝑫 𝟐 𝑫 𝟑 𝑫 𝒏
Origin /
Sources
𝑺 𝟏
X11
𝐶11
X12
𝐶12
X13
𝐶13
X1n
𝐶1𝑛
𝑎1
𝑺 𝟐
X21
𝐶21
X22
𝐶22
X23
𝐶23
X2n
𝐶2𝑛
𝑎2
𝑺 𝟑
X31
𝐶31
X32
𝐶32
X33
𝐶33
X1n
𝐶3𝑛
𝑎3
.
.
𝑺 𝒎
Xm1
𝐶 𝑚1
Xm2
𝐶 𝑚2
Xm3
𝐶 𝑚3
Xmn
𝐶 𝑚𝑛
𝑎 𝑚
Demand /
Requirements
𝑏1 𝑏2 𝑏3 𝑏 𝑛 𝑎 𝑚 = 𝑏 𝑛
The problem is to determine the values of xij such that total cost of transportation is
minimized.
We assume that the total quantity available is the same as the total requirement. i.e. Σai
= Σbj
• Balanced transportation problems
• Unbalanced transportation problems
BIET – MBA Programme, Davangere
100
Prof. Vijay K S Business Statistics and Analytics
Feasible Solution:
Any set of non-negative allocations which satisfies the row and column sum (Rim
Requirements) is called as feasible solution.
The feasible solution is called a basic feasible solution if the number of non-negative
allocations are equal to m+n-1, where “m” is the number of rows, “n” is the number of
column in a transportation table.
Non-Degenerate Basic Feasible Solution:
Any basic feasible solution to a transportation problem containing “m” Origins and “n”
Destinations is said to be non-degenerate, if it contains “m+n-1” occupied cells and each
allocation is in independent positions.
The allocation are said to be in independent positions, if it is impossible to form a closed
path.
Closed path means by allowing horizontal and vertical lines and all the corner cells are
occupied.
Degenerate Basic Feasible Solution:
If a basic feasible solution contains less than m+n-1 non negative allocations, it is said to
be degenerate.
Solution to Transportation Problem.
Transportation problem can be solved in 3 Phases
I. To find the Initial Basic Feasible Solution (IBFS) using any of the following
methods
A. North West Corner method
B. Least Cost Method
C. Vogel’s Approximation Method
II. Test the IBFS for Optimality using Modified Differences (Modi) / UV Method
III. Optimise Solution using Stepping stone algorithm method
BIET – MBA Programme, Davangere
101
Prof. Vijay K S Business Statistics and Analytics
Initial Basic Feasible Solution (IBFS) by North West Corner Method
(NWCM):
Step 1: Locate the cost situated at North West Corner of given cost matrix and allocate
quantity “Xij” such that it is min (Corresponding ai , bj )
Step 2: Again allocate the North West cell in reduced cost matrix and allocate as before
(Step 1)
Step 3: Continue allocating until all allocations exhaust
1: Find the IBFS by North West Corner Method
D1 D2 D3 D4 Supply
O1
5 3 1 2 30
O2
2 6 3 1 20
O3
6 3 1 5 40
O4
6 1 2 3 10
Demand
20 35 25 20 100
Note: If IBFS or Any Solution to Transportation Problem contains m+n-1 allocated cells
then solution is said to be non-degenerated (IBFS can be further improved for
optimization) Where m – No of rows, n – No of Columns of the given TP.
2: Obtain IBFS by North West Corner Method
P Q R S Supply
A 12 10 12 13 500
B 7 11 8 14 300
C 6 16 11 7 200
Demand 180 150 350 320 1000
BIET – MBA Programme, Davangere
102
Prof. Vijay K S Business Statistics and Analytics
3: Obtain IBFS by North West Corner Method
P Q R S Supply
A 6 4 1 5 14
B 8 9 2 7 16
C 4 3 6 2 5
Demand 6 10 15 4
4: Obtain IBFS by North West Corner Method
A B C D Supply
P 19 30 50 10 7
Q 70 30 40 60 9
R 40 8 70 20 18
Demand 5 8 7 14
5: Obtain IBFS by North West Corner Method
X Y Z Supply
A 50 40 80 400
B 80 70 40 400
C 60 70 60 500
D 60 60 60 400
E 30 50 40 800
Demand 800 600 1100
BIET – MBA Programme, Davangere
103
Prof. Vijay K S Business Statistics and Analytics
6: Obtain IBFS by North West Corner Method
A B C Supply
P 2 7 4 5
Q 3 3 1 8
R 5 4 7 7
S 1 6 2 14
Demand 7 9 18
Initial Basic Feasible Solution (IBFS) by Least Cost Method (LCM)
Step 1: Locate the cell with lowest cost and allocate accordingly to the minimum
(Corresponding ai , bj )
Step 2: Again locate the next least cost cell in the reduced cost matrix and allocate as
before
Step 3: Continue the process till all the allocations are made
7: Find IBFS by Least Cost Method:
D1 D2 D3 D4 Supply
O1 1 2 3 4 6
O2 4 3 2 0 8
O3 0 2 2 1 10
Demand 4 6 8 6
8: Find IBFS by Least Cost Method:
P Q R S Supply
A 6 4 1 5 14
B 8 9 2 7 16
C 4 3 6 2 5
Demand 6 10 15 4
9: Find IBFS by Least Cost Method:
D1 D2 D3 D4 Supply
F1 19 30 50 10 7
F2 70 30 40 60 9
F3 40 8 70 20 18
Demand 5 8 7 14
BIET – MBA Programme, Davangere
104
Prof. Vijay K S Business Statistics and Analytics
10: Find IBFS by Least Cost Method:
D1 D2 D3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250
11: Find IBFS by North West Corner Method and Least Cost Method
D1 D2 D3 D4 Supply
F1 19 30 50 10 7
F2 70 30 40 60 9
F3 40 8 7 14 34
Demand 5 8 7 14 34
12: Find the IBFS by LCM Method and NWCM
W1 W2 W3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250 770
Initial Basic Feasible Solution by Vogel’s Approximation Method
Step 1: Calculate penalty for each Row and Column by considering the difference
between least cost and next least cost in each row and column
Step 2: Find a row or column having highest penalty. Locate the least cost cell in that
row or column and allocate that cell with min (ai, bj)
Step 3: Again find fresh set of penalties for the reduced cost matrix and allocate as in
Step 2
Step 4: Continue allocating for further reduced cost matrix until all allocations are made
BIET – MBA Programme, Davangere
105
Prof. Vijay K S Business Statistics and Analytics
13: Find the IBFS by Vogel’s Approximation Method
W1 W2 W3 Supply
F1 48 60 56 140
F2 45 55 53 260
F3 50 65 60 150
F4 52 64 55 220
Demand 200 320 250 770
14: Find IBFS by VAM
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 250 300 250
15: Find IBFS by VAM
D1 D2 D3 D4 Supply
A 11 13 17 14 250
B 16 18 14 10 300
C 21 24 13 10 400
Demand 200 225 275 250
16: A diary firm has three plants located in a state. The daily milk production at each as
follows:
Plant 1: 6 Million litres
Plant 2: 1 Million Litres
Plant 3: 10 Million Litres
Each day, the firm must fulfil the needs of its four distribution centres. Minimum
requirement at each centre is as follows
Distribution centres 1: 7 Million Litres
Distribution centres 2: 5 Million Litres
Distribution Centres 3: 3 Million Litres
Distribution Centres 4: 2 Million Litres
Costs in hundreds of rupees of shipping one million litre from each plant to each
distribution centres is given in following table.
D1 D2 D3 D4
P1 2 3 11 7
P2 1 0 6 1
P3 5 8 15 9
BIET – MBA Programme, Davangere
106
Prof. Vijay K S Business Statistics and Analytics
17: Find the initial basic feasible solution for the following transportation problem by
VAM
D1 D2 D3 D4 Supply
O1 3 3 4 1 100
O2 4 2 4 2 125
O3 1 5 3 2 75
Demand 120 80 75 25 300
18: Find the initial solution to the following transportation problem using VAM
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14
19: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
S1 21 16 15 3 11
S2 17 18 14 23 13
S3 32 27 18 41 19
Demand 6 10 12 15
20: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
S1 1 2 1 4 30
S2 3 3 2 1 50
S3 4 2 5 9 20
Demand 20 40 30 10
21: Determine an initial basic feasible solution to the following TP by using VAM
D1 D2 D3 D4 Supply
O1 6 4 1 5 14
O2 8 9 2 7 16
O3 4 3 6 2 5
Demand 6 10 15 4
21: Determine an initial basic feasible solution to the following TP by using VAM
A B C D E F Available
O1 9 12 9 6 9 10 5
O2 7 3 7 7 5 5 6
O3 6 5 9 11 3 11 2
O4 6 8 11 2 2 10 9
Requirement 4 4 6 2 4 2
BIET – MBA Programme, Davangere
107
Prof. Vijay K S Business Statistics and Analytics
Unbalanced Transportation Problem:
TP is said to be unbalanced if total demand is not equal to total supply (∑ 𝑎𝑖 ≠ 𝑏𝑗) such
a TP is solved by transforming it to balanced TP by adding dummy row or column with
required supply / demand to make it balanced i.e.
Total supply (∑ 𝑎𝑖 ) < Total demand (∑ 𝑏𝑗 ), a dummy row with supply = (∑ 𝑏𝑗 − ∑ 𝑎𝑖 )
is added.
If total Supply (∑ 𝑎𝑖 ) > Total demand (∑ 𝑏𝑗 ), a dummy column with demand = (∑ 𝑎𝑖 −
∑ 𝑏𝑗 ) is added.
TP is then solved using the known procedure
22: Solve the following TP
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 450 300 250
23: Solve the following TP
D E F G Supply
A 7 14 8 12 400
B 9 10 12 5 300
C 11 6 11 4 300
Demand 200 450 300 250
24: A Company is spending Rs. 1200 and transportation of its unit from 3 plants to 4
destination centres. The supply and demand of units with unit cost of transportation is as
follows. What can be the maximum solving by optimal scheduling?
1 2 3 4 Supply
P1 20 30 50 17 7
P2 70 35 40 60 10
P3 40 12 60 25 18
Demand 5 8 7 15
BIET – MBA Programme, Davangere
108
Prof. Vijay K S Business Statistics and Analytics
25: Problem
• Holiday shipments of iPods to distribution centres
• Production at 3 facilities,
• A, supply 200k
• B, supply 350k
• C, supply 150k
• Distribute to 4 centers,
• N, demand 160k
• S, demand 140k
• E, demand 300k
• W, demand 200k
Total demand ≠ total supply. Obtain initial solution in the following transportation
problem by using VAM method
N S E W
A 16 13 22 17
B 14 13 19 15
C 9 20 23 10
BIET – MBA Programme, Davangere
109
Prof. Vijay K S Business Statistics and Analytics
Restricted Transportation Problem:
 Sometimes in a transpiration problem some routes may not be available. This
could be due to a variety of reasons like unfavourable weather condition or a strike
on particular route etc.
 In such a situation there is a restrictions on route available for transportation.
 We assign a very large cost represented by M to each of such routes which are not
available.
 The effect of adding a large cost element would be that such routes would
automatically be eliminated in the final solutions.
26: The XYZ Tobacco Company purchased and stores in warehouses located in the following four
cities
C1 C2 C3 Supply
A 7 10 5
B 12 9 4
C 7 3 11
D 9 5 7
Demand 120 100 110
Because of railroad construction, shipments are temporarily prohibited from warehouse
at city A to company C1. i) Find the IBFS for XYZ tobacco Company
27. Solve the below transportation problem
Factory
Warehouse
SupplyW1 W2 W3
F1 16 12 200
F2 14 8 18 160
F3 26 16 90
Demand 180 120 150 450
Maximization of Transportation Problem:
If the TP contains profit matrix with an objective of maximization. It can be solved by
transforming profit matrix into cost matrix by rewriting the cost matric, such that all unit
profits subtracted in highest unit profit of the given profit matrix.
11. Solve for maximum profit
A B C D Supply
X 12 18 6 25 200
Y 8 7 10 18 500
Z 14 3 11 20 300
Demand 180 320 100 400
BIET – MBA Programme, Davangere
110
Prof. Vijay K S Business Statistics and Analytics
Questions from Previous Year Question Papers:
1. Solve the following transportation problem for Maximum profit. Only by initial basic
feasible solution
Per Unit Profit (Rs)
Market
Warehouse A B C D
X 12 18 6 25
Y 8 7 10 18
Z 14 3 11 20
2. Use North West corner method (NWCM) and least cost method (LCM) to find an initial basic
feasible solution to the transportation problem.
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14 34
3. Solve the following transportation problem for maximum profit
Warehouse Per Unit profit (Rs.) Market
A B C D
X 12 18 6 25
Y 8 7 10 18
Z 14 3 11 20
Availability of Ware Houses Demand in the market
X: 200 Units A: 180 Units
Y: 500 Units B: 320 Units
Z: 300 Units C: 100 Units
D: 400 Units
4. For the following Transportation problem find initial solution using
1) North West Corner method 2) Least cost method
To
I II III Supply
From A
B
C
5 1 7 10
6 4 9 80
3 2 8 55
Demand 75 20 50
Available at warehouse Demand in the market
X: 200 Units A 180 Units
Y: 500 Units B 320 Units
Z: 300 Units C 100 Units
D 400 Units
BIET – MBA Programme, Davangere
111
Prof. Vijay K S Business Statistics and Analytics
5. A company has 3 fabrics S1, S2 and S3 with production capacity of 7, 9 and 18 units (in 100s)
per week of a product respectively. These units are to be shipped to four warehouse D1, D2,
D3 and D4 with requirement of 5, 8, 7 and 14 Units ( in 100s) per week respectively. The
transportation costs (in Rupees) per units between factories to warehouse are given below
D1 D2 D3 D4 Supply
S1 19 30 50 10 7
S2 70 30 40 60 9
S3 40 8 70 20 18
Demand 5 8 7 14
Find initial solution using VAM Method
BIET – MBA Programme, Davangere
112
Prof. Vijay K S Business Statistics and Analytics
Unit – 6
Project Management
BIET – MBA Programme, Davangere
113
Prof. Vijay K S Business Statistics and Analytics
Syllabus:
Project Management:
Introduction – Basic difference between PERT & CPM – Network components and
precedence relationship – Critical path analysis – Project scheduling – Project Timecost
trade off- Resource Allocation, basic concept of project crashing.
PERT & CPM
Project Management evolved as a new field with the development of two analytical
techniques for planning, scheduling and controlling of projects. These are Critical Path
Method (CPM) and Project Evaluation and Review Techniques (PERT)
Application of PERT & CPM Techniques
These methods have been applied to a wide variety of problems in industries and have
found acceptable even in government organizations.
These includes
- Construction of dam or Canal system in a region
- Construction of building / highways
- Maintenance of aeroplanes or Oil refinery
- Space flight
- Cost control of a project using PERT / CPM
- Designing a prototype of a machine
- Development of supersonic planes
Basic Definitions:
Activity: Any individual operations, which utilizes resources and has an end and a
beginning is called activity. An arrow is commonly used to represent an activity with its
head indicating the direction of progress in the project. These are usually classified into
following 4 categories
1. Predecessor Activity
Activity that must be completed immediately prior too the start of another activity are
called as predecessor activity
2. Successor Activity
Activities that cannot be started until one or more of other activities are completed,
but immediately succeed them are called successor activity.
3. Concurrent Activity
Activities which can be accomplished concurrently are known as concurrent
activities. It may be noted that an activity can be predecessor or a successor to an
event or it may be concurrent with one or more of the other activities.
BIET – MBA Programme, Davangere
114
Prof. Vijay K S Business Statistics and Analytics
4. Dummy Activity:
An activity which does not consume any kind of resources but merely depicts the
technological dependence is called a dummy activity.
It may be noted that the dummy activity is inserted in the network to clarify the
activity pattern in the following two situation.
- To make activities with common starting and finishing points distinguishable
- To identify and maintain the proper precedence relationship between activities
that are not connected by events.
Events: An Event represent a point in time signifying the completion of some activities
and the beginning of new ones. This is usually represented by circle “O” in a network,
which is called as node or connector.
The events can be further classified into following 3 categories
- Merge Events:
- When more than one activity comes and joins an event, such event is known as
merge event
- Burst Events
- When more than one activity leaves an event is known as burst event
- Merge and Burst Events
An activity may be merge and burst event at the same time as with respect to some
activities it can be a merge event with respect to some other activities it may be a
burst event
BIET – MBA Programme, Davangere
115
Prof. Vijay K S Business Statistics and Analytics
Basic difference between PERT and CPM
Rules to be followed during the construction of network
1. No single activity can be represented more than once in a network. The length of
an arrow has no significance.
2. The event numbered 1 is the start event and an event with highest number is the
end event. Before an activity can be undertaken, all activities preceding it must
be completed. That is, the activities must follow a logical sequence (or –
interrelationship) between activities.
3. In assigning numbers to events, there should not be any duplication of event
numbers in a network.
4. Dummy activities must be used only if it is necessary to reduce the complexity of
a network.
5. A network should have only one start event and one end event.
CPM PERT
 CPM uses activity oriented network.  PERT uses event oriented Network.
 Durations of activity may be
estimated with a fair degree of
accuracy.
 Estimate of time for activities are not so accurate
and definite.
 It is used extensively in construction
projects.
 It is used mostly in research and development
projects, particularly projects of non-repetitive
nature.
 Deterministic concept is used.  Probabilistic model concept is used.
 CPM can control both time and cost
when planning.
 PERT is basically a tool for planning.
 In CPM, cost optimization is given
prime importance. The time for the
completion of the project depends
upon cost optimization. The cost is
not directly proportioned to time.
Thus, cost is the controlling factor.
 In PERT, it is assumed that cost varies directly
with time. Attention is therefore given to
minimize the time so that minimum cost results.
Thus in PERT, time is the controlling factor.
BIET – MBA Programme, Davangere
116
Prof. Vijay K S Business Statistics and Analytics
Some conventions of network diagram are shown in Figures below:
Errors in Construction of Network Diagram
BIET – MBA Programme, Davangere
117
Prof. Vijay K S Business Statistics and Analytics
BIET – MBA Programme, Davangere
118
Prof. Vijay K S Business Statistics and Analytics
Dummy Activity
A Dummy activity is an imaginary activity. It does not exist in the Project activities. It
is used in the network diagram to show dependency relationship or connectivity
between two or more activities. It is represented by a dotted arrow.
Procedure for drawing a CPM network.
1. Specify the individual activities.
From the Work Breakdown Structure, a listing can be made of all the activities in
the project. This listing can be used as the basis for adding sequence and duration
information in later steps.
2. Determine the sequence of those activities.
Some activities are dependent upon the completion of others. A listing of the
immediate predecessors of each activity is useful for constructing the CPM
network diagram.
3. Draw a network diagram.
Once the activities and their sequencing have been defined, the CPM diagram can
be drawn. CPM originally was developed as an activity on node (AON) network,
but some project planners prefer to specify the activities on the arcs.
4. Estimate the completion time for each activity.
The time required to complete each activity can be estimated using past
experience or the estimates of knowledgeable persons. CPM is a deterministic
model that does not take into account variation in the completion time, so only
one number can be used for an activity’s time estimate.
5. Identify the critical path
The critical path is the longest-duration path through the network. The
significance of the critical path is that the activities that lie on it cannot be delayed
without delaying the project. Because of its impact on the entire project, critical
path analysis is an important aspect of project planning.
BIET – MBA Programme, Davangere
119
Prof. Vijay K S Business Statistics and Analytics
Problems:
1. The following table gives the activities in a project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E C 14
F D 8
G E F 9
2. The following table gives the activities in a project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 2
C A 3
D A 4
E C B 3
3. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E B 14
F C D 8
BIET – MBA Programme, Davangere
120
Prof. Vijay K S Business Statistics and Analytics
4. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 6
B - 4
C A 3
D B 8
E B C 14
F E D 8
5. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 2
B A 4
C A 3
D B 6
E C D 12
F C D 6
G E 9
H G 3
6. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 2
B 4
C 3
D A 6
E C 12
BIET – MBA Programme, Davangere
121
Prof. Vijay K S Business Statistics and Analytics
F A 6
G D B E 9
7. The following table gives the activities in a construction project.
 Draw a network diagram
 Determine critical path and project duration
Activity Predecessor
Activity
Time
(Days)
A - 3
B 4
C A 3
D A 6
E B 8
G D E 6
H D E 9
I D E 3
J C G 2
K F I 3
8. Draw a network corresponding to the following information
Activity 1 – 2 1 – 3 2 – 6 3 – 4 3 – 5 4 – 6 5 – 6 5 – 7 6 – 7
Duration 4 6 8 7 4 6 5 19 10
A) Draw the network diagram
B) Determine the critical path
9. Draw a network diagram for the following information obtain
 the respective time estimates, Calculate Total Float (TF), Free Float (FF)
and Independent Float (IF)
 Identify Critical Activity
Event Activity Duration
1 – 2 A 4
1 – 3 B 6
2 – 6 C 8
3 – 4 D 7
3 – 5 E 4
4 – 6 F 6
5 – 6 G 5
5 – 7 H 17
6 – 7 J 10
BIET – MBA Programme, Davangere
122
Prof. Vijay K S Business Statistics and Analytics
Note:
Total Float (TF) = Latest Start Time (Lst) – Earliest Start Time (Est)
Free Float (FF) = Total Float (TF) – Head Event Slack (HES)
Head Event Slack (HES) = Latest Finish Time (Lft) – Earliest Finish Time (Eft)
Independent Float (IF) = Free Float – Tail Event Slack (TES)
Tail Event Slack (TES) = Latest Start Time (Lst) – Earliest Start Time (Est)
10. A small project consist of 7 activities with following information
Activity Preceding Activity Duration
A - 4
B - 6
C - 8
D A B 7
E A B 4
F C D E 6
G C D E 5
11. The project has the following characteristics
- Constitute network diagram
- Calculate all time estimates, find the length of the project with critical path activities
using total float
Events Activity Time
1 – 2 A 2
1 – 4 B 2
1 – 7 C 1
2 – 3 D 4
3 – 6 E 1
4 – 5 F 5
4 – 8 G 8
5 – 6 H 4
6 – 9 I 3
7 – 8 J 3
8 – 9 K 5
BIET – MBA Programme, Davangere
123
Prof. Vijay K S Business Statistics and Analytics
12. Draw a network diagram from the following information
A < D, E ; B, D < F ; C < G ; B, D < H ; F, G < I
Construct network diagram, find CP using TF also find total project length.
Task A B C D E F G H I
Time 23 8 29 16 24 18 19 4 10
13. With the following information calculate Total Float and Project duration
Activity A B C D E F G H I J
Preceding
Activity
- - AB B A C E F D F G H I
Duration 2 3 4 1 5 3 2 7 6 3
14. The following table gives activities of a network with time estimates
 Draw a network diagram
 Calculate time duration of the project
 Calculate the Variance of Critical path
 Find the probability that the project will be completed in 41 days
Events Estimated Duration
to (Optimistic
Time)
tm (Most Likely Time
)
tp (Pessimistic Time )
1 – 2 3 6 15
1 – 6 2 5 14
2 – 3 6 12 30
2 – 4 2 5 8
3 – 5 5 11 17
4 – 5 3 6 15
6 – 7 3 9 27
5 – 8 1 4 7
7 – 8 4 19 28
BIET – MBA Programme, Davangere
124
Prof. Vijay K S Business Statistics and Analytics
Questions from Previous Year Question Papers:
1. Given the following information on a small project: A is the first activity of the project and
precedes the activity B and C. The activity D succeeds both B and C whereas only C is required to
start activity E. D Precedes F while G Succeeds E. H is the last activity of the project and succeeds
F and G. Draw a network diagram based on this information.
2. Draw a network diagram corresponding to the following information:
Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7
Duration 4 6 8 7 4 6 5 19 10
A) Draw a network diagram
B) Obtain early and late start time and completion times
C) Determine the critical path.
3. A Small project is composed of 7 activities whose time estimates are listed in the table below in
weeks.
Activity Optimistic Time Pessimistic Time Most Likely Time
1-2 1 7 1
1-3 1 7 4
1-4 2 8 2
2-5 1 1 1
3-5 2 14 5
4-6 2 8 5
5-6 3 15 6
a. Draw the network and find the expected project length
b. What is the probability that the project will be completed at-least 4 weeks earlier than
expected time.
4. Tasks A,B,C…….H, I constitute a project. The precedence relationship are:
A<D; A<E; B<F; D<F; C<G; C<H; F<I; G<I.
Draw a network diagram to represent the project and find the critical path when time in days of
each task is:
Task A B C D E F G H I
Time 8 10 8 10 16 17 18 14 9
Identify critical path with the help of EST, EFT, LST and LFT
 A project consists of nine activities whose time estimates (in Weeks) and other
characteristics are given below.
Activity Proceeding
Activity /lies
Time Estimates (Weeks)
Most
Optimistic
Most Likely Most
Pessimistic
A - 2 4 6
B - 6 6 6
C - 6 12 24
D A 2 5 8
E A 11 14 23
BIET – MBA Programme, Davangere
125
Prof. Vijay K S Business Statistics and Analytics
F B, D 8 10 12
G B, D 3 6 9
H C, F 9 15 27
I E 4 10 16
A) Show the PERT Network for the project
B) Identify the critical activities and find the expected project completion time and its
variance
C) If the project is required to be completed by December 31 of a given year and the
manager wants to be 95% sure of meeting the deadline, when he should start the
project work. Given P (0<Z<1.645) = 0.45
 The following table gives the activities in a construction project
Activity Immediate
Predecessor
Time
(Days )
A - 4
B - 6
C - 2
D A 5
E C 2
F A 7
G D, B, E 4
 A Small Project composed of 7 activities whose time estimates are given below
Activity Time Estimates (Weeks)
Most
Optimistic
Most Likely Most
Pessimistic
1-2 1 1 7
1-3 1 4 7
1-4 2 2 8
2-5 1 1 1
3-5 2 5 14
4-6 2 5 8
5-6 3 6 15
a) Draw the project network diagram.
b) Find the expected duration and variance of each activity. What is the expected length
and project standard deviation?
c) Calculate the probability of completing the project by 13 days
 Draw a network corresponding to the following information
a) Draw the network
b) Obtain early and late start time and completion times
c) Determine the critical path
d) Determine the total float
Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7
Duration 4 6 8 7 4 6 5 19 10

More Related Content

What's hot (20)

Role of statistics in real life , business & good governance
Role of statistics in real life , business & good governanceRole of statistics in real life , business & good governance
Role of statistics in real life , business & good governance
Department of Mathematics and Statistics, Ramjas College, Delhi University
 
Nature & scope of business research
Nature & scope of business researchNature & scope of business research
Nature & scope of business research
Ayisha Kowsar
 
Ppt 1-introduction-brm
Ppt 1-introduction-brmPpt 1-introduction-brm
Ppt 1-introduction-brm
PES Institution of Advanced Management Studies, Shivamogga
 
Scope of research - Research Methodology - Manu Melwin Joy
Scope of research - Research Methodology - Manu Melwin JoyScope of research - Research Methodology - Manu Melwin Joy
Scope of research - Research Methodology - Manu Melwin Joy
manumelwin
 
Measurement & scaling ,Research methodology
    Measurement & scaling ,Research methodology    Measurement & scaling ,Research methodology
Measurement & scaling ,Research methodology
SONA SEBASTIAN
 
Determinants of working capital
Determinants of working capitalDeterminants of working capital
Determinants of working capital
sadique Ali
 
Techniques for Forecasting Human Resources
Techniques  for Forecasting   Human ResourcesTechniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
BHOMA RAM
 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business Statistics
Megha Mishra
 
Role and responsibilities of managerial economist
Role and responsibilities of managerial economist Role and responsibilities of managerial economist
Role and responsibilities of managerial economist
jyyothees mv
 
Quantitative techniques introduction 19 pages
Quantitative techniques introduction 19 pagesQuantitative techniques introduction 19 pages
Quantitative techniques introduction 19 pages
taniyakhurana
 
1. introduction to business research
1. introduction to business research1. introduction to business research
1. introduction to business research
Muneer Hussain
 
Quantitative Techniques: Introduction
Quantitative Techniques: IntroductionQuantitative Techniques: Introduction
Quantitative Techniques: Introduction
Dayanand Huded
 
PPT on Transfer Pricing
PPT on Transfer Pricing PPT on Transfer Pricing
PPT on Transfer Pricing
Rani Channamma University, Sangolli Rayanna First Grade Constituent College, Belagavi
 
Management control system
Management control systemManagement control system
Management control system
Ankur Thakur
 
Data Analysis & Interpretation and Report Writing
Data Analysis & Interpretation and Report WritingData Analysis & Interpretation and Report Writing
Data Analysis & Interpretation and Report Writing
SOMASUNDARAM T
 
Definition, objectives and characteristics of business reports.
Definition, objectives and characteristics of business reports.Definition, objectives and characteristics of business reports.
Definition, objectives and characteristics of business reports.
geetarajan73
 
Operations Research - Meaning, Origin & Characteristics
Operations Research -  Meaning, Origin & CharacteristicsOperations Research -  Meaning, Origin & Characteristics
Operations Research - Meaning, Origin & Characteristics
Sundar B N
 
Social accounting ppt
Social accounting pptSocial accounting ppt
Social accounting ppt
Gurpreet Singh
 
Business forcasting
Business forcastingBusiness forcasting
Business forcasting
i really should study...oh its 7?..too late now
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
Shivani Sharma
 
Nature & scope of business research
Nature & scope of business researchNature & scope of business research
Nature & scope of business research
Ayisha Kowsar
 
Scope of research - Research Methodology - Manu Melwin Joy
Scope of research - Research Methodology - Manu Melwin JoyScope of research - Research Methodology - Manu Melwin Joy
Scope of research - Research Methodology - Manu Melwin Joy
manumelwin
 
Measurement & scaling ,Research methodology
    Measurement & scaling ,Research methodology    Measurement & scaling ,Research methodology
Measurement & scaling ,Research methodology
SONA SEBASTIAN
 
Determinants of working capital
Determinants of working capitalDeterminants of working capital
Determinants of working capital
sadique Ali
 
Techniques for Forecasting Human Resources
Techniques  for Forecasting   Human ResourcesTechniques  for Forecasting   Human Resources
Techniques for Forecasting Human Resources
BHOMA RAM
 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business Statistics
Megha Mishra
 
Role and responsibilities of managerial economist
Role and responsibilities of managerial economist Role and responsibilities of managerial economist
Role and responsibilities of managerial economist
jyyothees mv
 
Quantitative techniques introduction 19 pages
Quantitative techniques introduction 19 pagesQuantitative techniques introduction 19 pages
Quantitative techniques introduction 19 pages
taniyakhurana
 
1. introduction to business research
1. introduction to business research1. introduction to business research
1. introduction to business research
Muneer Hussain
 
Quantitative Techniques: Introduction
Quantitative Techniques: IntroductionQuantitative Techniques: Introduction
Quantitative Techniques: Introduction
Dayanand Huded
 
Management control system
Management control systemManagement control system
Management control system
Ankur Thakur
 
Data Analysis & Interpretation and Report Writing
Data Analysis & Interpretation and Report WritingData Analysis & Interpretation and Report Writing
Data Analysis & Interpretation and Report Writing
SOMASUNDARAM T
 
Definition, objectives and characteristics of business reports.
Definition, objectives and characteristics of business reports.Definition, objectives and characteristics of business reports.
Definition, objectives and characteristics of business reports.
geetarajan73
 
Operations Research - Meaning, Origin & Characteristics
Operations Research -  Meaning, Origin & CharacteristicsOperations Research -  Meaning, Origin & Characteristics
Operations Research - Meaning, Origin & Characteristics
Sundar B N
 

Similar to Business statistics and analytics (20)

Unit.1 MARKETING RESEARCH
Unit.1 MARKETING RESEARCHUnit.1 MARKETING RESEARCH
Unit.1 MARKETING RESEARCH
Pramod Rawat
 
Business Statistics PPT Unit 1 by Priya Singh.pptx
Business Statistics PPT Unit 1 by Priya Singh.pptxBusiness Statistics PPT Unit 1 by Priya Singh.pptx
Business Statistics PPT Unit 1 by Priya Singh.pptx
priyasinghy107
 
BASIS OF STATISTICS.pptx
BASIS OF STATISTICS.pptxBASIS OF STATISTICS.pptx
BASIS OF STATISTICS.pptx
DikshaGandhi20
 
Mod 2 -Descriptive Analytics - Final ppt.pdf
Mod 2 -Descriptive Analytics - Final ppt.pdfMod 2 -Descriptive Analytics - Final ppt.pdf
Mod 2 -Descriptive Analytics - Final ppt.pdf
anandchauhan9654
 
Unit 2 -Descriptive Analytics - Final ppt (1).pptx
Unit 2 -Descriptive Analytics - Final ppt (1).pptxUnit 2 -Descriptive Analytics - Final ppt (1).pptx
Unit 2 -Descriptive Analytics - Final ppt (1).pptx
gpb2703
 
Lecture 1 PPT.ppt
Lecture 1 PPT.pptLecture 1 PPT.ppt
Lecture 1 PPT.ppt
RAJKAMAL282
 
Importance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisionsImportance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisions
Aman Sinha
 
Importance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisionsImportance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisions
Aman Sinha
 
data science course in Chandigarh
data   science   course   in  Chandigarhdata   science   course   in  Chandigarh
data science course in Chandigarh
excellence academy
 
Pcc mktg 25 chapter 2
Pcc mktg 25 chapter 2Pcc mktg 25 chapter 2
Pcc mktg 25 chapter 2
Magiel Amora
 
Amr assignment goutam shit - roll 11
Amr assignment   goutam shit - roll 11Amr assignment   goutam shit - roll 11
Amr assignment goutam shit - roll 11
Sourav Biswas
 
Introduction to Statistics Presentation(2).pptx
Introduction to Statistics Presentation(2).pptxIntroduction to Statistics Presentation(2).pptx
Introduction to Statistics Presentation(2).pptx
pranavi452104
 
Lecture 1 PPT.pdf
Lecture 1 PPT.pdfLecture 1 PPT.pdf
Lecture 1 PPT.pdf
RAJKAMAL282
 
Business Statistics PPT - Updated (22.08.2024).pdf
Business Statistics PPT - Updated (22.08.2024).pdfBusiness Statistics PPT - Updated (22.08.2024).pdf
Business Statistics PPT - Updated (22.08.2024).pdf
DrSJayashree
 
Lecture 6 Market Research in powerpointt
Lecture 6 Market Research in powerpointtLecture 6 Market Research in powerpointt
Lecture 6 Market Research in powerpointt
LaywayMcDonald
 
Marketing Research
Marketing ResearchMarketing Research
Marketing Research
Tejasri Sambrani
 
Introduction to Statistics PPT (1).pptx
Introduction to Statistics PPT (1).pptxIntroduction to Statistics PPT (1).pptx
Introduction to Statistics PPT (1).pptx
Rishabh332761
 
Predictive Analytics in Education Context
Predictive Analytics in Education ContextPredictive Analytics in Education Context
Predictive Analytics in Education Context
IJMTST Journal
 
December
DecemberDecember
December
ssuser2dd821
 
DV HANDOUTS 2-MAY15-FORECASTING.pptx
DV HANDOUTS 2-MAY15-FORECASTING.pptxDV HANDOUTS 2-MAY15-FORECASTING.pptx
DV HANDOUTS 2-MAY15-FORECASTING.pptx
AbhishekAarya2
 
Unit.1 MARKETING RESEARCH
Unit.1 MARKETING RESEARCHUnit.1 MARKETING RESEARCH
Unit.1 MARKETING RESEARCH
Pramod Rawat
 
Business Statistics PPT Unit 1 by Priya Singh.pptx
Business Statistics PPT Unit 1 by Priya Singh.pptxBusiness Statistics PPT Unit 1 by Priya Singh.pptx
Business Statistics PPT Unit 1 by Priya Singh.pptx
priyasinghy107
 
BASIS OF STATISTICS.pptx
BASIS OF STATISTICS.pptxBASIS OF STATISTICS.pptx
BASIS OF STATISTICS.pptx
DikshaGandhi20
 
Mod 2 -Descriptive Analytics - Final ppt.pdf
Mod 2 -Descriptive Analytics - Final ppt.pdfMod 2 -Descriptive Analytics - Final ppt.pdf
Mod 2 -Descriptive Analytics - Final ppt.pdf
anandchauhan9654
 
Unit 2 -Descriptive Analytics - Final ppt (1).pptx
Unit 2 -Descriptive Analytics - Final ppt (1).pptxUnit 2 -Descriptive Analytics - Final ppt (1).pptx
Unit 2 -Descriptive Analytics - Final ppt (1).pptx
gpb2703
 
Lecture 1 PPT.ppt
Lecture 1 PPT.pptLecture 1 PPT.ppt
Lecture 1 PPT.ppt
RAJKAMAL282
 
Importance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisionsImportance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisions
Aman Sinha
 
Importance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisionsImportance of quantitative techniques in managerial decisions
Importance of quantitative techniques in managerial decisions
Aman Sinha
 
data science course in Chandigarh
data   science   course   in  Chandigarhdata   science   course   in  Chandigarh
data science course in Chandigarh
excellence academy
 
Pcc mktg 25 chapter 2
Pcc mktg 25 chapter 2Pcc mktg 25 chapter 2
Pcc mktg 25 chapter 2
Magiel Amora
 
Amr assignment goutam shit - roll 11
Amr assignment   goutam shit - roll 11Amr assignment   goutam shit - roll 11
Amr assignment goutam shit - roll 11
Sourav Biswas
 
Introduction to Statistics Presentation(2).pptx
Introduction to Statistics Presentation(2).pptxIntroduction to Statistics Presentation(2).pptx
Introduction to Statistics Presentation(2).pptx
pranavi452104
 
Lecture 1 PPT.pdf
Lecture 1 PPT.pdfLecture 1 PPT.pdf
Lecture 1 PPT.pdf
RAJKAMAL282
 
Business Statistics PPT - Updated (22.08.2024).pdf
Business Statistics PPT - Updated (22.08.2024).pdfBusiness Statistics PPT - Updated (22.08.2024).pdf
Business Statistics PPT - Updated (22.08.2024).pdf
DrSJayashree
 
Lecture 6 Market Research in powerpointt
Lecture 6 Market Research in powerpointtLecture 6 Market Research in powerpointt
Lecture 6 Market Research in powerpointt
LaywayMcDonald
 
Introduction to Statistics PPT (1).pptx
Introduction to Statistics PPT (1).pptxIntroduction to Statistics PPT (1).pptx
Introduction to Statistics PPT (1).pptx
Rishabh332761
 
Predictive Analytics in Education Context
Predictive Analytics in Education ContextPredictive Analytics in Education Context
Predictive Analytics in Education Context
IJMTST Journal
 
DV HANDOUTS 2-MAY15-FORECASTING.pptx
DV HANDOUTS 2-MAY15-FORECASTING.pptxDV HANDOUTS 2-MAY15-FORECASTING.pptx
DV HANDOUTS 2-MAY15-FORECASTING.pptx
AbhishekAarya2
 

More from Vijay K S (20)

Unit 6 HR Analytics
Unit   6 HR Analytics Unit   6 HR Analytics
Unit 6 HR Analytics
Vijay K S
 
Unit 5 hr analytics
Unit   5 hr analyticsUnit   5 hr analytics
Unit 5 hr analytics
Vijay K S
 
Unit 4 HR Analytics
Unit   4 HR AnalyticsUnit   4 HR Analytics
Unit 4 HR Analytics
Vijay K S
 
Unit 3 hr analytics
Unit   3 hr analyticsUnit   3 hr analytics
Unit 3 hr analytics
Vijay K S
 
Unit 2 hr analytics
Unit   2 hr analyticsUnit   2 hr analytics
Unit 2 hr analytics
Vijay K S
 
Unit 1 hr analytics
Unit   1 hr analyticsUnit   1 hr analytics
Unit 1 hr analytics
Vijay K S
 
Unit - 6_Strategic Management (18MBA25)_Strategy Implementation
Unit - 6_Strategic Management (18MBA25)_Strategy ImplementationUnit - 6_Strategic Management (18MBA25)_Strategy Implementation
Unit - 6_Strategic Management (18MBA25)_Strategy Implementation
Vijay K S
 
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Vijay K S
 
Unit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Unit - 5_Part A_Strategic Management (18MBA25)_EntrepreneurshipUnit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Unit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Vijay K S
 
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative StrategiesUnit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Vijay K S
 
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level StrategiesUnit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Vijay K S
 
Unit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Unit - 4_Part A_Strategic Management (18MBA25)_Internal AnalysisUnit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Unit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Vijay K S
 
Unit - 3_Strategic Management (18MBA25)_ External Analysis
Unit - 3_Strategic Management (18MBA25)_ External AnalysisUnit - 3_Strategic Management (18MBA25)_ External Analysis
Unit - 3_Strategic Management (18MBA25)_ External Analysis
Vijay K S
 
Unit - 2_Strategic Management (18MBA25)_Formulation & Intent
Unit - 2_Strategic Management (18MBA25)_Formulation & IntentUnit - 2_Strategic Management (18MBA25)_Formulation & Intent
Unit - 2_Strategic Management (18MBA25)_Formulation & Intent
Vijay K S
 
Unit - 1_Strategic Management (18MBA25)_Introduction
Unit - 1_Strategic Management (18MBA25)_IntroductionUnit - 1_Strategic Management (18MBA25)_Introduction
Unit - 1_Strategic Management (18MBA25)_Introduction
Vijay K S
 
Strategic formulation, intent &amp; balance score card
Strategic formulation, intent &amp; balance score cardStrategic formulation, intent &amp; balance score card
Strategic formulation, intent &amp; balance score card
Vijay K S
 
An introduction to strategic management unit - 1
An introduction to strategic management   unit - 1An introduction to strategic management   unit - 1
An introduction to strategic management unit - 1
Vijay K S
 
Corporate level strategies
Corporate level strategiesCorporate level strategies
Corporate level strategies
Vijay K S
 
Strategic Implementation
Strategic ImplementationStrategic Implementation
Strategic Implementation
Vijay K S
 
Business level strategies
Business level strategiesBusiness level strategies
Business level strategies
Vijay K S
 
Unit 6 HR Analytics
Unit   6 HR Analytics Unit   6 HR Analytics
Unit 6 HR Analytics
Vijay K S
 
Unit 5 hr analytics
Unit   5 hr analyticsUnit   5 hr analytics
Unit 5 hr analytics
Vijay K S
 
Unit 4 HR Analytics
Unit   4 HR AnalyticsUnit   4 HR Analytics
Unit 4 HR Analytics
Vijay K S
 
Unit 3 hr analytics
Unit   3 hr analyticsUnit   3 hr analytics
Unit 3 hr analytics
Vijay K S
 
Unit 2 hr analytics
Unit   2 hr analyticsUnit   2 hr analytics
Unit 2 hr analytics
Vijay K S
 
Unit 1 hr analytics
Unit   1 hr analyticsUnit   1 hr analytics
Unit 1 hr analytics
Vijay K S
 
Unit - 6_Strategic Management (18MBA25)_Strategy Implementation
Unit - 6_Strategic Management (18MBA25)_Strategy ImplementationUnit - 6_Strategic Management (18MBA25)_Strategy Implementation
Unit - 6_Strategic Management (18MBA25)_Strategy Implementation
Vijay K S
 
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Unit - 5_Part B_Strategic Management (18MBA25)_Corporate Level Strategies
Vijay K S
 
Unit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Unit - 5_Part A_Strategic Management (18MBA25)_EntrepreneurshipUnit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Unit - 5_Part A_Strategic Management (18MBA25)_Entrepreneurship
Vijay K S
 
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative StrategiesUnit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Unit - 4_Part C_Strategic Management (18MBA25)_Cooperative Strategies
Vijay K S
 
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level StrategiesUnit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Unit - 4_Part B_Strategic Management (18MBA25)_Business Level Strategies
Vijay K S
 
Unit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Unit - 4_Part A_Strategic Management (18MBA25)_Internal AnalysisUnit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Unit - 4_Part A_Strategic Management (18MBA25)_Internal Analysis
Vijay K S
 
Unit - 3_Strategic Management (18MBA25)_ External Analysis
Unit - 3_Strategic Management (18MBA25)_ External AnalysisUnit - 3_Strategic Management (18MBA25)_ External Analysis
Unit - 3_Strategic Management (18MBA25)_ External Analysis
Vijay K S
 
Unit - 2_Strategic Management (18MBA25)_Formulation & Intent
Unit - 2_Strategic Management (18MBA25)_Formulation & IntentUnit - 2_Strategic Management (18MBA25)_Formulation & Intent
Unit - 2_Strategic Management (18MBA25)_Formulation & Intent
Vijay K S
 
Unit - 1_Strategic Management (18MBA25)_Introduction
Unit - 1_Strategic Management (18MBA25)_IntroductionUnit - 1_Strategic Management (18MBA25)_Introduction
Unit - 1_Strategic Management (18MBA25)_Introduction
Vijay K S
 
Strategic formulation, intent &amp; balance score card
Strategic formulation, intent &amp; balance score cardStrategic formulation, intent &amp; balance score card
Strategic formulation, intent &amp; balance score card
Vijay K S
 
An introduction to strategic management unit - 1
An introduction to strategic management   unit - 1An introduction to strategic management   unit - 1
An introduction to strategic management unit - 1
Vijay K S
 
Corporate level strategies
Corporate level strategiesCorporate level strategies
Corporate level strategies
Vijay K S
 
Strategic Implementation
Strategic ImplementationStrategic Implementation
Strategic Implementation
Vijay K S
 
Business level strategies
Business level strategiesBusiness level strategies
Business level strategies
Vijay K S
 

Recently uploaded (20)

The History of Kashmir Karkota Dynasty NEP.pptx
The History of Kashmir Karkota Dynasty NEP.pptxThe History of Kashmir Karkota Dynasty NEP.pptx
The History of Kashmir Karkota Dynasty NEP.pptx
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
All About the 990 Unlocking Its Mysteries and Its Power.pdf
All About the 990 Unlocking Its Mysteries and Its Power.pdfAll About the 990 Unlocking Its Mysteries and Its Power.pdf
All About the 990 Unlocking Its Mysteries and Its Power.pdf
TechSoup
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Junction Field Effect Transistors (JFET)
Junction Field Effect Transistors (JFET)Junction Field Effect Transistors (JFET)
Junction Field Effect Transistors (JFET)
GS Virdi
 
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Leonel Morgado
 
How to Configure Scheduled Actions in odoo 18
How to Configure Scheduled Actions in odoo 18How to Configure Scheduled Actions in odoo 18
How to Configure Scheduled Actions in odoo 18
Celine George
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
Computer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issuesComputer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issues
Abhijit Bodhe
 
Link your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRMLink your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRM
Celine George
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Ancient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian HistoryAncient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian History
Virag Sontakke
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
All About the 990 Unlocking Its Mysteries and Its Power.pdf
All About the 990 Unlocking Its Mysteries and Its Power.pdfAll About the 990 Unlocking Its Mysteries and Its Power.pdf
All About the 990 Unlocking Its Mysteries and Its Power.pdf
TechSoup
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Junction Field Effect Transistors (JFET)
Junction Field Effect Transistors (JFET)Junction Field Effect Transistors (JFET)
Junction Field Effect Transistors (JFET)
GS Virdi
 
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Leonel Morgado
 
How to Configure Scheduled Actions in odoo 18
How to Configure Scheduled Actions in odoo 18How to Configure Scheduled Actions in odoo 18
How to Configure Scheduled Actions in odoo 18
Celine George
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
Computer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issuesComputer crime and Legal issues Computer crime and Legal issues
Computer crime and Legal issues Computer crime and Legal issues
Abhijit Bodhe
 
Link your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRMLink your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRM
Celine George
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Ancient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian HistoryAncient Stone Sculptures of India: As a Source of Indian History
Ancient Stone Sculptures of India: As a Source of Indian History
Virag Sontakke
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 

Business statistics and analytics

  • 1. BIET – MBA Programme, Davangere 1 Prof. Vijay K S Business Statistics and Analytics Business Statistics and Analytics -: Working Manual:- Name of the student: Section: USN Number:
  • 2. BIET – MBA Programme, Davangere 2 Prof. Vijay K S Business Statistics and Analytics Course Objectives: 1. To make the students learn about the applications of statistical tools and techniques in decision making. 2. To emphasize the need for statistics and decision models in solving business problems. 3. To enhance the knowledge on descriptive and inferential statistics. 4. To familiarize the students with analytical package MS Excel. 5. To develop analytical skills in students in order to comprehend and practice data analysis at different levels. Syllabus Unit 1: (12 Hours) Introduction to Statistics: Meaning and Definition, functions, scope and limitations, Collection and presentation of data, frequency distribution, measures of central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean, Measuresof dispersion:Range – Quartile Deviation – Mean Deviation - Standard Deviation – Variance- Coefficient of Variance - Comparison of various measures of Dispersion Unit 2: (8 Hours) Correlation and Regression: Scatter Diagram, Karl Pearson correlation, Spearman’s Rank correlation (one way table only), simple and multiple regression (problems on simple regression only) Unit 3: (6 Hours) Probability Distribution: Concept and definition - Rules of probability – Random variables – Concept of probability distribution – Theoretical probability distributions: Binomial, Poisson, Normal and Exponential – Baye’s theorem (No derivation) (Problems only on Binomial, Poisson and Normal). Unit 4: (10 Hours) Time Series Analysis: Introduction - Objectives Of Studying Time Series Analysis - Variations In Time Series - Methods Of Estimating Trend: Freehand Method - Moving Average Method - Semi-Average Method – Least Square Method. Methods of Estimating Seasonal Index: Method Of Simple Averages - Ratio To Trend Method - Ratio To Moving Average Method Unit 5: (8Hours) Linear Programming: structure, advantages, disadvantages, formulation of LPP, solution using Graphical method. Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced, restricted and maximization problems. Unit 6: (8 Hours) Project Management: Introduction – Basic difference between PERT & CPM – Network components and precedence relationships – Critical path analysis – Project scheduling – Project time-cost trade off – Resource allocation, Concept of project crashing.
  • 3. BIET – MBA Programme, Davangere 3 Prof. Vijay K S Business Statistics and Analytics Uncheck the following assumptions: - Business Statistics is all about only numbers - People who are good in maths can do well with business statistics - Analytics and Business Statistics are the difficult subjects to pass - Need advanced mathematical ability is required to learn Business Statistics. PRACTICAL COMPONENT :( Student-Centered Learning) - Students are expected to have a basic excel classes - Students should be able to relate the concepts which can highly enhance an application scenario in your profession. - Student should demonstrate the application of the techniques covered in this course. COURSE OUTCOMES: - Facilitate objective solutions in business decision making under subjective conditions. - Demonstrate different statistical techniques in business/real-life situations. - Understand the importance of probability in decision making - Understand the need and application of analytics. - Understand and apply various data analysis functions for business problems
  • 4. BIET – MBA Programme, Davangere 4 Prof. Vijay K S Business Statistics and Analytics Unit – 1 Introduction to Statistics Measures of Central Tendency & Measures of Dispersion
  • 5. BIET – MBA Programme, Davangere 5 Prof. Vijay K S Business Statistics and Analytics Unit 1: (12 Hours) Introduction to Statistics: Meaning and Definition, functions, scope and limitations, Collection and presentation of data, frequency distribution Measures of central tendency - Mean, Median, Mode, Geometric mean, Harmonic mean, Measures of dispersion: Range – Quartile Deviation – Mean Deviation - Standard Deviation – Variance, Coefficient of Variance - Comparison of various measures of Dispersion Statistics: Meaning and Definition: The simple sense of statistics is the facts is shown in number Example: Average score in maths is 45 Definition: - “The collection, representation, analysis and interpretation of the numerical data.” - The term statistics means a numerical statement or statistical methodology, when used in the sense of statistical data it refers to quantitative aspects of things and is a numerical description. - The art and science of collecting, analysing, presenting and interpreting data. Characteristics of Statistics: By statistics we mean  Aggregate of facts  Aggregate to a marked extent by a multiplicity of courses  Enumerated and expressed interms of numbers  Statistics should be collected with a reasonable standard of accuracy  Collected and placed to in relation to each other Statistical Methods: It is a science which deals with the methods of collecting, classifying, presenting, comparing and interpreting nemerical data collected to throw same light on any sphere of enquiry. Types of Statistical Methods: Descriptive statistics: - It consists of procedures used to summarize and describe the charecteristics of a set of data. Inferential Statistics: - It consists of procedure used to make inference about population charecteristics on the basis of sample results.
  • 6. BIET – MBA Programme, Davangere 6 Prof. Vijay K S Business Statistics and Analytics Some Important terminologies: - Data: Collection of observatios of one or more variable of interest - Population: A collection of all elements (Units or variable) of interest - Sample: A subset of the population - Variable: A characteristic, number, or quantity that increases or decreases over time, or takes different values in different situations. Functions of Statistics: - To collect and present facts in a systematic manner - To help in formulation and testing of hypothesis - To help in facilitating the comparison of data - To help predicting the future trends - To help to find the relationship between variables - Simplefies the mass of complex data - To help to formulate policies Scope and Importance of statistics: 1. Statistics and planning: Statistics in indispensable into planning in the modern age which is termed as “the age of planning”. Almost all over the world the govt. are re-storing to planning for economic development. 2. Statistics and economics: Statistical data and techniques of statistical analysis have to immensely useful involving economical problem. Such as wages, price, time series analysis, demand analysis. 3. Statistics and business: Statistics is an irresponsible tool of production control. Business executive are relying more and more on statistical techniques for studying the much and desire of the valued customers. 4. Statistics and industry: In industry statistics is widely used inequality control. In production engineering to find out whether the product is confirming to the specifications or not. Statistical tools, such as inspection plan, control chart etc. 5. Statistics and mathematics: Statistics are intimately related recent advancements in statistical technique are the outcome of wide applications of mathematics. 6. Statistics and modern science: In medical science the statistical tools for collection, presentation and analysis of observed facts relating to causes and incidence of dieses and the result of application various drugs and medicine are of great importance. 7. Statistics, psychology and education: In education and physiology statistics has found wide application such as, determining or to determine the reliability and validity to a test, factor analysis etc.
  • 7. BIET – MBA Programme, Davangere 7 Prof. Vijay K S Business Statistics and Analytics 8. Statistics and war: In war the theory of decision function can be a great assistance to the military and personal to plan “maximum destruction with minimum effort.” Statistics in business and management: 1. Marketing: Statistical analysis are frequently used in providing information for making decision in the field of marketing it is necessary first to find out what can be sold and the to evolve suitable strategy, so that the goods which to the ultimate consumer. A skill full analysis of data on production purchasing power, man power, habits of compotators, habits of consumer, transportation cost should be consider to take any attempt to establish a new market. 2. Production: In the field of production statistical data and method play a very important role. The decision about what to produce? How to produce? When to produce? For whom to produce is based largely on statistical analysis. 3. Finance: The financial organization discharging their finance function effectively depend very heavily on statistical analysis of peat and tigers. 3. Banking: Banking institute have found if increasingly to establish research department within their organization for the purpose of gathering and analysis information, not only regarding their own business but also regarding general economic situation and every segment of business in which they may have interest. 4. Investment: Statistics greatly assists investors in making clear and valued judgment in his investment decision in selecting securities which are safe and have the best prospects of yielding a good income. 5. Purchase: the purchase department in discharging their function makes use of statistical data to frame suitable purchase policies such as what to buy? What quantity to buy? What time to buy? Where to buy? Whom to buy? 6. Accounting: statistical data are also employer in accounting particularly in auditing function, the technique of sampling and destination is frequently used. 7. Control: the management control process combines statistical and accounting method in making the overall budget for the coming year including sales, materials, labor and other costs and net profits and capital requirement. Limitations of Statistics: - Does not study qualitative phenomenon - Does not deal with indiivdual items - Statistical results are true only on an average - Statistical s=data should be uniform and homogeneous - Statsitical results depends on the accuracy of data - Statistical conclusions are not universally true - Statistics results can be interpreted only if a person has sound knowledge of statisctics
  • 8. BIET – MBA Programme, Davangere 8 Prof. Vijay K S Business Statistics and Analytics Collection and presentation of data: Statistical data: A set of information collected from a sample to draw to general conclusion about the population Statistical data may be classified as - Primary Data – Collected first time by the researcher o Sources – Interview, Observation, Indirect or oral investigation, information from the local agents and correspondents, mail questionnaires, through enumerations. - Secondary Data – Already collected data o Sources: Published statitics, Publication of research institutes, Publication of business and financial institutes, newspaper and periodicals, reports of various committees and commissions, unpublished statistics Presentation of Date: arranging things or data in groups or classes according to their resembalces and affinities and gives expressions to the chapter of attributes that may subset among a diversity of individuals Some Important classification o Geographical (on the basis of area or region) Example: Sales of the company Region Sales North 450 South 310 East 281 West 114 o Chronological (On the basis of histrical i.e. with respect to time) Example: Sales reported by the departmental stores Month Sales (In lakhs) Jan 45 Feb 31 March 28 April 11 o Qualitative ( On the basis of character / attributes)  Simple Classification: Classification is done into two calsses  Maniifold Classification : The classification is based on more than one attribute at a time o Numerical, qunatitative ( On the basis of magnitude) Marks No of Students 0-10 45 10-20 31 20-30 28 30-40 11
  • 9. BIET – MBA Programme, Davangere 9 Prof. Vijay K S Business Statistics and Analytics Frequency distribution: A frequency distribution is a statistical table, which shows the set of all distict values of the variable arranged in order of magnitude, either individually or in groups with their corresponding frequencies. Classification of Frequency distribution: - Series of individual observation Items are listed one after the other Roll No. Marks Obtained 1 83 2 53 3 72 4 61 - Discrete (Ungrouped) Frequency distribution Variants differ from each other by a definite amount No of Kids Families 1 13 2 53 3 12 4 14 - Continuous Frequency distribution (Grouped frequency distribution) Measurements are only approximations and are expressed in terms of intervals with certyain limits Marks Students 0-5 1 5-10 13 10-15 8 15-20 5 Some technical terms in formulating the frequency distribution o Class Limits: Smallest and largest values in the class o Class Intervals: The difference between upper and lower limit of a class interval Methods of Forming Class Interval: o Exclusive Method (Over lapping) Marks Students 0-5 1 5-10 13 o Inclusive Method (Non Overlapping) Marks Students 0-4 1 5-9 13
  • 10. BIET – MBA Programme, Davangere 10 Prof. Vijay K S Business Statistics and Analytics Presenting Data: Some of the diagrammatic representation of data o One dimensional diagrams (Line and Bar) o Two-dimensional diagram (Rectangle, square, pie) o Three dimensional diagram (Cube, Sphere, Cylinder) o Pictogram o Cartogram Measures of Central Tendency Central Tendency:  It is also termed as average  They sometime referred as measures of location  Central tendency is the middle point of distribution  This is used to describe the inherent (Essential) characteristics of a frequency distribution  Average or Central Tendency which condense a huge unwieldy (Awkward or heavy) set of numerical data into single numerical values which are representative of the entire distribution  This will give us a bird’s eye view of the huge mass of numerical data  Central tendency or the average values are typically values around which other items of the distribution assembles or congregates.  These are the values lie between the two extreme observations of the distribution and give us an idea about the concentration of the values in the central part of distribution  This is very much useful in o Describing the distribution in concise manner o Comparative study of different distribution o To compute other measures such as Dispersion “Central tendency is the tendency (behaviour) of numerical data to move towards its central value like Arithmetic mean, Median, Mode, Geographical Mean, and Harmonic Mean is called Central Tendency” Example: Average score of class in particular subject is 65  It implies that each student score has contributed in getting 65 and thus, each score is understood to move towards 65
  • 11. BIET – MBA Programme, Davangere 11 Prof. Vijay K S Business Statistics and Analytics Central Tendency / Central Location of following curves  Curve A and C has got same or equal Central Tendency (CT)  Central Tendency of curve B lies right to the curve A and C Dispersion: Dispersion is the spread of the data in a distribution i.e. the extent to which the observations are scattered Dispersion of the following curves Here the curve B has got wider spread or dispersion than the curve A Various Measures of Central Tendency I. Mean (Arithmetic Mean / Simple Mean denoted as AM or 𝑿̅ ) II. Median (Denoted as Md also called as positional average) III. Mode ( Denoted as Z or Mo) IV. Geometric Mean (GM) V. Harmonic Mean (HM) Depending on the seriousness of data analysis we choose between different measures of central tendency listed above
  • 12. BIET – MBA Programme, Davangere 12 Prof. Vijay K S Business Statistics and Analytics I. MEAN (ARITHMETIC MEAN / SIMPLE MEAN)  Most of the time it was referred as average of something i.e. some given value  Arithmetic mean of a given set of observations is their sum divided by the number of observation  Mean= Sum of all values / Total number of values Examples: - Average winter temperature of New-York city - Average corn yield from acre of land Note: The date can be in any one of the following form - Grouped and Ungrouped Date 1. Raw data: Examples: The height of 6 plants in the garden is 6, 5, 3, 4, 2, 7 2. Discrete frequency distribution: A discrete variable is one whose set of possible values is finite. Discrete variables are frequently counting variables, like the number of cars owned, Number kids in the family etc. Example: X f 0 3 1 12 2 18
  • 13. BIET – MBA Programme, Davangere 13 Prof. Vijay K S Business Statistics and Analytics 3 7 40 Where X = Number of Children f = Number of family 3. Continuous frequency distribution a. Mutually Exclusive Class Intervals b. Mutually Inclusive c. Open Ended Class Intervals a. Mutually Exclusive Class Intervals: Here the lower limit is included and upper limit is excluded from the class interval. CI f 50 – 60 5 60 – 70 16 70 -80 19 80 -90 10 50 CI=Class Interval b. Mutually Inclusive Class Intervals: here both upper and lower limit is included in the class interval CI f 50-59 4 60-69 17 70-79 20 80-89 8 90-99 1 50 c. Open Ended Class: If the initial or final class interval is indeterminate at its end CI f Below 60 5 60 – 70 16 70 -80 19 80-90 30 30 Some Points Regarding the Class Intervals  To calculate median, mode for the particular values, the mutually inclusive class intervals to be converted to mutually exclusive class intervals.  To calculate Arithmetic Mean (AM), Geometric Mean (GM) and Harmonic Mean (HM) conversion of mutually inclusive to exclusive is not necessary  For all calculation open ended CI must be converted into mutually exclusive Class Intervals.
  • 14. BIET – MBA Programme, Davangere 14 Prof. Vijay K S Business Statistics and Analytics  Calculating the mean from ungrouped data or Raw Data or Individual Observation Ungrouped Data or Raw Data:  Here the sample size is small  We add all the observation to calculate mean  This is not possible, if there is 5000 observation i.e. large number of data or observation In General, if X1, X2, X3………..Xn are given “n” observations then their Arithmetic Mean usually denoted as 𝑿̅ is given by 𝑋̅ = ∑ 𝑋 𝑛 Where: Example: The Arithmetic mean of 5, 8, 10, 15, 24 and 28 is = 5+8+10+15+24+28 6 = 90/8 = 15 Calculating the mean from grouped data:  This is used when the number of observation is large and difficult to compute  Here, we are access the frequency distribution of the data, not every individual observation o Discrete frequency distribution o Continuous frequency distribution  Frequency distribution consist of data that are grouped by classes. Each value of the observation falls somewhere one of the classes.  To find the arithmetic mean of continuous frequency distribution, we first calculate the midpoint of each class  Then multiply each midpoint by the frequency of observations in that class, sum all these results, and divide the sum by the total number of observations in the sample. o Discrete frequency distribution 𝑋̅ = ∑ 𝑓𝑋 𝑁
  • 15. BIET – MBA Programme, Davangere 15 Prof. Vijay K S Business Statistics and Analytics o Continuous frequency distribution 𝑋̅ = ∑ 𝑓𝑋 𝑁 Where “X” is the middle value 𝑋̅ = 𝐴 + ∑ 𝑑 𝑁 (Shortcut method / Step Deviation method) 𝑋̅ = 𝐴 + ℎ ∑ 𝑑 𝑁 (Shortcut method / Step Deviation method) Where: Problems: Q.No-1.1: Calculate AM for the raw data; 22, 28, 26, 24, 26, 15, 08, 09, 32, 20 Q.No-1.2: Calculate AM for the following distribution X 0 1 2 3 4 f 8 23 45 24 7 Note: It is a discrete frequency distribution Q.No-1.3: Calculate AM for the following distribution CI 0-10 10-20 20-30 30-40 40-50 F 3 14 31 13 4 Note: It is a continuous frequency distribution Q.No-1.4: Calculate AM for the following distribution table CI 0 - 9 10-19 20-29 30-39 40-49 50-59 F 3 15 38 14 10 5 Q.No-1.5: Calculate AM for the following distribution table Marks Below 20 Below 30 Below 40 Below 50 Students 12 35 48 60 Note: Since it has open ended class intervals so it should be converted into mutually exclusive class intervals.
  • 16. BIET – MBA Programme, Davangere 16 Prof. Vijay K S Business Statistics and Analytics Q.No-1.6: Calculate AM for the following distribution table Marks 10 Above 20 Above 30 Above 40 Above 50 Above 60 Above No. Students 60 48 35 18 8 2 Q.No-1.7: The following is the frequency distribution of the number of telephone calls received in 245 successive one-minute intervals at an exchange No. of calls: 0 1 2 3 4 5 6 7 Frequency: 14 21 25 43 51 40 39 12 Obtain the mean number of calls per minute Q.No-1.8: The following table gives salary per month of 450 employees in a factory, Find mean salary of the employees Salary (,000) 0-5 5-10 10-15 15-20 20-25 25-30 Employees 80 120 100 60 50 40 Q.No-1.9: An average monthly balances of 600 customers is given as follows. Find the mean from this data Class (in Dollars) Frequency 0 - 49.99 50.00 – 99.99 100.00 – 149.99 150.00 – 199.99 200.00 – 249.99 250.00 – 299.99 300.00 – 349.99 350.00 – 399.99 400.00 – 449.99 450.00 – 499.99 78 123 187 82 51 47 13 9 6 4 Q.No-1.10: From the following data find AM? C.I 130- 134 135-139 140-144 145-149 150- 154 155- 159 160-164 Employees 5 15 28 24 17 10 1 Q.No-1.11: Calculate the average no. of days the workers are absent in a company No. Days Absent Less than 5 5-10 10-15 15-20 20-25 25-30 30-35 No. of Workers 29 224 465 582 634 644 650
  • 17. BIET – MBA Programme, Davangere 17 Prof. Vijay K S Business Statistics and Analytics Q.No-1.12: Calculate the average age of employee with the help of following distribution table Age above Years 20 25 30 35 40 45 50 55 60 f 450 410 330 300 210 185 85 40 12 Q.No-1.13: Find the missing frequency from the following data, if 𝑋̅ = 15.38 X 10 12 14 16 18 20 f 3 7 x 20 8 5 Q.No-1.14: Find missing frequency in the following data given 𝑋̅ = 50 and N = 120 C.I 0-20 20-40 40-60 60-80 80-100 f 17 f1 32 f2 19  Step deviation method for grouped or continuous frequency distribution: In case of grouped or continuous frequency distribution, with class intervals of equal magnitude, the calculation are further simplified by taking 𝑋̅ = 𝐴 + ℎ ∑ 𝑓𝑑 𝑁 Q.No-1.15: Calculate the mean for the following frequency distribution Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 No. of Students 5 5 7 15 8 6 4 a) By the direct formula b) Step deviation method Properties of Arithmetic Mean 1. The algebraic sum of the deviations of the given set of observations from their Arithmetic mean is “Zero” In simple words, the sum of deviations taken from the Arithmetic mean is always “Zero” Mathematically ∑(𝑋 − 𝑋̅) = 0 For frequency distribution ∑f(X - 𝑋̅) = 0 2. Sum of square deviation taken from the AM is always least among such deviations taken from other measures of other tendency Mathematically ∑(𝑋 − 𝑋̅) 2 is always less than ∑(𝑋 − 𝑀) 2 , ∑(𝑋 − 𝑍) 2 ∑(𝑋 − 𝐺𝑀) 2 , ∑(𝑋 − 𝐻𝑀) 2
  • 18. BIET – MBA Programme, Davangere 18 Prof. Vijay K S Business Statistics and Analytics 3. Mean of the combined series: If we know the sizes and means of two component series, then we can find the mean of the resultant series obtained on combining the given series. If n1 and n2 observations posses x1 and x2 as means respectively then the combined group of size n1 and n2 is given by 𝑋̅ 12 = 𝑁1𝑋̅ 1+𝑁2 𝑋 2̅̅̅̅̅ 𝑁1+𝑁2 Merits and Demerits of Arithmetic Mean: Merits  It is rigidly defined  It is easy to calculate and understand i.e. the concept is familiar to most people and intuitively clear  It is based on all the observations  Every data set has a mean. It is a measure that can be calculated and it is unique because every data set has one and only mean  It is suitable for further mathematical treatment  Of all averages, Arithmetic mean is affected least by fluctuations of sampling, that means the Arithmetic mean is a stable average  The mean is useful for performing statistical procedures such as comparing the means from several data set Demerits  The strongest drawback of Arithmetic mean is that it is very much affected by extreme observations. Two or three very large values of variable may disproportionately affect the values of the Arithmetic Mean  Arithmetic mean cannot be used in the case of open end classes such as less than 10, more than 70. Since such classes we cannot determine the midpoint / mid value. In such cases Mode or Median may be used.  It cannot be determined by inspection nor can it be located graphically  It cannot be used, if we are dealing with qualitative data / Characteristic such as Honesty, Beauty In case of qualitative data, Median is the only average used  Arithmetic Mean cannot be obtained, if a single observation is missing or lost or is illegible unless we drop it out and compute the Arithmetic Mean of the remaining value  In extreme asymmetrical (Skewed) distribution, usually arithmetic mean is not representative of the distribution and hence not suitable for measure of location or Measure of Central Tendency
  • 19. BIET – MBA Programme, Davangere 19 Prof. Vijay K S Business Statistics and Analytics  Arithmetic mean may lead to wrong conclusions if the details of the data from which it is obtained are not available.  Arithmetic mean may not be one of the values which the variable actually takes and is termed as fictitious average. Weighted Arithmetic Mean Usual AM gives equal importance for all items; but most of the situation Mean is calculated based on the importance level of the observations. To make the average computed as representative of the distribution – proper weightage is given to various items Let W1, W2, W3,……….Wn be the weights attached to variable values X1, X2, X3………Xn respectively. Then the Weighted Arithmetic Mean usually denoted c 𝑋̅ 𝑤 = 𝑊1𝑋1 + 𝑊2𝑋2 + 𝑊3𝑋3 … … … 𝑊𝑛𝑋𝑛 𝑊1 + 𝑊2 + 𝑊3 … … 𝑊𝑛 𝑋̅ 𝑤 = ∑ 𝑊𝑋 ∑ 𝑊 In Case of frequency distribution, if f1, f2, f3…………..fn are the frequencies of the variable values X1, X2, X3……………Xn respectively than the weighted average / weighted arithmetic mean is given by 𝑋̅ 𝑤 = 𝑊1(𝑓1𝑋1) + 𝑊2(𝑓2𝑋2) + 𝑊3(𝑓3𝑋3) … … … 𝑊𝑛(𝑓𝑛𝑋𝑛) 𝑊1 + 𝑊2 + 𝑊3 … … 𝑊𝑛 𝑋̅ 𝑤 = ∑ 𝑊(𝑓𝑋) ∑ 𝑊 Q.No-1.16: Calculate the Weighted Arithmetic Mean for the following distribution Item Rice Wheat Sugar Jawar Oil Tea Salt Price 40 30 33 35 25 250 15 Weight 1 0.5 0.2 0.5 0.25 0.1 0.05 Q.No-1.17: A candidate obtained the following percentage of marks in an examination: English-60, Hindi-75, Mathematics-65, Physics-59, and Chemistry-55. Find the candidate’s weighted arithmetic mean if weights are 1, 2, 1, 3, and 3 respectively are allocated to the subjects.
  • 20. BIET – MBA Programme, Davangere 20 Prof. Vijay K S Business Statistics and Analytics Q.No-1.18: The mean annual salary of all employees in a company is Rs. 25000. The mean salary of female and male employees is Rs. 27,000 and 17,000 respectively. Find percent of male, female employed by the company. Q.No-1.19: The mean monthly salary paid to 77 employees in a company was Rs. 78. The mean salary of 32 of them was 75, and that of others 25 was 82, what was the mean salary of remaining? Q.No-1.20: Average daily income for group of 50 persons in a factory was calculated to be Rs. 169, it was later found that 1 values was measured by 134 instead of the current value 143. Calculate the correct average income. Q.No-1.21: Calculate the Mean score of students using weights aligned to subjects Physics, Chemistry, Maths, Biology, English and Hindi; respectively as 3, 2, 3, 0, 1, 1 using the following marks Subjects Hindi English Physic Chemistry Maths Biology Marks 56 70 72 62 80 69 Q.No-1.22: The number 3.2, 5.8, 7.9 and 4.5 have frequencies X, (X+2), (X-3) and (X+6) respectively. If the arithmetic mean is 4.876. Find the value of x. Q.No-1.23: Marks secured by 50 students in a test paper are given below 30 45 48 55 39 25 31 12 18 21 54 59 51 33 43 44 10 38 19 26 41 35 37 41 46 33 51 37 58 58 17 19 23 26 29 38 57 36 35 44 43 27 19 43 22 31 47 34 31 15 35 32 Prepare frequency table with class interval 10-19, 20-29, 30-39…….., and calculate the value of the Arithmetic Mean from the frequency table obtained. 2. MEDIAN Median is another measure of Central Tendency which locates the middle most value in given set of data Median is the measure of Central Tendency different from any of the means Median is a single value from the data set that measures the central item in the data Median is that value of the variable which divides the group in two equal parts, one part comprising of the values greater than and the other less than Median This single item is the middlemost or most central item in the set of numbers. As said earlier half of the items lie above this point and the other half lie below it Contradicting to the Arithmetic mean which is based on all the items of the distribution, the median is only positional average i.e. its value depends on the position occupied by a value in the frequency distribution.
  • 21. BIET – MBA Programme, Davangere 21 Prof. Vijay K S Business Statistics and Analytics  Calculation of Median from raw data or ungrouped data To find the Median of a dataset, first array the data in ascending or descending order. If the data set contains an odd number of items, the middle item of the array is the Median. If there is even items, the arithmetic means of two middle items Median M = ( 𝑥+1 2 )𝑡ℎ́ item in an arrange Where 𝑥 is the number of items in the array.  Calculation of Median from Discrete frequency distribution Here in this case Median M= ( 𝑁+1 2 ) 𝑡ℎ́ observation for discrete frequency distribution Where N is the number of items in the distribution i.e. sum of frequencies  Calculation of Median from Continuous frequency distribution Median M = L + { ( 𝑁 2 −𝑀)∗𝐶 𝑓 } Where L = Lower limit of Median class N = Total number of items M = Cumulative frequency of the class proceeding the median class f = Frequency of the median class C = Class width or Magnitude of the class Q.No-2.1: Find the Median of the observations 12, 15, 16, 82, 75 Q.No-2.2: Find the Median of the observations 12, 18, 13, 42, 63, 78 Q.No-2.3: Find the Median of the following distribution X 10 20 30 40 50 f 3 8 13 9 7 Q.No-2.4: Find the Median of the following distribution C.I 0-10 10-20 20-30 30- 40 40- 50 f 5 12 23 12 3
  • 22. BIET – MBA Programme, Davangere 22 Prof. Vijay K S Business Statistics and Analytics Q.No-2.5: Find the Median of the following distribution C.I 10 - 14 15 - 19 20 - 24 25 – 29 30 – 34 f 2 5 8 4 1 Note: to calculate Median - Cumulative frequency is a must - Class intervals should be mutually exclusive Q.No-2.6: Calculate Mean and Median from the following distribution C.I 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80 80-90 f 4 12 40 41 27 13 9 4 Q.No-2.7: Find the Median for the following data Height (C.I) 125-129 130-134 135-139 140-144 145-149 No. of Items (f) 2 5 8 4 1 Q.No-2.8: Find the Median wage of labours from the following Wages Above 0 Above 10 Above 20 Above 30 Above 40 Above 50 Above 60 Above 70 No. Labours 650 500 425 375 300 275 250 100 Q.No-2.9: Find the missing frequency, if M=14 C.I 0-5 5-10 10-15 15-20 20-25 25-30 f 5 7 Q 8 6 4 Q.No-2.10: Calculate Median from the series 5 Men get less than Rs. 5 12 Men get less than Rs. 10 22 Men get less than Rs. 15 30 Men get less than Rs. 20 36 Men get less than Rs. 25 40 Men get less than Rs. 30 Q.No-2.11: Calculate the missing frequency from the following data having Median = 46 and N = 230 Variables 10-20 20-30 30- 40 40- 50 50-60 60-70 70-80 f 12 30 f1 65 f2 25 19 Merits and Demerits of Median
  • 23. BIET – MBA Programme, Davangere 23 Prof. Vijay K S Business Statistics and Analytics Merits:  Rigidly defined  Easy to calculate for non-mathematical person  Since, it is a positional average, not affected by the extreme observations. Useful in the skewed distribution  Computed while dealing with open ended classes  Located by simple inspection and even graphically  This is the only average which will deal with qualitative characteristics 3. MODE:  Mode is one of the measure of central tendency that is different from the mean that somewhat like the median  The mode is the value that is repeated most often in the data set  The mode is defined as the highest or the most popular value in the given data  Mode is the value which occurs most frequently in a set of observations and around which the other items of the set clusters densely located  It is the value at the point around which the items tend to be most heavily concentrated. It is regarded as the most typical of a series of values  Mode is the value which has the greatest frequency density in its immediate neighbourhood  Mode is termed as the fashionable value of the distribution o Example: Average size of the shoe sold in a shop is 7 o Average Indian Male is 5 feet 6 inch Here the average refer to neither mean nor median but mode, the most frequent value in the distribution  Mode denoted a Mo or Z  Calculation of “Mode” o Mode (Z) is the highest value - Raw Data o Mode (Z) is the value corresponding to highest frequency in discrete frequency distribution o Mode (Z) in continuous frequency distribution is Z =L + { ( 𝑓−𝑓1)∗𝐶 2𝑓−𝑓1−𝑓2 } in a model class Model class = Class with highest frequency L = Lower limit of the model class
  • 24. BIET – MBA Programme, Davangere 24 Prof. Vijay K S Business Statistics and Analytics f = Frequency of model class f1 = Frequency of proceeding model class f2 = Frequency of succeeding model class C = Class width of model class Q.No-3.1: Find Mode 2, 6, 8, 12, 32, 25, 41, 63, 25 Q.No-3.2: Find the mode for following distribution table X 0 1 2 3 4 f 7 13 25 10 3 Q.No-3.3: Find the mode for following distribution table C.I 0-10 10-20 20-30 30-40 40-50 f 3 8 13 10 1 Q.No-3.4: Find the Mean, Median and Mode CI 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 f 4 12 40 41 27 13 9 4 Merits and Demerits of Mode Merits:  Easy to calculate and understand; done by merely inspection process  Not affected by observations  Convenient for open ended class Demerits:  Mode is not rigidly defined  Mode is not suitable for further mathematical treatment  Affected to a greater extent with the fluctuation of samplings Empirical Relationship between Mean (𝑿̅ ), Median (M) and Mode (Z) (Slightly Skewed) Symmetrical distribution that contains only one mode always have the same value for the Mean, Median and Mode In case of Skewed distribution In case of positively and negatively skewed distribution. The median can be taken for to measure the central tendency
  • 25. BIET – MBA Programme, Davangere 25 Prof. Vijay K S Business Statistics and Analytics Since the Median is not as highly influenced by the frequency of occurrence of a single value as is the mode, nor it is pulled by extreme value as is the Mean Whenever the given distribution is slightly skewed Mean (𝑿̅ ), Median (M) and Mode (Z) have showed following relationship Z = 3M – 2 𝑿̅ I.e. Mode = 3 Median – 2 Mean Q.No-3.5: Find Mean, Median and Mode using Empirical Relationship Size in Inches 5 10 25 20 25 30 35 f 1 3 13 17 27 36 38 GEOMETRIC MEAN - GM is nth root of product of quantities of the series. It is observed by multiplying the values of items together and extracting the root of the product corresponding to the number of items. - Thus, square root of the products of two items and cube root of the products of the three items are the geometric mean - It is never larger than the arithmetic mean - If there are zeroes and negative numbers in the series, the geometric mean cannot be used. - Logarithms can be used to find the geometric mean to reduce large numbers and to save time - Appropriate in situations where, there is an average percentage rate of change over a period of time. - It is widely used in the construction of index numbers Geometric Mean (GM) = √ 𝑥1 𝑥2 𝑥3 𝑥4 … … … … … … … … . 𝑥 𝑛 𝑛 When the number of items in the series is larger than 3, the process of computing GM is difficult. To overcome this, logarithms of each value is obtained. The log of all the values added up and divided by number of items. The antilog of the ratio obtained is the required GM. Geometric Mean (GM) = Antilog [ 𝑙𝑜𝑔1 𝑥+ 𝑙𝑜𝑔2 𝑥+ 𝑙𝑜𝑔3 𝑥+ 𝑙𝑜𝑔4 𝑥………………………..+ 𝑙𝑜𝑔 𝑛 𝑥 𝑛 ] = Antilog [∑ 𝑙𝑜𝑔 𝑥𝑖 𝑁 𝑛 𝑖=1 ] Geometric Mean (GM) for Continuous case GM = Antilog [ ∑ 𝑓 𝑙𝑜𝑔 𝑥 𝑁 ]
  • 26. BIET – MBA Programme, Davangere 26 Prof. Vijay K S Business Statistics and Analytics Merits of GM - It is based on all the observation in the series - It is rigidly defined - It is suited for averages and ratios - It is less affected by extreme values - It is useful for studying social and economic data Demerits of GM - It is not simple to understand - It requires computational skill - It cannot be computed if any items are zero or negative - It has restricted applications Problems: 4.1 Find the GM of date 2, 4, 8 4.2 Find the GM of date 2, 4, 8, 10 using logarithms 4.3 The annual rate of growth rate of growth of output of a company in the last five years is given below. Find the GM of the growth rate Year Growth Rate Output at the end of the year 1998 5.0 105 1999 7.5 112.87 2000 2.5 115.69 2001 5.0 121.47 2002 10.0 133.61 4.4 Comparing the previous year, the overhead (OH) expenses went up to 32% is year 2003, then increased by 40% in the next year and 50% increase in the following year. Calculate average increase in overhead expenses. Let 100% OH expenses at base year Year Growth Rate 2002 Base Year 2003 132 2004 140 2005 150 4.5 Consider the following time series at monthly sales of ABC Company for 4 months. Find average rate of change per month sales Month Sales I 10,000 II 8,000 III 12,000 IV 15,000
  • 27. BIET – MBA Programme, Davangere 27 Prof. Vijay K S Business Statistics and Analytics 4.6 Find the GM for the following data Yield of Wheat in MT No. of Farms 1 – 10 3 11 – 20 16 21 – 30 26 31 – 40 31 41 – 50 16 51 – 60 8 Harmonic Mean - It is the total number of items of a value, divided by the sum of reciprocal of values of a variable - It is a specified average which solves problems involving variables expressed in “Time rates” that vary according to time - Example: Speed in km/hr., min/day, Price/chapter - Harmonic mean (HM) is suitable only when time factor is a variable and the act being performed remains constant HM = 𝑁 ∑ 1 𝑥 Merits of Harmonic Mean - It is based on all observation - It is rigidly defined - ‘Suitable in case of series having wide dispersion - It is suitable for further mathematical treatment Demerits of Harmonic Mean - It is not easy to compute - Cannot be used when one of the items is zero - It cannot represent distribution Problems: 5.1: The daily income of 5 families in a very remote village is given below. Compute HM Family Income (X) 1 85 2 90 3 70 4 50 5 60
  • 28. BIET – MBA Programme, Davangere 28 Prof. Vijay K S Business Statistics and Analytics 5.2 A man travels by a car for 3 days; he covered 480 km each day. On the first day, he drives for 10 hrs. At the rate of 48 KMPH, on the second day for 12 hrs. At the rate of 40 KMPH, and on the 3rd day for 15 hrs. at the rate 32 KMPH. Compute HM, Weighted mean and compare them. 5.3 Find the HM for the following data Class Interval Frequency 0 – 10 5 10 – 20 15 20 – 30 25 30 – 40 8 40 - 50 7 MEASURES OF DISPERSION  An average does not tell the full story. It is hardly a full representative of a mass unless we know the manner in which the individual items scatter around it. A further description of the series is necessary if we are to gauge how representative the average is.  The measure of central tendency must be supported and supplemented by some other measure, one such measures is dispersion.  The literal meaning of dispersion is “Scatteredness”  We study dispersion to have an idea of homogeneity (Compactness) or heterogeneity (Scatter) of the distribution.  Why dispersion is important characteristic to understand and Measure? o It enables us to judge the reliability of our measure of Central Tendency o To tackle the problems associated with widely dispersed data o We may wish to compare the dispersion of various samples  Dispersion is the measure of the variation of the items  It is a measure of the extent to which the individual item vary  It is the degree of the scatter or variation of variables about a central value  Degree to which numerical data tend to spread about an average value is called variation or dispersion of data
  • 29. BIET – MBA Programme, Davangere 29 Prof. Vijay K S Business Statistics and Analytics - The curve A has git less dispersion or Variability - The Curve B has got less variability than curve C but more variability than curve A - Curve C has got more dispersion / variability than curve A and curve B Objectives or Significance of the measures of dispersion  To find the reliability of an average  To control the variation of the data from the central value  To compare two or more set of data regarding their variability  To obtain other statistical measures for further analysis of data Characteristics for an ideal measure of dispersion  It should be rigidly defined  Easy to calculate and easy to understand  It should be based on all the observations  It should be amiable to further mathematical treatment  Less affected by possible fluctuation of sampling  It should not be much affected by extreme observations Measures of Dispersion 1. Range 2. Quartile Deviation 3. Mean Deviation 4. Standard Deviation 1. RANGE Range is the crude measure of dispersion; Calculated as R = High – Low R = H – L Its relative measure is called co-efficient of Range R = 𝐻−𝐿 𝐻+𝐿 Example: Find range and its co-efficient 10, 13, 18, 8, 14, 16, 23, 25
  • 30. BIET – MBA Programme, Davangere 30 Prof. Vijay K S Business Statistics and Analytics 2. QUARTILE DEVIATION This is measure of dispersion which consider deviation between upper and lower quartile Quartile Deviation - QD = Q3 – Q1 (Inter quartile range) QD = = 𝑄3−𝑄1 2 (Semi - Inter quartile range) Co-efficient of QD = = 𝑄3−𝑄1 𝑄3+𝑄1 Example1: Find Quartile Deviation and its Co-efficient 13, 82, 65, 45, 58, 76, 18, 29, 34, 91 Example 2: Find semi-interquartile range and co-efficient of Quartile Deviation for the following frequency distribution Size 4-8 8-12 12-16 16-20 20-24 24-28 28-32 32-36 36-40 f 6 10 18 30 15 12 10 6 2 3. MEAN DEVIATION It is measure of dispersion which calculate average distance between each observation and its central value (𝑋̅ or M or Z) If the Mean Deviation is calculated around 𝑋̅ it is called Mean Deviation about the mean. Formulas Mean Deviation = ∑ I X−A I 𝑛 or ∑ I d I 𝑛 Where A = Mean/Median/Mode I d I = Mod d = I X-A I In case of Frequency Distribution Mean Deviation = ∑ f I X−A I 𝑛 or ∑ f I d I 𝑛 Relative Measure of Mean Deviation Co-efficient of Mean Deviation = Mean Deviation Average about which it is calculated
  • 31. BIET – MBA Programme, Davangere 31 Prof. Vijay K S Business Statistics and Analytics Example 1: Find Mean Deviation about 𝑋̅ for the following data 18, 75, 56, 63, 36 Example 2: Find Mean Deviation about Median Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 Student 2 6 12 18 25 20 10 7 Example 3: Calculate M.D. about the Median and its coefficient Size of Items 4 6 8 10 12 14 16 f 2 1 3 6 4 3 1 STANDARD DEVIATION & VARIANCE The most comprehensive explanation of dispersion are those that deal with the average deviation from some measure of central tendency Variance and Standard Deviation are the two measures for measuring dispersion in statistics. Both of these will tell us an average distance of any observation in the data set from the mean of the distribution VARIANCE and STANDARD DEVIATION of PAPULATION Population Variance  Every population has a variance, which is symbolised by 𝜎2 (Sigma Squared)  To calculate the population variance, we divide the sum of squared distances between the mean and each item in the population by total number of items in the population  By squaring the each distance, we make each number positive and at the same time, assign more weight to large deviation (Distance between the mean and the value) Population Standard Deviation  Population Standard deviation is denoted as 𝜎 (Sigma Squared)  It is simply the square root of population variance  Standard deviation is the square root of the average of squared distances of the observations from the mean  While the variance is expressed in the square of the units used in the data, standard deviation is in the same units as those used in the data
  • 32. BIET – MBA Programme, Davangere 32 Prof. Vijay K S Business Statistics and Analytics Formula for Variance and Standard Deviation Formula for Raw Data: Variance = 𝜎2 = ∑(𝑋−𝑋̅ )2 𝑁 Standard Deviation = 𝜎 = √ ∑(𝑋−𝑋̅ )2 𝑁 𝑜𝑟 𝜎 = √ ∑(𝑋)2 𝑁 − 𝑋̅2 𝜎2 = Population Standard Deviation 𝜎 = Population Variance X = Observation 𝑋̅ = Mean N = Number of observation in the population ∑ = Sum of all the values Grouped data: Variance = 𝜎2 = ∑ 𝑓(𝑋−𝑋̅ )2 𝑁 𝑜𝑟 ∑ 𝑓(𝑋)2 𝑁 − (𝑋̅)2 or ∑ 𝑓(𝑋)2 𝑁 − ( ∑ 𝑓𝑋 𝑁 )2 Standard Deviation = 𝜎 = √ ∑ 𝑓(𝑋−𝑋̅ )2 𝑁 𝑜𝑟 𝜎 = √ ∑ 𝑓(𝑋)2 𝑁 − 𝑋̅2 or 𝜎 = √ ∑ 𝑓(𝑥)2 𝑁 − ( ∑ 𝑓𝑋 𝑁 )2 Where f = Frequency of each of the class Example 1: Calculate S.D for the following data X 25 36 45 65 82 93 58 70 Example 2: Find S.D and Co-Efficient of SD for the following data C.I 0-10 10-20 20-30 30-40 40-50 50-60 60-70 F 5 7 14 12 9 6 2
  • 33. BIET – MBA Programme, Davangere 33 Prof. Vijay K S Business Statistics and Analytics Example 3: Find Mean and SD for the following data Age 10 20 30 40 50 60 70 80 No. of Person 15 30 53 75 100 110 115 125 Example 4: The 15 Vessels were produced in one day and we test each vessel to determine its purity. The data is given below, calculate the standard deviation. The result of purity test on vessel and Observed percentage of impurity is as follows 0.04 0.14 0.17 0.19 0.22 0.06 0.14 0.17 0.21 0.24 0.12 0.15 0.18 0.21 0.25 CO-EFFICIENT OF VARIANCE If Co-efficient of Variance for a given data is more, the data is said to be less consistent, the other hand if C.V is less it means that variability in the data is less and more consistent. C.V = 𝜎 𝑋̅ *100 Example 1: The run scored by two batsmen A & B in 10 innings are as follows A 10 115 5 75 7 120 36 84 29 19 B 45 12 76 42 4 50 37 48 130 0 Find I) Better one score II) Consistent Batsmen Example 2: Life of 2 models of refrigerator in recent survey are shown as follows. What is the average life of each model? Which model has grater uniformity? Life (years) 0-2 2-4 4-6 6-8 8-10 10-12 A 5 16 13 7 5 4 B 2 7 12 19 9 1 Example 3: Two brands of tyres are listed with the following results A) Which brand of tyre have greater average life? B) Compare the variability and state which brand of tyres would you use Life 20-25 25-30 30-35 35-40 40-45 X 1 22 64 10 3 Y 0 24 76 0 0
  • 34. BIET – MBA Programme, Davangere 34 Prof. Vijay K S Business Statistics and Analytics Standard Deviation for combined series If n1 observation have mean 𝑋̅1, and SD 𝜎1 and n2 observation have mean 𝑋̅2, and SD 𝜎2 then combined SD of (n1+n2) observations is calculated as 𝜎 = √ 𝑛1( 𝜎12 + 𝑑12) + 𝑛2( 𝜎22 + 𝑑22) 𝑛1 + 𝑛2 Where d1 = 𝑋̅-𝑋̅1 d2 = 𝑋̅-𝑋̅2 And 𝑋̅ = 𝑛1𝑋̅1+𝑛2𝑋̅2 𝑛1+𝑛2 Example1: The Mean and SD of marks obtained by 2 groups of students consisting of 50 each are given below. Calculate S.D of all 100 students Group Mean Standard Deviation n 1 60 8 50 2 55 7 50 Example 2: Calculate the missing information from the following data A B C Combined Numbers 175 ? 225 500 SD ? 63 5.9 5.4 Mean 220 240 ? 235 Example 3: A shareholders research centre of India has conducted a research study on price behaviour of 3 leading industries A, B, C. The results published in quarterly journal are as follows Shares Average Price Standard Deviation (SD) Current Selling Price (S P) A 18.2 5.4 36.0 B 22.5 4.5 34.75 C 24.0 6.0 39.0 I. Which share in your opinion is more stable in value II. If you are holder of all 3 shares, which are would you dispose off at present? Why? Example 4: Following are the record of goals scored by team A in the football season. No. of Goals 0 1 2 3 4 Matches 1 9 7 5 3 For team B the average number of goals scored per match was 2.5 with S.D = 1.25 gaols
  • 35. BIET – MBA Programme, Davangere 35 Prof. Vijay K S Business Statistics and Analytics Find which team to be consider more consistent? Note: Standard deviation is understood to be best measure of dispersion as 1. It includes all observations in its calculations a. Is more suitable as compared to other measurement of dispersion b. Is not much affected by extreme values c. It is rigidly defined 2. S.D is used in finding the normal probabilities of population Some remarks on Standard Deviation 1. Mean Deviation can be computed by taking the deviation from any averages i.e. Mean, Median and Mode, but Standard Deviation is always computed from Arithmetic Mean 2. Standard Deviation of the variable X will be denoted by 𝜎𝑥 , This notation will be useful when we have to deal with the standard deviation of two or more variables 3. SD is always taken as the positive square roots 4. Since S.D depends on the numerical value of the deviation thus the value of 𝜎 will be greater if the value of x are scattered ;widely away from the mean - Smaller the 𝜎 implies that distribution is homogeneous - Larger value of 𝜎 implies distribution is heterogeneous Mathematical Properties of Standard Deviation 1. Standard deviation is independent of change of origin but not of scale 2. Standard Deviation is the minimum value of the root mean square deviation 3. S.D greater or equal to Range 4. S.D is suitable for further mathematical treatment 5. The S.D of first n natural numbers 1, 2, 3,4 ….n is √ (𝑛2−1) 12 6. The Empirical Rule: a. For a symmetrical bell shaped distribution, we have approximately the following properties i. 68% of the Observation lie in the range : Mean ± 𝜎 ii. 95% of the observation lie in the range : Mean ± 2𝜎 iii. 99% of the observation lie in the range : Mean ± 3𝜎 7. The approximate relationship between Quartile Deviation (QD), Mean Deviation (MD) and Standard Deviation (𝜎) is QD = 2 3 𝜎 MD = 4 5 𝜎
  • 36. BIET – MBA Programme, Davangere 36 Prof. Vijay K S Business Statistics and Analytics QD : MD : SD :: 10 : 12 : 15 8. Any discrete distribution, Standard Deviation is not less than the Mean Deviation about mean I.e. SD ≥ Mean Deviation about mean Question from Previous Question Papers: 1. Which is good measure of central tendency? Give any two reason 2. The following distribution gives the pattern of overtime work done by 100 employees of a company. Find the mean and median. Overtime(Hrs) 10-15 15-20 20-25 25-30 30-35 35-40 No. of Employees 11 20 35 20 8 6 3. The following data gives the prices X and Y of shares A and B respectively. Compute the coefficient of variance X and Y and state which is more stable in value. Price of Shares A X 55 54 52 56 58 52 50 51 49 Price of Shares B Y 108 107 105 106 107 104 103 104 101 4. A Sample of 50 cars each of 2 makes X and Y is taken and average running life in years is recorded Life (No. of Years) No. of Cars Make X Make Y 0-5 8 6 5-10 12 10 10-15 17 20 15-20 10 12 20-25 3 2 a) Which of these two make gives higher average life? b) Which of these makes has shown greater consistent performance? Use standard deviation. 5. Calculate mean for the following frequency distribution I) By direct method II) By step-deviation method Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 No. of Students 6 5 8 15 7 6 3 6. Find the value of Mean, Median and Mode from the following date given below Wt (in kg) 93-97 98-102 103-107 108-112 113-117 118-122 123-127 128-132 No. of Students 3 5 12 17 14 6 3 1 7. Determine the missing frequency of the class interval 15-20, the mean being 19 units X 5-10 10-15 15-20 20-25 25-30 F 2 2 ? 4 4
  • 37. BIET – MBA Programme, Davangere 37 Prof. Vijay K S Business Statistics and Analytics 8. Find the standard deviation method for the following data CI 0-10 10-20 20-30 30-40 40-50 50-60 60-70 F 6 14 10 8 1 3 8 9. Calculate the quartiles from the following data CI 0-10 10-20 20-30 30-40 40-50 F 3 8 20 12 7 10. Compute Mean, Median and Mode from the data pertaining to marks scored by 80 students in statistics. The test is of 140 marks? Marks more than 00 20 40 60 80 100 120 No of Students 80 76 50 28 18 09 01 11. Find the value of X and Y from the following distribution Mid Values 15 25 35 45 55 65 Frequency 10 X 15 20 Y 11 Note: N =82 and Median = 41 12. Calculate Q1 and Q3 from the following Distribution CI 0-10 10-20 20-30 30-40 40-50 50-60 F 3 8 20 12 7 3 13. From the prices of shares of X and Y below find out which is more stable in values. X 35 54 52 53 56 58 52 50 51 49 Y 108 107 105 106 107 104 103 104 105 101 14. The following data gives the prices X and Y of shares A and B respectively. Compare the co-efficient of variation of X and Y ad state which share is more stable in value. Share A 55 54 52 56 58 52 50 51 49 Share B 108 107 105 106 107 104 103 104 101
  • 38. BIET – MBA Programme, Davangere 38 Prof. Vijay K S Business Statistics and Analytics Unit – 2 Correlation and Regression
  • 39. BIET – MBA Programme, Davangere 39 Prof. Vijay K S Business Statistics and Analytics Unit -2 Correction and Regression: Scatter Diagram, Karl Pearson correlation, Sparman’s Rank correlation (One way table only), Simple and multiple regression (Problems on simple regression only) Correlation: Introduction: Measures of central tendency and dispersion are confined to univariate distribution i.e. the distribution involving only one variable. These measures also used for the purpose of the comparision and analysis In some distribution or set of data, each unit assumes two values, we call it as bivariate distribution. Example, an individual in distribution has got two vaiables like hight and weight. If we measure more than two variables on each unit of distribution, it is called “Multivariate distribution”. Some of the examples of “Bivariate distribution” - The series of marks of individuals in two subjects in an examination - The series of sales revenue and advertising expenditure of different companies in a particular year. - Imports and exports of cotton from 1989 to 1994 - The series of ages of husband and wives in a sample of selected married couple In bivairate distribution, we may be interested to find if there is any relationship between two variables under study. The correlation is a statistical tool which studies the relationship between two variables The correation analysis involves various methods and techniques used for studying and measuring the extent of the relationship between the two variables. Variables are set to be correlated if the change in one variable result in corresponding change in the other variable. Some definition: “When the relationship is of a quantitative nature, the appropriate statistical tool for discovering and measuring the relationship and expressing it in a brief formula is known as correlation” “Correlation is an analysis of the covariation between two or more variables”.
  • 40. BIET – MBA Programme, Davangere 40 Prof. Vijay K S Business Statistics and Analytics Types of correlation A) Positive and Negative Correlation B) Linear and Non-Linear Correlation A) Positive and Negative Correlation If the values of two variables deviate in the same direction, correlation is said to be “Positive or Direct” Correlation Example: - Hight and Weights - The family income and expenditure on laxury goods - Amount of rainfall and yeild of crops - Price and Supply of commodities If the values of two variables deviate in opposite direction Example: - Price and Demand of commodity - Volume and pressure of perfect gas - Sales of wollen garments and day temperature B) Linear and Non-Linear Correlation Correlation is linear if corresponding to a unit change in one variable, there is a constant change in the other variables over the entire range of values In general, two variables x and y are said to be linear related, if there exists a relationship of the form Y = a + b x - Here “b”is the slope of the stright line - Generally assumed that the relationship between two variables under study is linear The relationship between two variables is said to be non-linear or curvilinear, if corresponding to a unit change in one variable, the other variable doesnot change at a constant rate but at flactuating rate.
  • 41. BIET – MBA Programme, Davangere 41 Prof. Vijay K S Business Statistics and Analytics Correlation and Cousation Correlation analysis enables us to have an idea about the degree and direction of relationship between the two variables under study But it fails to reflect upon the cause and effect relationship between two variables In a bivariate distribution, if the variables have cause and effect relationship, they are bound to have high degree of correlation between them. That means, causation always implies correlation. However, the converse is not true i.e. there may be a fairly high degree of correlation between the two variables; need not imply a cause and effect relationship between them. This high degree of correlation between variables may be due to Mutual dependence Both the variables bring influenced by the same external factors Pure chance Methods of studying correlation: Assuming that there is a linear relationship exist between two variables or series The commonly used method for studying the correlation between two variables are I. Scatter diagram method II. Karl Pearson’s Co-efficient of Correlation (Covariance method) III. Two way frequency table (Bivariate correlation method) IV. Ranks method or Spearman’s Rank Correlation V. Concurrent Deviation Method I. Scatter Diagram Method: - It is one of the simplest way or method of diagrammatic representation of a bivariate distribution and provides us one of the simplest tool of ascertaining the correlation between two variables - The “n” points are plotted as dots of two variables (Examples heights and weight). The diagram of dots so obtained is known as “Scatter Diagram” - From the scatter diagram, we can form a fairly good, tough rough idea about the relationship between the two variables. Some points regarding scatter diagram: - If the points are very dense, i.e. very close to each other then it is said to be fairly good amount of correlation may be expected between two variables.
  • 42. BIET – MBA Programme, Davangere 42 Prof. Vijay K S Business Statistics and Analytics On the other end, if the points are widely scattered, a poor correlation may be expected between them. - If the points on the scatter diagram reveal trend (either upward or downward), the variables are said to be correlated and if no trend is revealed, the variables are not correlated. Uncorrelated Perfect Positive Correlation Perfect Negative Correlation Low degree of positive correlation Low degree of negative correlation
  • 43. BIET – MBA Programme, Davangere 43 Prof. Vijay K S Business Statistics and Analytics High degree of positive correlation High degree of negative correlation No Correlation Remarks: - Scatter diagram provides rough idea about the relationship between two variables. It is not getting affected by the extreme numbers or observations as it does with the mathematical formulae. However, this method is not suitable if the number of observation is fairly large - It does not provide us an exact measures of the extent of relationship between two variables - It provides only the approximate estimating line or line of best fit by free hand method
  • 44. BIET – MBA Programme, Davangere 44 Prof. Vijay K S Business Statistics and Analytics Karl Pearson Coefficient of Correlation: This is also called “Covariance Method” or “Product moment correlation co-efficient” - A mathematical method of measuring the intensity or the magnitude of linear relationship between two variable series - It was suggested by Karl Pearson, and this method is most widely used in the area of practice - Karl Pearson’s measure, also known as Pearsonian correlation coefficient between two variables i.e. series X and Y. Usually denoted by r( x , y ) or rxy or r r = 𝐶𝑜𝑣 (𝑥,𝑦) 𝜎𝑥 𝜎𝑦 It is the ratio of the covairnace between x and y, written as Cov (x, y), to the product of standrad deviation of x and y Here the Cov(x,y) = 1 𝑛 ∑(𝑥 − 𝑥̅) . (y - 𝑦̅) 𝜎𝑥 = √ ∑(𝑥−𝑥̅)2 𝑛 𝜎𝑦 = √ ∑(𝑦−𝑦̅)2 𝑛 If 𝑥̅ 𝑎𝑛𝑑 𝑦̅ come out to be integer (i.e. whole number) then the following formula is feasible to use r = ∑ 𝑑𝑥 . 𝑑𝑦 √∑ 𝑑𝑥2 . 𝑑𝑦2 or r = ∑( 𝑥−𝑥̅) . (𝑦−𝑦̅) √∑( 𝑥−𝑥̅)2 . (𝑦−𝑦̅)2 If 𝑥̅ 𝑎𝑛𝑑 𝑦̅ are in fractions then the above formula is cumbersome to apply, then we should use the following formula r = 𝑛 ∑ 𝑥𝑦− ∑ 𝑥 . ∑ 𝑦 √ 𝑛.∑ 𝑥2−(∑ 𝑥) 2 ∗𝑛.∑ 𝑦2−(∑ 𝑦) 2
  • 45. BIET – MBA Programme, Davangere 45 Prof. Vijay K S Business Statistics and Analytics Note: 1. Two variables are said to be correlated if their exist cause and effect relationship between them Example: - Yield and rainfall - Production and price - Demand and supply 2. Correlation is said to be Positive, if increases in one variable result in increase on other variable or Decrease in one as a result of decrease of other. Example: Demand and Production, Production and Supply, Income and Want 3. Correlation is said to be negative, if increase in one variable result in decrease of other variable. Similarly the decrease in one due to increase in other variable. Example: Production and Price, Supply and Price 4. Correlation is said to be zero, if the variables are independently behaving Example: Decrease in tax rate and revenue 5. rlies between -1 and +1  If r lies between 0 to 1, that means positive correlation exists  If ris exactly 1, the correlation is perfect positive correlation  If r lies between -1 to 0, that means negative correlation exists  If r is -1, that implies perfect negative correlation Problems: C-1: Find Karl Pearson’s co-efficient of correlation for the following data Price 14 16 17 18 19 20 21 22 23 Demand 84 78 70 75 66 67 62 58 60 C-2: Calculate KPCC for the following two variables X 80 90 100 110 120 130 140 150 160 Y 15 15 16 19 17 18 16 18 14 C-3: Calculate Pearson’s Co-efficient of correlation for the following X 6.9 8.5 5.8 8.6 9.6 8.0 9.7 Y 2.9 3.8 6.5 2.3 5.5 3.5 3.2
  • 46. BIET – MBA Programme, Davangere 46 Prof. Vijay K S Business Statistics and Analytics C-4: Calculate Karl Pearson’s Co-efficient of correlation between expenditure on Advertising and sales from the data given below Ad Expenses 39 65 62 90 82 75 25 98 36 78 Sales (Lakhs) 47 53 58 86 62 68 60 91 51 84 C-5: From the following table calculate the co-efficient of correlation by Karl Pearson’s Method X 6 2 10 4 8 Y 9 11 ? 8 7 Arithmetic mean of X and Y series are 6 & 8 respectively C-6: Calculate the co-efficient of correlation between X and Y series from the following data X Y No. of Pairs of Observation 15 15 Arithmetic Mean 25 18 Standard Deviation 3.01 3.03 Sum of Squared Deviation from Mean 136 138 C-7: The Co-efficient of correlation between X and Y is 0.48, the Co-variance of x, y is 36, the variance of X is 16. Find the standard deviation of Y. C-8: Given the following information rxy = 0.8 ∑ 𝑥𝑦= 60 𝜎𝑦 = 2.5 and ∑ 𝑥2 = 90, where x and y are the deviation from the respective means. Find the number of items. Properties of Correlation Co-efficient 1. Pearson’s Correlation Co-efficient can bot exceed 1 numerically. In other words it lies between -1 and +1 2. Correlation Co-efficient is independent of the change of origin and scale 3. Two Independent variables are uncorrelated but the converse is not true i.e. uncorrelated variables need not necessarily be independent Spearman’s Rank Correlation Sometimes we come across statistical series in which the variables under consideration are not capable of quantitative measurement can be arranged in serial order. This happens when we are dealing with qualitative characteristics such a honesty, Beauty, Character, morality etc.
  • 47. BIET – MBA Programme, Davangere 47 Prof. Vijay K S Business Statistics and Analytics In these above cases or situation Karl Person’s Co-efficient of correlation cannot be used as such. The variables are measurable and attributes are not able to measure i.e. quantification of these are difficult. Spearman’s Co-efficient of correlation is more appropriate for this and it is calculated by using following formula It is denoted as 𝜌 (Rho) 𝜌 = 1 − 6 ∑ 𝐷2 𝑛3−𝑛 D = Difference between ranks i.e. D = R1-R2 N = Number of observations under x and y Problems: C-9: Find Spearman’s Rank correlation for the following data X 12 18 32 45 21 30 Y 70 68 75 95 86 12 C-10: From the following data calculate co-efficient of Rank correlation Ranks X 1 2 3 4 5 6 7 8 9 10 11 12 Y 12 9 6 10 3 5 4 7 8 2 11 1 C-11: Calculate the Spearman’s Co-efficient of rank for the following data X 78 89 97 69 49 79 68 57 Y 125 137 156 112 107 136 123 108 Note: If X and Y contains repeated observations, spearman’s rank co-efficient correlation is calculated as 𝜌 = 1 − 6 (∑ 𝐷2+𝐶𝐹) 𝑛3−𝑛 Where CF is the correction factor for the repeated observations and is given by CF = 𝑚3−𝑚 12 ”m” is the number of repeated observations
  • 48. BIET – MBA Programme, Davangere 48 Prof. Vijay K S Business Statistics and Analytics Problems: C-12: Find the spearman’s rank correlation co-efficient X 78 89 89 69 59 79 68 57 Y 125 137 156 112 112 112 123 108 C-13: Find Spearman’s Rank Correlation Co-efficient X 12 18 32 18 25 24 25 40 38 22 Y 16 15 28 16 24 22 28 36 34 19 C-14: Find the Co-efficient of rank correlation between X and Y X 30 38 28 27 28 23 30 33 28 35 Y 29 27 22 29 20 29 18 21 27 22 C-15: Calculate Co-efficient of rank correlation X 15 20 28 12 40 60 20 80 Y 40 30 50 30 20 10 30 60 Regression Analysis: Technique of establishing one variable based on the values of other variable whenever two variables are correlated is called regression. I.e. if “x” and “y” are correlated then estimating the values of “x” based on the values “y” or estimating the values of “y” based on the values of “x” is called Regression Analysis. To estimate “x” values based on “y”, we have the regression equation of “x” on “y” is given by (x- 𝑥̅) = bxy (y- 𝑦̅) Where bxy is the regression coefficient of x on y 𝑥̅, 𝑦̅ are the means of x and y respectively Similarly to estimate “y” based on “x” we use the regression equation of “y” on “x” as (y- 𝑦̅) = byx (x- 𝑥̅) Where byx is the regression coefficient of y on x 𝑥̅, 𝑦̅ are the means of x and y respectively And
  • 49. BIET – MBA Programme, Davangere 49 Prof. Vijay K S Business Statistics and Analytics bxy = r . 𝜎𝑥 𝜎𝑦 or bxy = 𝑛 ∑ 𝑥𝑦−(∑ 𝑥) (∑ 𝑦) 𝑛 ∑ 𝑦2− (∑ 𝑦) 2 byx = r . 𝜎𝑦 𝜎𝑥 or byx = 𝑛 ∑ 𝑥𝑦−(∑ 𝑥) (∑ 𝑦) 𝑛 ∑ 𝑥2− (∑ 𝑥) 2 R-1: Find two regression Equation for the following data x 62 72 98 76 81 56 76 92 88 49 y 112 124 131 117 132 96 120 136 97 85 R-2: Following data relate to experience of 8 operators and their performance rating (y), calculate the regression line of performance rating on experience and estimate the probable performance of the operator has 15 years of experience. R-3: Fit a least square line of the following X 1 3 4 8 9 11 14 Y 1 2 4 5 7 8 9 a) Obtain Co-efficient of X on Y and Y on X b) Find the co-efficient of correlation between X and Y c) Find Y and X, when X=10 and Y=6 respectively Note: 1. Co-efficient of correlation r = √𝑏𝑥𝑦 . 𝑏𝑦𝑥 2. bxy and byx will never be a opposite sign 3. r will be positive if bxy and byx are positive 4. r will be negative if bxy and byx are negative R-4: The height of father’s and Son’s is given in the following table, Find the two lines of regression and estimate the expected average height of son’s, when the height of the father is 67.5 inches Height of father 65 66 67 67 68 69 71 73 Height of sons 67 68 64 68 72 70 69 70 R-5: The marks of the 8th Standard students in mathematics and statistics are as follows, find the regression on rank of marks in statistics on marks in mathematics also find the marks of 9th student in statistics, if he has scored 90 in mathematics. (X) Maths Scored 50 40 60 46 50 48 59 47 (Y) Statistics Scored 30 37 42 32 35 45 40 35 R-6: You are given the following information. Find - r = 0.66
  • 50. BIET – MBA Programme, Davangere 50 Prof. Vijay K S Business Statistics and Analytics - Two regression equation - Regression co-efficient of correlation - Estimate x when y=100 X Y AM 36 85 SD 11 8 R-7: Following is the information about advertisement expenditure and Sales Advertising Expenditure Sales AM 20 120 SD 5 25 r = 0.8 a) Calculate two regression equation b) Find the likely sales when advertisement expenses are 25 crores c) What should be the advertisement budget when sales target is 150 crores R-8: Two regression equation are 2y-x-50 = 0 and 3y-2x-10=0 Find (i) r (ii) 𝑥̅ and 𝑦̅ or (Point of Intersection) Question from Previous Question Papers: 1. The following data give the test scores and sales made by nine sales men during certain period: Test Scores: 14 19 24 21 26 22 15 20 19 Sales (’00 Rs) 31 36 48 37 50 45 33 41 39 Find the regression equation and also estimate the most probable sales volume of a salesman making a score of 28. 2. From the following data calculate the rank correlation coefficient after making adjustment for ties ranks X 48 33 40 9 16 16 65 24 16 57 Y 13 13 24 6 15 4 20 9 6 19 3. The Following data relate to age of employees and the number of days they reported sick in a month. Calculate Karl Pearson’s co-efficient of correlation and interpret it. Age (Years) 30 32 35 40 48 50 52 55 57 61 Sick Days 1 0 2 5 2 4 6 5 7 8
  • 51. BIET – MBA Programme, Davangere 51 Prof. Vijay K S Business Statistics and Analytics 4. The following data gives the experience of machine operator and their performance ratings. Operator 1 2 3 4 5 6 7 8 Performance (in Years) 16 12 18 4 3 10 5 12 Performance Ratings 87 88 89 68 78 80 75 83 Calculate the regression line of performance rating on experience and estimate the probable performance rating if an operator has 7 years of experience 5. A Company wants to assess the impact of R and D expenditure (in Rs. 1000/-) on its annual profit (in Rs. 1000/-). The following table presents the information for last 8 years. Year R and D Expenditure Annual Profit 2010 9 45 2011 7 42 2012 7 40 2013 1 60 2014 4 30 2015 5 34 2016 3 25 2017 3 20 Estimate the regression equations and predict the annual profit for the year 2020 for an allocated sum of Rs. 100,000/- as R and D expenditure. 6. The following table shows the ages (x) and blood pressure (y) of 8 persons. X 52 63 45 36 72 65 47 25 y 62 53 51 25 19 43 60 33 Obtain the regression equation of y on x and find the expected blood pressure of a person who is 49 year old. 7. The following data relate to the ages of husbands and wives: Age of Husbands (Years) 25 28 30 32 35 36 38 39 42 55 Age of Wives (Years) 20 26 29 30 25 18 26 35 35 46 Find the regressions and also find the most likely age of husband when wife’s age is 25 Years 8. Calculate Karl Pearson’s coefficient of correlation between expenditure on advertising and sales from the data given below: Advertisement expenditure (Rs.000’) 39 78 65 62 90 82 75 25 98 36 Sales (Rs. Lacs) 47 84 53 58 86 62 68 60 91 51 Comment the results
  • 52. BIET – MBA Programme, Davangere 52 Prof. Vijay K S Business Statistics and Analytics 9. Consider the following data, obtain the regression equations: X 6 2 10 4 8 Y 9 11 5 8 7 10. A research company summarized advertising expenditure and sales results as follows: Adv. Exp. (Rs. In Crore) Sales (Rs. In Crore Mean 20 200 SD 18 17 11. The following table gives the age of cars of a certain make and annual maintenance costs. Estimate the maintenance cost of a 7 year old car. Age of Cars (in Years) 2 4 6 8 Maintenance cost (in Hundreds of Rs.) 10 20 25 30 12. A financial analyst wanted to find, out whether inventory turnover influence any company’s earnings per share (in%). A random sample of 7 companies listed in stock exchange were selected and the following data was recorded. Company A B C D E F G Inventory Turnover 4 5 7 8 6 3 5 Earning Per Share (%) 11 9 13 7 13 8 8
  • 53. BIET – MBA Programme, Davangere 53 Prof. Vijay K S Business Statistics and Analytics Unit – 3 Probability Distribution
  • 54. BIET – MBA Programme, Davangere 54 Prof. Vijay K S Business Statistics and Analytics Unit 3: (8 Hours) Probability Distribution:  Concept and definition – Rules of probability – Random variables – Concept of probability distribution  Theoretical probability distribution: Binomial, Poisson, Normal and Exponential – Baye’s theorem (No deviation) (Problems only on Binomial, Poisson and Normal) Introduction: The result or outcome of an experiment, which is performed repeatedly under essentially homogeneous and similar condition are categorised either of the following - It is unique or Certain o Here the results are predictable with certainty and they are known as deterministic or predictable phenomenon Example: Boyles Law: Pressure × Volume = Constant o Most of physical and Chemical sciences are deterministic in nature - Uncertain or Unpredictable o Where the results cannot be predicted with certainty and they are known as unpredictable or probabilistic phenomenon. Example: Sales manager on sales target, life of electric bulb o It is frequently observed in Economics, Business and Social Sciences A numerical measurement of uncertainty is provided by a very important branch of statistics called the “Theory of Probability”. Here the mathematics and statistics, we try to present condition under which we can make sensible numerical statements about uncertainty and apply certain methods of calculating numerical values of probabilities and expectations. “Statistics is the science of decision making with calculated risks in the face of uncertainty” Many business decisions are based on variables which are certainly not under control and hence decision yields poor results. Using mathematical models, we can make better decisions but in limited number of cases. In such circumstances we make use of probability i.e. calculations to predict about uncertain conditions or situation. Probability is a measure of chance associated with occurrence of an event (That is essential to make good business decisions) Example: Demand for the products, which is newly launched.
  • 55. BIET – MBA Programme, Davangere 55 Prof. Vijay K S Business Statistics and Analytics Important terminologies in Probability: 1. Experiments The term experiment refer to describe an act which can be repeated under same given conditions. Random experiment: An experiment is called random experiment if when conducted repeatedly under essentially homogeneous conditions, the result is not unique or results is not certain but may be any one of the various possible outcomes. Or An Experiment having random outcomes Or Experiments whose results are depends on chance Example: Tossing a coin, rolling a dice 2. Trail Performing of a random experiment is called a trial Example: Tossing experiment of a coin has done two times, that means two trials 3. Event: Outcome or combination of outcomes of an experiment are termed as events Example: Tossing a coin – You may get H or T – These are events 4. Mutually Exclusive Events: Two events are said to be mutually exclusive or incompatible, when both cannot happen simultaneously in a single trial or in other words, the occurrence of any one of them avoid the occurrence of the other. In other words “if happening of one event prevents the happening of the other events such events we call it as mutually exclusive events” Example: Tossing a coin leads to two events Head (H) or Trail (T) If head turns up in tossing a coin, then head prevents tail to turn-up and vice-versa
  • 56. BIET – MBA Programme, Davangere 56 Prof. Vijay K S Business Statistics and Analytics 5. Independent and Dependent Events: Two or more events are said to be independent when the outcome of one doesn’t affect, and is not affected by the other. Example: Tossing of coin twice, happening of head during the first trail will not affect the happening of other in the next trial The occurrence and non-occurrence of one event in any one trial affect the probability of other event in other trial Example: Drawing a card without replacement. 6. Equally likely events: Events are said to be equally likely when one doesn’t occur more often than the others. This means none of them is expected to occur in preference of other. In other words – equal chance of occurrence and importance for all the events to occur Example: When you roll a dice, occurrence of all the 6 faces i.e. 1, 2, 3, 4, 5, 6 are equally likely 7. Simple and Compound Events: In case of simple events we consider the probability of the happening or not happening of single events Compound events, we consider the joint occurrence of two or more events 8. Exhaustive Events: Events are said to be exhaustive when their totality includes all the possible outcomes of a random experiment. In other words, if the sum of individual chance of occurrence is equal to 1 Example1 : Rolling dice, once the possible outcomes are 1, 2, 3, 4, 5 and 6, hence the exhaustive number of cases is 6 Example 2: If we roll two dice once the exhaustive number of cases is 62 = 36 Similarly for rolling of three dice leads to 216 outcomes and summation of possibilities or probability of occurrence of all these events is 1 10 Red balls and 6 White balls Probability of drawing 3 white balls in the first draw and 3 black balls in second draw Probability of drawing a red ball
  • 57. BIET – MBA Programme, Davangere 57 Prof. Vijay K S Business Statistics and Analytics 9. Complementary events: Let there be two events A and B, A is called the complementary event of B (and Vice versa), if A and B are mutually exclusive and exhaustive. Example: When the dice is thrown, the occurrence of an even number and odd number are complementary events. Simultaneous occurrence of two events A and B is generally written as AB Definition of Mathematical Probability If there be a random experiment with “N” outcomes which are mutually exclusive, exhaustive and equally likely Let there be an event “A”, Let “M” outcome occur for the event “A” (Favourable outcomes), then the probability of occurrence of “A” can be written as follows P (A) = 𝑚 𝑁 = 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 "𝐴" 𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠 Example: Rolling of dice once S = 1, 2, 3, 4, 5, 6 The total number of outcomes N = 6  Probability of getting odd numbers P (Odd Numbers) = 𝑚 𝑁 = 3 6 = 1 2 = 50% That means the probability of getting the odd number is 50%  Probability of getting the Even numbers P (Even Numbers) = 𝑚 𝑁 = 3 6 = 1 2 = 50% That means the probability of getting the even number is 50% The probability of not happening of “A”, we call it as complementary events of “A” denoted as 𝐴̅ or 𝐴 𝑐 or 𝐴1 i.e. P( 𝐴̅) = 𝑁−𝑚 𝑁 = 𝑇𝑜𝑡𝑎𝑙 𝑂𝑢𝑡𝑐𝑜𝑚𝑒−𝐹𝑎𝑣𝑜𝑢𝑎𝑏𝑙𝑒 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 𝑇𝑜𝑡𝑎𝑙 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 P(𝐴̅) = 𝑁 𝑁 − 𝑚 𝑁 = 1 − 𝑚 𝑁 = 1 – P (A)
  • 58. BIET – MBA Programme, Davangere 58 Prof. Vijay K S Business Statistics and Analytics P(𝐴̅) + P (A) = 1 i.e. the P(Failure) + P (Success) = 1 Theorems of Probability or Rules of Probability The two important theorems of probability 1. The addition theorem 2. The multiplication theorem 1. The addition Theorem of Probability: - This is also called as “Or” Probability P (A) or P (B) P (A Occurring) or P (B Occurring) - This is also denoted as follows o P(A) or P(B) o P(A) U P(B) o P(A or B) o P(A U B) - Here “Or” means “add” - The addition theorem stats that if two or more A and B are mutually exclusive, the probability of the occurrence of either A or B is the sum of the individual probabilities of A and B So Symbolically P(A or B) = P(A) + P(B) - When the events are mutually exclusive events i.e. when the events are disjoint, then adding the probability of occurring both events will be good Mutually Exclusive events P(A or B) = P(A U B) = P(A) + P(B) Similarly, for three or more mutually exclusive events=P(A or B or C)=P(A)+P(B)+P(C) A B Event BEvent A Denotes “A Union B”
  • 59. BIET – MBA Programme, Davangere 59 Prof. Vijay K S Business Statistics and Analytics - When events are “Not mutually exclusive”, here there is a possibility of occurrence of both the events. Then the addition rule of probability will get modified When the events are not mutually exclusive i.e. when there is a overlap, then the addition rule is Overlap of Event P(A or B) = P(A U B) = P(A) + P(B) – P(A ⊓ B) Or P(A or B) = P(A U B) = P(A) + P(B) – P(A and B) Or P(A or B) = P(A U B) = P(A) + P(B) – P(AB) 2. The Multiplication Theorem of Probability: - This is also called as “And” Probability - Denoted as P(A) and P(B) for events A and B - Also denoted as o P(A & B) o P(A ⊓ B) o P(A) ⊓ P(B) - This theorem states that if two events A and B are independent, the probability that they both will occur is equal to the product of their individual probabilities If A and B are independent i.e. Event “A” will not affect the event “B” P(A ⊓ B) = P(A) . P(B) or P(A and B) = P(A) × P(B) Event A
  • 60. BIET – MBA Programme, Davangere 60 Prof. Vijay K S Business Statistics and Analytics Similarly for three events P(A ⊓ B ⊓ C) = P(A) × P(B) × P(C) or P(A, B and C) = P(A) . P(B) . P(C) - If the events are interdependent / Dependent, two events; if A and B are said to be dependent i.e. B occurs only when A is known to have occurred Then P(A ⊓ B) = P(A) × P(B/A) ; P(A) ≠ 0 P(B ⊓ A) = P (B) × P(A/B) ; P(B) ≠ 0 Or P (B/A) = 𝑃 (𝐴𝐵) 𝑃(𝐴) = 𝑃 (𝐴) 𝑎𝑛𝑑 𝑃(𝐵) 𝑃(𝐴) P (A/B) = 𝑃 (𝐴𝐵) 𝑃(𝐵) = 𝑃 (𝐴) 𝑎𝑛𝑑 𝑃(𝐵) 𝑃(𝐵) The above cases were also called as conditional probability Problems: 1. Two unbiased coins are tossed once. Find the probability of getting I) At least one head II) At most one head III) Two tails 2. An unbiased dice is rolled once. Find the probability of getting I) Odd Numbers II) Even Numbers 3. A pair of dice are rolled. Find the probability of getting the sum on the faces turning up to be I) Up to be 7 II) At-least 10 III) Up to be 12 IV) Neither 7 nor 10 4. A Bag contains 6 White, 4 Red and 8 Black marbles. 3 Marbles drawn randomly, what is the probability that they are of I) Same Colour II) Different Colours
  • 61. BIET – MBA Programme, Davangere 61 Prof. Vijay K S Business Statistics and Analytics 5. A Bag contain 7 White and 9 Black Marbles, 2 marbles are drawn randomly. What is the probability that I) They are of same colours? II) They are of different colours? 6. A bag contains 5 Red, 3 white and 6 Green sticks. 3 sticks are drawn randomly. Find the probability that I) All are green II) 2 Red and 1 Green Sticks III) 3 White Sticks 7. 4 Cards are drawn from a pack of cards. Find the probability that 2 are Spades and 2 are Hearts. 8. From the pack of 52 cards, 4 are accidently drawn. Find the chance that I) They will consist of a Jack, A Queen, A King and A ACE II) They are one from each suit III) 2 of them are Red and 2 of them are Black 9. In a College, 60% of the students play football and 50% of them play basketball. If a student is selected randomly from the college, what is the probability that I) He plays basketball or Football II) Students play neither sports 10. A person is known to hit the target in 3 out of 4 shots, where as another person known to hit the target in 2 out of 3 shots. Find the probability that hit the target? 11. Salesman is known to sell the product in 3 out of 5 attempt, another salesman in 2 out of 5 attempt. Find the probability that I) Number of sales will be affected when they try to sell the product => Both will not be able to sell II) Either of them succeed in selling the product 12. Count against student “X” solving a B/S problem are 8:6 and count in favour of student “Y” solving B/S problems are 14:16 I) What is the chance that the problem will be solved or they both have independently of each other? II) What is the probability that neither solves the problem? 13. If the P(A) is 0.3, P(B) is 0.2 and P(C) = 0.1 and A B C are independent events. Find the probability of occurrence of at-least one of the 3 events A, B and C Bayes’ Theorem: This theorem allows us to use new information to update the conditional probability of an event. Bayes’ theorem in its simple form is given by P(A / B) = 𝑃 (𝐴 ∩ 𝐵) 𝑃 (𝐵)
  • 62. BIET – MBA Programme, Davangere 62 Prof. Vijay K S Business Statistics and Analytics Random Variable Random variables are really ways to map the outcome of random processes to numbers. It is a process of quantifying the outcomes of the random experiment. If you have a random process like flipping a coin or rolling a dice or you are measuring a rain that might fall tomorrow; here you are measuring the outcomes of these random processes to numbers that means you are quantifying the outcomes.  Random variable is a function which takes real values which are determined by the outcomes of the random experiment  The random variables were denoted by the capital letters X, Y, Z  The actual values which events assumes is not a random variable.  The random is used to do further mathematical operation of the outcomes and for the purpose of notation. Example: A Random experiment where three coins are tossed simultaneously; then the outcomes are S = {( 𝐻, 𝑇) 𝑎𝑛𝑑 ( 𝐻, 𝑇) 𝑎𝑛𝑑 ( 𝐻, 𝑇)}, which can also be denoted as follows S = {( 𝐻, 𝑇) × ( 𝐻, 𝑇) × ( 𝐻, 𝑇)} The total outcomes as follows S = { 𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝑇} Let us consider variable “X” to quantify the outcomes of the above experiment; If “X” is the No of Head obtained, Then “X” takes any one of the value {0, 1, 2, 3} Outcomes: 𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝑇 Values of X: 3 2 2 1 2 1 1 0 Hence the random variable is a function which takes real values which are determined by the outcomes of the random experiment. Discrete and Continuous Random Variable: If “X” Assumes only a finite or countable infinite set of values, it is known as Discrete Random Variable Example: No of students in a college, Marks obtained by the students in a test, Number of defective mangoes in a basket. If “X” assumes infinite and uncountable set of values, it is set to be Continuous Random Variable. Here we usually talk of the values in a particular interval and not at a point. Example: Height or Weight of students in a classroom Generally Discrete Random Variable represents counted data while Continuous Random Variable represents measured data.
  • 63. BIET – MBA Programme, Davangere 63 Prof. Vijay K S Business Statistics and Analytics Probability Distribution of a Discrete Random Variable Let us consider a Discrete Random Variable “X” which can take the possible values x1, x2, x3,……., xn with each value of the variable X, we associate a number pi = P(X=Xi); i=1,2,3……………., n Where pi = P(X=Xi) ≥ 0 𝑎𝑛𝑑 ∑ 𝑝𝑖 = 𝑝1 + 𝑝2 + 𝑝3 … … . 𝑝𝑛 = 1 The function pi = P(X=Xi) pr p(x) is called the Probability Mass Function of the random variable X and the set of all possible ordered pairs { 𝑥, 𝑝(𝑥)} is called the Probability Distribution of random variable X The concept of probability distribution is analogous to that of frequency distribution. Just as frequency distribution tells us how the total frequency is distributed among different values (or classes) of the variable. Similarly a probability distribution tells us how total probability of 1 is distributed among the various values which the random variable can take. It usually represented in a tabular form given below. Probability Distribution of Random Variable X x p(x) x1 x2 x3 . . xn p1 p2 p3 . . pn Probability Distribution of a Continuous Random Variable This will be represented in the form of frequency polygon drawn by referring to the grouped frequency distribution of a continuous variable. A frequency polygon gets soother and smoother as the sample size gets larger, and the class intervals become more numerous and narrow. Ultimately the density polygon becomes a smooth curve called the density curve. The function that defines the curve is called the Probability Density Function. Concept of Probability Distribution The probability distribution of a random variables may be - Theoretical list of outcomes and probabilities which can be obtained from a mathematical model representing some phenomenon or process of interest - Empirical listing of outcomes associated, with their subjective or contrived probabilities representing the degree of conviction of the decision maker as to the likelihood of the possible outcomes. - An empirical listing of outcomes and their observed relative frequency Here we are focusing on the theoretical listing of the outcomes and their probabilities.
  • 64. BIET – MBA Programme, Davangere 64 Prof. Vijay K S Business Statistics and Analytics Mathematical Expectations of Random Variable: The expected value of “X” and denoted by E(X) is defined as E(X) = ∑ [𝑥 × 𝑝(𝑥)] E(X) = 𝑋̅ Hence Mathematical expectations of a random variable is nothing but its Arithmetic mean Variance, Standard Deviation and Mean Mean = E(X) = ∑ 𝑥 × 𝑝(𝑥) Variance = 𝜎𝑥 2 = 𝐸 (𝑋)2 – [𝐸(𝑋)] 2 = ∑ 𝑋2 × 𝑝(𝑥) − [∑ 𝑥 𝑝(𝑥)] 2 Standard Deviation: 𝜎𝑥 = √∑ 𝑋2 × 𝑝( 𝑥) − [∑ 𝑥 𝑝(𝑥)] 2 Problems: 1. A die is tossed twice. Getting “an odd number” is termed as success. Find the probability distribution of the number of successes. 2. Two cards are drown A. Successively with replacement B. Simultaneously (Successively without replacement) From a well shuffled deck of 52 cards. Find the probability distribution of the number of aces. 3. Obtain the probability distribution of X, the number of heads in three tosses of a coin (Or simultaneous toss of three coins) 4. Two dice are rolled at random. Obtain the probability distribution of the sum of the numbers on them. 5. Four bad Apples are mixed accidentally with 20 good apples. Obtain the probability distribution of the number of bad apples in a draw of 2 apples at random. 6. A die is thrown at random. Find the expectation of the number on it. 7. A random variable “X” has the following probability distribution. Find Mean and Variance x 4 5 6 7 P(x) 0.1 0.3 0.4 0.2 8. A r.v. X has the following probability function X -2 -1 0 1 2 3 P(X) 0.1 K 0.2 2k 0.3 k Find k, Mean and s.d (X)
  • 65. BIET – MBA Programme, Davangere 65 Prof. Vijay K S Business Statistics and Analytics Theoretical Probability Distribution Theoretical probability distribution are the functions of a known random variable which generates probabilities for a given values of a random variable. In other words probability distribution are the ready to use formula (Functions) for calculating probability of a known variable. Amongst theoretical or expected frequency distribution the following are popular 1. Binomial Distribution 2. Poisson Distribution 3. Normal Distribution Binomial Probability Distribution: - It is also known as “Bernoulli Distribution”, Probability distribution expressing the probability of one set of dichotomous alternatives i.e. success or failure. - Conditions or assumptions of Binomial Distribution o n, the number of trails is finite o each trail results in two mutually exclusive and exhaustive outcomes, termed as success and failure o Trails are independent o p, the probability of Success is constant for each trail, then q = 1-p, is the probability of failure in any trail - Bernoulli trail: A trail having only two outcomes Example: Tossing a coin: H or T Outcome of the game: Win or Lose Business outcome: Success or failure Let “x” be a random variable for a binomial variable with “n” trail and P(Success) = p, then probability of “x” number of success is given by P(x) = n𝐶 𝑥 . 𝑝 𝑥 . 𝑞 𝑛−𝑥 Where x = Number of success in “n” trail n = Number of trail p = probability of success in a single trail q = (1-p) = (1-Success) Here “n” and “p” are the parameters of Binomial Distribution. They are the unknown values in the above formula. If we know the values of “n” and “p”, then we can able to find the required solution Constants of Binomial Distribution - Mean = np - Variance = npq
  • 66. BIET – MBA Programme, Davangere 66 Prof. Vijay K S Business Statistics and Analytics - Standard Deviation = √ 𝑛𝑝𝑞 Problems on Binomial Distribution 14. A fair coin is tossed 5 times, what is the probability of getting I) Exactly three head II) At-least 1 head III) No Heads IV) At most 3 heads 15. A salesman makes a sale of 4 out of 10 (40% success) customers he contacts. If four customers are contacted today, what is the probability that he makes sales exactly two? 16. 20% of the bolts manufactured by machine are defective. Find the probability that there are I) No defective II) At most two defective III) At-least 1 defective IV) Exactly one defective bolt when 5 bolts are chosen randomly 17. The probability of man hitting target is 1/5, what is the probability that he targets I) At-least once II) At-least trice If he aims 7 times at the target 18. In hundred sets of 10 tosses of an unbiased coin, how many tosses should we expect to get I) 7 heads and 3 Tails II) At-least 7 heads Fitting up of distribution (Framing a probability Distribution) - The process of obtaining expected frequency based on the theoretical probability distribution is called fitting up of distribution. 19. Fit a Binomial distribution for the following data X 0 1 2 3 4 5 f 2 10 24 38 18 8 20. Fit Binomial distribution for the following data X 0 1 2 3 4 5 f 2 10 48 114 72 40 21. Fit a binomial distribution for the following data X 0 1 2 3 4 5 6 7 f 7 6 19 35 30 23 7 1
  • 67. BIET – MBA Programme, Davangere 67 Prof. Vijay K S Business Statistics and Analytics 22. Mean and Variance of Binomial distribution are 12 and 5 respectively. Find the parameters of Binomial Distribution? 23. The mean and Standard Deviation of Binomial Distribution is 4 and √3 respectively. Find n, p and q 24. Find the probability of Success for Binomial distribution if n=6 and 4P(X=4) = P(X=2) Poisson Probability Distribution Poisson distribution may be expected in cases where the chance of any individual event being a success is small. The distribution is used to describe the behaviour of rare events such as the number of accidents on road, Number of printing mistakes in books. It has been called “the Law of Impossible Events” Let “x” be a discrete random variable with mean ( 𝜆 ) assume assigned to a rare event. “x” is said to follow Poisson probability distribution, the probability mass function (PMF) is given by P(x) = 𝑒−𝜆 𝜆 𝑥 𝑥! Where x= 1, 2, 3, 4…… . . ∞ 𝜆 = Parameters of the Poisson distribution Let “x” be a discrete random variable with mean ( 𝜆) i.e. np or the average number of occurrence of an event. The Poisson distribution is a discrete distribution with a single parameter “ 𝜆" increases, the distribution shifts to the right. All the Poisson probability distribution are skewed to the right. This is the reason why the Poisson probability has been called the probability distribution of rare events. Constants of the Poisson distribution - The mean of Poisson distribution = 𝜆 - The standard deviation = √ 𝜆 Role of the Poisson distribution Used in infrequently occurring events with respect to time, area, volume or similar events. Some practical situation in which Poisson distribution can be used are given below. - Quality control statistics - Biology to count the number of bacteria - Number of particles emitted from radioactive substance - Insurance – No of causalities - Waiting time problem – Number of incoming calls
  • 68. BIET – MBA Programme, Davangere 68 Prof. Vijay K S Business Statistics and Analytics Problems on Poisson distribution: 25. Skilled typist makes an average 2 types mistakes per book of 100 pages. In a randomly chosen book of 100 pages typed by the same typist, what is the probability that I) There are no typing mistakes II) There is at-least one typing mistakes III) There are exactly 3 typing mistakes 26. In an express way, the average number of car accidents in a week are 3, on randomly chosen week of the year, what is the probability that I) There are no accidents reported II) There are exactly 2 accidents in that week 27. In a world class manufacturing facility, the average number of occupational hazards in a year is 2. Find the probability that in a given year of safety systems assessment there are at most two occupational hazards. 28. On an average 2 flights crashed due to technical problem in this year. Find the probability that in a given year I) No crash II) At least one crash 29. Number of customers demanding for information using RTI act in government office in a week is 1.5 on an average. Find probability that in given week of the year I) There is no demand II) There are 3 demands 30. Accidents occurs on a particular stretch of highway t an average vales of 3 per week, assuming Poisson probability. Find probability of exactly two accidents in a given week( 𝑒−3 = 0.04979) Fitting a Poisson distribution 31. Systematic sample of 200 pages was taken from the manuscript type by typist and observed frequency distribution of the typing mistakes per page is found to be as follows. Fit Poisson distribution. Number of typing mistakes 0 1 2 3 4 Number of pages 122 60 15 2 1 32. Calculate Mean and Variance of Poisson variable “X”, if the probability P(X=4) = P(X=5) 33. For Poisson variable P(X=1) = P(X=2) find mean and P(X=0) Poisson Approximation to Binomial distribution - If “A” is Binomial variable with “n” very large (>30) and “P” is very small (<0.1) then X follows Poisson probability 𝜆 = np 34. If 2% electric bulbs manufactured by company are defective. Find the probability that in a sample of 200 bulbs I) Less than 2 bulbs II) More than 3 bulbs are defective
  • 69. BIET – MBA Programme, Davangere 69 Prof. Vijay K S Business Statistics and Analytics 35. On an average one in 400 chips is found to be defective if chips are packed in 100’s, what is the probability that any given box will contain I) One defectives II) One or more defectives III) Less than 2 defectives Normal Probability Distribution The normal distribution, also called the Normal Probability Distribution, to be the most useful theoretical distribution for continuous variables. Most of the date relating to economic and business statistics or even in social and physical science conform to this distribution. Properties of the Normal Distribution 1. Normal curve is “bell shaped” and symmetrical in it appearance 2. The height of the normal curve is at its maximum at the mean. Hence the mean and mode of the normal distribution coincide. Thus for a Normal Distribution Mean, Median and Mode are all equal. 3. There is one maximum point of the normal curve which occurs at the mean 4. Since there is only one maximum point, the normal curve is uni-modal i.e. it has only one Mode 5. As dissatisfied from Binomial and Poisson distribution where the variable is discrete. The variable distributed according to the normal curve is continuous. 6. The first and third quartile are equidistance from the Median 7. The area under the normal curve distributed as follows a. Mean ± 1 𝜎 covers 68.27% area and 34.135 % area will lie on either side of the Mean b. Mean ± 2 𝜎 covers 95.45% area c. Mean ± 3 𝜎 Covers 99.73% area 8. The mean deviation is 4th or more precisely 0.7979 of the standard deviation Conditions for Probability - The causal forces must be numerous and of appropriately equal weights - Forces must be the same over the universe from which the observations are drawn. This is the condition of homogeneity - Forces affecting events must be independent of one another - Condition symmetry Normal Distribution Graph
  • 70. BIET – MBA Programme, Davangere 70 Prof. Vijay K S Business Statistics and Analytics Calculation of “Variables” in Normal Distribution In order to calculate probabilities of Normal variable “X’, we transform to Normal variable “x” to Standard normal variable “Z” by using Z= 𝑥−µ 𝜎 There for x ∿ n (µ, 𝜎) 𝑔𝑒𝑡𝑠 𝑡𝑟𝑎𝑛𝑠𝑓𝑜𝑟𝑚 𝑡𝑜 𝑍 ∿ SD (µ = 0, 𝜎 = 1) f = (Z) = 1 2√ 𝜋 𝑒− 𝑧2 2 The probability values corresponding to different values of Z are made available under normal tables which are used to get probabilities for normal distribution. Problems on Normal distribution: 36. The marks obtained by students in an examination follows normal distribution with mean = 45 and SD =10. Calculate the probability that randomly chosen student has scored I) Less than 60 Marks II) Between 60 and 80 Marks III) Less than 40 Marks IV) More than 70 Marks V) Between 35 and 60 marks 37. The average daily sales of 500 branch offices was Rs. 1,50,000, SD = 15,000, assuming the distribution will be normal. Calculate how many branches have sales I) Above 1,70,000 II) Between 1,20,000 and 1,40,000 III) Between 145 thousands and 165 thousands 38. Distribution of monthly income of 500 workers follow normal distribution with mean of Rs. 2000 and SD of Rs. 200, estimate the number of workers with income I) Exceeding Rs. 2300 per month II) Between 1800 and 2300 per month III) What is the lowest income of 25% of workers in the highest income group 39. Banking recruitment board conducts qualifying exams for 1000 candidates and the scores of the candidates follow the normal distribution with mean of 52 marks and SD = 6 I) Find the number of candidates scoring between 40 and 55 marks II) If the recruitment board wishes to recruit only 10% of the top scores, what is the cut marks?
  • 71. BIET – MBA Programme, Davangere 71 Prof. Vijay K S Business Statistics and Analytics Questions from Previous Year Question Papers: 1. A Merchant’s file of 20 accounts contains 6 delinquent and 14 non-delinquent accounts. An auditor randomly selects 5 of these accounts for examination: A) What is the probability that the auditor finds exactly 2 – delinquent accounts? B) Find the expected number of delinquent accounts in the sample selected. 2. The mean and standard deviation of the wages of 6000 workers engaged in a factory are Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal estimate: Percentage of workers getting wages above Rs. 1600 Number of workers getting wages between Rs. 1100 and Rs. 1500 The relevant extract of the area table (under the normal courve from Z=0 to ∞ is given below Z 0.25 0.5 0.6 0.75 1.00 1.25 1.5 Area 0.0987 0.1915 0.2257 0.2734 0.3413 0.3944 0.4332 3. The mean and standard deviation of the wages of 1000 workers engaged in a factory are Rs. 1200 and Rs. 400 respectively. Assuming the distribution to be normal, estimate a. Percentage of workers getting wages above Rs. 1600 b. Number of workers getting wages between Rs. 600 and Rs. 900 4. The probability that a pen manufactured by a company will be defective is 1/10. If 12 such pens are manufactured, using binomial distribution find the probability that. a. Exactly two will be defective b. At least 3 will be defective c. At most 3 will be defective 5. A Project yields an average cash flow of Rs. 500 Lakhs, with a standard deviation of Rs. 60 Lakhs, Calculate the following probabilities. a. Cash flow will be more than 560 Lakhs b. Cash Flow will be less than 420 Lakhs 6. The incidence of occupational diseases in an industry is such that the worker have 20 percent chance of suffering from it. What is the probability that out of six worker’s 4 or more will come in contact of the disease? 7. Suppose a life insurance company insures the lives of 5000 persons aged 42. If studies show that any 42 years old person will die in a given year to be 0.001. Find the probability that the company will have to pay at-least two claims during a given year. What is the probability that company will have to pay zero claims? 8. In a manufacturing organization with 5000 employees. The mean was of workers is Rs. 8000/- per month with standard deviation of Rs. 2000/-. Assuming normal distribution, estimate: a. Number of workers getting salary below Rs. 6000/- b. Number of workers getting salary above Rs. 10,000/- c. Number of workers getting salary between Rs. 7000/- and Rs. 9000/-
  • 72. BIET – MBA Programme, Davangere 72 Prof. Vijay K S Business Statistics and Analytics 9. Mean and standard deviation of wages of 1000 workers engaged in a factory are Rs. 1200 and 400 respectively. Assuming the distribution to be normal, estimate a. Percentage of workers getting wages above Rs. 1600 b. Number of workers getting wages between Rs. 600 and Rs. 900 The area under normal curve for different Z are given below Z 0.5 0.75 1 1.5 Area 0.1915 0.2734 0.3413 0.4332 10. In a factory turning out fan blades, there is a small change of 0.002 for any blade to be defective. The blades are supplied in packets of 10. Use poisson distribution to calculate the approximate number of pockets containing no defectives, One defective and two defective blades respectively in a consignment of 10,000 packets.
  • 73. BIET – MBA Programme, Davangere 73 Prof. Vijay K S Business Statistics and Analytics Unit – 4 Time Series Analysis
  • 74. BIET – MBA Programme, Davangere 74 Prof. Vijay K S Business Statistics and Analytics Unit 4: (12 Hours) Time Series Analysis: Introduction - Objectives Of Studying Time Series Analysis - Variations In Time Series - Methods Of Estimating Trend: Freehand Method - Moving Average Method - Semi-Average Method - Least Square Method. Methods of Estimating Seasonal Index: Method of Simple Averages - Ratio to Trend Method - Ratio to Moving Average Method Introduction: Forecasting is an important tool in any decision-making process. Forecasting is essential in making many magerial decision such as deciding about raw material required for production, investment required for equipment purchase, sales forecast, human resource requirement forecast, etc A time series is a set of numerical values of some variable at regular period over time. The series is usually tabulated or graphed in a manner that readily conveys the behaviour of te variable under study. Example: The export of cement company between 2007 to 2017 Year Export (Tonnes) 2007 2 2008 3 2009 6 2010 10 2011 8 2012 7 2013 12 2014 14 2015 14 2016 18 2017 19 The above graph suggests that the series is time dependent. The management of the company is invested in determining how the series is dependent on time and in developing a means of predicting future levels with some degree of reliability Objective of Studying Time Series Analysis 1. The assumption underlying time series analysis is that the time series data behaves the same in the future as that in the past. Time series analysis is used to detect the pattern underlying data, isolate the influencing factors which in turn 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 Export (Tonnes) 2 3 6 10 8 7 12 14 14 18 19 2 3 6 10 8 7 12 14 14 18 19 0 2 4 6 8 10 12 14 16 18 20 Export (Tonnes)
  • 75. BIET – MBA Programme, Davangere 75 Prof. Vijay K S Business Statistics and Analytics used to estimate the future accurately. Thus, the time series data helps us to cope with the uncertainty about the future. 2. To review and evaluate the progress made in the plans are based on the time series data. For example, Finance Minstry of Govt. of India (GOI) reviewing the gross domestic product ) GDP of the economy during the financial year and chalking out the strategies to further the growth. Variations in Time Series / Components of a Time Series In typical time-series there are three main components which seem to be independent of one another and seems to be influencing time-series data. An important step in analysing time series is to consider the types of data patterns. A time series data can contain some or all of the following elements. They are: 1. Trend (T) 2. Cyclical (C) 3. Seasonal (S) 4. Irregular (I) 1. Trend (T) : The trend is the long term pattern of a time series. A trend can be positive or negative depending on whether, the time series exhibits an increasing long term pattern or a decreasing long term pattern. The rate of trend growth usually varies over time.
  • 76. BIET – MBA Programme, Davangere 76 Prof. Vijay K S Business Statistics and Analytics 2. Cyclical (C) : Time series data may show up and down movement around a given trend. For example, business cycle over the years show upward trend and touches its peak and then it may show slump and hits the bottom. The pattern repeats but not a regular interval of time. The duration of a cycle depends on the type of business or industry. In brief an upward and downward oscillation of uncertain duration and magnitude about the trend line due to seasonal effect with fairly regular period with irregular swings is called a cycle. 3. Seasonal (S): It is a speacial case of a cycle component of time series in which the magnitude and duration of the cycle do not vary but happen at a regular interval each year. Seasonality occurs when the time series exhibits regular variation during the same periods (Month, Year or same quarter every year)
  • 77. BIET – MBA Programme, Davangere 77 Prof. Vijay K S Business Statistics and Analytics 4. Irregular or Random: This type of variation is unpredictable. This is caused by short term unanticipated and non-recurring factors. These follows np specific pattern. Methods of Estimating Trend: These are also called the forecasting methods of Time Series Analysis Some of them are 1. Freehand Method 2. Moving Average Method 3. Semi-average Method 4. Least-Square Method 1. Freehand Method: It is easy method of estimating the trend. First, step is to plot the values of the time series on the graph and then draw a trend line through these points such that the line reflects long-term trend of the data. This method does not require any rigorous mathematical calculations. Here the forecast can be obtained simply by extending the trend line. A trend line fitted by the freehand method should conform to the following conditions. - The trend line should be smooth – a straight line or mix of long gradual curve - The sum of the vertical deviation of the observations above the trend line should be equal to the sum of the vertical deviation of the observation below the trend line. - The sum of squares of the vertical deviations of the observations from the trend line should be as samll as possible. - The trend line shoulc bisect the cycles so that area above the trend line should be equal to the area below the trend line, not only for the entire series but as much as possible for each full cycle. Limitations: The method involves personal bias, very subjective and needs judgement. Example: Fit a trend line to the following data by using the freehand method Year 2010 2011 2012 2013 2014 2015 2016 2017 Sales (Lakh) 80 90 92 83 94 99 92 104
  • 78. BIET – MBA Programme, Davangere 78 Prof. Vijay K S Business Statistics and Analytics 2. Moving Average Method: It is a very simple and flexible method. As the name refers to, in this method we calculate a series of averages of successive overlapping groups. The number of values to be included in an average is determined by a constant called Period. The resulting averages smoothens the fluctuations and extreme values. Two cases - When the period is odd - When the period is even Example: The sales of a store for 11 years are given below. Find the 3-Year, 5-Year and 7-Year Mooving average Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 Sales (Lakh) 7 13 19 25 31 37 43 49 55 61 67 Example: The sales of a store for 11 years are given below. Find the 4-Year and 6- Year Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 Sales (Lakh) 7 13 19 25 31 37 43 49 55 61 67 3. Semi-average Method: This method is used to estimate the trend line if a linear function can discribe the data sufficiently. This procedure is as follows 1. Divide the given time series into two segments leaving the middle period if the data is odd in number. If you find the number of time periods even, divide the time series into two segments leaving the two time period in the middle. 2. Find the average of te values of each segment and plot the two average points on the graph against the middle time period of each segment. 3. Join the two points to plot the trend line. You can extend the trend line to predict the value of a future time period. 4. The trend line equation is of the form 𝑌⏞ = a + bx. The intercept “a” and the slope can be found by using the experssion, Slope = ∆ 𝑌 ∆ 𝑋 = 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑠𝑒𝑔𝑚𝑒𝑛𝑡 𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑚𝑖𝑑 𝑝𝑒𝑟𝑖𝑜𝑑 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑠𝑒𝑔𝑚𝑒𝑛𝑡 Intercept = a = Average value of the first segment at its mid – period
  • 79. BIET – MBA Programme, Davangere 79 Prof. Vijay K S Business Statistics and Analytics Example: The sales of a manufacturing firm from 2001 – 2011 is given below. Fit a trend line by using the method of semi-averages and also estimate the sales for the year 2014. Year 200 1 200 2 200 3 200 4 200 5 200 6 200 7 200 8 200 9 201 0 201 1 Sales in Crore s 103 105 114 112 116 120 116 122 126 127 124 4. Semi-average Method: The trend may be linear or curvilinear. Let us consider the values that can be descibed by a stright line. These are called linear trends. The general equation for estimating a stright line is 𝑌⏞ = a + bx. Where 𝑌⏞ = Estimated value of the independent variable a = y-intercept, i.e. the value of 𝑌⏞ when x = 0 b = Slope of the trend line x = independent variable, i.e. the time The values of “a” and “b” can be found out by a = ∑ 𝑦 𝑛 b = ∑ 𝑥 𝑦 ∑ 𝑥2 Example: The production of the firm over the years is given below a) Fit a stright line trend using the method of least squares b) Estimate the production figures for the year 2012 c) Use a graph to plot the actual and the estimated production Year (X) 2005 2006 2007 2008 2009 2010 2011 Production (in ‘000s) 87 91 93 86 96 99 92
  • 80. BIET – MBA Programme, Davangere 80 Prof. Vijay K S Business Statistics and Analytics Methods of Estimating Seasonal Inde - Method of Simple Averages - Ratio to trend method - Ratio to moving average method Method of simple averages Example: The sales of lathes in the last three years is given below. Use the method of simple averages to determine the seasonal index for each month. Month Jan Feb Mar April May June July Aug Sep Oct Nov Dec 2009 16 17 19 19 24 24 21 29 30 34 34 39 2010 22 21 27 26 30 27 21 27 31 36 33 43 2011 28 28 38 39 39 33 33 37 41 50 44 56 Ratio to Trend Method Example: The quarterly sales of a stationery store (Rs. In thousands) for five years, i.e. 2007 – 2011 is given below. Use ratio to trend method to determine the seasonal indexes. Ratio to moving Average method Example: The quarterly sales for five years from 2008-2011 is given below. Use ratio to moving average method to determine the sesonal indexes. Quarter Sales (Rs. In Thousands I II III IV 2008 77 62 56 61 2009 85 64 62 79 2010 91 73 67 86 2011 102 80 74 95 1. Fit a stright line trend by semi average method for the following data: (June 2010) Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 Sales (in ‘000) 45 50 60 55 60 65 70 80 85 2. Calculate the trend values by the method of moving averages assuming a 4- year cycle from the following data relating to sugar production in India. Also plot the actual and trend values on a graph. (June 2010) Year 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 Sugar prodn (in Lakh tons) 75 62 76 78 94 84 96 128 116 76 102 168
  • 81. BIET – MBA Programme, Davangere 81 Prof. Vijay K S Business Statistics and Analytics 3. Compute the trend values by finding four yearly moving- averages for the following time series. Also graph the deserved values and the trend values. Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 Sales (in ‘000) 103 104 107 101 102 104 105 99 100 4. Fit a stright line trend for the following data by the method of least squares. Estimate the value for 1996. Year: 1989 1990 1991 1992 1993 1994 Value: 10 8 12 9 11 12 5. What is Time Series? Discuss various components of Time Series. 6. Fit a trend line to the following data by the method of Semi-average (Draw a Graph) Year Sales of Firm A (thousand units) 1990 102 1991 105 1992 114 1993 110 1994 108 1995 116 1996 112 7. Below are given the figures of production (in thousand kgs.) of casting in a factory Years 1990 1991 1992 1993 1994 1995 1996 Production 12 10 14 11 13 15 16
  • 82. BIET – MBA Programme, Davangere 82 Prof. Vijay K S Business Statistics and Analytics Questions from Previous Year Question Papers: 1. Below are given the figures of production of a sugar factory. Values are in thousand quintals Year 2011 2012 2013 2014 2015 2016 2017 Production 80 90 92 83 94 99 92 Fit a straight line trend and show the trend line on graph. Estimate production in 2020. 2. For the data on prices (in Rs. Per Kg) of a certain commodity during 2007 to 2011 are shown below. Compute the seasonal indexes by the average percentage method. Quarter 2007 2008 2009 2010 2011 1 45 48 49 52 60 2 54 56 63 65 70 3 72 63 70 75 84 4 60 56 65 72 66 3. From the following series of annual data, find the trend line by the method of semi averages. Also estimate the value for 1999 Year 1990 1991 1992 1993 1994 1995 1996 1997 1998 Actual Value 170 231 261 267 278 302 299 298 340 4. The sales of a company in millions of rupees for the year 1994 -2001 are given below Year 1994 1995 1996 1997 1998 1999 2000 2001 Sales 550 560 555 585 540 525 545 585 a. Find the linear trend equation? b. Estimate the sales for the year 1993? c. Find the slope of the straight line trend? d. Do the figures show a rising trend or a falling trend? 5. Taking of the deviation of the time variable compute the trend values for the following data by the method of least square: Days 1 2 3 4 5 6 7 Sales (Rs.) 20 30 40 20 50 60 80 6. With the help of following data, calculate the trend values by the method of least squares and estimate the sales for the year 2011. Years 2000 2001 2002 2003 2004 2005 2006 Sales (in Lakhs) 25 27 32 36 44 55 69
  • 83. BIET – MBA Programme, Davangere 83 Prof. Vijay K S Business Statistics and Analytics 7. The following table relates to the tourists arrivals (in million) during 1994 to 2000 in India. Years 1994 1995 1996 1997 1998 1999 2000 Tourists Arrival 18 20 23 25 24 28 30 Fit a straight line trend by method of least square and estimate the number of tourist that would arrive in the year 2004. 8. The sales of a company in millions of rupees for the years 1994-2001 are given below: Years: 1994 1995 1996 1997 1998 1999 2000 2001 Sales: 550 560 555 585 540 525 545 585 a. Find the linear trend equation b. Estimate the sales for the year 1993 9. Calculate seasonal indices by the “ratio to moving averages” method from the following data Year 1st Quarter 2nd Quarter 3rd Quarter 4th Quarter 2005 68 62 61 63 2006 65 58 66 61 2007 68 63 63 67 10. Gross revenue data (Rs. In Million) for a travel agency for a 10 year period is as follows Years: 2000 01 02 03 04 05 06 07 08 09 Revenue: 3 6 10 8 7 12 14 14 18 19 Calculate a 3 year moving average for the revenue earned.
  • 84. BIET – MBA Programme, Davangere 84 Prof. Vijay K S Business Statistics and Analytics Unit – 5 Part A Linear Programming
  • 85. BIET – MBA Programme, Davangere 85 Prof. Vijay K S Business Statistics and Analytics Unit 5: (8Hours) Linear Programming: structure, advantages, disadvantages, formulation of LPP, solution using Graphical method. Linear Programming For making decision in a business environment. Model formulation is very important because it represents the essence of business decision problem. Here formulation means, the process of converting the verbal description and numerical data into mathematical expression, which represents the relevant relationship among - Decision factors / variables - Objectives that firm wants to achieve – Objective function - Restriction – on the use of resources “Linear programming is a particular type of techniques used for economic allocation of scare and limited resources, such as Labour, Materials, Machine, Time, Warehouse, Space, Capital, Energy to several competing activities such as Products, Services, Jobs, New equipment, Projects etc.” Linear Programming is one of the optimization techniques used to optimise the business variables like Profit, Cost, Sales, and Waste with available limited resources. Linear Programming uses mathematical modelling with the help of “Linear Equation” “Linear programming is one of the optimizing techniques used to minimize profits or minimizing cost of given function using linear equation.” Two works comprising Linear Programming = Linear + Programming a. Linear – Means linear relationship among the variables in the model, change in one leads to proportionate change in other b. Programming – Modelling or Solving problem mathematically Structure of Linear Programming Model General Structure of LP Model: LP model consist of three components 1. Decision Variable / Courses of Action 2. Objective Function 3. Constraints 1. Decision Variable: To arrive at optimal value of the objective function, we need to evaluate the various alternative i.e. various courses of actions, If there is no alternative, no need of LP
  • 86. BIET – MBA Programme, Davangere 86 Prof. Vijay K S Business Statistics and Analytics - These activities denoted as X1, X2, X3,……Xn - The values of these denotes the extent to which each of these performed - They are under the control of decision maker - These are interrelated in terms of limited resources - All decision variables are continuous, controllable and Non negative X1 ≥ 0, X2 ≥ 0 …………….. 𝑋𝑛 ≥ 0 2. Objective Function: Mathematical Representation of the objectives in terms of measurable quantity such as Profit, Cost, Revenue, Distance etc. LPP Aims to achieve the highest profit or lowest cost by utilizing the available limited resources to the best possible extent Optimize (Minimize or Maximize) Z = 𝐶1 𝑥1 + 𝐶2 𝑥2 + 𝐶3 𝑥3 + ……… 𝐶 𝑛 𝑥 𝑛 Z = Measure of performance variable 𝑥1 , 𝑥2 , 𝑥3, 𝑥4…………………..𝑥 𝑛 = Decision Variable 𝐶1 , 𝐶2 , 𝐶3 … … … … … 𝐶 𝑛 = Quantities The optimum value of the given objectives function is obtained by the graphical method and simple method. 3. Constraints: - There are certain limitations (Or Constraints) on the use of resources, Example. Labour, Machine, Raw Materials, Space, Money etc. the limit the degree to which objective can be achieved - Such constraints must be expressed as linear equalities or inequalities in terms of decision variable - The solution of LP model must satisfy these
  • 87. BIET – MBA Programme, Davangere 87 Prof. Vijay K S Business Statistics and Analytics General Mathematical Model of Linear Programming Problem: The General Linear Programming Problem / Model with “n” Decision variables and “m” constraints can be stated in the following form Decision Variable 𝑥1 , 𝑥2 , 𝑥3, 𝑥4……………..𝑥 𝑛 ( One should find these values) Objective Function: Z = 𝐶1 𝑥1 + 𝐶2 𝑥2 + 𝐶3 𝑥3 + ……… 𝐶 𝑛 𝑥 𝑛 Subjected to the linear Constraints 𝑎11 𝑥1 + 𝑎12 𝑥2 + 𝑎13 𝑥3 + …………………….. 𝑎1𝑛 𝑥 𝑛 (≤ = ≥) 𝑏1 𝑎11 𝑥1 + 𝑎12 𝑥2 + 𝑎13 𝑥3 + …………………….. 𝑎1𝑛 𝑥 𝑛 (≤ = ≥) 𝑏1 𝑎 𝑚1 𝑥1 + 𝑎 𝑚2 𝑥2 + 𝑎 𝑚3 𝑥3 + …………………… 𝑎 𝑚𝑛 𝑥 𝑛 (≤ = ≥) 𝑏 𝑚 Such that 𝑥1 , 𝑥2 , 𝑥3, 𝑥4……………..𝑥 𝑛 ≥ 0 Non Negativity Constraints Where 𝐶1 , 𝐶2 , 𝐶3 … … … … … 𝐶 𝑛 => These are constant of profit / loss 𝑎11, 𝑎12, 𝑎13 …………………….. 𝑎 𝑚𝑛 => Technical Constant 𝑏1, 𝑏2, 𝑏3 …………………….. 𝑏 𝑚 => Availability or Requirements Assumption of Linear Programming - Certainty - Diversity - Additivity - Linearity
  • 88. BIET – MBA Programme, Davangere 88 Prof. Vijay K S Business Statistics and Analytics Certainty: LP model assume that all parameters such as availability of resources, profits or cost contribution of a unit if decision variable and consumption of resources by a unit of decision variable must be known and may constant. Diversity (Continuity): The solution values of decision variable and resources are assumed to have either whole number (Integer) or mixed number. Additivity: The values of the objective function for the given values of decision variable and the total sum of resources used, must be equal to the sum of the contribution (Profit or cost) earned from each decision variable and the sum of the resources used by each decision variable. Example 1: Total profit earned by the sale of two products A and B = Sum of profit earned separately from A and B Example 2: Resources consumed by A and B = Sum of resources used for A and B individually Linearity / Proportionately: All relationships in the LP model (both objective function and constraints must be linear) Example: If production of one unit of a product uses 5 hours of a particular resources, then making 3 units of that product use 3*5 = 15 hours of that resource Advantages of Linear Programming 1. Help in attaining optimum use of productive resources 2. Improve quality of decisions, Since it is more of objective than subjective 3. It provides possible and practical solutions 4. Highlighting of bottlenecks in the production processes 5. Linear Programming helps in re-evaluating of a basic plan for changing condictions Limitations of Linear Programming / Disadvantages 1. Treats all relationship among decision variable as linear 2. There is no guarantee of getting integer valued solution 3. It doesn’t take into consideration the effect of time and uncertainty 4. It is possible to solve large scale problems in LP with the usage of computer, but problem can be fragmented into several small problems and solving each separately 5. Parameters appearing the model are assumed to be constant but in real life they are frequently neither nor constant. 6. It deals with single objective, where as in real life situation we may come across conflicting multi-objective problems. In such cases a goal programming model is used instead of linear programming
  • 89. BIET – MBA Programme, Davangere 89 Prof. Vijay K S Business Statistics and Analytics Application Areas of Linear Programming 1. Agriculture Application: - Efficient production patterns can be specified by the Linear Programming Model under regional land resources and national demand constraints - Applied in agriculture planning – resource allocation 2. Military Application - No of defence units that should be used in a given attack in order to provide the required level of protection at the lowest possible cost 3. Production Management - Product mix – The objective is to maximise the total contribution subject to all constraints - Production Planning – To manage the operating cost - Assembly – Line Balancing – to reduce total elapse time - Blending problem – Find the minimum cost blend - Trim Loss – Minimise trim loss 4. Financial Management - Portfolio Selection: To find the allocation, which maximises the total expected return or minimise the risk under certain limitation - Portfolio Planning: Maximizing the profit margin from investment in plant facility and equipment 5. Marketing Management - Median selection: Maximise the effective exposure subject to limitation of budget, specified exposure rates to different market segments - Travelling salesmen problems – To find the shortest route - Physical distribution – Locating the manufacturing plants and distribution centres 6. Personal Management - Staffing Problem: Allocation of optimal resources i.e. manpower to particular job to reduce the overtime cost - Determination of equitable salaries - Job Evaluation and selection – Identifying a suitable person for a specified job Other application of linear programming lie in the areas of administration, education, fleet utilization, Awarding contracts, hospital administration and capital budgeting
  • 90. BIET – MBA Programme, Davangere 90 Prof. Vijay K S Business Statistics and Analytics Guideline or Steps for Linear Programming Model Formulation: Steps in Linear Programming (LP) Model formulation 1. Identify Decision Variable (“n” number of Decision variables) - How many? - How much quantity of the decision variable? - Which? 2. Formulating the Objective Function - Identify whether the objective function is to be maximised or minimised - Maximize: Profit, Revenue, Margin, Viewers, - Minimization: Cost, Time, Number of employee problem - The function of the objective has to be written, like as follows Z = C1X1 + C2X2 + C2X3……………..CnXn 3. Identify the Problem Data - Here we need to provide the actual values for the decision variables identified earlier. For this we need to know the information given in the problem to determine those values - These quantities constitute the problem data 4. Formulate the Constraints (“m” number of constraints) - Express the constraints in terms of requirements and availability of each resources - Convert the verbal expression of the constraints imposed by the resource availability as a linear equality or inequality in terms if decision variable defined in step 1 A11X1 + A12X2 + A13X3……………………………….A1nXn (≤ / ≥ / = ) b1 A21X1 + A22X2 + A23X3………………………………..A2nXn (≤ / ≥ / = ) b2 . . . Am1X1 + Am2X2 + Am3X……………………………….AmXn (≤ / ≥ / = ) bm Here the main aim is to translate a real life problem into mathematical model 5. Non Negativity Constraints: X1, X2, X3…….Xn ≥ 0 X1, X2, ………. Xn
  • 91. BIET – MBA Programme, Davangere 91 Prof. Vijay K S Business Statistics and Analytics Problems: 1. Suppose that company produces 2 products A and B. Product A gets a profit of Rs. 200 per units, Product B gets a profit of Rs. 500 per units. Company uses 3 main resources Labour, Power and Raw materials. Product A uses 2 Units of Labour 3 Units of Power 2 Units of Raw Materials Product B Uses 3 Units of Labour 4 Units of Power 3 Units of Raw Materials For a day Production Company can use 200 units of Labour, 500 Units of Power and 1000 units of Raw Materials. Formulate LPP. 2. A manufacturer produces 2 types of Model M1 and M2. Each model M1 requires 4 hours of grinding and 2 hours of polishing. Each model M2 requires 2 hours of grinding and 5 hours of polishing. The manufacturer has got 2 grinders and 3 polishers, each grinder works 40 hours a week, and each polisher works 60 hours a week. The profit on M1 is 30 / Unit and Profit on M2 is 40 / Unit. Formulate LPP. 3. A company manufactures 2 products Fibre and Viscose. 1 Kg of Fibre fetches a profit of Rs. 400 and 1 Kg of Viscose gives a profit of Rs. 500. These 2 products uses 3 basic resources Labour, Power and Raw Materials. To Produce 1 Kg of Fibre it requires 1 Man days of Labour, 2 units of Power and 2 Units of Raw Materials. To Produce 1 Kg of Viscose it requires 2 Man days of Labour, 3 Units of Power and 1 Unit of Raw material is required. The resources are limited in nature such that company can utilize 100 Man days of Labour, 500 Units of Power and 800 Units of Raw materials. Formulate LPP 4. A paper mill produces 2 grades of paper namely X and Y. There is a restriction on availability of raw materials that it cannot produce more than 400 tons of grade X and 300 tonnes of grade Y in a week. It requires 0.2 and 0.4 hours to produce a tonne of products X and Y respectively with corresponding profits of Rs. 200 and Rs. 400 per tonne. Formulate the above LPP. There are 160 production hours in a week 5. A person requires 10, 12 and 12 units of chemicals A, B and C respectively for his garden. A liquid products contains 5, 2 and 1 Units if A, B and C respectively per Jar. A Dry products contains 1. 2 and 4 Units of A, B and C
  • 92. BIET – MBA Programme, Davangere 92 Prof. Vijay K S Business Statistics and Analytics respectively per carton. If the liquid products is sold for Rs. 3 /- per Jar and Dry product is sold for Rs. 2/- per Carton. How many units of such products should be purchased in order to minimise the cost and meet the requirements? Formulate LPP 6. A Company produces two types of leather belts “A” and “B”. A is a superior quality than B. The respective profits are Rs. 10 and Rs. 15 per belt. The supply of raw materials is sufficient for making 850 belts per day. For belt A, a special type of buckle is required and 500 pieces are available per day. There are 700 buckles available for belt B per day. Belt A needs twice as much time as that required for belt B and Company can produce 500 belts if all of them were of type A. Formulate LPP. 7. Company manufactures 2 products A and B. Each unit B takes twice as long as to produce / unit of A. If company is to produce only A it would have time to produce 2000 per unit day. The availability of Raw materials is sufficient to produce 1500 per unit/day both A and B combined. Product B uses special ingredients so only 600 units per day can be produced per day. If A cost Rs. 20 and B cost Rs. 50. Formulate LPP 8. An animal food company must produce 200 KG of mixture consisting of ingredients X1 and X2 daily. X1 cost Rs. 3 per Kg and X2 cost 8 per Kg. Not more than 80 Kg of X1 can be used and at least 60 Kg of X2 must be used. Formulate LP Model to minimise the cost. 9. A firm produces 3 products A, B and C. It uses 2 types of Raw materials I and II of which 500 and 7500 units respectively are available. The raw materials requirements per unit of products are given below Requirements of Products Raw Materials A B C I 3 4 5 II 5 3 5 The labour time for each units of products A is tice as that of products A and 3 times of that of products C. The entire labour force of that firm can produce equivalent of Rs. 3000 Units. The minimum demand for 3 products if 6oo, 650 and 500 Units respectively. The ratio of the number of units produced must be equal to 2:3:4. Assuming the profit per unit of A, B and C is 50, 50 and 80 respectively. Formulate LPP 10. A retired person want to invest an amount of Rs. 30,000 in a fixed income securities. His broker recommends investing in 2 bonds. Bond A yields 7 % and bond B yields 10%. After some consideration he decides to invest at most of Rs. 12,000 in bond B and At least of Rs. 6000 in bond A. He also wants the amount invested in bond A to be at least equal to amount invested in bond B. Formulate LP model to maximize returns on investment
  • 93. BIET – MBA Programme, Davangere 93 Prof. Vijay K S Business Statistics and Analytics 11. A person is inherited with Rs. 1,00,000 from his father-in-law that can be invested in combination of only 2 stock portfolio’s, with the maximum investment allowed in either portfolio at Rs. 75,000. The first portfolio has an average returns of 10% and the 2nd has 20%. In return of risk factor associated with these portfolio, The first has the risk rating of 4 (on the scale of 0-10) and the 2nd has 9. Since he wants maximise the return that will not accept an average rate of return below 12% or a risk factor above 6. Formulate LPP 12. A manufacturer employees 3 inputs, Man hours, Machine hours and cloth materials to manufacture 2 type of dresses. Type A dress fetches him a profit of 160 per piece, while type B that of Rs. 180 per piece. The manufacturer has enough man hours to manufacture 50 Pieces of type A and 20 Pieces of type B dresses per day. The machine hours he processes supply only for 36 pieces of type A and 24 Pieces of type B. Cloth materials available per day is limited but sufficient enough for 30 pieces of either type of dresses. Formulate LPP 13. A company produces 2 Types of hats each hat of the first type requires twice as much as labour time of 2nd Type. If all hats are of 2nd Type only. The company can produce the total number of 500 hats / day. The market limit daily sales of the 1st and 2nd type of 150 and 250 hats. Assuming that the profits per hat are Rs. 8 for type A and Rs. 15 for type B. Formulate LPP. 14. An animal food company must produce 200 Kg of the mixture of ingredients X1 and X2 daily. X1 cost Rs. 3 / Kg and X2 cost Rs. 8 /Kg. Not more than 80 KG of X1 can be used and at least 60 Kg of X2 must be used. Formulate LPP Solution to LPP using Graphical Method Once the given business situation is transformed to LPP. It is solved using graphical method as follows Step 1: Plot the constraint equation on the graph sheet and locate the common region between all the constraints, It is called as common region Step 2: Identify the extreme points binding the feasible region Step 3: If solution exist to given LPP it is at any one of these extreme points. Substitute the values of each extreme point in the objective function. Solution to the LPP corresponding to that extreme point for which the values of objective function is optimized
  • 94. BIET – MBA Programme, Davangere 94 Prof. Vijay K S Business Statistics and Analytics 15. Solve the following LPP using graphical method Max Z = 8 𝑋1+ 16 𝑋2 Subjected to 𝑋1 + 𝑋2 ≤ 200 𝑋2 ≤ 125 3𝑋1 + 6𝑋2 ≤ 900 Where 𝑋1 , 𝑋2 ≥ 0 16. Find solution to the following LPP graphically Max Z = 10 𝑋1+ 8 𝑋2 Subjected to 2𝑋1 + 𝑋2 ≤ 20 𝑋1 + 3𝑋2 ≤ 30 𝑋1 − 2𝑋2 ≥ −15 Where 𝑋1 , 𝑋2 ≥ 0 17. Find Maximum and Minimum values for the function Z = 8 𝑋1+ 5 𝑋2 Subjected to 3𝑋1 − 2𝑋2 ≥ 6 − 2𝑋1 + 7𝑋2 ≥ 7 2𝑋1 − 3𝑋2 ≤ 6 Where 𝑋1 , 𝑋2 ≥ 0 18. Solve the following LPP using Graphical Method Z = 10 𝑋1 - 4 𝑋2 Subjected to 2𝑋1 − 6𝑋2 ≤ 0 𝑋1 − 2𝑋2 ≤ 2 − 3 𝑋1 − 3𝑋2 ≥ −24 Where 𝑋1 , 𝑋2 ≥ 0 19. Solve the following LPP using Graphical method Max Z = 10 𝑋1 + 15 𝑋2 Subjected to 2𝑋1 + 𝑋2 ≤ 26 2𝑋1 + 4𝑋2 ≤ 56 𝑋1 - 𝑋2 ≥ −5 Where 𝑋1 , 𝑋2 ≥ 0 20. Solve the following LPP using graphical method Max Z = 0.07 𝑋 + 0.1 𝑌 Subjected to 𝑋 + 𝑌 ≤ 30,000 𝑌 ≤ 12000 𝑋 ≥ 6000 X – Y ≥ 0 Where 𝑋1 , 𝑋2 ≥ 0 21. Solve Graphically
  • 95. BIET – MBA Programme, Davangere 95 Prof. Vijay K S Business Statistics and Analytics Max Z = 0.1 𝑋 + 0.2 𝑌 Subjected to 𝑋 + 𝑌 ≤ 100,000 𝑌 ≤ 75000 𝑋 ≥ 75000 -2X + 3Y ≤ 0 -2X + 8Y ≥ 0 Where 𝑋, 𝑌 ≥ 0 22. Solve Graphically Max Z = - 150 𝑋1 − 100 𝑋2 + 2,80,000 Subjected to 20 ≤ 𝑋1 ≤ 60 70 ≤ 𝑋2 ≤ 140 120 ≤ 𝑋1 + 𝑋2 ≤ 140 Where 𝑋1 , 𝑋2 ≥ 0 Questions from Previous Year Question Papers: 1. A firm buys castings of P and Q type of parts and sells them as finished product after machining, boring and polishing. The purchase cost for casting are Rs. 3 and Rs. 4 each for parts P and Q and selling costs are Rs. 8 and Rs. 10 respectively. The per hour capacity of machines used for machining, boring and polishing for two products is given below: Capacity per hour Parts P Q Machining 30 50 Boring 30 45 Polishing 45 30 The running costs for machining, boring and polishing are Rs. 30, Rs. 22.50 and Rs. 22.50 per hour respectively. Formulate LPP to find out the product mix to maximize the profit. 2. Mr. X has Rs. 1,00,000 that can be invested in a combination of only two stock portfolios with maximum investment allowed in either portfolio set at Rs. 75,000. The first portfolio has an average return of Rs. 10% where as second has Rs. 20%. In terms of risk factors associated with these portfolios, the first has a risk rating of 4 and second has 9. Since he wants to maximise his returns, he will not accept an average rate of returns below 12% of risk rating above 6. How much should he invest in each portfolio? Formulate this as linear programming problem and solve it graphically. 3. Solve the following problem by using graphical method: Minimize Z = 3X1 + 5X2 Subjected to -3X1 + 4X2 ≤ 12 2X1 + 3X2 ≥ 12 2X1 – X2 ≥ - 2 And X1 ≤ 4 ; X2 ≥ 2 ; X1, X2 ≥ 0
  • 96. BIET – MBA Programme, Davangere 96 Prof. Vijay K S Business Statistics and Analytics 4. Solve the following LPP graphically Maximum Z = 10X1 + 15X2 Subjected to 2X1 + X2 ≤ 26 2X1 + 4X2 ≤ 56 X1 – X2 ≥ - 5 X1, X2 ≥ 0 5. Solve the following LPP using Graphical method Minimum Z = 20X1 + 10X2 Subjected to the constraints X1 + 2X2 ≤ 40 3X1 + X2 ≥ 30 4X1 + 3X2 ≥ 60 Such that X1, X2 ≥ 0
  • 97. BIET – MBA Programme, Davangere 97 Prof. Vijay K S Business Statistics and Analytics Unit – 5 Part A Transportation Problem
  • 98. BIET – MBA Programme, Davangere 98 Prof. Vijay K S Business Statistics and Analytics Transportation problem: basic feasible solution using NWCM, LCM, and VAM unbalanced, restricted and maximization problems. Transportation Problem Transportation Problem is a particular case of LPP used to minimise the transportation cost involved in transporting goods from “m” different origins to “n” different destinations under the existing supply and demand constraints. It is to transport various amounts of a single homogeneous commodity that are initially stored at various origins, to different destinations in such a way that total transportation cost is minimum. The cost of transporting one unit of the commodity from each source to each destination is also known. The commodity is to be transported from various sources to different destinations in such a way that the requirement of each destination is satisfied and at the same time the total cost of transportation in minimized. Example: Pepsi has manufacturing unit at 4 cities in Karnataka and distributes more than 40 distribution centres. A typical transportation problem contains • Inputs: • Sources with availability (Supply) • Destinations with requirements (Demand) • Unit cost of transportation from various sources to destinations (Cost) • Objective: • To determine schedule of transportation to minimize total transportation cost.
  • 99. BIET – MBA Programme, Davangere 99 Prof. Vijay K S Business Statistics and Analytics A transportation problem can be stated mathematically as follows: Let there be ‘m’ SOURCES and ‘n’ DESTINATIONS Let 𝑎𝑖: the availability at the 𝑖 𝑡ℎ source 𝑏𝑗: The requirement of the 𝑗 𝑡ℎ destination. 𝑐𝑖𝑗 : The cost of transporting one unit of commodity from the 𝑖 𝑡ℎ source to the 𝑗 𝑡ℎ destination 𝑥𝑖𝑗 : The quantity of the commodity transported from 𝑖 𝑡ℎ source to the 𝑗 𝑡ℎ destination (i=1, 2, …… m; j=1,2, …..n) 𝑹𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒂𝒕𝒊𝒐𝒏 𝒐𝒇 𝑮𝒆𝒏𝒆𝒓𝒂𝒍 𝑻𝒓𝒂𝒏𝒔𝒑𝒐𝒓𝒕𝒂𝒕𝒊𝒐𝒏 𝑷𝒓𝒐𝒃𝒍𝒆𝒎 Destinations Supply / Availabilit y 𝑫 𝟏 𝑫 𝟐 𝑫 𝟑 𝑫 𝒏 Origin / Sources 𝑺 𝟏 X11 𝐶11 X12 𝐶12 X13 𝐶13 X1n 𝐶1𝑛 𝑎1 𝑺 𝟐 X21 𝐶21 X22 𝐶22 X23 𝐶23 X2n 𝐶2𝑛 𝑎2 𝑺 𝟑 X31 𝐶31 X32 𝐶32 X33 𝐶33 X1n 𝐶3𝑛 𝑎3 . . 𝑺 𝒎 Xm1 𝐶 𝑚1 Xm2 𝐶 𝑚2 Xm3 𝐶 𝑚3 Xmn 𝐶 𝑚𝑛 𝑎 𝑚 Demand / Requirements 𝑏1 𝑏2 𝑏3 𝑏 𝑛 𝑎 𝑚 = 𝑏 𝑛 The problem is to determine the values of xij such that total cost of transportation is minimized. We assume that the total quantity available is the same as the total requirement. i.e. Σai = Σbj • Balanced transportation problems • Unbalanced transportation problems
  • 100. BIET – MBA Programme, Davangere 100 Prof. Vijay K S Business Statistics and Analytics Feasible Solution: Any set of non-negative allocations which satisfies the row and column sum (Rim Requirements) is called as feasible solution. The feasible solution is called a basic feasible solution if the number of non-negative allocations are equal to m+n-1, where “m” is the number of rows, “n” is the number of column in a transportation table. Non-Degenerate Basic Feasible Solution: Any basic feasible solution to a transportation problem containing “m” Origins and “n” Destinations is said to be non-degenerate, if it contains “m+n-1” occupied cells and each allocation is in independent positions. The allocation are said to be in independent positions, if it is impossible to form a closed path. Closed path means by allowing horizontal and vertical lines and all the corner cells are occupied. Degenerate Basic Feasible Solution: If a basic feasible solution contains less than m+n-1 non negative allocations, it is said to be degenerate. Solution to Transportation Problem. Transportation problem can be solved in 3 Phases I. To find the Initial Basic Feasible Solution (IBFS) using any of the following methods A. North West Corner method B. Least Cost Method C. Vogel’s Approximation Method II. Test the IBFS for Optimality using Modified Differences (Modi) / UV Method III. Optimise Solution using Stepping stone algorithm method
  • 101. BIET – MBA Programme, Davangere 101 Prof. Vijay K S Business Statistics and Analytics Initial Basic Feasible Solution (IBFS) by North West Corner Method (NWCM): Step 1: Locate the cost situated at North West Corner of given cost matrix and allocate quantity “Xij” such that it is min (Corresponding ai , bj ) Step 2: Again allocate the North West cell in reduced cost matrix and allocate as before (Step 1) Step 3: Continue allocating until all allocations exhaust 1: Find the IBFS by North West Corner Method D1 D2 D3 D4 Supply O1 5 3 1 2 30 O2 2 6 3 1 20 O3 6 3 1 5 40 O4 6 1 2 3 10 Demand 20 35 25 20 100 Note: If IBFS or Any Solution to Transportation Problem contains m+n-1 allocated cells then solution is said to be non-degenerated (IBFS can be further improved for optimization) Where m – No of rows, n – No of Columns of the given TP. 2: Obtain IBFS by North West Corner Method P Q R S Supply A 12 10 12 13 500 B 7 11 8 14 300 C 6 16 11 7 200 Demand 180 150 350 320 1000
  • 102. BIET – MBA Programme, Davangere 102 Prof. Vijay K S Business Statistics and Analytics 3: Obtain IBFS by North West Corner Method P Q R S Supply A 6 4 1 5 14 B 8 9 2 7 16 C 4 3 6 2 5 Demand 6 10 15 4 4: Obtain IBFS by North West Corner Method A B C D Supply P 19 30 50 10 7 Q 70 30 40 60 9 R 40 8 70 20 18 Demand 5 8 7 14 5: Obtain IBFS by North West Corner Method X Y Z Supply A 50 40 80 400 B 80 70 40 400 C 60 70 60 500 D 60 60 60 400 E 30 50 40 800 Demand 800 600 1100
  • 103. BIET – MBA Programme, Davangere 103 Prof. Vijay K S Business Statistics and Analytics 6: Obtain IBFS by North West Corner Method A B C Supply P 2 7 4 5 Q 3 3 1 8 R 5 4 7 7 S 1 6 2 14 Demand 7 9 18 Initial Basic Feasible Solution (IBFS) by Least Cost Method (LCM) Step 1: Locate the cell with lowest cost and allocate accordingly to the minimum (Corresponding ai , bj ) Step 2: Again locate the next least cost cell in the reduced cost matrix and allocate as before Step 3: Continue the process till all the allocations are made 7: Find IBFS by Least Cost Method: D1 D2 D3 D4 Supply O1 1 2 3 4 6 O2 4 3 2 0 8 O3 0 2 2 1 10 Demand 4 6 8 6 8: Find IBFS by Least Cost Method: P Q R S Supply A 6 4 1 5 14 B 8 9 2 7 16 C 4 3 6 2 5 Demand 6 10 15 4 9: Find IBFS by Least Cost Method: D1 D2 D3 D4 Supply F1 19 30 50 10 7 F2 70 30 40 60 9 F3 40 8 70 20 18 Demand 5 8 7 14
  • 104. BIET – MBA Programme, Davangere 104 Prof. Vijay K S Business Statistics and Analytics 10: Find IBFS by Least Cost Method: D1 D2 D3 Supply F1 48 60 56 140 F2 45 55 53 260 F3 50 65 60 150 F4 52 64 55 220 Demand 200 320 250 11: Find IBFS by North West Corner Method and Least Cost Method D1 D2 D3 D4 Supply F1 19 30 50 10 7 F2 70 30 40 60 9 F3 40 8 7 14 34 Demand 5 8 7 14 34 12: Find the IBFS by LCM Method and NWCM W1 W2 W3 Supply F1 48 60 56 140 F2 45 55 53 260 F3 50 65 60 150 F4 52 64 55 220 Demand 200 320 250 770 Initial Basic Feasible Solution by Vogel’s Approximation Method Step 1: Calculate penalty for each Row and Column by considering the difference between least cost and next least cost in each row and column Step 2: Find a row or column having highest penalty. Locate the least cost cell in that row or column and allocate that cell with min (ai, bj) Step 3: Again find fresh set of penalties for the reduced cost matrix and allocate as in Step 2 Step 4: Continue allocating for further reduced cost matrix until all allocations are made
  • 105. BIET – MBA Programme, Davangere 105 Prof. Vijay K S Business Statistics and Analytics 13: Find the IBFS by Vogel’s Approximation Method W1 W2 W3 Supply F1 48 60 56 140 F2 45 55 53 260 F3 50 65 60 150 F4 52 64 55 220 Demand 200 320 250 770 14: Find IBFS by VAM D E F G Supply A 7 14 8 12 400 B 9 10 12 5 300 C 11 6 11 4 300 Demand 200 250 300 250 15: Find IBFS by VAM D1 D2 D3 D4 Supply A 11 13 17 14 250 B 16 18 14 10 300 C 21 24 13 10 400 Demand 200 225 275 250 16: A diary firm has three plants located in a state. The daily milk production at each as follows: Plant 1: 6 Million litres Plant 2: 1 Million Litres Plant 3: 10 Million Litres Each day, the firm must fulfil the needs of its four distribution centres. Minimum requirement at each centre is as follows Distribution centres 1: 7 Million Litres Distribution centres 2: 5 Million Litres Distribution Centres 3: 3 Million Litres Distribution Centres 4: 2 Million Litres Costs in hundreds of rupees of shipping one million litre from each plant to each distribution centres is given in following table. D1 D2 D3 D4 P1 2 3 11 7 P2 1 0 6 1 P3 5 8 15 9
  • 106. BIET – MBA Programme, Davangere 106 Prof. Vijay K S Business Statistics and Analytics 17: Find the initial basic feasible solution for the following transportation problem by VAM D1 D2 D3 D4 Supply O1 3 3 4 1 100 O2 4 2 4 2 125 O3 1 5 3 2 75 Demand 120 80 75 25 300 18: Find the initial solution to the following transportation problem using VAM D1 D2 D3 D4 Supply S1 19 30 50 10 7 S2 70 30 40 60 9 S3 40 8 70 20 18 Demand 5 8 7 14 19: Determine an initial basic feasible solution to the following TP by using VAM D1 D2 D3 D4 Supply S1 21 16 15 3 11 S2 17 18 14 23 13 S3 32 27 18 41 19 Demand 6 10 12 15 20: Determine an initial basic feasible solution to the following TP by using VAM D1 D2 D3 D4 Supply S1 1 2 1 4 30 S2 3 3 2 1 50 S3 4 2 5 9 20 Demand 20 40 30 10 21: Determine an initial basic feasible solution to the following TP by using VAM D1 D2 D3 D4 Supply O1 6 4 1 5 14 O2 8 9 2 7 16 O3 4 3 6 2 5 Demand 6 10 15 4 21: Determine an initial basic feasible solution to the following TP by using VAM A B C D E F Available O1 9 12 9 6 9 10 5 O2 7 3 7 7 5 5 6 O3 6 5 9 11 3 11 2 O4 6 8 11 2 2 10 9 Requirement 4 4 6 2 4 2
  • 107. BIET – MBA Programme, Davangere 107 Prof. Vijay K S Business Statistics and Analytics Unbalanced Transportation Problem: TP is said to be unbalanced if total demand is not equal to total supply (∑ 𝑎𝑖 ≠ 𝑏𝑗) such a TP is solved by transforming it to balanced TP by adding dummy row or column with required supply / demand to make it balanced i.e. Total supply (∑ 𝑎𝑖 ) < Total demand (∑ 𝑏𝑗 ), a dummy row with supply = (∑ 𝑏𝑗 − ∑ 𝑎𝑖 ) is added. If total Supply (∑ 𝑎𝑖 ) > Total demand (∑ 𝑏𝑗 ), a dummy column with demand = (∑ 𝑎𝑖 − ∑ 𝑏𝑗 ) is added. TP is then solved using the known procedure 22: Solve the following TP D E F G Supply A 7 14 8 12 400 B 9 10 12 5 300 C 11 6 11 4 300 Demand 200 450 300 250 23: Solve the following TP D E F G Supply A 7 14 8 12 400 B 9 10 12 5 300 C 11 6 11 4 300 Demand 200 450 300 250 24: A Company is spending Rs. 1200 and transportation of its unit from 3 plants to 4 destination centres. The supply and demand of units with unit cost of transportation is as follows. What can be the maximum solving by optimal scheduling? 1 2 3 4 Supply P1 20 30 50 17 7 P2 70 35 40 60 10 P3 40 12 60 25 18 Demand 5 8 7 15
  • 108. BIET – MBA Programme, Davangere 108 Prof. Vijay K S Business Statistics and Analytics 25: Problem • Holiday shipments of iPods to distribution centres • Production at 3 facilities, • A, supply 200k • B, supply 350k • C, supply 150k • Distribute to 4 centers, • N, demand 160k • S, demand 140k • E, demand 300k • W, demand 200k Total demand ≠ total supply. Obtain initial solution in the following transportation problem by using VAM method N S E W A 16 13 22 17 B 14 13 19 15 C 9 20 23 10
  • 109. BIET – MBA Programme, Davangere 109 Prof. Vijay K S Business Statistics and Analytics Restricted Transportation Problem:  Sometimes in a transpiration problem some routes may not be available. This could be due to a variety of reasons like unfavourable weather condition or a strike on particular route etc.  In such a situation there is a restrictions on route available for transportation.  We assign a very large cost represented by M to each of such routes which are not available.  The effect of adding a large cost element would be that such routes would automatically be eliminated in the final solutions. 26: The XYZ Tobacco Company purchased and stores in warehouses located in the following four cities C1 C2 C3 Supply A 7 10 5 B 12 9 4 C 7 3 11 D 9 5 7 Demand 120 100 110 Because of railroad construction, shipments are temporarily prohibited from warehouse at city A to company C1. i) Find the IBFS for XYZ tobacco Company 27. Solve the below transportation problem Factory Warehouse SupplyW1 W2 W3 F1 16 12 200 F2 14 8 18 160 F3 26 16 90 Demand 180 120 150 450 Maximization of Transportation Problem: If the TP contains profit matrix with an objective of maximization. It can be solved by transforming profit matrix into cost matrix by rewriting the cost matric, such that all unit profits subtracted in highest unit profit of the given profit matrix. 11. Solve for maximum profit A B C D Supply X 12 18 6 25 200 Y 8 7 10 18 500 Z 14 3 11 20 300 Demand 180 320 100 400
  • 110. BIET – MBA Programme, Davangere 110 Prof. Vijay K S Business Statistics and Analytics Questions from Previous Year Question Papers: 1. Solve the following transportation problem for Maximum profit. Only by initial basic feasible solution Per Unit Profit (Rs) Market Warehouse A B C D X 12 18 6 25 Y 8 7 10 18 Z 14 3 11 20 2. Use North West corner method (NWCM) and least cost method (LCM) to find an initial basic feasible solution to the transportation problem. D1 D2 D3 D4 Supply S1 19 30 50 10 7 S2 70 30 40 60 9 S3 40 8 70 20 18 Demand 5 8 7 14 34 3. Solve the following transportation problem for maximum profit Warehouse Per Unit profit (Rs.) Market A B C D X 12 18 6 25 Y 8 7 10 18 Z 14 3 11 20 Availability of Ware Houses Demand in the market X: 200 Units A: 180 Units Y: 500 Units B: 320 Units Z: 300 Units C: 100 Units D: 400 Units 4. For the following Transportation problem find initial solution using 1) North West Corner method 2) Least cost method To I II III Supply From A B C 5 1 7 10 6 4 9 80 3 2 8 55 Demand 75 20 50 Available at warehouse Demand in the market X: 200 Units A 180 Units Y: 500 Units B 320 Units Z: 300 Units C 100 Units D 400 Units
  • 111. BIET – MBA Programme, Davangere 111 Prof. Vijay K S Business Statistics and Analytics 5. A company has 3 fabrics S1, S2 and S3 with production capacity of 7, 9 and 18 units (in 100s) per week of a product respectively. These units are to be shipped to four warehouse D1, D2, D3 and D4 with requirement of 5, 8, 7 and 14 Units ( in 100s) per week respectively. The transportation costs (in Rupees) per units between factories to warehouse are given below D1 D2 D3 D4 Supply S1 19 30 50 10 7 S2 70 30 40 60 9 S3 40 8 70 20 18 Demand 5 8 7 14 Find initial solution using VAM Method
  • 112. BIET – MBA Programme, Davangere 112 Prof. Vijay K S Business Statistics and Analytics Unit – 6 Project Management
  • 113. BIET – MBA Programme, Davangere 113 Prof. Vijay K S Business Statistics and Analytics Syllabus: Project Management: Introduction – Basic difference between PERT & CPM – Network components and precedence relationship – Critical path analysis – Project scheduling – Project Timecost trade off- Resource Allocation, basic concept of project crashing. PERT & CPM Project Management evolved as a new field with the development of two analytical techniques for planning, scheduling and controlling of projects. These are Critical Path Method (CPM) and Project Evaluation and Review Techniques (PERT) Application of PERT & CPM Techniques These methods have been applied to a wide variety of problems in industries and have found acceptable even in government organizations. These includes - Construction of dam or Canal system in a region - Construction of building / highways - Maintenance of aeroplanes or Oil refinery - Space flight - Cost control of a project using PERT / CPM - Designing a prototype of a machine - Development of supersonic planes Basic Definitions: Activity: Any individual operations, which utilizes resources and has an end and a beginning is called activity. An arrow is commonly used to represent an activity with its head indicating the direction of progress in the project. These are usually classified into following 4 categories 1. Predecessor Activity Activity that must be completed immediately prior too the start of another activity are called as predecessor activity 2. Successor Activity Activities that cannot be started until one or more of other activities are completed, but immediately succeed them are called successor activity. 3. Concurrent Activity Activities which can be accomplished concurrently are known as concurrent activities. It may be noted that an activity can be predecessor or a successor to an event or it may be concurrent with one or more of the other activities.
  • 114. BIET – MBA Programme, Davangere 114 Prof. Vijay K S Business Statistics and Analytics 4. Dummy Activity: An activity which does not consume any kind of resources but merely depicts the technological dependence is called a dummy activity. It may be noted that the dummy activity is inserted in the network to clarify the activity pattern in the following two situation. - To make activities with common starting and finishing points distinguishable - To identify and maintain the proper precedence relationship between activities that are not connected by events. Events: An Event represent a point in time signifying the completion of some activities and the beginning of new ones. This is usually represented by circle “O” in a network, which is called as node or connector. The events can be further classified into following 3 categories - Merge Events: - When more than one activity comes and joins an event, such event is known as merge event - Burst Events - When more than one activity leaves an event is known as burst event - Merge and Burst Events An activity may be merge and burst event at the same time as with respect to some activities it can be a merge event with respect to some other activities it may be a burst event
  • 115. BIET – MBA Programme, Davangere 115 Prof. Vijay K S Business Statistics and Analytics Basic difference between PERT and CPM Rules to be followed during the construction of network 1. No single activity can be represented more than once in a network. The length of an arrow has no significance. 2. The event numbered 1 is the start event and an event with highest number is the end event. Before an activity can be undertaken, all activities preceding it must be completed. That is, the activities must follow a logical sequence (or – interrelationship) between activities. 3. In assigning numbers to events, there should not be any duplication of event numbers in a network. 4. Dummy activities must be used only if it is necessary to reduce the complexity of a network. 5. A network should have only one start event and one end event. CPM PERT  CPM uses activity oriented network.  PERT uses event oriented Network.  Durations of activity may be estimated with a fair degree of accuracy.  Estimate of time for activities are not so accurate and definite.  It is used extensively in construction projects.  It is used mostly in research and development projects, particularly projects of non-repetitive nature.  Deterministic concept is used.  Probabilistic model concept is used.  CPM can control both time and cost when planning.  PERT is basically a tool for planning.  In CPM, cost optimization is given prime importance. The time for the completion of the project depends upon cost optimization. The cost is not directly proportioned to time. Thus, cost is the controlling factor.  In PERT, it is assumed that cost varies directly with time. Attention is therefore given to minimize the time so that minimum cost results. Thus in PERT, time is the controlling factor.
  • 116. BIET – MBA Programme, Davangere 116 Prof. Vijay K S Business Statistics and Analytics Some conventions of network diagram are shown in Figures below: Errors in Construction of Network Diagram
  • 117. BIET – MBA Programme, Davangere 117 Prof. Vijay K S Business Statistics and Analytics
  • 118. BIET – MBA Programme, Davangere 118 Prof. Vijay K S Business Statistics and Analytics Dummy Activity A Dummy activity is an imaginary activity. It does not exist in the Project activities. It is used in the network diagram to show dependency relationship or connectivity between two or more activities. It is represented by a dotted arrow. Procedure for drawing a CPM network. 1. Specify the individual activities. From the Work Breakdown Structure, a listing can be made of all the activities in the project. This listing can be used as the basis for adding sequence and duration information in later steps. 2. Determine the sequence of those activities. Some activities are dependent upon the completion of others. A listing of the immediate predecessors of each activity is useful for constructing the CPM network diagram. 3. Draw a network diagram. Once the activities and their sequencing have been defined, the CPM diagram can be drawn. CPM originally was developed as an activity on node (AON) network, but some project planners prefer to specify the activities on the arcs. 4. Estimate the completion time for each activity. The time required to complete each activity can be estimated using past experience or the estimates of knowledgeable persons. CPM is a deterministic model that does not take into account variation in the completion time, so only one number can be used for an activity’s time estimate. 5. Identify the critical path The critical path is the longest-duration path through the network. The significance of the critical path is that the activities that lie on it cannot be delayed without delaying the project. Because of its impact on the entire project, critical path analysis is an important aspect of project planning.
  • 119. BIET – MBA Programme, Davangere 119 Prof. Vijay K S Business Statistics and Analytics Problems: 1. The following table gives the activities in a project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 6 B - 4 C A 3 D B 8 E C 14 F D 8 G E F 9 2. The following table gives the activities in a project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 6 B - 2 C A 3 D A 4 E C B 3 3. The following table gives the activities in a construction project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 6 B - 4 C A 3 D B 8 E B 14 F C D 8
  • 120. BIET – MBA Programme, Davangere 120 Prof. Vijay K S Business Statistics and Analytics 4. The following table gives the activities in a construction project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 6 B - 4 C A 3 D B 8 E B C 14 F E D 8 5. The following table gives the activities in a construction project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 2 B A 4 C A 3 D B 6 E C D 12 F C D 6 G E 9 H G 3 6. The following table gives the activities in a construction project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 2 B 4 C 3 D A 6 E C 12
  • 121. BIET – MBA Programme, Davangere 121 Prof. Vijay K S Business Statistics and Analytics F A 6 G D B E 9 7. The following table gives the activities in a construction project.  Draw a network diagram  Determine critical path and project duration Activity Predecessor Activity Time (Days) A - 3 B 4 C A 3 D A 6 E B 8 G D E 6 H D E 9 I D E 3 J C G 2 K F I 3 8. Draw a network corresponding to the following information Activity 1 – 2 1 – 3 2 – 6 3 – 4 3 – 5 4 – 6 5 – 6 5 – 7 6 – 7 Duration 4 6 8 7 4 6 5 19 10 A) Draw the network diagram B) Determine the critical path 9. Draw a network diagram for the following information obtain  the respective time estimates, Calculate Total Float (TF), Free Float (FF) and Independent Float (IF)  Identify Critical Activity Event Activity Duration 1 – 2 A 4 1 – 3 B 6 2 – 6 C 8 3 – 4 D 7 3 – 5 E 4 4 – 6 F 6 5 – 6 G 5 5 – 7 H 17 6 – 7 J 10
  • 122. BIET – MBA Programme, Davangere 122 Prof. Vijay K S Business Statistics and Analytics Note: Total Float (TF) = Latest Start Time (Lst) – Earliest Start Time (Est) Free Float (FF) = Total Float (TF) – Head Event Slack (HES) Head Event Slack (HES) = Latest Finish Time (Lft) – Earliest Finish Time (Eft) Independent Float (IF) = Free Float – Tail Event Slack (TES) Tail Event Slack (TES) = Latest Start Time (Lst) – Earliest Start Time (Est) 10. A small project consist of 7 activities with following information Activity Preceding Activity Duration A - 4 B - 6 C - 8 D A B 7 E A B 4 F C D E 6 G C D E 5 11. The project has the following characteristics - Constitute network diagram - Calculate all time estimates, find the length of the project with critical path activities using total float Events Activity Time 1 – 2 A 2 1 – 4 B 2 1 – 7 C 1 2 – 3 D 4 3 – 6 E 1 4 – 5 F 5 4 – 8 G 8 5 – 6 H 4 6 – 9 I 3 7 – 8 J 3 8 – 9 K 5
  • 123. BIET – MBA Programme, Davangere 123 Prof. Vijay K S Business Statistics and Analytics 12. Draw a network diagram from the following information A < D, E ; B, D < F ; C < G ; B, D < H ; F, G < I Construct network diagram, find CP using TF also find total project length. Task A B C D E F G H I Time 23 8 29 16 24 18 19 4 10 13. With the following information calculate Total Float and Project duration Activity A B C D E F G H I J Preceding Activity - - AB B A C E F D F G H I Duration 2 3 4 1 5 3 2 7 6 3 14. The following table gives activities of a network with time estimates  Draw a network diagram  Calculate time duration of the project  Calculate the Variance of Critical path  Find the probability that the project will be completed in 41 days Events Estimated Duration to (Optimistic Time) tm (Most Likely Time ) tp (Pessimistic Time ) 1 – 2 3 6 15 1 – 6 2 5 14 2 – 3 6 12 30 2 – 4 2 5 8 3 – 5 5 11 17 4 – 5 3 6 15 6 – 7 3 9 27 5 – 8 1 4 7 7 – 8 4 19 28
  • 124. BIET – MBA Programme, Davangere 124 Prof. Vijay K S Business Statistics and Analytics Questions from Previous Year Question Papers: 1. Given the following information on a small project: A is the first activity of the project and precedes the activity B and C. The activity D succeeds both B and C whereas only C is required to start activity E. D Precedes F while G Succeeds E. H is the last activity of the project and succeeds F and G. Draw a network diagram based on this information. 2. Draw a network diagram corresponding to the following information: Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7 Duration 4 6 8 7 4 6 5 19 10 A) Draw a network diagram B) Obtain early and late start time and completion times C) Determine the critical path. 3. A Small project is composed of 7 activities whose time estimates are listed in the table below in weeks. Activity Optimistic Time Pessimistic Time Most Likely Time 1-2 1 7 1 1-3 1 7 4 1-4 2 8 2 2-5 1 1 1 3-5 2 14 5 4-6 2 8 5 5-6 3 15 6 a. Draw the network and find the expected project length b. What is the probability that the project will be completed at-least 4 weeks earlier than expected time. 4. Tasks A,B,C…….H, I constitute a project. The precedence relationship are: A<D; A<E; B<F; D<F; C<G; C<H; F<I; G<I. Draw a network diagram to represent the project and find the critical path when time in days of each task is: Task A B C D E F G H I Time 8 10 8 10 16 17 18 14 9 Identify critical path with the help of EST, EFT, LST and LFT  A project consists of nine activities whose time estimates (in Weeks) and other characteristics are given below. Activity Proceeding Activity /lies Time Estimates (Weeks) Most Optimistic Most Likely Most Pessimistic A - 2 4 6 B - 6 6 6 C - 6 12 24 D A 2 5 8 E A 11 14 23
  • 125. BIET – MBA Programme, Davangere 125 Prof. Vijay K S Business Statistics and Analytics F B, D 8 10 12 G B, D 3 6 9 H C, F 9 15 27 I E 4 10 16 A) Show the PERT Network for the project B) Identify the critical activities and find the expected project completion time and its variance C) If the project is required to be completed by December 31 of a given year and the manager wants to be 95% sure of meeting the deadline, when he should start the project work. Given P (0<Z<1.645) = 0.45  The following table gives the activities in a construction project Activity Immediate Predecessor Time (Days ) A - 4 B - 6 C - 2 D A 5 E C 2 F A 7 G D, B, E 4  A Small Project composed of 7 activities whose time estimates are given below Activity Time Estimates (Weeks) Most Optimistic Most Likely Most Pessimistic 1-2 1 1 7 1-3 1 4 7 1-4 2 2 8 2-5 1 1 1 3-5 2 5 14 4-6 2 5 8 5-6 3 6 15 a) Draw the project network diagram. b) Find the expected duration and variance of each activity. What is the expected length and project standard deviation? c) Calculate the probability of completing the project by 13 days  Draw a network corresponding to the following information a) Draw the network b) Obtain early and late start time and completion times c) Determine the critical path d) Determine the total float Activity 1-2 1-3 2-6 3-4 3-5 4-6 5-6 5-7 6-7 Duration 4 6 8 7 4 6 5 19 10