SlideShare a Scribd company logo
Descriptive Statistics
• Measures of Central Tendency
• Measures of Dispersion
• Measures of Shape
• Basics of Probability
• Marginal Probability
• Bayes Theorem
• Probability Distributions
• Binomial
• Poisson
• Normal
▪ Raw Data
• Frequency Distribution - Histograms
• Cumulative Frequency Distribution
• Measures of Central Tendency
• Mean, Median, Mode
• Measures of Dispersion
• Range, IQR, Standard Deviation, coefficient of variation
• Normal distribution, Chebyshev Rule.
• Five number summary, boxplots, QQ plots, Quantile plot, scatter
plot.
• Visualization: scatter plot matrix.
• Correlation analysis
Data versus Information
• When analysts are bewildered by plethora of data, which do not
make any sense on the surface of it, they are looking for methods to
classify data that would convey meaning. The idea here is to
help them draw the right conclusion. Data needs to be arranged
into information.
Raw Data
• Raw Data represent numbers and facts in the original format in
which the data have been collected. We need to convert the raw data
into information for decision making.
Frequency Distribution - Histograms
• In simple terms, frequency distribution is a summarized
table in which raw data are arranged into classes and frequencies.
• Frequency distribution focuses on classifying raw data into
information. It is a widely used data reduction technique in
descriptive statistics.
Histogram (also known as frequency
histogram) is a snap shot of the frequency
distribution.
Histogram is a graphical representation of the
frequency distribution in which the X-axis
represents the classes and the Y-axis
represents the frequencies in bars
Histogram depicts the pattern of the
distribution emerging from the characteristic
being measured.
Histogram- Example
The inspection records of a hose assembly operation revealed a high
level of rejection. An analysis of the records showed that the "leaks" were
a major contributing factor to the problem. It was decided to
investigate the hose clamping operation. The hose clamping force
(torque) was measured on twenty five assemblies. (Figures in foot-
pounds). The data are given below: Draw the frequency histogram and
comment.
8 13 15 10 16
11 14 11 14 20
15 16 12 15 13
12 13 16 17 17
14 14 14 18 15
Histogram Example Solution
2
7
12
3
1
0
5
10
15
8-11 11-14 14-17
Classes
17-20 20-23
Fre
quency
Cumulative Frequency Distribution
A type of frequency distribution that shows how many observations are
above or below the lower boundaries of the classes. You can formulate
the following from the previous example of hose clamping force(torque)
Class Frequency Relative
Frequency
Cumulative
Frequency
Cumulative
Relative
Frequency
8-11
11-14
14-17
17-20
20-23
2
7
12
3
1
0.08
0.28
0.48
0.12
0.04
2
9
21
24
25
0.08
0.36
0.84
0.96
1.00
Total 25 1.00
What is Central Tendency?
• Whenever you measure things of the same kind, a fairly large number of such measurements
will tend to cluster around the middle value. Such a value is called a measure of "Central
Tendency". The other terms that are used synonymously are "Measures of Location", or
"Statistical Averages".
Arithmetic Mean
• Arithmetic Mean (called mean) is defined as the sum of all observations in a data set divided by
the total number of observations. For example, consider a data set containing the
following observations:
• In symbolic form mean is given by
X = 𝛴𝑥
𝑛
𝑛 = Total number of observations(Sample Size)
σ 𝑥 = Indicates sum all X values in the data set
X = Arithmetic Mean
Statistics.pdf
Statistics.pdf
Arithmetic Mean -Example
The inner diameter of a particular grade of tire based on
5 sample measurements are as follows: (figures in
millimeters)
565, 570, 572, 568, 585
Applying the formula
We get mean = (565+570+572+568+585)/5 =572
Caution: Arithmetic Mean is affected by extreme values or
fluctuations in sampling. It is not the best average to use
when the data set contains extreme values (Very high or
very low values).
X = 𝛴𝑥
𝑛
Median
• Median is the middle most observation when you arrange data in ascending order of magnitude.
Median is such that 50% of the observations are above the median and 50% of the observations are below
the median.
• Median is a very useful measure for ranked data in the context of consumer preferences and rating. It is
not affected by extreme values (greater resistance to outliers)
th value of ranked data
n = Number of observations in the sample
𝑛 + 1
2
Median =
Median - Example
Marks obtained by 7 students in Computer Science
Exam are given below: Compute the median.
45 40 60 80 90 65 55
Arranging the data after ranking gives
90 80 65 60 55 45 40
Median = (n+1)/2 th value in this set = (7+1)/2 th
observation= 4th observation=60
Hence Median = 60 for this problem.
Mode
Mode is that value which occurs most often. It has the maximum frequency
of occurrence. Mode also has resistance to outliers.
Mode is a very useful measure when you want to keep in the inventory, the most
popular shirt in terms of collar size during festive season.
Mode -Example
The life in number of hours of 10 flashlight batteries are as
follows: Find the mode.
340 occurs five times. Hence, mode=340.
340 350 340 340 320 340 330 330
340 350
Comparison of
Mean, Median, Mode Cont.
Mean Median Mode
Affected by extreme
values.
Can be treated
algebraically. That is,
Means of several groups
can be combined.
Not affected by
extreme values.
Cannot be treated
algebraically. That is,
Medians of several
groups cannot be
combined.
Not affected by
extreme values.
Cannot be treated
algebraically. That is,
Modes of several
groups cannot be
combined.
Statistics.pdf
Measures of Dispersion
• In simple terms, measures of dispersion indicate how large the
spread of the distribution is around the central tendency. It answers
unambiguously the question " What is the magnitude of
departure from the average value for different groups having
identical averages?".
Range
• Range is the simplest of all measures of dispersion. It is
calculated as the difference between maximum and
minimum value in the data set.
Range = XMaximum
− XMinimum
Inter-Quartile Range(IQR)
IQR= Range computed on middle 50% of the observations after
eliminating the highest and lowest 25% of
observations in a data set that is arranged in ascending
order. IQR is less affected by outliers.
IQR =Q3-Q1
Interquartile Range-Example
The following data represent the annual percentage
returns of 9 mutual funds.
Data Set: 12, 14, 11, 18, 10.5, 12, 14, 11, 9
Arranging in ascending order, the data set becomes
9, 10.5, 11, 11, 12, 12, 14, 14, 18
IQR=Q3-Q1=14-10.75=3.25
Standard Deviation
To define standard deviation, you need to define another term
called variance. In simple terms, standard deviation is the square
root of variance.
Statistics.pdf
Statistics.pdf
Example for Standard Deviation
The following data represent the percentage return on investment
for 10 mutual funds per annum. Calculate the sample standard
deviation.
12, 14, 11, 18, 10.5, 11.3, 12, 14, 11, 9
Solution for the Example
Standard Deviation Formula
Coefficient of Variation
(Relative Dispersion)
CoefficientvVariation (CV) is defined as the ratio of
Standard Deviation to Mean.
In symbolic form
CV = S
for the sample data and = for the population
μ
σ
X
Coefficient of Variation
Example
Consider two SalesPersons working in the same territory
The sales performance of these two in the context of
selling PCs are given below. Comment on the results.
Sales Person 1
Mean Sales (One year
average)
50 units
Sales Person 2
Mean Sales (One year
average)
75 units
Standard Deviation
5 units
Standard deviation
25 units
Interpretation for the Example
The CV is 5/50 =0.10 or 10% for the Sales Person1
and 25/75=0.33 or 33% for sales Person2.
The moral of the story is "don't get carried away by by
averages. Consider variation (“risk”).
• The empirical rule approximates the variation of data in a bell-
shaped distribution
• Approximately 68% of the data in a bell shaped distribution
is within 1 standard deviation of the mean or
The Empirical Rule
μ ± 1σ
68%
μ
μ ± 1σ
Statistics.pdf
• Approximately 95% of the data in a bell-shaped distribution
lies within two standard deviations of the mean, or µ ± 2σ
• Approximately 99.7% of the data in a bell-shaped distribution
lies within three standard deviations of the mean, or µ ± 3σ
The Empirical Rule
μ ± 3σ
99.7%
95%
μ ± 2σ
• Regardless of how the data are distributed, at least (1 -
1/k2) x 100% of the values will fall within k standard
deviations of the mean (for k > 1)
• For Example, when k=2, at least 75% of the values of any data
set will be within μ ± 2σ
Chebyshev Rule
The Five Number Summary
The five numbers that help describe the center, spread and shape
of data are:
▪ Xsmallest
▪ First Quartile (Q1)
▪ Median (Q2)
▪ Third Quartile (Q3)
▪ Xlargest
Distribution Shape
Right-Skewed
Left-Skewed Symmetric
Q1 Q2 Q3 Q1 Q2 Q3
Q1 Q2 Q3
Relationships among the five-number
summary and distribution shape
Left-Skewed Symmetric Right-Skewed
Median – Xsmallest
>
Xlargest – Median
Median – Xsmallest
≈
Xlargest – Median
Median – Xsmallest
<
Xlargest – Median
Q1
– Xsmallest
>
Xlargest
– Q3
Q1
– Xsmallest
≈
Xlargest
– Q3
Q1
– Xsmallest
<
Xlargest
– Q3
Median – Q1
>
Q3 – Median
Median – Q1
≈
Q3 – Median
Median – Q1
<
Q3 – Median
Five Number Summary and The
Boxplot
• The Boxplot: A Graphical display of the data based on the five-
number summary:
Example:
40
Xsmallest Q1 Median Q3
Xlargest
25% of data 25%
of data
25%
of data
25% of data
Five Number Summary and The
Boxplot
• The Boxplot: A Graphical display of the data based on the five-
number summary:
Example:
40
Xsmallest Q1 Median Q3
Xlargest
25% of data 25%
of data
25%
of data
25% of data
Five Number Summary:
Shape of Boxplots
• If data are symmetric around the median then the box
and central line are centered between the endpoints
• A Boxplot can be shown in either a vertical or
horizontal orientation
Xsmallest Q1 Q3
Median Xlargest
Distribution Shape and
The Boxplot
Right-Skewed
Left-Skewed Symmetric
Q1 Q2 Q3 Q1 Q2 Q3
Q1 Q2 Q3
Boxplot Example
5
The data are right skewed in the following plot
2277
00 22 33 55
10 15 20 25 30
0 5
Box plot example showing an outlier
• The boxplot below of the same data shows the
• outlier value of 27 plotted separately
• A value is considered an outlier if it is more than 1.5 times the interquartile
range below Q1 or above Q3
Graphic Displays of Basic Statistical Descriptions
•Boxplot: graphic display of five-number summary
•Histogram: x-axis are values, y-axis repres. frequencies
•Quantile plot: each value xi is paired with fi indicating that approximately 100 fi % of
data are ≤ xi
•Quantile-quantile (q-q) plot: graphs the quantiles of one univariant distribution
against the corresponding quantiles of another
•Scatter plot: each pair of values is a pair of coordinates and plotted as points in the
plane
Histograms Often Tell More than Boxplots
■ The two histograms
shown in the left may
have the same boxplot
representation
■ The same values for:
min, Q1, median, Q3,
max
■ But they have rather
different data
distributions
Quantile Plot
•Displays all of the data (allowing
the user to assess both the overall
behavior and unusual occurrences)
•Plots quantile information
•For a data xi data sorted in
increasing order, fi indicates that
approximately 100 fi% of the
data are below or equal to the
value xi
Quantile-Quantile (Q-Q) Plot
•Graphs the quantiles of one univariate distribution against the corresponding quantiles of
another
•View: Is there is a shift in going from one distribution to another?
•Example shows unit price of items sold at Branch 1 vs. Branch 2 for each quantile. Unit
prices of items sold at Branch 1 tend to be lower than those at Branch 2.
Scatter plot
Provides a first look at bivariate data to see clusters of points,
outliers, etc
Each pair of values is treated as a pair of coordinates and plotted as
points in the plane
Positively and Negatively Correlated Data
The left half fragment is positively
correlated
The right half is negative correlated
Uncorrelated Data
Visually Evaluating Correlation
Scatter plots
showing the
similarity from
–1 to 1.
Statistics.pdf
Summary
• Histograms
• Measures of central tendency: mean, mode, median
• Measures of dispersion: range, IQR, variance, std deviation, coefficient of
variation.
• Normal distribution, Chebyshev Rule.
• Five number summary, boxplots, QQ plots, Quantile plot, scatter plot.
• Visualization: scatter plot matrix, parallel coordinates.
• Correlation analysis.
Basics of Probability
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Bayes’ Theorem
• Bayes’ Theorem is used to revise previously calculated probabilities
based on new information.
• Developed by Thomas Bayes in the 18th Century.
• It is an extension of conditional probability.
Bi = ith event of k mutually exclusive and collectively exhaustive events A =
new event that might impact P(Bi )
Statistics.pdf
Statistics.pdf
Statistics.pdf
In precise terms, a probability distribution is a total listing of the various
values the random variable can take along with the corresponding probability
of each value. A real life example could be the pattern of distribution of the
machine breakdowns in a manufacturing unit.
• The random variable in this example would be the various values the machine
breakdowns could assume.
• The probability corresponding to each value of the breakdown is the relative
frequency of occurrence of the breakdown.
• The probability distribution for this example is constructed by the actual
breakdown pattern observed over a period of time. Statisticians use the term
“observed distribution” of breakdowns.
Probability Distributions
Statistics.pdf
Statistics.pdf
Statistics.pdf
P(x) is the probability of getting x successes in n trials
Statistics.pdf
Statistics.pdf
• Poisson Distribution is another discrete distribution which also plays a major role
in quality control in the context of reducing the number of defects per standard
unit.
• Examples include number of defects per item, number of defects per
transformer produced, number of defects per 100 m2 of cloth, etc.
• Other real life examples would include 1) The number of cars arriving at a
highway check post per hour; 2) The number of customers visiting a bank per
hour during peak business period
Statistics.pdf
𝑃 𝑥 =
ⅇ−𝜆𝜆𝑥
𝑥!
• P(x) = Probability of x events in an interval
given an idea of λ
• λ = Average number of events per unit
• e = 2.71828(based on natural logarithm) x =
events per unit which can take values 0, 1,
2, 3,…………..∞
• λ is the Parameter of the Poisson
Distribution.
If on an average, 6 customers arrive every two minutes at a
bank during the busy hours of working,
a) what is the probability that exactly four customers arrive
in a given minute?
b) What is the probability that more than three customers
will arrive in a given minute?
Sol: 6 customers arrive every two minutes.
Therefore , 3 customers arrive every minute.
That implies my lambda=3
P(X=4)=?
P(X>3)=?
Implies 1-P(X< =3)? In the problem mean value is given as an input for a time
interval. This is one of the indication that Poisson distribution
has to be applied
Statistics.pdf
The Normal Distribution is the most widely used continuous
distribution
The inferential statistics is based on the normal distribution.
When the sample size is reasonably large, almost every dataset
achieves normal distribution
• The normal distribution is a continuous distribution looking
like a bell.
• Statisticians use the expression “Bell Shaped Distribution”.
• Mean, the median, and the mode are all equal to one
another.
• It is symmetrical about its mean.
• If the tails of the normal distribution are extended, they will
run parallel to the horizontal axis without actually touching
it.
• • The normal distribution has two parameters namely the
mean µ and the standard deviation σ
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Inferential Statistics
• Mutually exclusive Vs Independent Events.
• Conditional Probability.
• Bayes Theorem.
• Applying Probability Concepts.
• Applying Distribution Concepts.
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Irrespective of the shape of the
distribution of the original population,
the sampling distribution of the mean
will approach a normal distribution as
the size of the sample increases and
becomes large
Statistics.pdf
Statistics.pdf
Inferential Analysis
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Statistics.pdf
Ad

More Related Content

Similar to Statistics.pdf (20)

Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
Diana Diana
 
PG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statisticsPG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statistics
Aashish Patel
 
5.DATA SUMMERISATION.ppt
5.DATA SUMMERISATION.ppt5.DATA SUMMERISATION.ppt
5.DATA SUMMERISATION.ppt
chusematelephone
 
Measure of Dispersion - Grade 8 Statistics.ppt
Measure of Dispersion - Grade 8 Statistics.pptMeasure of Dispersion - Grade 8 Statistics.ppt
Measure of Dispersion - Grade 8 Statistics.ppt
KirbyRaeDiaz2
 
Statistics excellent
Statistics excellentStatistics excellent
Statistics excellent
National Institute of Biologics
 
Chapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.pptChapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.ppt
NurinaSWGotami
 
Estimation
EstimationEstimation
Estimation
Mmedsc Hahm
 
estimation
estimationestimation
estimation
Mmedsc Hahm
 
Ch2 Data Description
Ch2 Data DescriptionCh2 Data Description
Ch2 Data Description
Farhan Alfin
 
These is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docxThese is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docx
meagantobias
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
M K
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
boyfieldhouse
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Mayuri Joshi
 
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
Kumar P
 
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
Smarten Augmented Analytics
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
Vishal543707
 
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptxLecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
NabeelAli89
 
determinatiion of
determinatiion of determinatiion of
determinatiion of
University of Balochistan
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
JuliusRomano3
 
DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
data2businessinsight
 
PG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statisticsPG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 Lecture 2 Descriptive statistics
Aashish Patel
 
Measure of Dispersion - Grade 8 Statistics.ppt
Measure of Dispersion - Grade 8 Statistics.pptMeasure of Dispersion - Grade 8 Statistics.ppt
Measure of Dispersion - Grade 8 Statistics.ppt
KirbyRaeDiaz2
 
Chapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.pptChapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.ppt
NurinaSWGotami
 
Ch2 Data Description
Ch2 Data DescriptionCh2 Data Description
Ch2 Data Description
Farhan Alfin
 
These is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docxThese is info only ill be attaching the questions work CJ 301 – .docx
These is info only ill be attaching the questions work CJ 301 – .docx
meagantobias
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
M K
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
boyfieldhouse
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Mayuri Joshi
 
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
Kumar P
 
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
Smarten Augmented Analytics
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
Vishal543707
 
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptxLecture. Introduction to Statistics (Measures of Dispersion).pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
NabeelAli89
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
JuliusRomano3
 

More from Shruti Nigam (CWM, AFP) (11)

Morph transition 1.pptx
Morph transition 1.pptxMorph transition 1.pptx
Morph transition 1.pptx
Shruti Nigam (CWM, AFP)
 
Morph transition 2.pptx
Morph transition 2.pptxMorph transition 2.pptx
Morph transition 2.pptx
Shruti Nigam (CWM, AFP)
 
Business forecasting project border
Business forecasting project borderBusiness forecasting project border
Business forecasting project border
Shruti Nigam (CWM, AFP)
 
Data analysis property area analysis via powerbi
Data analysis property area analysis via powerbiData analysis property area analysis via powerbi
Data analysis property area analysis via powerbi
Shruti Nigam (CWM, AFP)
 
Actionable results to enhance Employee satisfaction score analysis via Tableau
Actionable results to enhance Employee satisfaction score analysis via TableauActionable results to enhance Employee satisfaction score analysis via Tableau
Actionable results to enhance Employee satisfaction score analysis via Tableau
Shruti Nigam (CWM, AFP)
 
Data visualization intro2
Data visualization intro2Data visualization intro2
Data visualization intro2
Shruti Nigam (CWM, AFP)
 
Data visualization intro
Data visualization introData visualization intro
Data visualization intro
Shruti Nigam (CWM, AFP)
 
Finanacial institutions nature and role
Finanacial institutions nature and roleFinanacial institutions nature and role
Finanacial institutions nature and role
Shruti Nigam (CWM, AFP)
 
Mutual funds
Mutual fundsMutual funds
Mutual funds
Shruti Nigam (CWM, AFP)
 
Fs unit-i nbfc
Fs unit-i nbfcFs unit-i nbfc
Fs unit-i nbfc
Shruti Nigam (CWM, AFP)
 
NBFC MBA I SEMESTER
NBFC MBA I SEMESTERNBFC MBA I SEMESTER
NBFC MBA I SEMESTER
Shruti Nigam (CWM, AFP)
 
Ad

Recently uploaded (20)

2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhhChapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
ChrisjohnAlfiler
 
spssworksho9035530-lva1-app6891 (1).pptx
spssworksho9035530-lva1-app6891 (1).pptxspssworksho9035530-lva1-app6891 (1).pptx
spssworksho9035530-lva1-app6891 (1).pptx
clarkraal
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challengesProcess Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challenges
Process mining Evangelist
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
717239550-Hotel-Management-Ppt-Final.pptx
717239550-Hotel-Management-Ppt-Final.pptx717239550-Hotel-Management-Ppt-Final.pptx
717239550-Hotel-Management-Ppt-Final.pptx
dharmendrasingh31102
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
Volkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing ProcessVolkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing Process
Process mining Evangelist
 
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahahE-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
RyanRahardjo2
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Process Mining at AE - Key success factors
Process Mining at AE - Key success factorsProcess Mining at AE - Key success factors
Process Mining at AE - Key success factors
Process mining Evangelist
 
Collibra DQ Installation setup and debug
Collibra DQ Installation setup and debugCollibra DQ Installation setup and debug
Collibra DQ Installation setup and debug
karthikprince20
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhhChapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
Chapter-3-PROBLEM-SOLVING.pdf hhhhhhhhhh
ChrisjohnAlfiler
 
spssworksho9035530-lva1-app6891 (1).pptx
spssworksho9035530-lva1-app6891 (1).pptxspssworksho9035530-lva1-app6891 (1).pptx
spssworksho9035530-lva1-app6891 (1).pptx
clarkraal
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challengesProcess Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challenges
Process mining Evangelist
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
717239550-Hotel-Management-Ppt-Final.pptx
717239550-Hotel-Management-Ppt-Final.pptx717239550-Hotel-Management-Ppt-Final.pptx
717239550-Hotel-Management-Ppt-Final.pptx
dharmendrasingh31102
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
Volkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing ProcessVolkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing Process
Process mining Evangelist
 
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahahE-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
RyanRahardjo2
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Collibra DQ Installation setup and debug
Collibra DQ Installation setup and debugCollibra DQ Installation setup and debug
Collibra DQ Installation setup and debug
karthikprince20
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
Ad

Statistics.pdf

  • 2. • Measures of Central Tendency • Measures of Dispersion • Measures of Shape • Basics of Probability • Marginal Probability • Bayes Theorem • Probability Distributions • Binomial • Poisson • Normal
  • 3. ▪ Raw Data • Frequency Distribution - Histograms • Cumulative Frequency Distribution • Measures of Central Tendency • Mean, Median, Mode • Measures of Dispersion • Range, IQR, Standard Deviation, coefficient of variation • Normal distribution, Chebyshev Rule. • Five number summary, boxplots, QQ plots, Quantile plot, scatter plot. • Visualization: scatter plot matrix. • Correlation analysis
  • 4. Data versus Information • When analysts are bewildered by plethora of data, which do not make any sense on the surface of it, they are looking for methods to classify data that would convey meaning. The idea here is to help them draw the right conclusion. Data needs to be arranged into information.
  • 5. Raw Data • Raw Data represent numbers and facts in the original format in which the data have been collected. We need to convert the raw data into information for decision making.
  • 6. Frequency Distribution - Histograms • In simple terms, frequency distribution is a summarized table in which raw data are arranged into classes and frequencies. • Frequency distribution focuses on classifying raw data into information. It is a widely used data reduction technique in descriptive statistics.
  • 7. Histogram (also known as frequency histogram) is a snap shot of the frequency distribution. Histogram is a graphical representation of the frequency distribution in which the X-axis represents the classes and the Y-axis represents the frequencies in bars Histogram depicts the pattern of the distribution emerging from the characteristic being measured.
  • 8. Histogram- Example The inspection records of a hose assembly operation revealed a high level of rejection. An analysis of the records showed that the "leaks" were a major contributing factor to the problem. It was decided to investigate the hose clamping operation. The hose clamping force (torque) was measured on twenty five assemblies. (Figures in foot- pounds). The data are given below: Draw the frequency histogram and comment. 8 13 15 10 16 11 14 11 14 20 15 16 12 15 13 12 13 16 17 17 14 14 14 18 15
  • 9. Histogram Example Solution 2 7 12 3 1 0 5 10 15 8-11 11-14 14-17 Classes 17-20 20-23 Fre quency
  • 10. Cumulative Frequency Distribution A type of frequency distribution that shows how many observations are above or below the lower boundaries of the classes. You can formulate the following from the previous example of hose clamping force(torque) Class Frequency Relative Frequency Cumulative Frequency Cumulative Relative Frequency 8-11 11-14 14-17 17-20 20-23 2 7 12 3 1 0.08 0.28 0.48 0.12 0.04 2 9 21 24 25 0.08 0.36 0.84 0.96 1.00 Total 25 1.00
  • 11. What is Central Tendency? • Whenever you measure things of the same kind, a fairly large number of such measurements will tend to cluster around the middle value. Such a value is called a measure of "Central Tendency". The other terms that are used synonymously are "Measures of Location", or "Statistical Averages".
  • 12. Arithmetic Mean • Arithmetic Mean (called mean) is defined as the sum of all observations in a data set divided by the total number of observations. For example, consider a data set containing the following observations: • In symbolic form mean is given by X = 𝛴𝑥 𝑛 𝑛 = Total number of observations(Sample Size) σ 𝑥 = Indicates sum all X values in the data set X = Arithmetic Mean
  • 15. Arithmetic Mean -Example The inner diameter of a particular grade of tire based on 5 sample measurements are as follows: (figures in millimeters) 565, 570, 572, 568, 585 Applying the formula We get mean = (565+570+572+568+585)/5 =572 Caution: Arithmetic Mean is affected by extreme values or fluctuations in sampling. It is not the best average to use when the data set contains extreme values (Very high or very low values). X = 𝛴𝑥 𝑛
  • 16. Median • Median is the middle most observation when you arrange data in ascending order of magnitude. Median is such that 50% of the observations are above the median and 50% of the observations are below the median. • Median is a very useful measure for ranked data in the context of consumer preferences and rating. It is not affected by extreme values (greater resistance to outliers) th value of ranked data n = Number of observations in the sample 𝑛 + 1 2 Median =
  • 17. Median - Example Marks obtained by 7 students in Computer Science Exam are given below: Compute the median. 45 40 60 80 90 65 55 Arranging the data after ranking gives 90 80 65 60 55 45 40 Median = (n+1)/2 th value in this set = (7+1)/2 th observation= 4th observation=60 Hence Median = 60 for this problem.
  • 18. Mode Mode is that value which occurs most often. It has the maximum frequency of occurrence. Mode also has resistance to outliers. Mode is a very useful measure when you want to keep in the inventory, the most popular shirt in terms of collar size during festive season.
  • 19. Mode -Example The life in number of hours of 10 flashlight batteries are as follows: Find the mode. 340 occurs five times. Hence, mode=340. 340 350 340 340 320 340 330 330 340 350
  • 20. Comparison of Mean, Median, Mode Cont. Mean Median Mode Affected by extreme values. Can be treated algebraically. That is, Means of several groups can be combined. Not affected by extreme values. Cannot be treated algebraically. That is, Medians of several groups cannot be combined. Not affected by extreme values. Cannot be treated algebraically. That is, Modes of several groups cannot be combined.
  • 22. Measures of Dispersion • In simple terms, measures of dispersion indicate how large the spread of the distribution is around the central tendency. It answers unambiguously the question " What is the magnitude of departure from the average value for different groups having identical averages?".
  • 23. Range • Range is the simplest of all measures of dispersion. It is calculated as the difference between maximum and minimum value in the data set. Range = XMaximum − XMinimum
  • 24. Inter-Quartile Range(IQR) IQR= Range computed on middle 50% of the observations after eliminating the highest and lowest 25% of observations in a data set that is arranged in ascending order. IQR is less affected by outliers. IQR =Q3-Q1
  • 25. Interquartile Range-Example The following data represent the annual percentage returns of 9 mutual funds. Data Set: 12, 14, 11, 18, 10.5, 12, 14, 11, 9 Arranging in ascending order, the data set becomes 9, 10.5, 11, 11, 12, 12, 14, 14, 18 IQR=Q3-Q1=14-10.75=3.25
  • 26. Standard Deviation To define standard deviation, you need to define another term called variance. In simple terms, standard deviation is the square root of variance.
  • 29. Example for Standard Deviation The following data represent the percentage return on investment for 10 mutual funds per annum. Calculate the sample standard deviation. 12, 14, 11, 18, 10.5, 11.3, 12, 14, 11, 9
  • 30. Solution for the Example
  • 32. Coefficient of Variation (Relative Dispersion) CoefficientvVariation (CV) is defined as the ratio of Standard Deviation to Mean. In symbolic form CV = S for the sample data and = for the population μ σ X
  • 33. Coefficient of Variation Example Consider two SalesPersons working in the same territory The sales performance of these two in the context of selling PCs are given below. Comment on the results. Sales Person 1 Mean Sales (One year average) 50 units Sales Person 2 Mean Sales (One year average) 75 units Standard Deviation 5 units Standard deviation 25 units
  • 34. Interpretation for the Example The CV is 5/50 =0.10 or 10% for the Sales Person1 and 25/75=0.33 or 33% for sales Person2. The moral of the story is "don't get carried away by by averages. Consider variation (“risk”).
  • 35. • The empirical rule approximates the variation of data in a bell- shaped distribution • Approximately 68% of the data in a bell shaped distribution is within 1 standard deviation of the mean or The Empirical Rule μ ± 1σ 68% μ μ ± 1σ
  • 37. • Approximately 95% of the data in a bell-shaped distribution lies within two standard deviations of the mean, or µ ± 2σ • Approximately 99.7% of the data in a bell-shaped distribution lies within three standard deviations of the mean, or µ ± 3σ The Empirical Rule μ ± 3σ 99.7% 95% μ ± 2σ
  • 38. • Regardless of how the data are distributed, at least (1 - 1/k2) x 100% of the values will fall within k standard deviations of the mean (for k > 1) • For Example, when k=2, at least 75% of the values of any data set will be within μ ± 2σ Chebyshev Rule
  • 39. The Five Number Summary The five numbers that help describe the center, spread and shape of data are: ▪ Xsmallest ▪ First Quartile (Q1) ▪ Median (Q2) ▪ Third Quartile (Q3) ▪ Xlargest
  • 41. Relationships among the five-number summary and distribution shape Left-Skewed Symmetric Right-Skewed Median – Xsmallest > Xlargest – Median Median – Xsmallest ≈ Xlargest – Median Median – Xsmallest < Xlargest – Median Q1 – Xsmallest > Xlargest – Q3 Q1 – Xsmallest ≈ Xlargest – Q3 Q1 – Xsmallest < Xlargest – Q3 Median – Q1 > Q3 – Median Median – Q1 ≈ Q3 – Median Median – Q1 < Q3 – Median
  • 42. Five Number Summary and The Boxplot • The Boxplot: A Graphical display of the data based on the five- number summary: Example: 40 Xsmallest Q1 Median Q3 Xlargest 25% of data 25% of data 25% of data 25% of data
  • 43. Five Number Summary and The Boxplot • The Boxplot: A Graphical display of the data based on the five- number summary: Example: 40 Xsmallest Q1 Median Q3 Xlargest 25% of data 25% of data 25% of data 25% of data
  • 44. Five Number Summary: Shape of Boxplots • If data are symmetric around the median then the box and central line are centered between the endpoints • A Boxplot can be shown in either a vertical or horizontal orientation Xsmallest Q1 Q3 Median Xlargest
  • 45. Distribution Shape and The Boxplot Right-Skewed Left-Skewed Symmetric Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
  • 46. Boxplot Example 5 The data are right skewed in the following plot 2277 00 22 33 55
  • 47. 10 15 20 25 30 0 5 Box plot example showing an outlier • The boxplot below of the same data shows the • outlier value of 27 plotted separately • A value is considered an outlier if it is more than 1.5 times the interquartile range below Q1 or above Q3
  • 48. Graphic Displays of Basic Statistical Descriptions •Boxplot: graphic display of five-number summary •Histogram: x-axis are values, y-axis repres. frequencies •Quantile plot: each value xi is paired with fi indicating that approximately 100 fi % of data are ≤ xi •Quantile-quantile (q-q) plot: graphs the quantiles of one univariant distribution against the corresponding quantiles of another •Scatter plot: each pair of values is a pair of coordinates and plotted as points in the plane
  • 49. Histograms Often Tell More than Boxplots ■ The two histograms shown in the left may have the same boxplot representation ■ The same values for: min, Q1, median, Q3, max ■ But they have rather different data distributions
  • 50. Quantile Plot •Displays all of the data (allowing the user to assess both the overall behavior and unusual occurrences) •Plots quantile information •For a data xi data sorted in increasing order, fi indicates that approximately 100 fi% of the data are below or equal to the value xi
  • 51. Quantile-Quantile (Q-Q) Plot •Graphs the quantiles of one univariate distribution against the corresponding quantiles of another •View: Is there is a shift in going from one distribution to another? •Example shows unit price of items sold at Branch 1 vs. Branch 2 for each quantile. Unit prices of items sold at Branch 1 tend to be lower than those at Branch 2.
  • 52. Scatter plot Provides a first look at bivariate data to see clusters of points, outliers, etc Each pair of values is treated as a pair of coordinates and plotted as points in the plane
  • 53. Positively and Negatively Correlated Data The left half fragment is positively correlated The right half is negative correlated
  • 55. Visually Evaluating Correlation Scatter plots showing the similarity from –1 to 1.
  • 57. Summary • Histograms • Measures of central tendency: mean, mode, median • Measures of dispersion: range, IQR, variance, std deviation, coefficient of variation. • Normal distribution, Chebyshev Rule. • Five number summary, boxplots, QQ plots, Quantile plot, scatter plot. • Visualization: scatter plot matrix, parallel coordinates. • Correlation analysis.
  • 66. Bayes’ Theorem • Bayes’ Theorem is used to revise previously calculated probabilities based on new information. • Developed by Thomas Bayes in the 18th Century. • It is an extension of conditional probability. Bi = ith event of k mutually exclusive and collectively exhaustive events A = new event that might impact P(Bi )
  • 70. In precise terms, a probability distribution is a total listing of the various values the random variable can take along with the corresponding probability of each value. A real life example could be the pattern of distribution of the machine breakdowns in a manufacturing unit. • The random variable in this example would be the various values the machine breakdowns could assume. • The probability corresponding to each value of the breakdown is the relative frequency of occurrence of the breakdown. • The probability distribution for this example is constructed by the actual breakdown pattern observed over a period of time. Statisticians use the term “observed distribution” of breakdowns.
  • 75. P(x) is the probability of getting x successes in n trials
  • 78. • Poisson Distribution is another discrete distribution which also plays a major role in quality control in the context of reducing the number of defects per standard unit. • Examples include number of defects per item, number of defects per transformer produced, number of defects per 100 m2 of cloth, etc. • Other real life examples would include 1) The number of cars arriving at a highway check post per hour; 2) The number of customers visiting a bank per hour during peak business period
  • 80. 𝑃 𝑥 = ⅇ−𝜆𝜆𝑥 𝑥! • P(x) = Probability of x events in an interval given an idea of λ • λ = Average number of events per unit • e = 2.71828(based on natural logarithm) x = events per unit which can take values 0, 1, 2, 3,…………..∞ • λ is the Parameter of the Poisson Distribution.
  • 81. If on an average, 6 customers arrive every two minutes at a bank during the busy hours of working, a) what is the probability that exactly four customers arrive in a given minute? b) What is the probability that more than three customers will arrive in a given minute? Sol: 6 customers arrive every two minutes. Therefore , 3 customers arrive every minute. That implies my lambda=3 P(X=4)=? P(X>3)=? Implies 1-P(X< =3)? In the problem mean value is given as an input for a time interval. This is one of the indication that Poisson distribution has to be applied
  • 83. The Normal Distribution is the most widely used continuous distribution The inferential statistics is based on the normal distribution. When the sample size is reasonably large, almost every dataset achieves normal distribution • The normal distribution is a continuous distribution looking like a bell. • Statisticians use the expression “Bell Shaped Distribution”. • Mean, the median, and the mode are all equal to one another. • It is symmetrical about its mean. • If the tails of the normal distribution are extended, they will run parallel to the horizontal axis without actually touching it. • • The normal distribution has two parameters namely the mean µ and the standard deviation σ
  • 93. • Mutually exclusive Vs Independent Events. • Conditional Probability. • Bayes Theorem. • Applying Probability Concepts. • Applying Distribution Concepts.
  • 98. Irrespective of the shape of the distribution of the original population, the sampling distribution of the mean will approach a normal distribution as the size of the sample increases and becomes large