SlideShare a Scribd company logo
BASIC STATISTICS
IN ONE HOUR
SESSION
FLOW
What is Statistics?
Population & Sample
What is Data?
Types of Data
Level of Measurements
Summary Statistics
Types of Charts
Presentation of data
Univariate Analysis
Bivariate Analysis
Statistics
Statistics is the science concerned with developing and
studying methods for collecting, analysing, interpreting
and presenting data.
Population is the entire group that you
want to draw conclusions about.
Sample is a subset of a population that
contains characteristics of that
population.
Method of selecting sample from the population is called Sampling method.
What is Data ?
Data is a collection of facts or information from which
conclusions may be drawn.
Data
Laal Singh Chaddha (Aamir Khan) is that passenger on your train who has a lot of
stories to tell, even if you don’t want to be part of it. That’s how the story starts by
Laal making the viewers the co-passengers on a train to Chandigarh and starting to
narrate his journey from a dim-witted guy wearing leg-braces to the front-page
celebrity of a famous magazine. Laal grows up with just one person Rupa (Kareena
Kapoor Khan) who actually gets him after his mother (Mona Singh).
Cust ID Gender Age Region Source Payment Product Amount Time Of Day
10001 Male 38 East TV advt Credit Card Books 617 22:19
10002 Female 25 West Email Paypal Clothing 3083 13:27
10003 Male 24 North Email Net Banking Grocery 1762 14:27
10004 Male 33 West Email Paypal Home Kitchen 2248 15:38
10005 Male 21 South TV advt Cash On Delivery Grocery 1299 15:21
10006 Male 28 West Web Paypal Mobile 13041 13:11
10007 Male 20 East Email Paypal Mobile 14455 21:59
10008 Female 20 West TV advt Credit Card Home Kitchen 13090 04:04
10009 Female 38 West TV advt Cash On Delivery Grocery 16322 19:35
10010 Male 26 South Newspaper Credit Card Grocery 11716 13:26
10011 Female 27 South Newspaper Paypal Home Kitchen 18176 14:17
10012 Male 45 East Newspaper Credit Card Books 15505 01:01
10013 Male 58 North Email Cash On Delivery Books 21649 10:04
10014 Male 49 East Email Debit Card Home Kitchen 18227 09:09
10015 Female 29 West Email Net Banking Clothing 10971 05:05
10016 Male 19 West TV advt Credit Card Clothing 12956 20:29
Types of Data
Qualitative or Attribute data - the characteristic being
studied is nonnumeric.
E.g.: Gender, religious affiliation, state of birth, condition of
patient, words, images, videos.
Quantitative data - the characteristic being studied is
numeric.
E.g.: time (in seconds) for 400 mts race, age of corona patient,
no. of WBC in blood sample.
Quantitative
Data
Discrete variables: can only assume certain values.
E.g.: no. of pregnancies, no. of missing teeth in children of a
school, no. of visits made by doctor ,the number of goals
in a football match, the number of wickets by a bowler in
a cricket match.
Continuous variable can assume any value within a specified
range.
E.g.: the height of an athlete or the weight of a boxer, skull
circumference, diastolic blood pressure, serum-
cholesterol.
Types of
Variables
Levels
of
Measurements
• Nominal
• Ordinal
Categorical
• Interval
• Ratio
Scale / Numeric
Nominal-Level Data
Properties:
• Observations of a qualitative variable can only
be classified and counted.
• There is no particular order to the labels.
E.g. Blood group, Marital status, Eye colour,
Gender, Religion
Favorite
beverage
Group
Membership
Ordinal-Level Data
Properties:
• Data classifications are represented by sets of
labels or names (high, medium, low) that have
relative values.
• Because of the relative values, the data
classified can be ranked or ordered.
E.g. Stage of disease, Severity of pain, level of
satisfaction, Likert scale
Interval-Level Data
Properties:
• Data classifications are ordered according to
the amount of the characteristic they possess.
• Equal differences in the characteristic are
represented by equal differences in the
measurements.
E.g. Temperature , SAT score, Shoe size, Dress
Size, distance from landmark, geographical
coordinates ( longitudes, latitudes)
Dress Size
Ratio-Level Data
Properties:
• Data classifications are ordered according to the amount of the
characteristics they possess.
• Equal differences in the characteristic are represented by equal
differences in the numbers assigned to the classifications.
• The zero point is the absence of the characteristic and the ratio
between two numbers is meaningful.
E.g. Head circumference, Time until death, weight, Kelvin
temperature
Height
Weight
Levels of
Measurements
Levels of
Measurements
Decide Level of Measurement
• Sex: nominal
• Blood group: nominal
• BMI: numerical
• BMI group: ordinal
• Number of courses: numerical
• Body temperature: numerical
Presentation
of Data
Frequency tables
Cross-tables
Graphs & Tables
Tables &
Cross-tables
Types of
Charts
Pie Chart
The pie (circle) represents 100% of the variable and is divided into sectors.
The area of each sector represents the frequency of each category in the
variable it represents.
Bar Chart
Bar graphs are more
commonly used to
represent categorical
variables. It can be
vertical or horizontal
graphs and can show
the frequency or the
percentage of each
category.
Histogram
It is similar to the bar chart, but
there are no gaps between the
bars as the variable is continuous.
The width of each bar of the
histogram relates to a range of
values for the variable, but in
most cases, the width is kept the
same.
Scatter Diagram
If we have two variables that are
numerical, the relationship between
them can be illustrated using a scatter
diagram.
It plots one variable against the other in
a two-way diagram. One variable is
represented on the horizontal axis and
the other is plotted on the vertical axis
with each dot representing one case.
Box-Whisker Plot
The boxplot (also called Box and Whisker plot) is used to summarize numerical
variables based on the five-number summary.
Those five numbers are minimum, maximum, median, upper quartile, and lower
quartile.
Which Chart ?
ONLY ONE VARIABLE SCALE CATEGORICAL
SCALE
HISTOGRAM SCATTER PLOT BOX-PLOT
CATEGORICAL
PIE / BAR BOX-PLOT MULTIPLE / STACKED
Statistical
Analysis
Statistical
Analysis
Univariate Analysis
Bivariate Analysis
Multivariate Analysis
Univariate
Analysis
Univariate analysis is a basic kind of analysis technique for
statistical data. Here the data contains just one variable.
The main objective of the univariate analysis is to describe
the data in order to find out the patterns in the data.
Some of the measures in Univariate Analysis:
• Central Tendency
• Dispersion
• Skewness
• Kurtosis
Central Tendency
The Mean of a variable
can be computed as the
sum of the observed
values divided by the
number of observations.
The Median is the point
at the centre of the data,
where half of the values
are above, and half are
below it.
The Mode is the most
frequently occurring
value in the dataset
Measures that indicate the approximate centre of the data are called
Measures of Central Tendency.
Dispersion
The Range is simply the
difference between the
largest and smallest values.
The Inter-Quartile Range is
simply the difference
between the upper quartile
and the lower quartile
The Variance is an average
of squared deviations from
mean.
Standard deviation is
calculated as the square
root of the variance
Measures that describe the spread of the data from central tendency are
Measures of Dispersion.
Skewness
Normal distribution Positively Skewed Negatively Skewed
Skewness is a measure of symmetry, or more precisely, the lack of
symmetry.
Kurtosis
Kurtosis is a statistical measure used to describe the degree to which
observations cluster in the tails or the peak of a frequency distribution.
Choosing Summary Statistics
Type of Variable
Scale
Normally distributed
Mean
(Standard deviation)
Skewed data
Median
(Interquartile range)
Categorical
Ordinal:
Median
(Interquartile range)
Nominal:
Mode
(None)
Bivariate
Analysis
Bivariate analysis is stated to be an analysis of any
concurrent relation between two variables or attributes.
This study explores the relationship of two variables as
well as the depth of this relationship to figure out if there
are any discrepancies between two variables and any
causes of this difference.
Some of the measures in Bivariate Analysis:
• Correlation
• Regression
• Time Series
Correlation
Positive Correlation
If the change in the two variables is
in the same direction.
E.g. Temperature and Sales of Ice-cream
Negative Correlation
If the change in the two variables is
in the opposite direction.
E.g. Temperature and Sales of Woollen
clothes
If there is a simultaneous changes in the variables due to direct or indirect
cause-effect then there is a correlation between variables.
Correlation Coefficient
Scatter Plot
A scatterplot is a type of
data display that shows
the relationship between
two numerical variables.
Karl Pearson
It measures the linear
association between two
numeric variables.
Correlation coefficient is a statistical measure that indicates the extent to
which two or more variables fluctuate in relation to each other.
Spearman
It measures the linear
association between ranks
assigned to individual
items of two variables.
Regression
If these functional relationship is linear
in nature, it is called Linear Regression.
The regression line is given as
𝑦 = a + 𝑏𝑦𝑥 𝑥
𝒃𝒚𝒙 is the regression coefficient, which
measures the change in variable 𝑦 for a
unit change in independent variable 𝑥 .
Regression is the functional relationship between two or more variables, such
that we can estimate value of dependent variable for given value of
independent variable(s)
Time Series
A time series is a time ordered sequence of observations taken at regular interval (e.g.
Hourly, daily, weekly, monthly, quarterly, annually).
Examples of Time Series
• Daily: Stock Price, temperature Weekly: Retail sales of departmental store
• Monthly: Unemployment rate, consumer price index
• Quarterly: GDP of a country, Yearly: Production of crops
Multivariate
Analysis
Multivariate analysis is stated to be an analysis of any
concurrent relation between more than two variables or
attributes.
Some of the measures in Multivariate Analysis:
• Multiple Correlation
• Multiple Regression
• Discriminant Analysis
• ANOVA
• Structural Equation Modelling
References
https://ncert.nic.in/textbook.php?kest1=7-9
Std_11 - Google Drive
Std_12 - Google Drive
https://cdn1.byjus.com/wp-content/uploads/2020/07/GSEB-
Class-12-Statistics-Part-1-Textbook-Commerce-Stream.pdf
https://schools.freshersnow.com/wp-
content/uploads/2021/12/Std-12-Statistics-Part-2-E.M.pdf
THANK YOU
Dr Parag Shah | M.Sc., M.Phil., Ph.D. ( Statistics)
pbshah@hlcollege.edu
www.paragstatistics.wordpress.com
Ad

More Related Content

What's hot (20)

Categorical data analysis
Categorical data analysisCategorical data analysis
Categorical data analysis
Sumit Das
 
Statistics
StatisticsStatistics
Statistics
itutor
 
Frequency Distributions for Organizing and Summarizing
Frequency Distributions for Organizing and Summarizing Frequency Distributions for Organizing and Summarizing
Frequency Distributions for Organizing and Summarizing
Long Beach City College
 
Hypothesis
HypothesisHypothesis
Hypothesis
Nilanjan Bhaumik
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
Farhan Alfin
 
Classes
ClassesClasses
Classes
Debra Wallace
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
Nilanjan Bhaumik
 
Two sample t-test
Two sample t-testTwo sample t-test
Two sample t-test
Stephen Lange
 
Basic concept of statistics
Basic concept of statisticsBasic concept of statistics
Basic concept of statistics
GC University Faisalabad Pakistan
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
Ajendra7846
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Regent University
 
Chi square
Chi squareChi square
Chi square
Andi Koentary
 
Central limit theorem
Central limit theoremCentral limit theorem
Central limit theorem
Nadeem Uddin
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inference
Jags Jagdish
 
Non parametric tests by meenu
Non parametric tests by meenuNon parametric tests by meenu
Non parametric tests by meenu
meenu saharan
 
HYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.pptHYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.ppt
sadiakhan783184
 
Chapter 3 Confidence Interval
Chapter 3 Confidence IntervalChapter 3 Confidence Interval
Chapter 3 Confidence Interval
ghalan
 
Random Variables
Random VariablesRandom Variables
Random Variables
Tomoki Tsuchida
 
How to write a paper statistics
How to write a paper statisticsHow to write a paper statistics
How to write a paper statistics
Amany El-seoud
 
Categorical data analysis
Categorical data analysisCategorical data analysis
Categorical data analysis
Sumit Das
 
Statistics
StatisticsStatistics
Statistics
itutor
 
Frequency Distributions for Organizing and Summarizing
Frequency Distributions for Organizing and Summarizing Frequency Distributions for Organizing and Summarizing
Frequency Distributions for Organizing and Summarizing
Long Beach City College
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
Farhan Alfin
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
Ajendra7846
 
Central limit theorem
Central limit theoremCentral limit theorem
Central limit theorem
Nadeem Uddin
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inference
Jags Jagdish
 
Non parametric tests by meenu
Non parametric tests by meenuNon parametric tests by meenu
Non parametric tests by meenu
meenu saharan
 
Chapter 3 Confidence Interval
Chapter 3 Confidence IntervalChapter 3 Confidence Interval
Chapter 3 Confidence Interval
ghalan
 
How to write a paper statistics
How to write a paper statisticsHow to write a paper statistics
How to write a paper statistics
Amany El-seoud
 

Similar to Basic Statistics in 1 hour.pptx (20)

Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesExploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Parag Shah
 
Introduction of biostatistics
Introduction of biostatisticsIntroduction of biostatistics
Introduction of biostatistics
khushbu
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
Neny Isharyanti
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
Dr. Senthilvel Vasudevan
 
SFEPart1toolgraphs10 containing main things.ppt
SFEPart1toolgraphs10 containing main things.pptSFEPart1toolgraphs10 containing main things.ppt
SFEPart1toolgraphs10 containing main things.ppt
onlyforstalking1122
 
Lecture-2{This tell us about the statics basic info}_JIH.pptx
Lecture-2{This tell us about the statics basic info}_JIH.pptxLecture-2{This tell us about the statics basic info}_JIH.pptx
Lecture-2{This tell us about the statics basic info}_JIH.pptx
fahimhasan1217
 
Presentation of data
Presentation of dataPresentation of data
Presentation of data
DhruvPatel1020
 
introduction to biostat, standard deviation and variance
introduction to biostat, standard deviation and varianceintroduction to biostat, standard deviation and variance
introduction to biostat, standard deviation and variance
amol askar
 
Understanding statistics in research
Understanding statistics in researchUnderstanding statistics in research
Understanding statistics in research
Dr. Senthilvel Vasudevan
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statitics
Penny Jiang
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
IndhuGreen
 
businessstatistics-stat10022-200411201812.ppt
businessstatistics-stat10022-200411201812.pptbusinessstatistics-stat10022-200411201812.ppt
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
An introduction to Statistical Analysis in Health
An introduction to Statistical Analysis in HealthAn introduction to Statistical Analysis in Health
An introduction to Statistical Analysis in Health
iajak
 
Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)
YesAnalytics
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theory
Unsa Shakir
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)
AhmedToheed3
 
2. Numerical Descriptive Measures[1].pdf
2. Numerical Descriptive Measures[1].pdf2. Numerical Descriptive Measures[1].pdf
2. Numerical Descriptive Measures[1].pdf
mamillapallivinuthna1
 
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Universidad Particular de Loja
 
Medical Statistics.pptx
Medical Statistics.pptxMedical Statistics.pptx
Medical Statistics.pptx
Siddanna B Chougala C
 
Data Presentation and Slide Preparation
Data Presentation and Slide PreparationData Presentation and Slide Preparation
Data Presentation and Slide Preparation
Achu dhan
 
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesExploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Exploratory Data Analysis for Biotechnology and Pharmaceutical Sciences
Parag Shah
 
Introduction of biostatistics
Introduction of biostatisticsIntroduction of biostatistics
Introduction of biostatistics
khushbu
 
SFEPart1toolgraphs10 containing main things.ppt
SFEPart1toolgraphs10 containing main things.pptSFEPart1toolgraphs10 containing main things.ppt
SFEPart1toolgraphs10 containing main things.ppt
onlyforstalking1122
 
Lecture-2{This tell us about the statics basic info}_JIH.pptx
Lecture-2{This tell us about the statics basic info}_JIH.pptxLecture-2{This tell us about the statics basic info}_JIH.pptx
Lecture-2{This tell us about the statics basic info}_JIH.pptx
fahimhasan1217
 
introduction to biostat, standard deviation and variance
introduction to biostat, standard deviation and varianceintroduction to biostat, standard deviation and variance
introduction to biostat, standard deviation and variance
amol askar
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statitics
Penny Jiang
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
IndhuGreen
 
businessstatistics-stat10022-200411201812.ppt
businessstatistics-stat10022-200411201812.pptbusinessstatistics-stat10022-200411201812.ppt
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
An introduction to Statistical Analysis in Health
An introduction to Statistical Analysis in HealthAn introduction to Statistical Analysis in Health
An introduction to Statistical Analysis in Health
iajak
 
Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)
YesAnalytics
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theory
Unsa Shakir
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)
AhmedToheed3
 
2. Numerical Descriptive Measures[1].pdf
2. Numerical Descriptive Measures[1].pdf2. Numerical Descriptive Measures[1].pdf
2. Numerical Descriptive Measures[1].pdf
mamillapallivinuthna1
 
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Universidad Particular de Loja
 
Data Presentation and Slide Preparation
Data Presentation and Slide PreparationData Presentation and Slide Preparation
Data Presentation and Slide Preparation
Achu dhan
 
Ad

More from Parag Shah (15)

Non- Parametric Tests
Non- Parametric TestsNon- Parametric Tests
Non- Parametric Tests
Parag Shah
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSS
Parag Shah
 
Proportion test using Chi square
Proportion test using Chi squareProportion test using Chi square
Proportion test using Chi square
Parag Shah
 
Chi square tests using spss
Chi square tests using spssChi square tests using spss
Chi square tests using spss
Parag Shah
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSS
Parag Shah
 
t test using spss
t test using spsst test using spss
t test using spss
Parag Shah
 
Probability
Probability    Probability
Probability
Parag Shah
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
Parag Shah
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
Parag Shah
 
Small sample test
Small sample testSmall sample test
Small sample test
Parag Shah
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
Parag Shah
 
Testing of hypothesis - Chi-Square test
Testing of hypothesis - Chi-Square testTesting of hypothesis - Chi-Square test
Testing of hypothesis - Chi-Square test
Parag Shah
 
Testing of hypothesis - large sample test
Testing of hypothesis - large sample testTesting of hypothesis - large sample test
Testing of hypothesis - large sample test
Parag Shah
 
Statistics for Physical Education
Statistics for Physical EducationStatistics for Physical Education
Statistics for Physical Education
Parag Shah
 
Career option for stats
Career option for statsCareer option for stats
Career option for stats
Parag Shah
 
Non- Parametric Tests
Non- Parametric TestsNon- Parametric Tests
Non- Parametric Tests
Parag Shah
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSS
Parag Shah
 
Proportion test using Chi square
Proportion test using Chi squareProportion test using Chi square
Proportion test using Chi square
Parag Shah
 
Chi square tests using spss
Chi square tests using spssChi square tests using spss
Chi square tests using spss
Parag Shah
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSS
Parag Shah
 
t test using spss
t test using spsst test using spss
t test using spss
Parag Shah
 
Basic stat analysis using excel
Basic stat analysis using excelBasic stat analysis using excel
Basic stat analysis using excel
Parag Shah
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
Parag Shah
 
Small sample test
Small sample testSmall sample test
Small sample test
Parag Shah
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
Parag Shah
 
Testing of hypothesis - Chi-Square test
Testing of hypothesis - Chi-Square testTesting of hypothesis - Chi-Square test
Testing of hypothesis - Chi-Square test
Parag Shah
 
Testing of hypothesis - large sample test
Testing of hypothesis - large sample testTesting of hypothesis - large sample test
Testing of hypothesis - large sample test
Parag Shah
 
Statistics for Physical Education
Statistics for Physical EducationStatistics for Physical Education
Statistics for Physical Education
Parag Shah
 
Career option for stats
Career option for statsCareer option for stats
Career option for stats
Parag Shah
 
Ad

Recently uploaded (20)

Deloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining ProjectsDeloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining Projects
Process mining Evangelist
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Modern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx AaModern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx Aa
MuhammadAwaisKamboh
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Process Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challengesProcess Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challenges
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
Process Mining at AE - Key success factors
Process Mining at AE - Key success factorsProcess Mining at AE - Key success factors
Process Mining at AE - Key success factors
Process mining Evangelist
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
Taqyea
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Microsoft Excel: A Comprehensive Overview
Microsoft Excel: A Comprehensive OverviewMicrosoft Excel: A Comprehensive Overview
Microsoft Excel: A Comprehensive Overview
GinaTomarongRegencia
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
Deloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining ProjectsDeloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining Projects
Process mining Evangelist
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Modern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx AaModern_Distribution_Presentation.pptx Aa
Modern_Distribution_Presentation.pptx Aa
MuhammadAwaisKamboh
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
Process Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challengesProcess Mining at Rabobank - Organizational challenges
Process Mining at Rabobank - Organizational challenges
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
新西兰文凭奥克兰理工大学毕业证书AUT成绩单补办
Taqyea
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Microsoft Excel: A Comprehensive Overview
Microsoft Excel: A Comprehensive OverviewMicrosoft Excel: A Comprehensive Overview
Microsoft Excel: A Comprehensive Overview
GinaTomarongRegencia
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 

Basic Statistics in 1 hour.pptx

  • 2. SESSION FLOW What is Statistics? Population & Sample What is Data? Types of Data Level of Measurements Summary Statistics Types of Charts Presentation of data Univariate Analysis Bivariate Analysis
  • 3. Statistics Statistics is the science concerned with developing and studying methods for collecting, analysing, interpreting and presenting data.
  • 4. Population is the entire group that you want to draw conclusions about. Sample is a subset of a population that contains characteristics of that population.
  • 5. Method of selecting sample from the population is called Sampling method.
  • 6. What is Data ? Data is a collection of facts or information from which conclusions may be drawn.
  • 7. Data Laal Singh Chaddha (Aamir Khan) is that passenger on your train who has a lot of stories to tell, even if you don’t want to be part of it. That’s how the story starts by Laal making the viewers the co-passengers on a train to Chandigarh and starting to narrate his journey from a dim-witted guy wearing leg-braces to the front-page celebrity of a famous magazine. Laal grows up with just one person Rupa (Kareena Kapoor Khan) who actually gets him after his mother (Mona Singh). Cust ID Gender Age Region Source Payment Product Amount Time Of Day 10001 Male 38 East TV advt Credit Card Books 617 22:19 10002 Female 25 West Email Paypal Clothing 3083 13:27 10003 Male 24 North Email Net Banking Grocery 1762 14:27 10004 Male 33 West Email Paypal Home Kitchen 2248 15:38 10005 Male 21 South TV advt Cash On Delivery Grocery 1299 15:21 10006 Male 28 West Web Paypal Mobile 13041 13:11 10007 Male 20 East Email Paypal Mobile 14455 21:59 10008 Female 20 West TV advt Credit Card Home Kitchen 13090 04:04 10009 Female 38 West TV advt Cash On Delivery Grocery 16322 19:35 10010 Male 26 South Newspaper Credit Card Grocery 11716 13:26 10011 Female 27 South Newspaper Paypal Home Kitchen 18176 14:17 10012 Male 45 East Newspaper Credit Card Books 15505 01:01 10013 Male 58 North Email Cash On Delivery Books 21649 10:04 10014 Male 49 East Email Debit Card Home Kitchen 18227 09:09 10015 Female 29 West Email Net Banking Clothing 10971 05:05 10016 Male 19 West TV advt Credit Card Clothing 12956 20:29
  • 8. Types of Data Qualitative or Attribute data - the characteristic being studied is nonnumeric. E.g.: Gender, religious affiliation, state of birth, condition of patient, words, images, videos. Quantitative data - the characteristic being studied is numeric. E.g.: time (in seconds) for 400 mts race, age of corona patient, no. of WBC in blood sample.
  • 9. Quantitative Data Discrete variables: can only assume certain values. E.g.: no. of pregnancies, no. of missing teeth in children of a school, no. of visits made by doctor ,the number of goals in a football match, the number of wickets by a bowler in a cricket match. Continuous variable can assume any value within a specified range. E.g.: the height of an athlete or the weight of a boxer, skull circumference, diastolic blood pressure, serum- cholesterol.
  • 12. Nominal-Level Data Properties: • Observations of a qualitative variable can only be classified and counted. • There is no particular order to the labels. E.g. Blood group, Marital status, Eye colour, Gender, Religion Favorite beverage Group Membership
  • 13. Ordinal-Level Data Properties: • Data classifications are represented by sets of labels or names (high, medium, low) that have relative values. • Because of the relative values, the data classified can be ranked or ordered. E.g. Stage of disease, Severity of pain, level of satisfaction, Likert scale
  • 14. Interval-Level Data Properties: • Data classifications are ordered according to the amount of the characteristic they possess. • Equal differences in the characteristic are represented by equal differences in the measurements. E.g. Temperature , SAT score, Shoe size, Dress Size, distance from landmark, geographical coordinates ( longitudes, latitudes) Dress Size
  • 15. Ratio-Level Data Properties: • Data classifications are ordered according to the amount of the characteristics they possess. • Equal differences in the characteristic are represented by equal differences in the numbers assigned to the classifications. • The zero point is the absence of the characteristic and the ratio between two numbers is meaningful. E.g. Head circumference, Time until death, weight, Kelvin temperature Height Weight
  • 18. Decide Level of Measurement
  • 19. • Sex: nominal • Blood group: nominal • BMI: numerical • BMI group: ordinal • Number of courses: numerical • Body temperature: numerical
  • 23. Pie Chart The pie (circle) represents 100% of the variable and is divided into sectors. The area of each sector represents the frequency of each category in the variable it represents.
  • 24. Bar Chart Bar graphs are more commonly used to represent categorical variables. It can be vertical or horizontal graphs and can show the frequency or the percentage of each category.
  • 25. Histogram It is similar to the bar chart, but there are no gaps between the bars as the variable is continuous. The width of each bar of the histogram relates to a range of values for the variable, but in most cases, the width is kept the same.
  • 26. Scatter Diagram If we have two variables that are numerical, the relationship between them can be illustrated using a scatter diagram. It plots one variable against the other in a two-way diagram. One variable is represented on the horizontal axis and the other is plotted on the vertical axis with each dot representing one case.
  • 27. Box-Whisker Plot The boxplot (also called Box and Whisker plot) is used to summarize numerical variables based on the five-number summary. Those five numbers are minimum, maximum, median, upper quartile, and lower quartile.
  • 28. Which Chart ? ONLY ONE VARIABLE SCALE CATEGORICAL SCALE HISTOGRAM SCATTER PLOT BOX-PLOT CATEGORICAL PIE / BAR BOX-PLOT MULTIPLE / STACKED
  • 31. Univariate Analysis Univariate analysis is a basic kind of analysis technique for statistical data. Here the data contains just one variable. The main objective of the univariate analysis is to describe the data in order to find out the patterns in the data. Some of the measures in Univariate Analysis: • Central Tendency • Dispersion • Skewness • Kurtosis
  • 32. Central Tendency The Mean of a variable can be computed as the sum of the observed values divided by the number of observations. The Median is the point at the centre of the data, where half of the values are above, and half are below it. The Mode is the most frequently occurring value in the dataset Measures that indicate the approximate centre of the data are called Measures of Central Tendency.
  • 33. Dispersion The Range is simply the difference between the largest and smallest values. The Inter-Quartile Range is simply the difference between the upper quartile and the lower quartile The Variance is an average of squared deviations from mean. Standard deviation is calculated as the square root of the variance Measures that describe the spread of the data from central tendency are Measures of Dispersion.
  • 34. Skewness Normal distribution Positively Skewed Negatively Skewed Skewness is a measure of symmetry, or more precisely, the lack of symmetry.
  • 35. Kurtosis Kurtosis is a statistical measure used to describe the degree to which observations cluster in the tails or the peak of a frequency distribution.
  • 36. Choosing Summary Statistics Type of Variable Scale Normally distributed Mean (Standard deviation) Skewed data Median (Interquartile range) Categorical Ordinal: Median (Interquartile range) Nominal: Mode (None)
  • 37. Bivariate Analysis Bivariate analysis is stated to be an analysis of any concurrent relation between two variables or attributes. This study explores the relationship of two variables as well as the depth of this relationship to figure out if there are any discrepancies between two variables and any causes of this difference. Some of the measures in Bivariate Analysis: • Correlation • Regression • Time Series
  • 38. Correlation Positive Correlation If the change in the two variables is in the same direction. E.g. Temperature and Sales of Ice-cream Negative Correlation If the change in the two variables is in the opposite direction. E.g. Temperature and Sales of Woollen clothes If there is a simultaneous changes in the variables due to direct or indirect cause-effect then there is a correlation between variables.
  • 39. Correlation Coefficient Scatter Plot A scatterplot is a type of data display that shows the relationship between two numerical variables. Karl Pearson It measures the linear association between two numeric variables. Correlation coefficient is a statistical measure that indicates the extent to which two or more variables fluctuate in relation to each other. Spearman It measures the linear association between ranks assigned to individual items of two variables.
  • 40. Regression If these functional relationship is linear in nature, it is called Linear Regression. The regression line is given as 𝑦 = a + 𝑏𝑦𝑥 𝑥 𝒃𝒚𝒙 is the regression coefficient, which measures the change in variable 𝑦 for a unit change in independent variable 𝑥 . Regression is the functional relationship between two or more variables, such that we can estimate value of dependent variable for given value of independent variable(s)
  • 41. Time Series A time series is a time ordered sequence of observations taken at regular interval (e.g. Hourly, daily, weekly, monthly, quarterly, annually). Examples of Time Series • Daily: Stock Price, temperature Weekly: Retail sales of departmental store • Monthly: Unemployment rate, consumer price index • Quarterly: GDP of a country, Yearly: Production of crops
  • 42. Multivariate Analysis Multivariate analysis is stated to be an analysis of any concurrent relation between more than two variables or attributes. Some of the measures in Multivariate Analysis: • Multiple Correlation • Multiple Regression • Discriminant Analysis • ANOVA • Structural Equation Modelling
  • 43. References https://ncert.nic.in/textbook.php?kest1=7-9 Std_11 - Google Drive Std_12 - Google Drive https://cdn1.byjus.com/wp-content/uploads/2020/07/GSEB- Class-12-Statistics-Part-1-Textbook-Commerce-Stream.pdf https://schools.freshersnow.com/wp- content/uploads/2021/12/Std-12-Statistics-Part-2-E.M.pdf
  • 44. THANK YOU Dr Parag Shah | M.Sc., M.Phil., Ph.D. ( Statistics) pbshah@hlcollege.edu www.paragstatistics.wordpress.com