SlideShare a Scribd company logo
Descriptive Statistics in
cardiovascular research
MOST SIMPLE WAY OF DATA HANDLING
Statistics in general
SUBCLAUSE USED FOR
Collection
Analysis
 Interpretation
Presentation
 Reasoning
 Discussion
 Calculation
 Scientific
Inference
STATISTICS IN GENERAL
DESCRIPTIVE
INFERENTIAL
Descriptive Statistics
 Data analysis begins with calculation of descriptive
statistics for the research variables
 These statistics summarize various aspects about the
data, giving details about the sample and providing
information about the population from which he sample
was drawn
 Each variable’s type determines the nature of
descriptive statistics that one calculates and the manner
in which one reports or displays those statistics
 Simply to describe what's going on in our data
inferential statistics
 Trying to reach conclusions that extend beyond the immediate data
alone=INFER
 We use inferential statistics to try to infer from the sample data what
the population might think/experience
 Make judgments of the probability that an observed difference
between groups is a dependable one or one that might have
happened by chance in this study
 Make inferences from our data to more general conditions
http://www.socialresearchmethods.net/kb/statinf.php
KEYWORDS
 Population[Orientation]
 SAMPLE[Representive]
 VARIABLES[Characteristics]
 PARAMETERS[quantities that define a statistical model]
DISPLAY OF DESCRIPTIVE STATISTICS
TABLES GRAGHS
CHARTS
CIRCLE
DOT PLOTS
BOX-AND-WHISKER PLOTS
SCATTERPLOT
SURVIVAL PLOTS
BLAND-ALTMAN PLOTS
TABLE-1
Medical Statistics Part-I:Descriptive statistics
Variables:
DISCRETE CONTINUOUS
 Only certain values (fixed and
readily Countable
 Examples of discrete variables
commonly encountered in
cardiovascular research include
species, strain, racial/ethnic group,
sex, education level,
treatment group, hypertension status,
and New York Heart
Association class.
 Infinite number of values
 Fixed intervals between adjacent
values
 They can be manipulated
mathematically, taking sums and
differences
 Age, height, weight, blood
pressure, measures of cardiac
structure and function, blood
chemistries, and survival time
Discrete variables (categorical)
NOMINAL (UNORDERED) ORDINAL (ORDERED)
Take values such as yes/no,
Human/dog/mouse, female/male,
treatment A/B/C; a nominal
Variable that takes only 2 possible
values is called binary. One
May apply numbers as labels for
nominal categories, but there
Is no natural ordering
Take naturally ordered values such as
New York Heart Association class (I, II,
III, or IV), hypertension status (optimal,
normal, high-normal,or hypertensive),
or education level (less than high
school,high school, college, graduate
school
Categorical
 A categorical variable (sometimes called a nominal variable) is one
that has two or more categories, but there is no intrinsic ordering to
the categories. For example, gender is a categorical variable
having two categories (male and female) and there is no intrinsic
ordering to the categories. Hair color is also a categorical variable
having a number of categories (blonde, brown, brunette, red, etc.)
and again, there is no agreed way to order these from highest to
lowest. A purely categorical variable is one that simply allows you to
assign categories but you cannot clearly order the variables. If the
variable has a clear ordering, then that variable would be an
ordinal variable, as described below.
Ordinal
 An ordinal variable is similar to a categorical variable. The difference between
the two is that there is a clear ordering of the variables. For example, suppose
you have a variable, economic status, with three categories (low, medium and
high). In addition to being able to classify people into these three categories,
you can order the categories as low, medium and high. Now consider a
variable like educational experience (with values such as elementary school
graduate, high school graduate, some college and college graduate). These
also can be ordered as elementary school, high school, some college, and
college graduate. Even though we can order these from lowest to highest, the
spacing between the values may not be the same across the levels of the
variables. Say we assign scores 1, 2, 3 and 4 to these four levels of educational
experience and we compare the difference in education between categories
one and two with the difference in educational experience between categories
two and three, or the difference between categories three and four. The
difference between categories one and two (elementary and high school) is
probably much bigger than the difference between categories two and three
(high school and some college).
Ordinal
 In this example, we can order the people in level of educational
experience but the size of the difference between categories is
inconsistent (because the spacing between categories one and
two is bigger than categories two and three). If these categories
were equally spaced, then the variable would be an interval
variable
Continuous variables
 Continuous variables can have an
infinite number of different values
between two given points. As
shown above, there cannot be a
continuous scale of children within
a family. If height were being
measured though, the variables
would be continuous as there are
an unlimited number of possibilities
even if only looking at between 1
and 1.1 meters.
Descriptive statistics for Discrete
variables
 Absolute frequencies (raw counts) for each category
 Relative frequencies (proportions or percentages of the total
Number of observations)
 Cumulative frequencies for successive categories of ordinal
variables
Collection
 Formal Sampling
 Recording Responses To Experimental Conditions
 Observing A Process Repeatedly Over Time
Descriptive statistics for continuous
variables
 Location statistics
 MEAN
 MEDIAN
 MODE,
 QUANTILES
 Dispersion statistics[CENTRAL TENDENCY]
 VARIANCE=
 STANDARD DEVIATION=S=√S²
 RANGE
 INTERQUARTILE RANGE
 Shape statistics
 SKEWNESS
 KURTOSIS
ROBUST
 MEDIAN is robust :Not strongly affected by outliers or by extreme
changes to a small portion
 MEAN is sensitive (not robust) to those conditions
 MODE is robust to outliers, but it may be affected by data
collection operations, such as rounding or digit preference, that
alter data precision.
QUANTILES
 Quintiles combine aspects of ordered data and cumulative
frequencies
 The p-th quantile (0≤p≤1)
 100p is an integer, the quantiles are called percentiles
 Median, or 0.50 quantile, is the 50th percentile, the 0.99 quantile is
the 99th percentile
 Three specific percentiles are widely used in descriptive statistics,
[100p is an integer multiple of 25]
 Q1first quartile (25th percentile, 0.25 quantile)
 Q2second quartile (50th percentile, 0.50 quantile), median
 Q3third quartile (75th perce ntile,0.75 quantile)
INTERQUARTILE RANGE[IQR]
 It is a single number
 defined as IQ of RQ-3Q1
 Variance and standard deviation
are affected (increased) by the presence of extreme observations,
the IQR is not; it is robust
SKEWNESS[skewness coefficient]
For a given data
 Distribution is symmetric (skewness=0)
 A more pronounced tail in 1 direction than the
 other (left tail, skewness<0; right tail, skewness>0)
 If skewness=0, the mean= median
 Right- (left-) skewed distribution has its mean value greater
 (less than) the median
Kurtosis
 a measure of the “peakedness” of a distribution
 A gaussian distribution (also called “normal”) with a bell-shaped
frequency curve has kurtosis 0
 Positive kurtosis indicates a sharper peak with longer/fatter tails and
relatively more variability due to extreme deviations
 Negative kurtosis coefficient indicates broader shoulders with
shorter/thinner tails
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statistics
GRAPHS[complementary to
tabular]
DOT PLOT of Continuous variable(BMI)
The dot plot is a simple
graph that is used
mainly with small data
sets to show individual
values of sample data in
1 dimension
Box-and-whisker plot= box plot
graph
Graph displays values of quartiles (Q1, Q2, Q3) by
a rectangular box. The ends of the box
correspond to Q1 and Q3, such that
the length of the box is the interquartile range
(IQRQ3Q1). There is a line drawn inside the box at
the median, Q2, and there is a “” symbol plotted
at the mean.Traditionally, “whiskers” (thin lines)
extend out to, at most,1.5 times the box length
from both ends of the box: they connect all
values outside the box that are not 1.5 IQR away
from the box, and they must end at an observed
value.Beyond the whiskers are outliers, identified
individually by symbols such as circles or asterisks
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statistics
Univariate Analysis: Look one
variable at a time for 3 features
Distribution Central Tendency Dispersion
Of Frequency in %
/bar diagram/histogram
Mean
Median
Mode
Range
Standard Deviation
Variance
Correlation[r] is a single 1 number that shows
the degree of relationship between 2
variables
-1 to +1
Medical Statistics Part-I:Descriptive statistics
r is also called Karl Pearson’s
coefficient of correlation
It is just beginning
With best wishes

More Related Content

What's hot (20)

PPTX
Descriptive Statistics
Bhagya Silva
 
PPT
Biostatistics lec 1
Osmanmohamed38
 
PPT
Bio stat
AbhishekDas15
 
PPT
Introduction To Statistics
albertlaporte
 
PPT
Data Analysis With Spss - Reliability
Dr Ali Yusob Md Zain
 
PPT
Bioststistic mbbs-1 f30may
Rawalpindi Medical College
 
PPTX
Descriptive statistics
University of Jaffna
 
PPT
Introduction To SPSS
Phi Jack
 
PPTX
Bias and confounder
Reena Titoria
 
PPSX
Inferential statistics.ppt
Nursing Path
 
PPTX
1.2 types of data
Long Beach City College
 
PDF
Categorical data analysis
Sumit Das
 
PDF
Scales of measurement and presentation of data
Dr Sithun Kumar Patro
 
PPTX
INFERENTIAL STATISTICS: AN INTRODUCTION
John Labrador
 
PPTX
Measure of Dispersion in statistics
Md. Mehadi Hassan Bappy
 
PPTX
The Normal distribution
Sarfraz Ahmad
 
PPTX
Introduction to Descriptive Statistics
Sanju Rusara Seneviratne
 
PPTX
Overview of different statistical tests used in epidemiological
shefali jain
 
PPTX
Measures of central tendancy
Pranav Krishna
 
PPTX
Statistics in research
Balaji P
 
Descriptive Statistics
Bhagya Silva
 
Biostatistics lec 1
Osmanmohamed38
 
Bio stat
AbhishekDas15
 
Introduction To Statistics
albertlaporte
 
Data Analysis With Spss - Reliability
Dr Ali Yusob Md Zain
 
Bioststistic mbbs-1 f30may
Rawalpindi Medical College
 
Descriptive statistics
University of Jaffna
 
Introduction To SPSS
Phi Jack
 
Bias and confounder
Reena Titoria
 
Inferential statistics.ppt
Nursing Path
 
1.2 types of data
Long Beach City College
 
Categorical data analysis
Sumit Das
 
Scales of measurement and presentation of data
Dr Sithun Kumar Patro
 
INFERENTIAL STATISTICS: AN INTRODUCTION
John Labrador
 
Measure of Dispersion in statistics
Md. Mehadi Hassan Bappy
 
The Normal distribution
Sarfraz Ahmad
 
Introduction to Descriptive Statistics
Sanju Rusara Seneviratne
 
Overview of different statistical tests used in epidemiological
shefali jain
 
Measures of central tendancy
Pranav Krishna
 
Statistics in research
Balaji P
 

Similar to Medical Statistics Part-I:Descriptive statistics (20)

PPTX
Introduction to statistics.pptx
MuddaAbdo1
 
PPT
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
PPT
Business statistics (Basics)
AhmedToheed3
 
PPTX
PRESENTATION.pptx
MedicalEducation7
 
PDF
Tools and Techniques - Statistics: descriptive statistics
https://aiimsbhubaneswar.nic.in/
 
PPTX
What is Statistics is all about basics of statistics
svellala
 
PPT
Chapter34
Ying Liu
 
PPT
Introduction to Biostatistics_20_4_17.ppt
nyakundi340
 
PPTX
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
PPTX
Descrptive statistics
DrZahid Khan
 
PPTX
Basics of statistics
donthuraj
 
PPTX
Basic Statistics in 1 hour.pptx
Parag Shah
 
PDF
Biostatistics ppt.pdf
AbdulrezakHusein
 
PPTX
INTRODUCTION-TO-STATISTICS-and-FDT-2 (1).pptx
angeliquebartolome1
 
PPTX
BIOSTATISTICS OVERALL JUNE 20241234567.pptx
anasabdulmajeed3sker
 
PPTX
statistics.pptxghfhsahkjhsghkjhahkjhgfjkjkg
Central University of South Bihar
 
PPTX
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
DOCX
Chapter 4Summarizing Data Collected in the Sample.docx
keturahhazelhurst
 
PPT
Descriptive Statistics and Data Visualization
Douglas Joubert
 
PPT
presentation
Pwalmiki
 
Introduction to statistics.pptx
MuddaAbdo1
 
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
Business statistics (Basics)
AhmedToheed3
 
PRESENTATION.pptx
MedicalEducation7
 
Tools and Techniques - Statistics: descriptive statistics
https://aiimsbhubaneswar.nic.in/
 
What is Statistics is all about basics of statistics
svellala
 
Chapter34
Ying Liu
 
Introduction to Biostatistics_20_4_17.ppt
nyakundi340
 
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
Descrptive statistics
DrZahid Khan
 
Basics of statistics
donthuraj
 
Basic Statistics in 1 hour.pptx
Parag Shah
 
Biostatistics ppt.pdf
AbdulrezakHusein
 
INTRODUCTION-TO-STATISTICS-and-FDT-2 (1).pptx
angeliquebartolome1
 
BIOSTATISTICS OVERALL JUNE 20241234567.pptx
anasabdulmajeed3sker
 
statistics.pptxghfhsahkjhsghkjhahkjhgfjkjkg
Central University of South Bihar
 
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
Chapter 4Summarizing Data Collected in the Sample.docx
keturahhazelhurst
 
Descriptive Statistics and Data Visualization
Douglas Joubert
 
presentation
Pwalmiki
 
Ad

More from https://aiimsbhubaneswar.nic.in/ (20)

PPTX
LEFT MAIN BIFURCATION STENTING EBC 2024.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
COMPLETE ATRIOVENTRICULAR HEART BLOCK.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Willens's syndrome.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Intensive care of congenital heart disease.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Management of Hypetension.pptx
https://aiimsbhubaneswar.nic.in/
 
PDF
CRISPR and cardiovascular diseases.pdf
https://aiimsbhubaneswar.nic.in/
 
PDF
Pacemaker Pocket Infection After Splenectomy
https://aiimsbhubaneswar.nic.in/
 
PDF
Piccolo Duct Occluder.pdf
https://aiimsbhubaneswar.nic.in/
 
PPTX
MISPLACED ECG LEADS.pptx
https://aiimsbhubaneswar.nic.in/
 
PDF
A Case of Device Closure of an Eccentric Atrial Septal Defect Using a Large D...
https://aiimsbhubaneswar.nic.in/
 
PPTX
Arrythmia-IV.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Arrythmia-III.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Arrythmia-II.pptx
https://aiimsbhubaneswar.nic.in/
 
PPTX
Arrythmia-I.pptx
https://aiimsbhubaneswar.nic.in/
 
PDF
Trio of Rheumatic Mitral Stenosis, Right Posterior Septal Accessory Pathway a...
https://aiimsbhubaneswar.nic.in/
 
PPTX
Anticoagulation therapy during pregnancy
https://aiimsbhubaneswar.nic.in/
 
PDF
Coronary guidewire
https://aiimsbhubaneswar.nic.in/
 
PDF
Intracoronary optical coherence tomography
https://aiimsbhubaneswar.nic.in/
 
PDF
A roadmap for the human development
https://aiimsbhubaneswar.nic.in/
 
LEFT MAIN BIFURCATION STENTING EBC 2024.pptx
https://aiimsbhubaneswar.nic.in/
 
COMPLETE ATRIOVENTRICULAR HEART BLOCK.pptx
https://aiimsbhubaneswar.nic.in/
 
Willens's syndrome.pptx
https://aiimsbhubaneswar.nic.in/
 
Intensive care of congenital heart disease.pptx
https://aiimsbhubaneswar.nic.in/
 
Management of Hypetension.pptx
https://aiimsbhubaneswar.nic.in/
 
CRISPR and cardiovascular diseases.pdf
https://aiimsbhubaneswar.nic.in/
 
Pacemaker Pocket Infection After Splenectomy
https://aiimsbhubaneswar.nic.in/
 
Piccolo Duct Occluder.pdf
https://aiimsbhubaneswar.nic.in/
 
MISPLACED ECG LEADS.pptx
https://aiimsbhubaneswar.nic.in/
 
A Case of Device Closure of an Eccentric Atrial Septal Defect Using a Large D...
https://aiimsbhubaneswar.nic.in/
 
Trio of Rheumatic Mitral Stenosis, Right Posterior Septal Accessory Pathway a...
https://aiimsbhubaneswar.nic.in/
 
Anticoagulation therapy during pregnancy
https://aiimsbhubaneswar.nic.in/
 
Intracoronary optical coherence tomography
https://aiimsbhubaneswar.nic.in/
 
A roadmap for the human development
https://aiimsbhubaneswar.nic.in/
 
Ad

Recently uploaded (20)

PPTX
9.Biomechanics of Skeletal Muscles Final.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
PDF
Chronic kidney disease (CKD) - AMBOSS.pdf
Abbas Mushtaq Ali
 
PPT
Natural-Resources UNIT -1st EVS ( B.pharm 2nd sem)
surya singh
 
PDF
nocturnal enuresis presentation By Dr Harish kalasua
harishkalasua327
 
PPTX
Fetal skull and it's diameters in obstetrics
aniyakhan948
 
PPTX
JULY 2025 Oncology Cartoons by Dr Kanhu Charan Patro
Kanhu Charan
 
PDF
Pathophysiology of Artherosclerosis PPT.pdf
Miss. Pratiksha Ghodake
 
PPTX
9. THORACIC SURGERY ( VASCULAR SURGERY) PART 3..pptx
Bolan University of Medical and Health Sciences ,Quetta
 
PPTX
Intra-uterine fetal circulation in obstetrics
aniyakhan948
 
PPTX
8 .THORACIC SURGERY ( Cardiac Surgery) part 2..pptx
Bolan University of Medical and Health Sciences ,Quetta
 
PPTX
11 .Neurosurgery (part.1) cranial surgery.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
PPTX
Beyond Compliance Embracing Quality by Design (QbD) for Next-Generation Pharm...
Dr. Smita Kumbhar
 
PPTX
maternal pelvis and it's diameters in obstetrics
aniyakhan948
 
PPT
The Second Coming of Finite Therapy in CLL: Making Informed Treatment Decisio...
PVI, PeerView Institute for Medical Education
 
PDF
what types of cancer by sayyam health centre
miansayyam360
 
PPTX
10.Knowledge tools and techniques ergo.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
PPTX
Synopsis Writing Made Easy for PG Scholars and PG Guides
Dr KHALID B.M
 
PPTX
Stroke Imaging: Fundamental principles and concepts
Dr. Aryan (Anish Dhakal)
 
PPTX
Regulatory Aspects of Medical Devices in USA.pptx
Aaditi Kamble
 
PPT
The Road Ahead for Enhanced Sequential Care in CLL—Updates With Targeted Therapy
PVI, PeerView Institute for Medical Education
 
9.Biomechanics of Skeletal Muscles Final.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
Chronic kidney disease (CKD) - AMBOSS.pdf
Abbas Mushtaq Ali
 
Natural-Resources UNIT -1st EVS ( B.pharm 2nd sem)
surya singh
 
nocturnal enuresis presentation By Dr Harish kalasua
harishkalasua327
 
Fetal skull and it's diameters in obstetrics
aniyakhan948
 
JULY 2025 Oncology Cartoons by Dr Kanhu Charan Patro
Kanhu Charan
 
Pathophysiology of Artherosclerosis PPT.pdf
Miss. Pratiksha Ghodake
 
9. THORACIC SURGERY ( VASCULAR SURGERY) PART 3..pptx
Bolan University of Medical and Health Sciences ,Quetta
 
Intra-uterine fetal circulation in obstetrics
aniyakhan948
 
8 .THORACIC SURGERY ( Cardiac Surgery) part 2..pptx
Bolan University of Medical and Health Sciences ,Quetta
 
11 .Neurosurgery (part.1) cranial surgery.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
Beyond Compliance Embracing Quality by Design (QbD) for Next-Generation Pharm...
Dr. Smita Kumbhar
 
maternal pelvis and it's diameters in obstetrics
aniyakhan948
 
The Second Coming of Finite Therapy in CLL: Making Informed Treatment Decisio...
PVI, PeerView Institute for Medical Education
 
what types of cancer by sayyam health centre
miansayyam360
 
10.Knowledge tools and techniques ergo.pptx
Bolan University of Medical and Health Sciences ,Quetta
 
Synopsis Writing Made Easy for PG Scholars and PG Guides
Dr KHALID B.M
 
Stroke Imaging: Fundamental principles and concepts
Dr. Aryan (Anish Dhakal)
 
Regulatory Aspects of Medical Devices in USA.pptx
Aaditi Kamble
 
The Road Ahead for Enhanced Sequential Care in CLL—Updates With Targeted Therapy
PVI, PeerView Institute for Medical Education
 

Medical Statistics Part-I:Descriptive statistics

  • 1. Descriptive Statistics in cardiovascular research MOST SIMPLE WAY OF DATA HANDLING
  • 2. Statistics in general SUBCLAUSE USED FOR Collection Analysis  Interpretation Presentation  Reasoning  Discussion  Calculation  Scientific Inference
  • 4. Descriptive Statistics  Data analysis begins with calculation of descriptive statistics for the research variables  These statistics summarize various aspects about the data, giving details about the sample and providing information about the population from which he sample was drawn  Each variable’s type determines the nature of descriptive statistics that one calculates and the manner in which one reports or displays those statistics  Simply to describe what's going on in our data
  • 5. inferential statistics  Trying to reach conclusions that extend beyond the immediate data alone=INFER  We use inferential statistics to try to infer from the sample data what the population might think/experience  Make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study  Make inferences from our data to more general conditions http://www.socialresearchmethods.net/kb/statinf.php
  • 6. KEYWORDS  Population[Orientation]  SAMPLE[Representive]  VARIABLES[Characteristics]  PARAMETERS[quantities that define a statistical model]
  • 7. DISPLAY OF DESCRIPTIVE STATISTICS TABLES GRAGHS CHARTS CIRCLE DOT PLOTS BOX-AND-WHISKER PLOTS SCATTERPLOT SURVIVAL PLOTS BLAND-ALTMAN PLOTS
  • 10. Variables: DISCRETE CONTINUOUS  Only certain values (fixed and readily Countable  Examples of discrete variables commonly encountered in cardiovascular research include species, strain, racial/ethnic group, sex, education level, treatment group, hypertension status, and New York Heart Association class.  Infinite number of values  Fixed intervals between adjacent values  They can be manipulated mathematically, taking sums and differences  Age, height, weight, blood pressure, measures of cardiac structure and function, blood chemistries, and survival time
  • 11. Discrete variables (categorical) NOMINAL (UNORDERED) ORDINAL (ORDERED) Take values such as yes/no, Human/dog/mouse, female/male, treatment A/B/C; a nominal Variable that takes only 2 possible values is called binary. One May apply numbers as labels for nominal categories, but there Is no natural ordering Take naturally ordered values such as New York Heart Association class (I, II, III, or IV), hypertension status (optimal, normal, high-normal,or hypertensive), or education level (less than high school,high school, college, graduate school
  • 12. Categorical  A categorical variable (sometimes called a nominal variable) is one that has two or more categories, but there is no intrinsic ordering to the categories. For example, gender is a categorical variable having two categories (male and female) and there is no intrinsic ordering to the categories. Hair color is also a categorical variable having a number of categories (blonde, brown, brunette, red, etc.) and again, there is no agreed way to order these from highest to lowest. A purely categorical variable is one that simply allows you to assign categories but you cannot clearly order the variables. If the variable has a clear ordering, then that variable would be an ordinal variable, as described below.
  • 13. Ordinal  An ordinal variable is similar to a categorical variable. The difference between the two is that there is a clear ordering of the variables. For example, suppose you have a variable, economic status, with three categories (low, medium and high). In addition to being able to classify people into these three categories, you can order the categories as low, medium and high. Now consider a variable like educational experience (with values such as elementary school graduate, high school graduate, some college and college graduate). These also can be ordered as elementary school, high school, some college, and college graduate. Even though we can order these from lowest to highest, the spacing between the values may not be the same across the levels of the variables. Say we assign scores 1, 2, 3 and 4 to these four levels of educational experience and we compare the difference in education between categories one and two with the difference in educational experience between categories two and three, or the difference between categories three and four. The difference between categories one and two (elementary and high school) is probably much bigger than the difference between categories two and three (high school and some college).
  • 14. Ordinal  In this example, we can order the people in level of educational experience but the size of the difference between categories is inconsistent (because the spacing between categories one and two is bigger than categories two and three). If these categories were equally spaced, then the variable would be an interval variable
  • 15. Continuous variables  Continuous variables can have an infinite number of different values between two given points. As shown above, there cannot be a continuous scale of children within a family. If height were being measured though, the variables would be continuous as there are an unlimited number of possibilities even if only looking at between 1 and 1.1 meters.
  • 16. Descriptive statistics for Discrete variables  Absolute frequencies (raw counts) for each category  Relative frequencies (proportions or percentages of the total Number of observations)  Cumulative frequencies for successive categories of ordinal variables
  • 17. Collection  Formal Sampling  Recording Responses To Experimental Conditions  Observing A Process Repeatedly Over Time
  • 18. Descriptive statistics for continuous variables  Location statistics  MEAN  MEDIAN  MODE,  QUANTILES  Dispersion statistics[CENTRAL TENDENCY]  VARIANCE=  STANDARD DEVIATION=S=√S²  RANGE  INTERQUARTILE RANGE  Shape statistics  SKEWNESS  KURTOSIS
  • 19. ROBUST  MEDIAN is robust :Not strongly affected by outliers or by extreme changes to a small portion  MEAN is sensitive (not robust) to those conditions  MODE is robust to outliers, but it may be affected by data collection operations, such as rounding or digit preference, that alter data precision.
  • 20. QUANTILES  Quintiles combine aspects of ordered data and cumulative frequencies  The p-th quantile (0≤p≤1)  100p is an integer, the quantiles are called percentiles  Median, or 0.50 quantile, is the 50th percentile, the 0.99 quantile is the 99th percentile  Three specific percentiles are widely used in descriptive statistics, [100p is an integer multiple of 25]  Q1first quartile (25th percentile, 0.25 quantile)  Q2second quartile (50th percentile, 0.50 quantile), median  Q3third quartile (75th perce ntile,0.75 quantile)
  • 21. INTERQUARTILE RANGE[IQR]  It is a single number  defined as IQ of RQ-3Q1  Variance and standard deviation are affected (increased) by the presence of extreme observations, the IQR is not; it is robust
  • 22. SKEWNESS[skewness coefficient] For a given data  Distribution is symmetric (skewness=0)  A more pronounced tail in 1 direction than the  other (left tail, skewness<0; right tail, skewness>0)  If skewness=0, the mean= median  Right- (left-) skewed distribution has its mean value greater  (less than) the median
  • 23. Kurtosis  a measure of the “peakedness” of a distribution  A gaussian distribution (also called “normal”) with a bell-shaped frequency curve has kurtosis 0  Positive kurtosis indicates a sharper peak with longer/fatter tails and relatively more variability due to extreme deviations  Negative kurtosis coefficient indicates broader shoulders with shorter/thinner tails
  • 27. DOT PLOT of Continuous variable(BMI) The dot plot is a simple graph that is used mainly with small data sets to show individual values of sample data in 1 dimension
  • 28. Box-and-whisker plot= box plot graph Graph displays values of quartiles (Q1, Q2, Q3) by a rectangular box. The ends of the box correspond to Q1 and Q3, such that the length of the box is the interquartile range (IQRQ3Q1). There is a line drawn inside the box at the median, Q2, and there is a “” symbol plotted at the mean.Traditionally, “whiskers” (thin lines) extend out to, at most,1.5 times the box length from both ends of the box: they connect all values outside the box that are not 1.5 IQR away from the box, and they must end at an observed value.Beyond the whiskers are outliers, identified individually by symbols such as circles or asterisks
  • 31. Univariate Analysis: Look one variable at a time for 3 features Distribution Central Tendency Dispersion Of Frequency in % /bar diagram/histogram Mean Median Mode Range Standard Deviation Variance
  • 32. Correlation[r] is a single 1 number that shows the degree of relationship between 2 variables -1 to +1
  • 34. r is also called Karl Pearson’s coefficient of correlation
  • 35. It is just beginning With best wishes