SlideShare a Scribd company logo
Basics of statistics
Conducted by Dept. of Biostatistics, NIMHANS
From 28 to 30 Sept, 2015
"It is easy to lie with statistics,
But it is hard to tell the truth without statistics."
–Andrejs Dunkels
Topics covered
• Introduction
• Types of statistics
• Definitions
• Variable & Types
• Variable scales
• Description of data
• Distribution of sample &
population
• Measures of center, dispersion &
shape
• Properties of Normal distribution
• Testing of hypothesis
• Types of error
• Estimation of sample size
• Various tests to be used
• Central limit theorem
• Parametric tests- t-test, ANOVA,
Post Hoc, Correlation &
Regression
• Non Parametric tests
• Tests for categorical data
• Summary of tests to be used
• Qualitative vs Quantitative
research
• Qualitative research
• Software packages
Statistics
• Consists of a body of methods for collecting and analyzing data.
• It provides methods for-
– Design- planning and carrying out research studies
– Description- summarizing and exploring data
– Inference- making predictions & generalizing about phenomena
represented by data.
Types of statistics
• 2 major types of statistics
• Descriptive statistics- It consists of methods for organizing and
summarizing information.
– Includes- graphs, charts, tables & calculation of averages, percentiles
• Inferential statistics- It consists of methods for drawing and measuring the
reliability of conclusions about population based on information obtained.
– Includes- point estimation, interval estimation, hypothesis testing.
• Both are interrelated. Necessary to use methods of descriptive statistics to
organize and summarize the information obtained before methods of
inferential statistics can be used.
Population & Sample
• Basic concepts in statistics.
• Population- It is the collection of all individuals or items under
consideration in a statistical study
• Sample- It is the part of the population from which information is
collected.
• Population always represents the target of an investigation. We learn about
population by sampling from the collection.
• Parameters- used to summarize the features of the population under
investigation.
• Statistic- it describes a characteristics of the sample, which can then be
used to make inference about unknown parameters.
Variable & types
• Variable- a characteristic that varies from one person or thing to another.
• Types- Qualitative/ Quantitative, Discrete/ Continuous, Dependent/
Independent
• Qualitative data- the variable which yield non numerical data.
– Eg- sex, marital status, eye colour
• Quantitative data- the variables that yield numerical data
– Eg- height, weight, number of siblings.
• Discrete variable- the variable has only a countable number of distinct
possible values.
– Eg- number of car accidents, number of children
• Continuous variable- the variable has divisible unit.
– Eg- weight, length, temperature.
• Independent variable- variable is not dependent on other variable.
– Eg- age, sex.
• Dependent variable- depends on the independent variable.
– Eg- weight of a newborn, stress
Variable scales
• Variables can also be described according to the scale on which they are
defined.
• Nominal scale- the categories are merely names. They do not have a
natural order.
– Eg- male/female, yes/no
• Ordinal scale- the categories can be put in order. But the difference
between the two may not be same as other two.
– Eg- mild/ Moderate/ Severe.
• Interval scale- the differences between variables are comparable. The
variable does not has absolute zero.
– Eg- temperature, time
• Ratio scale- the variable has absolute zero as well as difference between
variables are comparable..
– Eg- stress using PSS, insomnia using ISI
• Nominal & Ordinal scales are used to describe Qualitative data.
• Interval & Ratio scales are used to describe Quantitative data.
Describing data
• Qualitative data-
– Frequency- number of observations falling into particular class/
category of the qualitative variable.
– Frequency distribution- table listing all classes & their frequencies.
– Graphical representation- Pie chart, Bar graph.
– Nominal data best displayed by pie chart
– Ordinal data best displayed by bar graph
• Quantitative data-
– Can be presented by a frequency distribution.
– If the discrete variable has a lot of different values, or if the data is a
continuous variable then data can be grouped into classes/ categories.
– Class interval- covers the range between maximum & minimum
values.
– Class limits- end points of class interval.
– Class frequency- number of observations in the data that belong to
each class interval.
– Usually presented as a Histogram or a Bar graph.
Population & Sample distribution
• Population distribution- frequency distribution of the population.
• Sample distribution- frequency distribution of the sample.
• Sample distribution is a blurry photo of the population distribution.
• As the sample size ↑, the sample distribution becomes closer representative
of the population distribution.
• Sample of population distribution can be summarized by describing its
shape (based on the graph).
• It can be Symmetric or Nonsymmetric/ Skewed to left/ right based on its
tail.
Properties of
Numerical data &
Measures
Central tendency
Mean
Median
Mode
Dispersion
Range
Interquartile
Range
Standard
Deviation
Shape
Skewness
Kurtosis
Measures of center
• Central tendency- In any distribution, majority of the observations pile up,
or cluster around in a particular region.
– Includes- Mean, Median & Mode.
• Mean- sum of observed values in a data divided by the number of
observations
• Median- observation in the data set that divides the data set into half.
• Mode- value of the data set which occurs with greatest frequency
• Mean & Median can be applied only to Quantitative data
• Mode can be used either to Qualitative or Quantitative data.
What to choose?
• Qualitative variable- Mode.
• Quantitative with symmetric distribution- Mean.
• Quantitative with skewed distribution- Median.
• Outlier- observation that falls far from the rest of the data. Mean gets
highly influenced by the outlier.
• We use sample mean, median & mode to estimate the population mean,
median & mode.
Measures of dispersion
• Dispersion- It is the spread/ variability of values about the measures of
central tendency. They quantify the variability of the distribution.
• Measures include-
– Range
– Sample interquartile range
– Standard deviation
• Mostly used for quantitative data
• Range- difference between the largest observed value in the data set and
the smallest one.
– So, while considering range great deal of information is ignored.
• Interquartile range- difference between the first & third quartiles of the
variable.
– Percentile- divides the observed values into hundredths/ 100 equal
parts.
– Deciles- divides the observed values into tenths/ 10 equal parts
– Quartiles- divides the observed values into 4 equal parts. Q1 divides
the bottom 25% of observed values from top 75%...
• Standard deviation- it is a kind of average of the absolute deviation of
observed values from the mean of the variable.
– It is defined using the sample mean & values get strongly affected by
few extreme observations.
Shape
• Skewness- Lack of symmetry in distribution. It can be interpreted from
frequency polygon.
• Properties-
– Mean, median & mode fall at different points.
– Quartiles are not equidistant from median.
– Curve is not symmetrical but stretched more to one side.
• Distribution may be positively or negatively skewed. Limits for
coefficient of skewness is ± 3.
• Kurtosis- convexity of a curve.
– Gives an idea about the flatness/ peakedness of the curve.
Normal distribution
• Bell shaped symmetric distribution.
• Why is it important?
– Many things are normally distributed, or very close to it.
– It is easy to work with mathematically
– Most inferential statistical methods make use of properties of the
normal distribution.
• Mean = Median = Mode
• 68.2% of the values lie within 1SD.
• 95.4% of the values lie within 2SD.
• 99.7% of the values lie within 3SD.
Tests to check normal distribution
1. Checking measures of Central tendency, Skewness & Kurtosis.
2. Graphical evaluation- normal plot, frequency polygon.
3. Statistical tests-
– Kolmogorov-Smirnov test
– Shapiro-Wilk test
– Lilliefor’s test
– Pearson’s chi-squared test
• Shapiro-Wilk has the best power for a given significance.
• If not normally distributed?- correction by transformation of the data- log
transformation, square root transformation.
Hypothesis testing
• Aim of doing a study is to check whether the data agree with certain
predictions. These predictions are called hypothesis.
• Hypothesis arise from the theory that drives the research.
• Significance test- it is a way of statistically testing a hypothesis by
comparing the data values.
– It consists of two hypothesis- Null (H0) & Alternative hypothesis (H1).
– Null hypothesis is usually a statement that the parameter has value
corresponding to, in some sense, no effect.
– Alternative hypothesis is a hypothesis contradicts null hypothesis.
– Hypothesis are formulated before collecting the data.
• Significance test analyzes the strength of sample evidence against the null
hypothesis.
• The test is conducted to investigate whether the data contradicts the null
hypothesis, suggesting alternative hypothesis is true.
• Test statistics- statistic calculated from the sample data to test the null
hypothesis.
• p-value- is the probability, if H0 were true, that the test statistic would fall
in this collection of values. The smaller the p-value, the more strongly
the data contradicts H0.
• When p-value ≤ 0.05, data sufficiently contradicts H0.
Types of error
• Type I/ α error- Rejecting true null hypothesis.
– We may conclude that difference is significant, when in fact there is no
real difference.
– It is popularly known as p-value. Maximum p-value allowed is called
as level of significance. Being serious p-value is kept low, mostly less
than 5% or p<0.05.
• Type II/ β error- Accepting false null hypothesis.
– We may conclude that difference is not significant, when in fact there is
real difference.
– It is also called as Power of the test & indicates sensitivity of the test.
• Not possible to reduce both type I & II, So α error is fixed at a tolerable
limit & β error is minimized by ↑ sample size.
Estimation of Sample size
• Small sample- fails to detect clinically important effects (lack of Power)
• Large sample- identify differences which has no clinical relevance.
• Calculation is based on (not included formulas)-
– Estimation of mean
– Estimation of proportions
– Comparison in two means
– Comparison in two proportions
• Checklist- level of significance, power, study design, statistical procedure.
• Minimum sample size required for statistical analysis- 50.
Basic theorem in statistics
• Central limit theorem-
– States that the distribution of the sum/ average of a large number of
independent, identically distributed variables will be approximately
normal.
• Why is this important?
– Basis of many statistical procedures.
Parametric tests
• These are statistical tests that makes assumptions about the parameters
(defining properties).
• Assumptions made are-
– Data follows normal distribution.
– Sample size is large enough for Central limit theorem to lead to
normality of averages.
– Data is not normal, but can be transformed.
• Some situations where data does not follow normal distribution-
– Outcome is an ordinal variable.
– Presence of definite outliers
– Outcome has clear limits of demarcation.
Tests to be used
Scale type Permissible statistics
Nominal Mode
Chi-Square test
Ordinal Mode/ Median
Interval
Mean, Standard
Deviation
t-test, ANOVA, Post hoc,
Correlation, Regression,
Ratio
One sample
t-Test
Independent
t- test
Dependent
t- test
Compares the sample
mean with the
population mean
Compares the means of
two independent
samples
Compares the means of
paired samples
(before-after, pre-post)
ANOVA
• t- Test- difference between 2 means.
– If there are more than 2 means, then doing t test increases the α & β
error. Which creates a serious flaw.
• So when there are >2 means to be compared we use ANOVA.
• Types-
– One way- study effects of one factors.
– Two way- study effects of multiple factors.
• Assumptions of ANOVA- Normality, Linearity.
• ANCOVA- It is a blend of ANOVA & Regression. In other words,
measures how much 2 variables change together & how strong is the
relationship.
Post Hoc
• Latin phrase, means- “after this” or “after the event”
• Why do Post hoc tests?
– ANOVA tells whether there is an overall difference between groups,
but it does not tell which specific group differed.
– Post hoc tests tell where the difference occurred between groups.
• Different Post hoc tests-
– Bonferroni
– Fisher’s least significant difference (LSD)
– Tukey’s honestly significant difference (HSD)
– Scheffe post hoc tests
Correlation & Regression
• Correlation- denotes association between 2 quantitative variables.
– Assume that the association is linear (i.e.., one variable ↑/ ↓ a fixed
amount for a unit ↑/ ↓ in the other).
– Degree of association is measured by a correlation coefficient, r.
– r is measured on a scale from -1 through 0 to +1.
– When both variables ↑, then r is + & when 1 variable ↑ and other
decreases, then r is -.
• Graphically- Scatter diagrams, usually independent variable is plotted
against x-axis & dependent against y-axis.
• Limitation- it does not say anything about Cause & Effect relationship.
– Beware of spurious/ non sense correlation.
• Correlation-
– Strength/ degree of association.
• Regression-
– Nature of association (eg- if x & y related, it means if x changes by
certain amount then y changes on an average by certain amount).
– Expresses the linear relationship between variables.
– Regression coefficient- β
– Types- Linear, Non linear, Stepwise
• Regression coefficient gives a better summary of the relationship between
the two variables than Correlation coefficient.
Non Parametric tests
• Also called as “Distribution free tests”, because they are based on fewer
assumptions.
• Advantages-
– When data does not follow normal distribution.
– When the average is better represented by median.
– Sample size is small.
– Presence of outliers.
– Relatively simple to conduct
Tests
Characters Parametric test Non Parametric test
Testing mean, a
hypothesized value
One sample t test Sign test
Comparison of means of
2 groups
Independent t test Mann Whitney U test
Means of related samples Paired t test
Wilcoxon Signed rank
test
Comparison of means of
> 2 groups
ANOVA Kruskal Wallis test
Comparison of means of
> 2 related groups
Repeated measures of
ANOVA
Friedman’s test
Assessing the
relationship between 2
quantitative variables
Pearson’s correlation Spearman’s correlation
Chi-Square test
• Used for analysis of categorical data.
• Other tests- Fisher exact probability test, McNemar’s test.
• Requirements of Chi-Square-
– Sample should be independent
– Sample size should be reasonably large (n >40)
– Expected cell frequency should not be < 5.
• Yate’s correction- if expected cell frequency is < 5
• Fisher exact probability test- used when sample size is small (n < 20)
• McNemar’s test- used when there are two related samples or there are
repeated measurements
RR & OR
• Relative Risk (RR)-
– It is the ratio of incidence rate among exposed to the incidence rate
among not exposed.
– used in RCTs & Cohort studies
– Values- <1 - risk of disease is less among exposed
– >1 – risk of disease is more among exposed
– =1 – equal risk among exposed & non exposed
• Odds Ratio (OR)-
– Ratio of odds of exposure among the cases to odds of exposure among
controls. Used for rare diseases/ events
– Used in case control & retrospective studies (no meaning in calculating
the risk of getting the disease)
– Values- >1- more among cases, <1- more among controls
Qualitative v/s Quantitative
Qualitative research
• Seeks to confirm hypothesis
• Highly structured methods used
• Uses closed ended, numerical
methods of collecting data
• Study design is fixed & subject to
statistical assumptions
Quantitative research
• Seeks to explore phenomena
• Semi-structured methods used
• Uses open ended, textual methods
• Study design is flexible, iterative
& subject to textual analysis
Qualitative research
• Provides complex descriptions & information about issues such as
contradictory behavior, belief, opinions, emotions & relationships.
• Methods used are-
– Phenomenology
– Ethnography
– Grounded theory
• Designs used-
– Case studies
– Comparative designs
– Snapshots
– Retrospective & Longitudinal studies
Statistical software packages
Quantitative research
• SPSS by IBM
• R by R Foundation
• GenStat by VSN International
• Mathematica by Wolfram
research
• Minitab, MATLAB, Nmath Stats
etc..,
Qualitative research
• ATLASti
• NVIVO
• MAXQDA
• NUDist
• ANTHTOPAC
"An approximate answer to the right problem is worth a good deal,
more than an exact answer to an approximate problem." -- John Tukey
Ad

More Related Content

What's hot (20)

Basics stat ppt-types of data
Basics stat ppt-types of dataBasics stat ppt-types of data
Basics stat ppt-types of data
Farhana Shaheen
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
CIToolkit
 
Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"
Dalia El-Shafei
 
1.2 types of data
1.2 types of data1.2 types of data
1.2 types of data
Long Beach City College
 
Data
DataData
Data
Ashutosh Mittal
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
saba khan
 
Data presentation 2
Data presentation 2Data presentation 2
Data presentation 2
Rawalpindi Medical College
 
Variance and standard deviation
Variance and standard deviationVariance and standard deviation
Variance and standard deviation
Amrit Swaroop
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
Bhagya Silva
 
Univariate Analysis
 Univariate Analysis Univariate Analysis
Univariate Analysis
Soumya Sahoo
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Nursing Path
 
Introduction to Descriptive Statistics
Introduction to Descriptive StatisticsIntroduction to Descriptive Statistics
Introduction to Descriptive Statistics
Sanju Rusara Seneviratne
 
DATA Types
DATA TypesDATA Types
DATA Types
Aniruddha Deshmukh
 
Data Analysis and Statistics
Data Analysis and StatisticsData Analysis and Statistics
Data Analysis and Statistics
T.S. Lim
 
Type of data
Type of dataType of data
Type of data
Amit Sharma
 
Data and its Types
Data and its TypesData and its Types
Data and its Types
RajaKrishnan M
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Attaullah Khan
 
Types of Data
Types of DataTypes of Data
Types of Data
Dr. Amjad Ali Arain
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
Dalia El-Shafei
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
University of Jaffna
 

Similar to Basics of statistics (20)

statistics.pptxghfhsahkjhsghkjhahkjhgfjkjkg
statistics.pptxghfhsahkjhsghkjhahkjhgfjkjkgstatistics.pptxghfhsahkjhsghkjhahkjhgfjkjkg
statistics.pptxghfhsahkjhsghkjhahkjhgfjkjkg
Central University of South Bihar
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
Intro statistics
Intro statisticsIntro statistics
Intro statistics
Arash Kamrani
 
Experimentation design of different Agricultural Research
Experimentation design of different Agricultural ResearchExperimentation design of different Agricultural Research
Experimentation design of different Agricultural Research
BonnyAloka
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptx
Tawhid4
 
Biostatistics ppt
Biostatistics  pptBiostatistics  ppt
Biostatistics ppt
santhoshikayithi
 
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docxPlanning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
emmanuelangelof
 
Types of Data in Machine Learning, Number aand Categorical
Types of Data in Machine Learning, Number aand CategoricalTypes of Data in Machine Learning, Number aand Categorical
Types of Data in Machine Learning, Number aand Categorical
msiad
 
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdfCommon Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Ravinandan A P
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
Ali Al Mousawi
 
Statistics
StatisticsStatistics
Statistics
Manish Runthala
 
unit4 rm research methodology .pdf
unit4 rm research methodology                   .pdfunit4 rm research methodology                   .pdf
unit4 rm research methodology .pdf
AnmolMogalai
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
Kern Rocke
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahu
Sudhir INDIA
 
Chapter34
Chapter34Chapter34
Chapter34
Ying Liu
 
data analysis real applications in life.pptx
data analysis real applications in life.pptxdata analysis real applications in life.pptx
data analysis real applications in life.pptx
chesenybrian2022
 
RM7.ppt
RM7.pptRM7.ppt
RM7.ppt
HanaKassahun1
 
Ch5-quantitative-data analysis.pptx
Ch5-quantitative-data analysis.pptxCh5-quantitative-data analysis.pptx
Ch5-quantitative-data analysis.pptx
zerihunnana
 
fundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptxfundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
Descriptive_statistics - Sample 1.pptx
Descriptive_statistics - Sample 1.pptxDescriptive_statistics - Sample 1.pptx
Descriptive_statistics - Sample 1.pptx
SachinKumar524686
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
Experimentation design of different Agricultural Research
Experimentation design of different Agricultural ResearchExperimentation design of different Agricultural Research
Experimentation design of different Agricultural Research
BonnyAloka
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptx
Tawhid4
 
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docxPlanning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
Planning-Data-Analysis-CHOOSING-STATISTICAL-TOOL.docx
emmanuelangelof
 
Types of Data in Machine Learning, Number aand Categorical
Types of Data in Machine Learning, Number aand CategoricalTypes of Data in Machine Learning, Number aand Categorical
Types of Data in Machine Learning, Number aand Categorical
msiad
 
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdfCommon Statistical Terms - Biostatistics - Ravinandan A P.pdf
Common Statistical Terms - Biostatistics - Ravinandan A P.pdf
Ravinandan A P
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
Ali Al Mousawi
 
unit4 rm research methodology .pdf
unit4 rm research methodology                   .pdfunit4 rm research methodology                   .pdf
unit4 rm research methodology .pdf
AnmolMogalai
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
Kern Rocke
 
Statistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahuStatistics in research by dr. sudhir sahu
Statistics in research by dr. sudhir sahu
Sudhir INDIA
 
data analysis real applications in life.pptx
data analysis real applications in life.pptxdata analysis real applications in life.pptx
data analysis real applications in life.pptx
chesenybrian2022
 
Ch5-quantitative-data analysis.pptx
Ch5-quantitative-data analysis.pptxCh5-quantitative-data analysis.pptx
Ch5-quantitative-data analysis.pptx
zerihunnana
 
fundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptxfundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
Descriptive_statistics - Sample 1.pptx
Descriptive_statistics - Sample 1.pptxDescriptive_statistics - Sample 1.pptx
Descriptive_statistics - Sample 1.pptx
SachinKumar524686
 
Ad

More from donthuraj (20)

9. Management of BPAD.pptx
9. Management of BPAD.pptx9. Management of BPAD.pptx
9. Management of BPAD.pptx
donthuraj
 
Disorders of sleep
Disorders of sleepDisorders of sleep
Disorders of sleep
donthuraj
 
Chronobiology
ChronobiologyChronobiology
Chronobiology
donthuraj
 
Nature vs nurture
Nature vs nurtureNature vs nurture
Nature vs nurture
donthuraj
 
Interviewing techniques
Interviewing techniquesInterviewing techniques
Interviewing techniques
donthuraj
 
Introduction to Psychology
Introduction to PsychologyIntroduction to Psychology
Introduction to Psychology
donthuraj
 
Bi Polar Affective Disorder
Bi Polar Affective DisorderBi Polar Affective Disorder
Bi Polar Affective Disorder
donthuraj
 
Psychiatry history taking and MSE
Psychiatry history taking and MSEPsychiatry history taking and MSE
Psychiatry history taking and MSE
donthuraj
 
Internet Addiction Disorder & Blue Whale Challenge
Internet Addiction Disorder & Blue Whale ChallengeInternet Addiction Disorder & Blue Whale Challenge
Internet Addiction Disorder & Blue Whale Challenge
donthuraj
 
Ragging
RaggingRagging
Ragging
donthuraj
 
Pharmacological Management of Bipolar Disorder
Pharmacological Management of Bipolar DisorderPharmacological Management of Bipolar Disorder
Pharmacological Management of Bipolar Disorder
donthuraj
 
Introduction to psychiatry
Introduction to psychiatryIntroduction to psychiatry
Introduction to psychiatry
donthuraj
 
Classification of Psychiatric disorders
Classification of Psychiatric disordersClassification of Psychiatric disorders
Classification of Psychiatric disorders
donthuraj
 
Organic Mental Disorders
Organic Mental DisordersOrganic Mental Disorders
Organic Mental Disorders
donthuraj
 
Polypharmacy in Psychiatry
Polypharmacy in PsychiatryPolypharmacy in Psychiatry
Polypharmacy in Psychiatry
donthuraj
 
Complementary and Alternative therapies in Psychiatry
Complementary and Alternative therapies in PsychiatryComplementary and Alternative therapies in Psychiatry
Complementary and Alternative therapies in Psychiatry
donthuraj
 
Social and Transcultural Psychiatry
Social and Transcultural PsychiatrySocial and Transcultural Psychiatry
Social and Transcultural Psychiatry
donthuraj
 
Electro Convulsive Therapy
Electro Convulsive TherapyElectro Convulsive Therapy
Electro Convulsive Therapy
donthuraj
 
Male and female brain what constitutes the gender
Male and female brain  what constitutes the genderMale and female brain  what constitutes the gender
Male and female brain what constitutes the gender
donthuraj
 
Pervasive Developmental Disorders
Pervasive Developmental Disorders Pervasive Developmental Disorders
Pervasive Developmental Disorders
donthuraj
 
9. Management of BPAD.pptx
9. Management of BPAD.pptx9. Management of BPAD.pptx
9. Management of BPAD.pptx
donthuraj
 
Disorders of sleep
Disorders of sleepDisorders of sleep
Disorders of sleep
donthuraj
 
Chronobiology
ChronobiologyChronobiology
Chronobiology
donthuraj
 
Nature vs nurture
Nature vs nurtureNature vs nurture
Nature vs nurture
donthuraj
 
Interviewing techniques
Interviewing techniquesInterviewing techniques
Interviewing techniques
donthuraj
 
Introduction to Psychology
Introduction to PsychologyIntroduction to Psychology
Introduction to Psychology
donthuraj
 
Bi Polar Affective Disorder
Bi Polar Affective DisorderBi Polar Affective Disorder
Bi Polar Affective Disorder
donthuraj
 
Psychiatry history taking and MSE
Psychiatry history taking and MSEPsychiatry history taking and MSE
Psychiatry history taking and MSE
donthuraj
 
Internet Addiction Disorder & Blue Whale Challenge
Internet Addiction Disorder & Blue Whale ChallengeInternet Addiction Disorder & Blue Whale Challenge
Internet Addiction Disorder & Blue Whale Challenge
donthuraj
 
Pharmacological Management of Bipolar Disorder
Pharmacological Management of Bipolar DisorderPharmacological Management of Bipolar Disorder
Pharmacological Management of Bipolar Disorder
donthuraj
 
Introduction to psychiatry
Introduction to psychiatryIntroduction to psychiatry
Introduction to psychiatry
donthuraj
 
Classification of Psychiatric disorders
Classification of Psychiatric disordersClassification of Psychiatric disorders
Classification of Psychiatric disorders
donthuraj
 
Organic Mental Disorders
Organic Mental DisordersOrganic Mental Disorders
Organic Mental Disorders
donthuraj
 
Polypharmacy in Psychiatry
Polypharmacy in PsychiatryPolypharmacy in Psychiatry
Polypharmacy in Psychiatry
donthuraj
 
Complementary and Alternative therapies in Psychiatry
Complementary and Alternative therapies in PsychiatryComplementary and Alternative therapies in Psychiatry
Complementary and Alternative therapies in Psychiatry
donthuraj
 
Social and Transcultural Psychiatry
Social and Transcultural PsychiatrySocial and Transcultural Psychiatry
Social and Transcultural Psychiatry
donthuraj
 
Electro Convulsive Therapy
Electro Convulsive TherapyElectro Convulsive Therapy
Electro Convulsive Therapy
donthuraj
 
Male and female brain what constitutes the gender
Male and female brain  what constitutes the genderMale and female brain  what constitutes the gender
Male and female brain what constitutes the gender
donthuraj
 
Pervasive Developmental Disorders
Pervasive Developmental Disorders Pervasive Developmental Disorders
Pervasive Developmental Disorders
donthuraj
 
Ad

Recently uploaded (20)

apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
CNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscessCNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscess
Mohamed Rizk Khodair
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
Herbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptxHerbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptx
RAJU THENGE
 
Rococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th centuryRococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th century
Gema
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)
Mohamed Rizk Khodair
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
How to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo SlidesHow to Add Customer Note in Odoo 18 POS - Odoo Slides
How to Add Customer Note in Odoo 18 POS - Odoo Slides
Celine George
 
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
Drive Supporter Growth from Awareness to Advocacy with TechSoup Marketing Ser...
TechSoup
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
CNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscessCNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscess
Mohamed Rizk Khodair
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
Herbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptxHerbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptx
RAJU THENGE
 
Rococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th centuryRococo versus Neoclassicism. The artistic styles of the 18th century
Rococo versus Neoclassicism. The artistic styles of the 18th century
Gema
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18How to Configure Public Holidays & Mandatory Days in Odoo 18
How to Configure Public Holidays & Mandatory Days in Odoo 18
Celine George
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)spinal cord disorders (Myelopathies and radiculoapthies)
spinal cord disorders (Myelopathies and radiculoapthies)
Mohamed Rizk Khodair
 

Basics of statistics

  • 1. Basics of statistics Conducted by Dept. of Biostatistics, NIMHANS From 28 to 30 Sept, 2015 "It is easy to lie with statistics, But it is hard to tell the truth without statistics." –Andrejs Dunkels
  • 2. Topics covered • Introduction • Types of statistics • Definitions • Variable & Types • Variable scales • Description of data • Distribution of sample & population • Measures of center, dispersion & shape • Properties of Normal distribution • Testing of hypothesis • Types of error • Estimation of sample size • Various tests to be used • Central limit theorem • Parametric tests- t-test, ANOVA, Post Hoc, Correlation & Regression • Non Parametric tests • Tests for categorical data • Summary of tests to be used • Qualitative vs Quantitative research • Qualitative research • Software packages
  • 3. Statistics • Consists of a body of methods for collecting and analyzing data. • It provides methods for- – Design- planning and carrying out research studies – Description- summarizing and exploring data – Inference- making predictions & generalizing about phenomena represented by data.
  • 4. Types of statistics • 2 major types of statistics • Descriptive statistics- It consists of methods for organizing and summarizing information. – Includes- graphs, charts, tables & calculation of averages, percentiles • Inferential statistics- It consists of methods for drawing and measuring the reliability of conclusions about population based on information obtained. – Includes- point estimation, interval estimation, hypothesis testing. • Both are interrelated. Necessary to use methods of descriptive statistics to organize and summarize the information obtained before methods of inferential statistics can be used.
  • 5. Population & Sample • Basic concepts in statistics. • Population- It is the collection of all individuals or items under consideration in a statistical study • Sample- It is the part of the population from which information is collected. • Population always represents the target of an investigation. We learn about population by sampling from the collection.
  • 6. • Parameters- used to summarize the features of the population under investigation. • Statistic- it describes a characteristics of the sample, which can then be used to make inference about unknown parameters.
  • 7. Variable & types • Variable- a characteristic that varies from one person or thing to another. • Types- Qualitative/ Quantitative, Discrete/ Continuous, Dependent/ Independent • Qualitative data- the variable which yield non numerical data. – Eg- sex, marital status, eye colour • Quantitative data- the variables that yield numerical data – Eg- height, weight, number of siblings.
  • 8. • Discrete variable- the variable has only a countable number of distinct possible values. – Eg- number of car accidents, number of children • Continuous variable- the variable has divisible unit. – Eg- weight, length, temperature. • Independent variable- variable is not dependent on other variable. – Eg- age, sex. • Dependent variable- depends on the independent variable. – Eg- weight of a newborn, stress
  • 9. Variable scales • Variables can also be described according to the scale on which they are defined. • Nominal scale- the categories are merely names. They do not have a natural order. – Eg- male/female, yes/no • Ordinal scale- the categories can be put in order. But the difference between the two may not be same as other two. – Eg- mild/ Moderate/ Severe.
  • 10. • Interval scale- the differences between variables are comparable. The variable does not has absolute zero. – Eg- temperature, time • Ratio scale- the variable has absolute zero as well as difference between variables are comparable.. – Eg- stress using PSS, insomnia using ISI • Nominal & Ordinal scales are used to describe Qualitative data. • Interval & Ratio scales are used to describe Quantitative data.
  • 11. Describing data • Qualitative data- – Frequency- number of observations falling into particular class/ category of the qualitative variable. – Frequency distribution- table listing all classes & their frequencies. – Graphical representation- Pie chart, Bar graph. – Nominal data best displayed by pie chart – Ordinal data best displayed by bar graph
  • 12. • Quantitative data- – Can be presented by a frequency distribution. – If the discrete variable has a lot of different values, or if the data is a continuous variable then data can be grouped into classes/ categories. – Class interval- covers the range between maximum & minimum values. – Class limits- end points of class interval. – Class frequency- number of observations in the data that belong to each class interval. – Usually presented as a Histogram or a Bar graph.
  • 13. Population & Sample distribution • Population distribution- frequency distribution of the population. • Sample distribution- frequency distribution of the sample. • Sample distribution is a blurry photo of the population distribution. • As the sample size ↑, the sample distribution becomes closer representative of the population distribution. • Sample of population distribution can be summarized by describing its shape (based on the graph). • It can be Symmetric or Nonsymmetric/ Skewed to left/ right based on its tail.
  • 14. Properties of Numerical data & Measures Central tendency Mean Median Mode Dispersion Range Interquartile Range Standard Deviation Shape Skewness Kurtosis
  • 15. Measures of center • Central tendency- In any distribution, majority of the observations pile up, or cluster around in a particular region. – Includes- Mean, Median & Mode. • Mean- sum of observed values in a data divided by the number of observations • Median- observation in the data set that divides the data set into half. • Mode- value of the data set which occurs with greatest frequency • Mean & Median can be applied only to Quantitative data • Mode can be used either to Qualitative or Quantitative data.
  • 16. What to choose? • Qualitative variable- Mode. • Quantitative with symmetric distribution- Mean. • Quantitative with skewed distribution- Median. • Outlier- observation that falls far from the rest of the data. Mean gets highly influenced by the outlier. • We use sample mean, median & mode to estimate the population mean, median & mode.
  • 17. Measures of dispersion • Dispersion- It is the spread/ variability of values about the measures of central tendency. They quantify the variability of the distribution. • Measures include- – Range – Sample interquartile range – Standard deviation • Mostly used for quantitative data • Range- difference between the largest observed value in the data set and the smallest one. – So, while considering range great deal of information is ignored.
  • 18. • Interquartile range- difference between the first & third quartiles of the variable. – Percentile- divides the observed values into hundredths/ 100 equal parts. – Deciles- divides the observed values into tenths/ 10 equal parts – Quartiles- divides the observed values into 4 equal parts. Q1 divides the bottom 25% of observed values from top 75%... • Standard deviation- it is a kind of average of the absolute deviation of observed values from the mean of the variable. – It is defined using the sample mean & values get strongly affected by few extreme observations.
  • 19. Shape • Skewness- Lack of symmetry in distribution. It can be interpreted from frequency polygon. • Properties- – Mean, median & mode fall at different points. – Quartiles are not equidistant from median. – Curve is not symmetrical but stretched more to one side. • Distribution may be positively or negatively skewed. Limits for coefficient of skewness is ± 3. • Kurtosis- convexity of a curve. – Gives an idea about the flatness/ peakedness of the curve.
  • 20. Normal distribution • Bell shaped symmetric distribution. • Why is it important? – Many things are normally distributed, or very close to it. – It is easy to work with mathematically – Most inferential statistical methods make use of properties of the normal distribution. • Mean = Median = Mode • 68.2% of the values lie within 1SD. • 95.4% of the values lie within 2SD. • 99.7% of the values lie within 3SD.
  • 21. Tests to check normal distribution 1. Checking measures of Central tendency, Skewness & Kurtosis. 2. Graphical evaluation- normal plot, frequency polygon. 3. Statistical tests- – Kolmogorov-Smirnov test – Shapiro-Wilk test – Lilliefor’s test – Pearson’s chi-squared test • Shapiro-Wilk has the best power for a given significance. • If not normally distributed?- correction by transformation of the data- log transformation, square root transformation.
  • 22. Hypothesis testing • Aim of doing a study is to check whether the data agree with certain predictions. These predictions are called hypothesis. • Hypothesis arise from the theory that drives the research. • Significance test- it is a way of statistically testing a hypothesis by comparing the data values. – It consists of two hypothesis- Null (H0) & Alternative hypothesis (H1). – Null hypothesis is usually a statement that the parameter has value corresponding to, in some sense, no effect. – Alternative hypothesis is a hypothesis contradicts null hypothesis. – Hypothesis are formulated before collecting the data.
  • 23. • Significance test analyzes the strength of sample evidence against the null hypothesis. • The test is conducted to investigate whether the data contradicts the null hypothesis, suggesting alternative hypothesis is true. • Test statistics- statistic calculated from the sample data to test the null hypothesis. • p-value- is the probability, if H0 were true, that the test statistic would fall in this collection of values. The smaller the p-value, the more strongly the data contradicts H0. • When p-value ≤ 0.05, data sufficiently contradicts H0.
  • 24. Types of error • Type I/ α error- Rejecting true null hypothesis. – We may conclude that difference is significant, when in fact there is no real difference. – It is popularly known as p-value. Maximum p-value allowed is called as level of significance. Being serious p-value is kept low, mostly less than 5% or p<0.05. • Type II/ β error- Accepting false null hypothesis. – We may conclude that difference is not significant, when in fact there is real difference. – It is also called as Power of the test & indicates sensitivity of the test. • Not possible to reduce both type I & II, So α error is fixed at a tolerable limit & β error is minimized by ↑ sample size.
  • 25. Estimation of Sample size • Small sample- fails to detect clinically important effects (lack of Power) • Large sample- identify differences which has no clinical relevance. • Calculation is based on (not included formulas)- – Estimation of mean – Estimation of proportions – Comparison in two means – Comparison in two proportions • Checklist- level of significance, power, study design, statistical procedure. • Minimum sample size required for statistical analysis- 50.
  • 26. Basic theorem in statistics • Central limit theorem- – States that the distribution of the sum/ average of a large number of independent, identically distributed variables will be approximately normal. • Why is this important? – Basis of many statistical procedures.
  • 27. Parametric tests • These are statistical tests that makes assumptions about the parameters (defining properties). • Assumptions made are- – Data follows normal distribution. – Sample size is large enough for Central limit theorem to lead to normality of averages. – Data is not normal, but can be transformed. • Some situations where data does not follow normal distribution- – Outcome is an ordinal variable. – Presence of definite outliers – Outcome has clear limits of demarcation.
  • 28. Tests to be used Scale type Permissible statistics Nominal Mode Chi-Square test Ordinal Mode/ Median Interval Mean, Standard Deviation t-test, ANOVA, Post hoc, Correlation, Regression, Ratio One sample t-Test Independent t- test Dependent t- test Compares the sample mean with the population mean Compares the means of two independent samples Compares the means of paired samples (before-after, pre-post)
  • 29. ANOVA • t- Test- difference between 2 means. – If there are more than 2 means, then doing t test increases the α & β error. Which creates a serious flaw. • So when there are >2 means to be compared we use ANOVA. • Types- – One way- study effects of one factors. – Two way- study effects of multiple factors. • Assumptions of ANOVA- Normality, Linearity. • ANCOVA- It is a blend of ANOVA & Regression. In other words, measures how much 2 variables change together & how strong is the relationship.
  • 30. Post Hoc • Latin phrase, means- “after this” or “after the event” • Why do Post hoc tests? – ANOVA tells whether there is an overall difference between groups, but it does not tell which specific group differed. – Post hoc tests tell where the difference occurred between groups. • Different Post hoc tests- – Bonferroni – Fisher’s least significant difference (LSD) – Tukey’s honestly significant difference (HSD) – Scheffe post hoc tests
  • 31. Correlation & Regression • Correlation- denotes association between 2 quantitative variables. – Assume that the association is linear (i.e.., one variable ↑/ ↓ a fixed amount for a unit ↑/ ↓ in the other). – Degree of association is measured by a correlation coefficient, r. – r is measured on a scale from -1 through 0 to +1. – When both variables ↑, then r is + & when 1 variable ↑ and other decreases, then r is -. • Graphically- Scatter diagrams, usually independent variable is plotted against x-axis & dependent against y-axis. • Limitation- it does not say anything about Cause & Effect relationship. – Beware of spurious/ non sense correlation.
  • 32. • Correlation- – Strength/ degree of association. • Regression- – Nature of association (eg- if x & y related, it means if x changes by certain amount then y changes on an average by certain amount). – Expresses the linear relationship between variables. – Regression coefficient- β – Types- Linear, Non linear, Stepwise • Regression coefficient gives a better summary of the relationship between the two variables than Correlation coefficient.
  • 33. Non Parametric tests • Also called as “Distribution free tests”, because they are based on fewer assumptions. • Advantages- – When data does not follow normal distribution. – When the average is better represented by median. – Sample size is small. – Presence of outliers. – Relatively simple to conduct
  • 34. Tests Characters Parametric test Non Parametric test Testing mean, a hypothesized value One sample t test Sign test Comparison of means of 2 groups Independent t test Mann Whitney U test Means of related samples Paired t test Wilcoxon Signed rank test Comparison of means of > 2 groups ANOVA Kruskal Wallis test Comparison of means of > 2 related groups Repeated measures of ANOVA Friedman’s test Assessing the relationship between 2 quantitative variables Pearson’s correlation Spearman’s correlation
  • 35. Chi-Square test • Used for analysis of categorical data. • Other tests- Fisher exact probability test, McNemar’s test. • Requirements of Chi-Square- – Sample should be independent – Sample size should be reasonably large (n >40) – Expected cell frequency should not be < 5. • Yate’s correction- if expected cell frequency is < 5 • Fisher exact probability test- used when sample size is small (n < 20) • McNemar’s test- used when there are two related samples or there are repeated measurements
  • 36. RR & OR • Relative Risk (RR)- – It is the ratio of incidence rate among exposed to the incidence rate among not exposed. – used in RCTs & Cohort studies – Values- <1 - risk of disease is less among exposed – >1 – risk of disease is more among exposed – =1 – equal risk among exposed & non exposed • Odds Ratio (OR)- – Ratio of odds of exposure among the cases to odds of exposure among controls. Used for rare diseases/ events – Used in case control & retrospective studies (no meaning in calculating the risk of getting the disease) – Values- >1- more among cases, <1- more among controls
  • 37. Qualitative v/s Quantitative Qualitative research • Seeks to confirm hypothesis • Highly structured methods used • Uses closed ended, numerical methods of collecting data • Study design is fixed & subject to statistical assumptions Quantitative research • Seeks to explore phenomena • Semi-structured methods used • Uses open ended, textual methods • Study design is flexible, iterative & subject to textual analysis
  • 38. Qualitative research • Provides complex descriptions & information about issues such as contradictory behavior, belief, opinions, emotions & relationships. • Methods used are- – Phenomenology – Ethnography – Grounded theory • Designs used- – Case studies – Comparative designs – Snapshots – Retrospective & Longitudinal studies
  • 39. Statistical software packages Quantitative research • SPSS by IBM • R by R Foundation • GenStat by VSN International • Mathematica by Wolfram research • Minitab, MATLAB, Nmath Stats etc.., Qualitative research • ATLASti • NVIVO • MAXQDA • NUDist • ANTHTOPAC
  • 40. "An approximate answer to the right problem is worth a good deal, more than an exact answer to an approximate problem." -- John Tukey