SlideShare a Scribd company logo
Introduction and
Descriptive Statistics
Review Statistics and Probability
Modifiedby:
Dr. AchmadNizar Hidayanto
Nur FitriahAyuning Budi
KhumaisaNuraini
Learning Outcomes
• Review key statistical and
research terms
1
• Review the concept of central
tendency
2
• Review the concept of
variability
3
Introduction to Statistics
PowerPoint Lecture Slides
Essentials of Statistics for the
Behavioral Sciences
Eighth Edition
by Frederick J Gravetter and Larry B. Wallnau
1.1 Statistics, Science and
Observations
• “Statistics” means “statistical procedures”
• Uses of Statistics
– Organize and summarize information
– Determine exactly what conclusions are
justified based on the results that were
obtained
• Goals of statistical procedures
– Accurate and meaningful interpretation
– Provide standardized evaluation procedures
1.2 Populations and Samples
• Population
– The set of all the individuals of interest in a
particular study
– Vary in size; often quite large
• Sample
– A set of individuals selected from a population
– Usually intended to represent the population
in a research study
Figure 1.1
Relationship between population and sample
Variables and Data
• Variable
– Characteristic or condition that changes or has
different values for different individuals
• Data (plural)
– Measurements or observations of a variable
• Data set
– A collection of measurements or observations
• A datum (singular)
– A single measurement or observation
– Commonly called a score or raw score
Parameters and Statistics
• Parameter
– A value, usually a
numerical value, that
describes a population
– Derived from
measurements of
the individuals in
the population
• Statistic
– A value, usually a
numerical value, that
describes a sample
– Derived from
measurements of
the individuals in
the sample
Descriptive & Inferential Statistics
• Descriptive statistics
– Summarize data
– Organize data
– Simplify data
• Familiar examples
– Tables
– Graphs
– Averages
• Inferential statistics
– Study samples to make
generalizations about
the population
– Interpret experimental
data
• Common terminology
– “Margin of error”
– “Statistically significant”
Sampling Error
• Sample is never identical to population
• Sampling Error
– The discrepancy, or amount of error, that
exists between a sample statistic and the
corresponding population parameter
• Example: Margin of Error in Polls
– “This poll was taken from a sample of registered
voters and has a margin of error of plus-or-minus 4
percentage points” (Box 1.1)
Figure 1.2
A demonstration of sampling error
Figure 1.3
Role of statistics in experimental research
1.3 Data Structures, Research
Methods, and Statistics
• Individual Variables
– A variable is observed
– “Statistics” describe the observed variable
– Category and/or numerical variables
– Descriptive statistics
• Relationships between variables
– Two variables observed and measured
– One of two possible data structures used to
determine what type of relationship exists
Relationships Between Variables
• Data Structure I: The Correlational Method
– One group of participants
– Measurement of two variables for each
participant
– Goal is to describe type and magnitude of the
relationship
– Patterns in the data reveal relationships
– Non-experimental method of study
Figure 1.4
Data structures for studies evaluating the
relationship between variables
Correlational Method Limitations
• Can demonstrate the existence of a
relationship
• Does not provide an explanation for the
relationship
• Most importantly, does not demonstrate a
cause-and-effect relationship between the
two variables
Relationships Between Variables
• Data Structure II: Comparing two (or
more) groups of Scores
– One variable defines the groups
– Scores are measured on second variable
– Both experimental and non-experimental
studies use this structure
Figure 1.5
Data structure for studies comparing groups
Experimental Method
• Goal of Experimental Method
– To demonstrate a cause-and-effect
relationship
• Manipulation
– The level of one variable is determined by the
experimenter
• Control rules out influence of other
variables
– Participant variables
– Environmental variables
Figure 1.6
The structure of an experiment
Independent/Dependent Variables
• Independent Variable is the variable
manipulated by the researcher
– Independent because no other variable in the
study influences its value
• Dependent Variable is the one observed
to assess the effect of treatment
– Dependent because its value is thought to
depend on the value of the independent
variable
Experimental Method: Control
• Methods of control
– Random assignment of subjects
– Matching of subjects
– Holding level of some potentially influential variables
constant
• Control condition
– Individuals do not receive the experimental treatment.
– They either receive no treatment or they receive a neutral,
placebo treatment
– Purpose: to provide a baseline for comparison with the
experimental condition
• Experimental condition
– Individuals do receive the experimental treatment
Non-experimental Methods
• Non-equivalent Groups
– Researcher compares groups
– Researcher cannot control who goes into which
group
• Pre-test / Post-test
– Individuals measured at two points in time
– Researcher cannot control influence of the
passage of time
• Independent variable is quasi-independent
Figure 1.7
Two examples of non-experimental studies
Insert NEW Figure 1.7
1.4 Variables and Measurement
• Scores are obtained by observing and
measuring variables that scientists use to
help define and explain external behaviors
• The process of measurement consists of
applying carefully defined measurement
procedures for each variable
Constructs & Operational Definitions
• Constructs
– Internal attributes
or characteristics
that cannot be
directly observed
– Useful for
describing and
explaining behavior
• Operational
– Identifies the set of
operations required to
measure an external
(observable) behavior
– Uses the resulting
measurements as both
a definition and a
measurement of a
hypothetical construct
Discrete and Continuous
Variables
• Discrete variable
– Has separate, indivisible categories
– No values can exist between two neighboring
categories
• Continuous variable
– Have an infinite number of possible values
between any two observed values
– Every interval is divisible into an infinite
number of equal parts
Figure 1.8
Example: Continuous Measurement
Real Limits of Continuous
Variables
• Real Limits are the boundaries of each
interval representing scores measured on
a continuous number line
– The real limit separating two adjacent scores
is exactly halfway between the two scores
– Each score has two real limits
• The upper real limit marks the top of the
interval
• The lower real limit marks the bottom of the
interval
Scales of Measurement
• Measurement assigns individuals or events to
categories
– The categories can simply be names such as
male/female or employed/unemployed
– They can be numerical values such as 68 inches
or 175 pounds
• The complete set of categories makes up a
scale of measurement
• Relationships between the categories determine
different types of scales
Scales of Measurement
Scale Characteristics Examples
Nominal •Label and categorize
•No quantitative distinctions
•Gender
•Diagnosis
•Experimental or Control
Ordinal •Categorizes observations
•Categories organized by
size or magnitude
•Rank in class
•Clothing sizes (S,M,L,XL)
•Olympic medals
Interval •Ordered categories
•Interval between categories
of equal size
•Arbitrary or absent zero
point
•Temperature
•IQ
•Golf scores (above/below
par)
Ratio •Ordered categories
•Equal interval between
categories
•Absolute zero point
•Number of correct answers
•Time to complete task
•Gain in height since last
year
Central Tendency
PowerPoint Lecture Slides
Essentials of Statistics for the Behavioral
Sciences
Seventh Edition
by Frederick J Gravetter and Larry B. Wallnau
1.5 Overview of central tendency
• Central tendency
– A single score to define the “center” of a
distribution
• Purpose: find the single score that is most
typical or best represents the entire group
Figure 1.9
What is the “center” of each distribution?
1.6 The Mean
• The mean is the sum of all the scores
divided by the number of scores in the
data.
Population Mean Sample Mean
N
X



n
X
M


The Mean: Three definitions
• Sum of the scores divided by the number
of scores in the data
• Amount each individual receives when
total is divided equally among all: M = ∑X /
n
• The balance point for the distribution
Figure 1.10
Computing the Mean from a
Frequency Distribution Table
Quiz Score (X) f fX
10 1 10
9 2 18
8 4 32
7 0 0
6 1 6
Total n = Σf = 8 ΣfX = 66
M = ??
The Weighted Mean
• Combine two sets of scores
• Three steps:
– Determine the combined sum of all the scores
– Determine the combined number of scores
– Divide the sum of scores by the total number
of scores
2
1
2
1
mean
(weighted)
overall
n
n
X
X
M




 
Characteristics of the Mean
• Changing the value of any score changes the
mean.
• Introducing a new score or removing a score
usually changes the mean.
• Adding or subtracting a constant from each
score changes the mean by the same constant.
• Multiplying or dividing each score by a constant
multiplies or divides the mean by
that constant.
Figure 1.11
1.7 The Median
• The median is the midpoint of the scores
in a distribution when they are listed in
order from smallest to largest.
• The median divides the scores into two
groups of equal size.
Figure 1.12
Figure 1.13
The Precise Median for a
Continuous Variable
• A continuous variable can be infinitely divided
• The precise median is located in the interval
defined by the real limits of the value.
• It may be necessary to determine the fraction of
the interval needed to divide the distribution
exactly in half.
•
interval
in the
number
50%
reach
to
needed
number
fraction 
Figure 1.14
Median, Mean, and Middle
• Mean is the balance point of a distribution
– Defined by distances
– Often is not the midpoint of the scores
• Median is the midpoint of a distribution
– Defined by number of scores
– Often is not the balance point of the scores
• Both measure central tendency, using two
different concepts of middle or “central.”
Figure 1.15
1.8 The Mode
• The mode is the score or category that has
the greatest frequency of any in the
frequency distribution
– Can be used with any scale of measurement
– Corresponds to an actual score in the data
– The only one used with nominal data
• It is possible to have more than one mode
Figure 1.16
1.9 Selecting a Measure of Central
Tendency
Measure of
Central
Tendency
Appropriate to choose
when …
Should not be used
when…
Mean No situation precludes it •Extreme scores
•Skewed distribution
•Undetermined values
•Open-ended distribution
•Ordinal scale
•Nominal scale
Median •Extreme scores
•Skewed distribution
•Undetermined values
•Open-ended distribution
•Ordinal scale
•Nominal scale
Mode •Nominal scales
•Discrete variables
•Describing shape
•Interval or ratio data, except
to accompany mean or
median
Figure 1.17
Figure 1.18
Means or Medians in a Line Graph
Figure 1.19
Means or Medians in a Bar Graph
• Symmetrical distributions
– Mean and median have same value
– If exactly one mode, it has same value as the
mean and the median
– Distribution may have more than one mode,
or no mode at all
1.10 Central Tendency and the
Shape of the Distribution
Figure 1.20
Central Tendency in Skewed
Distributions
• Mean is found far toward the long tail (positive or
negative)
• Median is found toward the long tail, but not as
far as the mean
• Mode is found near the piled-up scores.
• If positively skewed, order from left to right is
mode, median, mean;
• If negatively skewed, order from left to right is
mean, median, mode
Figure 1.21
Variability
PowerPoint Lecture Slides
Essentials of Statistics for the Behavioral
Sciences
Seventh Edition
by Frederick J Gravetter and Larry B. Wallnau
1.11 Overview
• Variability can be defined several ways
– A quantitative measure of the differences
between scores
– Describes the degree to which the scores are
spread out or clustered together
• Purposes of Measure of Variability
– Describe the distribution
– Measure how well an individual score
represents the distribution
Figure 1.22
Population Distributions: Height, Weight
Three Measures of Variability
• The Range
• The Standard Deviation
• The Variance
1.12 The Range
• The distance covered by the scores in a
distribution
– From smallest value to highest value
• For continuous data, real limits are used
• For discrete variables range is number of
categories
range = URL for Xmax — LRL for Xmin
1.13 Standard Deviation and
Variance for a Population
• Most common and most important measure
of variability
– A measure of the standard, or average, distance from
the mean
– Describes whether the scores are clustered closely
around the mean or are widely scattered
• Calculation differs for population and samples
Developing the Standard Deviation
• Step One: Determine the Deviation Score (distance
from the mean) for each score:
• Step Two: Calculate Mean (Average) of Deviations
– Deviations sum to 0 because M is balance point of the
distribution
– The Mean (Average) Deviation will always equal 0;
another method must be found
Deviation score = X — μ
Developing the Standard Deviation (2)
• Step Three: Get rid of negatives in
Deviations:
– Square each deviation score
– Using the squared values, compute the Mean
Squared Deviation, known as the Variance
–
• Variability is now measured in squared
units and is called the Variance.
Population variance equals the mean squared
deviation -- Variance is the average squared
distance from the mean
Developing the Standard Deviation (2)
• Step Four:
– Variance measures the average squared
distance from the mean; not quite on goal
• Correct for having squared all the
deviations by taking the square root of the
variance
Variance
Deviation
Standard 
Figure 1.23
Calculation of the Variance
Formulas for Population
Variance and Standard Deviation
•
• SS (sum of squares) is the sum of the
squared deviations of scores from the
mean
• Two equations for computing SS
scores
of
number
deviations
squared
of
sum
Variance 
Two formulas for SS
Definitional Formula
• Find each deviation
score (X–μ)
• Square each deviation
score, (X–μ)2
• Sum up the squared
deviations
Computational Formula
 2
 
 
X
SS
• Square each score and
sum the squared scores
• Find the sum of scores,
square it, divide by N
• Subtract the second
part from the first
 
N
X
X
SS
2
2 
 

Population Variance: Formula
and Notation
Formula
N
SS
N
SS
deviation
standard
variance


Notation
• Lowercase Greek letter
sigma is used to denote
the standard deviation of
a population:
σ
• Because the standard
deviation is the square
root of the variance, we
write the variance of a
population as σ2
Figure 1.24
Graphic Representation of Mean and Standard Deviation
1.14 Standard Deviation and
Variance for a Sample
• Goal of inferential statistics:
– Draw general conclusions about population
– Based on limited information from a sample
• Samples differ from the population
– Samples have less variability
– Computing the Variance and Standard
Deviation in the same way as for a population
would give a biased estimate of the
population values
Figure 1.25
Population of Adult Heights
Variance and Standard Deviation
for a Sample
• Sum of Squares (SS) is computed as
before
• Formula has n-1 rather than N in the
denominator
• Notation uses s instead of σ
1
1
2






n
SS
n
SS
s
sample
of
deviation
standard
s
sample
of
variance
Degrees of Freedom
• Population variance
– Mean is known
– Deviations are computed from a known mean
• Sample variance as estimate of population
– Population mean is unknown
– Using sample mean restricts variability
• Degrees of freedom
– Number of scores in sample that are
independent and free to vary
– Degrees of freedom (df) = n – 1
1.15 More about Variance and
Standard Deviation
• Unbiased estimate of a population
parameter
– Average value of statistic is equal to parameter
– Average value uses all possible samples of a
particular size n
• Biased estimate of a population parameter
– Systematically overestimates or
underestimates (as with variance) the
population parameter
Table 4.1 Biased & Unbiased
Estimates
Sample Statistics
Sample 1st Score 2nd Score Mean
Biased
(used n)
Unbiased
(used n-1)
1 0 0 0.00 0.00 0.00
2 0 3 1.50 2.25 4.50
3 0 9 4.50 20.25 40.50
4 3 0 1.50 2.25 4.50
5 3 3 3.00 0.00 0.00
6 3 9 6.00 9.00 18.00
7 9 0 4.50 20.25 40.50
8 9 3 6.00 9.00 18.00
9 9 9 9.00 0.00 0.00
Totals 36.00 63.00/9 126.00/8
Actual σ2 = 14
This is an adaptation of Table 4.1
Figure 1.26
Sample of n = 20, M = 36, and s = 4
Transformations of Scale
• Adding a constant to each score
– The Mean is changed
– The standard deviation is unchanged
• Multiplying each score by a constant
– The Mean is changed
– Standard Deviation is also changed
– The Standard Deviation is multiplied by
that constant
Variance and Inferential
Statistics
• Goal of inferential statistics: To detect
meaningful and significant patterns in
research results
• Variability in the data influences how easy it
is to see patterns
– High variability obscures patterns that would
be visible in low variability samples
– Variability is sometimes called error variance
Figure 1.27
Experiments with high and low variability

More Related Content

Similar to 1. Review Statistics and Probability.pdf (20)

PPT
chapter1.ppt
ssuserf40133
 
PPT
variance sample and population as introduction to statistics
suerie2
 
PPT
chapter 1 : introduction to statistics. topics include variable, population a...
suerie2
 
PPT
introstats.ppt
Gurumurthy B R
 
PPT
chapter1.ppt
abir014
 
PPT
Chapter1
AndresBrutas
 
PPT
Statistics - Chapter1
Cris Capilayan
 
PPTX
PRESENTATION.pptx
MedicalEducation7
 
PPT
presentation
Pwalmiki
 
PPT
Student’s presentation
Pwalmiki
 
PPTX
Introduction to statistics.pptx
Unfold1
 
PPTX
050325Online SPSS.pptx spss social science
NurFatin805963
 
PPT
Basics of statistics by Arup Nama Das
Arup8
 
PPT
Statistical Method for engineers and science
usaproductservices
 
PPT
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
PPTX
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
PDF
1.Introduction to Biostatistics MBChB 6 - DPH 6024.pdf
luapulachishipula14
 
PPTX
Presentation1.pptx
IndhuGreen
 
PDF
Basic Statistical Concepts.pdf
KwangheeJung
 
PPTX
Chapter_1_Lecture.pptx
ZelalemGebreegziabhe
 
chapter1.ppt
ssuserf40133
 
variance sample and population as introduction to statistics
suerie2
 
chapter 1 : introduction to statistics. topics include variable, population a...
suerie2
 
introstats.ppt
Gurumurthy B R
 
chapter1.ppt
abir014
 
Chapter1
AndresBrutas
 
Statistics - Chapter1
Cris Capilayan
 
PRESENTATION.pptx
MedicalEducation7
 
presentation
Pwalmiki
 
Student’s presentation
Pwalmiki
 
Introduction to statistics.pptx
Unfold1
 
050325Online SPSS.pptx spss social science
NurFatin805963
 
Basics of statistics by Arup Nama Das
Arup8
 
Statistical Method for engineers and science
usaproductservices
 
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
1.Introduction to Biostatistics MBChB 6 - DPH 6024.pdf
luapulachishipula14
 
Presentation1.pptx
IndhuGreen
 
Basic Statistical Concepts.pdf
KwangheeJung
 
Chapter_1_Lecture.pptx
ZelalemGebreegziabhe
 

More from Muhammad Mishbah (6)

PDF
Ch 10. SEM Structural Equation Modelling
Muhammad Mishbah
 
PDF
07. Repeated-Measures and Two-Factor Analysis of Variance.pdf
Muhammad Mishbah
 
PDF
Tutorial iii jawaban no3
Muhammad Mishbah
 
PDF
Tutorial iii jawaban no2
Muhammad Mishbah
 
PDF
Tutorial iii jawaban no1
Muhammad Mishbah
 
PDF
Tutorial III
Muhammad Mishbah
 
Ch 10. SEM Structural Equation Modelling
Muhammad Mishbah
 
07. Repeated-Measures and Two-Factor Analysis of Variance.pdf
Muhammad Mishbah
 
Tutorial iii jawaban no3
Muhammad Mishbah
 
Tutorial iii jawaban no2
Muhammad Mishbah
 
Tutorial iii jawaban no1
Muhammad Mishbah
 
Tutorial III
Muhammad Mishbah
 
Ad

Recently uploaded (20)

PPTX
How to Consolidate Subscription Billing in Odoo 18 Sales
Celine George
 
PPTX
VOMITINGS - NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PPTX
LEARNING ACTIVITY SHEET PPTXX ON ENGLISH
CHERIEANNAPRILSULIT1
 
PPTX
MALABSORPTION SYNDROME: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PPTX
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
PPTX
ARAL-Guidelines-Learning-Resources_v3.pdf.pptx
canetevenus07
 
PPTX
ARAL Program of Adia Elementary School--
FatimaAdessaPanaliga
 
PPTX
ABDOMINAL WALL DEFECTS:GASTROSCHISIS, OMPHALOCELE.pptx
PRADEEP ABOTHU
 
PPTX
Constitutional Design Civics Class 9.pptx
bikesh692
 
PDF
Living Systems Unveiled: Simplified Life Processes for Exam Success
omaiyairshad
 
PPTX
THE HUMAN INTEGUMENTARY SYSTEM#MLT#BCRAPC.pptx
Subham Panja
 
PPTX
Qweb Templates and Operations in Odoo 18
Celine George
 
PPTX
Maternal and Child Tracking system & RCH portal
Ms Usha Vadhel
 
PPTX
Folding Off Hours in Gantt View in Odoo 18.2
Celine George
 
PPTX
Company - Meaning - Definition- Types of Company - Incorporation of Company
DevaRam6
 
PPTX
Various Psychological tests: challenges and contemporary trends in psychologi...
santoshmohalik1
 
PPTX
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
PDF
Exploring-the-Investigative-World-of-Science.pdf/8th class curiosity/1st chap...
Sandeep Swamy
 
PDF
water conservation .pdf by Nandni Kumari XI C
Directorate of Education Delhi
 
PPTX
ANORECTAL MALFORMATIONS: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
How to Consolidate Subscription Billing in Odoo 18 Sales
Celine George
 
VOMITINGS - NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
LEARNING ACTIVITY SHEET PPTXX ON ENGLISH
CHERIEANNAPRILSULIT1
 
MALABSORPTION SYNDROME: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
Rules and Regulations of Madhya Pradesh Library Part-I
SantoshKumarKori2
 
ARAL-Guidelines-Learning-Resources_v3.pdf.pptx
canetevenus07
 
ARAL Program of Adia Elementary School--
FatimaAdessaPanaliga
 
ABDOMINAL WALL DEFECTS:GASTROSCHISIS, OMPHALOCELE.pptx
PRADEEP ABOTHU
 
Constitutional Design Civics Class 9.pptx
bikesh692
 
Living Systems Unveiled: Simplified Life Processes for Exam Success
omaiyairshad
 
THE HUMAN INTEGUMENTARY SYSTEM#MLT#BCRAPC.pptx
Subham Panja
 
Qweb Templates and Operations in Odoo 18
Celine George
 
Maternal and Child Tracking system & RCH portal
Ms Usha Vadhel
 
Folding Off Hours in Gantt View in Odoo 18.2
Celine George
 
Company - Meaning - Definition- Types of Company - Incorporation of Company
DevaRam6
 
Various Psychological tests: challenges and contemporary trends in psychologi...
santoshmohalik1
 
Top 10 AI Tools, Like ChatGPT. You Must Learn In 2025
Digilearnings
 
Exploring-the-Investigative-World-of-Science.pdf/8th class curiosity/1st chap...
Sandeep Swamy
 
water conservation .pdf by Nandni Kumari XI C
Directorate of Education Delhi
 
ANORECTAL MALFORMATIONS: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
Ad

1. Review Statistics and Probability.pdf

  • 1. Introduction and Descriptive Statistics Review Statistics and Probability Modifiedby: Dr. AchmadNizar Hidayanto Nur FitriahAyuning Budi KhumaisaNuraini
  • 2. Learning Outcomes • Review key statistical and research terms 1 • Review the concept of central tendency 2 • Review the concept of variability 3
  • 3. Introduction to Statistics PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick J Gravetter and Larry B. Wallnau
  • 4. 1.1 Statistics, Science and Observations • “Statistics” means “statistical procedures” • Uses of Statistics – Organize and summarize information – Determine exactly what conclusions are justified based on the results that were obtained • Goals of statistical procedures – Accurate and meaningful interpretation – Provide standardized evaluation procedures
  • 5. 1.2 Populations and Samples • Population – The set of all the individuals of interest in a particular study – Vary in size; often quite large • Sample – A set of individuals selected from a population – Usually intended to represent the population in a research study
  • 6. Figure 1.1 Relationship between population and sample
  • 7. Variables and Data • Variable – Characteristic or condition that changes or has different values for different individuals • Data (plural) – Measurements or observations of a variable • Data set – A collection of measurements or observations • A datum (singular) – A single measurement or observation – Commonly called a score or raw score
  • 8. Parameters and Statistics • Parameter – A value, usually a numerical value, that describes a population – Derived from measurements of the individuals in the population • Statistic – A value, usually a numerical value, that describes a sample – Derived from measurements of the individuals in the sample
  • 9. Descriptive & Inferential Statistics • Descriptive statistics – Summarize data – Organize data – Simplify data • Familiar examples – Tables – Graphs – Averages • Inferential statistics – Study samples to make generalizations about the population – Interpret experimental data • Common terminology – “Margin of error” – “Statistically significant”
  • 10. Sampling Error • Sample is never identical to population • Sampling Error – The discrepancy, or amount of error, that exists between a sample statistic and the corresponding population parameter • Example: Margin of Error in Polls – “This poll was taken from a sample of registered voters and has a margin of error of plus-or-minus 4 percentage points” (Box 1.1)
  • 11. Figure 1.2 A demonstration of sampling error
  • 12. Figure 1.3 Role of statistics in experimental research
  • 13. 1.3 Data Structures, Research Methods, and Statistics • Individual Variables – A variable is observed – “Statistics” describe the observed variable – Category and/or numerical variables – Descriptive statistics • Relationships between variables – Two variables observed and measured – One of two possible data structures used to determine what type of relationship exists
  • 14. Relationships Between Variables • Data Structure I: The Correlational Method – One group of participants – Measurement of two variables for each participant – Goal is to describe type and magnitude of the relationship – Patterns in the data reveal relationships – Non-experimental method of study
  • 15. Figure 1.4 Data structures for studies evaluating the relationship between variables
  • 16. Correlational Method Limitations • Can demonstrate the existence of a relationship • Does not provide an explanation for the relationship • Most importantly, does not demonstrate a cause-and-effect relationship between the two variables
  • 17. Relationships Between Variables • Data Structure II: Comparing two (or more) groups of Scores – One variable defines the groups – Scores are measured on second variable – Both experimental and non-experimental studies use this structure
  • 18. Figure 1.5 Data structure for studies comparing groups
  • 19. Experimental Method • Goal of Experimental Method – To demonstrate a cause-and-effect relationship • Manipulation – The level of one variable is determined by the experimenter • Control rules out influence of other variables – Participant variables – Environmental variables
  • 20. Figure 1.6 The structure of an experiment
  • 21. Independent/Dependent Variables • Independent Variable is the variable manipulated by the researcher – Independent because no other variable in the study influences its value • Dependent Variable is the one observed to assess the effect of treatment – Dependent because its value is thought to depend on the value of the independent variable
  • 22. Experimental Method: Control • Methods of control – Random assignment of subjects – Matching of subjects – Holding level of some potentially influential variables constant • Control condition – Individuals do not receive the experimental treatment. – They either receive no treatment or they receive a neutral, placebo treatment – Purpose: to provide a baseline for comparison with the experimental condition • Experimental condition – Individuals do receive the experimental treatment
  • 23. Non-experimental Methods • Non-equivalent Groups – Researcher compares groups – Researcher cannot control who goes into which group • Pre-test / Post-test – Individuals measured at two points in time – Researcher cannot control influence of the passage of time • Independent variable is quasi-independent
  • 24. Figure 1.7 Two examples of non-experimental studies Insert NEW Figure 1.7
  • 25. 1.4 Variables and Measurement • Scores are obtained by observing and measuring variables that scientists use to help define and explain external behaviors • The process of measurement consists of applying carefully defined measurement procedures for each variable
  • 26. Constructs & Operational Definitions • Constructs – Internal attributes or characteristics that cannot be directly observed – Useful for describing and explaining behavior • Operational – Identifies the set of operations required to measure an external (observable) behavior – Uses the resulting measurements as both a definition and a measurement of a hypothetical construct
  • 27. Discrete and Continuous Variables • Discrete variable – Has separate, indivisible categories – No values can exist between two neighboring categories • Continuous variable – Have an infinite number of possible values between any two observed values – Every interval is divisible into an infinite number of equal parts
  • 29. Real Limits of Continuous Variables • Real Limits are the boundaries of each interval representing scores measured on a continuous number line – The real limit separating two adjacent scores is exactly halfway between the two scores – Each score has two real limits • The upper real limit marks the top of the interval • The lower real limit marks the bottom of the interval
  • 30. Scales of Measurement • Measurement assigns individuals or events to categories – The categories can simply be names such as male/female or employed/unemployed – They can be numerical values such as 68 inches or 175 pounds • The complete set of categories makes up a scale of measurement • Relationships between the categories determine different types of scales
  • 31. Scales of Measurement Scale Characteristics Examples Nominal •Label and categorize •No quantitative distinctions •Gender •Diagnosis •Experimental or Control Ordinal •Categorizes observations •Categories organized by size or magnitude •Rank in class •Clothing sizes (S,M,L,XL) •Olympic medals Interval •Ordered categories •Interval between categories of equal size •Arbitrary or absent zero point •Temperature •IQ •Golf scores (above/below par) Ratio •Ordered categories •Equal interval between categories •Absolute zero point •Number of correct answers •Time to complete task •Gain in height since last year
  • 32. Central Tendency PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau
  • 33. 1.5 Overview of central tendency • Central tendency – A single score to define the “center” of a distribution • Purpose: find the single score that is most typical or best represents the entire group
  • 34. Figure 1.9 What is the “center” of each distribution?
  • 35. 1.6 The Mean • The mean is the sum of all the scores divided by the number of scores in the data. Population Mean Sample Mean N X    n X M  
  • 36. The Mean: Three definitions • Sum of the scores divided by the number of scores in the data • Amount each individual receives when total is divided equally among all: M = ∑X / n • The balance point for the distribution
  • 38. Computing the Mean from a Frequency Distribution Table Quiz Score (X) f fX 10 1 10 9 2 18 8 4 32 7 0 0 6 1 6 Total n = Σf = 8 ΣfX = 66 M = ??
  • 39. The Weighted Mean • Combine two sets of scores • Three steps: – Determine the combined sum of all the scores – Determine the combined number of scores – Divide the sum of scores by the total number of scores 2 1 2 1 mean (weighted) overall n n X X M      
  • 40. Characteristics of the Mean • Changing the value of any score changes the mean. • Introducing a new score or removing a score usually changes the mean. • Adding or subtracting a constant from each score changes the mean by the same constant. • Multiplying or dividing each score by a constant multiplies or divides the mean by that constant.
  • 42. 1.7 The Median • The median is the midpoint of the scores in a distribution when they are listed in order from smallest to largest. • The median divides the scores into two groups of equal size.
  • 45. The Precise Median for a Continuous Variable • A continuous variable can be infinitely divided • The precise median is located in the interval defined by the real limits of the value. • It may be necessary to determine the fraction of the interval needed to divide the distribution exactly in half. • interval in the number 50% reach to needed number fraction 
  • 47. Median, Mean, and Middle • Mean is the balance point of a distribution – Defined by distances – Often is not the midpoint of the scores • Median is the midpoint of a distribution – Defined by number of scores – Often is not the balance point of the scores • Both measure central tendency, using two different concepts of middle or “central.”
  • 49. 1.8 The Mode • The mode is the score or category that has the greatest frequency of any in the frequency distribution – Can be used with any scale of measurement – Corresponds to an actual score in the data – The only one used with nominal data • It is possible to have more than one mode
  • 51. 1.9 Selecting a Measure of Central Tendency Measure of Central Tendency Appropriate to choose when … Should not be used when… Mean No situation precludes it •Extreme scores •Skewed distribution •Undetermined values •Open-ended distribution •Ordinal scale •Nominal scale Median •Extreme scores •Skewed distribution •Undetermined values •Open-ended distribution •Ordinal scale •Nominal scale Mode •Nominal scales •Discrete variables •Describing shape •Interval or ratio data, except to accompany mean or median
  • 53. Figure 1.18 Means or Medians in a Line Graph
  • 54. Figure 1.19 Means or Medians in a Bar Graph
  • 55. • Symmetrical distributions – Mean and median have same value – If exactly one mode, it has same value as the mean and the median – Distribution may have more than one mode, or no mode at all 1.10 Central Tendency and the Shape of the Distribution
  • 57. Central Tendency in Skewed Distributions • Mean is found far toward the long tail (positive or negative) • Median is found toward the long tail, but not as far as the mean • Mode is found near the piled-up scores. • If positively skewed, order from left to right is mode, median, mean; • If negatively skewed, order from left to right is mean, median, mode
  • 59. Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry B. Wallnau
  • 60. 1.11 Overview • Variability can be defined several ways – A quantitative measure of the differences between scores – Describes the degree to which the scores are spread out or clustered together • Purposes of Measure of Variability – Describe the distribution – Measure how well an individual score represents the distribution
  • 62. Three Measures of Variability • The Range • The Standard Deviation • The Variance
  • 63. 1.12 The Range • The distance covered by the scores in a distribution – From smallest value to highest value • For continuous data, real limits are used • For discrete variables range is number of categories range = URL for Xmax — LRL for Xmin
  • 64. 1.13 Standard Deviation and Variance for a Population • Most common and most important measure of variability – A measure of the standard, or average, distance from the mean – Describes whether the scores are clustered closely around the mean or are widely scattered • Calculation differs for population and samples
  • 65. Developing the Standard Deviation • Step One: Determine the Deviation Score (distance from the mean) for each score: • Step Two: Calculate Mean (Average) of Deviations – Deviations sum to 0 because M is balance point of the distribution – The Mean (Average) Deviation will always equal 0; another method must be found Deviation score = X — μ
  • 66. Developing the Standard Deviation (2) • Step Three: Get rid of negatives in Deviations: – Square each deviation score – Using the squared values, compute the Mean Squared Deviation, known as the Variance – • Variability is now measured in squared units and is called the Variance. Population variance equals the mean squared deviation -- Variance is the average squared distance from the mean
  • 67. Developing the Standard Deviation (2) • Step Four: – Variance measures the average squared distance from the mean; not quite on goal • Correct for having squared all the deviations by taking the square root of the variance Variance Deviation Standard 
  • 69. Formulas for Population Variance and Standard Deviation • • SS (sum of squares) is the sum of the squared deviations of scores from the mean • Two equations for computing SS scores of number deviations squared of sum Variance 
  • 70. Two formulas for SS Definitional Formula • Find each deviation score (X–μ) • Square each deviation score, (X–μ)2 • Sum up the squared deviations Computational Formula  2     X SS • Square each score and sum the squared scores • Find the sum of scores, square it, divide by N • Subtract the second part from the first   N X X SS 2 2    
  • 71. Population Variance: Formula and Notation Formula N SS N SS deviation standard variance   Notation • Lowercase Greek letter sigma is used to denote the standard deviation of a population: σ • Because the standard deviation is the square root of the variance, we write the variance of a population as σ2
  • 72. Figure 1.24 Graphic Representation of Mean and Standard Deviation
  • 73. 1.14 Standard Deviation and Variance for a Sample • Goal of inferential statistics: – Draw general conclusions about population – Based on limited information from a sample • Samples differ from the population – Samples have less variability – Computing the Variance and Standard Deviation in the same way as for a population would give a biased estimate of the population values
  • 74. Figure 1.25 Population of Adult Heights
  • 75. Variance and Standard Deviation for a Sample • Sum of Squares (SS) is computed as before • Formula has n-1 rather than N in the denominator • Notation uses s instead of σ 1 1 2       n SS n SS s sample of deviation standard s sample of variance
  • 76. Degrees of Freedom • Population variance – Mean is known – Deviations are computed from a known mean • Sample variance as estimate of population – Population mean is unknown – Using sample mean restricts variability • Degrees of freedom – Number of scores in sample that are independent and free to vary – Degrees of freedom (df) = n – 1
  • 77. 1.15 More about Variance and Standard Deviation • Unbiased estimate of a population parameter – Average value of statistic is equal to parameter – Average value uses all possible samples of a particular size n • Biased estimate of a population parameter – Systematically overestimates or underestimates (as with variance) the population parameter
  • 78. Table 4.1 Biased & Unbiased Estimates Sample Statistics Sample 1st Score 2nd Score Mean Biased (used n) Unbiased (used n-1) 1 0 0 0.00 0.00 0.00 2 0 3 1.50 2.25 4.50 3 0 9 4.50 20.25 40.50 4 3 0 1.50 2.25 4.50 5 3 3 3.00 0.00 0.00 6 3 9 6.00 9.00 18.00 7 9 0 4.50 20.25 40.50 8 9 3 6.00 9.00 18.00 9 9 9 9.00 0.00 0.00 Totals 36.00 63.00/9 126.00/8 Actual σ2 = 14 This is an adaptation of Table 4.1
  • 79. Figure 1.26 Sample of n = 20, M = 36, and s = 4
  • 80. Transformations of Scale • Adding a constant to each score – The Mean is changed – The standard deviation is unchanged • Multiplying each score by a constant – The Mean is changed – Standard Deviation is also changed – The Standard Deviation is multiplied by that constant
  • 81. Variance and Inferential Statistics • Goal of inferential statistics: To detect meaningful and significant patterns in research results • Variability in the data influences how easy it is to see patterns – High variability obscures patterns that would be visible in low variability samples – Variability is sometimes called error variance
  • 82. Figure 1.27 Experiments with high and low variability