SlideShare a Scribd company logo
Research Methods, Design, and
Analysis
Thirteenth Edition
Chapter 15
Descriptive Statistics
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Descriptive Statistics
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Learning Objectives
15.1 Describe the purpose of descriptive statistics.
15.2 Explain the concept of a frequency distribution.
15.3 Differentiate among the types of graphic
representations of data and when they should be used.
15.4 Calculate the mean, median, and mode of a data set.
15.5 Calculate the variance and standard deviation of a data
set.
15.6 Summarize the techniques used to determine
relationships among variables.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Field of Statistics (1 of 2)
• Two broad categories
– Descriptive statistics
– Inferential statistics
Figure 15.1 Major divisions
of the field of statistics.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Field of Statistics (2 of 2)
• Descriptive statistics
– The type of statistical analysis focused on describing,
summarizing, or explaining a set of data
– Allows you to make sense of your set of data and to make
the key characteristics easily understandable to others
• Inferential statistics
– The type of statistical analysis focused on making inferences
about populations based on sample data
– Subdivided into
▪ Estimation
– Point estimation
– Interval estimation
▪ Hypothesis testing
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Let's Begin...
• In this chapter, we will explain descriptive statistical analysis.
• In Chapter 16, we will explain inferential statistical analysis.
• We assume no prior knowledge of the material.
• Both chapters are written so that everyone can understand the
material.
• Discussion requires very little mathematical background.
• Focus on showing you
– What statistical procedures to select to understand your data
– How to interpret and communicate your results
• Before moving to the next section please read Exhibit 15.1 to
see why you must always conduct your statistical analyses
intelligently.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (1 of 6)
Simpson’s Paradox
• Demonstrate how statistical analysis, if not conducted
properly, can deceive people.
• Example is based on a real case of purported gender
discrimination at the University of California, Berkeley,
several decades ago.
• Written up in Science (Bickel, 1975)
• Data shown below refer to men and women admitted to
graduate school in the Department of Psychology at a
hypothetical university.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (2 of 6)
Simpson’s Paradox
Combined or “Aggregated” Results
Blank
Number
Applied
Number
Admitted
Percentage
Admitted
Men 180 99 55
Women 100 45 45
• 55% of the men who applied to this department were
admitted to graduate school.
• Only 45% of the women who applied were admitted.
• Assume that their qualifications were the same.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (3 of 6)
Simpson’s Paradox
• If this were the case, you might conclude that gender
discrimination has occurred because men had a much
higher rate of acceptance than women.
• Assume that the 280 students applying to the Psychology
Department applied to two different graduate programs.
– Doctoral program in clinical psychology
– Doctoral program in experimental psychology
• The researcher decides to break down the data separately
for each program and obtains the two tables shown next.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (4 of 6)
Simpson’s Paradox
Results Separated by Program (“Disaggregated Results”)
Clinical Psychology Program
Blank
Number
Applied
Number
Admitted
Percentage
Admitted
Men 60 9 15
Women 60 12 20
Experimental Psychology Program
Blank
Number
Applied
Number
Admitted
Percentage
Admitted
Men 120 90 75
Women 40 32 80
• What do you see in these two program tables?
– Women (not men) had the higher acceptance rates in both degree
programs!
• If there is any discrimination, it is in favor of the women applicants.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (5 of 6)
Simpson’s Paradox
• What’s going on?
– The overall/combined data suggested one conclusion.
– When the data were more carefully analyzed (they were
“disaggregated” in the clinical and experimental program tables), a
completely different conclusion became apparent.
• How could it be that opposite conclusions are suggested in the two
exhibits based on the same data?
– A statistical phenomenon known as Simpson’s paradox
– Women tended to apply to the program that was harder to get into
– Men tended to apply to the program that was easier to get into
– Aggregated data produced one conclusion
– Disaggregated data produced the opposite and more accurate
conclusion.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.1 (6 of 6)
Simpson’s Paradox
• Moral of this story
– Be cautious when you examine and interpret
descriptive data.
– Always look at the data.
▪ Critically
▪ In multiple ways
▪ Until you are able to draw the most warranted
conclusion
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Descriptive Statistics (1 of 4)
15.1 Describe the Purpose of Descriptive Statistics
• Data set - a set of data, where the rows are “cases” and the
columns are “variables”
• The researcher uses descriptive statistics to understand and
summarize the key numerical characteristics of the data set.
• Example
– Calculate the averages of your treatment and control group
scores in an experiment.
– If you conducted a survey, you might want to know the
frequencies of the responses for each question.
– Want to use graphs to pictorially communicate some of your
results
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Descriptive Statistics (2 of 4)
15.1 Describe the Purpose of Descriptive Statistics
• In the next chapter (inferential statistics)
– Will learn how to determine if the difference between the treatment
and control groups means is statistically significant
– If other observed results are statistically significant
• In this chapter
– Focus on taking whatever set of data you currently have and
showing how to summarize the key characteristics of the data
• Key question in descriptive statistics
– How can I communicate the important characteristics of my data?
▪ One way would be to supply a printout of all of your data, but
that would be very inefficient.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Descriptive Statistics (3 of 4)
15.1 Describe the Purpose of Descriptive Statistics
• Data set in Table 15.1 will be used in several places in this chapter.
– “College graduate data set.”
– Hypothetically say
▪ Data came from a survey research study
▪ Conducted with 25 recent college graduates
▪ You asked participants
– Starting salaries, undergraduate GPA, college major (you
only surveyed three majors), gender, the SAT scores they
had when they entered college, number of days they
believe they missed during college
• Goal in this survey research study
– To determine what variables predicted the starting salaries of
psychology, philosophy, and business majors
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.1 (1 of 3)
Hypothetical Data Set for Nonexperimental Research for 25 Recent
College Graduates
• Four quantitative variables
– Salary
– GPA
– SAT scores
– Days of school missed
• Two categorical variables
– College major
– Gender
• Standard format
– Cases in rows
– Variables in columns
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.1 (2 of 3)
Hypothetical Data Set for Nonexperimental Research for 25 Recent
College Graduates
Person Salary GPA Major Gender SAT Days
Missed
1 24,000 2.5 1 0 1,110 36
2 25,000 2.5 1 0 1,100 26
3 27,500 3 point 0
1 0 1,300 31
4 28,500 2.4 2 1 1,100 18
5 30,500 3 point 0
2 0 1,150 26
6 30,500 2.9 2 1 1,130 18
7 31,000 3.1 1 0 1,180 16
8 31,000 3.3 1 0 1,160 11
9 31,500 2.9 2 0 1,170 25
10 32,000 3.6 1 0 1,250 12
11 32,000 2.6 1 1 1,230 26
12 32,500 3.1 2 0 1,130 21
13 32,500 3.2 2 1 1,200 17
14 32,500 3 point 0
3 1 1,150 14
3.0
3.0
3.0
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.1 (3 of 3)
Hypothetical Data Set for Nonexperimental Research for 25 Recent
College Graduates
Person Salary GPA Major Gender SAT Days
Missed
15 33,000 3.7 1 0 1,260 29
16 33,500 3.1 2 1 1,170 21
17 33,500 2.7 2 1 1,140 22
18 34,500 3 point 0
3 0 1,240 14
19 35,500 3.1 3 0 1,330 16
20 36,500 3.5 2 1 1,220 0
21 37,500 3.4 3 1 1,150 4
22 38,500 3.2 2 0 1,270 10
23 38,500 3 point 0
3 1 1,300 0
24 40,500 3.3 3 1 1,280 5
25 41,500 3.5 3 1 1,330 2
3.0
3.0
Note: For the categorical variable “major,” 1 = psychology, 2 = philosophy, and 3 =
business. For the categorical variable “ gender,” 0 = male and 1 = female.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Descriptive Statistics (4 of 4)
15.1 Describe the Purpose of Descriptive Statistics
• Enter data into a spreadsheet such as
– Excel (which can be used by a statistical program such
as SPSS)
– SPSS
• We used the popular statistical program SPSS for most of
the analyses in this and the next chapter.
• Most universities provide access to SPSS or another
statistical program in their computer labs.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Frequency Distributions
15.2 Explain the Concept of a Frequency Distribution
• Frequency distribution - data arrangement in which the frequencies of
each unique data value is shown
– First column shows the unique data values for the variable.
– Second column the frequencies for each of these values
– Third column the percentages
• Example - Table 15.2
– Variable - starting salary
– Lowest salary is $24,000.
– Highest is $41,500.
– Most frequently occurring salary − $32,500
▪ Three of the 25 recent graduates had this starting salary
– 4% of the 25 cases had a salary of $24,000.
– 8% of the cases had a salary of $32,000.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (1 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Graphs
– Pictorial representations of data
– Can be used for one or more variables
– Used to help communicate the nature of data
– Example
▪ Program evaluators often include graphs in their
reports because their clients often like to see
graphic representations of the data.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (2 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Bar Graphs
– Graph that uses vertical bars
to represent the data values
of a categorical variable
Figure 15.2 A bar graph of
undergraduate major.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (3 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Figure 15.2
– Bar graph of the categorical variable college major
– Horizontal axis shows the three categories in the variable.
– Frequencies of each category are shown on the vertical axis.
– Bars provide graphical representations of the frequencies of the three
majors.
▪ 8 psychology majors
▪ 10 philosophy majors
▪ 7 business majors
– Can easily convert these numbers into percentages
▪ 32% were psychology majors (8 divided by 25).
▪ 40% were philosophy majors (10 divided by 25).
▪ 28% were business majors (7 divided by 25).
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (4 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Histograms
– Graph depicting frequencies and distribution of a
quantitative variable
– A presentation of a frequency distribution in bar format
– Advantage over a frequency distribution
▪ More clearly shows the shape of the distribution
– Histogram for starting salary in Figure 15.3
– In contrast to bar graphs, the bars in histograms are
placed next to each other with no space in between.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.3
Histogram of Starting Salary
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (5 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Line Graphs
– A graph relying on the
drawing of one or more
lines connecting data points
– A useful way to graphically
depict the distribution of a
quantitative variable
– Line graph of starting salary
in Figure 15.4
– Useful to visually show and
aid in the interpretation of
interaction effects
Figure 15.4 Line graph of starting
salary.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (6 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Line Graph interaction example
– Conduct an experiment to test a new social skills
training program.
– Pretest–posttest control group design
– DV = the number of appropriate social interactions
– IV = social skills training (training versus. no training)
– Data shown in Table 15.3
– Some results of this hypothetical experiment are shown
in Figure 15.5.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.3 (1 of 2)
Hypothetical Data Set for Experimental Research Study Examining the
Effectiveness of Social Skills Training
Person Pretest Scores Treatment Condition Posttest Scores
1 3 1 4
2 4 1 4
3 2 1 3
4 1 1 2
5 1 1 2
6 0 1 0
7 2 1 2
8 4 1 4
9 4 1 4
10 3 1 4
11 2 1 3
12 5 1 5
13 3 1 3
14 3 1 3
15 2 2 4
16 3 2 5
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.3 (2 of 2)
Hypothetical Data Set for Experimental Research Study Examining the
Effectiveness of Social Skills Training
Person Pretest Scores Treatment Condition Posttest Scores
17 1 2 2
18 2 2 4
19 1 2 2
20 2 2 4
21 2 2 3
22 3 2 5
23 5 2 6
24 2 2 4
25 4 2 2
26 4 2 5
27 2 2 4
28 5 2 6
Note: Pretest = number of appropriate interactions at the beginning of the experiment;
posttest = number of appropriate interactions after the experimental intervention; treatment
condition = 1 for control group (did not receive social skills training) and 2 for treatment
group (did receive social skills training).
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.5
Line Graph of Results from Pretest–Posttest Control Group Design
Studying Effectiveness of Social Skills Treatment
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (7 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Line Graph interaction example
– Both groups started low on the number of appropriate skills they
exhibited.
– At the end of the study
▪ After the treatment group received social skills training
▪ The participants in the treatment group have higher scores
than the participants in the control group.
– The number of appropriate social skills
▪ Increased for the treatment group
▪ No (or very little) increase for the control group
– Treatment seems to work.
– You must also determine if the result is statistically significant.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (8 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Scatterplots
– A graphical depiction of the relationship between two
quantitative variables
– Dependent variable on the vertical axis
– Independent or predictor variable on the horizontal axis
– Dots within the graph represent the cases (i.e., participants)
in the data set.
– Scatterplot of the two quantitative variables grade point
average and starting salary in Figure 15.6
▪ Appears to be a positive relationship between GPA and
starting salary
▪ As GPA increases, starting salary also increases.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.6
Scatterplot of Starting Salary by College GPA (Positive Relationship)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Graphic Representations of Data (9 of 9)
15.3 Differentiate Among the Types of Graphic Representations of Data
and When They Should Be Used
• Scatterplots
– Positive relationship
▪ The data values tend to start at the bottom left side of the
graph and end at the top right side.
– Scatterplot of days of school missed during college and
starting salary is shown in Figure 15.
▪ Appears to be a negative relationship between days
missed and starting salary
▪ As days missed increases starting salary decreases
– Negative relationship
▪ The data values tend to start at the top left side of the
graph and end at the lower right side.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.7
Scatterplot of Starting Salary by Days Missed (Negative Relationship)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (1 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Measure of central tendency
– Numerical value expressing what is typical of the values of a
quantitative variable
• One of the most important ways to describe and understand data
• Example
– College GPA is the value expressing what is typical for your
grades.
• Three most common measures of central tendency
– Mode
– The median
– The mean.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (2 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Mode
– The most frequently occurring number
– Most basic, and the crudest, measure of central tendency
– Example
▪ 0, 2, 3, 4, 5, 5, 5, 7, 8, 8, 9, 10
▪ mode is 5
▪ occurs three times
– If there is a tie for the most frequently occurring number
▪ Need to report both
▪ Point out that the data for the variable are bimodal.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (3 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Mode
– Practice
▪ Determine the mode for the following set of numbers
▪ 1, 2, 2, 5, 5, 7, 10, 10, 10
▪ If you said 10, then you are right
▪ The mode in this case is not a very good indicator of the
central tendency of the data.
– If the data are normally distributed
▪ Most people fall toward the center of the distribution of
numbers.
▪ The mode works much better than in this case
– In practice, research psychologists rarely use the mode.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (4 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Median
– The center point in an ordered set of numbers
– Odd number of numbers
▪ the median is the middle number
▪ example
– 1, 2, 3, 4, 5
– median is 3
– Even number of numbers
▪ The median is the average of the two centermost numbers
▪ Example
– 1, 2, 3, 4
– Median is 2.5 (i.e., the average of 2 and 3 is 2.5)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (5 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Median
– An interesting property of the median is that it is not
affected by the size of the highest and lowest numbers
▪ Example
– The median of 1, 2, 3, 4, 5 is the same as the
median of 1, 2, 3, 4, 500
– In both cases the median is 3!
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (6 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Mean
– The arithmetic average
– The average of 1, 2, and 3 = 2
▪ (1+2+3)/ 3
– Psychologists sometimes refer to the mean as (called X bar)
– Our formula for getting the mean
– X stands for the variable you are using
– n is the number of numbers you have
–  is a sum sign (add up the numbers that follow it)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Central Tendency (7 of 7)
15.4 Calculate the Mean, Median, and Mode of a Data Set
• Mean
– Simple case where the three values of our variable are 1, 2,
and 3
– Psychologists frequently calculate the means for the groups
that they want to compare
▪ e.g.. The mean performance level for treatment and
control groups
– Figure 15.5
▪ Each of the four points in the graph is a group mean
– Means for the treatment and the control groups at the
pretest
– Means for these two groups at the posttest
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (1 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Also important to find out how much your data values are
spread out
• Variability
– Numerical value expressing how spread out or how much
variation is present in the values of a quantitative variable
• If all of the data values for a variable were the same, then there
is no variability.
– example- 4, 4, 4, 4, 4, 4, 4, 4, 4, 4
• Variability in these numbers
– 1, 2, 3, 3, 4, 4, 4, 6, 8, 10
• The more different your numbers, the more variability you have.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (2 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Which of the following sets of data have the most variability present?
– Group one: 44, 45, 45, 45, 46, 46, 47, 47, 48, 49
– Group two: 34, 37, 45, 51, 58, 60, 77, 88, 90, 98
– The data for group two have more variability than group one.
• Homogeneous - little variability in scores in a group
• Heterogeneous - a lot of variability in scores in a group
• Three of the types of variability
– Range
– Variance
– Standard deviation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (3 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Range
– The highest number minus the lowest number
– The simplest measure of variability, but also the most crude
– Formula
▪ Range = H – L
– H is the highest number
– L is the lowest number
– Example
▪ Data for group one shown in the previous section
– Range is equal to 5 (49 − 44)
▪ Range for group two
– Range is 64 (98 − 34)
– Crude index of variability because it takes into account only two numbers
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (4 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Variance and Standard Deviation
– Two most popular measures of variability
– Superior to the range because they take into account
all of the data values for a variable
– Both provide information about the dispersion or
variation around the mean value of a variable
– Variance - the average deviation of data values from
their mean in squared units
▪ Is popular because it has nice mathematical
properties
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (5 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Variance and Standard Deviation
– Standard deviation - the square root of the variance
▪ Turns the variance into more meaningful units
▪ An approximate indicator of the average distance that your data
values are from their mean
▪ Example
– if you have a mean of 5
– a standard deviation of 2
– data values tend to be approximately 2 units above or below 5
– For the variance and the standard deviation
▪ The larger the value, the greater the data are spread out
▪ The smaller the value, the less the data are spread out
– How to calculate the variance and standard deviation in Table 15.4
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.4 (1 of 3)
Calculating the Variance and Standard Deviation
Blank
(1) (2) (3) (4)
Blank
(X) Left parenthesis X bar right parenthesis. Left parenthesis X minus X bar right parenthesis. Left parenthesis X minus X bar right parenthesis superscript 2 end superscript.
Blank
2 6 −4 16
Blank
4 6 −2 4
Blank
6 6 0 0
Blank
8 6 2 4
3
10 6 4 16
Blank
30
Blank
0 40
Sums Summation, superscript up arrow end superscript, of X. Blank Blank
Summation of left parenthesis X minus X bar (up arrow) right parenthesis.
 
X  

X X  
2

X X

x  
2

 
X X
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.4 (2 of 3)
Calculating the Variance and Standard Deviation
Steps:
1. Insert your data values in the X column.
2. Calculate the mean of the values in column 1, and place this value in
column 2. In our example, the mean is 6.
30
= 6.
5
X =
3. Subtract the values in column 2 from the values in column 1, and
place these into column 3.
4. Square the numbers in column 3 (i.e., multiply the number by itself),
and place these in column 4. (Note: You can ignore the minus signs
in column 3 because a negative number multiplied by a negative
number produces a positive number
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.4 (3 of 3)
Calculating the Variance and Standard Deviation
Steps:
5. Insert the appropriate values into the following formula for the variance:
Variance

  2
( )
2
X X
where

 2
( )
X X is the sum of the numbers in column 4, and n is the number of numbers.
In this example, the variance

  
 2
( ) 40
8
2 5
X X
6. The standard deviation is the square root of the variance
In this example, the variance is 8 (see step 5), and the standard deviation is 2.83
(i.e., the square root of 8 = 2.83).
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (6 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Standard Deviation and the Normal Curve
– If the data were fully normally distributed, the standard
deviation would have additional meaning.
– Examine the standard normal distribution in Figure
15.8
▪ The normal curve or normal distribution has a bell
shape.
▪ It is high in the middle and it tapers off to the left and
the right.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.8
Areas Under the Normal Distribution
Z scores −3 −2 −1 0 1 2 3
Percentile
ranks
0.1 2 16 50 84 98 99.9
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (7 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Standard Deviation and the Normal Curve
– If the data were fully normally distributed
▪ You would be able to apply the 68, 95, 99.7 percent rule
– 68% of the cases fall within one standard deviation from
the mean.
– 95% fall within two standard deviations.
– 99.7% fall within three standard deviations.
– It is important to understand sample data are never fully normally
distributed.
– Can be called the theoretical normal distribution.
– The normal distribution also has many applications in more
advanced statistics courses.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (8 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• z scores
– A score that has been transformed into standard deviation units
– Transformed from their original “raw scores” into a new “standardized”
metric
– Mean of zero and a standard deviation of one
– Data values now can be interpreted in terms of how far they are from their
mean
– If a data value is +1.00, we can say that this value falls one standard
deviation above the mean.
– A value of +2.00 falls two standard deviations above the mean
– A value of -1.5 falls one and a half standard deviations below the mean
– “Standardized units” or “z scores” were used with the normal curve just
shown in Figure 15.8.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (9 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Formula
• To use this formula
– Convert raw scores to z scores
– Need to know the mean and standard deviation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (10 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Example
– Set of scores - 2, 4, 6, 8, 10
– Mean = 6
– Sd = 2.83
– Convert 10 to a z score
– convert 2 to a z score:
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Measures of Variability (11 of 11)
15.5 Calculate the Variance and Standard Deviation of a Data Set
• Negative sign indicates that the number is below the mean.
• All of the z scores for our set of five numbers
– −1.413, −.707, 0, +.707, +1.413
– The average of these numbers is zero.
• Key point
– You can take any set of numbers.
– Convert the numbers to z scores.
– They will always have a mean of zero and a standard deviation of one.
• Helps psychologists when
– They want to compare scores across different variables and different data
sets.
– They want to know how far a data value falls above or below the mean.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (1 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Rarely is a psychologist interested in a single variable.
• Typically are interested in determining whether IVs and DVs are related
• Use IVs to “explain variance” in DVs
• Determining what IVs predict or cause changes in DVs is perhaps the primary
goal of science.
• Practitioners can apply this knowledge to produce changes in the world.
– Use new psychotherapy techniques to reduce mental illness
– To determine how to predict who is “at risk” for future problems so that
early interventions can be started
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (2 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Describe several approaches used to examine relationships
among two or more variables.
• Vast majority of the time
– DV in psychological research is a quantitative variable.
▪ e.g., Response time, performance level, level of stress
• Most of the indexes of relationship described here are used for
quantitative DVs.
• Will explain one exception in which you have a categorical DV
and a categorical IV
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (3 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Unstandardized and Standardized Difference Between Group Means
– Example, in our college graduate data set
▪ Mean (i.e., the average) starting salary for males is
$34,791.67
▪ Mean starting salary for females is $31,269.23
▪ Unstandardized difference between these two means
–  
$34,791.67 $31,269.23 $3,522.44
▪ “There appears to be a sizable relationship between gender
and starting salary such that males have higher salaries than
females”
– The difference between the means is often transformed
into a standardized measure.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (4 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Cohen’s d
– The difference between two means in standard
deviation units
– One of many effect size indicators
• Effect size indicator
– Index of magnitude or strength of a relationship or
difference between means
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (5 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Cohen’s d formula
– M1 is the mean for group 1
– M2 is the mean for group 2
– SD is the standard deviation of either group
▪ Traditionally it’s the control group’s standard deviation in an
experiment.
▪ Some researchers prefer a pooled standard deviation.
• Rough starting point for interpreting d
– d = .2 as “small”
– d = .5 as “medium”
– d = .8 as “large”
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (6 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Calculate Cohen’s d to compare the average male and female incomes
– Gender is the categorical IV
– Starting salary is the quantitative DV
– Mean starting salary
▪ Males = $34,791.67
▪ Females = $31,269.23
▪ Unstandardized difference between the means is $3,522.44
▪ Standard deviation for females = $4,008.40
– Mean starting salary for men is .88 standard deviations above the mean
for females
– Criteria for interpretation, “large” difference between the means
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.2 (1 of 4)
Using Cohen’s D in a Pretest–Posttest Control-Group Experimental
Research Design
• IV = treatment and control conditions
• Purpose - treatment o improve the social skills of the participants
• DV = number of appropriate interactions in a 1-hour observation
session (pretest and posttest)
• Figure 15.5
– Pretest and posttest means for the treatment and control groups
– It appears that the treatment worked
– After the intervention the social skills performance of the treatment
group improved quite a bit more than that for the control group
– At the pretest, the two groups’ means were similar.
▪ Suggesting that random assignment to the groups worked well
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.2 (2 of 4)
Using Cohen’s D in a Pretest–Posttest Control-Group Experimental
Research Design
• Calculate Cohen’s d for pretest and posttest means
– Pretest mean for the treatment group (M1) = 2.71
– Pretest mean for the control group (M2) = 2.64
– Standard deviation (SD) for the control group = 1.39
– Posttest the mean for the treatment group (M1) 4.00
– Posttest mean for the control group (M2) = 3.07
– Standard deviation (SD) for the control group = 1.27
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.2 (3 of 4)
Using Cohen’s D in a Pretest–Posttest Control-Group Experimental
Research Design
• Interpreting these data
– The difference between the means was very small at
the pretest
▪ Standardized mean difference (Cohen’s d) = .05
▪ The treatment group was only .05 of a standard
deviation larger than the control group mean
– The posttest Cohen’s d was .73
▪ Indicates the treatment group mean was .73
standard deviation units above the control group
mean
▪ A moderately large difference
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.2 (4 of 4)
Using Cohen’s D in a Pretest–Posttest Control-Group Experimental
Research Design
• Although the results just presented appear to support the
efficacy of the social skills training, we still cannot trust
this experimental finding.
• Problem - the observed differences between the means
might represent nothing more than random or chance
fluctuation in the data.
• In the next chapter on inferential statistics, we will check
to see if this difference is statistically significant.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (7 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Correlation Coefficient
– Have a quantitative DV and a quantitative IV, you need to either obtain
▪ A correlation coefficient
▪ Or a regression coefficient
– Correlation coefficient - index indicating the strength and direction of linear
relationship between two quantitative variables
▪ A numerical index ranging from 1.00 to +1.00

▪ Absolute size of the number indicates the strength
▪ Sign (positive or negative) indicates the direction of relationship
▪ Endpoints, 1.00 and +1.00,
 stand for “perfect” correlations
– Strongest possible correlations
▪ Zero indicates no correlation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.9
Strength and Direction of a Correlation Coefficient
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (8 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• “Which correlation is stronger, +.20 or +.70?”
– The latter is stronger because +.70 is farther away from zero.
• Which of these correlations is stronger, +.20 or −.70?
– The latter because −.70 is farther away from zero.
• “Which correlation is stronger, +.50 or −.70?”
– The latter because −.70 is farther from zero.
• When judging the relative strength of two correlation coefficients,
• Ignore the sign and determine which number is farther from zero.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (9 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Negative correlation
– Correlation in which values of two variables tend to move in
opposite directions
– Example
▪ The more hours students spend partying the night before an
exam, the lower their test grades tend to be
• Positive correlation
– Correlation in which values of two variables tend to move in the
same direction
– Example
▪ The more hours students spend studying for a test, the higher
their test grades tend to be.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (10 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Checkpoint questions
– “Is the correlation between education and income
positively or negatively correlated?”
▪ Positive because the two variables tend to move in
the same direction
– “Is the correlation between empathy and aggression
positive or negative?”
▪ Negative because people with more empathy tend
to be less aggressive
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (11 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Scatterplots are used to visually determine the direction of
correlations.
• Figure 15.6 - scatterplot of college GPA and starting salary
– As college GPA increases, starting salary also tends to increase.
– The correlation coefficient is +.61.
– A moderately strong positive correlation
• Figure 15.7 - scatterplot of days missed during college and starting
salary
– As the number of days missed during college increases, starting
salary tends to decrease
– Correlation coefficient is −.81
– A strong negative correlation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.10
Correlations of Different Strengths and Directions
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (12 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Pearson correlation coefficient
– Works only if your data are linearly
related
• Curvilinear relationship - a nonlinear
(curved) relationship between two
quantitative variables
• If you calculate the Pearson correlation
coefficient on a curved relationship,
– It generally will tell you that your
variables are not related.
– When in fact they are related.
– You would draw an incorrect
conclusion about the relationship.
Figure 15.11 A curvilinear relationship.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (1 of 7)
How to Calculate the Pearson Correlation Coefficient
• Earlier we showed how to obtain z scores.
• A z score tells you how far a data value is from the mean of its variable
– Example
▪ A z score of +2.00 says that the score is two SDs above the mean
▪ A z score of −2.00 says the score is two SDs below the mean
• To use the following formula for calculating the correlation coefficient
– First convert your IV (X) and DV (Y) data values to z scores
▪ X
Z = z score of the value of the X or IV
▪ Y
Z = z score of the value of the Y or DV
▪ n = number of cases
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (2 of 7)
How to Calculate the Pearson Correlation Coefficient
• Positive relationship
– Some cases have low X
and low Y values.
– Some have high X and
high Y values.
– Pattern provides a
positive value for the
numerator of the
formula.
(a) Positive correlation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (3 of 7)
How to Calculate the Pearson Correlation Coefficient
• Negative relationship
– Some cases have low X
and high Y values.
– Some have high X and
low Y values.
– Pattern provides a
negative value for the
numerator of the
formula.
(b) Negative correlation
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (4 of 7)
How to Calculate the Pearson Correlation Coefficient
• Researchers do not calculate correlation coefficients by
hand these days.
• It is helpful to calculate the correlation coefficient once to
get a better feel for how the numerical value is produced.
• Table showing how to calculate the correlation between
two variables
• At the end of the chapter, we list a practice exercise
where you can apply this procedure to obtain your own
correlation coefficient.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (5 of 7)
How to Calculate the Pearson Correlation Coefficient
Step 1. Convert the X and Y variable scores to z scores. We
already obtained the z scores for the X variable when we
introduced the concept of z scores. Here are those z scores:
‒1.413, ‒.707, 0, +.707, +1.413. Using that same procedure,
here are the z scores for variable Y: ‒1.750, ‒343, .453,
.453, 1.187.
Step 2. Calculate the sum of the cross products of the z
scores ( .
)
X Y
Z Z
i.e., A three-column procedure works well for
this step:
Step 3. Divide the sum of the third column i.e.,
( )
X Y
Z Z by
the number of cases (i.e., n).
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (6 of 7)
How to Calculate the Pearson Correlation Coefficient
4.713
= = =.943
5
X Y
Z Z
r
n
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Exhibit 15.3 (7 of 7)
How to Calculate the Pearson Correlation Coefficient
• Correlation between hours spent studying (X) and test
grades (Y) is +.943
• The two variables are very strongly correlated.
• As the number of hours spent studying increases, so do
test grades.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (13 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Partial Correlation Coefficient
– The correlation between two quantitative variables
controlling for one or more variables
– Widely used in areas of psychology where the use of
experiments for some research questions is difficult
▪ e.g., Personality, social, and developmental
psychology Good, strong theory is required to use
partial correlation analysis.
– The researcher must know the variable(s) that he or
she needs to control for.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (14 of 31)
15.6 Summarize the Techniques Used to Determine Relationships Among
Variables
• Partial correlation example
– In applied social psychology
▪ The relationship between
– The number of hours spent viewing or playing violence
– The number of aggressive acts performed
▪ Want to control for variables such as
– Personality type
– School grades
– Exposure to violence in the family
– Exposure to violence in the neighborhood
▪ Built on Bandura, Ross, & Ross (1963)
– Classic experimental research showing that children act aggressively
after being exposed to an adult model acting aggressively
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (15 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Value of the partial correlation coefficient indicates
– The strength and direction of relationship between two variables
– After controlling for the influence of one or more other variables
• Just like with the Pearson correlation coefficient
– Partial correlation coefficient has a range of 1.00
 to 1.00,
 where
– Zero indicates there is no relationship
– Sign indicates the direction of the relationship
– Key difference
▪ Partial correlation coefficient indicates the linear relationship between
two variables
▪ After controlling for another variable
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (16 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Researchers use statistical programs to calculate partial
correlation coefficients.
• If you are curious how to calculate the partial correlation
coefficient (or the regression coefficients discussed in the next
section), we recommend
– Cohen, Cohen, West, and Aiken (2003)
– Keith (2019)
• Called “partial” correlation coefficient because the technique
statistically removes or “partials” out the influence of the other
variables statistically controlled for
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (17 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Regression Analysis
– When all variables are quantitative, the technique called regression
analysis is often appropriate.
– Regression analysis - use of one or more quantitative IVs to explain or
predict the values of a single quantitative DV
– Two main types of regression analysis
▪ Simple regression - regression analysis with one DV and one IV
▪ Multiple regression - regression analysis with one DV and two or more
IVs
– Regression equation - the equation that defines a regression line
– Regression line - the line of “best fit” based on a regression equation
– Regression analysis can be used with curvilinear data.
– We only discuss linear relationships in this text.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Figure 15.12
Regression Line Showing the Relationship Between GPA and Starting
Salary
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (18 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Figure 15.12 - scatterplot of college GPA and starting salary with the
regression line inserted
• Two important characteristics of a line
– Slope - tells you how steep the line is
– Y-intercept - the point at which a regression line crosses the Y (vertical)
axis
• Regression equation
– Ŷ (called Y-hat) is the predicted value of the DV
– 0
b is the Y-intercept
– 1
b is the slope (it’s called the regression coefficient)
– 1
X is the single IV
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (19 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Regression equation for the regression line shown in
Figure 15.12
– DV (Y) is starting salary
– IV 1
(X ) is GPA
• Researchers rarely, if ever, calculate the regression
equation by hand!
• Y-intercept is $9,405.55; this is the predicted starting
salary if a person had a GPA of 0.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (20 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Regression coefficient - the slope or change in Y given a one unit
change in X
• Regression coefficient or slope in our example is $7,687.48.
– Starting salary is expected to increase by $7,687.48 for every one
unit increase in GPA.
– Or decrease by $7,687.48 for every one unit decrease in GPA
• Example
– A student with a 3 on the GPA variable (i.e., a B) is predicted to
start at a salary of $7,687 more than a student with a 2 (i.e., a C).
– Used the traditional grading scale (A = 4, B = 3, C = 2, D = 1, F =
0)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (21 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Regression equation can be used to obtain predicted values for the DV for specific
values of the IV.
– Example
▪ Let’s see what the predicted starting salary is for
– A student with a college GPA of 3 (i.e., a B average)
ˆ
ˆ
ˆ
= $9,405.55 + $7,687.48(3.00) We inserted the GPA value of 3.00
= $9,405.55 + $23,062.44 We multiplied $7,687.48 by 3.00
= $32,467.99 We added $9,405.55 and $23,062.44
Y
Y
Y
– Expected starting salary is $32,467.99
– Someone with a C average (i.e., a GPA value of 2)
• Insert a 2 into the equation and solve it
• Predicted starting salary is $24,780.51
▪ Notice that the difference between the starting salary for someone with a C
and a B is equal to the value of the regression coefficient.
▪ $32,467.99 − $24,780.51 = $7,687.48
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (22 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Multiple regression
– Multiple regression equation includes one regression
coefficient for each IV.
• Useful difference between the simple and multiple regression
– Multiple regression coefficient shows the relationship
between the DV and the IV controlling for the other IVs in
the equation.
– Analogous to the idea discussed earlier with partial
correlation
– Multiple regression coefficient is called the partial regression
coefficient.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (23 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Useful difference between the simple and multiple regression
– Simple regression analogous to a Pearson correlation which
does not control for any confounding variables
– Multiple regression provides one way that you can control
for one or more variables.
▪ The difference in the actual values of the correlation and
regular (unstandardized) regression coefficients
– Correlation coefficients are in standardized units
that vary from 1.00
 to 1.00

– Regular regression coefficients are in natural units
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (24 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Multiple Regression example
– The partial correlation coefficient expressing the relationship between
starting salary and GPA controlling for SAT scores is .413
– The partial regression coefficient is $4,788.90
▪ Controlling for SAT scores, each unit change in GPA is predicted to
lead to a $4,788.90 change in income.
– Using the data from our hypothetical college student data set, we used
SPSS to provide the following multiple regression equation.
– Based on the DV of starting salary and the IVs of GPA and high school
SAT
– 1
X = GPA
– 2
X = high school SAT
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (25 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• First partial regression coefficient in the preceding regression
equation is $4,788.90
– After controlling for SAT scores, starting salary increases by
$4,788.90 for each one-unit increase in GPA
• Second partial regression coefficient is $25.56.
– After controlling for GPA, starting salary increases by
$25.56 for each one-unit increase in SAT
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (26 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Obtain a predicted starting salary using our multiple regression equation by
inserting the values for GPA and SAT and solve for Y-hat.
– B student
– 1100 on SA.T
ˆ
ˆ
ˆ
= $12,435.59 + $4,788.90(3) + $25.56(1100)
We inserted a 3 for GPA and1100 for SAT
= $12,435.59 + $14,366.70 + $25.56(1100)
We multiplied +4,788.90 times 3
= $12,435.59 + $14,366.70 + $28,116.00



Y
Y
Y
ˆ
We multiplied +25.56 times1100
= $30,047.11
We added the two positive numbers and subtracted the negative number
Y
– Predicted starting salary is $30,047.11.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (27 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Contingency Tables
– A categorical DV and a categorical IV
– Construct a contingency table (also called cross-tabulation)
– Contingency table- table used to examine the relationship between categorical
variables
– Two-dimension contingency table
▪ Two variables
▪ Rows represent the categories of one of the variables.
▪ Columns represent the categories of the other variable.
– Various types of information can be placed into the cells of a contingency table.
▪ Cell frequencies
▪ Cell percentages
▪ Row percentages
▪ Column percentages
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Table 15.5
Personality Type by Gender Contingency Tables
Blank Blank
Gender
Female
Gender
Male
Personality
Type A
2,972 2,460
Type Type B 1,921 971
Blank Blank
4,893 3,431
(a) Contingency Table Showing Cell Frequencies (hypothetical data)
Blank Blank
Gender
Female
Gender
Male
Personality
Type A
60.7% 71.7%
Type Type B 39.3% 28.3%
Blank Blank
100% 100%
(b) Contingency Table Showing Column Percentages (based on the data
in part (a)
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (28 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Contingency Tables
– Column variable is gender (i.e., female or male)
– Row variable is personality type
▪ Type- A personality
– Likely to be impatient, competitive, irritable, high achieving, engage in multitasking,
and feel a sense of urgency
▪ Type-B personality
– Likely to be cooperative, less competitive, more relaxed, more patient, more
satisfied, and easygoing
– Research question
▪ Whether there is a relationship between gender and personality types
– Does gender seem to predict personality type?
– Do you think that women tend to be type A more than men tend to be type A?
– Very difficult to determine how the two variables are related based on cell frequencies alone
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (29 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Table 15.5(b)
– Calculated what are called column percentages for females and
males
▪ Type A personality column percentages
– 60.7% of females were type A.
– 71.7% of the men were type A.
– Men had a greater rate of type-A personality than women.
▪ Type-B personality column percentages
– 39.3% for females
– 28.3% for men
– Women have a higher rate of type-B personality than men.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (30 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• We recommend that you
– Make your predictor variable (IV) the column variable and
your DV the row variable
– Calculate column percentages and compare the rates
across the rows
• In order to correctly read a contingency table, you need to
remember these two simple rules.
– If the percentages are calculated down the columns, then
compare across the rows.
– If the percentages are calculated across the rows, then
compare down the columns.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Examining Relationships Among
Variables (31 of 31)
15.6 Summarize the Techniques Used to Determine Relationships
Among Variables
• Rates are frequently reported in
– The news
– Some types of research (e.g., epidemiology)
• More advanced research
– Add another (a third) IV
– Construct the two-way table.
Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
Copyright
This work is protected by United States copyright laws and is
provided solely for the use of instructors in teaching their courses
and assessing student learning. Dissemination or sale of any part
of this work (including on the World Wide Web) will destroy the
integrity of the work and is not permitted. The work and materials
from it should never be made available to students except by
instructors using the accompanying text in their classes. All
recipients of this work are expected to abide by these restrictions
and to honor the intended pedagogical purposes and the needs of
other instructors who rely on these materials.
Ad

More Related Content

Similar to research methods for business, descriptive statistics (20)

MAC411(A) Analysis in Communication Researc.ppt
MAC411(A) Analysis in Communication Researc.pptMAC411(A) Analysis in Communication Researc.ppt
MAC411(A) Analysis in Communication Researc.ppt
PreciousOsoOla
 
Statistics for Management.pdf
Statistics for Management.pdfStatistics for Management.pdf
Statistics for Management.pdf
Sankaranarayanan196410
 
Teaching Correct Statistical Methods in the Era of Knowledge Sharing
Teaching Correct Statistical Methods  in the Era of Knowledge SharingTeaching Correct Statistical Methods  in the Era of Knowledge Sharing
Teaching Correct Statistical Methods in the Era of Knowledge Sharing
Johnny Amora
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
gueste87a4f
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
raylenepotter
 
Statistical analysis and Statistical process in 2023 .pptx
Statistical analysis and Statistical process in 2023 .pptxStatistical analysis and Statistical process in 2023 .pptx
Statistical analysis and Statistical process in 2023 .pptx
Fayaz Ahmad
 
Report on students' socio-economic background
Report on students' socio-economic backgroundReport on students' socio-economic background
Report on students' socio-economic background
Shourav Mahmud
 
Data analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed QureshiData analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed Qureshi
Jameel Ahmed Qureshi
 
050325Online SPSS.pptx spss social science
050325Online SPSS.pptx spss social science050325Online SPSS.pptx spss social science
050325Online SPSS.pptx spss social science
NurFatin805963
 
Business Research & Statitics part II.pptx
Business Research & Statitics part II.pptxBusiness Research & Statitics part II.pptx
Business Research & Statitics part II.pptx
milkesashobe430
 
Business Statistics for Managers with SPSS[1].pptx
Business Statistics for Managers with SPSS[1].pptxBusiness Statistics for Managers with SPSS[1].pptx
Business Statistics for Managers with SPSS[1].pptx
profgnagarajan
 
Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
Aakash Kulkarni
 
QUANTITATIVE-DATA.pptx
QUANTITATIVE-DATA.pptxQUANTITATIVE-DATA.pptx
QUANTITATIVE-DATA.pptx
ViaFortuna
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.ppt
JakeCuenca10
 
analysis plan.ppt
analysis plan.pptanalysis plan.ppt
analysis plan.ppt
SamsonOlusinaBamiwuy
 
Lecture 2 practical_guidelines_assignment
Lecture 2 practical_guidelines_assignmentLecture 2 practical_guidelines_assignment
Lecture 2 practical_guidelines_assignment
Daria Bogdanova
 
Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
Aakash Kulkarni
 
statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnaires
Mohamed Afifi
 
Research and Statistics Report- Estonio, Ryan.pptx
Research  and Statistics Report- Estonio, Ryan.pptxResearch  and Statistics Report- Estonio, Ryan.pptx
Research and Statistics Report- Estonio, Ryan.pptx
RyanEstonio
 
7INNOVA Lesson 5 slides for T26
7INNOVA Lesson 5 slides for T267INNOVA Lesson 5 slides for T26
7INNOVA Lesson 5 slides for T26
Maykalavaane Narayanan
 
MAC411(A) Analysis in Communication Researc.ppt
MAC411(A) Analysis in Communication Researc.pptMAC411(A) Analysis in Communication Researc.ppt
MAC411(A) Analysis in Communication Researc.ppt
PreciousOsoOla
 
Teaching Correct Statistical Methods in the Era of Knowledge Sharing
Teaching Correct Statistical Methods  in the Era of Knowledge SharingTeaching Correct Statistical Methods  in the Era of Knowledge Sharing
Teaching Correct Statistical Methods in the Era of Knowledge Sharing
Johnny Amora
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
gueste87a4f
 
Statistical analysis and Statistical process in 2023 .pptx
Statistical analysis and Statistical process in 2023 .pptxStatistical analysis and Statistical process in 2023 .pptx
Statistical analysis and Statistical process in 2023 .pptx
Fayaz Ahmad
 
Report on students' socio-economic background
Report on students' socio-economic backgroundReport on students' socio-economic background
Report on students' socio-economic background
Shourav Mahmud
 
Data analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed QureshiData analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed Qureshi
Jameel Ahmed Qureshi
 
050325Online SPSS.pptx spss social science
050325Online SPSS.pptx spss social science050325Online SPSS.pptx spss social science
050325Online SPSS.pptx spss social science
NurFatin805963
 
Business Research & Statitics part II.pptx
Business Research & Statitics part II.pptxBusiness Research & Statitics part II.pptx
Business Research & Statitics part II.pptx
milkesashobe430
 
Business Statistics for Managers with SPSS[1].pptx
Business Statistics for Managers with SPSS[1].pptxBusiness Statistics for Managers with SPSS[1].pptx
Business Statistics for Managers with SPSS[1].pptx
profgnagarajan
 
Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
Aakash Kulkarni
 
QUANTITATIVE-DATA.pptx
QUANTITATIVE-DATA.pptxQUANTITATIVE-DATA.pptx
QUANTITATIVE-DATA.pptx
ViaFortuna
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.ppt
JakeCuenca10
 
Lecture 2 practical_guidelines_assignment
Lecture 2 practical_guidelines_assignmentLecture 2 practical_guidelines_assignment
Lecture 2 practical_guidelines_assignment
Daria Bogdanova
 
Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
Aakash Kulkarni
 
statistical analysis of questionnaires
statistical analysis of questionnairesstatistical analysis of questionnaires
statistical analysis of questionnaires
Mohamed Afifi
 
Research and Statistics Report- Estonio, Ryan.pptx
Research  and Statistics Report- Estonio, Ryan.pptxResearch  and Statistics Report- Estonio, Ryan.pptx
Research and Statistics Report- Estonio, Ryan.pptx
RyanEstonio
 

More from MonaHashim6 (13)

international Business chapter 03_ppt.pptx
international Business chapter 03_ppt.pptxinternational Business chapter 03_ppt.pptx
international Business chapter 03_ppt.pptx
MonaHashim6
 
international Business chapter 01_ppt.pptx
international Business chapter 01_ppt.pptxinternational Business chapter 01_ppt.pptx
international Business chapter 01_ppt.pptx
MonaHashim6
 
international Business Management 01_ppt.pptx
international Business Management 01_ppt.pptxinternational Business Management 01_ppt.pptx
international Business Management 01_ppt.pptx
MonaHashim6
 
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.pptمدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
MonaHashim6
 
Ch-01-the nature of research (RM))).pptx
Ch-01-the nature of research (RM))).pptxCh-01-the nature of research (RM))).pptx
Ch-01-the nature of research (RM))).pptx
MonaHashim6
 
ادارة التغيير فى المنشات , change management
ادارة التغيير فى المنشات  , change managementادارة التغيير فى المنشات  , change management
ادارة التغيير فى المنشات , change management
MonaHashim6
 
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداءمقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
MonaHashim6
 
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptxMental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
MonaHashim6
 
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptxMental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
MonaHashim6
 
quantitative vs qualitative
quantitative vs qualitativequantitative vs qualitative
quantitative vs qualitative
MonaHashim6
 
Research Introduction.pptx
Research Introduction.pptxResearch Introduction.pptx
Research Introduction.pptx
MonaHashim6
 
salkind_ppt_chapter_1.pptx
salkind_ppt_chapter_1.pptxsalkind_ppt_chapter_1.pptx
salkind_ppt_chapter_1.pptx
MonaHashim6
 
Research_Ch2.pptx
Research_Ch2.pptxResearch_Ch2.pptx
Research_Ch2.pptx
MonaHashim6
 
international Business chapter 03_ppt.pptx
international Business chapter 03_ppt.pptxinternational Business chapter 03_ppt.pptx
international Business chapter 03_ppt.pptx
MonaHashim6
 
international Business chapter 01_ppt.pptx
international Business chapter 01_ppt.pptxinternational Business chapter 01_ppt.pptx
international Business chapter 01_ppt.pptx
MonaHashim6
 
international Business Management 01_ppt.pptx
international Business Management 01_ppt.pptxinternational Business Management 01_ppt.pptx
international Business Management 01_ppt.pptx
MonaHashim6
 
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.pptمدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
مدخل مادة ادارة الاعمال الدولية MRK 354 كلية الأعمال برابغ.ppt
MonaHashim6
 
Ch-01-the nature of research (RM))).pptx
Ch-01-the nature of research (RM))).pptxCh-01-the nature of research (RM))).pptx
Ch-01-the nature of research (RM))).pptx
MonaHashim6
 
ادارة التغيير فى المنشات , change management
ادارة التغيير فى المنشات  , change managementادارة التغيير فى المنشات  , change management
ادارة التغيير فى المنشات , change management
MonaHashim6
 
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداءمقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
مقدمه عن ادارة الذات عن اتخاذ القرارات وتطوير الاداء
MonaHashim6
 
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptxMental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
MonaHashim6
 
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptxMental+Health+and+Wellbeing+Presentation+Primary+School.pptx
Mental+Health+and+Wellbeing+Presentation+Primary+School.pptx
MonaHashim6
 
quantitative vs qualitative
quantitative vs qualitativequantitative vs qualitative
quantitative vs qualitative
MonaHashim6
 
Research Introduction.pptx
Research Introduction.pptxResearch Introduction.pptx
Research Introduction.pptx
MonaHashim6
 
salkind_ppt_chapter_1.pptx
salkind_ppt_chapter_1.pptxsalkind_ppt_chapter_1.pptx
salkind_ppt_chapter_1.pptx
MonaHashim6
 
Research_Ch2.pptx
Research_Ch2.pptxResearch_Ch2.pptx
Research_Ch2.pptx
MonaHashim6
 
Ad

Recently uploaded (20)

Grade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable WorksheetGrade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable Worksheet
Sritoma Majumder
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Leonel Morgado
 
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdfRanking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Rafael Villas B
 
Lecture 4 INSECT CUTICLE and moulting.pptx
Lecture 4 INSECT CUTICLE and moulting.pptxLecture 4 INSECT CUTICLE and moulting.pptx
Lecture 4 INSECT CUTICLE and moulting.pptx
Arshad Shaikh
 
Link your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRMLink your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRM
Celine George
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
Cultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptxCultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptx
UmeshTimilsina1
 
Herbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptxHerbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptx
RAJU THENGE
 
Cultivation Practice of Garlic in Nepal.pptx
Cultivation Practice of Garlic in Nepal.pptxCultivation Practice of Garlic in Nepal.pptx
Cultivation Practice of Garlic in Nepal.pptx
UmeshTimilsina1
 
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdfBiophysics Chapter 3 Methods of Studying Macromolecules.pdf
Biophysics Chapter 3 Methods of Studying Macromolecules.pdf
PKLI-Institute of Nursing and Allied Health Sciences Lahore , Pakistan.
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
Form View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo SlidesForm View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo Slides
Celine George
 
Grade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable WorksheetGrade 2 - Mathematics - Printable Worksheet
Grade 2 - Mathematics - Printable Worksheet
Sritoma Majumder
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18How to Create A Todo List In Todo of Odoo 18
How to Create A Todo List In Todo of Odoo 18
Celine George
 
dynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south Indiadynastic art of the Pallava dynasty south India
dynastic art of the Pallava dynasty south India
PrachiSontakke5
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Myopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduateMyopathies (muscle disorders) for undergraduate
Myopathies (muscle disorders) for undergraduate
Mohamed Rizk Khodair
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Redesigning Education as a Cognitive Ecosystem: Practical Insights into Emerg...
Leonel Morgado
 
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdfRanking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Ranking_Felicidade_2024_com_Educacao_Marketing Educacional_V2.pdf
Rafael Villas B
 
Lecture 4 INSECT CUTICLE and moulting.pptx
Lecture 4 INSECT CUTICLE and moulting.pptxLecture 4 INSECT CUTICLE and moulting.pptx
Lecture 4 INSECT CUTICLE and moulting.pptx
Arshad Shaikh
 
Link your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRMLink your Lead Opportunities into Spreadsheet using odoo CRM
Link your Lead Opportunities into Spreadsheet using odoo CRM
Celine George
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
Cultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptxCultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptx
UmeshTimilsina1
 
Herbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptxHerbs Used in Cosmetic Formulations .pptx
Herbs Used in Cosmetic Formulations .pptx
RAJU THENGE
 
Cultivation Practice of Garlic in Nepal.pptx
Cultivation Practice of Garlic in Nepal.pptxCultivation Practice of Garlic in Nepal.pptx
Cultivation Practice of Garlic in Nepal.pptx
UmeshTimilsina1
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptxLecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Lecture 2 CLASSIFICATION OF PHYLUM ARTHROPODA UPTO CLASSES & POSITION OF_1.pptx
Arshad Shaikh
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
Form View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo SlidesForm View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo Slides
Celine George
 
Ad

research methods for business, descriptive statistics

  • 1. Research Methods, Design, and Analysis Thirteenth Edition Chapter 15 Descriptive Statistics Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved
  • 2. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Descriptive Statistics
  • 3. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Learning Objectives 15.1 Describe the purpose of descriptive statistics. 15.2 Explain the concept of a frequency distribution. 15.3 Differentiate among the types of graphic representations of data and when they should be used. 15.4 Calculate the mean, median, and mode of a data set. 15.5 Calculate the variance and standard deviation of a data set. 15.6 Summarize the techniques used to determine relationships among variables.
  • 4. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Field of Statistics (1 of 2) • Two broad categories – Descriptive statistics – Inferential statistics Figure 15.1 Major divisions of the field of statistics.
  • 5. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Field of Statistics (2 of 2) • Descriptive statistics – The type of statistical analysis focused on describing, summarizing, or explaining a set of data – Allows you to make sense of your set of data and to make the key characteristics easily understandable to others • Inferential statistics – The type of statistical analysis focused on making inferences about populations based on sample data – Subdivided into ▪ Estimation – Point estimation – Interval estimation ▪ Hypothesis testing
  • 6. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Let's Begin... • In this chapter, we will explain descriptive statistical analysis. • In Chapter 16, we will explain inferential statistical analysis. • We assume no prior knowledge of the material. • Both chapters are written so that everyone can understand the material. • Discussion requires very little mathematical background. • Focus on showing you – What statistical procedures to select to understand your data – How to interpret and communicate your results • Before moving to the next section please read Exhibit 15.1 to see why you must always conduct your statistical analyses intelligently.
  • 7. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (1 of 6) Simpson’s Paradox • Demonstrate how statistical analysis, if not conducted properly, can deceive people. • Example is based on a real case of purported gender discrimination at the University of California, Berkeley, several decades ago. • Written up in Science (Bickel, 1975) • Data shown below refer to men and women admitted to graduate school in the Department of Psychology at a hypothetical university.
  • 8. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (2 of 6) Simpson’s Paradox Combined or “Aggregated” Results Blank Number Applied Number Admitted Percentage Admitted Men 180 99 55 Women 100 45 45 • 55% of the men who applied to this department were admitted to graduate school. • Only 45% of the women who applied were admitted. • Assume that their qualifications were the same.
  • 9. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (3 of 6) Simpson’s Paradox • If this were the case, you might conclude that gender discrimination has occurred because men had a much higher rate of acceptance than women. • Assume that the 280 students applying to the Psychology Department applied to two different graduate programs. – Doctoral program in clinical psychology – Doctoral program in experimental psychology • The researcher decides to break down the data separately for each program and obtains the two tables shown next.
  • 10. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (4 of 6) Simpson’s Paradox Results Separated by Program (“Disaggregated Results”) Clinical Psychology Program Blank Number Applied Number Admitted Percentage Admitted Men 60 9 15 Women 60 12 20 Experimental Psychology Program Blank Number Applied Number Admitted Percentage Admitted Men 120 90 75 Women 40 32 80 • What do you see in these two program tables? – Women (not men) had the higher acceptance rates in both degree programs! • If there is any discrimination, it is in favor of the women applicants.
  • 11. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (5 of 6) Simpson’s Paradox • What’s going on? – The overall/combined data suggested one conclusion. – When the data were more carefully analyzed (they were “disaggregated” in the clinical and experimental program tables), a completely different conclusion became apparent. • How could it be that opposite conclusions are suggested in the two exhibits based on the same data? – A statistical phenomenon known as Simpson’s paradox – Women tended to apply to the program that was harder to get into – Men tended to apply to the program that was easier to get into – Aggregated data produced one conclusion – Disaggregated data produced the opposite and more accurate conclusion.
  • 12. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.1 (6 of 6) Simpson’s Paradox • Moral of this story – Be cautious when you examine and interpret descriptive data. – Always look at the data. ▪ Critically ▪ In multiple ways ▪ Until you are able to draw the most warranted conclusion
  • 13. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Descriptive Statistics (1 of 4) 15.1 Describe the Purpose of Descriptive Statistics • Data set - a set of data, where the rows are “cases” and the columns are “variables” • The researcher uses descriptive statistics to understand and summarize the key numerical characteristics of the data set. • Example – Calculate the averages of your treatment and control group scores in an experiment. – If you conducted a survey, you might want to know the frequencies of the responses for each question. – Want to use graphs to pictorially communicate some of your results
  • 14. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Descriptive Statistics (2 of 4) 15.1 Describe the Purpose of Descriptive Statistics • In the next chapter (inferential statistics) – Will learn how to determine if the difference between the treatment and control groups means is statistically significant – If other observed results are statistically significant • In this chapter – Focus on taking whatever set of data you currently have and showing how to summarize the key characteristics of the data • Key question in descriptive statistics – How can I communicate the important characteristics of my data? ▪ One way would be to supply a printout of all of your data, but that would be very inefficient.
  • 15. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Descriptive Statistics (3 of 4) 15.1 Describe the Purpose of Descriptive Statistics • Data set in Table 15.1 will be used in several places in this chapter. – “College graduate data set.” – Hypothetically say ▪ Data came from a survey research study ▪ Conducted with 25 recent college graduates ▪ You asked participants – Starting salaries, undergraduate GPA, college major (you only surveyed three majors), gender, the SAT scores they had when they entered college, number of days they believe they missed during college • Goal in this survey research study – To determine what variables predicted the starting salaries of psychology, philosophy, and business majors
  • 16. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.1 (1 of 3) Hypothetical Data Set for Nonexperimental Research for 25 Recent College Graduates • Four quantitative variables – Salary – GPA – SAT scores – Days of school missed • Two categorical variables – College major – Gender • Standard format – Cases in rows – Variables in columns
  • 17. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.1 (2 of 3) Hypothetical Data Set for Nonexperimental Research for 25 Recent College Graduates Person Salary GPA Major Gender SAT Days Missed 1 24,000 2.5 1 0 1,110 36 2 25,000 2.5 1 0 1,100 26 3 27,500 3 point 0 1 0 1,300 31 4 28,500 2.4 2 1 1,100 18 5 30,500 3 point 0 2 0 1,150 26 6 30,500 2.9 2 1 1,130 18 7 31,000 3.1 1 0 1,180 16 8 31,000 3.3 1 0 1,160 11 9 31,500 2.9 2 0 1,170 25 10 32,000 3.6 1 0 1,250 12 11 32,000 2.6 1 1 1,230 26 12 32,500 3.1 2 0 1,130 21 13 32,500 3.2 2 1 1,200 17 14 32,500 3 point 0 3 1 1,150 14 3.0 3.0 3.0
  • 18. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.1 (3 of 3) Hypothetical Data Set for Nonexperimental Research for 25 Recent College Graduates Person Salary GPA Major Gender SAT Days Missed 15 33,000 3.7 1 0 1,260 29 16 33,500 3.1 2 1 1,170 21 17 33,500 2.7 2 1 1,140 22 18 34,500 3 point 0 3 0 1,240 14 19 35,500 3.1 3 0 1,330 16 20 36,500 3.5 2 1 1,220 0 21 37,500 3.4 3 1 1,150 4 22 38,500 3.2 2 0 1,270 10 23 38,500 3 point 0 3 1 1,300 0 24 40,500 3.3 3 1 1,280 5 25 41,500 3.5 3 1 1,330 2 3.0 3.0 Note: For the categorical variable “major,” 1 = psychology, 2 = philosophy, and 3 = business. For the categorical variable “ gender,” 0 = male and 1 = female.
  • 19. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Descriptive Statistics (4 of 4) 15.1 Describe the Purpose of Descriptive Statistics • Enter data into a spreadsheet such as – Excel (which can be used by a statistical program such as SPSS) – SPSS • We used the popular statistical program SPSS for most of the analyses in this and the next chapter. • Most universities provide access to SPSS or another statistical program in their computer labs.
  • 20. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Frequency Distributions 15.2 Explain the Concept of a Frequency Distribution • Frequency distribution - data arrangement in which the frequencies of each unique data value is shown – First column shows the unique data values for the variable. – Second column the frequencies for each of these values – Third column the percentages • Example - Table 15.2 – Variable - starting salary – Lowest salary is $24,000. – Highest is $41,500. – Most frequently occurring salary − $32,500 ▪ Three of the 25 recent graduates had this starting salary – 4% of the 25 cases had a salary of $24,000. – 8% of the cases had a salary of $32,000.
  • 21. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (1 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Graphs – Pictorial representations of data – Can be used for one or more variables – Used to help communicate the nature of data – Example ▪ Program evaluators often include graphs in their reports because their clients often like to see graphic representations of the data.
  • 22. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (2 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Bar Graphs – Graph that uses vertical bars to represent the data values of a categorical variable Figure 15.2 A bar graph of undergraduate major.
  • 23. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (3 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Figure 15.2 – Bar graph of the categorical variable college major – Horizontal axis shows the three categories in the variable. – Frequencies of each category are shown on the vertical axis. – Bars provide graphical representations of the frequencies of the three majors. ▪ 8 psychology majors ▪ 10 philosophy majors ▪ 7 business majors – Can easily convert these numbers into percentages ▪ 32% were psychology majors (8 divided by 25). ▪ 40% were philosophy majors (10 divided by 25). ▪ 28% were business majors (7 divided by 25).
  • 24. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (4 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Histograms – Graph depicting frequencies and distribution of a quantitative variable – A presentation of a frequency distribution in bar format – Advantage over a frequency distribution ▪ More clearly shows the shape of the distribution – Histogram for starting salary in Figure 15.3 – In contrast to bar graphs, the bars in histograms are placed next to each other with no space in between.
  • 25. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.3 Histogram of Starting Salary
  • 26. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (5 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Line Graphs – A graph relying on the drawing of one or more lines connecting data points – A useful way to graphically depict the distribution of a quantitative variable – Line graph of starting salary in Figure 15.4 – Useful to visually show and aid in the interpretation of interaction effects Figure 15.4 Line graph of starting salary.
  • 27. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (6 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Line Graph interaction example – Conduct an experiment to test a new social skills training program. – Pretest–posttest control group design – DV = the number of appropriate social interactions – IV = social skills training (training versus. no training) – Data shown in Table 15.3 – Some results of this hypothetical experiment are shown in Figure 15.5.
  • 28. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.3 (1 of 2) Hypothetical Data Set for Experimental Research Study Examining the Effectiveness of Social Skills Training Person Pretest Scores Treatment Condition Posttest Scores 1 3 1 4 2 4 1 4 3 2 1 3 4 1 1 2 5 1 1 2 6 0 1 0 7 2 1 2 8 4 1 4 9 4 1 4 10 3 1 4 11 2 1 3 12 5 1 5 13 3 1 3 14 3 1 3 15 2 2 4 16 3 2 5
  • 29. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.3 (2 of 2) Hypothetical Data Set for Experimental Research Study Examining the Effectiveness of Social Skills Training Person Pretest Scores Treatment Condition Posttest Scores 17 1 2 2 18 2 2 4 19 1 2 2 20 2 2 4 21 2 2 3 22 3 2 5 23 5 2 6 24 2 2 4 25 4 2 2 26 4 2 5 27 2 2 4 28 5 2 6 Note: Pretest = number of appropriate interactions at the beginning of the experiment; posttest = number of appropriate interactions after the experimental intervention; treatment condition = 1 for control group (did not receive social skills training) and 2 for treatment group (did receive social skills training).
  • 30. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.5 Line Graph of Results from Pretest–Posttest Control Group Design Studying Effectiveness of Social Skills Treatment
  • 31. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (7 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Line Graph interaction example – Both groups started low on the number of appropriate skills they exhibited. – At the end of the study ▪ After the treatment group received social skills training ▪ The participants in the treatment group have higher scores than the participants in the control group. – The number of appropriate social skills ▪ Increased for the treatment group ▪ No (or very little) increase for the control group – Treatment seems to work. – You must also determine if the result is statistically significant.
  • 32. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (8 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Scatterplots – A graphical depiction of the relationship between two quantitative variables – Dependent variable on the vertical axis – Independent or predictor variable on the horizontal axis – Dots within the graph represent the cases (i.e., participants) in the data set. – Scatterplot of the two quantitative variables grade point average and starting salary in Figure 15.6 ▪ Appears to be a positive relationship between GPA and starting salary ▪ As GPA increases, starting salary also increases.
  • 33. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.6 Scatterplot of Starting Salary by College GPA (Positive Relationship)
  • 34. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Graphic Representations of Data (9 of 9) 15.3 Differentiate Among the Types of Graphic Representations of Data and When They Should Be Used • Scatterplots – Positive relationship ▪ The data values tend to start at the bottom left side of the graph and end at the top right side. – Scatterplot of days of school missed during college and starting salary is shown in Figure 15. ▪ Appears to be a negative relationship between days missed and starting salary ▪ As days missed increases starting salary decreases – Negative relationship ▪ The data values tend to start at the top left side of the graph and end at the lower right side.
  • 35. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.7 Scatterplot of Starting Salary by Days Missed (Negative Relationship)
  • 36. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (1 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Measure of central tendency – Numerical value expressing what is typical of the values of a quantitative variable • One of the most important ways to describe and understand data • Example – College GPA is the value expressing what is typical for your grades. • Three most common measures of central tendency – Mode – The median – The mean.
  • 37. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (2 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Mode – The most frequently occurring number – Most basic, and the crudest, measure of central tendency – Example ▪ 0, 2, 3, 4, 5, 5, 5, 7, 8, 8, 9, 10 ▪ mode is 5 ▪ occurs three times – If there is a tie for the most frequently occurring number ▪ Need to report both ▪ Point out that the data for the variable are bimodal.
  • 38. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (3 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Mode – Practice ▪ Determine the mode for the following set of numbers ▪ 1, 2, 2, 5, 5, 7, 10, 10, 10 ▪ If you said 10, then you are right ▪ The mode in this case is not a very good indicator of the central tendency of the data. – If the data are normally distributed ▪ Most people fall toward the center of the distribution of numbers. ▪ The mode works much better than in this case – In practice, research psychologists rarely use the mode.
  • 39. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (4 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Median – The center point in an ordered set of numbers – Odd number of numbers ▪ the median is the middle number ▪ example – 1, 2, 3, 4, 5 – median is 3 – Even number of numbers ▪ The median is the average of the two centermost numbers ▪ Example – 1, 2, 3, 4 – Median is 2.5 (i.e., the average of 2 and 3 is 2.5)
  • 40. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (5 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Median – An interesting property of the median is that it is not affected by the size of the highest and lowest numbers ▪ Example – The median of 1, 2, 3, 4, 5 is the same as the median of 1, 2, 3, 4, 500 – In both cases the median is 3!
  • 41. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (6 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Mean – The arithmetic average – The average of 1, 2, and 3 = 2 ▪ (1+2+3)/ 3 – Psychologists sometimes refer to the mean as (called X bar) – Our formula for getting the mean – X stands for the variable you are using – n is the number of numbers you have –  is a sum sign (add up the numbers that follow it)
  • 42. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Central Tendency (7 of 7) 15.4 Calculate the Mean, Median, and Mode of a Data Set • Mean – Simple case where the three values of our variable are 1, 2, and 3 – Psychologists frequently calculate the means for the groups that they want to compare ▪ e.g.. The mean performance level for treatment and control groups – Figure 15.5 ▪ Each of the four points in the graph is a group mean – Means for the treatment and the control groups at the pretest – Means for these two groups at the posttest
  • 43. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (1 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Also important to find out how much your data values are spread out • Variability – Numerical value expressing how spread out or how much variation is present in the values of a quantitative variable • If all of the data values for a variable were the same, then there is no variability. – example- 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 • Variability in these numbers – 1, 2, 3, 3, 4, 4, 4, 6, 8, 10 • The more different your numbers, the more variability you have.
  • 44. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (2 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Which of the following sets of data have the most variability present? – Group one: 44, 45, 45, 45, 46, 46, 47, 47, 48, 49 – Group two: 34, 37, 45, 51, 58, 60, 77, 88, 90, 98 – The data for group two have more variability than group one. • Homogeneous - little variability in scores in a group • Heterogeneous - a lot of variability in scores in a group • Three of the types of variability – Range – Variance – Standard deviation
  • 45. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (3 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Range – The highest number minus the lowest number – The simplest measure of variability, but also the most crude – Formula ▪ Range = H – L – H is the highest number – L is the lowest number – Example ▪ Data for group one shown in the previous section – Range is equal to 5 (49 − 44) ▪ Range for group two – Range is 64 (98 − 34) – Crude index of variability because it takes into account only two numbers
  • 46. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (4 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Variance and Standard Deviation – Two most popular measures of variability – Superior to the range because they take into account all of the data values for a variable – Both provide information about the dispersion or variation around the mean value of a variable – Variance - the average deviation of data values from their mean in squared units ▪ Is popular because it has nice mathematical properties
  • 47. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (5 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Variance and Standard Deviation – Standard deviation - the square root of the variance ▪ Turns the variance into more meaningful units ▪ An approximate indicator of the average distance that your data values are from their mean ▪ Example – if you have a mean of 5 – a standard deviation of 2 – data values tend to be approximately 2 units above or below 5 – For the variance and the standard deviation ▪ The larger the value, the greater the data are spread out ▪ The smaller the value, the less the data are spread out – How to calculate the variance and standard deviation in Table 15.4
  • 48. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.4 (1 of 3) Calculating the Variance and Standard Deviation Blank (1) (2) (3) (4) Blank (X) Left parenthesis X bar right parenthesis. Left parenthesis X minus X bar right parenthesis. Left parenthesis X minus X bar right parenthesis superscript 2 end superscript. Blank 2 6 −4 16 Blank 4 6 −2 4 Blank 6 6 0 0 Blank 8 6 2 4 3 10 6 4 16 Blank 30 Blank 0 40 Sums Summation, superscript up arrow end superscript, of X. Blank Blank Summation of left parenthesis X minus X bar (up arrow) right parenthesis.   X    X X   2  X X  x   2    X X
  • 49. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.4 (2 of 3) Calculating the Variance and Standard Deviation Steps: 1. Insert your data values in the X column. 2. Calculate the mean of the values in column 1, and place this value in column 2. In our example, the mean is 6. 30 = 6. 5 X = 3. Subtract the values in column 2 from the values in column 1, and place these into column 3. 4. Square the numbers in column 3 (i.e., multiply the number by itself), and place these in column 4. (Note: You can ignore the minus signs in column 3 because a negative number multiplied by a negative number produces a positive number
  • 50. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.4 (3 of 3) Calculating the Variance and Standard Deviation Steps: 5. Insert the appropriate values into the following formula for the variance: Variance    2 ( ) 2 X X where   2 ( ) X X is the sum of the numbers in column 4, and n is the number of numbers. In this example, the variance      2 ( ) 40 8 2 5 X X 6. The standard deviation is the square root of the variance In this example, the variance is 8 (see step 5), and the standard deviation is 2.83 (i.e., the square root of 8 = 2.83).
  • 51. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (6 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Standard Deviation and the Normal Curve – If the data were fully normally distributed, the standard deviation would have additional meaning. – Examine the standard normal distribution in Figure 15.8 ▪ The normal curve or normal distribution has a bell shape. ▪ It is high in the middle and it tapers off to the left and the right.
  • 52. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.8 Areas Under the Normal Distribution Z scores −3 −2 −1 0 1 2 3 Percentile ranks 0.1 2 16 50 84 98 99.9
  • 53. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (7 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Standard Deviation and the Normal Curve – If the data were fully normally distributed ▪ You would be able to apply the 68, 95, 99.7 percent rule – 68% of the cases fall within one standard deviation from the mean. – 95% fall within two standard deviations. – 99.7% fall within three standard deviations. – It is important to understand sample data are never fully normally distributed. – Can be called the theoretical normal distribution. – The normal distribution also has many applications in more advanced statistics courses.
  • 54. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (8 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • z scores – A score that has been transformed into standard deviation units – Transformed from their original “raw scores” into a new “standardized” metric – Mean of zero and a standard deviation of one – Data values now can be interpreted in terms of how far they are from their mean – If a data value is +1.00, we can say that this value falls one standard deviation above the mean. – A value of +2.00 falls two standard deviations above the mean – A value of -1.5 falls one and a half standard deviations below the mean – “Standardized units” or “z scores” were used with the normal curve just shown in Figure 15.8.
  • 55. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (9 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Formula • To use this formula – Convert raw scores to z scores – Need to know the mean and standard deviation
  • 56. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (10 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Example – Set of scores - 2, 4, 6, 8, 10 – Mean = 6 – Sd = 2.83 – Convert 10 to a z score – convert 2 to a z score:
  • 57. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Measures of Variability (11 of 11) 15.5 Calculate the Variance and Standard Deviation of a Data Set • Negative sign indicates that the number is below the mean. • All of the z scores for our set of five numbers – −1.413, −.707, 0, +.707, +1.413 – The average of these numbers is zero. • Key point – You can take any set of numbers. – Convert the numbers to z scores. – They will always have a mean of zero and a standard deviation of one. • Helps psychologists when – They want to compare scores across different variables and different data sets. – They want to know how far a data value falls above or below the mean.
  • 58. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (1 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Rarely is a psychologist interested in a single variable. • Typically are interested in determining whether IVs and DVs are related • Use IVs to “explain variance” in DVs • Determining what IVs predict or cause changes in DVs is perhaps the primary goal of science. • Practitioners can apply this knowledge to produce changes in the world. – Use new psychotherapy techniques to reduce mental illness – To determine how to predict who is “at risk” for future problems so that early interventions can be started
  • 59. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (2 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Describe several approaches used to examine relationships among two or more variables. • Vast majority of the time – DV in psychological research is a quantitative variable. ▪ e.g., Response time, performance level, level of stress • Most of the indexes of relationship described here are used for quantitative DVs. • Will explain one exception in which you have a categorical DV and a categorical IV
  • 60. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (3 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Unstandardized and Standardized Difference Between Group Means – Example, in our college graduate data set ▪ Mean (i.e., the average) starting salary for males is $34,791.67 ▪ Mean starting salary for females is $31,269.23 ▪ Unstandardized difference between these two means –   $34,791.67 $31,269.23 $3,522.44 ▪ “There appears to be a sizable relationship between gender and starting salary such that males have higher salaries than females” – The difference between the means is often transformed into a standardized measure.
  • 61. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (4 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Cohen’s d – The difference between two means in standard deviation units – One of many effect size indicators • Effect size indicator – Index of magnitude or strength of a relationship or difference between means
  • 62. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (5 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Cohen’s d formula – M1 is the mean for group 1 – M2 is the mean for group 2 – SD is the standard deviation of either group ▪ Traditionally it’s the control group’s standard deviation in an experiment. ▪ Some researchers prefer a pooled standard deviation. • Rough starting point for interpreting d – d = .2 as “small” – d = .5 as “medium” – d = .8 as “large”
  • 63. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (6 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Calculate Cohen’s d to compare the average male and female incomes – Gender is the categorical IV – Starting salary is the quantitative DV – Mean starting salary ▪ Males = $34,791.67 ▪ Females = $31,269.23 ▪ Unstandardized difference between the means is $3,522.44 ▪ Standard deviation for females = $4,008.40 – Mean starting salary for men is .88 standard deviations above the mean for females – Criteria for interpretation, “large” difference between the means
  • 64. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.2 (1 of 4) Using Cohen’s D in a Pretest–Posttest Control-Group Experimental Research Design • IV = treatment and control conditions • Purpose - treatment o improve the social skills of the participants • DV = number of appropriate interactions in a 1-hour observation session (pretest and posttest) • Figure 15.5 – Pretest and posttest means for the treatment and control groups – It appears that the treatment worked – After the intervention the social skills performance of the treatment group improved quite a bit more than that for the control group – At the pretest, the two groups’ means were similar. ▪ Suggesting that random assignment to the groups worked well
  • 65. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.2 (2 of 4) Using Cohen’s D in a Pretest–Posttest Control-Group Experimental Research Design • Calculate Cohen’s d for pretest and posttest means – Pretest mean for the treatment group (M1) = 2.71 – Pretest mean for the control group (M2) = 2.64 – Standard deviation (SD) for the control group = 1.39 – Posttest the mean for the treatment group (M1) 4.00 – Posttest mean for the control group (M2) = 3.07 – Standard deviation (SD) for the control group = 1.27
  • 66. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.2 (3 of 4) Using Cohen’s D in a Pretest–Posttest Control-Group Experimental Research Design • Interpreting these data – The difference between the means was very small at the pretest ▪ Standardized mean difference (Cohen’s d) = .05 ▪ The treatment group was only .05 of a standard deviation larger than the control group mean – The posttest Cohen’s d was .73 ▪ Indicates the treatment group mean was .73 standard deviation units above the control group mean ▪ A moderately large difference
  • 67. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.2 (4 of 4) Using Cohen’s D in a Pretest–Posttest Control-Group Experimental Research Design • Although the results just presented appear to support the efficacy of the social skills training, we still cannot trust this experimental finding. • Problem - the observed differences between the means might represent nothing more than random or chance fluctuation in the data. • In the next chapter on inferential statistics, we will check to see if this difference is statistically significant.
  • 68. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (7 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Correlation Coefficient – Have a quantitative DV and a quantitative IV, you need to either obtain ▪ A correlation coefficient ▪ Or a regression coefficient – Correlation coefficient - index indicating the strength and direction of linear relationship between two quantitative variables ▪ A numerical index ranging from 1.00 to +1.00  ▪ Absolute size of the number indicates the strength ▪ Sign (positive or negative) indicates the direction of relationship ▪ Endpoints, 1.00 and +1.00,  stand for “perfect” correlations – Strongest possible correlations ▪ Zero indicates no correlation
  • 69. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.9 Strength and Direction of a Correlation Coefficient
  • 70. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (8 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • “Which correlation is stronger, +.20 or +.70?” – The latter is stronger because +.70 is farther away from zero. • Which of these correlations is stronger, +.20 or −.70? – The latter because −.70 is farther away from zero. • “Which correlation is stronger, +.50 or −.70?” – The latter because −.70 is farther from zero. • When judging the relative strength of two correlation coefficients, • Ignore the sign and determine which number is farther from zero.
  • 71. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (9 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Negative correlation – Correlation in which values of two variables tend to move in opposite directions – Example ▪ The more hours students spend partying the night before an exam, the lower their test grades tend to be • Positive correlation – Correlation in which values of two variables tend to move in the same direction – Example ▪ The more hours students spend studying for a test, the higher their test grades tend to be.
  • 72. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (10 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Checkpoint questions – “Is the correlation between education and income positively or negatively correlated?” ▪ Positive because the two variables tend to move in the same direction – “Is the correlation between empathy and aggression positive or negative?” ▪ Negative because people with more empathy tend to be less aggressive
  • 73. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (11 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Scatterplots are used to visually determine the direction of correlations. • Figure 15.6 - scatterplot of college GPA and starting salary – As college GPA increases, starting salary also tends to increase. – The correlation coefficient is +.61. – A moderately strong positive correlation • Figure 15.7 - scatterplot of days missed during college and starting salary – As the number of days missed during college increases, starting salary tends to decrease – Correlation coefficient is −.81 – A strong negative correlation
  • 74. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.10 Correlations of Different Strengths and Directions
  • 75. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (12 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Pearson correlation coefficient – Works only if your data are linearly related • Curvilinear relationship - a nonlinear (curved) relationship between two quantitative variables • If you calculate the Pearson correlation coefficient on a curved relationship, – It generally will tell you that your variables are not related. – When in fact they are related. – You would draw an incorrect conclusion about the relationship. Figure 15.11 A curvilinear relationship.
  • 76. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (1 of 7) How to Calculate the Pearson Correlation Coefficient • Earlier we showed how to obtain z scores. • A z score tells you how far a data value is from the mean of its variable – Example ▪ A z score of +2.00 says that the score is two SDs above the mean ▪ A z score of −2.00 says the score is two SDs below the mean • To use the following formula for calculating the correlation coefficient – First convert your IV (X) and DV (Y) data values to z scores ▪ X Z = z score of the value of the X or IV ▪ Y Z = z score of the value of the Y or DV ▪ n = number of cases
  • 77. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (2 of 7) How to Calculate the Pearson Correlation Coefficient • Positive relationship – Some cases have low X and low Y values. – Some have high X and high Y values. – Pattern provides a positive value for the numerator of the formula. (a) Positive correlation
  • 78. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (3 of 7) How to Calculate the Pearson Correlation Coefficient • Negative relationship – Some cases have low X and high Y values. – Some have high X and low Y values. – Pattern provides a negative value for the numerator of the formula. (b) Negative correlation
  • 79. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (4 of 7) How to Calculate the Pearson Correlation Coefficient • Researchers do not calculate correlation coefficients by hand these days. • It is helpful to calculate the correlation coefficient once to get a better feel for how the numerical value is produced. • Table showing how to calculate the correlation between two variables • At the end of the chapter, we list a practice exercise where you can apply this procedure to obtain your own correlation coefficient.
  • 80. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (5 of 7) How to Calculate the Pearson Correlation Coefficient Step 1. Convert the X and Y variable scores to z scores. We already obtained the z scores for the X variable when we introduced the concept of z scores. Here are those z scores: ‒1.413, ‒.707, 0, +.707, +1.413. Using that same procedure, here are the z scores for variable Y: ‒1.750, ‒343, .453, .453, 1.187. Step 2. Calculate the sum of the cross products of the z scores ( . ) X Y Z Z i.e., A three-column procedure works well for this step: Step 3. Divide the sum of the third column i.e., ( ) X Y Z Z by the number of cases (i.e., n).
  • 81. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (6 of 7) How to Calculate the Pearson Correlation Coefficient 4.713 = = =.943 5 X Y Z Z r n
  • 82. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Exhibit 15.3 (7 of 7) How to Calculate the Pearson Correlation Coefficient • Correlation between hours spent studying (X) and test grades (Y) is +.943 • The two variables are very strongly correlated. • As the number of hours spent studying increases, so do test grades.
  • 83. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (13 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Partial Correlation Coefficient – The correlation between two quantitative variables controlling for one or more variables – Widely used in areas of psychology where the use of experiments for some research questions is difficult ▪ e.g., Personality, social, and developmental psychology Good, strong theory is required to use partial correlation analysis. – The researcher must know the variable(s) that he or she needs to control for.
  • 84. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (14 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Partial correlation example – In applied social psychology ▪ The relationship between – The number of hours spent viewing or playing violence – The number of aggressive acts performed ▪ Want to control for variables such as – Personality type – School grades – Exposure to violence in the family – Exposure to violence in the neighborhood ▪ Built on Bandura, Ross, & Ross (1963) – Classic experimental research showing that children act aggressively after being exposed to an adult model acting aggressively
  • 85. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (15 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Value of the partial correlation coefficient indicates – The strength and direction of relationship between two variables – After controlling for the influence of one or more other variables • Just like with the Pearson correlation coefficient – Partial correlation coefficient has a range of 1.00  to 1.00,  where – Zero indicates there is no relationship – Sign indicates the direction of the relationship – Key difference ▪ Partial correlation coefficient indicates the linear relationship between two variables ▪ After controlling for another variable
  • 86. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (16 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Researchers use statistical programs to calculate partial correlation coefficients. • If you are curious how to calculate the partial correlation coefficient (or the regression coefficients discussed in the next section), we recommend – Cohen, Cohen, West, and Aiken (2003) – Keith (2019) • Called “partial” correlation coefficient because the technique statistically removes or “partials” out the influence of the other variables statistically controlled for
  • 87. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (17 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Regression Analysis – When all variables are quantitative, the technique called regression analysis is often appropriate. – Regression analysis - use of one or more quantitative IVs to explain or predict the values of a single quantitative DV – Two main types of regression analysis ▪ Simple regression - regression analysis with one DV and one IV ▪ Multiple regression - regression analysis with one DV and two or more IVs – Regression equation - the equation that defines a regression line – Regression line - the line of “best fit” based on a regression equation – Regression analysis can be used with curvilinear data. – We only discuss linear relationships in this text.
  • 88. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Figure 15.12 Regression Line Showing the Relationship Between GPA and Starting Salary
  • 89. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (18 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Figure 15.12 - scatterplot of college GPA and starting salary with the regression line inserted • Two important characteristics of a line – Slope - tells you how steep the line is – Y-intercept - the point at which a regression line crosses the Y (vertical) axis • Regression equation – Ŷ (called Y-hat) is the predicted value of the DV – 0 b is the Y-intercept – 1 b is the slope (it’s called the regression coefficient) – 1 X is the single IV
  • 90. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (19 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Regression equation for the regression line shown in Figure 15.12 – DV (Y) is starting salary – IV 1 (X ) is GPA • Researchers rarely, if ever, calculate the regression equation by hand! • Y-intercept is $9,405.55; this is the predicted starting salary if a person had a GPA of 0.
  • 91. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (20 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Regression coefficient - the slope or change in Y given a one unit change in X • Regression coefficient or slope in our example is $7,687.48. – Starting salary is expected to increase by $7,687.48 for every one unit increase in GPA. – Or decrease by $7,687.48 for every one unit decrease in GPA • Example – A student with a 3 on the GPA variable (i.e., a B) is predicted to start at a salary of $7,687 more than a student with a 2 (i.e., a C). – Used the traditional grading scale (A = 4, B = 3, C = 2, D = 1, F = 0)
  • 92. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (21 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Regression equation can be used to obtain predicted values for the DV for specific values of the IV. – Example ▪ Let’s see what the predicted starting salary is for – A student with a college GPA of 3 (i.e., a B average) ˆ ˆ ˆ = $9,405.55 + $7,687.48(3.00) We inserted the GPA value of 3.00 = $9,405.55 + $23,062.44 We multiplied $7,687.48 by 3.00 = $32,467.99 We added $9,405.55 and $23,062.44 Y Y Y – Expected starting salary is $32,467.99 – Someone with a C average (i.e., a GPA value of 2) • Insert a 2 into the equation and solve it • Predicted starting salary is $24,780.51 ▪ Notice that the difference between the starting salary for someone with a C and a B is equal to the value of the regression coefficient. ▪ $32,467.99 − $24,780.51 = $7,687.48
  • 93. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (22 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Multiple regression – Multiple regression equation includes one regression coefficient for each IV. • Useful difference between the simple and multiple regression – Multiple regression coefficient shows the relationship between the DV and the IV controlling for the other IVs in the equation. – Analogous to the idea discussed earlier with partial correlation – Multiple regression coefficient is called the partial regression coefficient.
  • 94. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (23 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Useful difference between the simple and multiple regression – Simple regression analogous to a Pearson correlation which does not control for any confounding variables – Multiple regression provides one way that you can control for one or more variables. ▪ The difference in the actual values of the correlation and regular (unstandardized) regression coefficients – Correlation coefficients are in standardized units that vary from 1.00  to 1.00  – Regular regression coefficients are in natural units
  • 95. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (24 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Multiple Regression example – The partial correlation coefficient expressing the relationship between starting salary and GPA controlling for SAT scores is .413 – The partial regression coefficient is $4,788.90 ▪ Controlling for SAT scores, each unit change in GPA is predicted to lead to a $4,788.90 change in income. – Using the data from our hypothetical college student data set, we used SPSS to provide the following multiple regression equation. – Based on the DV of starting salary and the IVs of GPA and high school SAT – 1 X = GPA – 2 X = high school SAT
  • 96. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (25 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • First partial regression coefficient in the preceding regression equation is $4,788.90 – After controlling for SAT scores, starting salary increases by $4,788.90 for each one-unit increase in GPA • Second partial regression coefficient is $25.56. – After controlling for GPA, starting salary increases by $25.56 for each one-unit increase in SAT
  • 97. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (26 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Obtain a predicted starting salary using our multiple regression equation by inserting the values for GPA and SAT and solve for Y-hat. – B student – 1100 on SA.T ˆ ˆ ˆ = $12,435.59 + $4,788.90(3) + $25.56(1100) We inserted a 3 for GPA and1100 for SAT = $12,435.59 + $14,366.70 + $25.56(1100) We multiplied +4,788.90 times 3 = $12,435.59 + $14,366.70 + $28,116.00    Y Y Y ˆ We multiplied +25.56 times1100 = $30,047.11 We added the two positive numbers and subtracted the negative number Y – Predicted starting salary is $30,047.11.
  • 98. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (27 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Contingency Tables – A categorical DV and a categorical IV – Construct a contingency table (also called cross-tabulation) – Contingency table- table used to examine the relationship between categorical variables – Two-dimension contingency table ▪ Two variables ▪ Rows represent the categories of one of the variables. ▪ Columns represent the categories of the other variable. – Various types of information can be placed into the cells of a contingency table. ▪ Cell frequencies ▪ Cell percentages ▪ Row percentages ▪ Column percentages
  • 99. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Table 15.5 Personality Type by Gender Contingency Tables Blank Blank Gender Female Gender Male Personality Type A 2,972 2,460 Type Type B 1,921 971 Blank Blank 4,893 3,431 (a) Contingency Table Showing Cell Frequencies (hypothetical data) Blank Blank Gender Female Gender Male Personality Type A 60.7% 71.7% Type Type B 39.3% 28.3% Blank Blank 100% 100% (b) Contingency Table Showing Column Percentages (based on the data in part (a)
  • 100. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (28 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Contingency Tables – Column variable is gender (i.e., female or male) – Row variable is personality type ▪ Type- A personality – Likely to be impatient, competitive, irritable, high achieving, engage in multitasking, and feel a sense of urgency ▪ Type-B personality – Likely to be cooperative, less competitive, more relaxed, more patient, more satisfied, and easygoing – Research question ▪ Whether there is a relationship between gender and personality types – Does gender seem to predict personality type? – Do you think that women tend to be type A more than men tend to be type A? – Very difficult to determine how the two variables are related based on cell frequencies alone
  • 101. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (29 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Table 15.5(b) – Calculated what are called column percentages for females and males ▪ Type A personality column percentages – 60.7% of females were type A. – 71.7% of the men were type A. – Men had a greater rate of type-A personality than women. ▪ Type-B personality column percentages – 39.3% for females – 28.3% for men – Women have a higher rate of type-B personality than men.
  • 102. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (30 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • We recommend that you – Make your predictor variable (IV) the column variable and your DV the row variable – Calculate column percentages and compare the rates across the rows • In order to correctly read a contingency table, you need to remember these two simple rules. – If the percentages are calculated down the columns, then compare across the rows. – If the percentages are calculated across the rows, then compare down the columns.
  • 103. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Examining Relationships Among Variables (31 of 31) 15.6 Summarize the Techniques Used to Determine Relationships Among Variables • Rates are frequently reported in – The news – Some types of research (e.g., epidemiology) • More advanced research – Add another (a third) IV – Construct the two-way table.
  • 104. Copyright © 2020, 2014, 2011 Pearson Education, Inc. All Rights Reserved Copyright This work is protected by United States copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the World Wide Web) will destroy the integrity of the work and is not permitted. The work and materials from it should never be made available to students except by instructors using the accompanying text in their classes. All recipients of this work are expected to abide by these restrictions and to honor the intended pedagogical purposes and the needs of other instructors who rely on these materials.

Editor's Notes

  • #2: If this PowerPoint presentation contains mathematical equations, you may need to check that your computer has the following installed: 1) Math Type Plugin 2) Math Player (free versions available) 3) NVDA Reader (free versions available)
  • #3: Long Description: The details are as follows: • Frequency Distributions • Graphic Representations 1. Bar Graphs 2. Histograms 3. Line Graphs 4. Scatterplots • Central Tendency 1. Mode 2. Median 3. Mean • Variability 1. Range 2. Variance 3. Standard Deviation 4. Z-scores • Relations Among Variables 1. Difference Between Means 2. Correlation Coefficient 3. Partial Correlation Coefficient 4. Regression Analysis 5. Contingency Tables
  • #5: Long Description: Classification is as follows: Statistics: Descriptive statistics and Inferential statistics. Inferential statistics: Estimation and Hypothesis testing. Estimation: Point estimation and interval estimation.
  • #23: Long Description: The chart shows the undergraduate majors on the x-axis and “Frequency” on the y-axis. The chart shows vertical bars against each undergraduate major. The heights of the bars differ and they represent the frequencies. The chart shows gaps between the bars. The frequency for each major is as given below. Psychology: 8 Philosophy: 10 Business: 7
  • #26: Long Description: The histogram shows the starting salary on the x-axis. Its values range from 20,000.00 to 45,000.00 with an increment of 5000. The y-axis shows the frequency. Its values range from 0 to 8, in increments of 2. The chart shows vertical bars for various starting salaries. The details are as follows (approximate data): 25,000: 1 28,000: 2 33,000: more than 6 37,000: 2 39,000: 3 More than 40,000: 2.
  • #27: Long Description: The x-axis of the chart shows the starting salary. Its values range from 24,000 to 40,500 with random increments. The y-axis shows the frequency ranging from 0 to 3 with an increment of 1. The chart shows a line. It keeps fluctuating, going up and down at times and staying flat at times. The details are as follows (approximate data): 24000: 1 27500: 1 30500: 2 31500: 1 32500: 3 33500: 2 35500: 1 37500: 1 3900: 2 40500: 1
  • #31: Long Description: It shows “Time of measurement” on the x-axis. It shows the values “Pretest” and “Posttest.” The y-axis shows the mean number of appropriate interactions. Its values range from 2.00 to 4.00, in increments of 0.50. The chart shows 2 lines. One line represents “No skills training (control)” and the other line represents “Skills training (treatment).” The chart shows two plots for each line. The approximate mean number of appropriate interactions is as given below. No skills training Pretest: 2.60 Posttest: 3.10 Skills training Pretest: 2.70 Posttest: 4.00
  • #34: Long Description: The chart shows “College G P A” on the x-axis. The values shown on the x-axis are from 2.50 to 3.75. The y-axis shows the starting salary. Its values range from 20,000 to 45,000, in increments of 5,000. The chart shows many plots mapping the various GPA scores with various starting salaries. There is no pattern in the spread of plots on the chart.
  • #36: Long Description: It shows “Days missed” on the x-axis. Its values range from 0 to 40, in increments of 10. The y-axis shows the starting salary. Its values range from 20,000 to 45,000, in increments of 5,000. The chart shows many plots mapping the number of days missed with various starting salaries. The plots show a downward trend.
  • #53: Long Description: The chart shows a bell-shaped curve over a horizontal line. The line shows “Mean” in the center. To its right are the values “1 S D”, “2 S D”, and “3 S D.” They are equally spaced. Similarly, to the left of the mean are “negative 1 S D”, “negative 2 S D”, and “negative 3 S D.” The chart shows vertical lines at each of the values on the horizontal line and the percentage of the area of the curve between the lines as given below. To the left of negative 3 S D: 0.13 percent Between negative 3 S D and negative 2 S D: 2.15 percent Between negative 2 S D and negative 1 S D: 13.59 percent Between negative 1 S D and Mean: 34.13 percent Between Mean and 1 S D: 34.13 percent Between 1 S D and 2 S D: 13.59 percent Between 2 S D and 3 S D: 2.15 percent To the right of 3 S D: 0.13 percent The chart also shows the area under the curve between the negative and positive values of the same standard deviation as given below. Between negative 1 S D and 1 S D: 68.26 percent Between negative 2 S D and 2 S D: 95.44 percent Between negative 3 S D and 3 S D: 99.74 percent.
  • #70: Long Description: The chart shows a horizontal number line from negative 1.0 to 1.0, in increments of 0.1. The values from negative 1.0 to 0 are labeled “Negative correlation.” The value 0 is labeled “Zero correlation” and the values between 0 and 1 are labeled “Positive correlation.” The chart also shows 2 pairs of arrows to indicate the strength of correlation. Stronger An arrow pointing from 0 to negative 1.0 An arrow pointing from 0 to 1.0 Weaker An arrow pointing from negative 1.0 to 0 An arrow pointing 1.0 towards 0
  • #75: Long Description: The chart shows 7 different types of correlations. Each correlation is represented by the first quadrant of a graph and plots that map the values in both the axes. The names of the correlations, their coefficients, and the patterns are as given below. No correlation r equals 0 The plots are random and a circle surrounds the plots. Perfect positive correlation r equals 1.00 All plots form a straight line sloping upward. String positive correlation r equals 0.75 The plots show a rising trend and an ellipse surrounds the plots. Weak positive correlation r equals 0.30 The plots show a rising trend. An ellipse surrounds the plots. Its middle is wider than in the earlier case. Perfect negative correlation r equals negative 1.00 All plots form a straight line sloping downward. Strong negative correlation r equals negative 0.75 The plots show a declining trend and an ellipse surrounds the plots. Weak negative correlation r equals negative 0.30 The plots show a declining trend. An ellipse surrounds the plots. Its middle is wider than in the earlier case.
  • #78: Long Description: It shows a horizontal and a vertical axis intersecting at their centers. The two axes form 4 quadrants. The left end of the horizontal axis shows the letter “Y bar”. The lower end of the vertical axis shows the letter “X bar”. The top-right quadrant shows the text “High X; High Y.” The quadrant has plots that show a rising trend from the bottom left corner to the top right corner. The bottom-left quadrant shows the text “Low X; Low Y.” The quadrant has plots that show a rising trend from the bottom left corner to the top right corner.
  • #79: Long Description: The chart shows a horizontal and a vertical axis intersecting at their centers. The two axes form 4 quadrants. The left end of the horizontal axis shows the letter “Y bar”. The lower end of the vertical axis shows the letter “X bar”. The top-left quadrant shows the text “Low X; High Y.” The quadrant has plots that show a declining trend from the top left corner towards the bottom right corner. The bottom-right quadrant shows the text “High X; Low Y.” The quadrant has plots that show a declining trend from the top left corner to the bottom right corner.
  • #82: Long Description: A table calculates cross products of z scores using the following formula. z scores for variable X times z scores for variable Y equals cross products of Z scores, or Z sub X times Z sub Y = Z sub X Z sub Y. Five calculations are as follows. 1. negative 1.413 times negative 1.750 = 2473. 2. negative 0.707 times negative 0.343 = 0.243. 3. 0 times 0.453 = 0. 4. 0.707 times 0.453 = 0.320. 5. 1.43 times 1.187 = 1.677. sigma sum Z sub X Z sub Y = 4.713. This is the sum you need for the formula.
  • #89: Long Description: The chart shows “College G P A” on the x-axis (ranging from 0.00 to 4.00 with an increment of 1) and “Starting salary” on the y-axis (ranging from 0 to 50000 with an increment of 10000). The plots on the chart show a rising trend with a strong correlation and are concentrated around the point (3.00, 30,000). The chart shows a line sloping upward. It starts from a point on the y-axis above the origin and passes through the plots.
  • #93: Long Description: The details are as follows: Y cap equals 9,405.55 dollars plus 7,687.48 left parenthesis 3.00 right parenthesis. We inserted the GPA value of 3.00. Y cap equals 9, 405.55 dollars plus 23,062.44 dollars. We multiplied 7,687.48 dollars by 3.00. Y cap equals 32,467.99 dollars. We added 9,405.55 dollars and 23,062.44 dollars.
  • #98: Long Description: The details are as follows: Y cap equals negative 12,435.59 dollars plus 4,788.90 dollars left parenthesis 3 right parenthesis plus 25.56 dollars left parenthesis 1100 right parenthesis. We inserted a 3 for GPA and 1100 for SAT. Y cap equals negative 12,435.59 dollars plus 14,366.70 dollars plus 25.56 dollars left parenthesis 1100 right parenthesis. We multiplied plus 4,788.90 times 3. Y cap equals negative 12,435.59 dollars plus 14,366.70 dollars plus 28,116.00. We multiplied plus 25.56 times 1100. Y cap equals 30,047.11 dollars. We added two positive numbers and subtracted the negative number.