Basic-Statistics in Research Design Presentation

BASIC STATISTICS ON
RESEARCH DESIGN
GRACE P. PRINCIPE

TYPES OF DATA IN STATISTICS
1. Continuous Data
2. Discrete Data
3. Nominal Data
4. Interval Data
5. Categorical Data

CONTINUOUS DATA
are data which come from an interval of possible outcomes.
Examples of continuous data include:
• the amount of rain, in inches, that falls in a randomly
selected storm
• the weight, in pounds, of a randomly selected student
• the square footage of a randomly selected three-bedroom
house

DISCRETE DATA
data with a finite or countably infinite number of possible
outcomes.
Examples of discrete data include
• the number of siblings a randomly selected person has
• the total on the faces of a pair of six-sided dice
• the number of students you need to ask before you find
one who loves

NOMINAL DATA
values are grouped into categories that have no meaningful
order. For example, gender and political affiliation are
nominal level variables. Members in the group are assigned a
label in that group and there is no hierarchy. Typical
descriptive statistics associated with nominal data are
frequencies and percentages.

INTERVAL DATA
is a type of data which is measured along a scale, in which
each point is placed at an equal distance (interval) from one
another. Interval data is one of the two types of discrete data.
An example of interval data is the data collected on a
thermometer—its gradation or markings are equidistant.

CATEGORICAL DATA
Categorical variables represent types of data which may be
divided into groups. Examples of categorical variables are
race, sex, age group, and educational level. While the latter
two variables may also be considered in a numerical manner
by using exact values for age and highest grade completed, it
is often more informative to categorize such variables into a
relatively small number of groups.

STATISTICS
the science concerned with developing and studying methods
for collecting, analyzing, interpreting and presenting empirical
data. Statistics is a highly interdisciplinary field; research in
statistics finds applicability in virtually all scientific fields and
research questions in the various scientific fields motivate the
development of new statistical methods and theory.

STATISTICS AND ITS TYPES
Statistics is a collection of planning
experiments methods, obtaining data,
analyzing, interpreting, and drawing
conclusions based on the data (Alferes &
Duro 2010). It is divided into two main areas:
Descriptive and Inferential.

DESCRIPTIVE STATISTICS
• summarizes or describes the essential characteristics of a
known set of data.
• are brief descriptive coefficients that summarize a given
data set, which can be either a representation of the entire
or a sample of a population.
• For example, the Department of Health conducts a tally to
determine the number of CoViD-19 cases per day in the
Philippines.

INFERENTIAL STATISTICS
• uses sample data to make inferences about a population. It
consists of generalizing from samples to populations,
performing hypothesis testing, determining relationships
among variables, and making predictions.
• For example, assuming you want to find out if the Filipinos
want to take a shot on the CoViD-19 vaccine. In such a case, a
smaller sample of the population is considered. The results
are drawn, and the analysis is extended to the larger data set.

TOOLS IN DESCRIPTIVE
STATISTICS
Frequency Distribution is a collection of
observations produced by sorting them into
classes and showing their frequency or
numbers of occurrences in each class. For
example, twenty-five students were given a
blood test to determine their blood types.

From the given data, here
is how to organize them
using frequency
distribution.
Data sets of Blood types
of Twenty-five students.

MEASURES OF CENTRAL TENDENCY OR
POSITION OR AVERAGE
When scores and other measures have been tabulated into a
frequency distribution, the next task is to calculate a measure
of central tendency or central position.
This measure of central tendency is synonymous with the
word “average”. An average is a typical value that tends to
describe the set of data.

MEAN
Mean, or simply the average is the most frequently used
and can be described as the arithmetic average of all
scores or groups of scores in a distribution. The process
can be done by adding all the scores or data then divided
by the total number of cases.

MEDIAN
Median, or the middle-most value in a list of items arranged in
increasing or decreasing order. If the case is in an odd number
or items, there will be exactly one item in the middle. In case
the number or items is an even number, the midpoint will be
determined by getting the average of the two-middle item.

MODE
mode is the score or group of scores that
occur most frequently. Some distributions
don’t have mode at all. Others may have
more than one mode. In cases that the
distribution has two modes, the term used is
bimodal.

Laboratory tests reveal
the incubation period
(measures in days) of
virus among the 30
infected residents of Brgy.
Malinis
In dealing with this,
arrange the given data
from highest to lowest or
vice versa

Basic-Statistics in Research Design Presentation

MEASURES OF VARIATION/
DISPERSION
The previous section focused on average
or measures of central tendency. The
averages are supposed to be the central
scores of a given set of data, However,
not all features of a given data set may
be reflected by the averages. Suppose,
two different groups of 5 Students are
given 20-item identical quizzes in Science.
The following data below were the
results.

MEASURES OF VARIATION/
DISPERSION
The average of each
group are as follows.
As shown in the second table, the two
sets of averages have no difference. But
both groups show an obvious difference.
Group 2 has more widely scattered data
compared to Group 1. This characteristic
called variability or dispersion is not
reflected by averages. The three basic
measures of dispersion are range,
variance, and standard deviation.

RANGE
is the simplest measure of dispersion to calculate.
It is done by getting the difference between the
highest/largest value and lowest/smallest value in
each set of data. A larger range suggests greater
variations or dispersion. On the other hand, a
smaller range suggests lesser variations or
dispersion

VARIANCE
measures how far a data set is spread out. It is
mathematically defined as the average of the
squared differences from the mean.

STANDARD DEVIATION
is the most commonly used measure of dispersion. It
indicates how closely the values of the given data set are
clustered around the mean. It is computed by getting the
positive square root of variance. The lower value of standard
deviation means that the values of the given set of data are
spread over a smaller range around the mean. On the other
hand, greater value means that the values of the given set of
data are spread over a larger range around the mean.

USED IN HYPOTHESIS TESTING
 To determine whether a predictor variable has a statistically
significant relationship with an outcome variable and
estimate the difference between two or more groups.
 To determine what type of statistical tool is appropriate.
 To choose the test that fits the types of predictor or
independent variables and outcome/dependent variables
you have collected.

TOOLS IN INFERENTIAL
STATISTICS
Statistical tests are used to derive a generalization about the
population from the sample. A statistical test is a formal technique
that relies on the probability distribution for concluding the
reasonableness of the hypothesis. These hypothetical testing related
to differences are classified as parametric and non-parametric tests.
The parametric test is one that has information about the population
parameter. On the other hand, the non-parametric test is where the
researcher has no idea regarding the population parameter.

PARAMETRIC TESTS
usually have stricter requirements than non-parametric tests
and can make more robust inferences from the data. They can
only be conducted with data that adheres to the standard
assumptions of statistical tests.
The most common types of the parametric test include
regression tests, comparison tests, and correlation tests.

PARAMETRIC
TESTS
Flowchart that will help us
determine the appropriate
statistical tool for
parametric tests

EXAMPLE
The Effect of the Amount of Chlorine in the Color of Algae. Identify first
your independent and dependent variables, how many are they, and
their type, whether qualitative/ categorical or quantitative/numeric. After
identifying such, look at the diagram above to know the parametric test's
right statistical tool. In the given problem, the amount of chlorine is the
independent variable, it’s numeric or qualitative, and 2 or more amounts
of chlorine may be used in the experiment. The dependent variable is the
color of algae; its categorical and color may vary. So, looking at the
above diagram, logistic regression is the appropriate tool.

NON-PARAMETRIC TEST
They don’t make as many assumptions about
the data and are useful when one or more
common statistical assumptions are violated.
However, the inferences they make aren’t as
strong as with parametric tests.

NON-PARAMETRIC
TEST
The table shows how to
determine the
appropriate non-
parametric tool to be
used.

Statistical tools are complex,
especially among beginners.
However, according to Grobman,
2017, the most commonly used
in science investigatory projects
are chi-square, t-tests, and
correlations. In determining
whether there is no statistically
significant relationship between
the independent and dependent
variables, we always consider the
standard rule of thumb. If the p-
value is lower than 0.05, we
reject the null hypothesis and
accept the alternative
hypothesis.

Licensed Statisticians
play a vital role in
computing and
interpreting the results
of the data gathered. In
any investigation, it is
important to consult
them to ensure that
your results are
statistically correct. SPSS
and Strata are some of
the most common
software they are using.

Basic-Statistics in Research Design Presentation

More Related Content

Similar to Basic-Statistics in Research Design Presentation (20)

Recently uploaded (20)

Basic-Statistics in Research Design Presentation