SlideShare a Scribd company logo
BASIC STATISTICS ON
RESEARCH DESIGN
GRACE P. PRINCIPE
TYPES OF DATA IN STATISTICS
1. Continuous Data
2. Discrete Data
3. Nominal Data
4. Interval Data
5. Categorical Data
CONTINUOUS DATA
are data which come from an interval of possible outcomes.
Examples of continuous data include:
• the amount of rain, in inches, that falls in a randomly
selected storm
• the weight, in pounds, of a randomly selected student
• the square footage of a randomly selected three-bedroom
house
DISCRETE DATA
data with a finite or countably infinite number of possible
outcomes.
Examples of discrete data include
• the number of siblings a randomly selected person has
• the total on the faces of a pair of six-sided dice
• the number of students you need to ask before you find
one who loves
NOMINAL DATA
values are grouped into categories that have no meaningful
order. For example, gender and political affiliation are
nominal level variables. Members in the group are assigned a
label in that group and there is no hierarchy. Typical
descriptive statistics associated with nominal data are
frequencies and percentages.
INTERVAL DATA
is a type of data which is measured along a scale, in which
each point is placed at an equal distance (interval) from one
another. Interval data is one of the two types of discrete data.
An example of interval data is the data collected on a
thermometer—its gradation or markings are equidistant.
CATEGORICAL DATA
Categorical variables represent types of data which may be
divided into groups. Examples of categorical variables are
race, sex, age group, and educational level. While the latter
two variables may also be considered in a numerical manner
by using exact values for age and highest grade completed, it
is often more informative to categorize such variables into a
relatively small number of groups.
STATISTICS
the science concerned with developing and studying methods
for collecting, analyzing, interpreting and presenting empirical
data. Statistics is a highly interdisciplinary field; research in
statistics finds applicability in virtually all scientific fields and
research questions in the various scientific fields motivate the
development of new statistical methods and theory.
STATISTICS AND ITS TYPES
Statistics is a collection of planning
experiments methods, obtaining data,
analyzing, interpreting, and drawing
conclusions based on the data (Alferes &
Duro 2010). It is divided into two main areas:
Descriptive and Inferential.
DESCRIPTIVE STATISTICS
• summarizes or describes the essential characteristics of a
known set of data.
• are brief descriptive coefficients that summarize a given
data set, which can be either a representation of the entire
or a sample of a population.
• For example, the Department of Health conducts a tally to
determine the number of CoViD-19 cases per day in the
Philippines.
INFERENTIAL STATISTICS
• uses sample data to make inferences about a population. It
consists of generalizing from samples to populations,
performing hypothesis testing, determining relationships
among variables, and making predictions.
• For example, assuming you want to find out if the Filipinos
want to take a shot on the CoViD-19 vaccine. In such a case, a
smaller sample of the population is considered. The results
are drawn, and the analysis is extended to the larger data set.
TOOLS IN DESCRIPTIVE
STATISTICS
Frequency Distribution is a collection of
observations produced by sorting them into
classes and showing their frequency or
numbers of occurrences in each class. For
example, twenty-five students were given a
blood test to determine their blood types.
From the given data, here
is how to organize them
using frequency
distribution.
Data sets of Blood types
of Twenty-five students.
MEASURES OF CENTRAL TENDENCY OR
POSITION OR AVERAGE
When scores and other measures have been tabulated into a
frequency distribution, the next task is to calculate a measure
of central tendency or central position.
This measure of central tendency is synonymous with the
word “average”. An average is a typical value that tends to
describe the set of data.
MEAN
Mean, or simply the average is the most frequently used
and can be described as the arithmetic average of all
scores or groups of scores in a distribution. The process
can be done by adding all the scores or data then divided
by the total number of cases.
MEDIAN
Median, or the middle-most value in a list of items arranged in
increasing or decreasing order. If the case is in an odd number
or items, there will be exactly one item in the middle. In case
the number or items is an even number, the midpoint will be
determined by getting the average of the two-middle item.
MODE
mode is the score or group of scores that
occur most frequently. Some distributions
don’t have mode at all. Others may have
more than one mode. In cases that the
distribution has two modes, the term used is
bimodal.
Laboratory tests reveal
the incubation period
(measures in days) of
virus among the 30
infected residents of Brgy.
Malinis
In dealing with this,
arrange the given data
from highest to lowest or
vice versa
Basic-Statistics in Research Design Presentation
MEASURES OF VARIATION/
DISPERSION
The previous section focused on average
or measures of central tendency. The
averages are supposed to be the central
scores of a given set of data, However,
not all features of a given data set may
be reflected by the averages. Suppose,
two different groups of 5 Students are
given 20-item identical quizzes in Science.
The following data below were the
results.
MEASURES OF VARIATION/
DISPERSION
The average of each
group are as follows.
As shown in the second table, the two
sets of averages have no difference. But
both groups show an obvious difference.
Group 2 has more widely scattered data
compared to Group 1. This characteristic
called variability or dispersion is not
reflected by averages. The three basic
measures of dispersion are range,
variance, and standard deviation.
RANGE
is the simplest measure of dispersion to calculate.
It is done by getting the difference between the
highest/largest value and lowest/smallest value in
each set of data. A larger range suggests greater
variations or dispersion. On the other hand, a
smaller range suggests lesser variations or
dispersion
VARIANCE
measures how far a data set is spread out. It is
mathematically defined as the average of the
squared differences from the mean.
STANDARD DEVIATION
is the most commonly used measure of dispersion. It
indicates how closely the values of the given data set are
clustered around the mean. It is computed by getting the
positive square root of variance. The lower value of standard
deviation means that the values of the given set of data are
spread over a smaller range around the mean. On the other
hand, greater value means that the values of the given set of
data are spread over a larger range around the mean.
USED IN HYPOTHESIS TESTING
 To determine whether a predictor variable has a statistically
significant relationship with an outcome variable and
estimate the difference between two or more groups.
 To determine what type of statistical tool is appropriate.
 To choose the test that fits the types of predictor or
independent variables and outcome/dependent variables
you have collected.
TOOLS IN INFERENTIAL
STATISTICS
Statistical tests are used to derive a generalization about the
population from the sample. A statistical test is a formal technique
that relies on the probability distribution for concluding the
reasonableness of the hypothesis. These hypothetical testing related
to differences are classified as parametric and non-parametric tests.
The parametric test is one that has information about the population
parameter. On the other hand, the non-parametric test is where the
researcher has no idea regarding the population parameter.
PARAMETRIC TESTS
usually have stricter requirements than non-parametric tests
and can make more robust inferences from the data. They can
only be conducted with data that adheres to the standard
assumptions of statistical tests.
The most common types of the parametric test include
regression tests, comparison tests, and correlation tests.
PARAMETRIC
TESTS
Flowchart that will help us
determine the appropriate
statistical tool for
parametric tests
EXAMPLE
The Effect of the Amount of Chlorine in the Color of Algae. Identify first
your independent and dependent variables, how many are they, and
their type, whether qualitative/ categorical or quantitative/numeric. After
identifying such, look at the diagram above to know the parametric test's
right statistical tool. In the given problem, the amount of chlorine is the
independent variable, it’s numeric or qualitative, and 2 or more amounts
of chlorine may be used in the experiment. The dependent variable is the
color of algae; its categorical and color may vary. So, looking at the
above diagram, logistic regression is the appropriate tool.
NON-PARAMETRIC TEST
They don’t make as many assumptions about
the data and are useful when one or more
common statistical assumptions are violated.
However, the inferences they make aren’t as
strong as with parametric tests.
NON-PARAMETRIC
TEST
The table shows how to
determine the
appropriate non-
parametric tool to be
used.
Statistical tools are complex,
especially among beginners.
However, according to Grobman,
2017, the most commonly used
in science investigatory projects
are chi-square, t-tests, and
correlations. In determining
whether there is no statistically
significant relationship between
the independent and dependent
variables, we always consider the
standard rule of thumb. If the p-
value is lower than 0.05, we
reject the null hypothesis and
accept the alternative
hypothesis.
Licensed Statisticians
play a vital role in
computing and
interpreting the results
of the data gathered. In
any investigation, it is
important to consult
them to ensure that
your results are
statistically correct. SPSS
and Strata are some of
the most common
software they are using.
THANK YOU

More Related Content

Similar to Basic-Statistics in Research Design Presentation (20)

PPTX
INTRODUCTION-TO-STATISTICS-and-FDT-2 (1).pptx
angeliquebartolome1
 
PDF
Data Analysis with SPSS PPT.pdf
Thanavathi C
 
PPTX
Introduction of biostatistics
khushbu
 
PPTX
AGRICULTURAL-STATISTICS.pptx
DianeJieRobuca1
 
PPTX
Lesson 1_Introduction to Statistics.pptx
RODRIGOAPADOGDOG
 
PPTX
Basic statistics
Ganesh Raju
 
PPT
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
PPT
Business statistics (Basics)
AhmedToheed3
 
PPTX
Statistical techniques for interpreting and reporting quantitative data i
Vijayalakshmi Murugesan
 
PDF
Nature of Statistics Nature of Statistics Nature of Statistics
ShannaClarito
 
PPTX
Statistics-Chapter-1.pptxheheheueuehehehehehe
raulskie17
 
PPT
Chapter34
Ying Liu
 
PPTX
STATISTICS ppt. G11.pptx
jarred16
 
PPTX
introduction to statistics
mirabubakar1
 
PPT
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
PPTX
Stats LECTURE 1.pptx
KEHKASHANNIZAM
 
PDF
1.Introduction to Biostatistics MBChB 6 - DPH 6024.pdf
luapulachishipula14
 
PPTX
introduction to biostat, standard deviation and variance
amol askar
 
PPTX
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
PDF
STATISTICS-E.pdf
ssuser86252c
 
INTRODUCTION-TO-STATISTICS-and-FDT-2 (1).pptx
angeliquebartolome1
 
Data Analysis with SPSS PPT.pdf
Thanavathi C
 
Introduction of biostatistics
khushbu
 
AGRICULTURAL-STATISTICS.pptx
DianeJieRobuca1
 
Lesson 1_Introduction to Statistics.pptx
RODRIGOAPADOGDOG
 
Basic statistics
Ganesh Raju
 
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
Business statistics (Basics)
AhmedToheed3
 
Statistical techniques for interpreting and reporting quantitative data i
Vijayalakshmi Murugesan
 
Nature of Statistics Nature of Statistics Nature of Statistics
ShannaClarito
 
Statistics-Chapter-1.pptxheheheueuehehehehehe
raulskie17
 
Chapter34
Ying Liu
 
STATISTICS ppt. G11.pptx
jarred16
 
introduction to statistics
mirabubakar1
 
Stats-Review-Maie-St-John-5-20-2009.ppt
DiptoKumerSarker1
 
Stats LECTURE 1.pptx
KEHKASHANNIZAM
 
1.Introduction to Biostatistics MBChB 6 - DPH 6024.pdf
luapulachishipula14
 
introduction to biostat, standard deviation and variance
amol askar
 
STATISTICS.pptx for the scholars and students
ssuseref12b21
 
STATISTICS-E.pdf
ssuser86252c
 

Recently uploaded (20)

PPTX
things that used in cleaning of the things
drkaran1421
 
PPTX
apidays Munich 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (Aavista Oy)
apidays
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPTX
AI Project Cycle and Ethical Frameworks.pptx
RiddhimaVarshney1
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
Learning Tendency Analysis of Scratch Programming Course(Entry Class) for Upp...
ryouta039
 
PPTX
apidays Munich 2025 - Federated API Management and Governance, Vince Baker (D...
apidays
 
PPT
1 DATALINK CONTROL and it's applications
karunanidhilithesh
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PPTX
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PPTX
Lecture_9_EPROM_Flash univeristy lecture fall 2022
ssuser5047c5
 
PDF
Introduction to Data Science_Washington_
StarToon1
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
Green Vintage Notebook Science Subject for Middle School Climate and Weather ...
RiddhimaVarshney1
 
PPTX
fashion industry boom.pptx an economics project
TGMPandeyji
 
things that used in cleaning of the things
drkaran1421
 
apidays Munich 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (Aavista Oy)
apidays
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Climate Action.pptx action plan for climate
justfortalabat
 
AI Project Cycle and Ethical Frameworks.pptx
RiddhimaVarshney1
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Rocket-Launched-PowerPoint-Template.pptx
Arden31
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Learning Tendency Analysis of Scratch Programming Course(Entry Class) for Upp...
ryouta039
 
apidays Munich 2025 - Federated API Management and Governance, Vince Baker (D...
apidays
 
1 DATALINK CONTROL and it's applications
karunanidhilithesh
 
Data base management system Transactions.ppt
gandhamcharan2006
 
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
Lecture_9_EPROM_Flash univeristy lecture fall 2022
ssuser5047c5
 
Introduction to Data Science_Washington_
StarToon1
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
Green Vintage Notebook Science Subject for Middle School Climate and Weather ...
RiddhimaVarshney1
 
fashion industry boom.pptx an economics project
TGMPandeyji
 
Ad

Basic-Statistics in Research Design Presentation

  • 1. BASIC STATISTICS ON RESEARCH DESIGN GRACE P. PRINCIPE
  • 2. TYPES OF DATA IN STATISTICS 1. Continuous Data 2. Discrete Data 3. Nominal Data 4. Interval Data 5. Categorical Data
  • 3. CONTINUOUS DATA are data which come from an interval of possible outcomes. Examples of continuous data include: • the amount of rain, in inches, that falls in a randomly selected storm • the weight, in pounds, of a randomly selected student • the square footage of a randomly selected three-bedroom house
  • 4. DISCRETE DATA data with a finite or countably infinite number of possible outcomes. Examples of discrete data include • the number of siblings a randomly selected person has • the total on the faces of a pair of six-sided dice • the number of students you need to ask before you find one who loves
  • 5. NOMINAL DATA values are grouped into categories that have no meaningful order. For example, gender and political affiliation are nominal level variables. Members in the group are assigned a label in that group and there is no hierarchy. Typical descriptive statistics associated with nominal data are frequencies and percentages.
  • 6. INTERVAL DATA is a type of data which is measured along a scale, in which each point is placed at an equal distance (interval) from one another. Interval data is one of the two types of discrete data. An example of interval data is the data collected on a thermometer—its gradation or markings are equidistant.
  • 7. CATEGORICAL DATA Categorical variables represent types of data which may be divided into groups. Examples of categorical variables are race, sex, age group, and educational level. While the latter two variables may also be considered in a numerical manner by using exact values for age and highest grade completed, it is often more informative to categorize such variables into a relatively small number of groups.
  • 8. STATISTICS the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data. Statistics is a highly interdisciplinary field; research in statistics finds applicability in virtually all scientific fields and research questions in the various scientific fields motivate the development of new statistical methods and theory.
  • 9. STATISTICS AND ITS TYPES Statistics is a collection of planning experiments methods, obtaining data, analyzing, interpreting, and drawing conclusions based on the data (Alferes & Duro 2010). It is divided into two main areas: Descriptive and Inferential.
  • 10. DESCRIPTIVE STATISTICS • summarizes or describes the essential characteristics of a known set of data. • are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire or a sample of a population. • For example, the Department of Health conducts a tally to determine the number of CoViD-19 cases per day in the Philippines.
  • 11. INFERENTIAL STATISTICS • uses sample data to make inferences about a population. It consists of generalizing from samples to populations, performing hypothesis testing, determining relationships among variables, and making predictions. • For example, assuming you want to find out if the Filipinos want to take a shot on the CoViD-19 vaccine. In such a case, a smaller sample of the population is considered. The results are drawn, and the analysis is extended to the larger data set.
  • 12. TOOLS IN DESCRIPTIVE STATISTICS Frequency Distribution is a collection of observations produced by sorting them into classes and showing their frequency or numbers of occurrences in each class. For example, twenty-five students were given a blood test to determine their blood types.
  • 13. From the given data, here is how to organize them using frequency distribution. Data sets of Blood types of Twenty-five students.
  • 14. MEASURES OF CENTRAL TENDENCY OR POSITION OR AVERAGE When scores and other measures have been tabulated into a frequency distribution, the next task is to calculate a measure of central tendency or central position. This measure of central tendency is synonymous with the word “average”. An average is a typical value that tends to describe the set of data.
  • 15. MEAN Mean, or simply the average is the most frequently used and can be described as the arithmetic average of all scores or groups of scores in a distribution. The process can be done by adding all the scores or data then divided by the total number of cases.
  • 16. MEDIAN Median, or the middle-most value in a list of items arranged in increasing or decreasing order. If the case is in an odd number or items, there will be exactly one item in the middle. In case the number or items is an even number, the midpoint will be determined by getting the average of the two-middle item.
  • 17. MODE mode is the score or group of scores that occur most frequently. Some distributions don’t have mode at all. Others may have more than one mode. In cases that the distribution has two modes, the term used is bimodal.
  • 18. Laboratory tests reveal the incubation period (measures in days) of virus among the 30 infected residents of Brgy. Malinis In dealing with this, arrange the given data from highest to lowest or vice versa
  • 20. MEASURES OF VARIATION/ DISPERSION The previous section focused on average or measures of central tendency. The averages are supposed to be the central scores of a given set of data, However, not all features of a given data set may be reflected by the averages. Suppose, two different groups of 5 Students are given 20-item identical quizzes in Science. The following data below were the results.
  • 21. MEASURES OF VARIATION/ DISPERSION The average of each group are as follows. As shown in the second table, the two sets of averages have no difference. But both groups show an obvious difference. Group 2 has more widely scattered data compared to Group 1. This characteristic called variability or dispersion is not reflected by averages. The three basic measures of dispersion are range, variance, and standard deviation.
  • 22. RANGE is the simplest measure of dispersion to calculate. It is done by getting the difference between the highest/largest value and lowest/smallest value in each set of data. A larger range suggests greater variations or dispersion. On the other hand, a smaller range suggests lesser variations or dispersion
  • 23. VARIANCE measures how far a data set is spread out. It is mathematically defined as the average of the squared differences from the mean.
  • 24. STANDARD DEVIATION is the most commonly used measure of dispersion. It indicates how closely the values of the given data set are clustered around the mean. It is computed by getting the positive square root of variance. The lower value of standard deviation means that the values of the given set of data are spread over a smaller range around the mean. On the other hand, greater value means that the values of the given set of data are spread over a larger range around the mean.
  • 25. USED IN HYPOTHESIS TESTING  To determine whether a predictor variable has a statistically significant relationship with an outcome variable and estimate the difference between two or more groups.  To determine what type of statistical tool is appropriate.  To choose the test that fits the types of predictor or independent variables and outcome/dependent variables you have collected.
  • 26. TOOLS IN INFERENTIAL STATISTICS Statistical tests are used to derive a generalization about the population from the sample. A statistical test is a formal technique that relies on the probability distribution for concluding the reasonableness of the hypothesis. These hypothetical testing related to differences are classified as parametric and non-parametric tests. The parametric test is one that has information about the population parameter. On the other hand, the non-parametric test is where the researcher has no idea regarding the population parameter.
  • 27. PARAMETRIC TESTS usually have stricter requirements than non-parametric tests and can make more robust inferences from the data. They can only be conducted with data that adheres to the standard assumptions of statistical tests. The most common types of the parametric test include regression tests, comparison tests, and correlation tests.
  • 28. PARAMETRIC TESTS Flowchart that will help us determine the appropriate statistical tool for parametric tests
  • 29. EXAMPLE The Effect of the Amount of Chlorine in the Color of Algae. Identify first your independent and dependent variables, how many are they, and their type, whether qualitative/ categorical or quantitative/numeric. After identifying such, look at the diagram above to know the parametric test's right statistical tool. In the given problem, the amount of chlorine is the independent variable, it’s numeric or qualitative, and 2 or more amounts of chlorine may be used in the experiment. The dependent variable is the color of algae; its categorical and color may vary. So, looking at the above diagram, logistic regression is the appropriate tool.
  • 30. NON-PARAMETRIC TEST They don’t make as many assumptions about the data and are useful when one or more common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.
  • 31. NON-PARAMETRIC TEST The table shows how to determine the appropriate non- parametric tool to be used.
  • 32. Statistical tools are complex, especially among beginners. However, according to Grobman, 2017, the most commonly used in science investigatory projects are chi-square, t-tests, and correlations. In determining whether there is no statistically significant relationship between the independent and dependent variables, we always consider the standard rule of thumb. If the p- value is lower than 0.05, we reject the null hypothesis and accept the alternative hypothesis.
  • 33. Licensed Statisticians play a vital role in computing and interpreting the results of the data gathered. In any investigation, it is important to consult them to ensure that your results are statistically correct. SPSS and Strata are some of the most common software they are using.