2. WHAT IS STATISTICS
Statistics is used in business and economics. It
plays an important role in the exploration of new
markets for a product, forecasting of business
trends, control and maintenance of high-quality
products, improvement of employer-employee
relationship and analysis of data concerning
insurance, investment, sales, employment,
transportation, communications, auditing and
accounting procedures.
3. STATISTICS is the branch of mathematics that deals
with the theory and method of collecting, organizing,
presenting, analyzing and interpreting data.
Two Main Divisions/Phases of Statistics
1. DESCRIPTIVE STATISTICS refers to the summary
statistic that quantitatively describes or summarizes
features from a collection of data under investigation.
The goal is to describe. Numerical measures are used to
tell about features of a set of data.
4. EXAMPLES:
The average, or measure of the center of a data set, consisting of the
mean, median, mode, or midrange
The spread of a data set, which can be measured with the range or
standard deviation
Overall descriptions of data such as the five number summary
Measurements such as skewness and kurtosis
The exploration of relationships and correlation between paired data
The presentation of statistical results in graphical form
5. 2. INFERENTIAL STATISTICS- statistical tools
that are used to examine the relationships
between variables within a sample and then
make generalizations or predictions about how
those variables will relate to a larger population.
• Example:
Tests of significance or hypothesis testing where scientists
make a claim about the population by analyzing a statistical
sample. By design, there is some uncertainty in this process.
This can be expressed in terms of a level of significance.
6. Two Branches of Statistics
1. Statistical Theory – is concerned with the
formulation of theories, principles, and
formulas which are used as bases in the
solution of problems related to Statistics.
2. Statistical Methods – is concerned with the
application of the theories, principles and
formulas in the solution of everyday
problems.
7. OTHER STATISTICAL TERMS:
• POPULATION – a set of data consisting of all conceivable possible
observations of a certain phenomenon. It refers to the totality of
the observations. Population is denoted by capital N.
• SAMPLE – a finite number of items selected from a population
possessing identical characteristics with those of the population from
which it was taken. Sample is denoted by small letter n
• PARAMETERS – are characteristics/measures computed from the
population
• STATISTIC/S – are characteristics/measures computed from the
sample
8. • VARIABLE – refers to a fundamental quantity that changes in
value from one observation to another within a given domain and
under a given set of conditions. Variables may be represented
by the letters X, Y, etc.
• DISCRETE VARIABLE - is a variable whose value is obtained by
counting.
• CONTINUOUS VARIABLE- is a variable whose value is obtained
by measuring.
• CONSTANT – refers to fundamental quantities that do not
change in value.
9. FOUR LEVELS OF DATA
MEASUREMENT
Nominal –also called the categorical variable scale, is defined as a scale
used for labeling variables into distinct classifications and doesn’t involve a
quantitative value or order. This scale is the simplest of the four variable
measurement scales. Calculations done on these variables will be futile as
there is no numerical value of the options. (ex. Sex, gender, place of
residence, political affiliation)
Ordinal –a variable measurement scale used to simply depict the order
of variables and not the difference between each of the variables. These
scales are generally used to depict non-mathematical ideas such as
frequency, satisfaction, happiness, a degree of pain, etc.
10. • Ordinal Scale maintains description qualities along with an intrinsic order
but is void of an origin of scale and thus, the distance between variables
can’t be calculated. Description qualities indicate tagging properties
similar to the nominal scale, in addition to which, the ordinal scale also has
a relative position of variables. Origin of this scale is absent due to which
there is no fixed start or “true zero”.
Examples:
High school class ranking: 1st, 9th, 87th…
Socioeconomic status: poor, middle class, rich.
The Likert Scale: strongly disagree, disagree, neutral, agree, strongly agree.
Level of Agreement: yes, maybe, no.
Time of Day: dawn, morning, noon, afternoon, evening, night.
Political Orientation: left, center, right.
11. • Interval Scale is defined as a numerical scale where the order of the
variables is known as well as the difference between these variables.
Variables that have familiar, constant, and computable differences are
classified using the Interval scale. It is easy to remember the primary
role of this scale too, ‘Interval’ indicates ‘distance between two
entities’, which is what Interval scale helps in achieving.
• These scales are effective as they open doors for the statistical
analysis of provided data. Mean, median, or mode can be used to
calculate the central tendency in this scale. The only drawback of this
scale is that there no pre-decided starting point or a true zero value.
• Interval scale contains all the properties of the ordinal scale, in
addition to which, it offers a calculation of the difference between
variables. The main characteristic of this scale is the equidistant
difference between objects.
12. Interval Scale Examples
There are situations where attitude scales are considered to be interval scales.
Apart from the temperature scale, time is also a very common example of an
interval scale as the values are already established, constant, and measurable.
Calendar years and time also fall under this category of measurement scales.
Likert scale, Net Promoter Score, Semantic Differential Scale,
Bipolar Matrix Table, etc. are the most-used interval scale
examples.
Celsius Temperature.
Fahrenheit Temperature.
IQ (intelligence scale).
SAT scores.
Time on a clock with hands.
13. • Ratio Scale: 4th
Level of Measurement
• is defined as a variable measurement scale that not only produces
the order of variables but also makes the difference between
variables known along with information on the value of true zero.
It is calculated by assuming that the variables have an option for
zero, the difference between the two variables is the same and
there is a specific order between the options.
• With the option of true zero, varied inferential, and descriptive
analysis techniques can be applied to the variables. In addition to
the fact that the ratio scale does everything that a nominal,
ordinal, and interval scale can do, it can also establish the value
of absolute zero. The best examples of ratio scales are weight
and height. In market research, a ratio scale is used to calculate
market share, annual sales, the price of an upcoming product, the
number of consumers, etc.
14. • Examples of Ratio scale
Age
Weight
Height
Sales Figures
Ruler measurements.
Income earned in a week
15. STEPS IN A STATISTICAL INQUIRY OR
INVESTIGATION
start with a problem
1. Collection of data
2. Presentation of data
3. Analysis of data
4. Interpretation of data
16. DATA COLLECTION AND DATA PRESENTATION
What are DATA?
• Data are plain facts, usually raw numbers, words, measurements,
observations or just description of things. Think of a spreadsheet full
of numbers with no meaningful description. In order for these
numbers to become information, they must be interpreted to have
meaning.
TWO TYPES OF DATA
1. QUALITATIVE DATA is descriptive in nature ex., color, shapes
2. QUANTITATIVE is numerical information ex. weight, height
17. DATA COLLECTION
• Data collection is concerned with the accurate
gathering of data; although methods may differ
depending on the field, the emphasis on ensuring
accuracy. The primary goal of any data collection
is to capture quality data or evidence that easily
translates to rich data analysis that may lead to
credible and conclusive answers to questions that
have been posed.
18. M E T H O D S O F D A T A C O L L E C T I O N
1. THE INTERVIEW or DIRECT METHOD
The researcher or interviewer gets the needed data
from the respondent or interviewee verbally and directly
face-to-face contact.
2. THE QUESTIONNAIRE or INDIRECT METHOD
The questionnaire is a tool for data gathering and
research that consists of a set of questions in a different
form of question type that is used to collect information
from the respondents for the purpose of either survey or
statistical analysis study.
19. 3. REGISTRATION METHOD
This method is used by the government such as the records of births at the
Philippine Statistics Authority (PSA), registration record at the COMELEC
4. OBSERVATION
This method is a way of collecting data through observing. The observer gains
firsthand knowledge by being in and around the social setting that is being
investigated.
5. EXPERIMENTATION
An experiment is a procedure carried out to support, refute, or validate a
hypothesis. An experiment is a method that most clearly shows cause-and-effect
because it isolates and manipulates a single variable, in order to clearly show its
effect.
20. DATA PRESENTATION
Once data has been collected, it has to be classified and organized in such a way that it becomes easily
readable and interpretable, that is, converted to information.
TYPES OF DATA PRESENTATION
1. TEXTUAL PRESENTATION
This type of presentation combines text and figures in a statistical report.
Example: news item in the newspaper
2. TABULAR PRESENTATION
This type of presentation uses tables consisting of vertical columns and horizontal rows
with headings describing these rows and columns. The data are presented in more brief and orderly
manner.
Example: frequency table
3. GRAPHICAL PRESENTATION
It is a most effective means of presenting statistical data because important relationships
are brought out more clearly in graphs.
21. DIFFERENT TYPES OF GRAPHS COMMONLY USED IN DATA
PRESENTATION
1. BAR GRAPH
A bar chart or bar graph is a chart or graph that presents
categorical data with rectangular bars with heights or lengths
proportional to the values that they represent. The bars can
be plotted vertically or horizontally.
22. LINE GRAPH
A line graph is a graphical display of information that changes
continuously over time. A line graph may also be referred to as
a line chart. Within a line graph, there are points connecting
the data to show a continuous change. The lines in a line graph
can descend and ascend based on the data. We can use a line
graph to compare different events, situations, and information.
23. PIE GRAPH
A pie chart is a circular chart divided into wedge-like sectors, illustrating
proportion. Each wedge represents a proportionate part of the whole, and the total
value of the pie is always 100 percent.
Pie charts can make the size of portions easy to understand at a glance.
They're widely used in business presentations and education to show the proportions
among a large variety of categories including expenses, segments of a population, or
answers to a survey.
24. SCATTER DIAGRAM
A scatter diagram also called a scatterplot, is a type of plot or
mathematical diagram using Cartesian coordinates to display values for typically two
variables for a set of data. If the points are coded (color/shape/size), one additional
variable can be displayed. The data are displayed as a collection of points, each having
the value of one variable determining the position on the horizontal axis and the value
of the other variable determining the position on the vertical axis.
25. 5. PICTOGRAPH/PICTOGRAM
A pictograph is a chart or graph, which uses pictures to represent data. A pictograph
is one of the simplest forms of data visualization.
26. TWO TYPES OF SAMPLING
• Probability sampling
• Simple random
• Systematic
• Stratified
• Cluster
• Non-probability sampling
• Convenience/Accidental
• Judgmental/Purposive
• Quota
• Snowball
27. PROBABILITY VS NON-PROBABILITY SAMPLING
1. Probability or Random Sampling
Provides equal chances to every single element of the population to be
included in the sampling.
2. Non-Probability Sampling
The samples are selected in a process that does not give all the
individuals in the population equal chances of being selected.
Samples are selected on the basis of their accessibility or by the
purposive personal judgment of the researcher.
29. PROBABILITY-BASED SAMPLING
Systematic Sampling
Step 1. Identify the population (N)
Step 2. Identify the number of sample (n) to be drawn from the population
Step 3. Divide N by n to find nth interval
Example
Population is 1,000. Desired sample size is 100. Sampling interval is 10
Get a random start from 1 to 10 in the list as first sample and every 10th
in
the list
30. PROBABILITY-BASED SAMPLING
Stratified Sampling
Used to ensure that different groups in the population are adequately represented in the sample
Step 1. Identify the population and divide the population into different groups or strata according to criteria.
Step 2. Decide on the sampling size or actual percentage of the population to be considered as sample.
Step 3. Get a proportion of sample from each group
Step 4. Select the respondents by random sampling
Example : Population = 2000 Desired Sample Size = 10%
Proportion of sample per stratum = 10%
500 students x .10 = 50
600 businessman x .10 = 60
400 teachers x .10 = 40
500 farmers x .10 = 50
Total sample = 200
Select the 200 by random sampling.
31. PROBABILITY-BASED SAMPLING
Cluster Sampling
Often called geographic sampling
Used in large scale surveys
The population is divided into multiple groups called clusters . The
clusters are selected with simple random or systematic sampling
technique for data collection and data analysis.
Example: the Population includes elementary schools in the Province.
The province is first divided into Districts which are treated as clusters
and are randomly selected. From the districts, the schools can be picked
out at random and then classes and then students are selected at random
32. NON-PROBABILITY SAMPLING
1. Accidental or Convenience Sampling
Researcher selects subjects that are more readily accessible or
available.
2. Purposive Sampling
Subjects are selected based on the needs of the study.
33. NON-PROBABILITY-BASED SAMPLING
Quota Sampling
Researcher takes a sample that is in proportion to some characteristic or trait of the
population
The population is divided into groups or strata (the basis may be age, gender, education
level, race, religion etc.
Samples are taken from each group to meet a quota.
Care is taken to maintain the correct proportions representative of the population.
Example :
The population consists of 60% female and 40% male.
The desired sample size is 200.
Therefore, the sample should consist of ____ females and ____ males.
34. NON-PROBABILITY-BASED SAMPLING
A study on science teaching is to be conducted in high schools of a region.
There are 4,641 teachers grouped according to area of specialization.
There are 2,243 biology teachers, 1,406 chemistry teachers and 992 physics
teachers.
The desired sample size is 300.
Select the sample according to the Quota Sampling technique.
35. NON-PROBABILITY-BASED SAMPLING
4. Snowball Sampling
This type of sampling starts with known sources of information, who or
which will in turn give other sources of information . As this goes on,
data accumulates.
This is used to find socially devalued urban populations such as drug
addicts, alcoholics, child abusers and criminals because they are usually
hidden from outsiders.