SlideShare a Scribd company logo
Adinew Handiso
Wachamo Univeristy
College of medicine and health science
Department of Public Health
Introduction To Biostatistics
 Introduction
 Descriptive Statistics
• Demography and vital statistics
 Probability and Probability Distributions
 Sampling and Sampling Distributions
 Statistical Estimation and Hypothesis Testing
 Introduction to correlation and regression 2
Contents
Statistics: we can define it in two senses
Plural sense: Statistics is defined as aggregates of
numerical expressed facts (figures) collected in a
systematic manner for a predetermined purpose.
Singular sense: Statistics is the science of collecting,
organizing, presenting, analyzing and interpreting
numerical data to make decision on the bases of
analysis. 3
INTRODUCTION
Biostatistics: The application of statistical methods to the
fields of biological, medical sciences and public health.
Concerned with interpretation of biological data & the
communication of information derived from these data
Has central role in medical investigations
Classifications of statistics
Depending on how data can be used statistics is classified
in to two main branches.
1. Descriptive Statistics
2. Inferential statistics 4
Descriptive statistics:
 Ways of organizing and summarizing data
 Helps to identify the general features and trends in a
set of data and extracting useful information
 Utilizes numerical and graphical methods to look for patterns in the data set
Example: tables, graphs, numerical summary measures
Inferential statistics:
 Techniques, by which inferences are drawn for the population parameters
from the sample statistics
 Sample statistics observed are inferred to the corresponding population
parameters
 Methods used for drawing conclusions about a population based on the
information obtained from a sample of observations drawn from that
population
Example: Principles of Probability, Estimation, hypothesis testing
5
There are five stages or steps in any statistical investigation.
• Collection of data
• Organization of data
• Presentation of data
• Analysis of data
• Interpretation of data
1. Collection of data: the process of measuring, gathering, assembling
the raw data up on which the statistical investigation is to be based.
2. Organization of data: If an investigator has collected data through a
survey, it is necessary to edit these data in order to correct any apparent
inconsistencies, ambiguities, and recording errors. 6
Stages of Statistical Investigation
3. Presentation of data: the organized data can now be presented in the
form of tables or diagrams or graphs. This presentation in an orderly
manner facilitates the understanding as well as analysis of data.
4. Analysis of data: the basic purpose of data analysis is to make it useful
for certain conclusions. Analysis usually involves highly complex and
sophisticated mathematical techniques. The calculation of enteral tendency
and computation of measure of dispersion are activities of analysis.
5. Interpretation of data: Interpretation means drawing conclusions from
the data which form the basis of decision making. This is the stage where
we draw valid conclusions from the results obtained through data analysis.
7
Stages of Statistical Investigation
Some Basic Terms in Statistics
• Population: It is the collection of all possible
observations of a specified characteristic of interest
• Sample: is a portion or part of the population taken so
that some generalization about the population can be
made.
• Sampling: The process or method of sample selection
from the population.
• Sample size: The number of elements or observation to
be included in the sample.
• Census: Complete enumeration or observation of the
elements of the population.
Target population:
A collection of items that have something in common for
which we wish to draw conclusions at a particular time.
 The whole group of interest
Example: All hospitals in Ethiopia
Study Population:
 The subset of the target population that has at least some chance
of being sampled
 The specific population group from which samples are drawn and
data are collected
Sample is simply a subset of the population, that is, a given collection
of observations or measurements taken from the population.
 is a part of a population
9
Example: Prevalence of HIV among adolescents in Ethiopia, a random
sample of adolescents in Yeka Kifle Ketema of AA were included.
Target Population: All adolescents in Ethiopia
Study population: All adolescents in Addis Ababa
Sample: Adolescents in Yeka Kifle Ketema
Sample survey: The technique of collecting information from a
portion of the population.
Census survey: is the collection of data from every element in a
population.
Sampling: The process or method of sample selection from the
population.
10
Sample size: The number of elements or observation to be included
in the sample
Parameter: a descriptive measure computed from the data of a
population
Statistic: a descriptive measure computed from the data of a
sample.
Statistical data: it refers to numerical descriptions of things. These
descriptions may take the form of counts or measurements.
11
Variable: is a characteristic that takes on different values in different
persons, places, or things.
 Is a characteristic or property that changes or varies over time and/or for
different individuals or objects under consideration.
 A quality or quantity which varies from one member of a sample or
population to another.
 It can also be defined as the generic characteristics being measured or
observed, e.g., HIV status, heart rate, the heights of adult males, the
weights of preschool children…
Data: the raw material of Statistics. Data may be defined as sets of values or
observations resulting from the process of counting or from taking a
measurement.
 A set of related observations (measurements) or facts collected to draw
conclusions. It can be either a sample or a population.
For example: when a hospital administrator counts the number of patients,
1
1.Qualitative Variables: categories are nonnumeric variables and can't
be measured.
Examples: Stages of breast cancer (I, II, III, or IV), blood type, marital
status, etc.
2. Quantitative Variables: A variable that can be measured (or
counted) and expressed numerically.
Examples: weight, height, number of car accidents etc.
Quantitative variable is divided into two:
 Discrete variables, and
 Continuous variables
13
Types of Variables
Discrete variable: It can only have a limited number of discrete
values (usually whole numbers).
 Characterized by gaps or interruptions in the values (integers).
 The values aren’t just labels, but are actual measurable
quantities.
Examples
 Number patients in a hospital,
 The number of bedrooms in your house,
 Number of students attending a conference
 Number of households (family size)
14
Types of Variables …
Continuous variables: It can have an infinite number of possible
values in any given interval.
• Are usually obtained by measurement not by counting.
 Does not possess the gaps or interruptions
Examples:
 Weight is continuous since it can take on any number of values
(e.g., 30.75 Kg)
 Height of seedlings,
 Temperature measurements etc.
15
Types of Variables …
Measurement scale refers to the property of value assigned to the data based
on the properties of order, distance and fixed zero.
There are four types of scales of measurement.
Nominal Scale:
 Is the lowest measurement level you can use, from a statistical point of view.
 as the name implies, is simply some placing of data into categories,
without any order or structure.
 Nominal scales are used for labeling variables, without any quantitative value
and has no logical. This means: No magnitude, unordered categories,
numbers used to represent categories Averages are meaningless; look at
frequency/proportion in each category
Examples:
Political party preference (Republican, Democrat, or Other,)
Marital status(married, single, widow, divorce), sex(M or Female)
Regional differentiation of Ethiopia(region1,2,…) etc.
16
Scales of Measurement
Ordinal Scale:
 data has a logical order, but the differences between values are not constant.
 With ordinal scales, it is the order of the values is what’s important and
significant, but the differences between each one is not really known. The
simplest ordinal scale is a ranking.
 Ordinal scales are typically measures of non-numeric concepts like
satisfaction, happiness, discomfort, etc. This means: no magnitude, ordered
categories, numbers used to represent categories order matters; magnitude
does not differences between categories are meaningless
Examples: Rating scales (Excellent, Very good, Good, Fair, poor),
Academic qualification(BSc, MSc, PhD), T -shirt size (small, medium, large)
Military rank (from Private to General) etc.
17
Interval scales
 Level of measurement which classifies data that can be ranked and differences
are meaningful. However, there is no meaningful zero, so ratios are
meaningless. That means Interval scales are numeric scales in which we know
not only the order, but also the exact differences between the values.
 Possible to add and subtract
 Multiplication and division are not possible
Most common examples are: IQ, temperature, Calendar dates
Examples 1: Degrees Fahrenheit
The difference between 20 and 30 is the same as that between 50
and 60 degrees.
Example 2:years
The difference between 2006-2008 is the same as 2009-2010. 18
Ratio scale:
 The most detailed and objectively interpretable of the measurement
scales.
 is an interval scale that has a true zero point (i.e., zero on the scale
represents a total absence of the variable being measured).
 Level of measurement which classifies data that can be ranked,
differences are meaningful, and there is a true zero.
Example:
salary of employees, price of good, age, weight, height,… etc.
Note: Ratio and interval level data are classified under quantitative variable
and, nominal and ordinal level data are classified under qualitative variable.
19
Experience
For each of the following variables indicate whether it is quantitative
or qualitative and specify the measurement scale
1. Status of student- undergraduate, postgraduate.
2. Number of children in a family
3. Time to complete a statistics test
4. Number of cigarettes smoked per day
5. Opinion of students about stats classes Very unhappy, unhappy,
neutral, happy, ecstatic!
6. Smoking status- smoker, non-smoker
7. Attendance- present, absent
8. Class of mark- pass, fail
20
Some of the most uses of Statistics are:
 condenses and summarizes complex data
 facilitates comparison of data
 Statistical methods are very helpful in formulating and testing hypothesis
and to develop new theories.
 use sampling and estimation methods to study the factors related to
compliance and outcome.
 It helps the researcher to arrive at a scientific Judgment about a
hypothesis.
 Statistics helps in predicting future trends: statistics is extremely useful for
analyzing the past and present data and predicting some future trends.
21
Uses of Statistics
As a science statistics has its own limitations. The following are some of
the limitations:
• Statistics does not deal with single (individual) values
 Statistics does not deal with qualitative characteristics: statistics is not
applicable to qualitative characteristics such as beauty, honesty,
poverty, etc.
• Statistical data are only approximately and not mathematically correct.
• Statistics can be easily misused and therefore should be used be
experts
22
Limitations of Statistics
The statistical data may be classified under two categories, depending upon the
sources.
1) Primary data
2) Secondary data
Primary data: collected from the items or individual respondents
directly by the researcher for the purpose of a study
Secondary data: which had been collected by certain people or
organization, & statistically treated and the information
contained in it is used for other purpose by other people
23
Types of Data
It is the process of gathering and measuring information on the variables
to answer research questions and evaluate outcomes. Data collection
techniques allow us to systematically collect data about our objects of
study (people, objects, and phenomena) and about the setting in which
they occur. Data collection techniques can be used such as:
1. Observation
2. Interview
3. Questionnaire
4. Focus group discussions (FGD)
5. Planned Experimentation
6. Document Analysis
24
Data Collection Methods
Observation: is a technique that involves systematically selecting,
watching and recoding behaviors of people or other phenomena.
Interview: It is a conversation between two people that initiated
by the interviewer in order to obtain the required information. All
respondents will be asked the same list of questions. Answers to the
questions posed during an interview can be recorded by writing
them down (either during the interview itself or immediately after
the interview) or by tape-recording the responses, or by a
combination of both.
25
Data Collection Methods…
Questionnaire: is a list of questions in written form that is aimed
at discovering particular information. The investigator prepares a
number of questions pertaining to the field of enquiry. The success
of Questionnaire depends upon designing the questionnaire properly
and acquiring the cooperation of the respondents.
Focus group discussions: It is a good way to gather
information from people together those who have similar
backgrounds or experiences to discuss a specific topic of interest. It
is important for deeper understanding of the phenomena being
studied.
26
Data Collection Methods…
Planned Experimentation: Statistically desired information can be
collected from conducting a planned experiment in laboratories or experiment
sites.
Document Analysis: gathering information by studying and analyzing
already available sources. Such source can be published or unpublished.
Examples
Official publications of Central Statistical Authority
Publication of Ministry of Health and Other Ministries
International Publications like Publications by WHO, World Bank, UNICEF
Records of hospitals or any Health Institutions, etc.
Reading Assignment: discuss the advantage and disadvantage of the above
data collection methods with respect to each other. 27
Data Collection Methods…
 Having collected and edited the data, the next important step is to organize it.
That is to present it in a readily comprehensible condensed form that aids in order
to draw inferences from it.
 The presentation of data is broadly classified in to the following two categories:
Tabular presentation
Diagrammatic and Graphic presentation.
 The process of arranging data in to classes or categories according to similarities
technically is called classification.
 Classification is a preliminary and it prepares the ground for proper presentation
of data.
28
2. Descriptive statistics
2.1 METHODS OF DATA ORGANIZATION & PRESENTATION
Raw data: recorded information in its original collected form,
whether it be counts or measurements.
Array: data put in an ascending or descending order of
magnitude.
Grouped data: is a form of data presented in the form of a
frequency distribution.
Frequency: is the number of values in a specific class of the
distribution.
Frequency distribution: is the organization of raw data in table
form using classes and frequencies.
Relative frequency: is the frequency of a classis divided by
total number of observations.
Relative cumulative frequency: is the cumulative frequency
divided by total frequency. 29
Definitions of some Terminologies
There are three basic types of frequency distributions
Categorical frequency distribution
Ungrouped frequency distribution
Grouped frequency distribution
1. Categorical Frequency Distribution:
Used for data that can be place in specific categories such as nominal, or
ordinal
Example 2.1: a social worker collected the following data on marital status
for 25 persons.(M=married, S=single, W=widowed, D=divorced)
30
Types of Frequency distribution
Cont….
Example 1. Construct categorical FD the blood
type of 25 students is given below
A B B AB O A
O O B AB B A
B B O A O AB
A O O O AB O B
Class
(1)
Tally
(2)
Frequency
(3)
Percent
(4)
32
Class Tally Frequency Percent
M  6 24
S   7 28
D   7 28
W  5 20
It is a table of all potential raw scored values that could possibly occur in
the data along with their corresponding frequencies.
Constructing ungrouped frequency distribution
• First find the smallest and largest raw score in the collected data.
• Arrange the data in order of magnitude and count the frequency.
• To facilitate counting one may include a column of tallies.
Example 2.2: The following data are the ages in years of 20 women who
attend health education last year
30, 41, 39, 41, 32, 29, 35, 31, 30, 36, 33, 36, 32, 42, 30, 35, 37, 32, 30,
and 41. Construct a frequency distribution for these data
Arrange the data by increasing order: 29, 30,30, 30, 30, …. 33
2. Ungrouped Frequency Distribution
When the range of the data is large, the data must be grouped
in to classes that are more than one unit in width
• Grouped Frequency Distribution: a frequency distribution
when several numbers are grouped in one class.
• Class limits: Separates one class in a grouped frequency
distribution from another. The limits could actually appear
in the data and have gaps between the upper limits of one
class and lower limit of the next.
• Units of measurement (U): the distance between two
possible consecutive measures. It is usually taken as 1, 0.1,
0.01, etc.
• Class boundaries: Separates one class in a grouped
frequency distribution from an other
34
3. Grouped Frequency Distribution
• The boundaries have one more decimal places than the row data
and therefore do not appear in the data. There is no gap
between the upper boundary of one class and lower boundary of
the next class.
Lower class boundary = Lower class limit – U
Upper class boundary = Upper class limit + U
• Class width: the difference between the upper and lower class
boundaries of any class. It is also the difference between the
lower limits of any two consecutive classes or the difference
between any two consecutive class marks.
35
Grouped Frequency Distribution…
• Class mark (Mid points): it is the average of the lower and upper
class limits or the average of upper and lower class boundary.
• Cumulative frequency: is the number of observations less
than/more than or equal to a specific value.
• Cumulative frequency above: it is the total frequency of all
values greater than or equal to the lower class boundary of a
given class.
• Cumulative frequency blow: it is the total frequency of all values
less than or equal to the upper class boundary of a given class.
36
Grouped Frequency Distribution…
Cumulative Frequency Distribution (CFD): it is the tabular arrangement of
class interval together with their corresponding cumulative frequencies. It can be
more than or less than type, depending on the type of cumulative frequency used.
Guidelines for classes
1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive.
3. The classes must be all inclusive or exhaustive. This means that all
data values must be included.
4. The classes must be continuous. There are no gaps in a frequency
distribution.
5. The classes must be equal in width. The exception here is the first or
last class.
37
Grouped Frequency Distribution…
Steps for constructing Grouped frequency Distribution
1.Find the largest and smallest values
2. Compute the Range(R) = Maximum – Minimum.
3. Select the number of classes desired, usually between 5 and 20 or
use Sturges rule k=1+3.322logn where k is number of classes
desired and n is total number of observation.
4. Find the class width by dividing the range by the number of classes
and rounding up, not off. w=R/k
5. Pick a suitable starting point less than or equal to the minimum
value. The Starting point is called the lower limit of the first class.
Continue to add the Class width to this lower limit to get the rest of
the lower limits.
38
Grouped Frequency Distribution…
6. To find the upper limit of the first class, subtract U from the lower limit of
the second class. Then continue to add the class width to this upper limit
to find the rest of the upper limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and
adding U/2 units from the upper limits.
8. Tally the data
9. Find the frequencies
10. Find the cumulative frequencies. Depending on what you're trying to
accomplish, it may not be necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative
frequencies
39
Grouped Frequency Distribution…
Examples 2.3:The following data are on the number of minutes to
travel from home to work for a group of automobile workers.
28 25 48 37 41 19 32 26 16 23 23 29 36
31 26 21 32 25 31 43 35 42 38 33 28.
Construct a frequency distribution for this data.
Solution: Arrange the data in increasing order
16,19,21,23,23,25,25,26,26,28,28,29,31,31,32,32,33,35,36,37,3
8,41,42,43 and 48.
Step 1: Find the highest and the lowest value; H=48, L=16
Step 2: Range = 48 – 16 =32 40
Grouped Frequency Distribution…
Step 3: Select the number of classes desired using Sturges formula;
K=1+3.322log25=5.64≈6
Step 4: Find the class width; W=32/6=5.33=6 (rounding up)
Step 5: Select the starting point; 16,22,28,34,40,46 are the lower class limits.
Step 6: Find the upper class limit; 21,27,33,39,45,51 are the upper class limits.
So combining step 5 and step 6, one can construct the following classes.
Class limits
16 – 21
22 – 27
28 – 33
34 – 39
40 – 45
46 – 51 41
Grouped Frequency Distribution…
Step 7: Find the class boundaries;
E.g. for class 1 Lower class boundary=16-U/2=15.5
Upper class boundary =21+U/2=21.5
Then continue adding w on both boundaries to obtain the rest boundaries. By
doing so one can obtain the following classes.
Class boundary
15.5 – 21.5
21.5 – 27.5
27.5 – 33.5
33.5 – 39.5
39.5 – 45.5
45.5 – 51.5
Step 8: tally the data.
42
Grouped Frequency Distribution…
Class
Limit
Class
boundary
Class
Mark
Tally f <f >f rf.
16-21 15.5-21.5 18.5  3 3 25 0.12
22-27 21.5-27.5 24.5   6 9 22 0.24
28-33 27.5-33.5 30.5   8 17 16 0.32
34-39 33.5-39.5 36.5  4 21 8 0.16
40-45 39.5-45.5 42.5  3 24 4 0.12
46-51 45.5-51.5 48.5  1 25 1 0.04
43
Grouped Frequency Distribution…
Table 2.1: The distribution of the time in minutes spent by
automobile workers to travel from home to work.
44
Home Work
Construct a frequency distribution for the
following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Graphic and Diagrammatic presentation of
data
Graphs
Histogram: A graph in which the classes are marked on the X axis
(horizontal axis) and the frequencies are marked along the Y axis
(vertical axis).
• The height of each bar represents the class frequencies and the width
of the bar represents the class width.
• The bars are drawn adjacent to each other.
Frequency Polygon A graph that consists of line segments connecting
the intersection of the class marks and the frequencies.
• Can be constructed from Histogram by joining the mid-points of each
bar.
Cumulative frequency graph : is a smooth free hand curve of frequency
polygon.
46
Histograms…
Class boundary 15.5 – 21.5 21.5 – 27.5 27.5 – 33.5 33.5 – 39.5 39.5 – 45.5 45.5 – 51.5
Class Mark 18.5 24.5 30.5 36.5 42.5 48.5
No. of workers 3 6 8 4 3 1
Figure 2.5: Distribution of number of minutes spent by the automobile workers.
47
Frequency Polygon…
Figure 2.6: Distribution of number of minutes spent by the automobile workers.
Class boundaries Less cumulative
frequency
Class boundaries More cumulative
frequency
Less than 15.5 0 More than 15.5 25
Less than 21.5 3 More than 21.5 22
Less than 27.5 9 More than 27.5 16
Less than 33.5 17 More than 33.5 8
Less than 39.5 21 More than 39.5 4
Less than 45.5 24 More than 45.5 1
Less than 51.5 25 More than 51.5 0
48
Cumulative Frequency Polygon (Ogive)…
49
Cumulative Frequency Polygon (Ogive)…
Figure 2.7: Cumulative frequency graph of number of minutes spent by the automobile
workers.
• It is easier to understand and interpret data when they are
presented graphically than using words or a frequency table. A
graph can present data in a simple and clear way.
Importance of Diagrammatic Representation
• They have greater attraction
• They facilitate comparison
• They are easily understandable
50
Diagrammatic Representation of Data
• Bar charts and pie chart are commonly used for qualitative or
quantitative discrete data.
• Histograms, frequency polygons and cumulative frequency graph
are used for quantitative continuous data.
• Pie-chart: is a circle divided by radial lines into sections or
sectors so that the area of each sector is proportional to the size
of the figure represented.
Pie-chart construction
• Calculate the % frequency of each component. It is given
•Calculate the degree measures of each sector. It is given by
51
Diagrammatic Representation of Data…
Example 2.4: The following data are the blood types of 50
volunteers at a blood plasma donation clinic:
O A O AB A A O O B A O A AB B O O O A B A A O A A
O B A O AB A O O A B A A A O B O O A O A B O AB A O B
Present the data using a pie chart
Solution: The classes of the frequency distribution are A, B,O and
AB. Count the number of donors for each of the blood types
52
Pie-Chart…
Blood type A B O AB Total
Frequency 19 8 19 4 50
Percent 38 16 38 8 100
Angles
53
Pie-Chart…
Figure 2.1: Pie-chart of the data on blood types of donors.
 Bar diagrams are used to represent and compare the frequency
distribution of discrete variables and attributes or categorical
series.
 When we represent data using bar diagram, all the bars must
have equal width and the distance between bars must be equal.
54
Bar Chart
55
Bar Chart…
Figure 2.2: Bar chart of the data on blood types of donors.
Example2.5: Present the blood types of 50 volunteers at a blood
Plasma donation clinic using a bar chart we have seen in example 2.4
Biostatistics ppt itroductionchapter 1.pptx
Ad

More Related Content

Similar to Biostatistics ppt itroductionchapter 1.pptx (20)

Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
SSaudia
 
Lect 1_Biostat.pdf
Lect 1_Biostat.pdfLect 1_Biostat.pdf
Lect 1_Biostat.pdf
BirhanTesema
 
Basic Statistics, Biostatistics, and Frequency Distribution
Basic Statistics, Biostatistics, and Frequency DistributionBasic Statistics, Biostatistics, and Frequency Distribution
Basic Statistics, Biostatistics, and Frequency Distribution
Gaurav Patil
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
Dr. Senthilvel Vasudevan
 
chapter 1.pptx
chapter 1.pptxchapter 1.pptx
chapter 1.pptx
ObsaHassanMohamed
 
Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
Abebe334138
 
Bio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Bio Statistics.pptx by Dr.REVATHI SIVAKUMARBio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Bio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Dr.REVATHI SIVAKUMAR
 
1. Biost. Introduction(2).ppt
1.  Biost.             Introduction(2).ppt1.  Biost.             Introduction(2).ppt
1. Biost. Introduction(2).ppt
muktarkedir459
 
Module 8-S M & T C I, Regular.pptx
Module 8-S M & T C I, Regular.pptxModule 8-S M & T C I, Regular.pptx
Module 8-S M & T C I, Regular.pptx
Rajashekhar Shirvalkar
 
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbkbiostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
anweshagarg49
 
Data analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed QureshiData analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed Qureshi
Jameel Ahmed Qureshi
 
CHAPTER 1.pdf Probability and Statistics for Engineers
CHAPTER 1.pdf Probability and Statistics for EngineersCHAPTER 1.pdf Probability and Statistics for Engineers
CHAPTER 1.pdf Probability and Statistics for Engineers
braveset14
 
CHAPTER 1.pdfProbability and Statistics for Engineers
CHAPTER 1.pdfProbability and Statistics for EngineersCHAPTER 1.pdfProbability and Statistics for Engineers
CHAPTER 1.pdfProbability and Statistics for Engineers
braveset14
 
02 Basics of Research Methodology...pptx
02 Basics of Research Methodology...pptx02 Basics of Research Methodology...pptx
02 Basics of Research Methodology...pptx
Mostafa Elsapan
 
Final Lecture - 1.ppt
Final Lecture - 1.pptFinal Lecture - 1.ppt
Final Lecture - 1.ppt
ssuserbe1d97
 
BIOSTATISTICS (MPT) 11 (1).pptx
BIOSTATISTICS (MPT) 11 (1).pptxBIOSTATISTICS (MPT) 11 (1).pptx
BIOSTATISTICS (MPT) 11 (1).pptx
VaishnaviElumalai
 
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
ssuser46ca42
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
SSaudia
 
Lect 1_Biostat.pdf
Lect 1_Biostat.pdfLect 1_Biostat.pdf
Lect 1_Biostat.pdf
BirhanTesema
 
Basic Statistics, Biostatistics, and Frequency Distribution
Basic Statistics, Biostatistics, and Frequency DistributionBasic Statistics, Biostatistics, and Frequency Distribution
Basic Statistics, Biostatistics, and Frequency Distribution
Gaurav Patil
 
Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
Abebe334138
 
Bio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Bio Statistics.pptx by Dr.REVATHI SIVAKUMARBio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Bio Statistics.pptx by Dr.REVATHI SIVAKUMAR
Dr.REVATHI SIVAKUMAR
 
1. Biost. Introduction(2).ppt
1.  Biost.             Introduction(2).ppt1.  Biost.             Introduction(2).ppt
1. Biost. Introduction(2).ppt
muktarkedir459
 
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbkbiostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
biostats.pptx hjvbuvfyjgvguyjgvfvtfugvghjbk
anweshagarg49
 
Data analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed QureshiData analysis presentation by Jameel Ahmed Qureshi
Data analysis presentation by Jameel Ahmed Qureshi
Jameel Ahmed Qureshi
 
CHAPTER 1.pdf Probability and Statistics for Engineers
CHAPTER 1.pdf Probability and Statistics for EngineersCHAPTER 1.pdf Probability and Statistics for Engineers
CHAPTER 1.pdf Probability and Statistics for Engineers
braveset14
 
CHAPTER 1.pdfProbability and Statistics for Engineers
CHAPTER 1.pdfProbability and Statistics for EngineersCHAPTER 1.pdfProbability and Statistics for Engineers
CHAPTER 1.pdfProbability and Statistics for Engineers
braveset14
 
02 Basics of Research Methodology...pptx
02 Basics of Research Methodology...pptx02 Basics of Research Methodology...pptx
02 Basics of Research Methodology...pptx
Mostafa Elsapan
 
Final Lecture - 1.ppt
Final Lecture - 1.pptFinal Lecture - 1.ppt
Final Lecture - 1.ppt
ssuserbe1d97
 
BIOSTATISTICS (MPT) 11 (1).pptx
BIOSTATISTICS (MPT) 11 (1).pptxBIOSTATISTICS (MPT) 11 (1).pptx
BIOSTATISTICS (MPT) 11 (1).pptx
VaishnaviElumalai
 
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
543957106-Introduction-Basic-Concepts-in-Statistics-PPT - Copy.pptx
ssuser46ca42
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 

Recently uploaded (20)

SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptxSPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
Keneth Hesbon
 
2025 lobotomy vs nasal surgery comparison
2025 lobotomy vs nasal surgery comparison2025 lobotomy vs nasal surgery comparison
2025 lobotomy vs nasal surgery comparison
yilef94631
 
Physiology of Defense System - B-Cell Immunity
Physiology of Defense System - B-Cell ImmunityPhysiology of Defense System - B-Cell Immunity
Physiology of Defense System - B-Cell Immunity
MedicoseAcademics
 
ASD ( Atrial Septal Defect)... An Overview
ASD ( Atrial Septal Defect)... An OverviewASD ( Atrial Septal Defect)... An Overview
ASD ( Atrial Septal Defect)... An Overview
dramiraaref
 
Pathogen Recognition and their presentation to immune system.pptx
Pathogen Recognition and their presentation to immune system.pptxPathogen Recognition and their presentation to immune system.pptx
Pathogen Recognition and their presentation to immune system.pptx
AbhisekManna3
 
Acute Myocarditis and Management in children.pptx
Acute Myocarditis and Management  in children.pptxAcute Myocarditis and Management  in children.pptx
Acute Myocarditis and Management in children.pptx
Mohammad ALktifan
 
Physiology of Pain and thermal sensations
Physiology of Pain and thermal sensationsPhysiology of Pain and thermal sensations
Physiology of Pain and thermal sensations
MedicoseAcademics
 
Lung ultrasound essential for BLUE protocol
Lung ultrasound essential for BLUE protocolLung ultrasound essential for BLUE protocol
Lung ultrasound essential for BLUE protocol
MessahVDiana
 
Veterinary Pharmacology and Toxicology Notes for Diploma Students
Veterinary Pharmacology and Toxicology Notes for Diploma StudentsVeterinary Pharmacology and Toxicology Notes for Diploma Students
Veterinary Pharmacology and Toxicology Notes for Diploma Students
Sir. Stymass Kasty
 
TH'e Oncology Meds for Medicine Services
TH'e Oncology Meds for Medicine ServicesTH'e Oncology Meds for Medicine Services
TH'e Oncology Meds for Medicine Services
Theoncologymeds
 
ECG. OR EKG ELECTRO CARDIO GRAPHY .pptx
ECG. OR EKG  ELECTRO CARDIO GRAPHY .pptxECG. OR EKG  ELECTRO CARDIO GRAPHY .pptx
ECG. OR EKG ELECTRO CARDIO GRAPHY .pptx
rekhapositivity
 
MSUS musculoskeletal ultrasound On The wrist basic level Marwa Abo ELmaaty Be...
MSUS musculoskeletal ultrasound On The wrist basic level Marwa Abo ELmaaty Be...MSUS musculoskeletal ultrasound On The wrist basic level Marwa Abo ELmaaty Be...
MSUS musculoskeletal ultrasound On The wrist basic level Marwa Abo ELmaaty Be...
Internal medicine department, faculty of Medicine Beni-Suef University Egypt
 
Breaking Down the Duties of a Prior Authorization Pharmacist.docx
Breaking Down the Duties of a Prior Authorization Pharmacist.docxBreaking Down the Duties of a Prior Authorization Pharmacist.docx
Breaking Down the Duties of a Prior Authorization Pharmacist.docx
Portiva
 
MIZAN 8 May 2025 Probabation period special.pdf
MIZAN 8  May 2025 Probabation period special.pdfMIZAN 8  May 2025 Probabation period special.pdf
MIZAN 8 May 2025 Probabation period special.pdf
Government Tibbi College and Hospital, Patna
 
Physiology of Central Nervous System - Somatosensory Cortex
Physiology of Central Nervous System - Somatosensory CortexPhysiology of Central Nervous System - Somatosensory Cortex
Physiology of Central Nervous System - Somatosensory Cortex
MedicoseAcademics
 
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
DHANASHREE KOLHEKAR
 
Navitor: An Intra-annular self-expanding valve option
Navitor: An Intra-annular self-expanding valve optionNavitor: An Intra-annular self-expanding valve option
Navitor: An Intra-annular self-expanding valve option
Duke Heart
 
Ophthalmological notes for dental students
Ophthalmological notes for dental studentsOphthalmological notes for dental students
Ophthalmological notes for dental students
KafrELShiekh University
 
MSUS On The Knee basic level Marwa Abo ELmaaty Besar.pdf
MSUS On The Knee basic level Marwa Abo ELmaaty Besar.pdfMSUS On The Knee basic level Marwa Abo ELmaaty Besar.pdf
MSUS On The Knee basic level Marwa Abo ELmaaty Besar.pdf
Internal medicine department, faculty of Medicine Beni-Suef University Egypt
 
GIT DIAGNOSTIC STUDIES, gastro intestinal system
GIT DIAGNOSTIC STUDIES, gastro intestinal systemGIT DIAGNOSTIC STUDIES, gastro intestinal system
GIT DIAGNOSTIC STUDIES, gastro intestinal system
36MariaTheresMathew
 
SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptxSPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
SPLEEN, OMENTAL BURSAE AND INTESTINES kenn.pptx
Keneth Hesbon
 
2025 lobotomy vs nasal surgery comparison
2025 lobotomy vs nasal surgery comparison2025 lobotomy vs nasal surgery comparison
2025 lobotomy vs nasal surgery comparison
yilef94631
 
Physiology of Defense System - B-Cell Immunity
Physiology of Defense System - B-Cell ImmunityPhysiology of Defense System - B-Cell Immunity
Physiology of Defense System - B-Cell Immunity
MedicoseAcademics
 
ASD ( Atrial Septal Defect)... An Overview
ASD ( Atrial Septal Defect)... An OverviewASD ( Atrial Septal Defect)... An Overview
ASD ( Atrial Septal Defect)... An Overview
dramiraaref
 
Pathogen Recognition and their presentation to immune system.pptx
Pathogen Recognition and their presentation to immune system.pptxPathogen Recognition and their presentation to immune system.pptx
Pathogen Recognition and their presentation to immune system.pptx
AbhisekManna3
 
Acute Myocarditis and Management in children.pptx
Acute Myocarditis and Management  in children.pptxAcute Myocarditis and Management  in children.pptx
Acute Myocarditis and Management in children.pptx
Mohammad ALktifan
 
Physiology of Pain and thermal sensations
Physiology of Pain and thermal sensationsPhysiology of Pain and thermal sensations
Physiology of Pain and thermal sensations
MedicoseAcademics
 
Lung ultrasound essential for BLUE protocol
Lung ultrasound essential for BLUE protocolLung ultrasound essential for BLUE protocol
Lung ultrasound essential for BLUE protocol
MessahVDiana
 
Veterinary Pharmacology and Toxicology Notes for Diploma Students
Veterinary Pharmacology and Toxicology Notes for Diploma StudentsVeterinary Pharmacology and Toxicology Notes for Diploma Students
Veterinary Pharmacology and Toxicology Notes for Diploma Students
Sir. Stymass Kasty
 
TH'e Oncology Meds for Medicine Services
TH'e Oncology Meds for Medicine ServicesTH'e Oncology Meds for Medicine Services
TH'e Oncology Meds for Medicine Services
Theoncologymeds
 
ECG. OR EKG ELECTRO CARDIO GRAPHY .pptx
ECG. OR EKG  ELECTRO CARDIO GRAPHY .pptxECG. OR EKG  ELECTRO CARDIO GRAPHY .pptx
ECG. OR EKG ELECTRO CARDIO GRAPHY .pptx
rekhapositivity
 
Breaking Down the Duties of a Prior Authorization Pharmacist.docx
Breaking Down the Duties of a Prior Authorization Pharmacist.docxBreaking Down the Duties of a Prior Authorization Pharmacist.docx
Breaking Down the Duties of a Prior Authorization Pharmacist.docx
Portiva
 
Physiology of Central Nervous System - Somatosensory Cortex
Physiology of Central Nervous System - Somatosensory CortexPhysiology of Central Nervous System - Somatosensory Cortex
Physiology of Central Nervous System - Somatosensory Cortex
MedicoseAcademics
 
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
UNIT 5- Metabolite Identification. IN VITRO-IN VIVO Approaches, Protocols & S...
DHANASHREE KOLHEKAR
 
Navitor: An Intra-annular self-expanding valve option
Navitor: An Intra-annular self-expanding valve optionNavitor: An Intra-annular self-expanding valve option
Navitor: An Intra-annular self-expanding valve option
Duke Heart
 
Ophthalmological notes for dental students
Ophthalmological notes for dental studentsOphthalmological notes for dental students
Ophthalmological notes for dental students
KafrELShiekh University
 
GIT DIAGNOSTIC STUDIES, gastro intestinal system
GIT DIAGNOSTIC STUDIES, gastro intestinal systemGIT DIAGNOSTIC STUDIES, gastro intestinal system
GIT DIAGNOSTIC STUDIES, gastro intestinal system
36MariaTheresMathew
 
Ad

Biostatistics ppt itroductionchapter 1.pptx

  • 1. Adinew Handiso Wachamo Univeristy College of medicine and health science Department of Public Health Introduction To Biostatistics
  • 2.  Introduction  Descriptive Statistics • Demography and vital statistics  Probability and Probability Distributions  Sampling and Sampling Distributions  Statistical Estimation and Hypothesis Testing  Introduction to correlation and regression 2 Contents
  • 3. Statistics: we can define it in two senses Plural sense: Statistics is defined as aggregates of numerical expressed facts (figures) collected in a systematic manner for a predetermined purpose. Singular sense: Statistics is the science of collecting, organizing, presenting, analyzing and interpreting numerical data to make decision on the bases of analysis. 3 INTRODUCTION
  • 4. Biostatistics: The application of statistical methods to the fields of biological, medical sciences and public health. Concerned with interpretation of biological data & the communication of information derived from these data Has central role in medical investigations Classifications of statistics Depending on how data can be used statistics is classified in to two main branches. 1. Descriptive Statistics 2. Inferential statistics 4
  • 5. Descriptive statistics:  Ways of organizing and summarizing data  Helps to identify the general features and trends in a set of data and extracting useful information  Utilizes numerical and graphical methods to look for patterns in the data set Example: tables, graphs, numerical summary measures Inferential statistics:  Techniques, by which inferences are drawn for the population parameters from the sample statistics  Sample statistics observed are inferred to the corresponding population parameters  Methods used for drawing conclusions about a population based on the information obtained from a sample of observations drawn from that population Example: Principles of Probability, Estimation, hypothesis testing 5
  • 6. There are five stages or steps in any statistical investigation. • Collection of data • Organization of data • Presentation of data • Analysis of data • Interpretation of data 1. Collection of data: the process of measuring, gathering, assembling the raw data up on which the statistical investigation is to be based. 2. Organization of data: If an investigator has collected data through a survey, it is necessary to edit these data in order to correct any apparent inconsistencies, ambiguities, and recording errors. 6 Stages of Statistical Investigation
  • 7. 3. Presentation of data: the organized data can now be presented in the form of tables or diagrams or graphs. This presentation in an orderly manner facilitates the understanding as well as analysis of data. 4. Analysis of data: the basic purpose of data analysis is to make it useful for certain conclusions. Analysis usually involves highly complex and sophisticated mathematical techniques. The calculation of enteral tendency and computation of measure of dispersion are activities of analysis. 5. Interpretation of data: Interpretation means drawing conclusions from the data which form the basis of decision making. This is the stage where we draw valid conclusions from the results obtained through data analysis. 7 Stages of Statistical Investigation
  • 8. Some Basic Terms in Statistics • Population: It is the collection of all possible observations of a specified characteristic of interest • Sample: is a portion or part of the population taken so that some generalization about the population can be made. • Sampling: The process or method of sample selection from the population. • Sample size: The number of elements or observation to be included in the sample. • Census: Complete enumeration or observation of the elements of the population.
  • 9. Target population: A collection of items that have something in common for which we wish to draw conclusions at a particular time.  The whole group of interest Example: All hospitals in Ethiopia Study Population:  The subset of the target population that has at least some chance of being sampled  The specific population group from which samples are drawn and data are collected Sample is simply a subset of the population, that is, a given collection of observations or measurements taken from the population.  is a part of a population 9
  • 10. Example: Prevalence of HIV among adolescents in Ethiopia, a random sample of adolescents in Yeka Kifle Ketema of AA were included. Target Population: All adolescents in Ethiopia Study population: All adolescents in Addis Ababa Sample: Adolescents in Yeka Kifle Ketema Sample survey: The technique of collecting information from a portion of the population. Census survey: is the collection of data from every element in a population. Sampling: The process or method of sample selection from the population. 10
  • 11. Sample size: The number of elements or observation to be included in the sample Parameter: a descriptive measure computed from the data of a population Statistic: a descriptive measure computed from the data of a sample. Statistical data: it refers to numerical descriptions of things. These descriptions may take the form of counts or measurements. 11
  • 12. Variable: is a characteristic that takes on different values in different persons, places, or things.  Is a characteristic or property that changes or varies over time and/or for different individuals or objects under consideration.  A quality or quantity which varies from one member of a sample or population to another.  It can also be defined as the generic characteristics being measured or observed, e.g., HIV status, heart rate, the heights of adult males, the weights of preschool children… Data: the raw material of Statistics. Data may be defined as sets of values or observations resulting from the process of counting or from taking a measurement.  A set of related observations (measurements) or facts collected to draw conclusions. It can be either a sample or a population. For example: when a hospital administrator counts the number of patients, 1
  • 13. 1.Qualitative Variables: categories are nonnumeric variables and can't be measured. Examples: Stages of breast cancer (I, II, III, or IV), blood type, marital status, etc. 2. Quantitative Variables: A variable that can be measured (or counted) and expressed numerically. Examples: weight, height, number of car accidents etc. Quantitative variable is divided into two:  Discrete variables, and  Continuous variables 13 Types of Variables
  • 14. Discrete variable: It can only have a limited number of discrete values (usually whole numbers).  Characterized by gaps or interruptions in the values (integers).  The values aren’t just labels, but are actual measurable quantities. Examples  Number patients in a hospital,  The number of bedrooms in your house,  Number of students attending a conference  Number of households (family size) 14 Types of Variables …
  • 15. Continuous variables: It can have an infinite number of possible values in any given interval. • Are usually obtained by measurement not by counting.  Does not possess the gaps or interruptions Examples:  Weight is continuous since it can take on any number of values (e.g., 30.75 Kg)  Height of seedlings,  Temperature measurements etc. 15 Types of Variables …
  • 16. Measurement scale refers to the property of value assigned to the data based on the properties of order, distance and fixed zero. There are four types of scales of measurement. Nominal Scale:  Is the lowest measurement level you can use, from a statistical point of view.  as the name implies, is simply some placing of data into categories, without any order or structure.  Nominal scales are used for labeling variables, without any quantitative value and has no logical. This means: No magnitude, unordered categories, numbers used to represent categories Averages are meaningless; look at frequency/proportion in each category Examples: Political party preference (Republican, Democrat, or Other,) Marital status(married, single, widow, divorce), sex(M or Female) Regional differentiation of Ethiopia(region1,2,…) etc. 16 Scales of Measurement
  • 17. Ordinal Scale:  data has a logical order, but the differences between values are not constant.  With ordinal scales, it is the order of the values is what’s important and significant, but the differences between each one is not really known. The simplest ordinal scale is a ranking.  Ordinal scales are typically measures of non-numeric concepts like satisfaction, happiness, discomfort, etc. This means: no magnitude, ordered categories, numbers used to represent categories order matters; magnitude does not differences between categories are meaningless Examples: Rating scales (Excellent, Very good, Good, Fair, poor), Academic qualification(BSc, MSc, PhD), T -shirt size (small, medium, large) Military rank (from Private to General) etc. 17
  • 18. Interval scales  Level of measurement which classifies data that can be ranked and differences are meaningful. However, there is no meaningful zero, so ratios are meaningless. That means Interval scales are numeric scales in which we know not only the order, but also the exact differences between the values.  Possible to add and subtract  Multiplication and division are not possible Most common examples are: IQ, temperature, Calendar dates Examples 1: Degrees Fahrenheit The difference between 20 and 30 is the same as that between 50 and 60 degrees. Example 2:years The difference between 2006-2008 is the same as 2009-2010. 18
  • 19. Ratio scale:  The most detailed and objectively interpretable of the measurement scales.  is an interval scale that has a true zero point (i.e., zero on the scale represents a total absence of the variable being measured).  Level of measurement which classifies data that can be ranked, differences are meaningful, and there is a true zero. Example: salary of employees, price of good, age, weight, height,… etc. Note: Ratio and interval level data are classified under quantitative variable and, nominal and ordinal level data are classified under qualitative variable. 19
  • 20. Experience For each of the following variables indicate whether it is quantitative or qualitative and specify the measurement scale 1. Status of student- undergraduate, postgraduate. 2. Number of children in a family 3. Time to complete a statistics test 4. Number of cigarettes smoked per day 5. Opinion of students about stats classes Very unhappy, unhappy, neutral, happy, ecstatic! 6. Smoking status- smoker, non-smoker 7. Attendance- present, absent 8. Class of mark- pass, fail 20
  • 21. Some of the most uses of Statistics are:  condenses and summarizes complex data  facilitates comparison of data  Statistical methods are very helpful in formulating and testing hypothesis and to develop new theories.  use sampling and estimation methods to study the factors related to compliance and outcome.  It helps the researcher to arrive at a scientific Judgment about a hypothesis.  Statistics helps in predicting future trends: statistics is extremely useful for analyzing the past and present data and predicting some future trends. 21 Uses of Statistics
  • 22. As a science statistics has its own limitations. The following are some of the limitations: • Statistics does not deal with single (individual) values  Statistics does not deal with qualitative characteristics: statistics is not applicable to qualitative characteristics such as beauty, honesty, poverty, etc. • Statistical data are only approximately and not mathematically correct. • Statistics can be easily misused and therefore should be used be experts 22 Limitations of Statistics
  • 23. The statistical data may be classified under two categories, depending upon the sources. 1) Primary data 2) Secondary data Primary data: collected from the items or individual respondents directly by the researcher for the purpose of a study Secondary data: which had been collected by certain people or organization, & statistically treated and the information contained in it is used for other purpose by other people 23 Types of Data
  • 24. It is the process of gathering and measuring information on the variables to answer research questions and evaluate outcomes. Data collection techniques allow us to systematically collect data about our objects of study (people, objects, and phenomena) and about the setting in which they occur. Data collection techniques can be used such as: 1. Observation 2. Interview 3. Questionnaire 4. Focus group discussions (FGD) 5. Planned Experimentation 6. Document Analysis 24 Data Collection Methods
  • 25. Observation: is a technique that involves systematically selecting, watching and recoding behaviors of people or other phenomena. Interview: It is a conversation between two people that initiated by the interviewer in order to obtain the required information. All respondents will be asked the same list of questions. Answers to the questions posed during an interview can be recorded by writing them down (either during the interview itself or immediately after the interview) or by tape-recording the responses, or by a combination of both. 25 Data Collection Methods…
  • 26. Questionnaire: is a list of questions in written form that is aimed at discovering particular information. The investigator prepares a number of questions pertaining to the field of enquiry. The success of Questionnaire depends upon designing the questionnaire properly and acquiring the cooperation of the respondents. Focus group discussions: It is a good way to gather information from people together those who have similar backgrounds or experiences to discuss a specific topic of interest. It is important for deeper understanding of the phenomena being studied. 26 Data Collection Methods…
  • 27. Planned Experimentation: Statistically desired information can be collected from conducting a planned experiment in laboratories or experiment sites. Document Analysis: gathering information by studying and analyzing already available sources. Such source can be published or unpublished. Examples Official publications of Central Statistical Authority Publication of Ministry of Health and Other Ministries International Publications like Publications by WHO, World Bank, UNICEF Records of hospitals or any Health Institutions, etc. Reading Assignment: discuss the advantage and disadvantage of the above data collection methods with respect to each other. 27 Data Collection Methods…
  • 28.  Having collected and edited the data, the next important step is to organize it. That is to present it in a readily comprehensible condensed form that aids in order to draw inferences from it.  The presentation of data is broadly classified in to the following two categories: Tabular presentation Diagrammatic and Graphic presentation.  The process of arranging data in to classes or categories according to similarities technically is called classification.  Classification is a preliminary and it prepares the ground for proper presentation of data. 28 2. Descriptive statistics 2.1 METHODS OF DATA ORGANIZATION & PRESENTATION
  • 29. Raw data: recorded information in its original collected form, whether it be counts or measurements. Array: data put in an ascending or descending order of magnitude. Grouped data: is a form of data presented in the form of a frequency distribution. Frequency: is the number of values in a specific class of the distribution. Frequency distribution: is the organization of raw data in table form using classes and frequencies. Relative frequency: is the frequency of a classis divided by total number of observations. Relative cumulative frequency: is the cumulative frequency divided by total frequency. 29 Definitions of some Terminologies
  • 30. There are three basic types of frequency distributions Categorical frequency distribution Ungrouped frequency distribution Grouped frequency distribution 1. Categorical Frequency Distribution: Used for data that can be place in specific categories such as nominal, or ordinal Example 2.1: a social worker collected the following data on marital status for 25 persons.(M=married, S=single, W=widowed, D=divorced) 30 Types of Frequency distribution
  • 31. Cont…. Example 1. Construct categorical FD the blood type of 25 students is given below A B B AB O A O O B AB B A B B O A O AB A O O O AB O B Class (1) Tally (2) Frequency (3) Percent (4)
  • 32. 32 Class Tally Frequency Percent M 6 24 S 7 28 D 7 28 W 5 20
  • 33. It is a table of all potential raw scored values that could possibly occur in the data along with their corresponding frequencies. Constructing ungrouped frequency distribution • First find the smallest and largest raw score in the collected data. • Arrange the data in order of magnitude and count the frequency. • To facilitate counting one may include a column of tallies. Example 2.2: The following data are the ages in years of 20 women who attend health education last year 30, 41, 39, 41, 32, 29, 35, 31, 30, 36, 33, 36, 32, 42, 30, 35, 37, 32, 30, and 41. Construct a frequency distribution for these data Arrange the data by increasing order: 29, 30,30, 30, 30, …. 33 2. Ungrouped Frequency Distribution
  • 34. When the range of the data is large, the data must be grouped in to classes that are more than one unit in width • Grouped Frequency Distribution: a frequency distribution when several numbers are grouped in one class. • Class limits: Separates one class in a grouped frequency distribution from another. The limits could actually appear in the data and have gaps between the upper limits of one class and lower limit of the next. • Units of measurement (U): the distance between two possible consecutive measures. It is usually taken as 1, 0.1, 0.01, etc. • Class boundaries: Separates one class in a grouped frequency distribution from an other 34 3. Grouped Frequency Distribution
  • 35. • The boundaries have one more decimal places than the row data and therefore do not appear in the data. There is no gap between the upper boundary of one class and lower boundary of the next class. Lower class boundary = Lower class limit – U Upper class boundary = Upper class limit + U • Class width: the difference between the upper and lower class boundaries of any class. It is also the difference between the lower limits of any two consecutive classes or the difference between any two consecutive class marks. 35 Grouped Frequency Distribution…
  • 36. • Class mark (Mid points): it is the average of the lower and upper class limits or the average of upper and lower class boundary. • Cumulative frequency: is the number of observations less than/more than or equal to a specific value. • Cumulative frequency above: it is the total frequency of all values greater than or equal to the lower class boundary of a given class. • Cumulative frequency blow: it is the total frequency of all values less than or equal to the upper class boundary of a given class. 36 Grouped Frequency Distribution…
  • 37. Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class interval together with their corresponding cumulative frequencies. It can be more than or less than type, depending on the type of cumulative frequency used. Guidelines for classes 1. There should be between 5 and 20 classes. 2. The classes must be mutually exclusive. 3. The classes must be all inclusive or exhaustive. This means that all data values must be included. 4. The classes must be continuous. There are no gaps in a frequency distribution. 5. The classes must be equal in width. The exception here is the first or last class. 37 Grouped Frequency Distribution…
  • 38. Steps for constructing Grouped frequency Distribution 1.Find the largest and smallest values 2. Compute the Range(R) = Maximum – Minimum. 3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule k=1+3.322logn where k is number of classes desired and n is total number of observation. 4. Find the class width by dividing the range by the number of classes and rounding up, not off. w=R/k 5. Pick a suitable starting point less than or equal to the minimum value. The Starting point is called the lower limit of the first class. Continue to add the Class width to this lower limit to get the rest of the lower limits. 38 Grouped Frequency Distribution…
  • 39. 6. To find the upper limit of the first class, subtract U from the lower limit of the second class. Then continue to add the class width to this upper limit to find the rest of the upper limits. 7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units from the upper limits. 8. Tally the data 9. Find the frequencies 10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not be necessary to find the cumulative frequencies. 11. If necessary, find the relative frequencies and/or relative cumulative frequencies 39 Grouped Frequency Distribution…
  • 40. Examples 2.3:The following data are on the number of minutes to travel from home to work for a group of automobile workers. 28 25 48 37 41 19 32 26 16 23 23 29 36 31 26 21 32 25 31 43 35 42 38 33 28. Construct a frequency distribution for this data. Solution: Arrange the data in increasing order 16,19,21,23,23,25,25,26,26,28,28,29,31,31,32,32,33,35,36,37,3 8,41,42,43 and 48. Step 1: Find the highest and the lowest value; H=48, L=16 Step 2: Range = 48 – 16 =32 40 Grouped Frequency Distribution…
  • 41. Step 3: Select the number of classes desired using Sturges formula; K=1+3.322log25=5.64≈6 Step 4: Find the class width; W=32/6=5.33=6 (rounding up) Step 5: Select the starting point; 16,22,28,34,40,46 are the lower class limits. Step 6: Find the upper class limit; 21,27,33,39,45,51 are the upper class limits. So combining step 5 and step 6, one can construct the following classes. Class limits 16 – 21 22 – 27 28 – 33 34 – 39 40 – 45 46 – 51 41 Grouped Frequency Distribution…
  • 42. Step 7: Find the class boundaries; E.g. for class 1 Lower class boundary=16-U/2=15.5 Upper class boundary =21+U/2=21.5 Then continue adding w on both boundaries to obtain the rest boundaries. By doing so one can obtain the following classes. Class boundary 15.5 – 21.5 21.5 – 27.5 27.5 – 33.5 33.5 – 39.5 39.5 – 45.5 45.5 – 51.5 Step 8: tally the data. 42 Grouped Frequency Distribution…
  • 43. Class Limit Class boundary Class Mark Tally f <f >f rf. 16-21 15.5-21.5 18.5 3 3 25 0.12 22-27 21.5-27.5 24.5 6 9 22 0.24 28-33 27.5-33.5 30.5 8 17 16 0.32 34-39 33.5-39.5 36.5 4 21 8 0.16 40-45 39.5-45.5 42.5 3 24 4 0.12 46-51 45.5-51.5 48.5 1 25 1 0.04 43 Grouped Frequency Distribution… Table 2.1: The distribution of the time in minutes spent by automobile workers to travel from home to work.
  • 44. 44 Home Work Construct a frequency distribution for the following data. 11 29 6 33 14 31 22 27 19 20 18 17 22 38 23 21 26 34 39 27
  • 45. Graphic and Diagrammatic presentation of data Graphs Histogram: A graph in which the classes are marked on the X axis (horizontal axis) and the frequencies are marked along the Y axis (vertical axis). • The height of each bar represents the class frequencies and the width of the bar represents the class width. • The bars are drawn adjacent to each other. Frequency Polygon A graph that consists of line segments connecting the intersection of the class marks and the frequencies. • Can be constructed from Histogram by joining the mid-points of each bar. Cumulative frequency graph : is a smooth free hand curve of frequency polygon.
  • 46. 46 Histograms… Class boundary 15.5 – 21.5 21.5 – 27.5 27.5 – 33.5 33.5 – 39.5 39.5 – 45.5 45.5 – 51.5 Class Mark 18.5 24.5 30.5 36.5 42.5 48.5 No. of workers 3 6 8 4 3 1 Figure 2.5: Distribution of number of minutes spent by the automobile workers.
  • 47. 47 Frequency Polygon… Figure 2.6: Distribution of number of minutes spent by the automobile workers.
  • 48. Class boundaries Less cumulative frequency Class boundaries More cumulative frequency Less than 15.5 0 More than 15.5 25 Less than 21.5 3 More than 21.5 22 Less than 27.5 9 More than 27.5 16 Less than 33.5 17 More than 33.5 8 Less than 39.5 21 More than 39.5 4 Less than 45.5 24 More than 45.5 1 Less than 51.5 25 More than 51.5 0 48 Cumulative Frequency Polygon (Ogive)…
  • 49. 49 Cumulative Frequency Polygon (Ogive)… Figure 2.7: Cumulative frequency graph of number of minutes spent by the automobile workers.
  • 50. • It is easier to understand and interpret data when they are presented graphically than using words or a frequency table. A graph can present data in a simple and clear way. Importance of Diagrammatic Representation • They have greater attraction • They facilitate comparison • They are easily understandable 50 Diagrammatic Representation of Data
  • 51. • Bar charts and pie chart are commonly used for qualitative or quantitative discrete data. • Histograms, frequency polygons and cumulative frequency graph are used for quantitative continuous data. • Pie-chart: is a circle divided by radial lines into sections or sectors so that the area of each sector is proportional to the size of the figure represented. Pie-chart construction • Calculate the % frequency of each component. It is given •Calculate the degree measures of each sector. It is given by 51 Diagrammatic Representation of Data…
  • 52. Example 2.4: The following data are the blood types of 50 volunteers at a blood plasma donation clinic: O A O AB A A O O B A O A AB B O O O A B A A O A A O B A O AB A O O A B A A A O B O O A O A B O AB A O B Present the data using a pie chart Solution: The classes of the frequency distribution are A, B,O and AB. Count the number of donors for each of the blood types 52 Pie-Chart… Blood type A B O AB Total Frequency 19 8 19 4 50 Percent 38 16 38 8 100 Angles
  • 53. 53 Pie-Chart… Figure 2.1: Pie-chart of the data on blood types of donors.
  • 54.  Bar diagrams are used to represent and compare the frequency distribution of discrete variables and attributes or categorical series.  When we represent data using bar diagram, all the bars must have equal width and the distance between bars must be equal. 54 Bar Chart
  • 55. 55 Bar Chart… Figure 2.2: Bar chart of the data on blood types of donors. Example2.5: Present the blood types of 50 volunteers at a blood Plasma donation clinic using a bar chart we have seen in example 2.4