2. CONTENTS
Introduction
Definition
Branches of Biostatistics
Uses/ application of biostatistics
Some important terms
Sampling
Collection of data
Presentation of data
- disribution
Summarization of data
- measures of central tendency
- dispersion
- probability
Summary
References
09/06/2025 2
3. introduction
Statistics has been derived from the Latin
word status.
Statistics today refers to either quantitative
information or to a method of dealing with
quantitative or qualitative information.
Statistics may be defined as the discipline
concerned with the treatment of numerical
data derived from group of individuals.
09/06/2025 3
4. BIOSTATISTICS
09/06/2025 4
Is a method of collection, organizing,
analyzing, tabulating and interpretation
of data related to living organisms and
human beings.
Bios ( life)
Metron
(measured)
Biometry
(measurement
of life)
5. 09/06/2025 5
“When you can measure what you are speaking
about and express it in numbers, you know
something about it but when you cannot
measure, when you cannot express it in
numbers, your knowledge is of meagre and
unsatisfactory kind.”
- LORD KELVIN
6. Branches of Biostatistics
• Descriptive Biostatistics
Methods of producing quantitative summaries
of information in biological sciences.
Tabulation and Graphical presentation
7. Branches of Biostatistics…
Inferential Biostatistics
Methods of making generalizations about a
larger group based on information about a sample
of that group in biological sciences.
Primarily performed in two ways:
• Estimation
• Testing of hypothesis
8. Uses/ APPLICATIONS of
biostatistics
In Physiology and Anatomy:
1. To define what is normal or healthy for a
population and to find limits of normality in
variables.
2. To find the difference between means and
proportions of normal at two places or in
different periods.
09/06/2025 8
9. In Pharmacology:
1. To find the action of drug.
2. To compare the action of two different drugs
or two successive dosages of the same drug.
3. To find the relative potency of a new drug
with respect to a standard drug.
09/06/2025 9
10. In Medicine:
1. To compare the efficacy of a particular drug,
operation or line of treatment.
2. To find an association between two
attributes.
3. To identify signs and symptoms of a disease
or syndrome.
09/06/2025 10
11. • Most people have heard the statistic that Heart
disease is the leading cause of death in America today*.
• But how do we know this fact to be true?
• Where did that information come from?
* [source: Center for Disease Control, USA]
In medicine
12. • Back in 1948, when a lot wasn't known about the factors
leading to heart disease and stroke, a health research
study -- known as the Framingham Heart Study -- was
done on 5,209 people living in the town of Framingham,
Mass.
• These participants hadn't developed any known
symptoms of cardiovascular disease and hadn't had a
stroke or heart attack.
In medicine
13. • They agreed to be followed over a period of time to
help researchers learn what factors lead to both
conditions.
• The study was landmark in several ways. It showed
that there was no one cause for getting a heart attack,
and combining information about several risk factors
could estimate the risk of someone getting the disease.
In medicine
14. • Thanks to the Framingham Study, (which is still going
on today), we now know the major risk factors that
lead to cardiovascular disease.
• To reach these conclusions, researchers simply
followed the numbers -- the Biostatistics numbers to
be exact.
In medicine
15. Clinical medicine
• Documentation of medical history of diseases.
• Planning and conduct of clinical studies.
• Evaluating the merits of different procedures.
• In providing methods for definition of ‘normal’ and
‘abnormal’.
16. Preventive medicine
• To provide the magnitude of any health problem in
the community.
• To find out the basic factors underlying the ill-health.
• To evaluate the health programs which was
introduced in the community(success/failure).
• To introduce and promote health legislation.
17. In Community Medicine and Public health:
1. To test usefulness of sera and vaccines in the
field. –
% of attacks or death among vaccinated
subjects is compared with that among non
vaccinated.
2. In epidemiological studies – the role of
causative factors is statistically tested.
09/06/2025 17
18. USES OF STATISTICS IN DENTAL SCIENCE:
1. To find the statistical difference between means of
two groups. Ex: Mean plaque scores of two groups.
2. To assess the state of oral health in the community
and to determine the availability and utilization of
dental care facilities.
3. To indicate the basic factors underlying the state of
oral health by diagnosing the community and find
solutions to such problems.
19. Uses of statistics in dental science:
4. To determine success or failure of specific oral
health care programs or to evaluate the program
action.
5. To promote oral health legislation and in creating
administrative standards for oral health care delivery.
20. Some important terms
VARIABLE:
A general term for any feature of the unit
which is observed or measured is a variable.
It is a characteristic that takes on different
values in different persons, places or things.
It is denoted as X and notation for orderly
series as X1, X2,X3….Xn. The suffix n is
symbol for number in the series.
09/06/2025 20
22. CONSTANT:
These are quantities that do not vary.
Eg:π = 3.141
е = 2.718
They do not require statistical study.
In biostatistics, mean, standard deviation,
standard error, correlation coefficient and
proportion of a particular population are
considered as constant.
09/06/2025 22
23. OBSERVATION:
An event and its measurements.
Eg: blood press – event
120mmHg – measurement
OBSERVATIONAL UNIT:
The source that gives observations such as
object, person, etc.
In medical stats the term individuals or
subjects is used more often.
09/06/2025 23
24. DATA:
A set of values recorded on one or more
observational units.
PARAMETER:
It is a summary value or constant of a
variable that describes the population such
as mean, variance, correlation coefficient,
proportion etc.
09/06/2025 24
26. STATISTIC:
It is a summary value that describes the
sample such as its mean, standard deviation,
standard error, correlation coefficient,
proportion etc.
This value is calculated from the sample and
is often applied to population but may or may
not be valid estimate of population.
Parameter and statistic are often used as
synonyms.
09/06/2025 26
27. PARAMETRIC TEST:
It is one in which population constants as
described above are used such as mean,
variances, etc. and data tend to follow one
assumed or established distributions such as
normal, binomial, Poisson etc.
09/06/2025 27
28. 09/06/2025 28
NON PARAMETRIC TEST:
Tests such as χ2 test in which no constant of a
population is used.
Data do not follow any specific distribution
and no assumptions are made in non
parametric tests.
Eg. To classify good, better and best you
allocate arbitrary numbers or marks to each
category.
29. population
In statistics population means the totality of
the individual observations about which
inferences are to be made.
Populations can be finite or infinite.
Samples of varied size can be drawn carefully
with appropriate procedures from their
populations which are either finite or
infinite.
09/06/2025 29
30. sample
It is a part of the population.
It is a small collection of observations from
some larger aggregate about which we want
to have information.
Samples drawn should be representative of
the population.
09/06/2025 30
Larger the sample, better is the degree of representation of the
sample selected.
31. sampling
Samples can be drawn from the entire
population through various procedures.
Sampling can be:
09/06/2025 31
Probability
sampling
Non probability
sampling
32. Probability sampling
09/06/2025 32
Simple
random
sampling
Systematic
sampling
Stratified
random
sampling
Cluster
sampling
Multistage
sampling
Multiphase
sampling
33. Non probability sampling:
09/06/2025 33
Heterogeneous
sampling
Homogenous
sampling
Structured
sampling
Haphazard
sampling
34. 1. Simple random sampling
UNRESTRICTED RANDOM SAMPLING
Applicable when population is small,
homogenous and readily available.
Used mainly in experimental medicine or
clinical trials to check the efficacy of a
particular drug.
09/06/2025 34
Principle : every unit of the population has an equal chance
of being selected.
35. To ensure randomness of selection 2 methods
are available:
09/06/2025 35
Lottery method
Random number
procedure
36. 2. Systematic sampling
Simple procedure.
Utilized when a complete list of population
from which sample is to be drawn is
available.
Systematic procedure is followed to choose a
sample by taking every Kth
house or patient
where k refers to the sample interval which is
calculated by the following formula:
K = total population/sample size desired
09/06/2025 36
37. Merits of systematic
sampling
1. Procedure is simple and convenient for use.
2. Relatively time to be devoted and labor
needed are small.
3. If the population is sufficiently large and
homogenous and if the numbering of the
subjects is available, this method can
provide good results.
09/06/2025 37
An element of randomness is introduced into this kind of sampling by randomly
selecting from the first K units, the unit with which to start. – RANDOM
START.
Sample so chosen is sometimes called as “Every K’th systematic sample”
38. 3.Stratified random sampling
Followed when the population is not
homogenous.
Population under study is first divided into
homogenous groups called strata and the
sample is drawn from each stratum at
random in proportion to its size.
Gives more representative sample than
simple random sampling in a given large
population.
09/06/2025 38
39. Merits of stratified
random sampling
1. Gives greater accuracy.
2. Gives better representation to each strata
compared to simple random sampling.
09/06/2025 39
40. 4. Cluster sampling
Cluster is a group consisting of units such
as villages, wards, blocks, factories,
workshops etc.
Simple random sampling or systematic
sampling procedure is utilized for
selection of clusters.
After the selection of clusters randomly,
enumeration of individuals in the cluster
is carried out.
09/06/2025 40
41. If the cluster consists of natural groupings
and if they are geographic regions it is
referred to as AREA SAMPLING.
MERITS:
1. Simple and time saving.
DEMERITS:
2. Costlier.
3. Provides figures with higher standard errors
than other procedures.
09/06/2025 41
42. 5. Multistage sampling
Refers to sampling procedures carried out in
several stages using random sampling
techniques.
Employed in large scale, country wise or
region wise surveys.
Stage wise sampling procedures are to be
utilized for selection of households or
subjects.
09/06/2025 42
43. 6. Multiphase sampling
Here part of information is collected from
whole sample and part from the sub sample.
Numbers in 2nd
and 3rd
phase will become
successively smaller and smaller.
MERITS:
1. Less costly.
2. Less laborious.
3. More purposeful.
09/06/2025 43
44. Purposive sampling
If a sample is not randomly selected it is
called purposive sampling.
Here the chances of any element being
selected are either unknown or guaranteed to
be 0% or 100%.
It provides better descriptive data.
Are used in early stages of any branch of
knowledge as the focus is on what the
researchers will seek to explain.
09/06/2025 44
45. Types of purposive
sampling
09/06/2025 45
• Selected from things or people which are in some way
alike in a relevant detail.
• Quota sampling
Heterogenous
• Extreme – selecting a group of people with a quality
which lies at the top or bottom of the range of such
qualities found in general population.
• Rare – those which contain a quality or qualities found
only rarely.
Homogenou
s
46. • Strategic informant sampling – selecting
people whom you think can give you the
most information.for eg comm leaders etc.
• Snowball sampling
Structured
• Is merely one which is readily available.
Haphazard
09/06/2025 46
52. 1. observation
Used in studies relating to behavioral
sciences.
Merits:
1. Elimination of subjective bias.
2. Information obtained relates to what is
currently happening.
3. Independent of respondent’s willingness.
09/06/2025 52
53. Demerits:
1. Expensive.
2. Information provided is limited.
3. Unforeseen factors may interfere.
4. Some people are rarely accessible to direct
observation.
09/06/2025 53
54. Types of observation
09/06/2025 54
Structured
Unstructured
Uncontrolle
d
Controlled
Participant
observation
Non
participant
observation
Disguised
observation
55. Merits of participant
type of observation
Researcher is able to record the natural
behavior of the group.
Researcher can gather information which
could not easily be obtained if he observes in
a disinterested manner.
Researcher can even verify the truth of
statements.
09/06/2025 55
57. Personal interview
Can be of 2 types:
1. Direct personal investigation
2. Indirect oral investigation
09/06/2025 57
structured unstructured
58. 09/06/2025 58
Focused
• Focus
attention on
the given
experience of
the
respondent
and its effects.
Clinical
• Concerned
with broad
underlying
feelings or
motivation or
with the
course of
individuals life
experience
Non Directive
• Simply
encourage the
respondent to
talk about the
given topic
with a bare
minimum
questioning.
59. merits
09/06/2025 59
More information and in greater depth
Interviewer can overcome resistance of
respondents.
Greater flexibility
Observation method can as well be applied to
recording verbal answers to various questions.
Personal information can as well be obtained
easily.
60. Samples can be more controlled more effectively.
Interviewer can control which persons will answer the questions.
Interviewer may catch the informant off guard and thus may secure the most
spontaneous reactions than would be the case if mailed questionnaire is used.
Language of interview can be adopted to the ability or educational level of the
person interviewed.
Interviewer can collect supplementary information about the respondent’s
personal characteristics and environment which is of great value in interpreting
results.
09/06/2025 60
61. demerits
09/06/2025 61
Very expensive Possibility of bias
Certain types of
respondents such as
important officials
may not be easily
approachable
More time
consuming specially
when the sample is
large and recalls
upon the
respondents are
necessary
Presence of
interviewer on the
spot may over
stimulate the
respondent .
Organisation
required for
selecting , training
and supervising staff
is more complex
with formidable
problems
Interviewing at times
may also introduce
systematic errors.
Effective interview
presupposes proper
rapport with
respondents that
would facilitate free
and frank responses
62. Pre- requisites and basic
tenets of interviewing
09/06/2025 62
1. Interviewers should be carefully selected, trained and briefed.
2. They should be honest, sincere, hardworking, impartial and must
possess the technical competence and necessary practical
experience.
3. Occasional field checks should be made to ensure that interviewers
are neither cheating nor deviating from instructions given to perform
their job efficiently.
4. The approach should be friendly, courteous, conversational and
unbiased.
5. Interviewer should not show disapproval or surprise of a respondents
answer but he must keep the direction of interview in his own hand,
discouraging irrelevant conversation and must make all possible
effort to keep respondent on the track.
63. Telephone interviews
Merits:
1. More flexible in comparison to mailing
methods.
2. Is faster than other methods.
3. Cheaper than personal interviewing method.
4. Recall is easy, callbacks are simple and
economical.
5. Replies can be recorded without causing
embarrassment to the respondents.
6. Higher rate of response than mailing
method.
09/06/2025 63
64. 7. Interviewer can explain requirements more
easily.
8. At times access can be gained to respondents
who otherwise cannot be contacted for one
reason or another.
9. No field staff is required.
10.Representative and wider distribution of
sample is possible.
09/06/2025 64
65. Demerits of telephone
interviews
1. Little time is given to respondents for
considered answers.
2. Surveys are restricted to respondents who
have telephone facilities.
3. Extensive geographical coverage may get
restricted by cost considerations.
09/06/2025 65
66. 4. It is not suitable for intensive surveys where
comprehensive answers are required to
various questions.
5. Possibility of bias of the interviewer is
relatively more.
6. Questions have to be short and to the point.
09/06/2025 66
67. 3. Collection of data
through questionnaires
Used in big enquiries.
Adopted by private individuals, research
workers, private and public organizations
and even by governments.
A questionnaire consists of a set of questions
printed or typed in a definite order on a form
or set of forms.
The questionnaire is mailed to the
respondents who are expected to read and
understand questions and answer them on
their own.
09/06/2025 67
68. Merits of questionnaire
survey
09/06/2025 68
1. Low cost even when business is large and widely spread geographically
2. Free from bias of the interviewer, answers are in respondents words.
3. Respondents have adequate time to give well thought out answers.
4. Respondents who are not easily approachable, can also be reached
conveniently.
5. Large samples can be made use of and thus the results can be made
more dependable.
69. Demerits of
questionnaire method
09/06/2025 69
1. Low rate of return of duly filled in questionnaires, bias due to no-
response is often indeterminate.
2. Can be used only when the respondents are educated and co-
operating.
3. The control over questionnaire may be lost once it is sent.
4. There is inbuilt inflexibility because of the difficulty of amending the
approach once questionnaires have been dispatched.
5. Possibility of ambiguous replies or omission of replies altogether to
certain questions.
6. Difficult to know whether willing respondents are truly representative.
7. Slowest of all methods.
70. Aspects of a questionnaire
General form:
Question sequence:
09/06/2025 70
structured
unstructured
Questions to be avoided:
1. Questions that put too great a strain on the memory or
intellect of the respondent.
2. Questions of a personal character.
3. Questions related to personal wealth etc.
Question sequence should always go from the general to the more
specific.
The answer given to a given question is a function not only of specific
question but of all previous questions as well.
71. Question formulation and wording:
Should be simple.
Should be easily understood.
Should be concrete and should conform to
the respondent’s way of thinking.
09/06/2025 71
Multiple choice
or closed
questionnaire
Open ended
72. 09/06/2025 72
Open ended:
What sports or other physical activities do you
undertake each week on a regular basis?
Closed ended:
For each of the following sports tick if you regularly spend more
than 30 mins each week in that activity?
a. Walking
b. Jogging
c. Cycling
d. Swimming
73. Open ended
questionnaire
Closed ended
questionnaire
Subject recall Reduced Enhanced
Accuracy of
response
Easier to express
complex situations
Difficult to
investigate
complex situations
Coverage May pick up
anticipated
situation
Will miss areas not
anticipated
Size of
questionnaire
May need fewer
lines of text
May need many
pages of text
Analysis More complex Simpler
09/06/2025 73
74. Essentials of a good
questionnaire
09/06/2025 74
• Should be short and simple.
• Questions should proceed in logical sequence moving from easy to more difficult
questions.
• Personal and intimate questions should be left to the end.
• Technical and vague expressions capable of different interpretations should be avoided in
a questionnaire.
• Questions may be dichotomous, multiple choice or open ended.
• There should be some control questions in the questionnaire which indicate reliability of
the respondent.
• There should be provision for indications of uncertainty.
• The physical appearance of the questionnaire affects the cooperation the researcher
receives from the recipients.
76. Open ended questionnaire are formulated to elicit
awareness of the issue in question and general
attitudes towards it.
A closed ended question follows to capture
information on specific attitudes to the subjects.
An open ended question is placed next to explore
justifications for their attitudes
This is followed by a closed ended question to tap the
intensity with which they hold attitudes.
09/06/2025 76
77. 4. Collection of data
through schedules
This method requires the selection of
enumerators for filling up schedules or
assisting respondents to fill up schedules.
The enumerators should be trained to
perform their job well and the nature and
scope of investigation should be explained to
them thoroughly.
09/06/2025 77
78. Enumerators should be intelligent and must
possess the capacity of cross examination in
order to find the truth.
They should be honest, hard working, patient
and have perseverance.
09/06/2025 78
79. Difference between
questionnaires and schedule
09/06/2025 79
Schedule Questionnaire
Cost High Low
Response rate Higher Lower
Completion of
questionnaire
High Low
Complexity of
questions
Can be high Should be minimized
Interviewer bias May be present Not relevant
Interviewer variability May be present Not relevant
Total study duration Considerably fast Slow
81. Published data:
09/06/2025 81
1. Various publications of central, state or local
govts.
2. Various publications of foreign govts or of
international bodies and their subsidiary
organizations.
3. Technical and trade journals.
4. Books, magazines and newspapers.
5. Reports and publications of various
associations
6. Reports prepared by research scholars,
universities.
7. Public records and statistics, historical
documents and other sources of information.
82. Unpublished data:
09/06/2025 82
1. Diaries, letters.
2. Unpublished biographies and autobiographies.
3. May be available with scholars, research
workers, trade associations, labor bureaus and
other public or private individuals or
organizations.
83. Secondary data should possess following
characteristics:
Reliability of data
Suitability of data
Adequacy of data
09/06/2025 83
84. 84
PRESENTATION OF DATA
Objectives
• make the data simple
• concise, meaningful,
• interesting and
• helpful in further analysis.
Two main methods of presenting data:
• Tabulation and
• Diagrams
85. 85
TABULATION
• The first step in presenting data
• Principles of tabulation:
– Table should be numbered
– Title- brief & self explanatory
– Headings of columns and rows- clear & concise
– Data must follow an order;
alphabetical/magnitude/geographical/chronological etc.
– Should not be too large & confusing
– Footnotes for any other relevant information
87. 87
Simple table
Table No.1: Number of students attending PCD lectures
Lecture no. No.of students
I 100
II 95
III 88
IV 75
Note: During the academic year 2007-’08
88. 88
• Data is split into groups/classes
• Class intervals & frequency
• The no of class intervals - between 5 and 20.
• The class intervals - at equal width.
• Clearly defined class limits – to avoid ambiguity.
e.g. 0-4, 5-9, 10-14, Etc.
• Clearly defined headings
• Units of measurement should be specified.
• It is used to tabulate the quantitative data
Frequency distribution table
89. 89
Marks obtained Frequency
0-10 0
11-20 16
21-30 32
31-40 46
41-50 6
Total 100
Table 2. Marks obtained by III BDS students
in PCD in II internal assessment
Note: During the academic year 2007-’08
90. 90
Diagram
• Extremely useful
• Attractive to the eyes,
• Give a bird's eye view of the entire data,
• Have a lasting impression
• Facilitate comparison of data relating to
different time periods and regions.
91. 91
TYPES OF DIAGRAMS
• Bar Diagram
• Multiple Bar
• Component Bar Diagram
• Proportional Bar Diagram
• Histogram
• Frequency Polygon
• Pie Diagram
• Line diagram
• Cartograms or Spot Map
• Pictogram
92. 92
Basic requirements
• Self explanatory
• Simple and consistent with the data.
• Values of the variables - on horizontal or X-axis and the
frequency - vertical line or Y-axis.
• No too many lines on the graph, should not look clumsy.
• The scale of presentation – right hand top corner of the
graph.
• The details of the variables and frequencies should be
presented on the axes.
93. 93
Bar Diagram
• Represents qualitative data.
• Frequency distribution of one variable.
• Width of the bar remains the same
• The length varies
according to the
frequency in each category.
• Bars - vertical or
horizontal.
Limitations
• Represent only one variable
• Cannot be used for comparison
94. 94
Multiple Bar
• Compare qualitative data with respect to a single variable.
• Facilitates comparison.
– Eg: sex
wise or with respect to time or region.
• Each category of the variable have a set of bars of the same
width corresponding to the different sections without any gap in
between the width and the length corresponds to the frequency.
95. 95
Component Bar Diagram:
• Represents qualitative data.
• Both, the number of cases in major groups as
well as the subgroups simultaneously
• Cases of the major group drawn
• Each rectangle is divided according to no in
the subgroups.
96. 97
PIE DIAGRAM
• The frequency of the group is shown in a circle.
• Degree of angle denotes the frequency.
• Instead of comparing the length of bar , the
areas of segments are compared.
Males
Females
97. 98
Line diagram:
• To present continuous data
• Useful to study changes of values in the
variable over time
• X-axis: Hours, days, weeks, months or years
• Y-axis: Value of any quantity pertaining to X-
axis
98. 99
Histogram
• Quantitative data of continuous type.
• Bar diagram without gap between the bars.
• Represents a frequency distribution of
continuous data.
99. 100
Frequency Polygon
• Frequency distribution of quantitative data
• a point is marked over the mid-point of the class
interval, corresponding to the frequency.
• points are connected by straight lines.
• The first point and last point are joined to the
midpoint of previous and next class respectively.
• To compare two or more frequency distributions, lines
of different types are drawn on the same graph.
100. 101
Scatter diagram
Fig.--. Height and Weight of 20 students of CODS
0
10
20
30
40
50
60
70
80
3 4 5 6 7
Height in feet
Weight
in
KGs
Weight
101. 102
Spot Map
• show geographical distribution of
frequencies of a characteristic.
#6: Concerned with the presentation , organisation and summarization of data
#7: Are used to generalize the data from sample to a larger group of patients
#84: Data collected and compiled from experimental work, surveys, registers or records are raw data. These are unsorted and are not much helpful for understanding the underlying trends or its meaning, so these are to be sorted & classified in to characteristic groups or classes like, according to age, sex, social class, number of DMFT, etc.