2. Course Objective
I. Definition and classification of Statistics
II. Reasons to know about biostatistics
III. Stages in statistical investigation
IV. Definition of Some Basic Terms
V. Applications, uses and limitations of Statistics
VI. Types of variables and measurement scales
3. Definition and Classification of
Biostatistics
Biostatistics is a growing field with applications in many areas of biology,
including:
• Epidemiology
• Medical sciences
• Health sciences
• Educational research
• Environmental sciences
4. Concern of Biostatistics
Applied Statistics:
The application of statistical methods to solve real problems involving randomly generated data,
and the development of new statistical methodologies motivated by real problems.
Biostatistics:
A branch of applied statistics directed toward applications in the health sciences and biology.
Use of Statistical Tools in Biostatistics:
• The tools of statistics are applied in many fields: business, education, psychology, agriculture,
economics
• When data comes from public health, biological sciences, or medicine, the term biostatistics
describes the specific application of statistical tools and concepts.
5. Classification of Biostatistics
1. Descriptive Statistics
• A statistical method concerned with:
• Collection
• Organization
• Summarization
• Analysis of data from a sample of the population
2. Inferential Statistics
• A statistical method concerned with:
• Drawing conclusions/inferences about a population
• Based on measurements from a random sample of that population
6. Descriptive Statistics
Descriptive Statistics
• Statistical procedures used to summarize, organize, and simplify data.
• The process should reflect overall findings effectively.
Key Points:
• Raw data is made more manageable.
• Raw data is presented in a logical form.
• Patterns can be identified from organized data.
7. Descriptive Statistics (continued)
Common Statistical Summaries in Descriptive Analyses:
• Measures of central tendency
• Measures of dispersion
• Measures of association
• Cross-tabulation, contingency table
• Histogram
• Quantile, Q-Q plot
• Scatter plot
• Box plot
8. Inferential Statistics
• This branch of statistics deals with techniques of making conclusions about the population.
• Inferential statistics builds upon descriptive statistics.
Key Points:
• Inferences are drawn from sample properties to population properties.
• Used to make generalizations from a sample to a population.
• Includes a variety of procedures to ensure that inferences are sound and rational, though not
always correct.
9. Inferential Statistics (continued)
In short, inferential statistics enable us to make confident decisions in the face of uncertainty.
Examples:
• Antibiotics reduce the duration of viral throat infections by 1–2 days.
• Five percent of women aged 30–49 consult their GP each year with heavy menstrual bleeding.
11. Reasons to know about biostatistics:
• Medicine is becoming increasingly quantitative.
• The planning, conduct and interpretation of much of medical research are becoming increasingly
reliant on statistical methodology.
• Statistics pervades the medical literature.
12. Example: Evaluation of Penicillin (treatment A) vs Penicillin &
Chloramphenicol (treatment B) for treating bacterial pneumonia
in children< 2 yrs.
What is the sample size needed to demonstrate the significance of one group against other ?
Is treatment A is better than treatment B or vice versa ?
If so, how much better?
What is the normal variation in clinical measurement ? (mild, moderate & severe) ?
How reliable and valid is the measurement ? (clinical & radiological) ?
What is the magnitude and effect of laboratory and technical
error ?
How does one interpret abnormal values ?
13. CLINICAL MEDICINE
• Documentation of medical history of diseases.
• Planning and conduct of clinical studies.
• Evaluating the merits of different procedures.
• In providing methods for definition of “normal” and “abnormal”.
14. PREVENTIVE MEDICINE
• To provide the magnitude of any health problem in the community.
• To find out the basic factors underlying the ill-health.
• To evaluate the health programs which was introduced in the community (success/failure).
• To introduce and promote health legislation.
15. WHAT DOES STATISTICS COVER?
• Planning
• Design
• Execution (Data collection)
• Data Processing
• Data analysis
• Presentation
• Interpretation
• Publication
16. HOW A “BIOSTATISTICIAN” CAN HELP ?
• Design of study
• Sample size & power calculations
• Selection of sample and controls
• Designing a questionnaire
• Data Management
• Choice of descriptive statistics & graphs
• Application of univariate and multivariate
statistical analysis techniques
17. Stages in statistical investigation
There are five stages or steps in any statistical investigation
1. Collection of data: The process of obtaining measurements or counts.
2. Organization of data: Includes editing, classifying, and tabulating the data collected
3. Presentation of data: overall view of what the data actually looks like I facilitate further statistical
analysis I Can be done in the form of tables and graphs or diagrams
4. Analysis of data: To dig out useful information for decision making. It involves extracting relevant
information from the data (like mean, median, mode, range, variance. . . )
5. Interpretation of data Concerned with drawing conclusions from the data collected and
analyzed; and giving meaning to analysis results. A difficult task and requires a high degree of skill
and experience
18. definitions of Some basic terms
Population
Census
Sample
Parameter
Statistic, Statistics
Sampling
sample size
Variable Data
19. Definition of Some basic terms
• Population: is the complete set of possible measurements for which inferences are to be made.
• Census: a complete enumeration of the population. But in most real problems, it cannot be
realized, hence we take a sample.
• Sample: A sample from a population is the set of measurements that are actually collected in the
course of an investigation.
• Parameter: Characteristic or measure obtained from a population.
• Statistic: A statistic refers to a numerical quantity computed from sample data (e.g. the mean, the
median, the maximum...).
• Data: Refers to a collection of facts, values, observations, or measurements that the variables can
assume.
20. Definitions of Some basic terms
• Statistics is a branch of mathematics dealing with data collection, organization, analysis,
interpretation, and presentation.
• Sampling: The process or method of sample selection from the population.
• Sample Size: The number of elements or observations to be included in the sample.
• Variable: It is an item of interest that can take on many different numerical values.
Some examples of variables include:
Diastolic blood pressure,
heart rate, heights,
The weights,
Stage of bladder cancer patients,
21. Applications, Uses and Limitations of
statistics.
Applications of Statistics
In almost all fields of human endeavor
Almost all human beings in their daily life are subjected to obtaining numerical facts e.g. abut price.
Applicable in some process e.g. invention of certain drugs, extent of environmental pollution.
In industries especially in quality control area
22. Uses of Statistics
The main function of statistics is to enlarge our knowledge of complex phenomena. The following are
some uses of statistics:
I. It presents facts in a definite and precise form.
II. Data reduction.
III. Measuring the magnitude of variations in data.
IV. Furnishes a technique of comparison.
V. Estimating unknown population characteristics.
VI. Testing and formulating of hypothesis.
VII. Studying the relationship between two or more variables.
VIII. Forecasting future events
23. Limitations of statistics
As a science, statistics has its own limitations. The following are some of the limitations:
I. Deals with only quantitative information.
II. Deals with only the aggregate of facts and not with individual data items.
III. Statistical data are only approximate and not mathematically correct.
IV. Statistics can be easily misused and therefore should be used by experts
24. Types of Variables and Measurement
Scales
Variable: A variable is a characteristic or attribute that can assume different values in different
persons, places, or things.
Example:
Age, Diastolic blood pressure,
Heart rate,
The height of adult males,
The weights of preschool children,
Gender of Biostatistics students,
Marital status of instructors at the PIMS,
Ethnic group of patients
25. Types of Variables
A-Depending on the characteristic of the measurement, a variable can be:
1. Qualitative(Categorical) variable:
A variable or characteristic that cannot be measured in quantitative form but can only be identified by name or
categories,
for instance, place of birth, ethnic group, type of drug, stages of breast cancer (I, II, III, or IV), degree of pain
(minimal, moderate, severe or unbearable).
The categories should be clear-cut, not overlapping, and cover all the possibilities. For example, sex (male or
female), vital status (alive or dead), disease stage (depends on disease), ever smoked (yes or no).
2. Quantitative(Numerical) variable:
It is one that can be measured and expressed numerically.
Example:
survival time
systolic blood pressure
Number of children in a family
height, age, and body mass index.
26. Quantitative(Numerical) variable:
They can be of two types
1. Discrete Variables:
Have a set of possible values that is either finite or countably infinite.
The values of a discrete variable are usually whole numbers.
Numerical discrete data occur when the observations are integers that correspond with a count of some
sort.
Examples of discrete variables
Number of pregnancies,
The number of bacteria colonies on a plate,
The number of cells within a prescribed area upon microscopic examination,
The number of heart beats within a specified time interval,
A mother’s history of the number of births ( parity) and pregnancies (gravidity),
The number of episodes of illness a patient experiences during some time period, etc
27. Quantitative(Numerical) variable:
1. Continuous variable:
A continuous variable has a set of possible values, including all values in an interval of the real line.
No gaps between possible values.
Each observation theoretically falls somewhere along a continuum
Examples of Continuous variables:
Body mass index
Height I Blood pressure
Serum cholesterol level
Weigh,
Age etc...
Observations are not restricted to take on certain numerical values: Often, measurements (e.g., height,
weight, age)
Continuous data are used to report a measurement of the individual that can take on any value within
an acceptable range
28. Types of Variables:
B- On the basis of Scales of measurement:
There are four types of measurement scales:
1. Nominal scales of measurement:
Only ”naming” and classifying observations is possible. When numbers are assigned to categories, it is only
for coding purposes, and it does not provide a sense of size. Example:
Sex of a person (M, F)
Eye color (e.g. brown, blue)
religion (Muslim, Christian)
place of residence (urban, rural), etc
29. On the basis of scale of Measurement
2. Ordinal Scales of Measurement:
Categorization and ranking (ordering) of observations is possible
We can talk of greater than or less than, and it conveys meaning to the value, but;
Impossible to express the real difference between measurements in numerical terms. Example:
Socio-economic status (very low, low, medium, high, very high)
severity(mild, moderate, sever)
blood pressure (very low, low, high, very high), etc.
30. On the basis of scale of Measurement
3. Interval Scales of Measurement:
Possible to categorize, rank, and tell the real distance between any two measurements
Zero is not absolute
Example:
Body temperature in degrees F. and Celsius (measured in degrees).
It is a meaningful difference
31. On the basis of scale of Measurement
4.Ratio scales of Measurement:
The highest level of measurement scale, characterized by the fact that equality of ratios as well as
equality of intervals can be determined
There is a true zero point. i.e. zero is absolute Example:
volume
height
weight
length
time until death, etc...
33. Types of Variables
C. On the basis of the source of data:
1. Primary Data:
Data generated for the first time primarily/originally for the study in question
It needs the involvement of the researcher himself. Census and sample surveys are sources of primary
types of data
2. Secondary Data:
Obtained from other pre-existing/previously collected sources
In this case, data were obtained from already collected sources like newspapers, magazines, DHS, hospital
records, and existing data like:
Mortality reports
Morbidity reports
Epidemic reports
Reports of laboratory utilization (including laboratory test results)
34. Statistics
Statistics
SCALE DESCRIPTIVE INFERENTIAL
Nominal Percentage , Mode Chi Square, Binomial Test
Ordinal Percentile , Median Rank order, Correlation ANOVA
Interval Range, Mean, SD Correlation, T-test, ANOVA,
Regression, Factor Analysis
Ratio Geometric Mean, Harmonic
Mean
Coficient of variation