SlideShare a Scribd company logo
1
HNDIT 1214
Statistics for IT
1. Introduction To Statistics
2
3
What is Statistics
• The science of collectiong, organizing, presenting, analyzing, and
interpreting data to assist in making more effective decisions
• Statistical analysis – used to manipulate summarize, and investigate
data, so that useful decision-making information results.
4
Types of statistics
• Descriptive statistics – Methods of organizing, summarizing, and
presenting data in an informative way
• Inferential statistics – The methods used to determine something
about a population on the basis of a sample
• Population –The entire set of individuals or objects of interest or the
measurements obtained from all individuals or objects of interest
• Sample – A portion, or part, of the population of interest
5
6
Inferential Statistics
• Estimation
• e.g., Estimate the population mean
weight using the sample mean
weight
• Hypothesis testing
• e.g., Test the claim that the
population mean weight is 70 kg
Inference is the process of drawing conclusions or making decisions about a
population based on sample results
7
Sampling
a sample should have the same characteristics
as the population it is representing.
Sampling can be:
• with replacement: a member of the population may be chosen more
than once (picking the candy from the bowl)
• without replacement: a member of the population may be chosen
only once (lottery ticket)
8
Sampling methods
Sampling methods can be:
• random (each member of the population has an equal chance
of being selected)
• nonrandom
The actual process of sampling causes sampling
errors. For example, the sample may not be large
enough or representative of the population. Factors not
related to the sampling process cause nonsampling
errors. A defective counting device can cause a
nonsampling error.
9
Random sampling methods
• Simple random sample (each sample of the same size has
an equal chance of being selected)
• Stratified sample (divide the population into groups called
strata and then take a sample from each stratum)
• Cluster sample (divide the population into strata and then
randomly select some of the strata. All the members from
these strata are in the cluster sample.)
• Systematic sample (randomly select a starting point and
take every n-th piece of data from a listing of the population)
10
Descriptive Statistics
•Collect data
•e.g., Survey
•Present data
•e.g., Tables and graphs
•Summarize data
•e.g., Sample mean = i
X
n

11
Statistical data
 The collection of data that are relevant to the problem being studied
is commonly the most difficult, expensive, and time-consuming part
of the entire research project.
 Statistical data are usually obtained by counting or measuring items.
 Primary data are collected specifically for the analysis desired
 Secondary data have already been compiled and are available for statistical
analysis
 A variable is an item of interest that can take on many different
numerical values.
 A constant has a fixed numerical value.
12
Data Collection Methods
• Data can be collected in a variety of ways
• Interviews
• Face -to -face interviews
• Telephone interviews
• Computer Assisted Personal Interviewing (CAPI)
• Questionnaires
• Paper-pencil-questionnaires
• Web based questionnaires
• Survey
• Observation
• Exercise : Discuss advantage and disadvantage of each above data
collective methods
13
Data
Statistical data are usually obtained by counting or measuring items.
Most data can be put into the following categories:
• Qualitative - data are measurements that each fail into one of several
categories. (hair color, ethnic groups and other attributes of the
population)
• quantitative - data are observations that are measured on a
numerical scale (distance traveled to college, number of children in a
family, etc.)
14
Qualitative data
Qualitative data are generally described by words or
letters. They are not as widely used as quantitative data
because many numerical techniques do not apply to the
qualitative data. For example, it does not make sense to
find an average hair color or blood type.
Qualitative data can be separated into two subgroups:
 Dichotomic (if it takes the form of a word with two options
(gender - male or female)
 Polynomic (if it takes the form of a word with more than
two options (education - primary school, secondary school
and university).
15
Quantitative data
Quantitative data are always numbers and are the
result of counting or measuring attributes of a population.
Quantitative data can be separated into two
subgroups:
• Discrete (if it is the result of counting (the number of
students of a given ethnic group in a class, the number of
books on a shelf, ...)
• Continuous (if it is the result of measuring (distance
traveled, weight of luggage, …)
16
Types of variables
Variables
Quantitative
Qualitative
Dichotomic Polynomic Discrete Continuous
Gender, marital
status
Brand of Pc, hair
color
Children in family,
Strokes on a golf
hole
Amount of income
tax paid, weight of a
student
17
Numerical scale of measurement:
• Nominal – consist of categories in each of which the number of respective
observations is recorded. The categories are in no logical order and have no
particular relationship. The categories are said to be mutually exclusive
since an individual, object, or measurement can be included in only one of
them.
• Ordinal – contain more information. Consists of distinct categories in which
order is implied. Values in one category are larger or smaller than values in
other categories (e.g. rating-excelent, good, fair, poor)
• Interval – is a set of numerical measurements in which the distance
between numbers is of a known, sonstant size.
• Ratio – consists of numerical measurements where the distance between
numbers is of a known, constant size, in addition, there is a nonarbitrary
zero point.
18
19
Data presentation
„ The question is“ said Alice, „whether you can
make words mean so many different things.“
„The question is,“ said Humpty Dumpty, „which is
to be master-that´s all.“ (Lewis Carroll)
20
Histogram
• Frequently used to graphically
present interval and ratio data
• Is often used for interval and
ratio data
• The adjacent bars indicate that a
numerical range is being
summarized by indicating the
frequencies in arbitrarily chosen
classes
21
Frequency polygon
• Another common method for
graphically presenting interval
and ratio data
• To construct a frequency
polygon mark the frequencies
on the vertical axis and the
values of the variable being
measured on the horizontal axis,
as with the histogram.
• If the purpose of presenting is
comparation with other
distributions, the frequency
polygon provides a good
summary of the data
22
Ogive
• A graph of a cumulative
frequency distribution
• Ogive is used when one wants to
determine how many
observations lie above or below
a certain value in a distribution.
• First cumulative frequency
distribution is constructed
• Cumulative frequencies are
plotted at the upper class limit of
each category
• Ogive can also be constructed for
a relative frequency distribution.
23
Pie Chart
• The pie chart is an effective way
of displaying the percentage
breakdown of data by category.
• Useful if the relative sizes of the
data components are to be
emphasized
• Pie charts also provide an
effective way of presenting
ratio- or interval-scaled data
after they have been organized
into categories
24
Bar chart
• Another common method for
graphically presenting nominal
and ordinal scaled data
• One bar is used to represent the
frequency for each category
• The bars are usually positioned
vertically with their bases located
on the horizontal axis of the graph
• The bars are separated, and this is
why such a graph is frequently
used for nominal and ordinal data
– the separation emphasize the
plotting of frequencies for distinct
categories
25
Time Series Graph
•The time series graph is
a graph of data that
have been measured
over time.
•The horizontal axis of
this graph represents
time periods and the
vertical axis shows the
numerical values
corresponding to these
time periods
26

More Related Content

Similar to 1. Introduction To Statistics in computing.pptx (20)

PPT
Chapter 1
cunninghame
 
PPT
Business statistics (Basics)
AhmedToheed3
 
PDF
Distinguish between qualitative data and quantitative data.
AddisalemMenberu
 
PPT
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
PDF
WEEK-1-IS-20022023-094301am.pdf
MdDahri
 
PDF
Statistics for Data Analytics
SSaudia
 
PPTX
statistics chp 1&2.pptx statistics in veterinary
ayeleasefa2
 
PPTX
Unit 1 - Statistics (Part 1).pptx
Malla Reddy University
 
PDF
statistics.pdf
Noname274365
 
PPTX
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
PPTX
Biostatistics ppt itroductionchapter 1.pptx
jkmrz2302
 
PPTX
An Introduction to Statistics
Nazrul Islam
 
PPTX
Types of Statistics.pptx
ANKURARYA23
 
PPTX
INTRODUCTION TO BIOSTATISTICS
BismahKhan21
 
PPTX
Advanced Statistics with Computer Application
MariaVictoria427485
 
PPTX
advance data Science-Introduction to Statistics
shraddhahajari0
 
PPTX
01 Introduction (1).pptx
BAVAHRNIAPSUBRAMANIA
 
PDF
INTRO to STATISTICAL THEORY.pdf
mt6280255
 
PPTX
BASIC CONCEPTS in STAT 1 [Autosaved].pptx
JhunafilRas2
 
PPT
Statistics.ppt
21EDM25Lilitha
 
Chapter 1
cunninghame
 
Business statistics (Basics)
AhmedToheed3
 
Distinguish between qualitative data and quantitative data.
AddisalemMenberu
 
businessstatistics-stat10022-200411201812.ppt
tejashreegurav243
 
WEEK-1-IS-20022023-094301am.pdf
MdDahri
 
Statistics for Data Analytics
SSaudia
 
statistics chp 1&2.pptx statistics in veterinary
ayeleasefa2
 
Unit 1 - Statistics (Part 1).pptx
Malla Reddy University
 
statistics.pdf
Noname274365
 
fundamentals of data science and analytics on descriptive analysis.pptx
kumaragurusv
 
Biostatistics ppt itroductionchapter 1.pptx
jkmrz2302
 
An Introduction to Statistics
Nazrul Islam
 
Types of Statistics.pptx
ANKURARYA23
 
INTRODUCTION TO BIOSTATISTICS
BismahKhan21
 
Advanced Statistics with Computer Application
MariaVictoria427485
 
advance data Science-Introduction to Statistics
shraddhahajari0
 
01 Introduction (1).pptx
BAVAHRNIAPSUBRAMANIA
 
INTRO to STATISTICAL THEORY.pdf
mt6280255
 
BASIC CONCEPTS in STAT 1 [Autosaved].pptx
JhunafilRas2
 
Statistics.ppt
21EDM25Lilitha
 

More from IsuriUmayangana (20)

PPTX
HNDIT1022 Week 03 Part 2 Theory information.pptx
IsuriUmayangana
 
PPTX
HNDIT1022 Week 04 Theoryinformation hier.pptx
IsuriUmayangana
 
PPTX
HNDIT1022 Week 08, 09 10 Theory web .pptx
IsuriUmayangana
 
PPTX
IT-Integrated Communication Systems.pptx
IsuriUmayangana
 
PPTX
Chat UI Designs presentation first yeaars.pptx
IsuriUmayangana
 
PPTX
mobile app design rules presentation.pptx
IsuriUmayangana
 
PPTX
session 3 - 1st years tourism and hospitality .pptx
IsuriUmayangana
 
PPTX
Introduction to computer can components.pptx
IsuriUmayangana
 
PPTX
tourism and hospitality management .pptx
IsuriUmayangana
 
PDF
L 09 - Sustainable Tourism and Information Technology (1).pdf
IsuriUmayangana
 
PPTX
E-tourism presentation......revision.pptx
IsuriUmayangana
 
PPTX
oLecture09-Internal Organization of CPU.pptx
IsuriUmayangana
 
PPT
systemsoftwarevs-140725005422-phpapp02.ppt
IsuriUmayangana
 
PPT
COMPUTER_ORGANIZATION basic presentation.ppt
IsuriUmayangana
 
PPTX
lecture02-numbersystem-191002152647.pptx
IsuriUmayangana
 
PPTX
Normalization in data base presentation .pptx
IsuriUmayangana
 
PPTX
Programming with JAVA language presentation.pptx
IsuriUmayangana
 
PPT
software development life cycle presentation.ppt
IsuriUmayangana
 
PPTX
L 02 - Travel Intermediaries and Information Technology.pptx
IsuriUmayangana
 
PPTX
web designing and development part (two)
IsuriUmayangana
 
HNDIT1022 Week 03 Part 2 Theory information.pptx
IsuriUmayangana
 
HNDIT1022 Week 04 Theoryinformation hier.pptx
IsuriUmayangana
 
HNDIT1022 Week 08, 09 10 Theory web .pptx
IsuriUmayangana
 
IT-Integrated Communication Systems.pptx
IsuriUmayangana
 
Chat UI Designs presentation first yeaars.pptx
IsuriUmayangana
 
mobile app design rules presentation.pptx
IsuriUmayangana
 
session 3 - 1st years tourism and hospitality .pptx
IsuriUmayangana
 
Introduction to computer can components.pptx
IsuriUmayangana
 
tourism and hospitality management .pptx
IsuriUmayangana
 
L 09 - Sustainable Tourism and Information Technology (1).pdf
IsuriUmayangana
 
E-tourism presentation......revision.pptx
IsuriUmayangana
 
oLecture09-Internal Organization of CPU.pptx
IsuriUmayangana
 
systemsoftwarevs-140725005422-phpapp02.ppt
IsuriUmayangana
 
COMPUTER_ORGANIZATION basic presentation.ppt
IsuriUmayangana
 
lecture02-numbersystem-191002152647.pptx
IsuriUmayangana
 
Normalization in data base presentation .pptx
IsuriUmayangana
 
Programming with JAVA language presentation.pptx
IsuriUmayangana
 
software development life cycle presentation.ppt
IsuriUmayangana
 
L 02 - Travel Intermediaries and Information Technology.pptx
IsuriUmayangana
 
web designing and development part (two)
IsuriUmayangana
 
Ad

Recently uploaded (20)

PPTX
Pre-Interrogation_Assessment_Presentation.pptx
anjukumari94314
 
PDF
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
PPTX
原版定制AIM毕业证(澳大利亚音乐学院毕业证书)成绩单底纹防伪如何办理
Taqyea
 
PPTX
fashion industry boom.pptx an economics project
TGMPandeyji
 
PDF
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
The X-Press God-WPS Office.pdf hdhdhdhdhd
ramifatoh4
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PPTX
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
PDF
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
NRRM 200 Statistics on Bycatch's Effects on Marine Mammals Slideshow.pdf
Rowan Sales
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPT
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
PPTX
AI Project Cycle and Ethical Frameworks.pptx
RiddhimaVarshney1
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Pre-Interrogation_Assessment_Presentation.pptx
anjukumari94314
 
How to Avoid 7 Costly Mainframe Migration Mistakes
JP Infra Pvt Ltd
 
原版定制AIM毕业证(澳大利亚音乐学院毕业证书)成绩单底纹防伪如何办理
Taqyea
 
fashion industry boom.pptx an economics project
TGMPandeyji
 
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
The X-Press God-WPS Office.pdf hdhdhdhdhd
ramifatoh4
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
GenAI-Introduction-to-Copilot-for-Bing-March-2025-FOR-HUB.pptx
cleydsonborges1
 
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
NRRM 200 Statistics on Bycatch's Effects on Marine Mammals Slideshow.pdf
Rowan Sales
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Data base management system Transactions.ppt
gandhamcharan2006
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
AI Project Cycle and Ethical Frameworks.pptx
RiddhimaVarshney1
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
Ad

1. Introduction To Statistics in computing.pptx

  • 1. 1 HNDIT 1214 Statistics for IT 1. Introduction To Statistics
  • 2. 2
  • 3. 3 What is Statistics • The science of collectiong, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions • Statistical analysis – used to manipulate summarize, and investigate data, so that useful decision-making information results.
  • 4. 4 Types of statistics • Descriptive statistics – Methods of organizing, summarizing, and presenting data in an informative way • Inferential statistics – The methods used to determine something about a population on the basis of a sample • Population –The entire set of individuals or objects of interest or the measurements obtained from all individuals or objects of interest • Sample – A portion, or part, of the population of interest
  • 5. 5
  • 6. 6 Inferential Statistics • Estimation • e.g., Estimate the population mean weight using the sample mean weight • Hypothesis testing • e.g., Test the claim that the population mean weight is 70 kg Inference is the process of drawing conclusions or making decisions about a population based on sample results
  • 7. 7 Sampling a sample should have the same characteristics as the population it is representing. Sampling can be: • with replacement: a member of the population may be chosen more than once (picking the candy from the bowl) • without replacement: a member of the population may be chosen only once (lottery ticket)
  • 8. 8 Sampling methods Sampling methods can be: • random (each member of the population has an equal chance of being selected) • nonrandom The actual process of sampling causes sampling errors. For example, the sample may not be large enough or representative of the population. Factors not related to the sampling process cause nonsampling errors. A defective counting device can cause a nonsampling error.
  • 9. 9 Random sampling methods • Simple random sample (each sample of the same size has an equal chance of being selected) • Stratified sample (divide the population into groups called strata and then take a sample from each stratum) • Cluster sample (divide the population into strata and then randomly select some of the strata. All the members from these strata are in the cluster sample.) • Systematic sample (randomly select a starting point and take every n-th piece of data from a listing of the population)
  • 10. 10 Descriptive Statistics •Collect data •e.g., Survey •Present data •e.g., Tables and graphs •Summarize data •e.g., Sample mean = i X n 
  • 11. 11 Statistical data  The collection of data that are relevant to the problem being studied is commonly the most difficult, expensive, and time-consuming part of the entire research project.  Statistical data are usually obtained by counting or measuring items.  Primary data are collected specifically for the analysis desired  Secondary data have already been compiled and are available for statistical analysis  A variable is an item of interest that can take on many different numerical values.  A constant has a fixed numerical value.
  • 12. 12 Data Collection Methods • Data can be collected in a variety of ways • Interviews • Face -to -face interviews • Telephone interviews • Computer Assisted Personal Interviewing (CAPI) • Questionnaires • Paper-pencil-questionnaires • Web based questionnaires • Survey • Observation • Exercise : Discuss advantage and disadvantage of each above data collective methods
  • 13. 13 Data Statistical data are usually obtained by counting or measuring items. Most data can be put into the following categories: • Qualitative - data are measurements that each fail into one of several categories. (hair color, ethnic groups and other attributes of the population) • quantitative - data are observations that are measured on a numerical scale (distance traveled to college, number of children in a family, etc.)
  • 14. 14 Qualitative data Qualitative data are generally described by words or letters. They are not as widely used as quantitative data because many numerical techniques do not apply to the qualitative data. For example, it does not make sense to find an average hair color or blood type. Qualitative data can be separated into two subgroups:  Dichotomic (if it takes the form of a word with two options (gender - male or female)  Polynomic (if it takes the form of a word with more than two options (education - primary school, secondary school and university).
  • 15. 15 Quantitative data Quantitative data are always numbers and are the result of counting or measuring attributes of a population. Quantitative data can be separated into two subgroups: • Discrete (if it is the result of counting (the number of students of a given ethnic group in a class, the number of books on a shelf, ...) • Continuous (if it is the result of measuring (distance traveled, weight of luggage, …)
  • 16. 16 Types of variables Variables Quantitative Qualitative Dichotomic Polynomic Discrete Continuous Gender, marital status Brand of Pc, hair color Children in family, Strokes on a golf hole Amount of income tax paid, weight of a student
  • 17. 17 Numerical scale of measurement: • Nominal – consist of categories in each of which the number of respective observations is recorded. The categories are in no logical order and have no particular relationship. The categories are said to be mutually exclusive since an individual, object, or measurement can be included in only one of them. • Ordinal – contain more information. Consists of distinct categories in which order is implied. Values in one category are larger or smaller than values in other categories (e.g. rating-excelent, good, fair, poor) • Interval – is a set of numerical measurements in which the distance between numbers is of a known, sonstant size. • Ratio – consists of numerical measurements where the distance between numbers is of a known, constant size, in addition, there is a nonarbitrary zero point.
  • 18. 18
  • 19. 19 Data presentation „ The question is“ said Alice, „whether you can make words mean so many different things.“ „The question is,“ said Humpty Dumpty, „which is to be master-that´s all.“ (Lewis Carroll)
  • 20. 20 Histogram • Frequently used to graphically present interval and ratio data • Is often used for interval and ratio data • The adjacent bars indicate that a numerical range is being summarized by indicating the frequencies in arbitrarily chosen classes
  • 21. 21 Frequency polygon • Another common method for graphically presenting interval and ratio data • To construct a frequency polygon mark the frequencies on the vertical axis and the values of the variable being measured on the horizontal axis, as with the histogram. • If the purpose of presenting is comparation with other distributions, the frequency polygon provides a good summary of the data
  • 22. 22 Ogive • A graph of a cumulative frequency distribution • Ogive is used when one wants to determine how many observations lie above or below a certain value in a distribution. • First cumulative frequency distribution is constructed • Cumulative frequencies are plotted at the upper class limit of each category • Ogive can also be constructed for a relative frequency distribution.
  • 23. 23 Pie Chart • The pie chart is an effective way of displaying the percentage breakdown of data by category. • Useful if the relative sizes of the data components are to be emphasized • Pie charts also provide an effective way of presenting ratio- or interval-scaled data after they have been organized into categories
  • 24. 24 Bar chart • Another common method for graphically presenting nominal and ordinal scaled data • One bar is used to represent the frequency for each category • The bars are usually positioned vertically with their bases located on the horizontal axis of the graph • The bars are separated, and this is why such a graph is frequently used for nominal and ordinal data – the separation emphasize the plotting of frequencies for distinct categories
  • 25. 25 Time Series Graph •The time series graph is a graph of data that have been measured over time. •The horizontal axis of this graph represents time periods and the vertical axis shows the numerical values corresponding to these time periods
  • 26. 26