SlideShare a Scribd company logo
Probability & Statistics
for Engineers & Scientists
Authors: Walpole, Myers, Myers, YE
Instructor:
NADEEM KHAN
Lecturer, S & H Dept.,
FAST-NU, Main Campus, Karachi
Nadeem.arif@nu.edu.pk
Text Book

Reference
Book 
MT206 – Probability &
Statistics
(4 Credit Hours)
Week Topics
01 Intro. To Statistics, Measures of Central Tendency &
Dispersion
02 Bar Chart, Histogram, Stem-Leaf Plot, Box Plot, Dot Plot,
Frequency Curves, Ogive, Skewness & Kurtosis
03 Introduction to Probability: Sample Space, Tree
Diagram, Event, Set Theory, Venn Diagram
04 Counting techniques, Kinds of Events, Additive rules
05 Conditional Probability, Independence, Multiplicative
rules, Bayes’ Theorem.
06 1st
Mid-Term Examination
Course Outline:
MT206 – Probability &
Statistics
(4 Credit Hours)
Wee
k
Topics
07 Random Variables & Probability Distributions: PMF,
PDF, CDF, Joint & Marginal Probability Distributions,
Mathematical Expectation
08 Discrete Distributions: Binomial & Multinomial, Poisson,
Geometric, Hypergeometric, and Discrete uniform.
09 Continuous Distributions: Normal, Exponential, Uniform,
Chi-Square
10 Testing of Hypothesis: z-test, t-test
11 Goodness of Fit Test, Chi-Square
test of Independence
12 2nd
Mid-Term Examination
Course Outline
(Contd.)
MT206 – Probability & Statistics
(4 Credit Hours)
 Note: The above course outline & schedule is
tentative.
Wee
k
Topics
13 Correlation & Regression
14 Non-Linear Regression: Polynomial regression
15 ANOVA
16 Final Examination
Course Outline
(Contd.)
Marks Distribution
S. No. Particulars % Marks
01 Assignments 10
02 Quizzes 10
03 1st
Mid Term 15
04 2nd
Mid Term 15
05 Final Exam 50
Total 100
Important Instructions
 Be in the classroom on time.
 All students are required to maintain 80% of attendance. In case students
fail to maintain 80% of attendance, they become ineligible to take the final
exam.
 Turn off your cell phones or any other electronic devices before
entering the class.
 Maintain the decorum of the class room all the time.
 Avoid conversation with your classmates while the lecture is in progress.
 Submit your assignments on time otherwise marks will be deducted after
deadline.
Important Instructions
(Contd.)
 Assignment should include a title page
consisting of your complete Name, Roll
No, Subject Name and date etc.
 Assignment should be submitted in the
Holes clip punch folder (snap attached).
 Incomplete assignments lead to
reduction in marks.
 Avoid plagiarism.
 For Quizzes bring your own loose
pages.
 Violation of any instructions leads to
a reduction in marks.
Introduction and Data Collection
PROBABILITY AND
STATISTICS
Learning Objectives
In this topic you learn:
 How Statistics is used in business
 The sources of data used in business
 The types of data used in business
 The basics of Microsoft Excel
 The basics of Minitab
Why Learn Statistics?
So you are able to make better sense of the
ubiquitous use of numbers:
 Business memos
 Business research
 Technical reports
 Technical journals
 Newspaper articles
 Magazine articles
What is statistics?
 A branch of mathematics taking and
transforming numbers into useful information for
decision makers
 Methods for processing & analyzing numbers
 Methods for helping reduce the uncertainty
inherent in decision making
Why Study Statistics?
Decision Makers Use Statistics To:
 Present and describe business data and information properly
 Draw conclusions about large groups of individuals or items,
using information collected from subsets of the individuals or
items.
 Make reliable forecasts about a business activity
 Improve business processes
Types of Statistics
 Statistics
 The branch of mathematics that transforms data into
useful information for decision makers.
Descriptive Statistics
Collecting, summarizing, and
describing data
Inferential Statistics
Drawing conclusions and/or
making decisions concerning a
population based only on sample
data
Descriptive Statistics
 Collect data
 e.g., Survey
 Present data
 e.g., Tables and graphs
 Characterize data

e.g., Sample mean = i
X
n

Inferential Statistics
 Estimation
 e.g., Estimate the population
mean weight using the sample
mean weight
 Hypothesis testing
 e.g., Test the claim that the
population mean weight is 120
pounds
Drawing conclusions about a large group of
individuals based on a subset of the large group.
Basic Vocabulary of Statistics
VARIABLE
A variable is a characteristic of an item or individual.
DATA
Data are the different values associated with a variable.
OPERATIONAL DEFINITIONS
Data values are meaningless unless their variables have operational
definitions, universally accepted meanings that are clear to all associated
with an analysis.
Basic Vocabulary of Statistics
POPULATION
A population consists of all the items or individuals about which
you want to draw a conclusion.
SAMPLE
A sample is the portion of a population selected for analysis.
PARAMETER
A parameter is a numerical measure that describes a
characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a characteristic of
a sample.
Population vs. Sample
Population Sample
Measures used to describe the
population are called parameters
Measures computed from
sample data are called statistics
Why Collect Data?
 A marketing research analyst needs to assess the
effectiveness of a new television advertisement.
 A pharmaceutical manufacturer needs to determine
whether a new drug is more effective than those currently
in use.
 An operations manager wants to monitor a manufacturing
process to find out whether the quality of the product
being manufactured is conforming to company standards.
 An auditor wants to review the financial transactions of a
company in order to determine whether the company is in
compliance with generally accepted accounting
principles.
Sources of Data
 Primary Sources: The data collector is the one using the data
for analysis
 Data from a political survey
 Data collected from an experiment
 Observed data
 Secondary Sources: The person performing data analysis is
not the data collector
 Analyzing census data
 Examining data from print journals or data published on the internet.
Sources of data fall into four
categories
 Data distributed by an organization or an
individual
 A designed experiment
 A survey
 An observational study
Types of Variables
 Categorical (qualitative) variables have values that
can only be placed into categories, such as “yes” and
“no.”
 Numerical (quantitative) variables have values that
represent quantities.
Types of Data
Data
Discrete
Continuou
s
Examples:
 Marital Status
 Political Party
 Eye Color
(Defined categories)
Examples:
 Number of Children
 Defects per hour
(Counted items)
Examples:
 Weight
 Voltage
(Measured characteristics)
Levels of Measurement
 A nominal scale classifies data into distinct categories in
which no ranking is implied.
Categorical Variables Categories
Personal Computer
Ownership
Type of Stocks Owned
Internet Provider
Yes / No
Microsoft Network / AOL/ Other
Growth Value Other
Levels of Measurement
 An ordinal scale classifies data into distinct categories
in which ranking is implied
Categorical Variable Ordered Categories
Student class designation Freshman, Sophomore, Junior,
Senior
Product satisfaction Satisfied, Neutral, Unsatisfied
Faculty rank Professor, Associate Professor,
Assistant Professor, Instructor
Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F
Levels of Measurement
 An interval scale is an ordered scale in which the difference
between measurements is a meaningful quantity but the
measurements do not have a true zero point.
 A ratio scale is an ordered scale in which the difference
between the measurements is a meaningful quantity and the
measurements have a true zero point.
Interval and Ratio Scales
Chapter Summary
 Reviewed why a manager needs to know statistics
 Introduced key definitions:
 Population vs. Sample
 Primary vs. Secondary data types
 Categorical vs. Numerical data
 Examined descriptive vs. inferential statistics
 Reviewed data types and measurement levels
In this chapter, we have
Ad

More Related Content

Similar to LECTURE 1 STATISTICS for data analytics and machine learning (20)

business research process, design and proposal
business research process, design and proposalbusiness research process, design and proposal
business research process, design and proposal
Ahsan Khan Eco (Superior College)
 
Unit 1 pptx.pptx
Unit 1 pptx.pptxUnit 1 pptx.pptx
Unit 1 pptx.pptx
rekhabawa2
 
Understanding the Different Scales of Measurement with Examples
Understanding the Different Scales of Measurement with ExamplesUnderstanding the Different Scales of Measurement with Examples
Understanding the Different Scales of Measurement with Examples
India Assignment India
 
Data Analysis - Approach & Techniques
Data Analysis - Approach & TechniquesData Analysis - Approach & Techniques
Data Analysis - Approach & Techniques
InvenkLearn
 
introductiontobusinessstatistics-anithanew.pptx
introductiontobusinessstatistics-anithanew.pptxintroductiontobusinessstatistics-anithanew.pptx
introductiontobusinessstatistics-anithanew.pptx
pathianithanaidu
 
A step-by-step guide for conducting statistical data analysis
A step-by-step guide for conducting statistical data analysisA step-by-step guide for conducting statistical data analysis
A step-by-step guide for conducting statistical data analysis
Phd Assistance
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
Collin College
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
gueste87a4f
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
raylenepotter
 
SUSLA IE Process: Writing Assessment Results
SUSLA IE Process: Writing Assessment Results SUSLA IE Process: Writing Assessment Results
SUSLA IE Process: Writing Assessment Results
Cleopatra Allen
 
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
JOHNREYMANZANO3
 
Practical Research 2.pptx
Practical Research 2.pptxPractical Research 2.pptx
Practical Research 2.pptx
AldreMalupeng
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
lea6nklmattu
 
Research and Statistics Report- Estonio, Ryan.pptx
Research  and Statistics Report- Estonio, Ryan.pptxResearch  and Statistics Report- Estonio, Ryan.pptx
Research and Statistics Report- Estonio, Ryan.pptx
RyanEstonio
 
Denise Rousseau's Generic EBMgt Class 4
Denise Rousseau's Generic EBMgt Class 4Denise Rousseau's Generic EBMgt Class 4
Denise Rousseau's Generic EBMgt Class 4
Center for Evidence-Based Management
 
Lecture-1 Introduction to statistics.ppt
Lecture-1 Introduction to statistics.pptLecture-1 Introduction to statistics.ppt
Lecture-1 Introduction to statistics.ppt
ICFAI University-Tripura
 
Finding the answers to the research questions.pptx
Finding the answers to the research questions.pptxFinding the answers to the research questions.pptx
Finding the answers to the research questions.pptx
rejlopez
 
Data Processing & Explain each term in details.pptx
Data Processing & Explain each term in details.pptxData Processing & Explain each term in details.pptx
Data Processing & Explain each term in details.pptx
PratikshaSurve4
 
محاضرة 9
محاضرة 9محاضرة 9
محاضرة 9
مركز البحوث الأقسام العلمية
 
Intro to quant_s_tudents
Intro to quant_s_tudentsIntro to quant_s_tudents
Intro to quant_s_tudents
MPA502a
 
Unit 1 pptx.pptx
Unit 1 pptx.pptxUnit 1 pptx.pptx
Unit 1 pptx.pptx
rekhabawa2
 
Understanding the Different Scales of Measurement with Examples
Understanding the Different Scales of Measurement with ExamplesUnderstanding the Different Scales of Measurement with Examples
Understanding the Different Scales of Measurement with Examples
India Assignment India
 
Data Analysis - Approach & Techniques
Data Analysis - Approach & TechniquesData Analysis - Approach & Techniques
Data Analysis - Approach & Techniques
InvenkLearn
 
introductiontobusinessstatistics-anithanew.pptx
introductiontobusinessstatistics-anithanew.pptxintroductiontobusinessstatistics-anithanew.pptx
introductiontobusinessstatistics-anithanew.pptx
pathianithanaidu
 
A step-by-step guide for conducting statistical data analysis
A step-by-step guide for conducting statistical data analysisA step-by-step guide for conducting statistical data analysis
A step-by-step guide for conducting statistical data analysis
Phd Assistance
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
Collin College
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
gueste87a4f
 
SUSLA IE Process: Writing Assessment Results
SUSLA IE Process: Writing Assessment Results SUSLA IE Process: Writing Assessment Results
SUSLA IE Process: Writing Assessment Results
Cleopatra Allen
 
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
Topic-6-Finding-the-Answers-to-the-Research-Questions-Interpretation-and-Pres...
JOHNREYMANZANO3
 
Practical Research 2.pptx
Practical Research 2.pptxPractical Research 2.pptx
Practical Research 2.pptx
AldreMalupeng
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
lea6nklmattu
 
Research and Statistics Report- Estonio, Ryan.pptx
Research  and Statistics Report- Estonio, Ryan.pptxResearch  and Statistics Report- Estonio, Ryan.pptx
Research and Statistics Report- Estonio, Ryan.pptx
RyanEstonio
 
Finding the answers to the research questions.pptx
Finding the answers to the research questions.pptxFinding the answers to the research questions.pptx
Finding the answers to the research questions.pptx
rejlopez
 
Data Processing & Explain each term in details.pptx
Data Processing & Explain each term in details.pptxData Processing & Explain each term in details.pptx
Data Processing & Explain each term in details.pptx
PratikshaSurve4
 
Intro to quant_s_tudents
Intro to quant_s_tudentsIntro to quant_s_tudents
Intro to quant_s_tudents
MPA502a
 

Recently uploaded (20)

Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
disnakertransjabarda
 
Process Mining at AE - Key success factors
Process Mining at AE - Key success factorsProcess Mining at AE - Key success factors
Process Mining at AE - Key success factors
Process mining Evangelist
 
Volkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing ProcessVolkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing Process
Process mining Evangelist
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
How to regulate and control your it-outsourcing provider with process mining
How to regulate and control your it-outsourcing provider with process miningHow to regulate and control your it-outsourcing provider with process mining
How to regulate and control your it-outsourcing provider with process mining
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682
way to join real illuminati Agent In Kampala Call/WhatsApp+256782561496/0756664682
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahahE-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
RyanRahardjo2
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
disnakertransjabarda
 
Volkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing ProcessVolkswagen - Analyzing the World's Biggest Purchasing Process
Volkswagen - Analyzing the World's Biggest Purchasing Process
Process mining Evangelist
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
How to regulate and control your it-outsourcing provider with process mining
How to regulate and control your it-outsourcing provider with process miningHow to regulate and control your it-outsourcing provider with process mining
How to regulate and control your it-outsourcing provider with process mining
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahahE-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
E-Book-TOEFL-Masuk-PTN.pdf hahahahaahahahah
RyanRahardjo2
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Ad

LECTURE 1 STATISTICS for data analytics and machine learning

  • 1. Probability & Statistics for Engineers & Scientists Authors: Walpole, Myers, Myers, YE Instructor: NADEEM KHAN Lecturer, S & H Dept., FAST-NU, Main Campus, Karachi Nadeem.arif@nu.edu.pk
  • 4. MT206 – Probability & Statistics (4 Credit Hours) Week Topics 01 Intro. To Statistics, Measures of Central Tendency & Dispersion 02 Bar Chart, Histogram, Stem-Leaf Plot, Box Plot, Dot Plot, Frequency Curves, Ogive, Skewness & Kurtosis 03 Introduction to Probability: Sample Space, Tree Diagram, Event, Set Theory, Venn Diagram 04 Counting techniques, Kinds of Events, Additive rules 05 Conditional Probability, Independence, Multiplicative rules, Bayes’ Theorem. 06 1st Mid-Term Examination Course Outline:
  • 5. MT206 – Probability & Statistics (4 Credit Hours) Wee k Topics 07 Random Variables & Probability Distributions: PMF, PDF, CDF, Joint & Marginal Probability Distributions, Mathematical Expectation 08 Discrete Distributions: Binomial & Multinomial, Poisson, Geometric, Hypergeometric, and Discrete uniform. 09 Continuous Distributions: Normal, Exponential, Uniform, Chi-Square 10 Testing of Hypothesis: z-test, t-test 11 Goodness of Fit Test, Chi-Square test of Independence 12 2nd Mid-Term Examination Course Outline (Contd.)
  • 6. MT206 – Probability & Statistics (4 Credit Hours)  Note: The above course outline & schedule is tentative. Wee k Topics 13 Correlation & Regression 14 Non-Linear Regression: Polynomial regression 15 ANOVA 16 Final Examination Course Outline (Contd.)
  • 7. Marks Distribution S. No. Particulars % Marks 01 Assignments 10 02 Quizzes 10 03 1st Mid Term 15 04 2nd Mid Term 15 05 Final Exam 50 Total 100
  • 8. Important Instructions  Be in the classroom on time.  All students are required to maintain 80% of attendance. In case students fail to maintain 80% of attendance, they become ineligible to take the final exam.  Turn off your cell phones or any other electronic devices before entering the class.  Maintain the decorum of the class room all the time.  Avoid conversation with your classmates while the lecture is in progress.  Submit your assignments on time otherwise marks will be deducted after deadline.
  • 9. Important Instructions (Contd.)  Assignment should include a title page consisting of your complete Name, Roll No, Subject Name and date etc.  Assignment should be submitted in the Holes clip punch folder (snap attached).  Incomplete assignments lead to reduction in marks.  Avoid plagiarism.  For Quizzes bring your own loose pages.  Violation of any instructions leads to a reduction in marks.
  • 10. Introduction and Data Collection PROBABILITY AND STATISTICS
  • 11. Learning Objectives In this topic you learn:  How Statistics is used in business  The sources of data used in business  The types of data used in business  The basics of Microsoft Excel  The basics of Minitab
  • 12. Why Learn Statistics? So you are able to make better sense of the ubiquitous use of numbers:  Business memos  Business research  Technical reports  Technical journals  Newspaper articles  Magazine articles
  • 13. What is statistics?  A branch of mathematics taking and transforming numbers into useful information for decision makers  Methods for processing & analyzing numbers  Methods for helping reduce the uncertainty inherent in decision making
  • 14. Why Study Statistics? Decision Makers Use Statistics To:  Present and describe business data and information properly  Draw conclusions about large groups of individuals or items, using information collected from subsets of the individuals or items.  Make reliable forecasts about a business activity  Improve business processes
  • 15. Types of Statistics  Statistics  The branch of mathematics that transforms data into useful information for decision makers. Descriptive Statistics Collecting, summarizing, and describing data Inferential Statistics Drawing conclusions and/or making decisions concerning a population based only on sample data
  • 16. Descriptive Statistics  Collect data  e.g., Survey  Present data  e.g., Tables and graphs  Characterize data  e.g., Sample mean = i X n 
  • 17. Inferential Statistics  Estimation  e.g., Estimate the population mean weight using the sample mean weight  Hypothesis testing  e.g., Test the claim that the population mean weight is 120 pounds Drawing conclusions about a large group of individuals based on a subset of the large group.
  • 18. Basic Vocabulary of Statistics VARIABLE A variable is a characteristic of an item or individual. DATA Data are the different values associated with a variable. OPERATIONAL DEFINITIONS Data values are meaningless unless their variables have operational definitions, universally accepted meanings that are clear to all associated with an analysis.
  • 19. Basic Vocabulary of Statistics POPULATION A population consists of all the items or individuals about which you want to draw a conclusion. SAMPLE A sample is the portion of a population selected for analysis. PARAMETER A parameter is a numerical measure that describes a characteristic of a population. STATISTIC A statistic is a numerical measure that describes a characteristic of a sample.
  • 20. Population vs. Sample Population Sample Measures used to describe the population are called parameters Measures computed from sample data are called statistics
  • 21. Why Collect Data?  A marketing research analyst needs to assess the effectiveness of a new television advertisement.  A pharmaceutical manufacturer needs to determine whether a new drug is more effective than those currently in use.  An operations manager wants to monitor a manufacturing process to find out whether the quality of the product being manufactured is conforming to company standards.  An auditor wants to review the financial transactions of a company in order to determine whether the company is in compliance with generally accepted accounting principles.
  • 22. Sources of Data  Primary Sources: The data collector is the one using the data for analysis  Data from a political survey  Data collected from an experiment  Observed data  Secondary Sources: The person performing data analysis is not the data collector  Analyzing census data  Examining data from print journals or data published on the internet.
  • 23. Sources of data fall into four categories  Data distributed by an organization or an individual  A designed experiment  A survey  An observational study
  • 24. Types of Variables  Categorical (qualitative) variables have values that can only be placed into categories, such as “yes” and “no.”  Numerical (quantitative) variables have values that represent quantities.
  • 25. Types of Data Data Discrete Continuou s Examples:  Marital Status  Political Party  Eye Color (Defined categories) Examples:  Number of Children  Defects per hour (Counted items) Examples:  Weight  Voltage (Measured characteristics)
  • 26. Levels of Measurement  A nominal scale classifies data into distinct categories in which no ranking is implied. Categorical Variables Categories Personal Computer Ownership Type of Stocks Owned Internet Provider Yes / No Microsoft Network / AOL/ Other Growth Value Other
  • 27. Levels of Measurement  An ordinal scale classifies data into distinct categories in which ranking is implied Categorical Variable Ordered Categories Student class designation Freshman, Sophomore, Junior, Senior Product satisfaction Satisfied, Neutral, Unsatisfied Faculty rank Professor, Associate Professor, Assistant Professor, Instructor Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC, C, DDD, DD, D Student Grades A, B, C, D, F
  • 28. Levels of Measurement  An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity but the measurements do not have a true zero point.  A ratio scale is an ordered scale in which the difference between the measurements is a meaningful quantity and the measurements have a true zero point.
  • 30. Chapter Summary  Reviewed why a manager needs to know statistics  Introduced key definitions:  Population vs. Sample  Primary vs. Secondary data types  Categorical vs. Numerical data  Examined descriptive vs. inferential statistics  Reviewed data types and measurement levels In this chapter, we have