This document discusses statistical packages used for data analysis. It describes the Statistical Package for the Social Sciences (SPSS) in detail. SPSS is a widely used statistical software package that can perform complex data analysis, including regression, ANOVA, and multivariate analysis. It has advantages such as easy data import, reliable results, and useful graphs. However, it also has limitations such as difficulty interpreting error logs and incomplete menus. The document also briefly mentions other packages like Microsoft Excel, SAS, Minitab, and Stata.
This document discusses quantitative and qualitative data analysis. It defines key terms like analysis, hypothesis, descriptive statistics, inferential statistics, and parametric and nonparametric tests. It explains the steps of quantitative data analysis which include data preparation, describing the data through summary statistics, drawing inferences through inferential statistics, and interpreting the results. Common parametric tests include t-tests, ANOVA, and correlation. Common nonparametric tests include chi-square, median, Mann-Whitney, and Wilcoxon tests. The document emphasizes accurate presentation of analyzed data through narratives and tables.
This document defines and provides examples of three different types of plots used to represent data distributions: stem and leaf plots, bar graphs, and histograms. It explains that stem and leaf plots organize large data sets by separating values into "stems" and "leaves", bar graphs represent frequencies using bars of equal or varying widths, and histograms use intervals to group continuous data and display frequencies. The document also lists advantages and disadvantages of each type of plot.
The document presents a presentation on coefficient correlation by Irshad Narejo. It defines correlation as a technique used to measure the relationship between two or more variables. A correlation coefficient measures the degree to which changes in one variable can predict changes in another, though correlation does not imply causation. Correlation coefficient formulas return a value between -1 and 1 to indicate the strength and direction of relationships between data. Positive correlation means high values in one variable are associated with high values in the other, while negative correlation means high values in one variable are associated with low values in the other. The document discusses Pearson's correlation coefficient formula and provides an example of calculating correlation by hand versus using SPSS.
Biostatistics applies statistical tools to biological and medical data. It is used (1) to establish what is considered normal in areas like physiology and anatomy, (2) to determine whether observed differences are due to chance or other factors in areas like comparing drug effectiveness and public health measures, and (3) to identify disease characteristics and evaluate health programs by comparing data from experiment and control groups.
General rules for graphical representation of data include providing a suitable title indicating the subject, including the proper measurement unit, choosing an appropriate scale to accurately represent the data, indexing colors and lines for better understanding, citing data sources, keeping the graph simple and neat for easy understanding, and using the correct size, fonts, and colors to make the graph a clear visual aid. The sample graph shows the birth months of students.
THE POWER POINT PRESENTATION OF ANATOMY AND PHYSIOLOGY OF THE EAR (SENSE OF HEARING) IS JUST TO EQUIP READERS WITH SOME BASIC UNDERSTANDING ON THE ORGAN.
HOW IT OPERATES AND CONNECTED TO THE CENTRAL NERVOUS SYSTEM IN ORDER TO PERCEIVE SOUND AND AID IN BALANCE.
This document provides an introduction to statistics and biostatistics in healthcare. It defines statistics and biostatistics, outlines the basic steps of statistical work, and describes different types of variables and methods for collecting data. The document also discusses different types of descriptive and inferential statistics, including measures of central tendency, dispersion, frequency, t-tests, ANOVA, regression, and different types of plots/graphs. It explains how statistics is used in healthcare for areas like disease burden assessment, intervention effectiveness, cost considerations, evaluation frameworks, health care utilization, resource allocation, needs assessment, quality improvement, and product development.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications in various medical fields. The key areas covered include defining variables and parameters, common statistical terms, sources of data collection, methods of presenting data through tabulation and diagrams, analyzing data through measures like mean, median, mode, range and standard deviation, sampling and related errors, significance tests, and uses of biostatistics in areas like epidemiology and clinical trials.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications of biostatistics in various medical fields. The document aims to introduce students to important biostatistical concepts and their use in research, clinical trials, epidemiology and other areas of medicine.
A frequency distribution summarizes data by organizing it into intervals and counting the frequency of observations within each interval. It presents the data distribution in a table or chart. To create one, you first collect data, identify the range of values, create intervals, count frequencies within each interval, and construct a table or chart showing the intervals and frequencies. Frequency distributions are useful for understanding central tendency, dispersion, patterns and making comparisons. They have many applications across fields like descriptive statistics, data analysis, business, economics, manufacturing, healthcare and education.
This document provides an introduction to statistics and biostatistics. It discusses what statistics and biostatistics are, their uses, and what they cover. Specifically, it explains that biostatistics applies statistical methods to biological and medical data. It also discusses different types of data, variables, coding data, and strategies for describing data, including tables, diagrams, frequency distributions, and numerical measures. Graphs and charts discussed include bar charts, pie charts, histograms, scatter plots, box plots, and stem-and-leaf plots. The document provides examples and illustrations of these concepts and techniques.
This document provides an overview of statistics presented by five students. It defines statistics as the practice of collecting and analyzing numerical data. Descriptive statistics summarize data through parameters like the mean, while inferential statistics interpret descriptive statistics to draw conclusions. The document discusses examples of statistics, different types of charts and graphs, descriptive versus inferential statistics, and the importance and applications of statistics in fields like business, economics, and social sciences. It also covers topics like sampling methods, characteristics of sampling, probability versus non-probability sampling, and differences between the two.
This document provides information about medical statistics including what statistics are, how they are used in medicine, and some key statistical concepts. It discusses that statistics is the study of collecting, organizing, summarizing, presenting, and analyzing data. Medical statistics specifically deals with applying these statistical methods to medicine and health sciences areas like epidemiology, public health, and clinical research. It also overview some common statistical analyses like descriptive versus inferential statistics, populations and samples, variables and data types, and some statistical notations.
data analysis in Statistics-2023 guide 2023ayesha455941
- Statistics is the science of collecting, analyzing, interpreting, presenting, and organizing data. It is used across various fields including physics, business, social sciences, and healthcare.
- There are two main branches of statistical analysis: descriptive statistics, which summarizes and describes data, and inferential statistics, which draws conclusions about populations based on samples.
- Key concepts include populations, samples, parameters, statistics, and the differences between descriptive and inferential analysis. Measures of central tendency like the mean, median, and mode are used to describe data, while measures of variation like the range, variance, and standard deviation quantify how spread out the data is.
- Descriptive statistics describe the properties of sample and population data through metrics like mean, median, mode, variance, and standard deviation. Inferential statistics use those properties to test hypotheses and draw conclusions about large groups.
- Descriptive statistics focus on central tendency, variability, and distribution of data. Inferential statistics allow statisticians to draw conclusions about populations based on samples and determine the reliability of those conclusions.
- Statistics rely on variables, which are characteristics or attributes that can be measured and analyzed. Variables can be qualitative like gender or quantitative like mileage, and quantitative variables can be discrete like test scores or continuous like height.
- Descriptive statistics describe the properties of sample and population data through metrics like mean, median, mode, variance, and standard deviation. Inferential statistics use those properties to test hypotheses and draw conclusions about large groups.
- The two major areas of statistics are descriptive statistics, which summarizes data, and inferential statistics, which uses descriptive statistics to make generalizations and predictions.
- Mean, median, and mode describe central tendency, with mean being the average, median being the middle number, and mode being the most frequent value.
Biostatistics in clinical research involves the application of statistical methods to analyze and interpret data from clinical trials. It plays a crucial role in study design, sample size determination, data analysis, and result interpretation. Biostatisticians ensure that clinical research findings are valid, reliable, and meaningful, contributing to evidence-based medicine. Their expertise helps researchers make informed decisions, assess treatment efficacy, and draw accurate conclusions about the safety and effectiveness of interventions.
Statistics is the collection, organization, analysis, and presentation of data. It has become important for professionals, scientists, and citizens to make sense of large amounts of data. Statistics are used across many disciplines from science to business. There are two main types of statistical methods - descriptive statistics which summarize data through measures like the mean and median, and inferential statistics which make inferences about populations based on samples. Descriptive statistics describe data through measures of central tendency and variability, while inferential statistics allow inferences to be made from samples to populations through techniques like hypothesis testing.
This document provides an overview of key concepts in nursing statistics. It begins by outlining the course objectives, which are to develop statistical literacy, analyze nursing literature, and critically evaluate different statistical methods. It then defines different statistical terms and concepts, including the different types of data, measures of central tendency, variability, and correlation. Examples are provided to illustrate these statistical techniques. The document serves to introduce nursing students to important foundational knowledge in statistics for analyzing nursing research.
This document provides an introduction to statistics and biostatistics in healthcare. It defines statistics and biostatistics, outlines the basic steps of statistical work, and describes different types of variables and methods for collecting data. The document also discusses different types of descriptive and inferential statistics, including measures of central tendency, dispersion, frequency, t-tests, ANOVA, regression, and different types of plots/graphs. It explains how statistics is used in healthcare for areas like disease burden assessment, intervention effectiveness, cost considerations, evaluation frameworks, health care utilization, resource allocation, needs assessment, quality improvement, and product development.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications in various medical fields. The key areas covered include defining variables and parameters, common statistical terms, sources of data collection, methods of presenting data through tabulation and diagrams, analyzing data through measures like mean, median, mode, range and standard deviation, sampling and related errors, significance tests, and uses of biostatistics in areas like epidemiology and clinical trials.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications of biostatistics in various medical fields. The document aims to introduce students to important biostatistical concepts and their use in research, clinical trials, epidemiology and other areas of medicine.
A frequency distribution summarizes data by organizing it into intervals and counting the frequency of observations within each interval. It presents the data distribution in a table or chart. To create one, you first collect data, identify the range of values, create intervals, count frequencies within each interval, and construct a table or chart showing the intervals and frequencies. Frequency distributions are useful for understanding central tendency, dispersion, patterns and making comparisons. They have many applications across fields like descriptive statistics, data analysis, business, economics, manufacturing, healthcare and education.
This document provides an introduction to statistics and biostatistics. It discusses what statistics and biostatistics are, their uses, and what they cover. Specifically, it explains that biostatistics applies statistical methods to biological and medical data. It also discusses different types of data, variables, coding data, and strategies for describing data, including tables, diagrams, frequency distributions, and numerical measures. Graphs and charts discussed include bar charts, pie charts, histograms, scatter plots, box plots, and stem-and-leaf plots. The document provides examples and illustrations of these concepts and techniques.
This document provides an overview of statistics presented by five students. It defines statistics as the practice of collecting and analyzing numerical data. Descriptive statistics summarize data through parameters like the mean, while inferential statistics interpret descriptive statistics to draw conclusions. The document discusses examples of statistics, different types of charts and graphs, descriptive versus inferential statistics, and the importance and applications of statistics in fields like business, economics, and social sciences. It also covers topics like sampling methods, characteristics of sampling, probability versus non-probability sampling, and differences between the two.
This document provides information about medical statistics including what statistics are, how they are used in medicine, and some key statistical concepts. It discusses that statistics is the study of collecting, organizing, summarizing, presenting, and analyzing data. Medical statistics specifically deals with applying these statistical methods to medicine and health sciences areas like epidemiology, public health, and clinical research. It also overview some common statistical analyses like descriptive versus inferential statistics, populations and samples, variables and data types, and some statistical notations.
data analysis in Statistics-2023 guide 2023ayesha455941
- Statistics is the science of collecting, analyzing, interpreting, presenting, and organizing data. It is used across various fields including physics, business, social sciences, and healthcare.
- There are two main branches of statistical analysis: descriptive statistics, which summarizes and describes data, and inferential statistics, which draws conclusions about populations based on samples.
- Key concepts include populations, samples, parameters, statistics, and the differences between descriptive and inferential analysis. Measures of central tendency like the mean, median, and mode are used to describe data, while measures of variation like the range, variance, and standard deviation quantify how spread out the data is.
- Descriptive statistics describe the properties of sample and population data through metrics like mean, median, mode, variance, and standard deviation. Inferential statistics use those properties to test hypotheses and draw conclusions about large groups.
- Descriptive statistics focus on central tendency, variability, and distribution of data. Inferential statistics allow statisticians to draw conclusions about populations based on samples and determine the reliability of those conclusions.
- Statistics rely on variables, which are characteristics or attributes that can be measured and analyzed. Variables can be qualitative like gender or quantitative like mileage, and quantitative variables can be discrete like test scores or continuous like height.
- Descriptive statistics describe the properties of sample and population data through metrics like mean, median, mode, variance, and standard deviation. Inferential statistics use those properties to test hypotheses and draw conclusions about large groups.
- The two major areas of statistics are descriptive statistics, which summarizes data, and inferential statistics, which uses descriptive statistics to make generalizations and predictions.
- Mean, median, and mode describe central tendency, with mean being the average, median being the middle number, and mode being the most frequent value.
Biostatistics in clinical research involves the application of statistical methods to analyze and interpret data from clinical trials. It plays a crucial role in study design, sample size determination, data analysis, and result interpretation. Biostatisticians ensure that clinical research findings are valid, reliable, and meaningful, contributing to evidence-based medicine. Their expertise helps researchers make informed decisions, assess treatment efficacy, and draw accurate conclusions about the safety and effectiveness of interventions.
Statistics is the collection, organization, analysis, and presentation of data. It has become important for professionals, scientists, and citizens to make sense of large amounts of data. Statistics are used across many disciplines from science to business. There are two main types of statistical methods - descriptive statistics which summarize data through measures like the mean and median, and inferential statistics which make inferences about populations based on samples. Descriptive statistics describe data through measures of central tendency and variability, while inferential statistics allow inferences to be made from samples to populations through techniques like hypothesis testing.
This document provides an overview of key concepts in nursing statistics. It begins by outlining the course objectives, which are to develop statistical literacy, analyze nursing literature, and critically evaluate different statistical methods. It then defines different statistical terms and concepts, including the different types of data, measures of central tendency, variability, and correlation. Examples are provided to illustrate these statistical techniques. The document serves to introduce nursing students to important foundational knowledge in statistics for analyzing nursing research.
Clinical Epilepsy power point presentationPituaIvaan1
This document discusses definitions, epidemiology, classification, etiology, evaluation, treatment and management of seizures and epilepsy. Key points include:
- Seizures are clinical manifestations of abnormal neuronal activity while epilepsy is two or more unprovoked seizures.
- Incidence of seizures is approximately 80 per 100,000 people per year while epilepsy incidence is 45 per 100,000 people per year.
- Seizures can be partial or generalized and classification is based on clinical symptoms and EEG findings.
- Etiology depends on age and includes genetic factors, structural abnormalities, infections, trauma and metabolic disturbances.
- Evaluation of a first seizure involves a detailed history, physical exam, blood and imaging tests to determine
Acute and Chronic Urinary Retention.pptxPituaIvaan1
This document discusses acute and chronic urinary retention. Acute retention occurs suddenly due to severe bladder distention and pain, while chronic retention is a gradual distention often with dribbling or overflow incontinence. Common causes include benign prostatic hyperplasia, urethral stricture, and prostate cancer in males, and retroverted uterus or pelvic masses in females. Clinical features and investigations help diagnose the underlying etiology. Management involves relieving the obstruction through catheterization or surgery while treating the cause. Complications can include bladder and kidney damage if not properly managed.
diseases of conjunctiva power point presentationPituaIvaan1
This document discusses various types of extraocular disorders including conjunctivitis. It describes the etiology, signs and symptoms, diagnosis, and treatment of different forms of conjunctivitis such as bacterial, viral, allergic, chlamydial, and trachomatous conjunctivitis. It also covers other extraocular disorders like pinguecula, pterygium, and conjunctival tumors. The document provides detailed clinical information on examining patients and managing various extraocular conditions.
This document discusses acute and chronic urinary retention. Acute retention occurs suddenly due to severe bladder distention and pain, while chronic retention is a gradual distention often with dribbling or overflow incontinence. Common causes include benign prostatic hyperplasia, urethral stricture, and bladder or prostate cancers. Clinical features depend on whether retention is acute or chronic. Management involves relieving obstruction through catheterization or surgery while treating the underlying cause. Complications can develop if retention is not properly managed.
6-Pre-operative care assessment and preparations-1 - Copy - Copy.pptxPituaIvaan1
The document outlines the principles and processes of pre-operative care, including assessing patient risk factors through history and examinations, preparing patients physically and psychologically for surgery, obtaining informed consent, and ensuring all necessary preparations and safety checks are completed prior to surgery through pre-operative ward rounds and checklists. The goal is to optimize patient health and safety as well as provide informed consent for the planned surgical procedure.
APPROACH TO A PATIENT PRESENTING WITH LIMB WEAKNESSPituaIvaan1
This document provides an overview of how to approach a patient presenting with limb weakness. It discusses classifying the type of weakness, determining the etiology and risk factors, distinguishing features of upper motor neuron, lower motor neuron, neuromuscular junction, and myopathic weaknesses. It also covers the distribution of weakness, clinical examination, relevant investigations, and management considerations. The case presented involves a 39-year-old man with acute hemorrhagic stroke and hypertensive crisis presenting with right-sided hemiparesis.
Just-in-time: Repetitive production system in which processing and movement of materials and goods occur just as they are needed, usually in small batches
JIT is characteristic of lean production systems
JIT operates with very little “fat”
Decision Trees in Artificial-Intelligence.pdfSaikat Basu
Have you heard of something called 'Decision Tree'? It's a simple concept which you can use in life to make decisions. Believe you me, AI also uses it.
Let's find out how it works in this short presentation. #AI #Decisionmaking #Decisions #Artificialintelligence #Data #Analysis
https://saikatbasu.me
快速办理新西兰成绩单奥克兰理工大学毕业证【q微1954292140】办理奥克兰理工大学毕业证(AUT毕业证书)diploma学位认证【q微1954292140】新西兰文凭购买,新西兰文凭定制,新西兰文凭补办。专业在线定制新西兰大学文凭,定做新西兰本科文凭,【q微1954292140】复制新西兰Auckland University of Technology completion letter。在线快速补办新西兰本科毕业证、硕士文凭证书,购买新西兰学位证、奥克兰理工大学Offer,新西兰大学文凭在线购买。
主营项目:
1、真实教育部国外学历学位认证《新西兰毕业文凭证书快速办理奥克兰理工大学毕业证的方法是什么?》【q微1954292140】《论文没过奥克兰理工大学正式成绩单》,教育部存档,教育部留服网站100%可查.
2、办理AUT毕业证,改成绩单《AUT毕业证明办理奥克兰理工大学展示成绩单模板》【Q/WeChat:1954292140】Buy Auckland University of Technology Certificates《正式成绩单论文没过》,奥克兰理工大学Offer、在读证明、学生卡、信封、证明信等全套材料,从防伪到印刷,从水印到钢印烫金,高精仿度跟学校原版100%相同.
3、真实使馆认证(即留学人员回国证明),使馆存档可通过大使馆查询确认.
4、留信网认证,国家专业人才认证中心颁发入库证书,留信网存档可查.
《奥克兰理工大学毕业证定制新西兰毕业证书办理AUT在线制作本科文凭》【q微1954292140】学位证1:1完美还原海外各大学毕业材料上的工艺:水印,阴影底纹,钢印LOGO烫金烫银,LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。
高仿真还原新西兰文凭证书和外壳,定制新西兰奥克兰理工大学成绩单和信封。专业定制国外毕业证书AUT毕业证【q微1954292140】办理新西兰奥克兰理工大学毕业证(AUT毕业证书)【q微1954292140】学历认证复核奥克兰理工大学offer/学位证成绩单定制、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决奥克兰理工大学学历学位认证难题。
新西兰文凭奥克兰理工大学成绩单,AUT毕业证【q微1954292140】办理新西兰奥克兰理工大学毕业证(AUT毕业证书)【q微1954292140】学位认证要多久奥克兰理工大学offer/学位证在线制作硕士成绩单、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决奥克兰理工大学学历学位认证难题。
奥克兰理工大学offer/学位证、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作【q微1954292140】Buy Auckland University of Technology Diploma购买美国毕业证,购买英国毕业证,购买澳洲毕业证,购买加拿大毕业证,以及德国毕业证,购买法国毕业证(q微1954292140)购买荷兰毕业证、购买瑞士毕业证、购买日本毕业证、购买韩国毕业证、购买新西兰毕业证、购买新加坡毕业证、购买西班牙毕业证、购买马来西亚毕业证等。包括了本科毕业证,硕士毕业证。
特殊原因导致无法毕业,也可以联系我们帮您办理相关材料:
1:在奥克兰理工大学挂科了,不想读了,成绩不理想怎么办???
2:打算回国了,找工作的时候,需要提供认证《AUT成绩单购买办理奥克兰理工大学毕业证书范本》【Q/WeChat:1954292140】Buy Auckland University of Technology Diploma《正式成绩单论文没过》有文凭却得不到认证。又该怎么办???新西兰毕业证购买,新西兰文凭购买,
【q微1954292140】帮您解决在新西兰奥克兰理工大学未毕业难题(Auckland University of Technology)文凭购买、毕业证购买、大学文凭购买、大学毕业证购买、买文凭、日韩文凭、英国大学文凭、美国大学文凭、澳洲大学文凭、加拿大大学文凭(q微1954292140)新加坡大学文凭、新西兰大学文凭、爱尔兰文凭、西班牙文凭、德国文凭、教育部认证,买毕业证,毕业证购买,买大学文凭,购买日韩毕业证、英国大学毕业证、美国大学毕业证、澳洲大学毕业证、加拿大大学毕业证(q微1954292140)新加坡大学毕业证、新西兰大学毕业证、爱尔兰毕业证、西班牙毕业证、德国毕业证,回国证明,留信网认证,留信认证办理,学历认证。从而完成就业。奥克兰理工大学毕业证办理,奥克兰理工大学文凭办理,奥克兰理工大学成绩单办理和真实留信认证、留服认证、奥克兰理工大学学历认证。学院文凭定制,奥克兰理工大学原版文凭补办,扫描件文凭定做,100%文凭复刻。
保密服务圣地亚哥州立大学英文毕业证书影本美国成绩单圣地亚哥州立大学文凭【q微1954292140】办理圣地亚哥州立大学学位证(SDSU毕业证书)毕业证书购买【q微1954292140】帮您解决在美国圣地亚哥州立大学未毕业难题(San Diego State University)文凭购买、毕业证购买、大学文凭购买、大学毕业证购买、买文凭、日韩文凭、英国大学文凭、美国大学文凭、澳洲大学文凭、加拿大大学文凭(q微1954292140)新加坡大学文凭、新西兰大学文凭、爱尔兰文凭、西班牙文凭、德国文凭、教育部认证,买毕业证,毕业证购买,买大学文凭,购买日韩毕业证、英国大学毕业证、美国大学毕业证、澳洲大学毕业证、加拿大大学毕业证(q微1954292140)新加坡大学毕业证、新西兰大学毕业证、爱尔兰毕业证、西班牙毕业证、德国毕业证,回国证明,留信网认证,留信认证办理,学历认证。从而完成就业。圣地亚哥州立大学毕业证办理,圣地亚哥州立大学文凭办理,圣地亚哥州立大学成绩单办理和真实留信认证、留服认证、圣地亚哥州立大学学历认证。学院文凭定制,圣地亚哥州立大学原版文凭补办,扫描件文凭定做,100%文凭复刻。
特殊原因导致无法毕业,也可以联系我们帮您办理相关材料:
1:在圣地亚哥州立大学挂科了,不想读了,成绩不理想怎么办???
2:打算回国了,找工作的时候,需要提供认证《SDSU成绩单购买办理圣地亚哥州立大学毕业证书范本》【Q/WeChat:1954292140】Buy San Diego State University Diploma《正式成绩单论文没过》有文凭却得不到认证。又该怎么办???美国毕业证购买,美国文凭购买,【q微1954292140】美国文凭购买,美国文凭定制,美国文凭补办。专业在线定制美国大学文凭,定做美国本科文凭,【q微1954292140】复制美国San Diego State University completion letter。在线快速补办美国本科毕业证、硕士文凭证书,购买美国学位证、圣地亚哥州立大学Offer,美国大学文凭在线购买。
美国文凭圣地亚哥州立大学成绩单,SDSU毕业证【q微1954292140】办理美国圣地亚哥州立大学毕业证(SDSU毕业证书)【q微1954292140】录取通知书offer在线制作圣地亚哥州立大学offer/学位证毕业证书样本、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决圣地亚哥州立大学学历学位认证难题。
主营项目:
1、真实教育部国外学历学位认证《美国毕业文凭证书快速办理圣地亚哥州立大学办留服认证》【q微1954292140】《论文没过圣地亚哥州立大学正式成绩单》,教育部存档,教育部留服网站100%可查.
2、办理SDSU毕业证,改成绩单《SDSU毕业证明办理圣地亚哥州立大学成绩单购买》【Q/WeChat:1954292140】Buy San Diego State University Certificates《正式成绩单论文没过》,圣地亚哥州立大学Offer、在读证明、学生卡、信封、证明信等全套材料,从防伪到印刷,从水印到钢印烫金,高精仿度跟学校原版100%相同.
3、真实使馆认证(即留学人员回国证明),使馆存档可通过大使馆查询确认.
4、留信网认证,国家专业人才认证中心颁发入库证书,留信网存档可查.
《圣地亚哥州立大学学位证书的英文美国毕业证书办理SDSU办理学历认证书》【q微1954292140】学位证1:1完美还原海外各大学毕业材料上的工艺:水印,阴影底纹,钢印LOGO烫金烫银,LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。
高仿真还原美国文凭证书和外壳,定制美国圣地亚哥州立大学成绩单和信封。毕业证网上可查学历信息SDSU毕业证【q微1954292140】办理美国圣地亚哥州立大学毕业证(SDSU毕业证书)【q微1954292140】学历认证生成授权声明圣地亚哥州立大学offer/学位证文凭购买、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决圣地亚哥州立大学学历学位认证难题。
圣地亚哥州立大学offer/学位证、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作【q微1954292140】Buy San Diego State University Diploma购买美国毕业证,购买英国毕业证,购买澳洲毕业证,购买加拿大毕业证,以及德国毕业证,购买法国毕业证(q微1954292140)购买荷兰毕业证、购买瑞士毕业证、购买日本毕业证、购买韩国毕业证、购买新西兰毕业证、购买新加坡毕业证、购买西班牙毕业证、购买马来西亚毕业证等。包括了本科毕业证,硕士毕业证。
Philipp Horn has worked in the Business Intelligence area of the Purchasing department of Volkswagen for more than 5 years. He is a front runner in adopting new techniques to understand and improve processes and learned about process mining from a friend, who in turn heard about it at a meet-up where Fluxicon had participated with other startups.
Philipp warns that you need to be careful not to jump to conclusions. For example, in a discovered process model it is easy to say that this process should be simpler here and there, but often there are good reasons for these exceptions today. To distinguish what is necessary and what could be actually improved requires both process knowledge and domain expertise on a detailed level.
Lagos School of Programming Final Project Updated.pdfbenuju2016
A PowerPoint presentation for a project made using MySQL, Music stores are all over the world and music is generally accepted globally, so on this project the goal was to analyze for any errors and challenges the music stores might be facing globally and how to correct them while also giving quality information on how the music stores perform in different areas and parts of the world.
This project demonstrates the application of machine learning—specifically K-Means Clustering—to segment customers based on behavioral and demographic data. The objective is to identify distinct customer groups to enable targeted marketing strategies and personalized customer engagement.
The presentation walks through:
Data preprocessing and exploratory data analysis (EDA)
Feature scaling and dimensionality reduction
K-Means clustering and silhouette analysis
Insights and business recommendations from each customer segment
This work showcases practical data science skills applied to a real-world business problem, using Python and visualization tools to generate actionable insights for decision-makers.
Lalit Wangikar, a partner at CKM Advisors, is an experienced strategic consultant and analytics expert. He started looking for data driven ways of conducting process discovery workshops. When he read about process mining the first time around, about 2 years ago, the first feeling was: “I wish I knew of this while doing the last several projects!".
Interviews are subject to all the whims human recollection is subject to: specifically, recency, simplification and self preservation. Interview-based process discovery, therefore, leaves out a lot of “outliers” that usually end up being one of the biggest opportunity area. Process mining, in contrast, provides an unbiased, fact-based, and a very comprehensive understanding of actual process execution.
Frank van Geffen is a Business Analyst at the Rabobank in the Netherlands. The first time Frank encountered Process Mining was in 2002, when he graduated on a method called communication diagnosis. He stumbled upon the topic again in 2008 and was amazed by the possibilities.
Frank shares his experiences after applying process mining in various projects at the bank. He thinks that process mining is most interesting for the Process manager / Process owner (accountable for all aspects of the complete end to end process), the Process Analyst (responsible for performing the process mining analysis), the Process Auditor (responsible for auditing processes), and the IT department (responsible for development/aquisition, delivery and maintanance of the process mining software).
Mieke Jans is a Manager at Deloitte Analytics Belgium. She learned about process mining from her PhD supervisor while she was collaborating with a large SAP-using company for her dissertation.
Mieke extended her research topic to investigate the data availability of process mining data in SAP and the new analysis possibilities that emerge from it. It took her 8-9 months to find the right data and prepare it for her process mining analysis. She needed insights from both process owners and IT experts. For example, one person knew exactly how the procurement process took place at the front end of SAP, and another person helped her with the structure of the SAP-tables. She then combined the knowledge of these different persons.
2. SCOPE
Part 1 Introduction
• Definitions
• Importance of statistics
• Application of biostatistics
• Statistical notations
• Types of data
• Variables
• Sources of data
• Data presentation
• Data summarization
• Sampling
• Probability
Part 2 Basic Data statistical analysis
• t-test
• z-test
• Binomial test
• Chi-square test
• Fischer exact test
• Corelation
• Simple linear regression
3. PART 1: DEFINITIONS
Statistics
• The study and manipulation of data, including ways to gather, review,
analyze, and draw conclusions from data.
• The two major areas of statistics are descriptive and inferential statistics.
• Statistics can be communicated at different levels ranging from non-numerical
descriptor (nominal-level) to numerical in reference to a zero-point (ratio-
level).
• Several sampling techniques can be used to compile statistical data,
including simple random, systematic, stratified, or cluster sampling.
• Statistics are present in almost every department of every company and are
an integral part of investing.
4. PART 1: DEFINITIONS
Biostatistics or biometry
• Branch of biological science concerned with the study and methods for
collecting, presenting, analysing and interpreting biological research
data.
• The primary aim of this branch of science is to allow researchers, health
care providers and public health administrators to make decisions
concerning a population using sample data.
• For example, the government wants to know the prevalence of a specific
health problem among residents in a given town. If there are 3 million
residents in the town it may not be realistic to test them individually and
determine whether they have the disease or are susceptible to it.
5. PART 1: DEFINITIONS
Biostatistics or biometry
• The realistic and cost-effective approach is to study a representative subset
of the population and apply their results to the entire group.
• Hence biostatistics makes research possible by providing tools and
techniques for collecting, analysing and interpreting biological and medical
data, allowing stakeholders to draw actionable insights about a population
from sample data.
• Biostatisticians usually get their data from a wide range of sources,
including medical records, peer-reviewed literature, claims records, vital
records, disease registries, surveillance, experiments and surveys.
• The professionals collaborate with scientists, health care providers, public
health administrators and other stakeholders.
6. PART 1: DEFINITIONS
Biostatistics or biometry sources of data
• Medical records: Medical records can provide researchers with data about
diagnoses, lab tests and procedures common amongst a specific population,
such as people above 50 years working in the police force.
• Claims data: Scientists can get data about doctor's appointments and medical
bills in claims data.
• Vital records: Vital records contain information about births, deaths, causes of
death and divorces.
• Peer-reviewed literature: Researchers can also pull data from the articles and
studies that experts in a particular field published in peer-reviewed journals.
• Surveys: The researchers can collect primary data using surveys designed
specifically for an experiment.
• Disease registries: These systems help to collect, store, analyse, retrieve and
disseminate information regarding people living with specific disease
8. PART 1: DESCRIPTIVE STATISTICS
Descriptive statistics
• Mostly focus on the central tendency, variability, and distribution of sample data.
• Central tendency means the estimate of the characteristics, a typical element of
a sample or population-It includes descriptive statistics such as mean, median,
and mode.
• Variability refers to a set of statistics that show how much difference there is
among the elements of a sample or population along the characteristics
measured. It includes metrics such as range, variance, and standard deviation.
• The distribution refers to the overall “shape” of the data, which can be depicted
on a chart such as a histogram or a dot plot, and includes properties such as the
probability distribution function, skewness, and kurtosis
11. FORMULA FOR SD
SD= SQRT OF [SUM
OF(Number-the
mean)/the number of
elements in the data set)]
12. SD
1 Calculate the mean
2. Subtract the mean from each element individually
3. Square the differences from subtraction
4. Get the sum of the squared differences
5. Divide the sum of the squared difference by the number of elements
6. Get the square root of the answer after division (quotient)=SD
13. DESCRIPTIVE STATISTICS
Central tendency
Mean
Median
mode
Variability
Range
Variance
SD
Shape or distribution
Skewness and Ketosis
Relative frequencies/proportions
Graphs and charts and tables
14. PART 1: DESCRIPTIVE STATISTICS
Descriptive statistics
• Can also describe differences between observed
characteristics of the elements of a data set.
• Can help us understand the collective properties of the
elements of a data sample and form the basis for testing
hypotheses and making predictions using inferential statistic
• Useful in summarizing data
• Can be in form of numbers, tables or graphs
16. PART 1: INFERENTIAL STATISTICS
Inferential statistics
• Is a tool that statisticians use to draw conclusions about the characteristics of a
population, drawn from the characteristics of a sample, and to determine how
certain they can be of the reliability of those conclusions.
• Based on the sample size and distribution, statisticians can calculate the
probability that statistics, which measure the central tendency, variability,
distribution, and relationships between characteristics within a data sample,
provide an accurate picture of the corresponding parameters of the whole
population from which the sample is drawn.
• Are used to make generalizations about large groups, such as estimating
average demand for a product by surveying a sample of consumers’ buying
habits or attempting to predict future events.
19. FACTORS ASSOCIATED WITH MEN’S INVOLVEMENT IN
ANTENATAL CARE VISITS IN ASMARA, ERITREA: COMMUNITY-
BASED SURVEY
The necessity for a pregnant woman to attend ANC was recognized by almost all
(98.7%) of the male partners; however, 26.6% identified a minimum frequency of
ANC visits.
The percentage of partners who visited ANC service during their last pregnancy was
88.6%. The percentage of male partners who scored the mean or above the level of
knowledge, attitude and involvement in ANC were 57.0, 57.5, and 58.7, respectively.
Religion (p = 0.006, AOR = 1.91, 95% CI 1.20–3.03), level of education (p =
0.027, AOR = 1.96, 95% CI 1.08–3.57), and level of knowledge (p<0.001, AOR =
3.80, 95% CI 2.46–5.87) were significantly associated factors of male involvement in
ANC.
20. METHODS USED
List of households with pregnant women was prepared for each administration area
and was used as sampling frame
A community-based cross-sectional survey was applied using a two-stage sampling
technique to select 605 eligible respondents in Asmara in 2019.
Data was collected using a pretested structured questionnaire.
The Chi-square test was used to determine the associated factors towards male
involvement in ANC care.
Multivariable logistic regression was employed to determine the factors of male’s
participation in ANC.
A P-value less than 0.05 was considered statistically significant.
21. USE-CASE INFORMATION NEEDED
Define target population
State the type of statistics you expect (descriptive, inferential, or both)
State the possible sources of data
22. PART 1: INFERENTIAL TESTS
Inferential tests
• Tests concerned with using selected sample data compared with population
data in a variety of ways are called inferential statistical tests.
• There are two main bodies of these tests.
• The first and most frequently used are called parametric statistical tests.
• The second are called nonparametric tests.
• For each parametric test, there may be a comparable nonparametric test,
sometimes even two or three.
• Parametric tests are tests of significance appropriate when the data
represent an interval or ratio scale of measurement
23. PART 1: INFERENTIAL TESTS
Parametric tests
• Tests of significance appropriate when the data represent an interval or ratio
scale of measurement and other specific assumptions have been met, specifically,
that the sample statistics relate to the population parameters, that the variance of
the sample relates to the variance of the population, that the population has
normality, and that the data are statistically independent.
Nonparametric tests
• Statistical tests used when the data represent a nominal or ordinal level scale or
when assumptions required for parametric tests cannot be met, specifically, small
sample sizes, biased samples, an inability to determine the relationship between
sample and population, and unequal variances between the sample and
population. These are a class of tests that do not hold the assumptions of normality.
24. PART 1: DATA TYPES
Data types
Qualitative
Dichotomous Multinomial
Quantitative
Discrete Continuous
26. ILLUSTRATION OF QUALITATIVE AND
QUANTITATIVE DATA
To assess the nutritional status and to determine potential risk factors of malnutrition
in children under 3 years of age in Nghean, Vietnam.
The study carried out in November 2007, a total of 383 child/mother pairs were
selected by using a 2-stage cluster sampling methodology. A structured questionnaire
was administered to mothers in their home settings.
Anthropometric measurement was defined as being underweight (weight for age),
wasting (weight for height) and stunting (height for age) on the basis of reference
data from the National Center for Health Statistics (NCHS) / World Health
Organization (WHO).
27. ILLUSTRATION OF QUALITATIVE AND QUANTITATIVE
DATA
Logistic regression analysis was used to into account the hierarchical relationships between potential
determinants of malnutrition.
The mean Z-score for weight-for-age was -1.51 (95% CI -1.64, -1.38), for height-for-age was -
1.51 (95% CI -1.65, -1.37) and for weight-for-height was -0.63 (95% CI -0.78, -0.48). Of the
children, 103 (27.7%) were underweight, 135 (36.3%) were stunted and 38 (10.2%) were wasted.
Region of residence, ethnic, mother’s occupation, household size, mother’s BMI, number of children in
family, weight at birth, time of initiation of breast-feeding and duration of exclusive breast-feeding
were found to be significantly related to malnutrition.
The findings of this study indicates that malnutrition is still an important problem among children
under three years of age in Nghean, Vietnam. Socio-economic, environmental factors and feeding
practices are significant risk factors for malnutrition among under-three.
28. PART 1: COMMON STATISTICAL TERMS
Binomial test
• When a test has two alternative outcomes, either failure or success, and you
know what the possibilities of success are, you may apply a binomial test.
• Use a binomial test to determine if an observed test outcome is different
from its predicted outcome.
Causation
• Causation is a direct relationship between two variables.
• Two variables have a direct relationship if a change in one’s value causes a
change in the other variable.
• In that case, one becomes the cause, and the other is the effect.
29. PART 1: COMMON STATISTICAL TERMS
Confidence interval
• A confidence interval measures the level of uncertainty of a collection of
data.
• This is the range in which you anticipate your values to fall within a specific
degree of confidence if you repeat the same experiment.
Correlation coefficient
• The correlation coefficient describes the level of correlation or dependence
between two variables.
• This value is a number between -1 and +1, and if it falls beyond this limit,
there’s been a mistake in the measurement of a coefficient.
30. PART 1: COMMON STATISTICAL TERMS
Z-score:
• A score expressed in units of standard deviations from
the mean. It is also known as a standard score.
Z-test:
• A test of any of a number of hypotheses in inferential
statistics that has validity if sample sizes are sufficiently
large and the underlying data are normally distributed.
31. PART 1: COMMON STATISTICAL TERMS
Hypothesis tests
• A hypothesis test is a method of testing results. Before conducting research, the researcher creates a
hypothesis or a theory for what they believe the results will prove.
• A study then tests that theory.
Kruskal-Wallis one-way analysis of variance:
• A nonparametric inferential statistic used to compare two or more independent groups for statistical
significance of differences.
Mann-Whitney U-test (U):
• A nonparametric inferential statistic used to determine whether two uncorrelated groups differ
significantly.
McNemar’s test:
• A nonparametric method used on nominal data to determine whether the row and column marginal
frequencies are equal. *NPT
32. PART 1: COMMON STATISTICAL TERMS
Dependent variable
• A dependent variable is a value that depends on another variable to exhibit change.
• When computing in statistical analysis, you can use dependent variables to make conclusions about causes of
events, changes and other translations in statistical research.
Independent variable
• In a statistical experiment, an independent variable is one that you modify, control or manipulate in order to
investigate its effects.
• It's called independent since no other factor in the research affects it.
Multivariate analysis of covariance (MANCOVA):
• An extension of ANOVA that incorporates two or more dependent variables in the same analysis. It is an
extension of MANOVA where artificial dependent variables (DVs) are initially adjusted for differences in one or
more covariates. It computes the multivariate F statistic.
Multivariate analysis of variance (MANOVA):
• It is an ANOVA with several dependent variables.
33. PART 1: COMMON STATISTICAL TERMS
One-way analysis of variance (ANOVA):
• An extension of the independent group t-test where you have more than two groups. It computes the
difference in means both between and within groups and compares variability between groups and
variables. Its parametric test statistic is the F-test.
Pearson correlation coefficient (r): T
• This is a measure of the correlation or linear relationship between two variables x and y, giving a value
between +1 and −1 inclusive.
• It is widely used in the sciences as a measure of the strength of linear dependence between two
variables.
Pooled point estimate:
• An approximation of a point, usually a mean or variance, that combines information from two or more
independent samples believed to have the same characteristics.
• It is used to assess the effects of treatment samples versus comparative samples
34. PART 1: COMMON STATISTICAL TERMS
Standard deviation
• The standard deviation is a metric that calculates the square root of a variance. It informs you
how far a single or group result deviates from the average.
Standard error of the mean
• A standard error of mean assesses the likelihood of a sample's mean deviating from the
population mean. You can find the standard error of the mean if you divide the standard
deviation by the square root of the sample size.
Range
• The range is the difference between the lowest and highest values in a collection of data.
Quartile and quintile
• Quartile refers to data divided into four equal parts, while quintile refers to data divided into
five equal parts.
35. PART 1: COMMON STATISTICAL TERMS
Pearson correlation coefficient
• Pearson's correlation coefficient is a statistical test that determines the connection between two continuous
variables.
• Since it is based on covariance, they recognize it as the best approach to quantify the relationship among
variables of interest.
Median
• The median refers to the middle point of data.
• Typically, if you have a data set with an odd number of items, the median appears directly in the middle of
the numbers.
• When computing the median of a set of data with an even number of items, you can calculate the simple
mean between the two middle-most values to achieve the median.
Mode
• Mode refers to the value in a database that repeats the most number of times. If none of the values repeat,
there’s no mode in that database.
36. PART 1: COMMON STATISTICAL TERMS
Statistical inference
• Statistical inference occurs when you use sample data to generate an inference or conclusion.
Statistical inference can include regression, confidence intervals or hypothesis tests.
Statistical power
• Statistical power is a metric of a study's probability of discovering statistical relevance in a
sample, provided the effect is present in the entire population. A powerful statistical test likely
rejects the null hypothesis.
Runs test:
• Where measurements are made according to some well-defined ordering, in either time or space.
• A frequent question is whether or not the average value of the measurement is different at
different points in the sequence. This nonparametric test provides a means for this
37. PART 1: COMMON STATISTICAL TERMS
T-score
• A t-score in a t-distribution refers to the number of standard deviations a sample is away
from the average.
Z-score
• A z-score, also known as a standard score, is a measurement of the distance between the
mean and data point of a variable. You can measure it in standard deviation units.
Z-test
• A z-test is a test that determines if two populations' means are different. To use a z-test, you
need to know the differences in variances and have a large sample size
Sign test:
• A test that can be used whenever an experiment is conducted to compare a treatment with a
control on a number of matched pairs, provided the two treatments are assigned to the
members of each pair at random.
38. PART 1: COMMON STATISTICAL TERMS
Student t-test
• A student t-test is a hypothesis that tests the mean of a small sample with a bell curve where you
don’t know the standard deviation. This can include correlated means, correlation, independent
proportions or independent means.
T-distribution
• T-distribution means when the population standard deviation is unknown and the data originates
from a bell-curve population, it describes the standardized deviations of the mean of the sample
to the mean of the population.
Standard error of the mean (SEM):
• An estimate of the amount by which an obtained mean may be expected to differ by chance
from the true mean. It is an indication of how well the mean of a sample estimates the mean of a
population
39. PART 1: COMMON STATISTICAL TERMS
Variance (SD2 ):
• A measure of the
dispersion of a set of
data points around their
mean value.
• It is a mathematical
expectation of the
average squared
deviations from the mean
Analysis of covariance
(ANCOVA):
• A statistical technique for
equating groups on one
or more variables when
testing for statistical
significance using the F-
test statistic.
• It adjusts scores on a
dependent variable for
initial differences on
other variables, such as
pretest performance or
IQ. *PT
Analysis of variance
(ANOVA):
• A statistical technique for
determining the statistical
significance of
differences among
means; it can be used wit
40. PART 1: COMMON STATISTICAL TERMS
Effect size
• Effect size is a statistical term that quantifies the degree of a relationship between
two given variables. For example, we can learn about the effect of therapy on
anxiety patients.
• The effect size aims to determine whether the therapy is highly successful or mildly
successful.
Measures of variability
• Measures of variability, also referred to as measures of dispersion, denote how
scattered or dispersed a database is.
• Four main measures of variability are the interquartile range, range, standard
deviation and variance.
41. PART 1: COMMON STATISTICAL TERMS
Median test
• A median test is a nonparametric test that tests two independent groups that have
the same median.
• It follows the null hypothesis that each of the two groups maintains the same median.
Population
• Population refers to the group you’re studying. This might include a certain
demographic or a sample of the group, which is a subset of the population.
Parameter
• A parameter is a quantitative measurement that you use to measure a population.
• It’s the unknown value of a population on which you conduct research to learn more.
42. PART 1: COMMON STATISTICAL TERMS
Post hoc test
• Researchers perform a post hoc test only after they’ve discovered a statistically
relevant finding and need to identify where the differences actually originated.
Probability density
• The probability density is a statistical measurement that measures the likely
outcome of a calculation over a given range.
Random variable
• A random variable is a variable in which the value is unknown.
• It can be discrete or continuous with any value given in a range.
43. PART 1: COMMON STATISTICAL TERMS
Chi-square (²):
• A nonparametric test of statistical significance appropriate when the data are in the form of
frequency counts; it compares frequencies actually observed in a study with expected
frequencies to see if they are significantly different.
Coefficient of determination (r²):
• The square of the correlation coefficient (r), it indicates the degree of relationship strength
by potentially explained variance between two variables.
Cohen’s d:
• A standardized way of measuring the effect size or difference by comparing two means by
a simple math formula. It can be used to accompany the reporting of a t-test or ANOVA
result and is often used in meta-analysis.
• The conventional benchmark scores for the magnitude of effect sizes are as follows: small, d
= 0.2; medium, d = 0.5; large, d = 0.8
44. PART 1: COMMON STATISTICAL TERMS
Cronbach’s alpha coefficient ():
• A coefficient of consistency that measures how well a set of variables or items measures a single,
unidimensional, latent construct in a scale or inventory.
• Alpha scores are conventionally interpreted as follows: high, 0.90; medium, 0.70 to 0.89; and
low, 0.55 to 0.69
F-test (F):
• A parametric statistical test of the equality of the means of two or more samples. It compares the
means and variances between and within groups over time. It is also called analysis of variance
(ANOVA)
Tukey’s test of significance:
• A single-step multiple comparison procedure and statistical test generally used in conjunction with
an ANOVA to find which means are significantly different from one another.
• Named after John Tukey, it compares all possible pairs of means and is based on a studentized
range distribution q (this distribution is similar to the distribution of t from the t-test).
45. PART 1: COMMON STATISTICAL TERMS
Fisher’s exact test:
• A nonparametric statistical significance test used in the analysis of contingency tables where
sample sizes are small.
• The test is useful for categorical data that result from classifying objects in two different
ways; it is used to examine the significance of the association (contingency) between two
kinds of classifications
Wald-Wolfowitz test:
• A nonparametric statistical test used to test the hypothesis that a series of numbers is
random. It is also known as the runs test for randomness
Wilcoxon sign rank test (W+ ):
• A nonparametric statistical hypothesis test for the case of two related samples or repeated
measurements on a single sample. It can be used as an alternative to the paired Student’s t-
test when the population cannot be assumed to be normally distributed.
46. PART 1: COMMON STATISTICAL TERMS
Independent t-test:
• A statistical procedure for comparing measurements of mean scores in two
different groups or samples.
• It is also called the independent samples t-test. *
Kendall’s tau:
• A nonparametric statistic used to measure the degree of correspondence
between two rankings and to assess the significance of the correspondence.
Kolmogorav-Smirnov (K-S) test:
• A nonparametric goodness of-fit test used to decide if a sample comes
from a population with a specific distribution.
• The test is based on the empirical distribution function (ECDF)
47. PART 1: APPLICATION OF BIOSTATISTICS
1. Clinical Trials
• One of the most impactful applications of biostatistics is in the design and analysis of clinical
trials.
• Biostatisticians ensure the validity and reliability of trial results, which helps researchers assess
the safety and efficacy of new drugs and treatments.
• Using various statistical methods, we analyze patient data to draw conclusions that will help in
making medical decisions.
2. Epidemiology
• In the field of epidemiology, biostatistics aids in studying the distribution and determinants of
diseases within populations.
• Biostatisticians use different statistical models to analyze patterns, identify risk factors, and
assess the impact of interventions.
• This information is crucial for public health planning and for developing disease prevention
strategies.
48. PART 1: APPLICATION OF BIOSTATISTICS
3. Genetics and Genomics
• Biostatistics is indispensable in the analysis of genetic and genomic data.
• Researchers use statistical methods to identify genes associated with specific diseases,
understand the heritability of these genes, and figure out complex genetic interactions.
• This application of biostatistics is instrumental in advancing our understanding of what
is the genetic basis of various medical conditions.
4. Public Health Policy
• Biostatistics contributes significantly to the formulation and evaluation of public health
policies.
• By analyzing health data, biostatisticians can assess the effectiveness of interventions,
evaluate health disparities, and guide policymakers in making informed decisions to
improve public health outcomes.
49. PART 1: APPLICATION OF BIOSTATISTICS
5. Environmental Health
• Biostatistics is applied in environmental health studies to analyze the impact of environmental factors on
human health. Whether it is assessing the effects of air quality on respiratory diseases or studying the
correlation between water contaminants and health outcomes, biostatistics helps decode the complex
relationships in environmental health research.
6. Bioinformatics
• This is the era of big data, biostatistics plays a crucial role in bioinformatics, where vast amounts of
biological data are analyzed to extract meaningful patterns. Biostatisticians develop statistical methods
and algorithms to interpret data from genomics, proteomics, and other ‘omics’ technologies. And the result
is visible in the form of advancements in personalized medicine and drug discovery.
7. Quality Control in Healthcare
• Biostatistics is also employed in quality control processes within healthcare systems. It ensures the accuracy
and reliability of medical tests, monitors healthcare processes, and helps identify areas for improvement.
This application is vital for maintaining high standards of patient care
50. PART 1: APPLICATION OF BIOSTATISTICS
8. Create population-based interventions
• Researchers can use biometric techniques to assess the impact of a
health programme on the target population.
• With biometric techniques, researchers can use insights from data to:
• Measure the performance of public health interventions
• Boost immunisation rates
• Increase the number of patients attending post-surgery appointments
• Improve training and supervision of health care professionals
standards of patient care
51. PART 1: APPLICATION OF BIOSTATISTICS
9. Create population-based interventions
• Biometrics can also help researchers, health care providers and public health
administrators to create population-based health interventions based on the results
of biostatistical data analysis and interpretation.
These data insights can be used to:
• Identify populations that require interventions to reduce their exposure to specific
health problems
• Identify areas susceptible to high risk of certain diseases
• Identify the factors influencing the high cases of health disparities within a
population
• Identify members of a population that require the highest level of health care
52. PART 1: APPLICATION OF BIOSTATISTICS
10. Control epidemics
• Biostatistical techniques can also help public health officials, health care practitioners and
epidemiologists to control epidemics.
• Researchers not only use statistical analysis to understand how diseases spread, but they can also
use it to determine the mortality rate amongst specific populations.
• It can also help health care professionals determine the most at-risk members of the population and
create a framework for formulating strategies to stop the spread of such diseases.
11. Identify barriers to care
• Researchers and health care professionals can use biostatistical methods to learn about the barriers
preventing people from getting access to quality care.
• Researchers use surveys to identify the factors that limit access to health care. Medical records,
interviews and claims records can show patient perceptions about health care services, providing
insights to make such services more accessible and acceptable to target populations for higher
efficiency.
53. PART 1: APPLICATION OF BIOSTATISTICS
12. Study demography
• Demography is the statistical study of the human population.
• The field uses statistical techniques to describe births, deaths, income, disease disparity and other
structural changes in human populations.
• Using census data, surveys and statistical models, biostatisticians can analyse the structure, size and
movement of populations, providing insights for government agencies, health care administrators, town
planners and other stakeholders to create and adjust their plans based on the dynamics of the population
13. Derive conclusions about populations from samples
• One major importance of biostatistical methods is that they help researchers derive far-reaching
conclusions about a population from samples.
• Due to several factors, such as finances, size and time constraints, it's not always possible for researchers
to collect data about an entire population when testing assumptions about them.
• Biostatistical methods provide researchers and administrators with the tools they require to select a
sample that's representative of the population, choose the right independent and dependent variables
and derive logical conclusions from the data
54. PART 1: APPLICATION OF BIOSTATISTICS
14. Check drug efficacy
• In the medical and pharmaceutical fields, biostatistical research is used to check the
efficacy and effectiveness of treatments during clinical trials.
• Researchers can also use it to find possible side effects of drugs.
• These methods are ideal for conducting drug treatment trials and performing other
experiments to understand the impact of different medications and medical devices on
the human body
15. Perform genetics studies
• It's an important discipline in the study of Mendelian genetics.
• Geneticists use it to study the inheritance patterns of genes.
• They also use it to study the genetic structure of a population.
• Researchers also use biometry to map chromosomes and understand the behaviour of
genes in a population.
55. PART 1: APPLICATION OF BIOSTATISTICS
16. Other applications
• Determining leading causes of death
and burden of disease
• Health status of the population
• Morbidity patterns
56. PART 1: APPLICATIONS OF BIOSTATISTICS
Predictive modelling
• In public health, predictive modeling is a pivotal aspect of biostatistics.
• This statistical process utilizes existing data to forecast future events, uncovering
patterns and trends.
• Is applied in epidemiology for screening individuals prone to specific diseases.
• For instance in breast cancer, where factors like age, race, family history, and more
are analyzed to gauge the risk.
• Predictive modeling plays a crucial role in preventing breast cancer-related deaths
by identifying individuals who may need preventive or treatment measures.
• Beyond cancer and pandemics, this approach extends to various public health
concerns, showcasing its versatility in foreseeing and addressing health challenges.
57. PART 1: APPLICATIONS OF BIOSTATISTICS
Decision-making
• Healthcare leaders
• Researchers
• Policymakers.
Operational Viability:
• Biostatistics provides the necessary data to assess the operational feasibility of new ideas and initiatives.
• It helps in making informed decisions about acquisitions, tool prototypes, and hiring strategies, setting the parameters for
project scopes and methodologies.
Guarding Against Bias:
• Biostatistical studies undergo rigorous examination to detect and eliminate bias.
• Public health’s commitment to equitability ensures that data collection processes are designed to be fair and objective,
preventing unfair conclusions.
Protecting Data Subjects:
• Biostatistical researchers prioritize the protection of data subjects. Personal information collected for public health
research is anonymized and safeguarded, addressing privacy concerns and mitigating risks associated with unsecured
data.
58. PART 1: DATA CLASSIFICATION
The main objectives of Classification of Data are
as follows:
• Explain similarities and differences of data
• Simplify and condense data’s mass
• Facilitate comparisons
• Study the relationship
• Prepare data for tabular presentation
• Present a mental picture of the data
59. PART 1: DATA CLASSIFICATION
There are different types of data classification,
depending on the characteristics.
• Structured and Unstructured
• Primary or Secondary
• Qualitative and Quantitative.
• Number of variables: Univariate, Bivariate, Multivariate
Classifying data is an important step to ensure proper
analysis
60. PART 1: DATA CLASSIFICATION
Univariate data
• This type of data consists of only one variable.
• The analysis of univariate data is thus the simplest form of
analysis since the information deals with only one quantity that
changes.
• It does not deal with cause sor relationships and the main
purpose of the analysis is to describe the data and find
patterns that exist within it.
• The example of a univariate data can be height.
61. PART 1: DATA CLASSIFICATION
Bivariate data
This type of data involves two different variables.
The analysis of this type of data deals with causes and relationships and the analysis is done to find
out the relationship among the two variables.
Example of bivariate data can be temperature and ice cream sales in summer season.
Bivariate data analys isinvolves comparisons, relationships, causes and explanations
62. PART 1: DATA CLASSIFICATION
Multivariate data
• When the data involves three or more variables, it is categorized under
multivariate.
• Example of this type of data is suppose an advertiser wants to compare
the popularity of four advertisements on a website, then their click rates
could be measured for both men and women and relationships between
variables can then be examined.
• It is similar to bivariate but contains more than one dependent variable
• The ways to perform analysis on this data depends on the goals to be
achieved.
• Some of the techniques are regression analysis, path analysis, factor
analysis and multivariate analysis of variance (MANOVA)
63. DATA ANALYSIS
Choice of method
• Size
• Complexity
• Number of variables
• Nature of variables
• Study objectives
• Research questions
• Hypothesis
64. PART 1: DATA ANALYSIS
Descriptive analysis
• Suitable for analyzing and presenting data, such as
mean and median.
Inferential
• To establish functional relationship between variables,
more advanced analytical techniques, such as
correlation and regression
65. PART 1: DATA INTERPRETATION
Data interpretation
• Involves inferring conclusions from the results of
data analysis.
• This exercise allows researchers to categorise,
manipulate and summarise their findings to
answer important questions in public health,
biology and medicine.
66. PART 1: PRIMARY AND SECONDARY DATA
Definition
• Primary data are the original data derived from research endeavors
and collected through methods such as direct observation, indirect
observation, interviews, questionnaire
• Secondary data are data derived from primary data and sources
include published reports, journal articles, news papers
• Often, the distinction between primary and secondary data may be
less than clear.
• In conducting research, both types of data are collected and created
• It is essential to have plan for the management of all types of data
and primary materials
67. PART1: SOURCES OF EPIDEMIOLOGICAL DATA
Epidemiologists use primary and secondary data sources to calculate
rates and conduct studies.
• Primary data is the original data collected for a specific purpose by or for an
investigator. For example, an epidemiologist may collect primary data by interviewing
people who became ill after eating at a restaurant in order to identify which specific
foods were consumed.
• Collecting primary data is expensive and time-consuming, and it usually is
undertaken only when secondary data is not available.
• Secondary data is data collected for another purpose by other individuals or
organizations.
• Examples of sources of secondary data that are commonly used in epidemiological
studies include birth and death certificates, population census records, patient medical
records, disease registries, insurance claim forms and billing records, public
health department case reports, and surveys of individuals and households
68. PART 1: PRIMARY AND SECONDARY DATA
Primary Materials Primary Data Secondary Data
Interview schedules Interview audio recordings
Surveys
Experiments
Nvivo interview transcripts
Purchased laboratory
reagents
Investigational product Product analyses
Research animals Tissue samples Stained slides
Validated questionnaires Completed paper and
pencil questionnaires
SPSS data files containing
raw data and calculated
variable summary scores
69. PART 1: PRIMARY AND SECONDARY DATA
BASIS FOR COMPARISON PRIMARY DATA SECONDARY DATA
Meaning Primary data refers to the first
hand data gathered by the
researcher himself.
Secondary data means data
collected by someone else earlier.
Data Real time data Past data
Process Very involved Quick and easy
Source Surveys, observations,
experiments, questionnaire,
personal interview, etc.
Government publications, websites,
books, journal articles, internal
records etc.
Cost effectiveness Expensive Economical
Collection time Long Short
Specific Always specific to the researcher's
needs.
May or may not be specific to the
researcher's need.
Available in Crude form Refined form
Accuracy and Reliability More Relatively less
71. PART 1: ELEMENTS, OBSERVATIONS,VARIABLES, DATA
Element
• Entities or units on which data are collected such as person, place, or object
Observation
• Set of measurements or observations related to a particular element
Variable
• Character or attribute of interest on a particular element and which takes on different values
Total number of data values
• The number of elements times the number of variables
Data
• Is a specific measurement of a variable – it is the value you record in your data sheet.
72. PART 1: QUANTITATIVE AND QUALITATIVE
VARIABLES
Data is generally divided into two categories:
• Quantitative data represents amounts
• Qualitative or Categorical data represents groupings
• A variable that contains quantitative data is
a quantitative variable;
• A variable that contains categorical data is
a categorical variable.
73. PART 1: QUANTITATIVE AND QUALITATIVE
VARIABLES
Quantitative variables
• The numbers recorded represent real amounts
that can be added, subtracted, divided, etc.
• There are two types of quantitative variables:
• Discrete and continuous.
75. Indicate whether each of the following variables is discrete or continuous:the time it takes for you to
get to school
the number of Canadian couples who were married last year
the number of goals scored by a women’s hockey team
the speed of a bicycle
your age
the number of subjects your school offered last year
the length of time of a telephone call
the annual income of an individual
the distance between your house and school
the number of pages in a dictionary
76. PART 1: QUANTITATIVE VARIABLES
Type of variable What does the data
represent?
Examples
Discrete variables (aka
integer variables)
Counts of individual
items or values.
•Number of students in a
class
•Number of different
tree species in a forest
Continuous
variables (aka ratio
variables)
Measurements of
continuous or non-finite
values.
•Distance
•Volume
•Age
77. PART 1: QUALITATIVE VARIABLES
Qualitative variables
• Categorical variables represent groupings of some kind.
• They are sometimes recorded as numbers, but the numbers represent
categories rather than actual amounts of things.
• There are three types of categorical variables:
• Binary, nominal, and ordinal variables.
• Sometimes a variable can work as more than one type
• An ordinal variable can also be used as a quantitative variable if the scale is
numeric and doesn’t need to be kept as discrete integers.
• For example, star ratings on product reviews are ordinal (1 to 5 stars), but
the average star rating is quantitative.
78. PART 1: QUALITATIVE VARIABLES
Type of variable What does the data
represent?
Examples
Binary variables (aka
dichotomous variables)
Yes or no outcomes. •Heads/tails in a coin flip
•Win/lose in a football game
Nominal variables Groups with no rank or order
between them.
•Species names
•Colors
•Brands
Ordinal variables Groups that are ranked in a
specific order.
•Finishing place in a race
•Rating scale responses in a
survey, such as Likert scales
79. PART 1: INDEPENDENT AND DEPENDENT
VARIABLES
Independent vs dependent variables
• Experiments are usually designed to find out what effect one
variable has on another for instance the effect of salt addition on
plant growth.
• The independent variable (the one you think might be the cause) is
manipulated and then the dependent variable (the one you think
might be the effect) is measured to find out what this effect might
be.
• There are variables that you hold constant (control variables) in
order to focus on your experimental treatment.
80. PART 1: INDEPENDENT AND DEPENDENT VARIABLES
Independent vs dependent vs control variables
Type of variable Definition Example (salt tolerance
experiment)
Independent variables (aka
treatment variables)
Variables you manipulate in
order to affect the outcome of an
experiment.
The amount of salt added to each
plant’s water.
Dependent
variables (aka response
variables)
Variables that represent the
outcome of the experiment.
Any measurement of plant health
and growth: in this case, plant
height and wilting.
Control variables Variables that are held constant
throughout the experiment.
The temperature and light in the
room the plants are kept in, and
the volume of water given to each
plant.
81. OTHER COMMON TYPES OF VARIABLES
Other types
Definition of the independent and dependent variables and determination of whether they are
categorical or quantitative enables choice of the correct statistical test.
Type of variable Definition Example (salt tolerance experiment)
Confounding variables A variable that hides the true effect of another variable
in an experiment.
This can happen when another variable is closely related
to a variable you are interested in, but you haven’t
controlled it in your experiment. Be careful with these,
because confounding variables run a high risk of
introducing a variety of research biases to your work,
particularly omitted variable bias.
Pot size and soil type might affect plant survival as much
or more than salt additions. In an experiment you would
control these potential confounders by holding them
constant.
Latent variables A variable that can’t be directly measured, but that you
represent via a proxy.
Salt tolerance in plants cannot be measured directly, but
can be inferred from measurements of plant health in our
salt-addition experiment.
Composite variables A variable that is made by combining multiple variables
in an experiment. These variables are created when you
analyze data, not when you measure it.
The three plant health variables could be combined into
a single plant-health score to make it easier to present
your findings
83. PART 1: VARIABLES IN RESEARCH
83
No Variable Type Measurement Scale Categories
12 BMI Independent Interval -
13 ≥High school education Independent Nominal Above/Under
14 Health insurance coverage Independent Nominal Yes/No
15 Smoking Independent Nominal Yes/No
16 History of CKD Independent Nominal Yes/No
17 Family history of diabetes Background Nominal Yes/No
18 Family history of hypertension Background Nominal Yes/No
19 Family history of CKD Background Nominal Yes/No
20 Repeatedly respiratory tract infection Background Nominal Yes/No
21 Nephrotoxic medications Independent Nominal Yes/No
22 Obesity Independent Nominal Yes/No
85. PART 1: DISCUSS THE CATEGORIZATION OF THE
FOLLOWING VARIABLES
Number of all hospital discharges
Acute care hospital discharges per 100
Number of acute care hospital discharges
Inpatient surgical procedures per year per 100 000
Total number of inpatient surgical procedures per
year
Average length of hospital stay
Bed occupancy rate (%)
Outpatient contacts per person per year
Autopsy rate (%) for hospital deaths
Inpatient care discharges per 100
Turn over rate
Outpatient/In-patient ration
Number of surgeries
Number of deliveries
Number of x-rays/scans
Number of lab tests
Number of beds per capita
Number of
86. PART 1: QUALITATIVE RESEARCH METHODS
Method Overall Purpose Advantages Challenges
Surveys •Quickly and/or easily gets lots
of information from people in a non
threatening way
•Can complete anonymously
•Inexpensive to administer
•Easy to compare and analyze
•Administer to many people
•Can get lots of data
•Many sample questionnaires already
exist
•Might not get careful feedback
•Wording can bias client's responses
•Impersonal
•May need sampling expert
•Doesn't get full story
Interviews •Understand someone's impressions
or experiences
•Learn more about answers to
questionnaires
•Get full range and depth of
information
•Develops relationship with client
•Can be flexible with client
•Can take time
•Can be hard to analyze and compare
•Can be costly
•Interviewer can bias client's responses
Observation •Gather firsthand information about
people, events, or programs
•View operations of a program as
they are actually occurring
•Can adapt to events as they occur
•Can be difficult to interpret seen
behaviors
•Can be complex to categorize
observations
•Can influence behaviors of program
participants
•Can be expensive
87. PART 1: QUALITATIVE RESEARCH METHODS
Method Overall Purpose Advantages Challenges
Focus Groups •Explore a topic in depth
through group discussion
•quickly and reliably get
common impressions
•can be efficient way to get
much range and depth of
information in short time
•can convey key information
about programs
•can be hard to analyze
responses
•need good facilitator for
safety and closure
•difficult to schedule 6-8
people together
Case Studies •Understand an experience
or conduct comprehensive
examination through cross
comparison of cases
•depicts client's experience in
program input, process and
results
•powerful means to portray
program to outsiders
•usually time consuming to
collect, organize and
describe
•represents depth of
information, rather than
breadth
88. FACTORIALS
These provide an easier way of wring large numbers in forms
Just like in mathematics where we can use bases to write large numbers
For instance 10 to base 2 =1010
100 to base 2 =1100100
10!=10x9x8x7x6x5x4x3x2x1=
89. PART 1: FACTORIALS
The Factorial of a whole number 'n' is defined as the product of that number with
every whole number less than or equal to 'n' till 1.
For example, the factorial of 4 is 4 × 3 × 2 × 1, which is equal to 24. It is
represented using the symbol
5 factorial, that is, 5! can be written as: 5! = 5 × 4 × 3 × 2 × 1 = 120.
The formulas for n factorial are:
n! = n(n-1)(n-2)…………………….(3)(2)(1)
n! = n × (n - 1)!
92. PERMUTATION AND COMBINATION
Both are about rearranging
numbers or objects to change their
position
For instance a set of numbers like
123 can be rearranged in different
ways e.g
123
132
321
312
213
231
Both are about rearranging
numbers or objects to change
their position
For instance a set of numbers
like 12 can be rearranged in
different ways e.g
12
21
Both are about rearranging
numbers or objects to change
their position
For instance a set of numbers
like 1 can be rearranged in
different ways e.g
1
93. A set {1 2 3 4 5 6 7 8 9}
Permutation
123
Where n=3, and r =3
nPr=3P3=n!/(n-r)!=3!/(3-3)!=3!/0!=(3X2X1)/1=6/1=6
Combination
nCr
Where n=3 and r =3
nCr =n!/[(n-r)! X r!=3!/[(3-3)!x3!=(3x2x1)/{(0!) x (3x2x1)]=6/6=1
94. PREMIER LEAGUE EXAMPLE
The set of 20 clubs which are unique
They play in pairs meaning 2 at a time
That is r=2
Then n=20
For permutation, the number of unique pairs given that the order matters i.e home
away, therefore the pairs =nPr=20P2=20!/(20-2)!=20!/18!=
95. PART 1: USE OF FACTORIAL
Use of Factorial
• One area where factorials are widely used is in permutations &
combinations.
• Permutation is an ordered arrangement of outcomes and it can be
calculated with the formula: n Pr= n! / (n - r)!
• Combination is a grouping of outcomes in which order does not
matter. It can be calculated with the formula: nCr = n! / [ (n - r)! r!]
• In both of these formulas, 'n' is the total number of things available
and 'r' is the number of things that have to be chosen.
96. PART 1: FACTORIAL
Set {1,2,3)has three elements meaning n=3
• Permutation i.e the order in which the elements of the sub-set matters
and makes a difference: 1,2; 1,3; 2,3; 2,1; 3,1;3,2, the number of
subsets of twos is equal to 6
• nPr=n!/(n-r)!, wherE n is the total number of elements in the mother set
or population and r is the number of elements in the subset.
• For example if we had 20 elements in the mother set and we are
picking two at a time, the by permutation we will process as follows:
20P2
• Therefore 20P2=20!/(20-2)!=20!/18!=380
97. PART 1: FACTORIAL
Set {1,2,3)has three elements meaning n=3
• Combination i.e the order in which the elements of the sub-set does not
matter and makes no difference: 1,2; 1,3; 2,3 the number of subsets
of twos is equal to 3
• nCr=n!/(n-r)!r!, where n is the total number of elements in the mother
set or population and r is the number of elements in the subset.
• For example if we had 20 elements in the mother set and we are
picking two at a time, the by permutation we will process as follows:
20P2
• Therefore 20C2=20!/(20-2)!2!=20!/18!2!=190
98. PART 1: PERMUTATION-THE ORDER MATTERS
{1, 2, 3} arrange in pairs using permutation i.e. the order should be respected and
matters
1,2; 1,3, 2, 3, 2,1, 3,1. 3,2=6 pairs
{1,2,3,4}
1,2; 1,3; 1,4; 2,3; 2,4; 3,4; 2,1; 3,1;4,1;3,2;4,2;4,3;=12 pairs
Premiere league we have 20 clubs
n Pr= n! / (n- r)!= 20 P2= 20! / (20 - 2)!=20!/18!
99. PART 1: COMBINATION –ORDER DOES NOT
MATTER
{1,2,3}
1,2; 2,3,1,3=3 pairs
nCr = n! / [ (n- r)! 2!]
3C2 = 3! / [ (3 - 2)! 2!]
=3!/[1!x2!]=6/2=3
For premiere league with 20 clubs if the games were one way only then the number
of games would be =20!(18!x2!)=
100. PART 1: COMBINATIONS
nCr = n! / [ (n- r)! 2!], where r is the number of elements we pick for arrangement
at a time
30C5 = 30! / [ (30- 5)! 2!]
=30! / [ (25)! 2!]
101. PART 1: USE OF FACTORIALS
Example 1: How many 5-digit numbers can be formed using the digits 1, 2, 5, 7, and
8 in each of which no digit is repeated?
Solution:
The given 5 digits (1, 2, 5, 7 and 8) should be arranged among themselves in order
to get all possible 5-digit numbers.
The number of ways for doing this can be done by calculating the 5 factorial.
5! = 5 × 4 × 3 × 2 × 1 = 120
Answer: Therefore, the required number of 5-digit numbers is 120.
102. PART 1: USE OF FACTORIAL
Example 2v: In a group of 10 people, $200, $100, and $50 prizes are to be given. In how many
ways can the prizes be distributed?
Solution:
This is permutation because here the order of distribution of prizes matters. It can be calculated
as 10P3 ways.
10P3 = (10!) / (10 - 3)! = 10! / 7! = (10 × 9 × 8 × 7!) / 7! = 10 × 9 × 8 = 720 ways.
Example 3: Three $50 prizes are to be distributed to a group of 10 people. In how many ways
can the prizes be distributed?
Solution:
This is a combination because here the order of distribution of prizes does not matter (because all
prizes are of the same worth). It can be calculated using 10C3.
10C3 = (10!) / [ 3! (10 - 3)!] = 10! / (3! 7!) = (10 × 9 × 8 × 7!) / [(3 × 2 × 1) 7!] = 120 ways.
103. PERMUTION AND COMBINATION
Difference between Permutation and Combination
Permutation Combination
The different ways of arranging a set of
objects into a sequential order are termed as
Permutation.
One of the several ways of choosing items from
a large set of objects, without considering an
order is termed as Combination.
The order is very relevant. The order is quite irrelevant.
It denotes the arrangement of objects. It does not denote the arrangement of objects.
Multiple permutations can be derived from a
single combination.
From a single permutation, only a single
combination can be derived.
They can simply be defined as ordered
elements.
They can simply be defined as unordered set
104. PART 1: UNIVARIATE AND BIVARIATE DATA
Univariate data
• Data on ne variable
• Examples include height, skin colour, ethnicity, service coverage
Bivariate data
• Data where two variables are being compared for correlation or
causation
• Correlation =height and body weight; age and body weight
• Causation such as obesity and heart disease
105. PART 1: UNIVARIATE AND BIVARIATE DATA
Univariate analysis
• Summary statistics
• Central tendency
• Dispersion
• Frequency distribution
• Bar charts
• Histogram
• Pie chart
106. PRACTICE QUESTIONS
1. Explain why a sample statistic (the estimate from the sample) may differ from the
population parameter (the true value) and how you would minimize the difference.
2. A local coffee shop is creating a spreadsheet of their drinks for customers to view
on their website. The spreadsheet includes the calories, sugar content, and
ingredients for each coffee drink. Which of the following would be considered a
variable in this data set?
Answers:
The Calories
The Customers
The Coffee Shop
The Coffee Drink
What are the other variables in the passage?
107. PRACTICE QUESTIONS
1. A political pollster is conducting a survey about voter's affiliation to a major
political party. He selects a random sample of voters who voted in the last
presidential election, and looks into how party affiliation differs based on age,
race, gender and location. How many variables can you identify in this data set?
Answers:
A. 5
B. 6
C. 4
D. 7
108. PART 1: SCALES OF MEASUREMENT
Rationale
• In order to analyze data, the variables have to be defined and categorized using
different scales of measurements.
• There are four scales of measurements- nominal scale, ordinal scale, interval scale,
and ratio scale.
• The scale of measurement of a variable determines the kind of statistical test to be
used.
• Psychologist Stanley Stevens developed the four common scales of measurement:
nominal, ordinal, interval and ratio.
• 1. Nominal scale
• 2. Ordinal scale
• 3. Interval scale
• 4. Ratio scale
109. PART 1: SCALES OF MEASUREMENT
Properties and scales of measurement
• Each scale of measurement has properties that determine how to properly analyse the data.
• The properties evaluated are identity, magnitude, equal intervals and a minimum value of
zero.
Properties of Measurement
• Identity: Identity refers to each value having a unique meaning.
• Magnitude: Magnitude means that the values have an ordered relationship to one another, so
there is a specific order to the variables.
• Equal intervals: Equal intervals mean that data points along the scale are equal, so the
difference between data points one and two will be the same as the difference between data
points five and six.
• A minimum value of zero: A minimum value of zero means the scale has a true zero point.
Degrees, for example, can fall below zero and still have meaning. But if you weigh nothing,
you don’t exist.
110. PART 1: STATISTICAL LEVELS OF
MEASUREMENT
Nominal-level Measurement
• There’s no numerical or quantitative value, and
qualities are not ranked.
• Nominal-level measurements are instead simply
labels or categories assigned to other variables.
• It’s easiest to think of nominal-level measurements
as non-numerical facts about a variable.
111. SCALES OF MEASUREMENT
Nominal scale,
• Also known as categorical variable scale, can be defined as a scale used for
labelling variables into different categories.
• The numbers are used to identify and classify people, objects or events, like
identity number, jersey number of sportspersons, and vehicle registration
number; thus, they have no specific numerical value or meaning. I
• In research, the nominal scale is used for analysing categorical variables such
as gender, place of residence, marital status, political party, blood group
and so on.
• The interval between numbers and their order does not matter on the
nominal scale
112. SCALES OF MEASUREMENT
Nominal scale:
• A nominal scale preserves only the equality property; there is no
‘more or less than’ relation in this measurement.
• The nominal scale of measurement defines the identity property
of data.
• This scale has certain characteristics, but doesn’t have any form
of numerical meaning.
• The data can be placed into categories but can’t be multiplied,
divided, added or subtracted from one another.
• It’s also not possible to measure the difference between data
points
113. SCALES OF MEASUREMENT
Nominal scale:
• The statistical analysis that can be performed on a nominal scale is the
frequency distribution and percentage.
• It can be analyzed graphically using a bar chart or a pie chart. If there are two
categorical variables, quantitative analysis techniques such as joint frequency
distribution and cross-tabulation can be used.
• Mode is the only measure of central tendency which can be used in this scale.
• Since numbers do not have a quantitative value, addition, subtraction,
multiplication, division, and measures of dispersion cannot be applied.
• It is also possible to perform contingency correlation. Hypothesis tests can be
carried out on data collected in the nominal form using the Chi-square test. It can
tell whether there is an association between the variables.
• However, it cannot establish a cause and effect relationship or explain the form
of relationship.
114. PART 1: STATISTICAL LEVELS OF
MEASUREMENT
Ordinal-level Measurement
• Outcomes can be arranged in an order, but all data
values have the same value or weight.
• Although they’re numerical, ordinal-level measurements
can’t be subtracted against each other in statistics
because only the position of the data point matters.
• Ordinal levels are often incorporated into nonparametric
statistics and compared against the total variable group.
115. SCALES OF MEASUREMENT
Ordinal scale
• is a ranking scale in which numbers are assigned to variables to represent their
rank or relative position in the data set.
• The variables are arranged in a specific order rather than just naming them.
• So they can be named, grouped, and ranked.
• In research, the ordinal scale is used for ranking students in a class (1,2,3), rating
a product satisfaction (very unsatisfied-1, unsatisfied-2, neutral-3, satisfied-4,
very satisfied-5), evaluating the frequency of occurrences (very often-1, often-2,
not often-3, not at all-4), assessing the degree of agreement (totally agree-1,
agree-2, neutral-3, disagree-4, totally disagree-5
• In this scale, the attributes are arranged in ascending or descending order. The
numbers indicate rank or the order of quality or quantity.
116. SCALES OF MEASUREMENT
Ordinal Scale:
• The origin of scale is absent because there is no fixed start or ‘true zero’ in the data.
• Hence, it is impossible to find the magnitude of difference or distance between the variables or their
degree of quality.
• For example, while ranking students in terms of potential for an award, a student labelled ‘1’ is better
than the student labelled ‘2’, ‘2’ is better than ‘3’ and so forth.
• However, this ordinal scaling cannot quantify or indicate how much better the second student to the first
student, or the difference between the potential of first and second students, the same as the difference
between the second and third.
• Similarly, very satisfied will always be better than satisfied and unsatisfied will be better than very
unsatisfied.
• The order of variables is of prime importance, and so is the labelling.
• The ordinal scale is the second level of measurement from a statistical point of view.
• These scales are unique up to a monotone transformation. A monotone transformation T is one that assigns
new values such that if f(X) > f(Y) in the ordinal scale, then T(f(X)) > T(f(X)) in the newly transformed scale
117. SCALES OF MEASUREMENT
Ordinal Scale:
• The ordinal data can be presented using tabular or graphical formats.
• The descriptive analysis such as percentile, quartile, median and mode
can be determined in ordinal scale data. Since the interval between
numbers is insignificant, addition, subtraction, multiplication, division, and
measures of dispersion cannot be applied.
• It is possible to test for order correlation using Spearman's rank
correlation coefficient.
• Non-parametric tests such as Mann-Whitney U test, Friedman’s ANOVA,
Kruskal–Wallis H test can also be used to analyze ordinal scale data
118. SCALES OF MEASUREMENT
Interval Scale
• can be defined as a quantitative scale in which both the order and the exact difference
between categories are known.
• Thus it measures variables that can be labelled, ordered, and have an equal interval.
• However, the point of beginning or zero point on an interval scale is arbitrarily
established and is not a ‘true zero’ or ‘absolute zero’.
• Thus the value of zero does not indicate the complete absence of the characteristic being
measured.
• In Fahrenheit/Celsius temperature scales, 0°F and 0°C do not indicate an absence of
temperature.
• In fact, negative values of temperature do exist.
• Temperature, calendar years, attitudes, opinions and so on fall under the interval scale.
Likert scale, Net Promoter Score (NPS), Bipolar matrix table, Semantic differential scale
are the widely used interval scale examples
119. PART 1: STATISTICAL LEVELS OF
MEASUREMENT
Interval-level Measurement
• Outcomes can be arranged in order, but differences
between data values may now have meaning. T
• wo data points are often used to compare the passing
of time or changing conditions within a data set.
• There is often no “starting point” for the range of data
values, and calendar dates or temperatures may not
have a meaningful intrinsic zero value.
120. SCALES OF MEASUREMENT
Interval Scale:
• The major difference between ordinal and interval scale is the existence of
meaningful and equal intervals between variables.
• For example, 40 degrees is higher than 30 degrees, and the difference between
them is a measurable 10 degrees, as is the difference between 90 and 100
degrees.
• However, while ranking students on an ordinal scale, the difference between first
and second student might be 5 marks, and between second and third student is 8
marks.
• Thus, with an interval scale, it is possible to identify whether a given attribute is
higher or lower than another and the extent to which one is higher or lower than
another.
121. SCALES OF MEASUREMENT
Interval Scale:
• The interval scale is the third level of measurement scale. The arbitrary presence of zero has implications
in data manipulation and analysis.
• It is possible to add or subtract a constant to all of the interval scale values without affecting the form of
the scale but not possible to multiply or divide the values.
• For instance, two persons with scale positions 4 and 5 are as far apart as persons with scale positions 9
and 10, but not that a person with score a 10 feels twice as strong as one with a score 5.
• Similarly, 100°F cannot be defined as twice as hot as 50°F because the corresponding temperatures on
the centigrade scale, 37.78°C and 10°C, are not in the ratio 2:1.
• Unlike the ordinal and nominal scale, arithmetic operations such as addition and subtraction can be
performed on an interval scale.
• Any positive linear transformation of form Y = a + bX will preserve the properties of an interval scale
• The arithmetic mean, median, and mode can be used to calculate the central tendency in this scale.
• The measures of dispersion, such as range and standard deviation, can also be calculated.
• Apart from those techniques, product-moment correlation, t-test, and regression analysis are extensively
used for analyzing interval data.
122. PART 1: STATISTICAL LEVELS OF
MEASUREMENT
Interval-level Measurement
• Outcomes can be arranged in order, but differences
between data values may now have meaning. T
• wo data points are often used to compare the passing
of time or changing conditions within a data set.
• There is often no “starting point” for the range of data
values, and calendar dates or temperatures may not
have a meaningful intrinsic zero value.
123. SCALES OF MEASUREMENT
Ratio Scale
• Can be defined as a quantitative scale that bears all the characteristics of an interval scale and
a ‘true zero’ or ‘absolute zero’, which implies the complete absence of the attribute being
measured.
• Thus it measures variables that can be labelled, ordered, has equal intervals and the ‘absolute
zero’ property.
• Before deciding to use a ratio scale, the researcher must observe whether the variables possess
all these characteristics.
• The variables such as length, age, weight, income, years of schooling, price etc., are examples of
a ratio scale.
• They do not have negative numbers because of the existence of an absolute zero point of origin.
• For instance, a price of zero means the commodity does not have any price (it is free); and there
cannot be any negative price.
• Thus ratio scale has a meaningful zero.
• It allows unit conversions like metres to feet, kilogram to calories etc.
124. SCALES OF MEASUREMENT
Ratio Scale:
• The ratio scale is the highest level of measurement scale. It is unique to
a congruence or proportionality transformation of form Y = bX.
• The ‘absolute zero’ property allows performing a wide range of
descriptive and inferential statistics on ratio scale variables.
• It is possible to compare both differences in values and the relative
magnitude of values.
• For instance, the difference between 15cm and 20cm is the same as
between 30cm and 35cm, and 30 cm is twice as long as 15 cm.
• Arithmetic operations such as addition, subtraction, multiplication, and
division (ratio) can be performed in ratio scale data
125. SCALES OF MEASUREMENT
Ratio Scale:
• All statistical operations applicable to nominal, ordinal and
interval scale can be performed on ratio scale data as well.
• Besides, measures of central tendency such as geometric
mean and harmonic mean and all measures of dispersion,
including coefficient of variation, can be determined.
• Parametric tests such as independent sample t-test, paired
sample t-test, ANOVA etc., can also be performed.
• The ratio scale provides unique opportunities for statistical
analysis.
126. SCALES OF MEASUREMENT
Scale Properties
Nominal Categories
Ordinal Categories Rank
Interval Categories Rank Intervals
Ratio Categories Rank Interval True or absolute
zero
129. SOURCES OF DATA
Three main sources for demographic and social statistics
• Censuses
• Surveys
• Administrative records.
A population census
• The total process of collecting, compiling, evaluating, analysing and publishing or otherwise
disseminating demographic, economic and social data pertaining, at a specified time, to all persons
in a country or in a well-delimited part of a country.
• The census collects data from each individual and each set of living quarters for the whole country
or area.
• It allows estimates to be produced for small geographic areas and for population subgroups.
• It also provides the base population figures needed to calculate vital rates from civil registration
data, and it supplies the sampling frame for sample surveys.
130. SOURCES OF DATA
Population census steps
• Securing the required legislation, political support and funding
• Mapping and listing all households
• Planning and printing questionnaires, instruction manuals and procedures
• Planning for shipping census materials
• Recruiting and training census personnel
• Organizing field operations
• Launching publicity campaigns
• Preparing for data processing
• Planning for tabulation
131. SOURCES OF DATA
Population census data
• Because of the expense and complexity of the census, only the most basic items
are included on the questionnaire for the whole population.
• Choosing these items requires considering the needs of data users; availability
of the information from other data sources; international comparability;
willingness of the respondents to give information; and available resources to
fund the census.
• Many countries carry out a sample enumeration in conjunction with the census.
• This can be a cost-effective way to collect more detailed information on
additional topics from a sample of the population.
• The sample enumeration uses the infrastructure and facilities that are already in
place for the census.
132. SOURCES OF DATA
Surveys
• A continuing program of intercensal household surveys is useful for
collecting detailed data on social, economic and housing characteristics
that are not appropriate for collection in a full-scale census.
• Household-based surveys are the most flexible type of data collection.
• They can examine most subjects in detail and provide timely information
about emerging issues.
• They increase the ability and add to the experience of in-house technical
and field staff and maintain resources that have already been
developed, such as maps, sampling frame, field operations, infrastructure
and data-processing capability.
133. SOURCES OF DATA
Surveys
• The many types of household surveys include multi-
subject surveys, specialized surveys, multi-phase surveys
and panel or longitudinal surveys.
• Each type of survey is appropriate for certain kinds of
data-collection needs.
• Household surveys can be costly to undertake, especially
if a country has no ongoing program
134. SOURCES OF DATA
Administrative records
• Administrative records are statistics compiled from various administrative
processes.
• They include not only the vital events recorded in a civil registration system but
also education statistics from school records; health statistics from hospital
records; employment statistics; and many others.
• The reliability and usefulness of these statistics depend on the completeness of
coverage and the compatibility of concepts, definitions and classifications with
those used in the census.
• Administrative records are often by-products of administrative processes, but
they can also be valuable complementary sources of data for censuses and
surveys.
135. SOURCES OF DATA
Administrative records
• Birth certificates
• Death certificates
• Patient medical records
• Disease registries
• Insurance claim forms
• Billing records
• Public health department case reports