SlideShare a Scribd company logo
PROBABILITY AND STATISTICS BY ENGR. JORGE P. BAUTISTA
COURSE OUTLINE Introduction  to Statistics Tabular and Graphical representation of Data Measures of Central Tendencies, Locations and Variations Measure of Dispersion and Correlation Probability and Combinatorics Discrete and Continuous Distributions Hypothesis Testing
Text and References Statistics: a simplified approach by Punsalan and Uriarte, 1998, Rex Texbook Probability and Statistics by Johnson, 2008, Wiley Counterexamples in Probability and Statistics by Romano and Siegel, 1986, Chapman and Hall
Introduction to Statistics Definition In its plural sense, statistics is a set of numerical data e.g. Vital statistics, monthly sales, exchange rates, etc. In its singular sense, statistics is a branch of science that deals with the collection, presentation, analysis and interpretation of data.
General uses of Statistics Aids in decision making by providing comparison of data, explains action that has taken place, justify a claim or assertion, predicts future outcome and estimates un known quantities Summarizes data for public use
Examples on the role of Statistics In Biological and medical sciences, it helps researchers discover relationship worthy of further attention. Ex. A doctor can use statistics to determine to what extent is an increase in blood pressure dependent upon age - In social sciences, it guides researchers and helps them support theories and models that cannot stand on rationale alone. Ex. Empirical studies are using statistics to obtain socio-economic profile of the middle class to form new socio-political theories.
Con’t In business, a company can use statistics to forecast sales, design products, and produce goods more efficiently. Ex. A pharmaceutical company can apply statistical procedures to find out if the new formula is indeed more effective than the one being used. In Engineering, it can be used to test properties of various materials, Ex. A quality controller can use statistics to estimate the average lifetime of the products produced by their current equipment.
Fields of Statistics Statistical Methods of Applied Statistics: Descriptive-comprise those methods concerned with the collection, description, and analysis of a set of data without drawing conclusions or inferences about a larger set. Inferential-comprise those methods concerned with making predictions or inferences about a larger set of data using only the information gathered from a subset of this larger set.
con’t b. Statistical theory of mathematical statistics- deals with the development and exposition of theories that serve as a basis of statistical methods
Descriptive VS Inferential DESCRIPTIVE A bowler wants to find his bowling average for the past 12 months A housewife wants to determine the average weekly amount she spent on groceries in the past 3 months A politician wants to know the exact number of votes he receives in the last election INFERENTIAL A bowler wants to estimate his chance of winning a game based on his current season averages and the average of his opponents. A housewife would like to predict based on last year’s grocery bills, the average weekly amount she will spend on groceries for this year. A politician would like to estimate based on opinion polls, his chance for winning in the upcoming election.
Population as Differrentiated from Sample The word population refers to groups or aggregates of people, animals, objects, materials, happenings or things of any form,  this means that there are populations of students, teachers, supervisors, principals, laboratory animals, trees, manufactured articles, birds and many others. If your interest is on few members of the population to represent their characteristics or traits, these members constitute a sample. The measures of the population are called parameters, while those of the sample are called estimates or statistics.
The Variable It refers to a characteristic or property whereby the members of the group or set vary or differ from one another. However, a constant refers to a property whereby the members of the group do not differ one another. Variables can be according to functional relationship which is classified as independent and dependent. If you treat variable y as a function of variable z, then z is your independent variable and y is your dependent variable. This means that the value of y, say academic achievement depends on the value of z.
Con’t Variables according to continuity of values. 1. Continuous variable – these are variables whose levels can take continuous values. Examples are height, weight, length and width. 2. Discrete variables – these are variables whose values or levels can not take the form of a decimal. An example is the size of a particular family.
Con’t Variables according to scale of measurements: 1. Nominal – this refers to a property of the members of a group defined by an operation which allows making of statements only of equality or difference. For example, individuals can be classified according to thier sex or skin color. Color is an example of nominal variable.
Con’t 2. Ordinal – it is defined by an operation whereby members of a particular group are ranked. In this operation, we can state that one member is greater or less that the others in a criterion rather than saying that he/it is only equal or different from the others such as what is meant by the nominal variable. 3. Interval – this refers to a property defined by an operation which permits making statement of equality of intervals rather than just statement of sameness of difference and greater than or less than. An interval variable does not have a “true” zero point.; althought for convenience, a zero point may be assigned.
Con’t 4. Ratio –  is defined by the operation which permits making statements of equality of ratios in addition to statements of sameness or difference, greater than or less than and equality or inequality of differences.  This means that one level or value may be thought of or said as double, triple or five times another and so on.
Assignment no. 1 Make a list of at least 5 mathematician or scientist that contributes in the field of statistics. State their contributions With your knowledge of statistics, give a real life situation how statistics is applied. Expand your answer. When can a variable be considered independent and dependent? Give an example for your answer.
Con’t IV. Enumerate some uses of statistics. Do you think that any science will develop without test of the hypothesis? Why?
Examples of Scales of Measurement 1.Nominal Level Ex. Sex:  M-Male  F-Female Marital Status: 1-single  2- married  3- widowed  4- separated 2. Ordinal Level Ex. Teaching Ratings: 1-poor  2-fair  3- good  4- excellent
Con’t 3. Interval Level Ex. IQ, temperature 4. Ratio Level Ex. Age, no. of correct answers in exam
Data Collection Methods Survey Method – questions are asked to obtain information, either through self administered questionnaire or personal interview. Observation Method – makes possible the recording of behavior but only at the time of occurrence (ex. Traffic count, reactions to a particular stimulus)
Con’t 3. Experimental method – a method designed for collecting data under controlled conditions. An experiment is an operation where there is actual human interference with the conditions that can affect the variable under study. 4. Use of existing studies – that is census, health statistics, weather reports. 5. Registration method – that is car registration, student registration, hospital admission and ticket sales.
Tabular Representation Frequency Distribution is defined as the arrangement of the gathered data by categories plus their corresponding frequencies and class marks or midpoint.  It has a class frequency containing the number of observations belonging to a class interval.  Its class interval contain a grouping defined by the limits called the lower and the upper limit. Between these limits are called class boundaries.
Frequency of a Nominal Data Male and Female College students Major in Chemistry 130 TOTAL 107 FEMALE 23 MALE FREQUENCY SEX
Frequency of Ordinal Data Ex. Frequency distribution of Employee Perception on the Behavior of their Administrators 100 total 31 Strongly unfavorable 22 Unfavorable 14 Slightly unfavorable 12 Slightly favorable 11 favorable 10 Strongly favorable Frequency Perception
Frequency Distribution Table Definition: Raw data – is the set of data in its original form Array – an arrangement of observations according to their magnitude, wither in increasing or decreasing order. Advantages: easier to detect the smallest and largest value and easy to find the measures of position
Grouped Frequency of Interval Data Given the following raw scores in Algebra Examination, 56 42 28 56 41 56 55 59 50 55 57 38 62 52 66 65 33 34 37 47 42 68 62 54 68 48 56 39 77 80 62 71 57 52 60 70
Compute the range: R = H – L and the number of classes by K = 1 + 3.322log n where n = number of observations. Divide the range by 10 to 15 to determine the acceptable size of the interval. Hint: most frequency distribution have odd numbers as the size of the interval.  The advantage is that the midpoints of the intervals will be whole number. Organize the class interval.  See to it that the lowest interval begins with a number that is multiple of the interval size.
4. Tally each score to the category of class interval it belongs to. 5. Count the tally columns and summarizes it under column (f).  Then add the frequency which is the total number of the cases (N). 6.  Determine the class boundaries. UCB and LCB.(upper and lower class boundary) 7. Compute the midpoint for each class interval and put it in the column (M). M = (LS + HS) / 2
8. Compute the cumulative distribution for less than and greater than and put them in column cf< and cf>. (you can now interpret the data). cf = cumulative frequency 9. Compute the relative frequency distribution.  This can be obtained by RF% = CF/TF x 100% CF = CLASS FREQUENCY TF = TOTAL FREQUENCY
Graphical Representation The data can be graphically presented according to their scale or level of measurements. 1. Pie chart or circle graph. The pie chart at the right is the enrollment from elementary to master’s degree of a certain university. The total population is 4350 students
2. Histogram or bar graph- this graphical representation can be used in nominal, ordinal or interval. For nominal bar graph, the bars are far apart rather than connected since the categories are not continuous. For ordinal and interval data, the bars should be  joined to emphasize the degree of differences
Given the bar graph of how students rate their library. A-strongly favorable, 90 B-favorable, 48 C-slightly favorable, 88 D-slightly unfavorable, 48 E-unfavorable, 15 F-strongly unfavorable, 25
The Histogram of Person’s Age with Frequency of Travel 100% 51 total 3.9% 2 27-28 7.8% 4 25-26 7.8% 4 23-24 41.2% 21 21-22 39.2% 20 19-20 RF freq age
Exercises From the previous grouped data on algebra scores, Draw its histogram using the frequency in the y axis and midpoints in the x axis. Draw the line graph or frequency polygon using frequency in the y axis and midpoints in the x axis. Draw the less than and greater than ogives of the data. Ogives is a cumulation of frequencies by class intervals.  Let the y axis be the CF> and x axis be LCB while y axis be CF< and x axis be UCB
Con’t d. Plot the relative frequency using the y axis as the relative frequency in percent value while in the x axis the midpoints.
 
 
 
Assignment No. 2 Given the score in a statistics examinations, 38 56 35 70 44 81 44 80 45 72 45 50 51 51 52 66 54 53 56 84 58 56 57 70 56 39 56 59 72 63 89 63 69 65 61 62 64 64 69 60 53 66 66 67 67 68 68 69 66 67 70 59 40 71 73 60 73 73 73 73 73 74 73 73 79 74 74 70 73 46 74 74 74 75 75 76 55 77 78 73 48 81 44 84 77 88 63 85 73
Construct the class interval, frequency table, class midpoint(use a whole number midpoint), less than and greater than cumulative frequency, upper and lower boundary and relative frequency. Plot the histogram, frequency polygon, and ogives
3. Draw the pie chart and bar graph of the plans of computer science students with respect to attending a seminar. Compute for the Relative frequency of each. A-will not attend=45 B-probably will not attend=30 C-probably will attend=40 D-will attend=25
Measures of Centrality and Location Mean for Ungrouped Data X’ =  Σ X / N where X’ = the mean Σ X = the sum of all scores/data N  =  the total number of cases Mean for Grouped Data X’ =  Σ fM / N where X’ = the mean M = the midpoint fM = the product of the frequency and each  midpoint N  = total number of cases
Ex.  Find the mean of 10, 20, 25,30, 30, 35, 40 and 50. Given the grades of 50 students in a statistics class Class interval  f 10-14  4 15-19  3 20-24  12 25-29  10 30-34  6 35-39  6 40-44  6 45-49  3
The weighted mean. The weighted arithmetic mean of given groups of data is the average of the means of all groups WX’ =  Σ Xw / N where WX’ = the weighted mean w = the weight of X Σ Xw = the sum of the weight of X’s N =  Σ w = the sum of the weight of  X
Ex. Find the weighted mean of four groups of means below: Group, i  1  2  3  4 X i   60  50  70  75 W i   10  20  40  50
Median for Ungrouped Data The median of ungrouped data is the centermost scores in a distribution. Mdn =  (X N/2   +  X  (N + 2)/2 ) / 2 if N is even Mdn = X  (1+N)/2  if N is odd Ex. Find the median of the following sets of score: Score A: 12, 15, 19, 21, 6, 4, 2 Score B: 18, 22, 31, 12, 3, 9, 11, 8
Median for Grouped Data Procedure: Compute the cumulative frequency less than. Find N/2 Locate the class interval in which the middle class falls, and determine the exact limit of this interval. Apply the formula Mdn =  L +  [(N/2 – F)i]/fm where L = exact lower limit interval containing  the median class F = The sum of all frequencies preceeding L. fm = Frequency of interval containing the median  class i = class interval N = total number of cases
Ex.  Find the median of the given frequency table. class interval  f  cf< 25-29  3 3 30-34  5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
Mode of Ungrouped Data It is defined as the data value or specific score which has the highest frequency. Find the mode of the following data. Data A : 10, 11, 13, 15, 17, 20 Data B:  2, 3, 4, 4, 5, 7, 8, 10 Data C: 3.5, 4.8, 5.5, 6.2, 6.2, 6.2, 7.3, 7.3, 7.3, 8.8
Mode of Grouped Data For grouped data, the mode is defined as the midpoint of the interval containing the largest number of cases. Mdo = L + [d 1 /(d 1  + d 2 )]i where L = exact lower limit interval  containing the modal class. d 1  = the difference of the modal class and the frequency of the interval preceding the modal class d 2  = the difference of the modal class and the frequency of the interval after the modal class.
Ex.  Find the mode of the given frequency table. class interval  f  cf< 25-29  3 3 30-34  5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
Exercises  Determine the mean, median and mode of the age of 15 students in a certain class. 15, 18, 17, 16, 19, 18, 23 , 24, 18, 16, 17, 20, 21, 19 2.  To qualify for scholarship, a student should have garnered an average score of 2.25. determine if the a certain student is qualified for a scholarship.
Subject no. of units  grade A  1  2.0 B  2  3.0 C  3  1.5 D  3  1.25 E  5  2.0
Find the mean, median and mode of the given grouped data. Classes  f  11-22  2 23-34  8 35-46   11 47-58  19 59-70  14 71-82   5 83-94   1
Quartiles refer to the values that divide the distribution into four equal parts.  There are 3 quartiles represented by Q 1  , Q 2  and Q 3 . The value Q 1  refers to the value in the distribution that falls on the first one fourth of the distribution arranged in magnitude.  In the case of Q 2  or the second quartile, this value corresponds to the median. In the case of third quartile or Q 3 , this value corresponds to three fourths of the distribution.
 
For grouped data, the computing formula of the kth quartile where k = 1,2,3,4,… is given by Q k  = L  +  [(kn/4  - F)/fm]Ii Where L = lower class boundary of the kth quartile class F = cumulative frequency before the kth quartile class fm = frequency before the kth quartile i = size of the class interval
Exercises Compute the value of the first and third quartile of the given data class interval  f  cf< 25-29  3 3 30-34  5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
Decile: If the given data is divided into ten equal parts, then we have nine points of division known as deciles.  It is denoted by D 1  , D 2 , D 3  , D 4  …and D 9 D k  = L + [(kn/10 – F)/fm] I Where k = 1,2,3,4 …9
Exercises  Compute the value of the third, fifth and seventh decile of the given data class interval  f  cf< 25-29  3 3 30-34  5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
Percentile- refer to those values that divide a distribution into one hundred equal parts. There are 99 percentiles represented by P 1 , P 2 , P 3 , P 4 , P 5 , …and P 99 . when we say 55 th  percentile we are referring to that value at or below 55/100 th of the data. P k  = L + [(kn/100 – F)/fm]i Where k = 1,2,3,4,5,…99
Exercises  Compute the value of the 30 th , 55 th , 68 th  and 88 th  percentile of the given data class interval  f  cf< 25-29  3 3 30-34  5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
Assignment no. 3 The rate per hour in pesos of 12 employees of a certain company were taken and are shown below. 44.75, 44.75, 38.15, 39.25, 18.00, 15.75, 44.75, 39.25, 18.50, 65.25, 71.25, 77.50 Find the mean, median and mode. If the value 15.75 was incorrectly written as 45.75, what measure of central tendency will be affected? Support your answer.
II. The final grades of a student in six subjects were tabulated below. Subj  units  final grade Algebra  3  60 Religion  2  90 English  3  75 Pilipino  3  86 PE  1  98 History  3  70 Determine the weighted mean If the subjects were of equal number of units, what would be his average?
III. The ages of qualified voters in a certain barangay were taken and are shown below Class Interval  Frequency 18-23  20 24-29  25 30-35  40 36-41  52 42-47  30 48-53  21 54-59  12 60-65  6 66-71  4 72-77   1
Find the mean, median and mode Find the 1 st  and 3 rd  quantile Find the 4 th  and 6 th  decile Find the 25 th  and 75 th  percentile
Measure of Variation The range is considered to be the simplest form of measure of variation. It is the difference between the highest and the lowest value in the distribution. R = H – L For grouped data, the3 difference between the highest upper class boundary and the lowest lower class boundary. Example: find the range of the given grouped data in slide no. 59
Semi-inter Quartile Range This value is obtained by getting one half of the difference between the third and the first quartile. Q =  (Q 3  – Q 1 )/2 Example:  Find the semin-interquartile range of the previous example in slide no. 59
Average Deviation The average deviation refers to the arithmetic mean of the absolute deviations of the values from the mean of the distribution. This measure is sometimes known as the mean absolute deviation. AD =  Σ│ x – x’ │ / n Where x = the individual values x’ = mean of the distribution
Steps in solving for AD Arrange the values in column according to magnitude Compute for the value of the mean x’ Determine the deviations (x – x’) Convert the deviations in step 3 into positive deviations. Use the absolute value sign. Get the sum of the absolute deviations in step 4 Divide the sum in step 5 by n.
Example: Consider the following values: 16, 13, 9, 6, 15, 7, 11, 12 Find the average deviation.
For grouped data: AD =  Σ f│x – x’│ / n Where f = frequency of each class x = midpoint of each class x’ = mean of the distribution n = total number of frequency
Example: Find the average deviation of the given data Classes  f  11-22  2 23-34  8 35-46   11 47-58  19 59-70  14 71-82   5 83-94   1
Variance For ungrouped data s 2  =  Σ (x – x’) 2  / n Example:  Find the variance of 16, 13, 9, 6, 15, 7, 11, 12
For grouped data s 2  =  Σ f(x – x’) 2  / n Where f = frequency of each class x = midpoint of each class interval x’ = mean of the distribution n = total number of frequency
Example:  Find the variance of the given data Classes  f  11-22  2 23-34   8 35-46  11 47-58  19 59-70  14 71-82   5 83-94   1
Coefficient of variation If you wish to compare the variability between different sets of scores or data, coefficient of variation would be very useful measure for interval scale data CV = s/x Where s = standard deviation x = the mean
Example: In a particular university, a researcher wishes to compare the variation in scores of the urban students with that of the scores of the rural students in their college entrance test. It is know that the urban student’s mean score is 384 with a standard deviation of 101; while among the rural students, the mean is 174, with a standard deviation of 53, which group shows more variation in scores?
Standard Deviation s = √s 2 For ungrouped data s =  √  Σ (x – x’) 2  / n For grouped data s  = √  Σ f(x – x’) 2  / n
Find the standard deviation of the previous examples for ungrouped and grouped data. Find the standard deviation of the given data Classes  f  11-22  2 23-34   8 35-46  11 47-58  19 59-70  14 71-82   5 83-94   1
Find the standard deviation of 16, 13, 9, 6, 15, 7, 11, 12
Measure of variation for nominal data VR = 1 – fm/N Where VR = the variation ratio fm = modal class frequency N = counting of observation
Example:  With the data given by a clinical psychologist on the type of therapy used, compute the variation ratios. Type of therapy  no. of patients YR 1980  YR 1985 Logotherapy 20 8 Reality Therapy 60 105 Rational Therapy 42 6 Transactional analysis 39 9 Family therapy 52 5 Others 41 8
Assignment no. 4 I. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation test III of assignment no. 3. II. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation of test I of assignment no. 3.
SIMPLE LINEAR REGRESSION AND MEASURES OF CORRELATION In this topic, you will learn how to predict the value of one dependent variable from the corresponding given value of the independent variable.
The scatter diagram: In solving problems that concern estimation and forecasting, a scatter diagram can be used as a graphical approach. This technique consist of joining the points corresponding to the paired scores of dependent and independent variables which are commonly represented by X and Y on the X-Y coordinate system.
Example: The working experience and income of 8 employees are given below Employee  years of  income experience  (in Thousands) X  Y A  2  8 B  8  10 C  4  11 D  11  15 E  5  9 F  13  17 G  4  8 H  15  14
Using the Least Squares Linear Regression Equation: Y = a + bX Where b = [n Σ xy –  Σ x Σ y] / [n Σ x 2  – ( Σ x) 2 ] a = y’ – bx’ Obtain the equation of the given data and estimate the income of an employee if the number of years experience is 20 years.
Standard Error of Estimate Se = √  [ Σ Y i 2  – a(Y i ) – b(X i Y i )] / n-2  The standard error of estimate is interpreted as the standard deviation.  We will find that the same value of X will always fall between the upper and lower 3Se limits.
Measures of Correlation The degree of relationship between variables is expressed into: Perfect correlation (positive or negative) Some degree of correlation (positive or negative) No correlation
For a perfect correlation, it is either positive or negative represented by +1 and -1. correlation coefficients, positive or negative, is represented by +0.01 to +0.99 and -0.01 to -0.99. The no correlation is represented by 0.
0 to +0.25 very small positive correlation +0.26 to +0.50 moderately small positive correlation +0.51 to +0.75 high positive correlation +0.76 to +0.99 very high positive correlation +1.00 perfect positive correlation ---------------------------------------------------------- 0 to -0.25 very small negative correlation -0.26 to -0.50 moderately small positive correlation -0.51 to -0.75 high negative correlation -0.76 to -0.99 very high negative correlation -1.00 perfect negative correlation
Anybody who wants to interpret the results of the coefficient of correlation should be guided by the following reminders: The relationship of two variables does no necessarily mean that one is the cause of the effect of the other variable. It does not imply cause-effect relationship. When the computed Pearson r is high, it does not necessarily mean that one factor is strongly dependent on the other. On the other hand, when the computed Pearson r is small it does not necessarily mean that one factor has no dependence on the other. If there is a reason to believe that the two variables are related and the computed Pearson r is high, these two variables are really meant as associated. On the other hand, if the variables correlated are low, other factors might be responsible for such small association. Lastly, the meaning of correlation coefficient just simply informs us that when two variables change there may be a strong or weak relationship taking place.
The formula for finding the Pearson r is [n Σ XY –  Σ X Σ Y]  r =  ------------------------------ √ [n Σ X 2  – ( Σ X) 2 ] [n Σ Y 2  – ( Σ Y) 2 ]
Example: Given two sets of scores. Find the Pearson r and interpret the result. X  Y 18  10 16  14 14  14 13  12 12  10 10  8 10  5 8  6 6  12 3  0
Correlation between Ordinal Data This is the Spearman Rank-Order Correlation Coefficient (Spearman Rho).  For cases of 30 or less, Spearman  ρ  is the most widely used of the rank correlation method. 6 Σ D 2 ρ  = 1 - ----------- n(n 2  – 1) Where D = (RX – RY)
Example: Individual  Test X  Test Y 1  18  24 2  17  28 3  14  30 4  13  26 5  12  22 6  10  18 7  8  15 8  8  12
Gamma Rank Order An alternative to the rank order correlation is the Goodman’s and Kruskal’s Gamma (G). The value of one variable can be estimated or predicted from the other variable when you have the knowledge of their values. The gamma can also be used when ties are found in the ranking of the data.
N S   -  N 1 G = ----------------- N S   +  N 1 Where N S  = the number of pairs ordered in the parallel direction N 1  = the number of pairs ordered in the opposite direction
Given a segment of the Filipino Electorate according to religion and political party Total  10 12 22 Born Again 21 72 34 INC 20 25 50 Catholic Total NP LP LAKAS
Correlation between Nominal Data The Guttman’s Coefficient of predictability is the proportionate reduction in error measure which shows the index of how much an error is reduced in predicting values of one variable from the value of another. Σ FBR  -  MBC  λ c = ------------------ N – MBC Where FBR = the biggest cell frequencies in the ith row MBC = the biggest column totals N = total observations
Σ FBC  -  MBR  λ r  = ------------------- N – MBR Where FBC = the biggest cell frequencies in the column MBR = the biggest of the row totals N = total number of observations Compute for the  λ c and  λ r for the segment of Filipino electorate and political parties.
Assignment no. 5 Given the average yearly cost and sales of company A for a period of 8 years. Find the pearson r and interpret the results. Year  Cost  Sales per P10,000  per P10,000 15  38 30  53.3 16  60 39  72 20  40 36  47.5 45  82 10  21.5
Given the grades of 10 students in statistics determine the spearman rho and interpret the result Student  Q1  Q2 A  62  57 B  90  88 C  75  90 D  60  67 E  58  60 F  89  79 G  91  78 H  90  62 I  94  86 J  50  55
3.  Compute for the gamma shown and interpret the result  TOTAL 25 26 9 LOWER 29 54 12 MIDDLE 5 19 24 UPPER TOTAL LOWER MIDDLE UPPER TOTAL EDUCATIONAL STATUS Socio-economic status
4. Compute for the  λ c and  λ r for the problem no. 3.
Counting Techniques Consider the numbers 1,2,3 and 4. suppose you want to determine the total 2 digit numbers that can be formed if these are combined. First, let us assume that no digit is to be repeated. 12  21  31  41 23  32  42 24  34  43 Notice that we were able to used all the possibilities. In this example, we have 12 possible 2 digit numbers.
Now, what if the digits can be repeated? 12  13  14 22  23  24 23  33  34 42  43  44 Hence, we have 16 possible outcomes. In the first activity, we can do it in n 1  ways and after it has been done, the second activity can be done in n 2  ways, then the total number of ways in which the two activities can be done is equal to n 1  n 2 .
Example: How many two digit numbers can be formed from the numbers 1,2,3 and 4 if Repetition is not allowed? Repetition is allowed? 2. How many three digit numbers can be formed from the digits 1,2,3,4 and 5 if any of the digits can be repeated? 3.  The club members are going to elect their officers. If there are 5 candidates for president, 5 candidates for vice president and 3 for secretary, then how many ways can the officers be elected?
An office executive plans to buy as laptop in which there are 5 brands available. Each of the brands has 3 models and each model has 5 colors to chose from. In how many ways can the executive choose? Consider the numbers 2,3 5 and 7. if repetition is not allowed, how many three digit numbers can be formed such that They are all odd? They are all even? They are greater that 500?
6. A pizza place offers 3 choices of salad, 20 kinds of pizza and 4 different deserts. How many different 3 course meals can one order? 7. The executive of a certain company is consist of 5 males and 2 females. How many ways can the presidents and secretary be chosen if The president must be female and the secretary must be male? The president and the secretary are of opposite sex? The president and the secretary should be male?
Permutation  The term permutation refers to the arrangement of objects with reference to order. P(n,r) = n! / (n – r)! Evaluate: P(10,6) P(5,5) P(4,3) + P(4,4)
Examples: In how many ways can a president, a vice president, a secretary and a treasurer be elected from a class with 40 students? In how many ways can 7 individuals be seated in a row of 7 chairs? In how many ways can 9 individuals be seated in a row of 9 chairs if two individuals wanted to be seated side by side?
4. Suppose 5 different math books and 7 different physics books shall be arranged in a shelf.  In how many ways can such books be arranged if the books of the same subject be placed side by side? Determine the possible permutations of the word MISSISSIPPI. Find the total 8 digit numbers that can be formed using all the digits in the following numerals 55777115
In how many ways can 6 persons be seated around a table with 6 chairs if two individuals wanted to be seated side by side? In a local election, there are 7 people running for 3 positions. In how many ways can this be done?
Combination  A combination is an arrangement of objects not in particular order. nCr = C(n,r) =  n!  /  r!(n-r)! Evaluate: 8 C 4 5( 5 C 4  –  5 C 2 ) 7 C 5  / ( 7 C 6  –  7 C 2 )
A class is consist of 12 boys and 10 girls. In how many ways can the class elect the president, vice president, secretary and a treasurer? In how many ways can the class elect 4 members of a certain committee? In how many ways can a student answer 6 out of ten questions? In how many ways can a student answer 6 out of 10 questions if he is required to answer 2 of the first 5 questions?
In how many ways can 3 balls be drawn from a box containing 8 red and 6 green balls? A box contain 8 red and 6 green balls. In how many ways can 3 balls be drawn such that They are all green? 2 is red and 1 is green? 1 is red and 2 is green?
A shipment of 40 computers are unloaded from the van and tested.  6 of them are defective. In how many ways can we select a set of 5 computers and get at least one defective? Five letters a,b,c,d,e are to be chosen. In how many ways could you choose None of them At least two of them At most three of them
Assignment no. 6 How many possible outcomes are there if A die is rolled? A pair of dice is rolled? 2.  In how many ways can 5 math teachers be assigned to 4 available subjects if each of the 5 teachers have equal chance of being assigned to any of the 4 subjects?
3. Consider the numbers 1,2,3,5,and 6. how many 3 digit numbers can be formed from these numbers if Repetition is not allowed and 0 should not be in the first digit? Repetition is allowed and 0 should not be in the first digit? 4.  A college has 3 entrance gates and 2 exit gates. In how many ways can a student enter then leave the building?
In how many ways can 9 passengers be seated in a bus if there are only 5 seats available? In how many ways can 4 boys and 4 girls be seated in a row of 8 chairs if They can sit anywhere? The boys and girls are to be seated alternately? 7.  In how many ways can ten participants in a race placed first, second and third?
Determine the number of distinct permutations of each of the following: STATISTICS ADRENALIN 44044999404 A class consist of 12 boys and 10 girls. In how many ways can a committee of five be formed if All members are boys? 2 are boys and 3 are girls?
10.  In how many ways can a student answer an exam if out of the 6 problem, he is required to answer only 4?
Ad

More Related Content

What's hot (17)

Module 1 statistics
Module 1   statisticsModule 1   statistics
Module 1 statistics
dionesioable
 
statistics Lesson 1
statistics Lesson 1statistics Lesson 1
statistics Lesson 1
donna ruth talo
 
Introduction to Statistics and Probability
Introduction to Statistics and ProbabilityIntroduction to Statistics and Probability
Introduction to Statistics and Probability
Bhavana Singh
 
Statistics 1
Statistics 1Statistics 1
Statistics 1
Saed Jama
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
Fatima Bianca Gueco
 
Measurement and descriptive statistics
Measurement and descriptive statisticsMeasurement and descriptive statistics
Measurement and descriptive statistics
Phạm Phúc Khánh Minh
 
Math 102- Statistics
Math 102- StatisticsMath 102- Statistics
Math 102- Statistics
Zahra Zulaikha
 
Lesson 3 basic terms in statistics
Lesson 3 basic terms in statisticsLesson 3 basic terms in statistics
Lesson 3 basic terms in statistics
Maris Ganace
 
Chapter 1 introduction to statistics
Chapter 1 introduction to statisticsChapter 1 introduction to statistics
Chapter 1 introduction to statistics
John Carlo Catacutan
 
What is Statistics
What is StatisticsWhat is Statistics
What is Statistics
sidra-098
 
Class lecture notes #1 (statistics for research)
Class lecture notes #1 (statistics for research)Class lecture notes #1 (statistics for research)
Class lecture notes #1 (statistics for research)
Harve Abella
 
Chap003
Chap003Chap003
Chap003
Sandra Nicks
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
Ruby Ocenar
 
Statistics Module 2 & 3
Statistics Module 2 & 3Statistics Module 2 & 3
Statistics Module 2 & 3
precyrose
 
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions ManualStatistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
lajabed
 
Statistics and probability lesson 4
Statistics and probability lesson 4Statistics and probability lesson 4
Statistics and probability lesson 4
MARIA CHRISTITA POLINAG
 
Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)
Don Bosco BSIT
 
Module 1 statistics
Module 1   statisticsModule 1   statistics
Module 1 statistics
dionesioable
 
Introduction to Statistics and Probability
Introduction to Statistics and ProbabilityIntroduction to Statistics and Probability
Introduction to Statistics and Probability
Bhavana Singh
 
Statistics 1
Statistics 1Statistics 1
Statistics 1
Saed Jama
 
Lesson 3 basic terms in statistics
Lesson 3 basic terms in statisticsLesson 3 basic terms in statistics
Lesson 3 basic terms in statistics
Maris Ganace
 
Chapter 1 introduction to statistics
Chapter 1 introduction to statisticsChapter 1 introduction to statistics
Chapter 1 introduction to statistics
John Carlo Catacutan
 
What is Statistics
What is StatisticsWhat is Statistics
What is Statistics
sidra-098
 
Class lecture notes #1 (statistics for research)
Class lecture notes #1 (statistics for research)Class lecture notes #1 (statistics for research)
Class lecture notes #1 (statistics for research)
Harve Abella
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
Ruby Ocenar
 
Statistics Module 2 & 3
Statistics Module 2 & 3Statistics Module 2 & 3
Statistics Module 2 & 3
precyrose
 
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions ManualStatistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
lajabed
 
Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)
Don Bosco BSIT
 

Similar to Probability and statistics (20)

Probability in statistics
Probability in statisticsProbability in statistics
Probability in statistics
Sukirti Garg
 
Meaning and Importance of Statistics
Meaning and Importance of StatisticsMeaning and Importance of Statistics
Meaning and Importance of Statistics
Flipped Channel
 
lesson-1_Introduction-to-Statistics.pptx
lesson-1_Introduction-to-Statistics.pptxlesson-1_Introduction-to-Statistics.pptx
lesson-1_Introduction-to-Statistics.pptx
RizalieIco
 
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
BASIC STATISTICAL TREATMENT IN RESEARCH.pptxBASIC STATISTICAL TREATMENT IN RESEARCH.pptx
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
ardrianmalangen2
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
jamiebrandon
 
Statistics.pptx
Statistics.pptxStatistics.pptx
Statistics.pptx
XadDax1
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
cunninghame
 
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
homeworkping3
 
Basic-Statistics-in-Research-Design.pptx
Basic-Statistics-in-Research-Design.pptxBasic-Statistics-in-Research-Design.pptx
Basic-Statistics-in-Research-Design.pptx
KheannJanePasamonte
 
Basic-Statistics in Research Design Presentation
Basic-Statistics in Research Design PresentationBasic-Statistics in Research Design Presentation
Basic-Statistics in Research Design Presentation
FuriousRoblox
 
2_54248135948895858599595585887869437 2.pdf
2_54248135948895858599595585887869437 2.pdf2_54248135948895858599595585887869437 2.pdf
2_54248135948895858599595585887869437 2.pdf
Saad49687
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
raileeanne
 
1. week 1
1. week 11. week 1
1. week 1
renz50
 
Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statistics
Aniceto Naval
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8vaaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
barok2127
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
albertlaporte
 
Sta301 lec01
Sta301 lec01Sta301 lec01
Sta301 lec01
Rizwan Alvi
 
Probability in statistics
Probability in statisticsProbability in statistics
Probability in statistics
Sukirti Garg
 
Meaning and Importance of Statistics
Meaning and Importance of StatisticsMeaning and Importance of Statistics
Meaning and Importance of Statistics
Flipped Channel
 
lesson-1_Introduction-to-Statistics.pptx
lesson-1_Introduction-to-Statistics.pptxlesson-1_Introduction-to-Statistics.pptx
lesson-1_Introduction-to-Statistics.pptx
RizalieIco
 
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
BASIC STATISTICAL TREATMENT IN RESEARCH.pptxBASIC STATISTICAL TREATMENT IN RESEARCH.pptx
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
ardrianmalangen2
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
jamiebrandon
 
Statistics.pptx
Statistics.pptxStatistics.pptx
Statistics.pptx
XadDax1
 
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
59172888 introduction-to-statistics-independent-study-requirements-2nd-sem-20...
homeworkping3
 
Basic-Statistics-in-Research-Design.pptx
Basic-Statistics-in-Research-Design.pptxBasic-Statistics-in-Research-Design.pptx
Basic-Statistics-in-Research-Design.pptx
KheannJanePasamonte
 
Basic-Statistics in Research Design Presentation
Basic-Statistics in Research Design PresentationBasic-Statistics in Research Design Presentation
Basic-Statistics in Research Design Presentation
FuriousRoblox
 
2_54248135948895858599595585887869437 2.pdf
2_54248135948895858599595585887869437 2.pdf2_54248135948895858599595585887869437 2.pdf
2_54248135948895858599595585887869437 2.pdf
Saad49687
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
raileeanne
 
1. week 1
1. week 11. week 1
1. week 1
renz50
 
Review of descriptive statistics
Review of descriptive statisticsReview of descriptive statistics
Review of descriptive statistics
Aniceto Naval
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
Reko Kemo
 
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8vaaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
aaaaasddfgjj,kjl;'lnmvnnbvvmcbmc33333333555566yyy777u8v
barok2127
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
albertlaporte
 
Ad

Probability and statistics

  • 1. PROBABILITY AND STATISTICS BY ENGR. JORGE P. BAUTISTA
  • 2. COURSE OUTLINE Introduction to Statistics Tabular and Graphical representation of Data Measures of Central Tendencies, Locations and Variations Measure of Dispersion and Correlation Probability and Combinatorics Discrete and Continuous Distributions Hypothesis Testing
  • 3. Text and References Statistics: a simplified approach by Punsalan and Uriarte, 1998, Rex Texbook Probability and Statistics by Johnson, 2008, Wiley Counterexamples in Probability and Statistics by Romano and Siegel, 1986, Chapman and Hall
  • 4. Introduction to Statistics Definition In its plural sense, statistics is a set of numerical data e.g. Vital statistics, monthly sales, exchange rates, etc. In its singular sense, statistics is a branch of science that deals with the collection, presentation, analysis and interpretation of data.
  • 5. General uses of Statistics Aids in decision making by providing comparison of data, explains action that has taken place, justify a claim or assertion, predicts future outcome and estimates un known quantities Summarizes data for public use
  • 6. Examples on the role of Statistics In Biological and medical sciences, it helps researchers discover relationship worthy of further attention. Ex. A doctor can use statistics to determine to what extent is an increase in blood pressure dependent upon age - In social sciences, it guides researchers and helps them support theories and models that cannot stand on rationale alone. Ex. Empirical studies are using statistics to obtain socio-economic profile of the middle class to form new socio-political theories.
  • 7. Con’t In business, a company can use statistics to forecast sales, design products, and produce goods more efficiently. Ex. A pharmaceutical company can apply statistical procedures to find out if the new formula is indeed more effective than the one being used. In Engineering, it can be used to test properties of various materials, Ex. A quality controller can use statistics to estimate the average lifetime of the products produced by their current equipment.
  • 8. Fields of Statistics Statistical Methods of Applied Statistics: Descriptive-comprise those methods concerned with the collection, description, and analysis of a set of data without drawing conclusions or inferences about a larger set. Inferential-comprise those methods concerned with making predictions or inferences about a larger set of data using only the information gathered from a subset of this larger set.
  • 9. con’t b. Statistical theory of mathematical statistics- deals with the development and exposition of theories that serve as a basis of statistical methods
  • 10. Descriptive VS Inferential DESCRIPTIVE A bowler wants to find his bowling average for the past 12 months A housewife wants to determine the average weekly amount she spent on groceries in the past 3 months A politician wants to know the exact number of votes he receives in the last election INFERENTIAL A bowler wants to estimate his chance of winning a game based on his current season averages and the average of his opponents. A housewife would like to predict based on last year’s grocery bills, the average weekly amount she will spend on groceries for this year. A politician would like to estimate based on opinion polls, his chance for winning in the upcoming election.
  • 11. Population as Differrentiated from Sample The word population refers to groups or aggregates of people, animals, objects, materials, happenings or things of any form, this means that there are populations of students, teachers, supervisors, principals, laboratory animals, trees, manufactured articles, birds and many others. If your interest is on few members of the population to represent their characteristics or traits, these members constitute a sample. The measures of the population are called parameters, while those of the sample are called estimates or statistics.
  • 12. The Variable It refers to a characteristic or property whereby the members of the group or set vary or differ from one another. However, a constant refers to a property whereby the members of the group do not differ one another. Variables can be according to functional relationship which is classified as independent and dependent. If you treat variable y as a function of variable z, then z is your independent variable and y is your dependent variable. This means that the value of y, say academic achievement depends on the value of z.
  • 13. Con’t Variables according to continuity of values. 1. Continuous variable – these are variables whose levels can take continuous values. Examples are height, weight, length and width. 2. Discrete variables – these are variables whose values or levels can not take the form of a decimal. An example is the size of a particular family.
  • 14. Con’t Variables according to scale of measurements: 1. Nominal – this refers to a property of the members of a group defined by an operation which allows making of statements only of equality or difference. For example, individuals can be classified according to thier sex or skin color. Color is an example of nominal variable.
  • 15. Con’t 2. Ordinal – it is defined by an operation whereby members of a particular group are ranked. In this operation, we can state that one member is greater or less that the others in a criterion rather than saying that he/it is only equal or different from the others such as what is meant by the nominal variable. 3. Interval – this refers to a property defined by an operation which permits making statement of equality of intervals rather than just statement of sameness of difference and greater than or less than. An interval variable does not have a “true” zero point.; althought for convenience, a zero point may be assigned.
  • 16. Con’t 4. Ratio – is defined by the operation which permits making statements of equality of ratios in addition to statements of sameness or difference, greater than or less than and equality or inequality of differences. This means that one level or value may be thought of or said as double, triple or five times another and so on.
  • 17. Assignment no. 1 Make a list of at least 5 mathematician or scientist that contributes in the field of statistics. State their contributions With your knowledge of statistics, give a real life situation how statistics is applied. Expand your answer. When can a variable be considered independent and dependent? Give an example for your answer.
  • 18. Con’t IV. Enumerate some uses of statistics. Do you think that any science will develop without test of the hypothesis? Why?
  • 19. Examples of Scales of Measurement 1.Nominal Level Ex. Sex: M-Male F-Female Marital Status: 1-single 2- married 3- widowed 4- separated 2. Ordinal Level Ex. Teaching Ratings: 1-poor 2-fair 3- good 4- excellent
  • 20. Con’t 3. Interval Level Ex. IQ, temperature 4. Ratio Level Ex. Age, no. of correct answers in exam
  • 21. Data Collection Methods Survey Method – questions are asked to obtain information, either through self administered questionnaire or personal interview. Observation Method – makes possible the recording of behavior but only at the time of occurrence (ex. Traffic count, reactions to a particular stimulus)
  • 22. Con’t 3. Experimental method – a method designed for collecting data under controlled conditions. An experiment is an operation where there is actual human interference with the conditions that can affect the variable under study. 4. Use of existing studies – that is census, health statistics, weather reports. 5. Registration method – that is car registration, student registration, hospital admission and ticket sales.
  • 23. Tabular Representation Frequency Distribution is defined as the arrangement of the gathered data by categories plus their corresponding frequencies and class marks or midpoint. It has a class frequency containing the number of observations belonging to a class interval. Its class interval contain a grouping defined by the limits called the lower and the upper limit. Between these limits are called class boundaries.
  • 24. Frequency of a Nominal Data Male and Female College students Major in Chemistry 130 TOTAL 107 FEMALE 23 MALE FREQUENCY SEX
  • 25. Frequency of Ordinal Data Ex. Frequency distribution of Employee Perception on the Behavior of their Administrators 100 total 31 Strongly unfavorable 22 Unfavorable 14 Slightly unfavorable 12 Slightly favorable 11 favorable 10 Strongly favorable Frequency Perception
  • 26. Frequency Distribution Table Definition: Raw data – is the set of data in its original form Array – an arrangement of observations according to their magnitude, wither in increasing or decreasing order. Advantages: easier to detect the smallest and largest value and easy to find the measures of position
  • 27. Grouped Frequency of Interval Data Given the following raw scores in Algebra Examination, 56 42 28 56 41 56 55 59 50 55 57 38 62 52 66 65 33 34 37 47 42 68 62 54 68 48 56 39 77 80 62 71 57 52 60 70
  • 28. Compute the range: R = H – L and the number of classes by K = 1 + 3.322log n where n = number of observations. Divide the range by 10 to 15 to determine the acceptable size of the interval. Hint: most frequency distribution have odd numbers as the size of the interval. The advantage is that the midpoints of the intervals will be whole number. Organize the class interval. See to it that the lowest interval begins with a number that is multiple of the interval size.
  • 29. 4. Tally each score to the category of class interval it belongs to. 5. Count the tally columns and summarizes it under column (f). Then add the frequency which is the total number of the cases (N). 6. Determine the class boundaries. UCB and LCB.(upper and lower class boundary) 7. Compute the midpoint for each class interval and put it in the column (M). M = (LS + HS) / 2
  • 30. 8. Compute the cumulative distribution for less than and greater than and put them in column cf< and cf>. (you can now interpret the data). cf = cumulative frequency 9. Compute the relative frequency distribution. This can be obtained by RF% = CF/TF x 100% CF = CLASS FREQUENCY TF = TOTAL FREQUENCY
  • 31. Graphical Representation The data can be graphically presented according to their scale or level of measurements. 1. Pie chart or circle graph. The pie chart at the right is the enrollment from elementary to master’s degree of a certain university. The total population is 4350 students
  • 32. 2. Histogram or bar graph- this graphical representation can be used in nominal, ordinal or interval. For nominal bar graph, the bars are far apart rather than connected since the categories are not continuous. For ordinal and interval data, the bars should be joined to emphasize the degree of differences
  • 33. Given the bar graph of how students rate their library. A-strongly favorable, 90 B-favorable, 48 C-slightly favorable, 88 D-slightly unfavorable, 48 E-unfavorable, 15 F-strongly unfavorable, 25
  • 34. The Histogram of Person’s Age with Frequency of Travel 100% 51 total 3.9% 2 27-28 7.8% 4 25-26 7.8% 4 23-24 41.2% 21 21-22 39.2% 20 19-20 RF freq age
  • 35. Exercises From the previous grouped data on algebra scores, Draw its histogram using the frequency in the y axis and midpoints in the x axis. Draw the line graph or frequency polygon using frequency in the y axis and midpoints in the x axis. Draw the less than and greater than ogives of the data. Ogives is a cumulation of frequencies by class intervals. Let the y axis be the CF> and x axis be LCB while y axis be CF< and x axis be UCB
  • 36. Con’t d. Plot the relative frequency using the y axis as the relative frequency in percent value while in the x axis the midpoints.
  • 37.  
  • 38.  
  • 39.  
  • 40. Assignment No. 2 Given the score in a statistics examinations, 38 56 35 70 44 81 44 80 45 72 45 50 51 51 52 66 54 53 56 84 58 56 57 70 56 39 56 59 72 63 89 63 69 65 61 62 64 64 69 60 53 66 66 67 67 68 68 69 66 67 70 59 40 71 73 60 73 73 73 73 73 74 73 73 79 74 74 70 73 46 74 74 74 75 75 76 55 77 78 73 48 81 44 84 77 88 63 85 73
  • 41. Construct the class interval, frequency table, class midpoint(use a whole number midpoint), less than and greater than cumulative frequency, upper and lower boundary and relative frequency. Plot the histogram, frequency polygon, and ogives
  • 42. 3. Draw the pie chart and bar graph of the plans of computer science students with respect to attending a seminar. Compute for the Relative frequency of each. A-will not attend=45 B-probably will not attend=30 C-probably will attend=40 D-will attend=25
  • 43. Measures of Centrality and Location Mean for Ungrouped Data X’ = Σ X / N where X’ = the mean Σ X = the sum of all scores/data N = the total number of cases Mean for Grouped Data X’ = Σ fM / N where X’ = the mean M = the midpoint fM = the product of the frequency and each midpoint N = total number of cases
  • 44. Ex. Find the mean of 10, 20, 25,30, 30, 35, 40 and 50. Given the grades of 50 students in a statistics class Class interval f 10-14 4 15-19 3 20-24 12 25-29 10 30-34 6 35-39 6 40-44 6 45-49 3
  • 45. The weighted mean. The weighted arithmetic mean of given groups of data is the average of the means of all groups WX’ = Σ Xw / N where WX’ = the weighted mean w = the weight of X Σ Xw = the sum of the weight of X’s N = Σ w = the sum of the weight of X
  • 46. Ex. Find the weighted mean of four groups of means below: Group, i 1 2 3 4 X i 60 50 70 75 W i 10 20 40 50
  • 47. Median for Ungrouped Data The median of ungrouped data is the centermost scores in a distribution. Mdn = (X N/2 + X (N + 2)/2 ) / 2 if N is even Mdn = X (1+N)/2 if N is odd Ex. Find the median of the following sets of score: Score A: 12, 15, 19, 21, 6, 4, 2 Score B: 18, 22, 31, 12, 3, 9, 11, 8
  • 48. Median for Grouped Data Procedure: Compute the cumulative frequency less than. Find N/2 Locate the class interval in which the middle class falls, and determine the exact limit of this interval. Apply the formula Mdn = L + [(N/2 – F)i]/fm where L = exact lower limit interval containing the median class F = The sum of all frequencies preceeding L. fm = Frequency of interval containing the median class i = class interval N = total number of cases
  • 49. Ex. Find the median of the given frequency table. class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 50. Mode of Ungrouped Data It is defined as the data value or specific score which has the highest frequency. Find the mode of the following data. Data A : 10, 11, 13, 15, 17, 20 Data B: 2, 3, 4, 4, 5, 7, 8, 10 Data C: 3.5, 4.8, 5.5, 6.2, 6.2, 6.2, 7.3, 7.3, 7.3, 8.8
  • 51. Mode of Grouped Data For grouped data, the mode is defined as the midpoint of the interval containing the largest number of cases. Mdo = L + [d 1 /(d 1 + d 2 )]i where L = exact lower limit interval containing the modal class. d 1 = the difference of the modal class and the frequency of the interval preceding the modal class d 2 = the difference of the modal class and the frequency of the interval after the modal class.
  • 52. Ex. Find the mode of the given frequency table. class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 53. Exercises Determine the mean, median and mode of the age of 15 students in a certain class. 15, 18, 17, 16, 19, 18, 23 , 24, 18, 16, 17, 20, 21, 19 2. To qualify for scholarship, a student should have garnered an average score of 2.25. determine if the a certain student is qualified for a scholarship.
  • 54. Subject no. of units grade A 1 2.0 B 2 3.0 C 3 1.5 D 3 1.25 E 5 2.0
  • 55. Find the mean, median and mode of the given grouped data. Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 56. Quartiles refer to the values that divide the distribution into four equal parts. There are 3 quartiles represented by Q 1 , Q 2 and Q 3 . The value Q 1 refers to the value in the distribution that falls on the first one fourth of the distribution arranged in magnitude. In the case of Q 2 or the second quartile, this value corresponds to the median. In the case of third quartile or Q 3 , this value corresponds to three fourths of the distribution.
  • 57.  
  • 58. For grouped data, the computing formula of the kth quartile where k = 1,2,3,4,… is given by Q k = L + [(kn/4 - F)/fm]Ii Where L = lower class boundary of the kth quartile class F = cumulative frequency before the kth quartile class fm = frequency before the kth quartile i = size of the class interval
  • 59. Exercises Compute the value of the first and third quartile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 60. Decile: If the given data is divided into ten equal parts, then we have nine points of division known as deciles. It is denoted by D 1 , D 2 , D 3 , D 4 …and D 9 D k = L + [(kn/10 – F)/fm] I Where k = 1,2,3,4 …9
  • 61. Exercises Compute the value of the third, fifth and seventh decile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 62. Percentile- refer to those values that divide a distribution into one hundred equal parts. There are 99 percentiles represented by P 1 , P 2 , P 3 , P 4 , P 5 , …and P 99 . when we say 55 th percentile we are referring to that value at or below 55/100 th of the data. P k = L + [(kn/100 – F)/fm]i Where k = 1,2,3,4,5,…99
  • 63. Exercises Compute the value of the 30 th , 55 th , 68 th and 88 th percentile of the given data class interval f cf< 25-29 3 3 30-34 5 8 35-39 10 18 40-44 15 33 45-49 15 48 50-54 15 63 55-59 21 84 60-64 8 92 65-69 6 98 70-74 2 100
  • 64. Assignment no. 3 The rate per hour in pesos of 12 employees of a certain company were taken and are shown below. 44.75, 44.75, 38.15, 39.25, 18.00, 15.75, 44.75, 39.25, 18.50, 65.25, 71.25, 77.50 Find the mean, median and mode. If the value 15.75 was incorrectly written as 45.75, what measure of central tendency will be affected? Support your answer.
  • 65. II. The final grades of a student in six subjects were tabulated below. Subj units final grade Algebra 3 60 Religion 2 90 English 3 75 Pilipino 3 86 PE 1 98 History 3 70 Determine the weighted mean If the subjects were of equal number of units, what would be his average?
  • 66. III. The ages of qualified voters in a certain barangay were taken and are shown below Class Interval Frequency 18-23 20 24-29 25 30-35 40 36-41 52 42-47 30 48-53 21 54-59 12 60-65 6 66-71 4 72-77 1
  • 67. Find the mean, median and mode Find the 1 st and 3 rd quantile Find the 4 th and 6 th decile Find the 25 th and 75 th percentile
  • 68. Measure of Variation The range is considered to be the simplest form of measure of variation. It is the difference between the highest and the lowest value in the distribution. R = H – L For grouped data, the3 difference between the highest upper class boundary and the lowest lower class boundary. Example: find the range of the given grouped data in slide no. 59
  • 69. Semi-inter Quartile Range This value is obtained by getting one half of the difference between the third and the first quartile. Q = (Q 3 – Q 1 )/2 Example: Find the semin-interquartile range of the previous example in slide no. 59
  • 70. Average Deviation The average deviation refers to the arithmetic mean of the absolute deviations of the values from the mean of the distribution. This measure is sometimes known as the mean absolute deviation. AD = Σ│ x – x’ │ / n Where x = the individual values x’ = mean of the distribution
  • 71. Steps in solving for AD Arrange the values in column according to magnitude Compute for the value of the mean x’ Determine the deviations (x – x’) Convert the deviations in step 3 into positive deviations. Use the absolute value sign. Get the sum of the absolute deviations in step 4 Divide the sum in step 5 by n.
  • 72. Example: Consider the following values: 16, 13, 9, 6, 15, 7, 11, 12 Find the average deviation.
  • 73. For grouped data: AD = Σ f│x – x’│ / n Where f = frequency of each class x = midpoint of each class x’ = mean of the distribution n = total number of frequency
  • 74. Example: Find the average deviation of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 75. Variance For ungrouped data s 2 = Σ (x – x’) 2 / n Example: Find the variance of 16, 13, 9, 6, 15, 7, 11, 12
  • 76. For grouped data s 2 = Σ f(x – x’) 2 / n Where f = frequency of each class x = midpoint of each class interval x’ = mean of the distribution n = total number of frequency
  • 77. Example: Find the variance of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 78. Coefficient of variation If you wish to compare the variability between different sets of scores or data, coefficient of variation would be very useful measure for interval scale data CV = s/x Where s = standard deviation x = the mean
  • 79. Example: In a particular university, a researcher wishes to compare the variation in scores of the urban students with that of the scores of the rural students in their college entrance test. It is know that the urban student’s mean score is 384 with a standard deviation of 101; while among the rural students, the mean is 174, with a standard deviation of 53, which group shows more variation in scores?
  • 80. Standard Deviation s = √s 2 For ungrouped data s = √ Σ (x – x’) 2 / n For grouped data s = √ Σ f(x – x’) 2 / n
  • 81. Find the standard deviation of the previous examples for ungrouped and grouped data. Find the standard deviation of the given data Classes f 11-22 2 23-34 8 35-46 11 47-58 19 59-70 14 71-82 5 83-94 1
  • 82. Find the standard deviation of 16, 13, 9, 6, 15, 7, 11, 12
  • 83. Measure of variation for nominal data VR = 1 – fm/N Where VR = the variation ratio fm = modal class frequency N = counting of observation
  • 84. Example: With the data given by a clinical psychologist on the type of therapy used, compute the variation ratios. Type of therapy no. of patients YR 1980 YR 1985 Logotherapy 20 8 Reality Therapy 60 105 Rational Therapy 42 6 Transactional analysis 39 9 Family therapy 52 5 Others 41 8
  • 85. Assignment no. 4 I. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation test III of assignment no. 3. II. Compute for the semi-interquartile range, absolute deviation, variance and standard deviation of test I of assignment no. 3.
  • 86. SIMPLE LINEAR REGRESSION AND MEASURES OF CORRELATION In this topic, you will learn how to predict the value of one dependent variable from the corresponding given value of the independent variable.
  • 87. The scatter diagram: In solving problems that concern estimation and forecasting, a scatter diagram can be used as a graphical approach. This technique consist of joining the points corresponding to the paired scores of dependent and independent variables which are commonly represented by X and Y on the X-Y coordinate system.
  • 88. Example: The working experience and income of 8 employees are given below Employee years of income experience (in Thousands) X Y A 2 8 B 8 10 C 4 11 D 11 15 E 5 9 F 13 17 G 4 8 H 15 14
  • 89. Using the Least Squares Linear Regression Equation: Y = a + bX Where b = [n Σ xy – Σ x Σ y] / [n Σ x 2 – ( Σ x) 2 ] a = y’ – bx’ Obtain the equation of the given data and estimate the income of an employee if the number of years experience is 20 years.
  • 90. Standard Error of Estimate Se = √ [ Σ Y i 2 – a(Y i ) – b(X i Y i )] / n-2 The standard error of estimate is interpreted as the standard deviation. We will find that the same value of X will always fall between the upper and lower 3Se limits.
  • 91. Measures of Correlation The degree of relationship between variables is expressed into: Perfect correlation (positive or negative) Some degree of correlation (positive or negative) No correlation
  • 92. For a perfect correlation, it is either positive or negative represented by +1 and -1. correlation coefficients, positive or negative, is represented by +0.01 to +0.99 and -0.01 to -0.99. The no correlation is represented by 0.
  • 93. 0 to +0.25 very small positive correlation +0.26 to +0.50 moderately small positive correlation +0.51 to +0.75 high positive correlation +0.76 to +0.99 very high positive correlation +1.00 perfect positive correlation ---------------------------------------------------------- 0 to -0.25 very small negative correlation -0.26 to -0.50 moderately small positive correlation -0.51 to -0.75 high negative correlation -0.76 to -0.99 very high negative correlation -1.00 perfect negative correlation
  • 94. Anybody who wants to interpret the results of the coefficient of correlation should be guided by the following reminders: The relationship of two variables does no necessarily mean that one is the cause of the effect of the other variable. It does not imply cause-effect relationship. When the computed Pearson r is high, it does not necessarily mean that one factor is strongly dependent on the other. On the other hand, when the computed Pearson r is small it does not necessarily mean that one factor has no dependence on the other. If there is a reason to believe that the two variables are related and the computed Pearson r is high, these two variables are really meant as associated. On the other hand, if the variables correlated are low, other factors might be responsible for such small association. Lastly, the meaning of correlation coefficient just simply informs us that when two variables change there may be a strong or weak relationship taking place.
  • 95. The formula for finding the Pearson r is [n Σ XY – Σ X Σ Y] r = ------------------------------ √ [n Σ X 2 – ( Σ X) 2 ] [n Σ Y 2 – ( Σ Y) 2 ]
  • 96. Example: Given two sets of scores. Find the Pearson r and interpret the result. X Y 18 10 16 14 14 14 13 12 12 10 10 8 10 5 8 6 6 12 3 0
  • 97. Correlation between Ordinal Data This is the Spearman Rank-Order Correlation Coefficient (Spearman Rho). For cases of 30 or less, Spearman ρ is the most widely used of the rank correlation method. 6 Σ D 2 ρ = 1 - ----------- n(n 2 – 1) Where D = (RX – RY)
  • 98. Example: Individual Test X Test Y 1 18 24 2 17 28 3 14 30 4 13 26 5 12 22 6 10 18 7 8 15 8 8 12
  • 99. Gamma Rank Order An alternative to the rank order correlation is the Goodman’s and Kruskal’s Gamma (G). The value of one variable can be estimated or predicted from the other variable when you have the knowledge of their values. The gamma can also be used when ties are found in the ranking of the data.
  • 100. N S - N 1 G = ----------------- N S + N 1 Where N S = the number of pairs ordered in the parallel direction N 1 = the number of pairs ordered in the opposite direction
  • 101. Given a segment of the Filipino Electorate according to religion and political party Total 10 12 22 Born Again 21 72 34 INC 20 25 50 Catholic Total NP LP LAKAS
  • 102. Correlation between Nominal Data The Guttman’s Coefficient of predictability is the proportionate reduction in error measure which shows the index of how much an error is reduced in predicting values of one variable from the value of another. Σ FBR - MBC λ c = ------------------ N – MBC Where FBR = the biggest cell frequencies in the ith row MBC = the biggest column totals N = total observations
  • 103. Σ FBC - MBR λ r = ------------------- N – MBR Where FBC = the biggest cell frequencies in the column MBR = the biggest of the row totals N = total number of observations Compute for the λ c and λ r for the segment of Filipino electorate and political parties.
  • 104. Assignment no. 5 Given the average yearly cost and sales of company A for a period of 8 years. Find the pearson r and interpret the results. Year Cost Sales per P10,000 per P10,000 15 38 30 53.3 16 60 39 72 20 40 36 47.5 45 82 10 21.5
  • 105. Given the grades of 10 students in statistics determine the spearman rho and interpret the result Student Q1 Q2 A 62 57 B 90 88 C 75 90 D 60 67 E 58 60 F 89 79 G 91 78 H 90 62 I 94 86 J 50 55
  • 106. 3. Compute for the gamma shown and interpret the result TOTAL 25 26 9 LOWER 29 54 12 MIDDLE 5 19 24 UPPER TOTAL LOWER MIDDLE UPPER TOTAL EDUCATIONAL STATUS Socio-economic status
  • 107. 4. Compute for the λ c and λ r for the problem no. 3.
  • 108. Counting Techniques Consider the numbers 1,2,3 and 4. suppose you want to determine the total 2 digit numbers that can be formed if these are combined. First, let us assume that no digit is to be repeated. 12 21 31 41 23 32 42 24 34 43 Notice that we were able to used all the possibilities. In this example, we have 12 possible 2 digit numbers.
  • 109. Now, what if the digits can be repeated? 12 13 14 22 23 24 23 33 34 42 43 44 Hence, we have 16 possible outcomes. In the first activity, we can do it in n 1 ways and after it has been done, the second activity can be done in n 2 ways, then the total number of ways in which the two activities can be done is equal to n 1 n 2 .
  • 110. Example: How many two digit numbers can be formed from the numbers 1,2,3 and 4 if Repetition is not allowed? Repetition is allowed? 2. How many three digit numbers can be formed from the digits 1,2,3,4 and 5 if any of the digits can be repeated? 3. The club members are going to elect their officers. If there are 5 candidates for president, 5 candidates for vice president and 3 for secretary, then how many ways can the officers be elected?
  • 111. An office executive plans to buy as laptop in which there are 5 brands available. Each of the brands has 3 models and each model has 5 colors to chose from. In how many ways can the executive choose? Consider the numbers 2,3 5 and 7. if repetition is not allowed, how many three digit numbers can be formed such that They are all odd? They are all even? They are greater that 500?
  • 112. 6. A pizza place offers 3 choices of salad, 20 kinds of pizza and 4 different deserts. How many different 3 course meals can one order? 7. The executive of a certain company is consist of 5 males and 2 females. How many ways can the presidents and secretary be chosen if The president must be female and the secretary must be male? The president and the secretary are of opposite sex? The president and the secretary should be male?
  • 113. Permutation The term permutation refers to the arrangement of objects with reference to order. P(n,r) = n! / (n – r)! Evaluate: P(10,6) P(5,5) P(4,3) + P(4,4)
  • 114. Examples: In how many ways can a president, a vice president, a secretary and a treasurer be elected from a class with 40 students? In how many ways can 7 individuals be seated in a row of 7 chairs? In how many ways can 9 individuals be seated in a row of 9 chairs if two individuals wanted to be seated side by side?
  • 115. 4. Suppose 5 different math books and 7 different physics books shall be arranged in a shelf. In how many ways can such books be arranged if the books of the same subject be placed side by side? Determine the possible permutations of the word MISSISSIPPI. Find the total 8 digit numbers that can be formed using all the digits in the following numerals 55777115
  • 116. In how many ways can 6 persons be seated around a table with 6 chairs if two individuals wanted to be seated side by side? In a local election, there are 7 people running for 3 positions. In how many ways can this be done?
  • 117. Combination A combination is an arrangement of objects not in particular order. nCr = C(n,r) = n! / r!(n-r)! Evaluate: 8 C 4 5( 5 C 4 – 5 C 2 ) 7 C 5 / ( 7 C 6 – 7 C 2 )
  • 118. A class is consist of 12 boys and 10 girls. In how many ways can the class elect the president, vice president, secretary and a treasurer? In how many ways can the class elect 4 members of a certain committee? In how many ways can a student answer 6 out of ten questions? In how many ways can a student answer 6 out of 10 questions if he is required to answer 2 of the first 5 questions?
  • 119. In how many ways can 3 balls be drawn from a box containing 8 red and 6 green balls? A box contain 8 red and 6 green balls. In how many ways can 3 balls be drawn such that They are all green? 2 is red and 1 is green? 1 is red and 2 is green?
  • 120. A shipment of 40 computers are unloaded from the van and tested. 6 of them are defective. In how many ways can we select a set of 5 computers and get at least one defective? Five letters a,b,c,d,e are to be chosen. In how many ways could you choose None of them At least two of them At most three of them
  • 121. Assignment no. 6 How many possible outcomes are there if A die is rolled? A pair of dice is rolled? 2. In how many ways can 5 math teachers be assigned to 4 available subjects if each of the 5 teachers have equal chance of being assigned to any of the 4 subjects?
  • 122. 3. Consider the numbers 1,2,3,5,and 6. how many 3 digit numbers can be formed from these numbers if Repetition is not allowed and 0 should not be in the first digit? Repetition is allowed and 0 should not be in the first digit? 4. A college has 3 entrance gates and 2 exit gates. In how many ways can a student enter then leave the building?
  • 123. In how many ways can 9 passengers be seated in a bus if there are only 5 seats available? In how many ways can 4 boys and 4 girls be seated in a row of 8 chairs if They can sit anywhere? The boys and girls are to be seated alternately? 7. In how many ways can ten participants in a race placed first, second and third?
  • 124. Determine the number of distinct permutations of each of the following: STATISTICS ADRENALIN 44044999404 A class consist of 12 boys and 10 girls. In how many ways can a committee of five be formed if All members are boys? 2 are boys and 3 are girls?
  • 125. 10. In how many ways can a student answer an exam if out of the 6 problem, he is required to answer only 4?