• Privacy Policy

Research Method

Home » Inferential Statistics – Types, Methods and Examples

Inferential Statistics – Types, Methods and Examples

Table of Contents

Inferential Statistics

Inferential Statistics

Inferential statistics is a branch of statistics that involves making predictions or inferences about a population based on a sample of data taken from that population. It is used to analyze the probabilities, assumptions, and outcomes of a hypothesis .

The basic steps of inferential statistics typically involve the following:

  • Define a Hypothesis: This is often a statement about a parameter of a population, such as the population mean or population proportion.
  • Select a Sample: In order to test the hypothesis, you’ll select a sample from the population. This should be done randomly and should be representative of the larger population in order to avoid bias.
  • Collect Data: Once you have your sample, you’ll need to collect data. This data will be used to calculate statistics that will help you test your hypothesis.
  • Perform Analysis: The collected data is then analyzed using statistical tests such as the t-test, chi-square test, or ANOVA, to name a few. These tests help to determine the likelihood that the results of your analysis occurred by chance.
  • Interpret Results: The analysis can provide a probability, called a p-value, which represents the likelihood that the results occurred by chance. If this probability is below a certain level (commonly 0.05), you may reject the null hypothesis (the statement that there is no effect or relationship) in favor of the alternative hypothesis (the statement that there is an effect or relationship).

Inferential Statistics Types

Inferential statistics can be broadly categorized into two types: parametric and nonparametric. The selection of type depends on the nature of the data and the purpose of the analysis.

Parametric Inferential Statistics

These are statistical methods that assume data comes from a type of probability distribution and makes inferences about the parameters of the distribution. Common parametric methods include:

  • T-tests : Used when comparing the means of two groups to see if they’re significantly different.
  • Analysis of Variance (ANOVA) : Used to compare the means of more than two groups.
  • Regression Analysis : Used to predict the value of one variable (dependent) based on the value of another variable (independent).
  • Chi-square test for independence : Used to test if there is a significant association between two categorical variables.
  • Pearson’s correlation : Used to test if there is a significant linear relationship between two continuous variables.

Nonparametric Inferential Statistics

These are methods used when the data does not meet the requirements necessary to use parametric statistics, such as when data is not normally distributed. Common nonparametric methods include:

  • Mann-Whitney U Test : Non-parametric equivalent to the independent samples t-test.
  • Wilcoxon Signed-Rank Test : Non-parametric equivalent to the paired samples t-test.
  • Kruskal-Wallis Test : Non-parametric equivalent to the one-way ANOVA.
  • Spearman’s rank correlation : Non-parametric equivalent to the Pearson correlation.
  • Chi-square test for goodness of fit : Used to test if the observed frequencies for a categorical variable match the expected frequencies.

Inferential Statistics Formulas

Inferential statistics use various formulas and statistical tests to draw conclusions or make predictions about a population based on a sample from that population. Here are a few key formulas commonly used:

Confidence Interval for a Mean:

When you have a sample and want to make an inference about the population mean (µ), you might use a confidence interval.

The formula for a confidence interval around a mean is:

[Sample Mean] ± [Z-score or T-score] * (Standard Deviation / sqrt[n]) where:

  • Sample Mean is the mean of your sample data
  • Z-score or T-score is the value from the Z or T distribution corresponding to the desired confidence level (Z is used when the population standard deviation is known or the sample size is large, otherwise T is used)
  • Standard Deviation is the standard deviation of the sample
  • sqrt[n] is the square root of the sample size

Hypothesis Testing:

Hypothesis testing often involves calculating a test statistic, which is then compared to a critical value to decide whether to reject the null hypothesis.

A common test statistic for a test about a mean is the Z-score:

Z = (Sample Mean - Hypothesized Population Mean) / (Standard Deviation / sqrt[n])

where all variables are as defined above.

Chi-Square Test:

The Chi-Square Test is used when dealing with categorical data.

The formula is:

χ² = Σ [ (Observed-Expected)² / Expected ]

  • Observed is the actual observed frequency
  • Expected is the frequency we would expect if the null hypothesis were true

The t-test is used to compare the means of two groups. The formula for the independent samples t-test is:

t = (mean1 - mean2) / sqrt [ (sd1²/n1) + (sd2²/n2) ] where:

  • mean1 and mean2 are the sample means
  • sd1 and sd2 are the sample standard deviations
  • n1 and n2 are the sample sizes

Inferential Statistics Examples

Sure, inferential statistics are used when making predictions or inferences about a population from a sample of data. Here are a few real-time examples:

  • Medical Research: Suppose a pharmaceutical company is developing a new drug and they’re currently in the testing phase. They gather a sample of 1,000 volunteers to participate in a clinical trial. They find that 700 out of these 1,000 volunteers reported a significant reduction in their symptoms after taking the drug. Using inferential statistics, they can infer that the drug would likely be effective for the larger population.
  • Customer Satisfaction: Suppose a restaurant wants to know if its customers are satisfied with their food. They could survey a sample of their customers and ask them to rate their satisfaction on a scale of 1 to 10. If the average rating was 8.5 from a sample of 200 customers, they could use inferential statistics to infer that the overall customer population is likely satisfied with the food.
  • Political Polling: A polling company wants to predict who will win an upcoming presidential election. They poll a sample of 10,000 eligible voters and find that 55% prefer Candidate A, while 45% prefer Candidate B. Using inferential statistics, they infer that Candidate A has a higher likelihood of winning the election.
  • E-commerce Trends: An e-commerce company wants to improve its recommendation engine. They analyze a sample of customers’ purchase history and notice a trend that customers who buy kitchen appliances also frequently buy cookbooks. They use inferential statistics to infer that recommending cookbooks to customers who buy kitchen appliances would likely increase sales.
  • Public Health: A health department wants to assess the impact of a health awareness campaign on smoking rates. They survey a sample of residents before and after the campaign. If they find a significant reduction in smoking rates among the surveyed group, they can use inferential statistics to infer that the campaign likely had an impact on the larger population’s smoking habits.

Applications of Inferential Statistics

Inferential statistics are extensively used in various fields and industries to make decisions or predictions based on data. Here are some applications of inferential statistics:

  • Healthcare: Inferential statistics are used in clinical trials to analyze the effect of a treatment or a drug on a sample population and then infer the likely effect on the general population. This helps in the development and approval of new treatments and drugs.
  • Business: Companies use inferential statistics to understand customer behavior and preferences, market trends, and to make strategic decisions. For example, a business might sample customer satisfaction levels to infer the overall satisfaction of their customer base.
  • Finance: Banks and financial institutions use inferential statistics to evaluate the risk associated with loans and investments. For example, inferential statistics can help in determining the risk of default by a borrower based on the analysis of a sample of previous borrowers with similar credit characteristics.
  • Quality Control: In manufacturing, inferential statistics can be used to maintain quality standards. By analyzing a sample of the products, companies can infer the quality of all products and decide whether the manufacturing process needs adjustments.
  • Social Sciences: In fields like psychology, sociology, and education, researchers use inferential statistics to draw conclusions about populations based on studies conducted on samples. For instance, a psychologist might use a survey of a sample of people to infer the prevalence of a particular psychological trait or disorder in a larger population.
  • Environment Studies: Inferential statistics are also used to study and predict environmental changes and their impact. For instance, researchers might measure pollution levels in a sample of locations to infer overall pollution levels in a wider area.
  • Government Policies: Governments use inferential statistics in policy-making. By analyzing sample data, they can infer the potential impacts of policies on the broader population and thus make informed decisions.

Purpose of Inferential Statistics

The purposes of inferential statistics include:

  • Estimation of Population Parameters: Inferential statistics allows for the estimation of population parameters. This means that it can provide estimates about population characteristics based on sample data. For example, you might want to estimate the average weight of all men in a country by sampling a smaller group of men.
  • Hypothesis Testing: Inferential statistics provides a framework for testing hypotheses. This involves making an assumption (the null hypothesis) and then testing this assumption to see if it should be rejected or not. This process enables researchers to draw conclusions about population parameters based on their sample data.
  • Prediction: Inferential statistics can be used to make predictions about future outcomes. For instance, a researcher might use inferential statistics to predict the outcomes of an election or forecast sales for a company based on past data.
  • Relationships Between Variables: Inferential statistics can also be used to identify relationships between variables, such as correlation or regression analysis. This can provide insights into how different factors are related to each other.
  • Generalization: Inferential statistics allows researchers to generalize their findings from the sample to the larger population. It helps in making broad conclusions, given that the sample is representative of the population.
  • Variability and Uncertainty: Inferential statistics also deal with the idea of uncertainty and variability in estimates and predictions. Through concepts like confidence intervals and margins of error, it provides a measure of how confident we can be in our estimations and predictions.
  • Error Estimation : It provides measures of possible errors (known as margins of error), which allow us to know how much our sample results may differ from the population parameters.

Limitations of Inferential Statistics

Inferential statistics, despite its many benefits, does have some limitations. Here are some of them:

  • Sampling Error : Inferential statistics are often based on the concept of sampling, where a subset of the population is used to infer about the population. There’s always a chance that the sample might not perfectly represent the population, leading to sampling errors.
  • Misleading Conclusions : If assumptions for statistical tests are not met, it could lead to misleading results. This includes assumptions about the distribution of data, homogeneity of variances, independence, etc.
  • False Positives and Negatives : There’s always a chance of a Type I error (rejecting a true null hypothesis, or a false positive) or a Type II error (not rejecting a false null hypothesis, or a false negative).
  • Dependence on Quality of Data : The accuracy and validity of inferential statistics depend heavily on the quality of data collected. If data are biased, inaccurate, or collected using flawed methods, the results won’t be reliable.
  • Limited Predictive Power : While inferential statistics can provide estimates and predictions, these are based on the current data and may not fully account for future changes or variables not included in the model.
  • Complexity : Some inferential statistical methods can be quite complex and require a solid understanding of statistical principles to implement and interpret correctly.
  • Influenced by Outliers : Inferential statistics can be heavily influenced by outliers. If these extreme values aren’t handled properly, they can lead to misleading results.
  • Over-reliance on P-values : There’s a tendency in some fields to overly rely on p-values to determine significance, even though p-values have several limitations and are often misunderstood.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Content Analysis

Content Analysis – Methods, Types and Examples

Probability Histogram

Probability Histogram – Definition, Examples and...

Graphical Methods

Graphical Methods – Types, Examples and Guide

Phenomenology

Phenomenology – Methods, Examples and Guide

Multidimensional Scaling

Multidimensional Scaling – Types, Formulas and...

Histogram

Histogram – Types, Examples and Making Guide

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Statistics and probability

Unit 1: analyzing categorical data, unit 2: displaying and comparing quantitative data, unit 3: summarizing quantitative data, unit 4: modeling data distributions, unit 5: exploring bivariate numerical data, unit 6: study design, unit 7: probability, unit 8: counting, permutations, and combinations, unit 9: random variables, unit 10: sampling distributions, unit 11: confidence intervals, unit 12: significance tests (hypothesis testing), unit 13: two-sample inference for the difference between groups, unit 14: inference for categorical data (chi-square tests), unit 15: advanced regression (inference and transforming), unit 16: analysis of variance (anova).

Library Home

Statistical Inference For Everyone

(3 reviews)

assignment inferential statistics

Brian Blais, Bryant University

Copyright Year: 2017

Publisher: Brian Blais

Language: English

Formats Available

Conditions of use.

Attribution-ShareAlike

Learn more about reviews.

assignment inferential statistics

Reviewed by Kenese Io, PhD candidate, Colorado State University on 11/30/20

The book illustrates a very pragmatic approach with little theoretical application. I would recommend this text to anyone who is teaching applied stats at an early level. read more

Comprehensiveness rating: 4 see less

The book illustrates a very pragmatic approach with little theoretical application. I would recommend this text to anyone who is teaching applied stats at an early level.

Content Accuracy rating: 5

The book is accurate with a number of very helpful examples for new researchers. The examples provide examples of code for students to use and draw from as they execute their own examples. They also provide examples with commonly used datasets which is very helpful for some students who may be working on their final projects as an undergraduate or homework assignments as a first year graduate student.

Relevance/Longevity rating: 5

The book is problem or problem set oriented which will allow the book to maintain its longevity. The examples offer analysis of old data but this is very helpful as instructors can assign similar problem sets with new datasets while the students have an excellent tool to rely on.

Clarity rating: 4

The book is generally clear but given that it is problem oriented some of the theoretical background is scarce and leaves a bit to be desired. Nevertheless the examples really allow for an immersive experience.

Consistency rating: 5

The book does a great job of following a clear formula of historical background/ brief theoretical walkthrough/ long examples that force you engage critically with the assignment.

Modularity rating: 5

The book is very easy to assign as the text quickly jumps to examples of matlab code that will draw students to engage with it. I can imagine students constantly flipping between their own code and the text to help simplify analysis or execute their code.

Organization/Structure/Flow rating: 4

The book is organized relatively well. I would have liked to see a few of the later chapters earlier likt the common tests for statistical significance but it generally goes from broader to more narrow perspectives.

Interface rating: 5

The graphs and code examples are laid out well and the text works great in acrobat reader.

Grammatical Errors rating: 5

Cultural Relevance rating: 4

The text does not offer any critical analysis here but this is due to maintaining general examples. I think an instructor could easily assign more critical assignments that rely on the intuition laid out in the book.

Reviewed by Jimmy Chen, Assistant Professor, Bucknell University on 1/26/19

As far as Statistical Inference goes, the author has done a great job covering the essential topics. The breadth and the depth of the content are are well balanced. I believe this book can be a great supplemental material for any statistics or... read more

Comprehensiveness rating: 5 see less

As far as Statistical Inference goes, the author has done a great job covering the essential topics. The breadth and the depth of the content are are well balanced. I believe this book can be a great supplemental material for any statistics or probability course. Students would have no problems studying this book themselves because the author has explained concepts clearly and provided ample examples.

I think the content is fine. Examples, illustration, and computer codes are all very helpful for the readers to understand the content.

The relevance of the book is great. Most supporting examples would be easily relatable to most students. Most statistics or probability concepts discussed in the book are timeless. Detailed computer codes make it easy for verification.

Clarity rating: 5

The author has explained concepts very well. The flow of the text and examples are great and thoughtful, make it very easy to flow.

The consistency of the text is great.

The modularity of the text is great. I could easily adopt the entire book or use only certain sections of the book for my teaching.

Organization/Structure/Flow rating: 5

The topics in the text are presented in a logical, clear fashion.

The layout of the text are clear and easily readable. Imagines, charts, and tables are clear and concise. Very easy to follow.

The text contains no grammatical errors.

Cultural Relevance rating: 5

The text is not culturally insensitive or offensive in any way.

Reviewed by Adam Molnar, Assistant Professor, Oklahoma State University on 5/21/18

This book is not a comprehensive introduction to elementary statistics, or even statistical inference, as the author Brian Blais deliberately chose not to cover all topics of statistical inference. For example, the term matched pairs never... read more

Comprehensiveness rating: 2 see less

This book is not a comprehensive introduction to elementary statistics, or even statistical inference, as the author Brian Blais deliberately chose not to cover all topics of statistical inference. For example, the term matched pairs never appears; neither do Type I or Type II error. The Student's t distribution gets much less attention than in almost every other book; the author offers a rarely used standard-deviation change (page 153) as a way to keep things Gaussian. The author justifies the reduced topic set by calling typical "traditional" approaches flawed in the first pages of text, the Proposal. Instead, Blais tries to develop statistical inference from logic, in a way that might be called Bayesian inference. Other books have taken this approach, more than just Donald Berry's book mentioned on page 32. [For more references, see the ICOTS6 paper by James Albert at https://iase-web.org/documents/papers/icots6/3f1_albe.pdf ] None of those books are open-resource, though; an accurate, comprehensive textbook would have potential. This PDF does not contain that desired textbook, however. As mentioned below under accuracy, clarity, and structure, there are too many missing elements, including the lack of an index. As I read, this PDF felt more like a augmented set of lecture notes than a textbook which stands without instructor support. It's not good enough. (For more on this decision, see the other comments at the end.)

Content Accuracy rating: 2

The only non-troubling number of errors in a textbook is zero, but this book has many more than that. In the version I read from the Minnesota-hosted website, my error list includes not defining quartiles from the left (page 129), using ICR instead of IQR (page 133), misstating the 68-95-99 rule as 65-95-99 (page 134), flipping numbers in the combination of the binomial formula (page 232), repeating Figure C-2 as Figure C-1 (page 230), and titling section 2.6 "Monte Hall" instead of "Monty Hall". Infuriatingly, several of these mistakes are correct elsewhere in the book - Monty Hall in section 5.4, the binomial formula in the main text, and 68-95-99 on page 142. I'm also annoyed that some datasets have poor source citations, such as not indicating Fisher's iris data on page 165 and calling something "student measurements during a physics lab" on page 173.

Relevance/Longevity rating: 4

Because there are so many gaps, including full support for computer presentation, it would be easy to update completed sections as needed, such as when Python becomes less popular.

Clarity rating: 2

Quality of the prose is fine, but many jargon terms are not well defined. Students learning a subject need clear definitions, but they don't appear. In my notes, I see exclusive (page 36), conditioning (page 40), complement (used on page 40 but never appears in the text), posterior (page 54), correlation (page 55), uniform distribution (page 122), and Greek letters for which the reference to a help table appears on page 140, but Greek letters have appeared earlier. Additionally, several important terms receive insufficient or unusual definitions, including labeling summary description of data as inference (page 34), mutually exclusive (page 36) versus independence (page 43), and plus/minus (page 146, as this definition of +/- applies in lab bench science but not social sciences). I appreciate that the author is trying to avoid calculus with "area under the curve" on page 127, but there's not enough written for a non-calculus student to understand how these probabilities are calculated. To really understand posterior computation, a magical computer and a few graphs aren't good enough.

Internal consistency to Bayesian inference is quite strong; many of the examples repeat the steps of Bayes' Recipe. This is not a concern.

Modularity rating: 3

The book needs to be read in linear order, like most statistics books, but that's not necessarily a negative thing. Dr. Blais is trying to take the reader through a structured development of Bayesian inference, which has a single path. There are a few digressions, such as fallacies about probability reasoning, but the book generally maintains a single path from chapters 1 to at least 7. Most sections are less than 10 pages and don't involve lots of self-references. Although I rated reorganization possibility as low, due to the near-impossibility of realigning the argument, I consider it harsh to penalize the book for this.

Organization/Structure/Flow rating: 2

There isn't enough structure for a textbook; this feels more like a set of augmented lecture notes that a book for guided study. I mentioned poor definitions under "Clarity", so let me add other topics here. The most frustrating structural problem for me is the presentation of the fundamental idea of Bayesian inference, posterior proportional to prior * likelihood. The word prior first appears on page 48, but receives no clear definition until a side-note on page 97. The word posterior first appears on page 53. Despite this, the fundamental equation is never written with all three words in the correct places until page 154. That's way, way too late. The three key terms should have been defined around page 50 and drilled throughout all the sections. The computer exercises also have terrible structure. The first section with computer exercises, section 2.9 on page 72, begins with code. The reader has no idea about the language, package, or purpose of these weird words in boxes. The explanation about Python appears as Appendix A, after all the exercises. It would not have taken much to explain Python and the purpose of the computer exercises in Chapter 1 or 2, but it didn't happen. A classroom instructor could explain this in class, but the Open Resource Project doesn't provide an instructor with every book. Like the other things mentioned, the structure around computing is insufficient.

I had no problems navigating through the chapters. Images look fine as well.

Grammar and spelling are good. I only spotted one typographical error, "posterier" on page 131, and very few awkward sentences.

This is a US-centered book, since it refers to the "standard deck" of playing cards on page 36 as the US deck; other places like Germany have different suits. The book also uses "heads" and "tails" for coins, while other countries such as Mexico use different terms. I wouldn't call this a major problem, however; the pictures and diagrams make the coins and cards pretty clear. There aren't many examples involving people, so there's little scope for ethnicities and backgrounds.

On Brian Blais's webpage for the book, referenced only in Appendix A for some reason, he claims that this book is targeted to the typical Statistics 101 college student. It is NOT. Typical college students need much more support than what this book offers - better structure, better scaffolding, more worked examples, support for computing. What percentage of all college students would pick up Python given the contents presented here? My prior estimate would be 5%. Maybe students at Bryant university, where Pre-Calculus is the lowest math course offered, have a higher Python rate, but the bottom 20% of my students at Oklahoma State struggle with order of operations and using the combinations formula. They would need massive support, and Oklahoma State enrolls above-average college students. This book does not have massive support - or much at all. This makes me sad, because I've argued that we should teach hypothesis testing through credible intervals because I think students will understand the logic better than the frequentist philosophical approach. In 2014, I wrote a guest blog post [http://www.culturalcognition.net/blog/2014/9/5/teaching-how-to-teach-bayess-theorem-covariance-recognition.html] on teaching Bayes' Rule. I would value a thorough book that might work for truly typical students, but for the students in my everyone, this won't work.

Table of Contents

  • 1 Introduction to Probability
  • 2 Applications of Probability
  • 3 Random Sequences and Visualization
  • 4 Introduction to Model Comparison
  • 5 Applications of Model Comparison
  • 6 Introduction to Parameter Estimation
  • 7 Priors, Likelihoods, and Posteriors
  • 8 Common Statistical Significance Tests
  • 9 Applications of Parameter Estimation and Inference
  • 10 Multi-parameter Models
  • 11 Introduction to MCMC
  • 12 Concluding Thoughts

BibliographyAppendix A: Computational AnalysisAppendix B: Notation and StandardsAppendix C: Common Distributions and Their PropertiesAppendix D: Tables

Ancillary Material

About the book.

This is a new approach to an introductory statistical inference textbook, motivated by probability theory as logic. It is targeted to the typical Statistics 101 college student, and covers the topics typically covered in the first semester of such a course. It is freely available under the Creative Commons License, and includes a software library in Python for making some of the calculations and visualizations easier.

About the Contributors

Brian Blais professor of Science and Technology, Bryant University and a research professor at the Institute for Brain and Neural Systems, Brown University. 

Contribute to this Page

Introduction to Inferential Statistics: Describing patterns and relationships in datasets

by Liz Roth-Johnson, Ph.D.

Listen to this reading

Did you know that in statistics, the word “population” doesn’t refer to the people who live in a particular area? Rather, it refers to the complete set of observations that can be made. Since it is impossible to repeat an experiment an infinite number of times or observe every single individual, inferential statistics allow scientists to draw conclusions about a much larger group based on observing a much smaller set of data.

In statistics, a population is a complete set of possible observations that can be made. It is often impractical for scientists to study an entire population, so smaller subsets of the population, known as either subsamples or samples, are often studied instead. It is important that such subsample is representative of the population from which it comes.

Inferential statistics can help scientists make generalizations about a population based on subsample data. Through the process of estimation, subsample data is used to identify population parameters like the population mean or variance.

Random sampling helps scientists collect a subsample dataset that is representative of the larger population. This is critical for statistical inference, which often involves using subsample datasets to make inferences about entire populations.

Statistical significance provides a measure of the statistical probability for a result to have occurred. A statistically significant result is unlikely to have occurred by chance and can therefore be reliably reproduced if statistical tests are repeated. Statistical significance does not tell scientists whether a result is relevant, important, or meaningful.

Imagine you are working in an agricultural sciences lab, where you have been collaborating with a local farmer to develop new varieties of fruits and vegetables. It is your job to analyze the fruits and vegetables that are harvested each year and track any changes that occur from one plant generation to the next. Today you are measuring the sugar content of the latest crop of tomatoes to test how sweet they are. You have 25 tomatoes and find that the mean sugar content is 32 milligrams (mg) of sugar per gram (g) of tomato with a standard deviation of 4 mg/g (see Introduction to Descriptive Statistics for more information about calculating a mean and standard deviation).

Just as you finish these calculations, your collaborator from the farm shows up to ask about the latest results. Specifically, she wants to know the mean sugar content for this year’s entire tomato harvest (Figure 1). You look at your data and wonder what to tell her. How does the mean and standard deviation of the sample relate to the mean and standard deviation of the entire harvest?

Figure 1: Can studying just 25 tomatoes tell you something about the characteristics of the entire tomato harvest? In situations like this, scientists can use inferential statistics to help them make sense of their data.

Figure 1 : Can studying just 25 tomatoes tell you something about the characteristics of the entire tomato harvest? In situations like this, scientists can use inferential statistics to help them make sense of their data.

As you will see in this module, your current predicament is very familiar to scientists. In fact, it is very unlikely that the mean and standard deviation of your 25-tomato sample is exactly the same as the mean and standard deviation of the entire harvest. Fortunately, you can use techniques from a branch of statistics known as “inferential statistics” to use your smaller subset of measurements to learn something about the sugar content of the entire tomato harvest. These and other inferential statistics techniques are an invaluable tool for scientists as they analyze and interpret their data .

  • What are inferential statistics?

Many statistical techniques have been developed to help scientists make sense of the data they collect. These techniques are typically categorized as either descriptive or inferential . While descriptive statistics (see Introduction to Descriptive Statistics ) allow scientists to quickly summarize the major characteristics of a dataset , inferential statistics go a step further by helping scientists uncover patterns or relationships in a dataset, make judgments about data, or apply information about a small dataset to a larger group. They are part of the process of data analysis used by scientists to interpret and make statements about their results (see Data Analysis and Interpretation for more information).

The inferential statistics toolbox available to scientists is quite large and contains many different methods for analyzing and interpreting data . As an introduction to the topic, we will give a brief overview of some of the more common methods of statistical inference used by scientists. Many of these methods involve using smaller subsets of data to make inferences about larger populations . Therefore, we will also discuss ways in which scientists can mitigate systematic errors (sampling bias) by selecting subsamples (often simply referred to as “samples”) that are representative of the larger population. This module describes inferential statistics in a qualitative way.

Comprehension Checkpoint

  • Populations versus subsamples

When we use the word “population” in our everyday speech, we are usually talking about the number of people, plants, or animals that live in a particular area. However, to a scientist or statistician this term can mean something very different. In statistics , a population is defined as the complete set of possible observations . If a physicist conducts an experiment in her lab, the population is the entire set of possible results that could arise if the experiment were repeated an infinite number of times. If a marine biologist is tracking the migration patterns of blue whales in the Northeast Pacific Ocean, the population would be the entire set of migratory journeys taken by every single blue whale that lives in the Northeast Pacific. Note that in this case the statistical population is the entire set of migration events – the variable being observed – and not the blue whales themselves (the biological population).

Based on this definition of a population , you might be thinking how impractical, or even impossible, it could be for a scientist to collect data about an entire population. Just imagine trying to tag thousands of blue whales or repeating an experiment indefinitely! Instead, scientists typically collect data for a smaller subset – a “subsample” – of the population. If the marine biologist tags and tracks only 92 blue whales, this more practical subsample of migration data can then be used to make inferences about the larger population (Figure 2).

Figure 2: Individual migration patterns for 92 tagged blue whales (Balaenoptera musculus) tracked between 1994 and 2007 in the Pacific Northeast, color-coded by deployment location. These migration patterns represent a small subsample of a much larger statistical population: all of the possible blue whale migration patterns that occurred in the Pacific Northeast from 1994 to 2007. Image from Bailey, H., Mate, B.R., Palacios, D.M., et al. 2010. Behavioural estimation of blue whale movements in the Northeast Pacific from state space model analysis of satellite tracks. Endangered Species Research 10, 93–106.

Figure 2 : Individual migration patterns for 92 tagged blue whales ( Balaenoptera musculus ) tracked between 1994 and 2007 in the Pacific Northeast, color-coded by deployment location. These migration patterns represent a small subsample of a much larger statistical population: all of the possible blue whale migration patterns that occurred in the Pacific Northeast from 1994 to 2007. Image from Bailey, H., Mate, B.R., Palacios, D.M., et al. 2010. Behavioural estimation of blue whale movements in the Northeast Pacific from state space model analysis of satellite tracks. Endangered Species Research 10 , 93–106.

But this raises an important point about statistical inference: By selecting only a subsample of a population , you are not identifying with certainty all possible outcomes . Instead, as the name of the technique implies, you are making inferences about a large number of possible outcomes. As you will see later in this module, addressing the uncertainty associated with these inferences is an important part of inferential statistics .

  • The importance of random sampling

When using a subsample to draw conclusions about a much larger population , it is critical that the subsample reasonably represents the population it comes from. Scientists often use a process called “simple random sampling” to collect representative subsample datasets . Random sampling does not mean the data are collected haphazardly but rather that the probability of each individual in the population being included in the subsample is the same. This process helps scientists ensure that they are not introducing unintentional biases into their sample that might make their subsample less representative of the larger population.

Let’s think about this in the context of our original tomato example. To make inferences about the entire tomato harvest, we need to make sure our 25-tomato subsample is as representative of the entire tomato harvest as possible. To collect a random subsample of the tomato harvest, we could use a computer program, such as a random number generator, to randomly select different locations throughout the tomato field and different days throughout the harvesting season at which to collect subsample tomatoes. This randomization ensures that there is no bias inherent in the subsample selection process . In contrast, a biased sample might only select tomatoes from a single day during the harvesting period or from just one small area of the field.

If the sugar content of tomatoes varies throughout the season or if one area of the field gets more sun and water than another, then these subsamples would hardly be representative of the entire harvest. (For more information about the importance of randomization in science, see Statistics in Science .) You may also notice that the process of random sampling requires that some minimum number of samples be collected to ensure that subsample accounts for all of the possible conditions that can affect the research . Determining the ideal sample size for an experiment can depend on a number of factors, including the degree of variation within a population and the level of precision required in the analysis . When designing an experiment, scientists consider these factors to choose an appropriate number of samples to collect.

Another example of simple random sampling comes from a wildlife study about songbirds living on an island off the coast of California (Langin et al. 2009). To understand how the songbirds were being affected by climate change, researchers wanted to know how much food was available to the songbirds throughout the island. They knew that these particular songbirds ate mostly insects off of oak tree leaves – but imagine trying to find and measure the mass of every insect living on every oak tree on an island!

To collect a representative subsample , the researchers randomly selected 12 geographic coordinates throughout the island, collected a single branch from the oak tree closest to each coordinate in a randomly selected direction, and then measured the total insect mass on each branch. They then repeated this procedure every two weeks, randomly selecting locations each time and taking care to collect the same size branch at every location (Figure 3).

Figure 3: Example of random sampling. To collect a representative subsample of insects living on the oak trees on an island, researchers randomly selected 12 geographic coordinates throughout the island. The figure on the left shows what two random samples might have looked like, with each randomly selected location marked by an X. Every time the random sampling procedure is repeated, the distribution of sampling locations changes.

Figure 3 : Example of random sampling. To collect a representative subsample of insects living on the oak trees on an island, researchers randomly selected 12 geographic coordinates throughout the island. The figure on the left shows what two random samples might have looked like, with each randomly selected location marked by an X. Every time the random sampling procedure is repeated, the distribution of sampling locations changes.

This carefully constructed procedure helped the researchers avoid biasing their subsample . By randomly selecting multiple locations, they ensured that branches from more than one tree would be selected and that one tree would not be favored over the others. Repeating the sampling procedure also helped limit bias. If insects were very abundant during the summer but hard to find in the winter, then sampling only one time or during one season would not be likely to generate a representative snapshot of year-round insect availability. Despite its name, the process of simple random sampling is not so “simple” at all! It requires careful planning to avoid introducing unintended biases.

  • Estimating statistical parameters

Once scientists have collected an appropriately random or unbiased subsample , they can use the subsample to make inferences about the population through a process called “estimation.” In our original example about tomatoes and sugar content, we reported the mean (32 mg/g) and standard deviation (4 mg/g) for the sugar content of 25 tomatoes. These are called “statistics” and are a property of the subsample. We can use these subsample statistics to estimate “parameters” for the population, such as the population mean or population standard deviation.

Notice that we refer to the population mean as a parameter while the subsample mean is called a statistic . This reflects the fact that any given population has only one true mean, while the subsample mean can change from one subsample to the next. Suppose you measured the sugar content of a different set of 25 tomatoes from the same harvest. This subsample’s mean and standard deviation will probably be slightly different from the first subsample due to variations in sugar content from one tomato to the next. Yet either set of subsample statistics could be used to estimate the population mean for the entire harvest.

To estimate population parameters from subsample statistics , scientists typically use two different types of estimates: point estimates and interval estimates. Often these two estimations are used in tandem to report a plausible range of values for a population parameter based on a subsample dataset .

A point estimate of a population parameter is simply the value of a subsample statistic . For our tomatoes, this means that the subsample mean of 32 mg/g could be used as a point estimate of the population mean. In other words, we are estimating that the population mean is also 32 mg/g. Given that the subsample statistic will vary from one subsample to another, point estimates are not commonly used by themselves as they do not account for subsample variability.

An interval estimate of a population parameter is a range of values in which the parameter is thought to lie. Interval estimates are particularly useful in that they reflect the uncertainty related to the estimation (see our Uncertainty, Error, and Confidence module) and can be reported as a range of values surrounding a point estimate. One common tool used in science to generate interval estimates is the confidence interval . Confidence intervals take into consideration both the variability and total number of observations within a subsample to provide a range of plausible values around a point estimate. A confidence interval is calculated at a chosen confidence level , which represents the level of uncertainty associated with the estimation. We could calculate a confidence interval estimate using our 25-tomato subsample, which has a mean of 32 mg/g and a standard deviation of 4 mg/g. When calculated at the 95% confidence level, this interval estimate would be reported as 32 ± 2 mg/g, meaning that the population mean is likely to lie somewhere between 30 mg/g and 34 mg/g.*

*While the standard deviation provides a measure of the spread of all observations in the sample, the confidence interval provides a narrower probability of where the mean would fall if you took another sub-sample from the population .

  • Comparing multiple subsamples

Another technique that scientists often employ is to compare two or more subsamples to determine how likely it is that they have similar population parameters . Let’s say that you want to compare your current tomato harvest to a previous year’s harvest. This year the mean sugar content was 32 ± 2 mg/g but last year the mean sugar content was only 26 ± 3 mg/g. While these two numbers seem quite different from each other, how can you be confident that the difference wasn’t simply due to random variation in your two subsamples?

In cases like this, scientists turn to a branch of statistical inference known as statistical hypothesis testing. When comparing two subsamples , scientists typically consider two simple hypotheses: Either the two subsamples come from similar populations and are essentially the same (the null hypothesis) or the two subsamples come from different populations and are therefore “significantly” different from one another (the alternative hypothesis). In statistics , the word “significant” is used to designate a level of statistical robustness. A “significant” difference implies that the difference can be reliably detected by the statistical test but says nothing about the scientific importance, relevance, or meaningfulness of the difference.

To determine whether the sugar content of your two tomato harvests is indeed significantly different, you could use a statistical hypothesis test such as Student’s t-test to compare the two subsamples . Conducting a t-test provides a measure of statistical significance that can be used to either reject or accept the null hypothesis . The significance level quantifies the likelihood that a particular result occurred by chance. In science, the significance level used for hypothesis testing is often 0.05. This means that in order for a result to be deemed “statistically significant,” there must be less than a 5% probability that the result was observed by chance. If you conduct a t-test on your two tomato samples and calculate a probability value (also called p-value) less than 0.05, you can reject the null hypothesis and report that the difference in sugar content is significantly different from one year to the next.

What if you now wanted to compare the sugar content of all tomato harvests from the last 20 years? Theoretically, you could conduct pairwise t-tests among all of the different subsamples , but this approach can lead to trouble. With every t-test , there is always a chance, however small, that the null hypothesis is incorrectly rejected and a so-called “false positive” result is produced. Repeating multiple t-tests over and over can introduce unintended error into the analysis by increasing the likelihood of false positives. When comparing three or more samples, scientists instead use methods like “ an alysis o f va riance,” aka ANOVA , which compare multiple samples all at once to reduce the chance of introducing error into the statistical analysis.

  • Finding relationships among variables

As you continue to analyze all of your tomato data , you notice that the tomatoes seem to be sweeter in warmer years. Are you making this up, or could there actually be a relationship between tomato sweetness and the weather? To analyze these kinds of mutual relationships between two or more variables , scientists can use techniques in inferential statistics to measure how much the variables correlate with one another. A strong correlation between two variables means that the variables change, or vary, in similar ways. For example, medical research has shown people with high-salt diets tend to have higher blood pressure than people with low-salt diets. Thus, blood pressure and salt consumption are said to be correlated.

When scientists analyze the relationships between two or more variables , they must take great care to distinguish between correlation and causation. A strong correlation between two variables can signify that a relationship exists, but it does not provide any information about the nature of that relationship. It can be tempting to look for cause-and-effect relationships in datasets , but correlation among variables does not necessarily mean that changes in one variable caused or influenced changes in the other. While two variables may show a correlation if they are directly related to each other, they could also be correlated if they are both related to a third unknown variable. Moreover, two variables can sometimes appear correlated simply by chance. The total revenue generated by arcades and the number of computer science doctorates awarded in the United States change in very similar ways over time and can therefore be said to correlate (Figure 4). The two variables are highly correlated with each other, but we cannot conclude that changes in one variable are causing changes in the other. Causation must ultimately be determined by the researcher, typically through the discovery of a reasonable mechanism by which one variable can directly affect the other.

Figure 4: Correlation does not imply causation. Although the graph above shows a striking correlation between annual arcade revenue and advanced computer science degrees awarded, we cannot conclude that changes in one variable caused changes in the other. Data sources: US Census Bureau and National Science Foundation.

Figure 4 : Correlation does not imply causation. Although the graph above shows a striking correlation between annual arcade revenue and advanced computer science degrees awarded, we cannot conclude that changes in one variable caused changes in the other. Data sources: US Census Bureau and National Science Foundation.

Although correlation does not imply causation on its own, researchers can still establish cause-and-effect relationships between two variables . In these kinds of relationships, an independent variable (one that is not changed by any other variables being studied) is said to cause an effect on a dependent variable . The dependent variable is named for the fact that it will change in response to an independent variable – its value is literally dependent on the value of the independent variable. The strength of such a relationship can be analyzed using a linear regression , which shows the degree to which the data collected for two variables fall along a straight line. This statistical operation could be used to examine the relationship between tomato sweetness (the dependent variable) and a number of weather-related independent variables that could plausibly affect the growth, and therefore sweetness, of the tomatoes (Figure 5).

Figure 5: Linear regressions measure the relationship between an independent variable and a dependent variable. In the examples of simulated data above, sugar content of the tomatoes (the dependent variable) is weakly related to total rainfall (left graph) and strongly related to average temperature (right graph). Sugar content shows no relationship with the overall harvest yield (middle graph).

Figure 5 : Linear regressions measure the relationship between an independent variable and a dependent variable. In the examples of simulated data above, sugar content of the tomatoes (the dependent variable) is weakly related to total rainfall (left graph) and strongly related to average temperature (right graph). Sugar content shows no relationship with the overall harvest yield (middle graph).

When the independent and dependent variable measurements fall close to a straight line, the relationship between the two variables is said to be “strong” and you can be more confident that the two variables are indeed related. When data points appear more scattered, the relationship is weaker and there is more uncertainty associated with the relationship.

  • Statistical inference with qualitative data

So far we have only considered examples in which the data being collected and analyzed are quantitative in nature and can be described with numbers. Instead of describing tomato sweetness quantitatively by experimentally measuring the sugar content, what if you asked a panel of taste-testers to rank the sweetness of the tomatoes on a scale from “not at all sweet” to “very sweet”? This would give you a qualitative dataset based on observations rather than numerical measurements (Figure 6).

Figure 6: Inferential statistics can be used to analyze more qualitative data like the simulated taste-test data shown above. Statistical methods that compare the shapes of the distributions of the taste-test responses can help determine whether or not the difference in sweetness between the two tomato harvests is statistically significant.

Figure 6 : Inferential statistics can be used to analyze more qualitative data like the simulated taste-test data shown above. Statistical methods that compare the shapes of the distributions of the taste-test responses can help determine whether or not the difference in sweetness between the two tomato harvests is statistically significant.

The statistical methods discussed above would not be appropriate for analyzing this kind of data . If you tried to assign numerical values , one through four, to each of the responses on the taste-test scale, the meaning of the original data would change. For example, we cannot say with certainty that the difference between “3 - sweet” and “4 - very sweet” is really exactly the same as the difference between “1 - not at all sweet” and “2 - somewhat sweet.”

Rather than trying to make qualitative data more quantitative, scientists can use methods in statistical inference that are more appropriate for interpreting qualitative datasets . These methods often test for statistical significance by comparing the overall shape of the distributions of two or more subsamples – for instance, the location and number of peaks in the distribution or overall spread of the data – instead of using more quantitative measures like the mean and standard deviation . This approach is perfect for analyzing your tomato taste-test data. By using a statistical test that compares the shapes of the distributions of the taste-testers’ responses, you can determine whether or not the results are significantly different and thus whether one tomato harvest actually tastes sweeter than the other.

  • Proceed with caution!

Inferential statistics provides tools that help scientists analyze and interpret their data . The key here is that the scientists – not the statistical tests – are making the judgment calls. The way that the term “significance” is used in statistical inference can be a major source of confusion. In statistics , significance indicates how reliably a result can be observed if a statistical test is repeated over and over. A statistically significant result is not necessarily relevant or important; it is the scientist that determines the importance of the result. (For a broader discussion of statistical significance , see our module Statistics in Science .)

One additional pitfall is the close relationship between statistical significance and subsample size. As subsamples grow larger, it becomes easier to reliably detect even the smallest differences among them. Sometimes well-meaning scientists are so excited to report statistically significant results that they forget to ask whether the magnitude , or size, of the result is actually meaningful.

Statistical inference is a powerful tool, but like any tool it must be used appropriately. Misguided application or interpretation of inferential statistics can lead to distorted or misleading scientific results. On the other hand, proper application of the methods described in this module can help scientists gain important insights about their data and lead to amazing discoveries. So use these methods wisely, and remember: It is ultimately up to scientists to ascribe meaning to their data.

Table of Contents

Activate glossary term highlighting to easily identify key terms within the module. Once highlighted, you can click on these terms to view their definitions.

Activate NGSS annotations to easily identify NGSS standards within the module. Once highlighted, you can click on them to view these standards.

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

15 Quantitative analysis: Inferential statistics

Inferential statistics are the statistical procedures that are used to reach conclusions about associations between variables. They differ from descriptive statistics in that they are explicitly designed to test hypotheses. Numerous statistical procedures fall into this category—most of which are supported by modern statistical software such as SPSS and SAS. This chapter provides a short primer on only the most basic and frequent procedures. Readers are advised to consult a formal text on statistics or take a course on statistics for more advanced procedures.

Basic concepts

British philosopher Karl Popper said that theories can never be proven, only disproven. As an example, how can we prove that the sun will rise tomorrow? Popper said that just because the sun has risen every single day that we can remember does not necessarily mean that it will rise tomorrow, because inductively derived theories are only conjectures that may or may not be predictive of future phenomena. Instead, he suggested that we may assume a theory that the sun will rise every day without necessarily proving it, and if the sun does not rise on a certain day, the theory is falsified and rejected. Likewise, we can only reject hypotheses based on contrary evidence, but can never truly accept them because the presence of evidence does not mean that we will not observe contrary evidence later. Because we cannot truly accept a hypothesis of interest (alternative hypothesis), we formulate a null hypothesis as the opposite of the alternative hypothesis, and then use empirical evidence to reject the null hypothesis to demonstrate indirect, probabilistic support for our alternative hypothesis.

A second problem with testing hypothesised relationships in social science research is that the dependent variable may be influenced by an infinite number of extraneous variables and it is not plausible to measure and control for all of these extraneous effects. Hence, even if two variables may seem to be related in an observed sample, they may not be truly related in the population, and therefore inferential statistics are never certain or deterministic, but always probabilistic.

\alpha

General linear model

Most inferential statistical procedures in social science research are derived from a general family of statistical models called the general linear model (GLM). A model is an estimated mathematical equation that can be used to represent a set of data, and linear refers to a straight line. Hence, a GLM is a system of equations that can be used to represent linear patterns of relationships in observed data.

Two-variable linear model

Two-group comparison

t

where the numerator is the difference in sample means between the treatment group (Group 1) and the control group (Group 2) and the denominator is the standard error of the difference between the two groups, which in turn, can be estimated as:

\[ s_{\overline{X}_{1}-\overline{X}_{2}} = \sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}} }\,.\]

Factorial designs

2 \times 2

Other quantitative analysis

There are many other useful inferential statistical techniques—based on variations in the GLM—that are briefly mentioned here. Interested readers are referred to advanced textbooks or statistics courses for more information on these techniques:

Factor analysis is a data reduction technique that is used to statistically aggregate a large number of observed measures (items) into a smaller set of unobserved (latent) variables called factors based on their underlying bivariate correlation patterns. This technique is widely used for assessment of convergent and discriminant validity in multi-item measurement scales in social science research.

Discriminant analysis is a classificatory technique that aims to place a given observation in one of several nominal categories based on a linear combination of predictor variables. The technique is similar to multiple regression, except that the dependent variable is nominal. It is popular in marketing applications, such as for classifying customers or products into categories based on salient attributes as identified from large-scale surveys.

Logistic regression (or logit model) is a GLM in which the outcome variable is binary (0 or 1) and is presumed to follow a logistic distribution, and the goal of the regression analysis is to predict the probability of the successful outcome by fitting data into a logistic curve. An example is predicting the probability of heart attack within a specific period, based on predictors such as age, body mass index, exercise regimen, and so forth. Logistic regression is extremely popular in the medical sciences. Effect size estimation is based on an ‘odds ratio’, representing the odds of an event occurring in one group versus the other.

Probit regression (or probit model) is a GLM in which the outcome variable can vary between 0 and 1—or can assume discrete values 0 and 1—and is presumed to follow a standard normal distribution, and the goal of the regression is to predict the probability of each outcome. This is a popular technique for predictive analysis in the actuarial science, financial services, insurance, and other industries for applications such as credit scoring based on a person’s credit rating, salary, debt and other information from their loan application. Probit and logit regression tend to demonstrate similar regression coefficients in comparable applications (binary outcomes), however the logit model is easier to compute and interpret.

Path analysis is a multivariate GLM technique for analysing directional relationships among a set of variables. It allows for examination of complex nomological models where the dependent variable in one equation is the independent variable in another equation, and is widely used in contemporary social science research.

Time series analysis is a technique for analysing time series data, or variables that continually changes with time. Examples of applications include forecasting stock market fluctuations and urban crime rates. This technique is popular in econometrics, mathematical finance, and signal processing. Special techniques are used to correct for autocorrelation, or correlation within values of the same variable across time.

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

Statistical inference and estimation, review of introductory inference.

Sampling distribution & Central Limit Theorem

Basic concepts of estimation:

Review of Introductory Inference

-test

Statistical Inference, Model & Estimation

Recall, a statistical inference aims at learning characteristics of the population from a sample; the population characteristics are parameters and sample characteristics are statistics .

A statistical model is a representation of a complex phenomena that generated the data.

  • It has mathematical formulations that describe relationships between random variables and parameters.
  • It makes assumptions about the random variables, and sometimes parameters.
  • A general form: data = model + residuals
  • Model should explain most of the variation in the data
  • Residuals are a representation of a lack-of-fit, that is of the portion of the data unexplained by the model.

Estimation represents ways or a process of learning and determining the population parameter based on the model fitted to the data.

Point estimation and interval estimation, and hypothesis testing are three main ways of learning about the population parameter from the sample statistic.

An estimator is particular example of a statistic, which becomes an estimate when the formula is replaced with actual observed sample values.

Point estimation = a single value that estimates the parameter. Point estimates are single values calculated from the sample

Confidence Intervals = gives a range of values for the parameter Interval estimates are intervals within which the parameter is expected to fall, with a certain degree of confidence.

Hypothesis tests = tests for a specific value(s) of the parameter.

In order to perform these inferential tasks, i.e., make inference about the unknown population parameter from the sample statistic, we need to know the likely values of the sample statistic. What would happen if we do sampling many times?

We need the sampling distribution of the statistic

  • It depends on the model assumptions about the population distribution, and/or on the sample size.
  • Standard error refers to the standard deviation of a sampling distribution.

We are interested in estimating the true average height of the student population at Penn State. We collect a simple random sample of 54 students. Here is a graphical summary of that sample.

Central Limit Theorem

Sampling distribution of the sample mean:

If numerous samples of size n are taken, the frequency curve of the sample means ( \(\bar{X}\)‘s) from those various samples is approximately bell shaped with mean μ and standard deviation, i.e. standard error \(\bar{X}/ \sim N(\mu , \sigma^2 / n)\)

  • X is normally distributed
  • X is NOT normal, but n is large (e.g. n >30) and μ finite.
  • For continuous variables

For categorical data, the CLT holds for the sampling distribution of the sample proportion.

Proportions in Newspapers

As found in CNN in June, 2006:

The parameter of interest in the population is the proportion of U.S. adults who disapprove of how well Bush is handling Iraq, p .

The sample statistic, or point estimator is \(\hat{p}\), and an estimate, based on this sample is \(\hat{p}=0.62\).

Next question ...

If we take another poll, we are likely to get a different sample proportion, e.g. 60%, 59%,67%, etc..

So, what is the 95% confidence interval? Based on the CLT, the 95% CI is \(\hat{p}\pm 2 \ast \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\).

We often assume p = 1/2 so \(\hat{p}\pm 2 \ast \sqrt{\frac{\frac{1}{2}\ast\frac{1}{2} }{n}}=\hat{p}\pm\frac{1}{\sqrt{n}}=\hat{p}\pm\text{MOE}\).

The margin of error (MOE) is 2 × St.Dev or \(1/\sqrt{n}\).

Introduction to Data Science

Chapter 15 statistical inference.

In Chapter 16 we will describe, in some detail, how poll aggregators such as FiveThirtyEight use data to predict election outcomes. To understand how they do this, we first need to learn the basics of Statistical Inference , the part of statistics that helps distinguish patterns arising from signal from those arising from chance. Statistical inference is a broad topic and here we go over the very basics using polls as a motivating example. To describe the concepts, we complement the mathematical formulas with Monte Carlo simulations and R code.

Opinion polling has been conducted since the 19th century. The general goal is to describe the opinions held by a specific population on a given set of topics. In recent times, these polls have been pervasive during presidential elections. Polls are useful when interviewing every member of a particular population is logistically impossible. The general strategy is to interview a smaller group, chosen at random, and then infer the opinions of the entire population from the opinions of the smaller group. Statistical theory is used to justify the process. This theory is referred to as inference and it is the main topic of this chapter.

Perhaps the best known opinion polls are those conducted to determine which candidate is preferred by voters in a given election. Political strategists make extensive use of polls to decide, among other things, how to invest resources. For example, they may want to know in which geographical locations to focus their “get out the vote” efforts.

Elections are a particularly interesting case of opinion polls because the actual opinion of the entire population is revealed on election day. Of course, it costs millions of dollars to run an actual election which makes polling a cost effective strategy for those that want to forecast the results.

Although typically the results of these polls are kept private, similar polls are conducted by news organizations because results tend to be of interest to the general public and made public. We will eventually be looking at such data.

Real Clear Politics 51 is an example of a news aggregator that organizes and publishes poll results. For example, they present the following poll results reporting estimates of the popular vote for the 2016 presidential election 52 :

Poll Date Sample MoE Clinton Trump Spread
RCP Average 10/31 - 11/7 47.2 44.3 Clinton +2.9
Bloomberg 11/4 - 11/6 799 LV 3.5 46.0 43.0 Clinton +3
Economist 11/4 - 11/7 3669 LV 49.0 45.0 Clinton +4
IBD 11/3 - 11/6 1026 LV 3.1 43.0 42.0 Clinton +1
ABC 11/3 - 11/6 2220 LV 2.5 49.0 46.0 Clinton +3
FOX News 11/3 - 11/6 1295 LV 2.5 48.0 44.0 Clinton +4
Monmouth 11/3 - 11/6 748 LV 3.6 50.0 44.0 Clinton +6
CBS News 11/2 - 11/6 1426 LV 3.0 47.0 43.0 Clinton +4
LA Times 10/31 - 11/6 2935 LV 4.5 43.0 48.0 Trump +5
NBC News 11/3 - 11/5 1282 LV 2.7 48.0 43.0 Clinton +5
NBC News 10/31 - 11/6 30145 LV 1.0 51.0 44.0 Clinton +7
McClatchy 11/1 - 11/3 940 LV 3.2 46.0 44.0 Clinton +2
Reuters 10/31 - 11/4 2244 LV 2.2 44.0 40.0 Clinton +4
GravisGravis 10/31 - 10/31 5360 RV 1.3 50.0 50.0 Tie

Although in the United States the popular vote does not determine the result of the presidential election, we will use it as an illustrative and simple example of how well polls work. Forecasting the election is a more complex process since it involves combining results from 50 states and DC and we describe it in Section 16.8 .

Let’s make some observations about the table above. First, note that different polls, all taken days before the election, report a different spread : the estimated difference between support for the two candidates. Notice also that the reported spreads hover around what ended up being the actual result: Clinton won the popular vote by 2.1%. We also see a column titled MoE which stands for margin of error .

In this section, we will show how the probability concepts we learned in the previous chapter can be applied to develop the statistical approaches that make polls an effective tool. We will learn the statistical concepts necessary to define estimates and margins of errors , and show how we can use these to forecast final results relatively well and also provide an estimate of the precision of our forecast. Once we learn this, we will be able to understand two concepts that are ubiquitous in data science: confidence intervals and p-values . Finally, to understand probabilistic statements about the probability of a candidate winning, we will have to learn about Bayesian modeling. In the final sections, we put it all together to recreate the simplified version of the FiveThirtyEight model and apply it to the 2016 election.

We start by connecting probability theory to the task of using polls to learn about a population.

15.1.1 The sampling model for polls

To help us understand the connection between polls and what we have learned, let’s construct a similar situation to the one pollsters face. To mimic the challenge real pollsters face in terms of competing with other pollsters for media attention, we will use an urn full of beads to represent voters and pretend we are competing for a $25 dollar prize. The challenge is to guess the spread between the proportion of blue and red beads in this urn (in this case, a pickle jar):

assignment inferential statistics

Before making a prediction, you can take a sample (with replacement) from the urn. To mimic the fact that running polls is expensive, it costs you $0.10 per each bead you sample. Therefore, if your sample size is 250, and you win, you will break even since you will pay $25 to collect your $25 prize. Your entry into the competition can be an interval. If the interval you submit contains the true proportion, you get half what you paid and pass to the second phase of the competition. In the second phase, the entry with the smallest interval is selected as the winner.

The dslabs package includes a function that shows a random draw from this urn:

assignment inferential statistics

Think about how you would construct your interval based on the data shown above.

We have just described a simple sampling model for opinion polls. The beads inside the urn represent the individuals that will vote on election day. Those that will vote for the Republican candidate are represented with red beads and the Democrats with the blue beads. For simplicity, assume there are no other colors. That is, that there are just two parties: Republican and Democratic.

15.2 Populations, samples, parameters, and estimates

We want to predict the proportion of blue beads in the urn. Let’s call this quantity \(p\) , which then tells us the proportion of red beads \(1-p\) , and the spread \(p - (1-p)\) , which simplifies to \(2p - 1\) .

In statistical textbooks, the beads in the urn are called the population . The proportion of blue beads in the population \(p\) is called a parameter . The 25 beads we see in the previous plot are called a sample . The task of statistical inference is to predict the parameter \(p\) using the observed data in the sample.

Can we do this with the 25 observations above? It is certainly informative. For example, given that we see 13 red and 12 blue beads, it is unlikely that \(p\) > .9 or \(p\) < .1. But are we ready to predict with certainty that there are more red beads than blue in the jar?

We want to construct an estimate of \(p\) using only the information we observe. An estimate should be thought of as a summary of the observed data that we think is informative about the parameter of interest. It seems intuitive to think that the proportion of blue beads in the sample \(0.48\) must be at least related to the actual proportion \(p\) . But do we simply predict \(p\) to be 0.48? First, remember that the sample proportion is a random variable. If we run the command take_poll(25) four times, we get a different answer each time, since the sample proportion is a random variable.

assignment inferential statistics

Note that in the four random samples shown above, the sample proportions range from 0.44 to 0.60. By describing the distribution of this random variable, we will be able to gain insights into how good this estimate is and how we can make it better.

15.2.1 The sample average

Conducting an opinion poll is being modeled as taking a random sample from an urn. We are proposing the use of the proportion of blue beads in our sample as an estimate of the parameter \(p\) . Once we have this estimate, we can easily report an estimate for the spread \(2p-1\) , but for simplicity we will illustrate the concepts for estimating \(p\) . We will use our knowledge of probability to defend our use of the sample proportion and quantify how close we think it is from the population proportion \(p\) .

We start by defining the random variable \(X\) as: \(X=1\) if we pick a blue bead at random and \(X=0\) if it is red. This implies that the population is a list of 0s and 1s. If we sample \(N\) beads, then the average of the draws \(X_1, \dots, X_N\) is equivalent to the proportion of blue beads in our sample. This is because adding the \(X\) s is equivalent to counting the blue beads and dividing this count by the total \(N\) is equivalent to computing a proportion. We use the symbol \(\bar{X}\) to represent this average. In general, in statistics textbooks a bar on top of a symbol means the average. The theory we just learned about the sum of draws becomes useful because the average is a sum of draws multiplied by the constant \(1/N\) :

\[\bar{X} = 1/N \times \sum_{i=1}^N X_i\]

For simplicity, let’s assume that the draws are independent: after we see each sampled bead, we return it to the urn. In this case, what do we know about the distribution of the sum of draws? First, we know that the expected value of the sum of draws is \(N\) times the average of the values in the urn. We know that the average of the 0s and 1s in the urn must be \(p\) , the proportion of blue beads.

Here we encounter an important difference with what we did in the Probability chapter: we don’t know what is in the urn. We know there are blue and red beads, but we don’t know how many of each. This is what we want to find out: we are trying to estimate \(p\) .

15.2.2 Parameters

Just like we use variables to define unknowns in systems of equations, in statistical inference we define parameters to define unknown parts of our models. In the urn model which we are using to mimic an opinion poll, we do not know the proportion of blue beads in the urn. We define the parameters \(p\) to represent this quantity. \(p\) is the average of the urn because if we take the average of the 1s (blue) and 0s (red), we get the proportion of blue beads. Since our main goal is figuring out what is \(p\) , we are going to estimate this parameter .

The ideas presented here on how we estimate parameters, and provide insights into how good these estimates are, extrapolate to many data science tasks. For example, we may want to determine the difference in health improvement between patients receiving treatment and a control group. We may ask, what are the health effects of smoking on a population? What are the differences in racial groups of fatal shootings by police? What is the rate of change in life expectancy in the US during the last 10 years? All these questions can be framed as a task of estimating a parameter from a sample.

15.2.3 Polling versus forecasting

Before we continue, let’s make an important clarification related to the practical problem of forecasting the election. If a poll is conducted four months before the election, it is estimating the \(p\) for that moment and not for election day. The \(p\) for election night might be different since people’s opinions fluctuate through time. The polls provided the night before the election tend to be the most accurate since opinions don’t change that much in a day. However, forecasters try to build tools that model how opinions vary across time and try to predict the election night results taking into consideration the fact that opinions fluctuate. We will describe some approaches for doing this in a later section.

15.2.4 Properties of our estimate: expected value and standard error

To understand how good our estimate is, we will describe the statistical properties of the random variable defined above: the sample proportion \(\bar{X}\) . Remember that \(\bar{X}\) is the sum of independent draws so the rules we covered in the probability chapter apply.

Using what we have learned, the expected value of the sum \(N\bar{X}\) is \(N \times\) the average of the urn, \(p\) . So dividing by the non-random constant \(N\) gives us that the expected value of the average \(\bar{X}\) is \(p\) . We can write it using our mathematical notation:

\[ \mbox{E}(\bar{X}) = p \]

We can also use what we learned to figure out the standard error: the standard error of the sum is \(\sqrt{N} \times\) the standard deviation of the urn. Can we compute the standard error of the urn? We learned a formula that tells us that it is \((1-0) \sqrt{p (1-p)}\) = \(\sqrt{p (1-p)}\) . Because we are dividing the sum by \(N\) , we arrive at the following formula for the standard error of the average:

\[ \mbox{SE}(\bar{X}) = \sqrt{p(1-p)/N} \]

This result reveals the power of polls. The expected value of the sample proportion \(\bar{X}\) is the parameter of interest \(p\) and we can make the standard error as small as we want by increasing \(N\) . The law of large numbers tells us that with a large enough poll, our estimate converges to \(p\) .

If we take a large enough poll to make our standard error about 1%, we will be quite certain about who will win. But how large does the poll have to be for the standard error to be this small?

One problem is that we do not know \(p\) , so we can’t compute the standard error. However, for illustrative purposes, let’s assume that \(p=0.51\) and make a plot of the standard error versus the sample size \(N\) :

assignment inferential statistics

From the plot we see that we would need a poll of over 10,000 people to get the standard error that low. We rarely see polls of this size due in part to costs. From the Real Clear Politics table, we learn that the sample sizes in opinion polls range from 500-3,500 people. For a sample size of 1,000 and \(p=0.51\) , the standard error is:

or 1.5 percentage points. So even with large polls, for close elections, \(\bar{X}\) can lead us astray if we don’t realize it is a random variable. Nonetheless, we can actually say more about how close we get the \(p\) and we do that in Section 15.4 .

15.3 Exercises

1. Suppose you poll a population in which a proportion \(p\) of voters are Democrats and \(1-p\) are Republicans. Your sample size is \(N=25\) . Consider the random variable \(S\) which is the total number of Democrats in your sample. What is the expected value of this random variable? Hint: it’s a function of \(p\) .

2. What is the standard error of \(S\) ? Hint: it’s a function of \(p\) .

3. Consider the random variable \(S/N\) . This is equivalent to the sample average, which we have been denoting as \(\bar{X}\) . What is the expected value of the \(\bar{X}\) ? Hint: it’s a function of \(p\) .

4. What is the standard error of \(\bar{X}\) ? Hint: it’s a function of \(p\) .

5. Write a line of code that gives you the standard error se for the problem above for several values of \(p\) , specifically for p <- seq(0, 1, length = 100) . Make a plot of se versus p .

6. Copy the code above and put it inside a for-loop to make the plot for \(N=25\) , \(N=100\) , and \(N=1000\) .

7. If we are interested in the difference in proportions, \(p - (1-p)\) , our estimate is \(d = \bar{X} - (1-\bar{X})\) . Use the rules we learned about sums of random variables and scaled random variables to derive the expected value of \(d\) .

8. What is the standard error of \(d\) ?

9. If the actual \(p=.45\) , it means the Republicans are winning by a relatively large margin since \(d= -.1\) , which is a 10% margin of victory. In this case, what is the standard error of \(2\hat{X}-1\) if we take a sample of \(N=25\) ?

10. Given the answer to 9, which of the following best describes your strategy of using a sample size of \(N=25\) ?

  • The expected value of our estimate \(2\bar{X}-1\) is \(d\) , so our prediction will be right on.
  • Our standard error is larger than the difference, so the chances of \(2\bar{X}-1\) being positive and throwing us off were not that small. We should pick a larger sample size.
  • The difference is 10% and the standard error is about 0.2, therefore much smaller than the difference.
  • Because we don’t know \(p\) , we have no way of knowing that making \(N\) larger would actually improve our standard error.

15.4 Central Limit Theorem in practice

The CLT tells us that the distribution function for a sum of draws is approximately normal. We also learned that dividing a normally distributed random variable by a constant is also a normally distributed variable. This implies that the distribution of \(\bar{X}\) is approximately normal.

In summary, we have that \(\bar{X}\) has an approximately normal distribution with expected value \(p\) and standard error \(\sqrt{p(1-p)/N}\) .

Now how does this help us? Suppose we want to know what is the probability that we are within 1% from \(p\) . We are basically asking what is

\[ \mbox{Pr}(| \bar{X} - p| \leq .01) \] which is the same as:

\[ \mbox{Pr}(\bar{X}\leq p + .01) - \mbox{Pr}(\bar{X} \leq p - .01) \]

Can we answer this question? We can use the mathematical trick we learned in the previous chapter. Subtract the expected value and divide by the standard error to get a standard normal random variable, call it \(Z\) , on the left. Since \(p\) is the expected value and \(\mbox{SE}(\bar{X}) = \sqrt{p(1-p)/N}\) is the standard error we get:

\[ \mbox{Pr}\left(Z \leq \frac{ \,.01} {\mbox{SE}(\bar{X})} \right) - \mbox{Pr}\left(Z \leq - \frac{ \,.01} {\mbox{SE}(\bar{X})} \right) \]

One problem we have is that since we don’t know \(p\) , we don’t know \(\mbox{SE}(\bar{X})\) . But it turns out that the CLT still works if we estimate the standard error by using \(\bar{X}\) in place of \(p\) . We say that we plug-in the estimate. Our estimate of the standard error is therefore:

\[ \hat{\mbox{SE}}(\bar{X})=\sqrt{\bar{X}(1-\bar{X})/N} \] In statistics textbooks, we use a little hat to denote estimates. The estimate can be constructed using the observed data and \(N\) .

Now we continue with our calculation, but dividing by \(\hat{\mbox{SE}}(\bar{X})=\sqrt{\bar{X}(1-\bar{X})/N})\) instead. In our first sample we had 12 blue and 13 red so \(\bar{X} = 0.48\) and our estimate of standard error is:

And now we can answer the question of the probability of being close to \(p\) . The answer is:

Therefore, there is a small chance that we will be close. A poll of only \(N=25\) people is not really very useful, at least not for a close election.

Earlier we mentioned the margin of error . Now we can define it because it is simply two times the standard error, which we can now estimate. In our case it is:

Why do we multiply by 1.96? Because if you ask what is the probability that we are within 1.96 standard errors from \(p\) , we get:

\[ \mbox{Pr}\left(Z \leq \, 1.96\,\mbox{SE}(\bar{X}) / \mbox{SE}(\bar{X}) \right) - \mbox{Pr}\left(Z \leq - 1.96\, \mbox{SE}(\bar{X}) / \mbox{SE}(\bar{X}) \right) \] which is:

\[ \mbox{Pr}\left(Z \leq 1.96 \right) - \mbox{Pr}\left(Z \leq - 1.96\right) \]

which we know is about 95%:

Hence, there is a 95% probability that \(\bar{X}\) will be within \(1.96\times \hat{SE}(\bar{X})\) , in our case within about 0.2, of \(p\) . Note that 95% is somewhat of an arbitrary choice and sometimes other percentages are used, but it is the most commonly used value to define margin of error. We often round 1.96 up to 2 for simplicity of presentation.

In summary, the CLT tells us that our poll based on a sample size of \(25\) is not very useful. We don’t really learn much when the margin of error is this large. All we can really say is that the popular vote will not be won by a large margin. This is why pollsters tend to use larger sample sizes.

From the table above, we see that typical sample sizes range from 700 to 3500. To see how this gives us a much more practical result, notice that if we had obtained a \(\bar{X}\) =0.48 with a sample size of 2,000, our standard error \(\hat{\mbox{SE}}(\bar{X})\) would have been 0.011. So our result is an estimate of 48 % with a margin of error of 2%. In this case, the result is much more informative and would make us think that there are more red balls than blue. Keep in mind, however, that this is hypothetical. We did not take a poll of 2,000 since we don’t want to ruin the competition.

15.4.1 A Monte Carlo simulation

Suppose we want to use a Monte Carlo simulation to corroborate the tools we have built using probability theory. To create the simulation, we would write code like this:

The problem is, of course, we don’t know p . We could construct an urn like the one pictured above and run an analog (without a computer) simulation. It would take a long time, but you could take 10,000 samples, count the beads and keep track of the proportions of blue. We can use the function take_poll(n=1000) instead of drawing from an actual urn, but it would still take time to count the beads and enter the results.

One thing we therefore do to corroborate theoretical results is to pick one or several values of p and run the simulations. Let’s set p=0.45 . We can then simulate a poll:

In this particular sample, our estimate is x_hat . We can use that code to do a Monte Carlo simulation:

To review, the theory tells us that \(\bar{X}\) is approximately normally distributed, has expected value \(p=\) 0.45 and standard error \(\sqrt{p(1-p)/N}\) = 0.016. The simulation confirms this:

A histogram and qq-plot confirm that the normal approximation is accurate as well:

assignment inferential statistics

Of course, in real life we would never be able to run such an experiment because we don’t know \(p\) . But we could run it for various values of \(p\) and \(N\) and see that the theory does indeed work well for most values. You can easily do this by re-running the code above after changing p and N .

15.4.2 The spread

The competition is to predict the spread, not the proportion \(p\) . However, because we are assuming there are only two parties, we know that the spread is \(p - (1-p) = 2p - 1\) . As a result, everything we have done can easily be adapted to an estimate of \(2p - 1\) . Once we have our estimate \(\bar{X}\) and \(\hat{\mbox{SE}}(\bar{X})\) , we estimate the spread with \(2\bar{X} - 1\) and, since we are multiplying by 2, the standard error is \(2\hat{\mbox{SE}}(\bar{X})\) . Note that subtracting 1 does not add any variability so it does not affect the standard error.

For our 25 item sample above, our estimate \(p\) is .48 with margin of error .20 and our estimate of the spread is 0.04 with margin of error .40 . Again, not a very useful sample size. However, the point is that once we have an estimate and standard error for \(p\) , we have it for the spread \(2p-1\) .

15.4.3 Bias: why not run a very large poll?

For realistic values of \(p\) , say from 0.35 to 0.65, if we run a very large poll with 100,000 people, theory tells us that we would predict the election perfectly since the largest possible margin of error is around 0.3%:

assignment inferential statistics

One reason is that running such a poll is very expensive. Another possibly more important reason is that theory has its limitations. Polling is much more complicated than picking beads from an urn. Some people might lie to pollsters and others might not have phones. But perhaps the most important way an actual poll differs from an urn model is that we actually don’t know for sure who is in our population and who is not. How do we know who is going to vote? Are we reaching all possible voters? Hence, even if our margin of error is very small, it might not be exactly right that our expected value is \(p\) . We call this bias. Historically, we observe that polls are indeed biased, although not by that much. The typical bias appears to be about 1-2%. This makes election forecasting a bit more interesting and we will talk about how to model this in a later chapter.

15.5 Exercises

1. Write an urn model function that takes the proportion of Democrats \(p\) and the sample size \(N\) as arguments and returns the sample average if Democrats are 1s and Republicans are 0s. Call the function take_sample .

2. Now assume p <- 0.45 and that your sample size is \(N=100\) . Take a sample 10,000 times and save the vector of mean(X) - p into an object called errors . Hint: use the function you wrote for exercise 1 to write this in one line of code.

3. The vector errors contains, for each simulated sample, the difference between the actual \(p\) and our estimate \(\bar{X}\) . We refer to this difference as the error . Compute the average and make a histogram of the errors generated in the Monte Carlo simulation and select which of the following best describes their distributions:

  • The errors are all about 0.05.
  • The errors are all about -0.05.
  • The errors are symmetrically distributed around 0.
  • The errors range from -1 to 1.

4. The error \(\bar{X}-p\) is a random variable. In practice, the error is not observed because we do not know \(p\) . Here we observe it because we constructed the simulation. What is the average size of the error if we define the size by taking the absolute value \(\mid \bar{X} - p \mid\) ?

5. The standard error is related to the typical size of the error we make when predicting. We say size because we just saw that the errors are centered around 0, so thus the average error value is 0. For mathematical reasons related to the Central Limit Theorem, we actually use the standard deviation of errors rather than the average of the absolute values to quantify the typical size. What is this standard deviation of the errors?

6. The theory we just learned tells us what this standard deviation is going to be because it is the standard error of \(\bar{X}\) . What does theory tell us is the standard error of \(\bar{X}\) for a sample size of 100?

7. In practice, we don’t know \(p\) , so we construct an estimate of the theoretical prediction based by plugging in \(\bar{X}\) for \(p\) . Compute this estimate. Set the seed at 1 with set.seed(1) .

8. Note how close the standard error estimates obtained from the Monte Carlo simulation (exercise 5), the theoretical prediction (exercise 6), and the estimate of the theoretical prediction (exercise 7) are. The theory is working and it gives us a practical approach to knowing the typical error we will make if we predict \(p\) with \(\bar{X}\) . Another advantage that the theoretical result provides is that it gives an idea of how large a sample size is required to obtain the precision we need. Earlier we learned that the largest standard errors occur for \(p=0.5\) . Create a plot of the largest standard error for \(N\) ranging from 100 to 5,000. Based on this plot, how large does the sample size have to be to have a standard error of about 1%?

9. For sample size \(N=100\) , the central limit theorem tells us that the distribution of \(\bar{X}\) is:

  • practically equal to \(p\) .
  • approximately normal with expected value \(p\) and standard error \(\sqrt{p(1-p)/N}\) .
  • approximately normal with expected value \(\bar{X}\) and standard error \(\sqrt{\bar{X}(1-\bar{X})/N}\) .
  • not a random variable.

10. Based on the answer from exercise 8, the error \(\bar{X} - p\) is:

  • practically equal to 0.
  • approximately normal with expected value \(0\) and standard error \(\sqrt{p(1-p)/N}\) .

11. To corroborate your answer to exercise 9, make a qq-plot of the errors you generated in exercise 2 to see if they follow a normal distribution.

12. If \(p=0.45\) and \(N=100\) as in exercise 2, use the CLT to estimate the probability that \(\bar{X}>0.5\) . You can assume you know \(p=0.45\) for this calculation.

13. Assume you are in a practical situation and you don’t know \(p\) . Take a sample of size \(N=100\) and obtain a sample average of \(\bar{X} = 0.51\) . What is the CLT approximation for the probability that your error is equal to or larger than 0.01?

15.6 Confidence intervals

Confidence intervals are a very useful concept widely employed by data analysts. A version of these that are commonly seen come from the ggplot geometry geom_smooth . Here is an example using a temperature dataset available in R:

assignment inferential statistics

In the Machine Learning part we will learn how the curve is formed, but for now consider the shaded area around the curve. This is created using the concept of confidence intervals.

In our earlier competition, you were asked to give an interval. If the interval you submitted includes the \(p\) , you get half the money you spent on your “poll” back and pass to the next stage of the competition. One way to pass to the second round is to report a very large interval. For example, the interval \([0,1]\) is guaranteed to include \(p\) . However, with an interval this big, we have no chance of winning the competition. Similarly, if you are an election forecaster and predict the spread will be between -100% and 100%, you will be ridiculed for stating the obvious. Even a smaller interval, such as saying the spread will be between -10 and 10%, will not be considered serious.

On the other hand, the smaller the interval we report, the smaller our chances are of winning the prize. Likewise, a bold pollster that reports very small intervals and misses the mark most of the time will not be considered a good pollster. We want to be somewhere in between.

We can use the statistical theory we have learned to compute the probability of any given interval including \(p\) . If we are asked to create an interval with, say, a 95% chance of including \(p\) , we can do that as well. These are called 95% confidence intervals.

When a pollster reports an estimate and a margin of error, they are, in a way, reporting a 95% confidence interval. Let’s show how this works mathematically.

We want to know the probability that the interval \([\bar{X} - 2\hat{\mbox{SE}}(\bar{X}), \bar{X} + 2\hat{\mbox{SE}}(\bar{X})]\) contains the true proportion \(p\) . First, consider that the start and end of these intervals are random variables: every time we take a sample, they change. To illustrate this, run the Monte Carlo simulation above twice. We use the same parameters as above:

And notice that the interval here:

is different from this one:

Keep sampling and creating intervals and you will see the random variation.

To determine the probability that the interval includes \(p\) , we need to compute this: \[ \mbox{Pr}\left(\bar{X} - 1.96\hat{\mbox{SE}}(\bar{X}) \leq p \leq \bar{X} + 1.96\hat{\mbox{SE}}(\bar{X})\right) \]

By subtracting and dividing the same quantities in all parts of the equation, we get that the above is equivalent to:

\[ \mbox{Pr}\left(-1.96 \leq \frac{\bar{X}- p}{\hat{\mbox{SE}}(\bar{X})} \leq 1.96\right) \]

The term in the middle is an approximately normal random variable with expected value 0 and standard error 1, which we have been denoting with \(Z\) , so we have:

\[ \mbox{Pr}\left(-1.96 \leq Z \leq 1.96\right) \]

which we can quickly compute using :

proving that we have a 95% probability.

If we want to have a larger probability, say 99%, we need to multiply by whatever z satisfies the following:

\[ \mbox{Pr}\left(-z \leq Z \leq z\right) = 0.99 \]

will achieve this because by definition pnorm(qnorm(0.995)) is 0.995 and by symmetry pnorm(1-qnorm(0.995)) is 1 - 0.995. As a consequence, we have that:

is 0.995 - 0.005 = 0.99 .

We can use this approach for any probability, not just 0.95 and 0.99. In statistics textbooks, these are usually written for any probability as \(1-\alpha\) . We can then obtain the \(z\) for the equation above noting using z = qnorm(1 - alpha / 2) because \(1 - \alpha/2 - \alpha/2 = 1 - \alpha\) .

So, for example, for \(\alpha=0.05\) , \(1 - \alpha/2 = 0.975\) and we get the 1.96 we have been using:

15.6.1 A Monte Carlo simulation

We can run a Monte Carlo simulation to confirm that, in fact, a 95% confidence interval includes \(p\) 95% of the time.

The following plot shows the first 100 confidence intervals. In this case, we created the simulation so the black line denotes the parameter we are trying to estimate:

assignment inferential statistics

15.6.2 The correct language

When using the theory we described above, it is important to remember that it is the intervals that are random, not \(p\) . In the plot above, we can see the random intervals moving around and \(p\) , represented with the vertical line, staying in the same place. The proportion of blue in the urn \(p\) is not. So the 95% relates to the probability that this random interval falls on top of \(p\) . Saying the \(p\) has a 95% chance of being between this and that is technically an incorrect statement because \(p\) is not random.

15.7 Exercises

For these exercises, we will use actual polls from the 2016 election. You can load the data from the dslabs package.

Specifically, we will use all the national polls that ended within one week before the election.

1. For the first poll, you can obtain the samples size and estimated Clinton percentage with:

Assume there are only two candidates and construct a 95% confidence interval for the election night proportion \(p\) .

2. Now use dplyr to add a confidence interval as two columns, call them lower and upper , to the object poll . Then use select to show the pollster , enddate , x_hat , lower , upper variables. Hint: define temporary columns x_hat and se_hat .

3. The final tally for the popular vote was Clinton 48.2% and Trump 46.1%. Add a column, call it hit , to the previous table stating if the confidence interval included the true proportion \(p=0.482\) or not.

4. For the table you just created, what proportion of confidence intervals included \(p\) ?

5. If these confidence intervals are constructed correctly, and the theory holds up, what proportion should include \(p\) ?

6. A much smaller proportion of the polls than expected produce confidence intervals containing \(p\) . If you look closely at the table, you will see that most polls that fail to include \(p\) are underestimating. The reason for this is undecided voters, individuals polled that do not yet know who they will vote for or do not want to say. Because, historically, undecideds divide evenly between the two main candidates on election day, it is more informative to estimate the spread or the difference between the proportion of two candidates \(d\) , which in this election was \(0. 482 - 0.461 = 0.021\) . Assume that there are only two parties and that \(d = 2p - 1\) , redefine polls as below and re-do exercise 1, but for the difference.

7. Now repeat exercise 3, but for the difference.

8. Now repeat exercise 4, but for the difference.

9. Although the proportion of confidence intervals goes up substantially, it is still lower than 0.95. In the next chapter, we learn the reason for this. To motivate this, make a plot of the error, the difference between each poll’s estimate and the actual \(d=0.021\) . Stratify by pollster.

10. Redo the plot that you made for exercise 9, but only for pollsters that took five or more polls.

Pollsters are not successful at providing correct confidence intervals, but rather at predicting who will win. When we took a 25 bead sample size, the confidence interval for the spread:

includes 0. If this were a poll and we were forced to make a declaration, we would have to say it was a “toss-up”.

A problem with our poll results is that given the sample size and the value of \(p\) , we would have to sacrifice on the probability of an incorrect call to create an interval that does not include 0.

This does not mean that the election is close. It only means that we have a small sample size. In statistical textbooks this is called lack of power . In the context of polls, power is the probability of detecting spreads different from 0.

By increasing our sample size, we lower our standard error and therefore have a much better chance of detecting the direction of the spread.

15.9 p-values

p-values are ubiquitous in the scientific literature. They are related to confidence intervals so we introduce the concept here.

Let’s consider the blue and red beads. Suppose that rather than wanting an estimate of the spread or the proportion of blue, I am interested only in the question: are there more blue beads or red beads? I want to know if the spread or, equivalently, if \(p > 0.5\) .

Say we take a random sample of \(N=100\) and we observe \(52\) blue beads, which gives us \(\bar{X} = 0.52\) . This seems to be pointing to the existence of more blue than red beads since 0.52 is larger than 0.5. However, as data scientists we need to be skeptical. We know there is chance involved in this process and we could get a 52 even when the actual spread is 0. We call the assumption that the spread is \(p = 0.5\) a null hypothesis . The null hypothesis is the skeptic’s hypothesis. We have observed a random variable \(\bar{X} = 0.52\) and the p-value is the answer to the question: how likely is it to see a value this large, when the null hypothesis is true? So we write:

\[\mbox{Pr}(\mid \bar{X} - 0.5 \mid > 0.02 ) \]

assuming the \(p=0.5\) . Under the null hypothesis we know that:

\[ \sqrt{N}\frac{\bar{X} - 0.5}{\sqrt{0.5(1-0.5)}} \]

is standard normal. We therefore can compute the probability above, which is the p-value.

\[\mbox{Pr}\left(\sqrt{N}\frac{\mid \bar{X} - 0.5\mid}{\sqrt{0.5(1-0.5)}} > \sqrt{N} \frac{0.02}{ \sqrt{0.5(1-0.5)}}\right)\]

In this case, there is actually a large chance of seeing 52 or larger under the null hypothesis.

Keep in mind that there is a close connection between p-values and confidence intervals. If a 95% confidence interval of the spread does not include 0, we know that the p-value must be smaller than 0.05.

To learn more about p-values, you can consult any statistics textbook. However, in general, we prefer reporting confidence intervals over p-values since it gives us an idea of the size of the estimate. If we just report the p-value we provide no information about the significance of the finding in the context of the problem.

15.10 Association tests

The statistical tests we have studied up to now leave out a substantial portion of data types. Specifically, we have not discussed inference for binary, categorical, and ordinal data. To give a very specific example, consider the following case study.

A 2014 PNAS paper 53 analyzed success rates from funding agencies in the Netherlands and concluded that their:

results reveal gender bias favoring male applicants over female applicants in the prioritization of their “quality of researcher” (but not “quality of proposal”) evaluations and success rates, as well as in the language use in instructional and evaluation materials.

The main evidence for this conclusion comes down to a comparison of the percentages. Table S1 in the paper includes the information we need. Here are the three columns showing the overall outcomes:

We have these values for each gender:

We can compute the totals that were successful and the totals that were not as follows:

So we see that a larger percent of men than women received awards:

But could this be due just to random variability? Here we learn how to perform inference for this type of data.

15.10.1 Lady Tasting Tea

R.A. Fisher 54 was one of the first to formalize hypothesis testing. The “Lady Tasting Tea” is one of the most famous examples.

The story is as follows: an acquaintance of Fisher’s claimed that she could tell if milk was added before or after tea was poured. Fisher was skeptical. He designed an experiment to test this claim. He gave her four pairs of cups of tea: one with milk poured first, the other after. The order was randomized. The null hypothesis here is that she is guessing. Fisher derived the distribution for the number of correct picks on the assumption that the choices were random and independent.

As an example, suppose she picked 3 out of 4 correctly. Do we believe she has a special ability? The basic question we ask is: if the tester is actually guessing, what are the chances that she gets 3 or more correct? Just as we have done before, we can compute a probability under the null hypothesis that she is guessing 4 of each. Under this null hypothesis, we can think of this particular example as picking 4 balls out of an urn with 4 blue (correct answer) and 4 red (incorrect answer) balls. Remember, she knows that there are four before tea and four after.

Under the null hypothesis that she is simply guessing, each ball has the same chance of being picked. We can then use combinations to figure out each probability. The probability of picking 3 is \(\binom{4}{3} \binom{4}{1} / \binom{8}{4} = 16/70\) . The probability of picking all 4 correct is \(\binom{4}{4} \binom{4}{0} / \binom{8}{4}= 1/70\) . Thus, the chance of observing a 3 or something more extreme, under the null hypothesis, is \(\approx 0.24\) . This is the p-value. The procedure that produced this p-value is called Fisher’s exact test and it uses the hypergeometric distribution .

15.10.2 Two-by-two tables

The data from the experiment is usually summarized by a table like this:

These are referred to as a two-by-two table. For each of the four combinations one can get with a pair of binary variables, they show the observed counts for each occurrence.

The function fisher.test performs the inference calculations above:

15.10.3 Chi-square Test

Notice that, in a way, our funding rates example is similar to the Lady Tasting Tea. However, in the Lady Tasting Tea example, the number of blue and red beads is experimentally fixed and the number of answers given for each category is also fixed. This is because Fisher made sure there were four cups with milk poured before tea and four cups with milk poured after and the lady knew this, so the answers would also have to include four befores and four afters. If this is the case, the sum of the rows and the sum of the columns are fixed. This defines constraints on the possible ways we can fill the two by two table and also permits us to use the hypergeometric distribution. In general, this is not the case. Nonetheless, there is another approach, the Chi-squared test, which is described below.

Imagine we have 290, 1,345, 177, 1,011 applicants, some are men and some are women and some get funded, whereas others don’t. We saw that the success rates for men and woman were:

respectively. Would we see this again if we randomly assign funding at the overall rate:

The Chi-square test answers this question. The first step is to create the two-by-two data table:

The general idea of the Chi-square test is to compare this two-by-two table to what you expect to see, which would be:

We can see that more men than expected and fewer women than expected received funding. However, under the null hypothesis these observations are random variables. The Chi-square test tells us how likely it is to see a deviation this large or larger. This test uses an asymptotic result, similar to the CLT, related to the sums of independent binary outcomes. The R function chisq.test takes a two-by-two table and returns the results from the test:

We see that the p-value is 0.0509:

15.10.4 The odds ratio

An informative summary statistic associated with two-by-two tables is the odds ratio. Define the two variables as \(X = 1\) if you are a male and 0 otherwise, and \(Y=1\) if you are funded and 0 otherwise. The odds of getting funded if you are a man is defined:

\[\mbox{Pr}(Y=1 \mid X=1) / \mbox{Pr}(Y=0 \mid X=1)\]

and can be computed like this:

And the odds of being funded if you are a woman is:

\[\mbox{Pr}(Y=1 \mid X=0) / \mbox{Pr}(Y=0 \mid X=0)\]

The odds ratio is the ratio for these two odds: how many times larger are the odds for men than for women?

We often see two-by-two tables written out as

Men Women
Awarded a b
Not Awarded c d

In this case, the odds ratio is \(\frac{a/c}{b/d}\) which is equivalent to \((ad) / (bc)\)

15.10.5 Confidence intervals for the odds ratio

Computing confidence intervals for the odds ratio is not mathematically straightforward. Unlike other statistics, for which we can derive useful approximations of their distributions, the odds ratio is not only a ratio, but a ratio of ratios. Therefore, there is no simple way of using, for example, the CLT.

However, statistical theory tells us that when all four entries of the two-by-two table are large enough, then the log of the odds ratio is approximately normal with standard error

\[ \sqrt{1/a + 1/b + 1/c + 1/d} \]

This implies that a 95% confidence interval for the log odds ratio can be formed by:

\[ \log\left(\frac{ad}{bc}\right) \pm 1.96 \sqrt{1/a + 1/b + 1/c + 1/d} \]

By exponentiating these two numbers we can construct a confidence interval of the odds ratio.

Using R we can compute this confidence interval as follows:

If we want to convert it back to the odds ratio scale, we can exponentiate:

Note that 1 is not included in the confidence interval which must mean that the p-value is smaller than 0.05. We can confirm this using:

This is a slightly different p-value than that with the Chi-square test. This is because we are using a different asymptotic approximation to the null distribution. To learn more about inference and asymptotic theory for odds ratio, consult the Generalized Linear Models book by McCullagh and Nelder.

15.10.6 Small count correction

Note that the log odds ratio is not defined if any of the cells of the two-by-two table is 0. This is because if \(a\) , \(b\) , \(c\) , or \(d\) is 0, the \(\log(\frac{ad}{bc})\) is either the log of 0 or has a 0 in the denominator. For this situation, it is common practice to avoid 0s by adding 0.5 to each cell. This is referred to as the Haldane–Anscombe correction and has been shown, both in practice and theory, to work well.

15.10.7 Large samples, small p-values

As mentioned earlier, reporting only p-values is not an appropriate way to report the results of data analysis. In scientific journals, for example, some studies seem to overemphasize p-values. Some of these studies have large sample sizes and report impressively small p-values. Yet when one looks closely at the results, we realize odds ratios are quite modest: barely bigger than 1. In this case the difference may not be practically significant or scientifically significant .

Note that the relationship between odds ratio and p-value is not one-to-one. It depends on the sample size. So a very small p-value does not necessarily mean a very large odds ratio. Notice what happens to the p-value if we multiply our two-by-two table by 10, which does not change the odds ratio:

15.11 Exercises

1. A famous athlete has an impressive career, winning 70% of her 500 career matches. However, this athlete gets criticized because in important events, such as the Olympics, she has a losing record of 8 wins and 9 losses. Perform a Chi-square test to determine if this losing record can be simply due to chance as opposed to not performing well under pressure.

2. Why did we use the Chi-square test instead of Fisher’s exact test in the previous exercise?

  • It actually does not matter, since they give the exact same p-value.
  • Fisher’s exact and the Chi-square are different names for the same test.
  • Because the sum of the rows and columns of the two-by-two table are not fixed so the hypergeometric distribution is not an appropriate assumption for the null hypothesis. For this reason, Fisher’s exact test is rarely applicable with observational data.
  • Because the Chi-square test runs faster.

3. Compute the odds ratio of “losing under pressure” along with a confidence interval.

4. Notice that the p-value is larger than 0.05 but the 95% confidence interval does not include 1. What explains this?

  • We made a mistake in our code.
  • These are based on t-statistics so the connection between p-value and confidence intervals does not apply.
  • Different approximations are used for the p-value and the confidence interval calculation. If we had a larger sample size the match would be better.
  • We should use the Fisher exact test to get confidence intervals.

5. Multiply the two-by-two table by 2 and see if the p-value and confidence retrieval are a better match.

http://www.realclearpolitics.com ↩︎

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton-5491.html ↩︎

http://www.pnas.org/content/112/40/12349.abstract ↩︎

https://en.wikipedia.org/wiki/Ronald_Fisher ↩︎

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Basic Inferential Statistics: Theory and Application

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The heart of statistics is inferential statistics. Descriptive statistics are typically straightforward and easy to interpret. Unlike descriptive statistics, inferential statistics are often complex and may have several different interpretations.

The goal of inferential statistics is to discover some property or general pattern about a large group by studying a smaller group of people in the hopes that the results will generalize to the larger group. For example, we may ask residents of New York City their opinion about their mayor. We would probably poll a few thousand individuals in New York City in an attempt to find out how the city as a whole views their mayor. The following section examines how this is done.

A population is the entire group of people you would like to know something about. In our previous example of New York City, the population is all of the people living in New York City. It should not include people from England, visitors in New York, or even people who know a lot about New York City.

A sample is a subset of the population. Just like you may sample different types of ice cream at the grocery store, a sample of a population should be just a smaller version of the population.

It is extremely important to understand how the sample being studied was drawn from the population. The sample should be as representative of the population as possible. There are several valid ways of creating a sample from a population, but inferential statistics works best when the sample is drawn at random from the population. Given a large enough sample, drawing at random ensures a fair and representative sample of a population.

Comparing two or more groups

Much of statistics, especially in medicine and psychology, is used to compare two or more groups and attempts to figure out if the two groups are different from one another.

Example: Drug X

Let us say that a drug company has developed a pill, which they think increases the recovery time from the common cold. How would they actually find out if the pill works or not? What they might do is get two groups of people from the same population (say, people from a small town in Indiana who had just caught a cold) and administer the pill to one group, and give the other group a placebo. They could then measure how many days each group took to recover (typically, one would calculate the mean of each group). Let's say that the mean recovery time for the group with the new drug was 5.4 days, and the mean recovery time for the group with the placebo was 5.8 days.

The question becomes, is this difference due to random chance, or does taking the pill actually help you recover from the cold faster? The means of the two groups alone does not help us determine the answer to this question. We need additional information.

Sample Size

If our example study only consisted of two people (one from the drug group and one from the placebo group) there would be so few participants that we would not have much confidence that there is a difference between the two groups. That is to say, there is a high probability that chance explains our results (any number of explanations might account for this, for example, one person might be younger, and thus have a better immune system). However, if our sample consisted of 1,000 people in each group, then the results become much more robust (while it might be easy to say that one person is younger than another, it is hard to say that 1,000 random people are younger than another 1,000 random people). If the sample is drawn at random from the population, then these 'random' variations in participants should be approximately equal in the two groups, given that the two groups are large. This is why inferential statistics works best when there are lots of people involved.

Be wary of statistics that have small sample sizes, unless they are in a peer-reviewed journal. Professional statisticians can interpret results correctly from small sample sizes, and often do, but not everyone is a professional, and novice statisticians often incorrectly interpret results. Also, if your author has an agenda, they may knowingly misinterpret results. If your author does not give a sample size, then he or she is probably not a professional, and you should be wary of the results. Sample sizes are required information in almost all peer-reviewed journals, and therefore, should be included in anything you write as well.

Variability

Even if we have a large enough sample size, we still need more information to reach a conclusion. What we need is some measure of variability. We know that the typical person takes about 5-6 days to recover from a cold, but does everyone recover around 5-6 days, or do some people recover in 1 day, and others recover in 10 days? Understanding the spread of the data will tell us how effective the pill is. If everyone in the placebo group takes exactly 5.8 days to recover, then it is clear that the pill has a positive effect, but if people have a wide variability in their length of recovery (and they probably do) then the picture becomes a little fuzzy. Only when the mean, sample size, and variability have been calculated can a proper conclusion be made. In our case, if the sample size is large, and the variability is small, then we would receive a small p-value (probability-value). Small p-values are good, and this term is prominent enough to warrant further discussion.

In classic inferential statistics, we make two hypotheses before we start our study, the null hypothesis, and the alternative hypothesis.

Null Hypothesis: States that the two groups we are studying are the same.

Alternative Hypothesis: States that the two groups we are studying are different.

The goal in classic inferential statistics is to prove the null hypothesis wrong. The logic says that if the two groups aren't the same, then they must be different. A low p-value indicates a low probability that the null hypothesis is correct (thus, providing evidence for the alternative hypothesis).

Remember: It's good to have low p-values.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Statistical Inference: Definition, Methods & Example

By Jim Frost 1 Comment

What is Statistical Inference?

Statistical inference is the process of using a sample to infer the properties of a population . Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn.

Image of a scientist who wants to make a statistical inference.

Unfortunately, populations are usually too large to measure fully. Consequently, researchers must use a manageable subset of that population to learn about it.

By using procedures that can make statistical inferences, you can estimate the properties and processes of a population. More specifically, sample statistics can estimate population parameters . Learn more about the differences between sample statistics and population parameters .

For example, imagine that you are studying a new medication. As a scientist, you’d like to understand the medicine’s effect in the entire population rather than just a small sample. After all, knowing the effect on a handful of people isn’t very helpful for the larger society!

Consequently, you are interested in making a statistical inference about the medicine’s effect in the population.

Read on to see how to do that! I’ll show you the general process for making a statistical inference and then cover an example using real data.

Related posts : Populations vs. Samples and Descriptive vs. Inferential Statistics

How to Make Statistical Inferences

In its simplest form, the process of making a statistical inference requires you to do the following:

  • Draw a sample that adequately represents the population.
  • Measure your variables of interest.
  • Use appropriate statistical methodology to generalize your sample results to the population while accounting for sampling error.

Of course, that’s the simple version. In real-world experiments, you might need to form treatment and control groups , administer treatments, and reduce other sources of variation. In more complex cases, you might need to create a model of a process. There are many details in the process of making a statistical inference! Learn how to incorporate statistical inference into scientific studies .

Statistical inference requires using specialized sampling methods that tend to produce representative samples. If the sample does not look like the larger population you’re studying, you can’t trust any inferences from the sample. Consequently, using an appropriate method to obtain your sample is crucial. The best sampling methods tend to produce samples that look like the target population. Learn more about Sampling Methods and Representative Samples .

After obtaining a representative sample, you’ll need to use a procedure that can make statistical inferences. While you might have a sample that looks similar to the population, it will never be identical to it. Statisticians refer to the differences between a sample and the population as sampling error. Any effect or relationship you see in your sample might actually be sampling error rather than a true finding. Inferential statistics incorporate sampling error into the results. Learn more about Sampling Error .

Common Inferential Methods

The following are four standard procedures than can make statistical inferences.

  • Hypothesis Testing : Uses representative samples to assess two mutually exclusive hypotheses about a population. Statistically significant results suggest that the sample effect or relationship exists in the population after accounting for sampling error.
  • Confidence Intervals : A range of values likely containing the population value. This procedure evaluates the sampling error and adds a margin around the estimate, giving an idea of how wrong it might be.
  • Margin of Error : Comparable to a confidence interval but usually for survey results.
  • Regression Modeling : An estimate of the process that generates the outcomes in the population.

Example Statistical Inference

Let’s look at a real flu vaccine study for an example of making a statistical inference. The scientists for this study want to evaluate whether a flu vaccine effectively reduces flu cases in the general population. However, the general population is much too large to include in their study, so they must use a representative sample to make a statistical inference about the vaccine’s effectiveness.

The Monto et al. study* evaluates the 2007-2008 flu season and follows its participants from January to April. Participants are 18-49 years old. They selected ~1100 participants and randomly assigned them to the vaccine and placebo groups. After tracking them for the flu season, they record the number of flu infections in each group, as shown below.

Placebo 35 325 10.8%
Vaccine 28 813 3.4%
Effect 7.4%

Monto Study Findings

From the table above, 10.8% of the unvaccinated got the flu, while only 3.4% of the vaccinated caught it. The apparent effect of the vaccine is 10.8% – 3.4% = 7.4%. While that seems to show a vaccine effect, it might be a fluke due to sampling error. We’re assessing only 1,100 people out of a population of millions. We need to use a hypothesis test and confidence interval (CI) to make a proper statistical inference.

While the details go beyond this introductory post, here are two statistical inferences we can make using a 2-sample proportions test and CI.

  • The p-value of the test is < 0.0005. The evidence strongly favors the hypothesis that the vaccine effectively reduces flu infections in the population after accounting for sampling error.
  • Additionally, the confidence interval for the effect size is 3.7% to 10.9%. Our study found a sample effect of 7.4%, but it is unlikely to equal the population effect exactly due to sampling error. The CI identifies a range that is likely to include the population effect.

For more information about this and other flu vaccine studies, read my post about Flu Vaccine Effectiveness .

In conclusion, by using a representative sample and the proper methodology, we made a statistical inference about vaccine effectiveness in an entire population.

Monto AS, Ohmit SE, Petrie JG, Johnson E, Truscon R, Teich E, Rotthoff J, Boulton M, Victor JC. Comparative efficacy of inactivated and live attenuated influenza vaccines. N Engl J Med. 2009;361(13):1260-7.

Share this:

assignment inferential statistics

Reader Interactions

' src=

July 17, 2022 at 10:09 am

Hello, Jim: I reviewed several of your posts related to linear regression and hypothesis testing. I am a psychology PhD student working on my dissertation. Here is my question: I am studying the predictive power of mindfulness on romantic partner behavior. Each of 5 facets of mindfulness is being tested separately against the 6 types of behavior. I want to control for demographics such as education and previous mindfulness training. Is hierarchical linear regression the best option? Here is an example of one of my research questions: Does Describing (a facet of mindfulness) predict the presence of compromise (a type of conflict behavior)?

Comments and Questions Cancel reply

Introduction to Inferential Statistics

Current as of 2024-01-05

Lecture : MW 12-1:30pm (MCNB 309)

Dr. Marc Trussler

[email protected]

Fox-Fels Hall 32 (3814 Walnut Street)

Office Hours: M 9-11am

TA: Dylan Radley

[email protected]

Fox-Fels Hall 35 (3814 Walnut Street)

Office Hours:

Tuesday 11-12

Tuesday 3-4

Thursday 12-1

Course Description

The first step of many data science sequences is to learn a great deal about how to work with individual data sets: cleaning, tidying, merging, describing and visualizing data. These are crucial skills in data analytics, but describing a data set is not our ultimate goal. The ultimate goal of data science is to make inferences about the world based on the small sample of data that we have.

PSCI 1801 shifts focus to this goal of inference. Using a methodology that emphasizes intuition and simulation over mathematics, this course will cover the key statistical concepts of probability, sampling, distributions, hypothesis testing, and covariance. The goal of the class is for students to ultimately have the knowledge and ability to perform, customize, and explain bivariate and multivariate regression. Students who have not taken PSCI-1800 should have basic familiarity with R, including working with vectors and matrices, basic summary statistics, visualizations, and for() loops.

Expectations and policies

Prerequisite knowledge.

PSCI 1800 (formerly 107) or similar R course. To help us better understand the nature of inferential statistics, we will be running quite a lot of simulations in R . Students entering the class should have a working knowledge of the R programming language, and in particular know how to use square brackets to index vectors and to run for() loops. We will be doing a short refresher on these concepts in the first two weeks of class.

Course Slack Channel

We will use Slack to communicate with the class. You will receive an invitation to join the our channel shortly after the start of class. One of the better things to come through the pandemic is the use of Slack for classroom communications. It is a really good tool to allow us to send quick and informal messages to individual students or groups (or for you to message us). Similarly, it allows you to collaborate with other students in the class, and is a great place to get simple questions answered. Because we will be making announcements via Slack, it is extremely important you get this set up.

Format/Attendance

The lectures will be in person. While this is not a discussion-based class, there is an expectation of some amount of participation and feedback. Attendance will not be recorded, though do note you are scored on participation.

The course will require students to have access to a personal computer in order to run the statistics software. If this is not possible, please consult with one of the instructors as soon as possible. Support to cover course costs is available through ( https://srfs.upenn.edu/sfs )[Student Financial Services].

Academic integrity

We expect all students to abide by the rules of the University and to follow the Code of Academic Integrity. 1

For Problem Sets : Collaboration on problem sets is permitted. Ultimately, however, the write-up and code that you turn in must be your own creation. Please write the names of any students you worked with at the top of each problem set. 2

For Exams : Collaboration on the take home exams is cheating. Anyone caught collaborating (and I have caught many) will be immediately referred to the University’s disciplinary system.

Re-grading of assignments

All student work will be assessed using fair criteria that are uniform across the class. If, however, you are unsatisfied with the grade you received on a particular assignment (beyond simple clerical errors), you can request a re-grade using the following protocol. First, you may not send any grade complaints or requests for re-grades until at least 24 hours after the graded assignment was returned to you. After that, you must document your specific grievances in writing by submitting a PDF or Word Document to the teaching staff. In this document you should explain exactly which parts of the assignment you believe were mis-graded, and provide documentation for why your answers were correct.We will then re-score the entire assignment (including portions for which you did not have grievances), and the new score will be the one you receive on the assignment (even if it is lower than your original score).

Late policy

Notwithstanding everything below: exceptions to all of these policies will be made for health reasons, extraordinary family circumstances, and religious holidays. The teaching staff are extremely reasonable and lenient, as long as you discuss with us potential issues *before} the deadline.

For problem sets: You are granted 5 ``grace days’’ throughout the semester. Over the course of the semester you can use these when you need to turn problem sets in late. You can only use 3 grace days on any given assignment. You do not have to ask to use these days. This is counted in whole days, so if a problem set is turned in at 5:01pm the day it is due (i.e. 1 minute late) you will have used 1 grace day. If you turn the problem set in at 5:01pm the day after it is due (i.e. 24 hours and 1 minute late) you will have used 2 grace days etc. Choosing to not complete a problem set (see policy below) does not affect your grace days. Once you are out of grace days subsequently late problem sets will be graded as incomplete.

The nature of the two exams (timed exams completed during a certain window) does not allow for any extensions.

Assessment and grading

All assignments will be graded anonymously. Please hand assignments in on Canvas with your student number, not your name.

Participation (5%)

This portion of your grade mixes two components:

Traditional participation including: asking and answering questions in lecture and in recitations, asking and answering questions on the course Slack, or attending office hours.

The completion of weekly ``check-in’’ quizzes on Canvas. These will be available each week, will take less than 5 minutes, and will be graded by completion (not correctness).

Problem sets (45%)

Five problem sets (roughly every two weeks)

Scored out of 100.

You are free to do as many of the problem sets as you like. If you do not complete a problem set, the percentage points for that assignment will be transferred to the first exam (for PS1 and PS2), or the second exam (for PS3, PS4, & PS5). For example if you don’t complete PS2, the first exam would then be worth 34% of your final grade. If you don’t complete PS4 & PS5, the second exam would be worth 43% of your final grade.

First Exam 25%

  • This will be an open-book 24 hour take-home test. The test will open on Monday, October 16 at 3:00pm and close on Friday, October 20 at 11:59pm. You can select any 24 hour period to do the test during this window. The latest you can open the test and still have 24 hours to complete it is therefore October 19th at 11:59pm. You may not work with other students on this exam. It will take a similar form as the problem sets.

Second Exam 25%

  • This will be a 3 hour open-book take-home test completed on December 20th. You may not work with other students on this exam. Because of the shortened time frame this exam will be less coding intensive and focus more on theoretic concepts.

We will use R in this class, which you can download for free at https://www.r-project.org/ . R is completely open source and has an almost endless set of resources online. Virtually any data science job you could apply nowadays to will require some background in R programming.

While R is the language we will use, RStudio is a free program that makes it considerably easier to work with R. After installing R, you should install RStudio https://www.rstudio.com . Please have both R and RStudio installed by the end of the first week of classes.

If you’re having trouble installing either program, there are more detailed installation instructions on the course Canvas page.

There is one mandatory textbook for this course and two optional:

Data Analysis for Social Science: A Friendly and Practical Introduction. Elena Llaudet & Kosuke Imai. (Mandatory).

  • I have chosen this book because it does a really good job of weaving in the basics of statistics with the use of R. Generally speaking the assigned readings from this book will be slightly less technical than what is in the class notes. This book is available at the bookstore and from Amazon. There is only one addition, but be sure to get the (way cheaper) paperback version.

Quantitative Social Science: an Introduction. Kosuke Imai.

  • This is the original, graduate level, textbook the Llaudet and Imai textbook is based on. The chapters are largely the same, but this textbook is much more math intensive. I have included below the equivalent readings (labeled QSS) if you want to go into greater detail. These readings are completely optional.

Statistics: Fourth Edition. Freedman, Pisani, Purves. (Optional).

  • This textbook has a slightly more conversational and intuitive approach, but does not incorporate those lessons with R. While having this book is not mandatory I really like the style and common-sense explanations of this book. It’s a great companion to have around.

Class Schedule

Week 1: august 30.

The population is the point.

Excerpt from Mlodinow (on Canvas).

Week 2: (No Monday class) - September 6

Llaudet & Imai 1

Week 3: September 11 - September 13

R Review/Start probability

Llaudet & Imai 6.1,6.2,6.7

(QSS 4.11, 6.1)

September 12: course selection period ends

Week 4: September 18 - September 20

Conditional probability and independence

Week 5: September 25 - September 27

Random Variables I: Discrete

Llaudet & Imai 6.4.1

Problem Set 1 Due Wednesday 7pm .

Week 6: October 2 - October 4

Random Variables II: Continuous

Llaudet & Imai 6.4.2-6.4.4

Week 7:October 9- October 11

Sampling and confidence intervals

Llaudet & Imai 6.5.1,6.5.2

October 9: Drop period ends

Problem Set 2 Due Wednesday 7pm .

Week 8: October 16 - October 18

Wednesday class will be a drop-in review session in our usual classroom.

First Midterm Exam period Monday 3:00pm to Friday 11:59pm .

Week 9: October 23 - October 25

Standard error of the mean/Field Trip

Llaudet & Imai 6.5.3

On October 25th we will take a class field trip to the NBC News Decision Desk.

October 27: Grade type change deadline.

Week 10: October 30 - November 1

Hypothesis Tests and Power

Llaudet & Imai 7.1 7.3 7.4

Problem Set 3 Due Wednesday 7pm .

Week 11: November 6 - November 8

Two continuous variables and covariation

Llaudet & Imai 3.5

November 6: Withdrawal deadline

Week 12: November 13 - November 15

Correlation and bivariate regressionn

Llaudet & Imai 4.3

Problem Set 4 Due Wednesday 7pm .

Week 13: November 20 – (No Wednesday Class)

Multivariate Regression I

Llaudet & Imai 2.1-2.4

Week 14: November 27 - November 29

Multivariate regression II

Llaudet & Imai 5.1-5.5

Week 15: December 4- December 6

Interaction with regression

Excerpt from Kam and Franzese (Canvas)

Problem Set 5 Due Wednesday 7pm .

Week 16: December 11

Prediction with regression

Llaudet & Imai 4.5-4.6

(QSS 7.3.1,7.3.2)

Inferential Statistics

Inferential statistics is a branch of statistics that makes the use of various analytical tools to draw inferences about the population data from sample data. Apart from inferential statistics, descriptive statistics forms another branch of statistics. Inferential statistics help to draw conclusions about the population while descriptive statistics summarizes the features of the data set.

There are two main types of inferential statistics - hypothesis testing and regression analysis. The samples chosen in inferential statistics need to be representative of the entire population. In this article, we will learn more about inferential statistics, its types, examples, and see the important formulas.

1.
2.
3.
4.
5.

What is Inferential Statistics?

Inferential statistics helps to develop a good understanding of the population data by analyzing the samples obtained from it. It helps in making generalizations about the population by using various analytical tests and tools. In order to pick out random samples that will represent the population accurately many sampling techniques are used. Some of the important methods are simple random sampling, stratified sampling, cluster sampling, and systematic sampling techniques.

Inferential Statistics Definition

Inferential statistics can be defined as a field of statistics that uses analytical tools for drawing conclusions about a population by examining random samples. The goal of inferential statistics is to make generalizations about a population. In inferential statistics, a statistic is taken from the sample data (e.g., the sample mean) that used to make inferences about the population parameter (e.g., the population mean).

Types of Inferential Statistics

Inferential statistics can be classified into hypothesis testing and regression analysis. Hypothesis testing also includes the use of confidence intervals to test the parameters of a population. Given below are the different types of inferential statistics.

Types of Inferential Statistics

Hypothesis Testing

Hypothesis testing is a type of inferential statistics that is used to test assumptions and draw conclusions about the population from the available sample data. It involves setting up a null hypothesis and an alternative hypothesis followed by conducting a statistical test of significance. A conclusion is drawn based on the value of the test statistic, the critical value , and the confidence intervals . A hypothesis test can be left-tailed, right-tailed, and two-tailed. Given below are certain important hypothesis tests that are used in inferential statistics.

Z Test: A z test is used on data that follows a normal distribution and has a sample size greater than or equal to 30. It is used to test if the means of the sample and population are equal when the population variance is known. The right tailed hypothesis can be set up as follows:

Null Hypothesis: \(H_{0}\) : \(\mu = \mu_{0}\)

Alternate Hypothesis: \(H_{1}\) : \(\mu > \mu_{0}\)

Test Statistic: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the sample size.

Decision Criteria: If the z statistic > z critical value then reject the null hypothesis.

T Test: A t test is used when the data follows a student t distribution and the sample size is lesser than 30. It is used to compare the sample and population mean when the population variance is unknown. The hypothesis test for inferential statistics is given as follows:

Test Statistics: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\)

Decision Criteria: If the t statistic > t critical value then reject the null hypothesis.

F Test: An f test is used to check if there is a difference between the variances of two samples or populations. The right tailed f hypothesis test can be set up as follows:

Null Hypothesis: \(H_{0}\) : \(\sigma_{1}^{2} = \sigma_{2}^{2}\)

Alternate Hypothesis: \(H_{1}\) : \(\sigma_{1}^{2} > \sigma_{2}^{2}\)

Test Statistic: f = \(\frac{\sigma_{1}^{2}}{\sigma_{2}^{2}}\), where \(\sigma_{1}^{2}\) is the variance of the first population and \(\sigma_{2}^{2}\) is the variance of the second population.

Decision Criteria: If the f test statistic > f test critical value then reject the null hypothesis.

Confidence Interval: A confidence interval helps in estimating the parameters of a population. For example, a 95% confidence interval indicates that if a test is conducted 100 times with new samples under the same conditions then the estimate can be expected to lie within the given interval 95 times. Furthermore, a confidence interval is also useful in calculating the critical value in hypothesis testing.

Apart from these tests, other tests used in inferential statistics are the ANOVA test, Wilcoxon signed-rank test, Mann-Whitney U test, Kruskal-Wallis H test, etc.

Regression Analysis

Regression analysis is used to quantify how one variable will change with respect to another variable. There are many types of regressions available such as simple linear, multiple linear, nominal, logistic, and ordinal regression. The most commonly used regression in inferential statistics is linear regression. Linear regression checks the effect of a unit change of the independent variable in the dependent variable. Some important formulas used in inferential statistics for regression analysis are as follows:

Regression Coefficients :

The straight line equation is given as y = \(\alpha\) + \(\beta x\), where \(\alpha\) and \(\beta\) are regression coefficients.

\(\beta = \frac{\sum_{1}^{n}\left ( x_{i}-\overline{x} \right )\left ( y_{i}-\overline{y} \right )}{\sum_{1}^{n}\left ( x_{i}-\overline{x} \right )^{2}}\)

\(\beta = r_{xy}\frac{\sigma_{y}}{\sigma_{x}}\)

\(\alpha = \overline{y}-\beta \overline{x}\)

Here, \(\overline{x}\) is the mean, and \(\sigma_{x}\) is the standard deviation of the first data set. Similarly, \(\overline{y}\) is the mean, and \(\sigma_{y}\) is the standard deviation of the second data set.

Inferential Statistics Examples

Inferential statistics is very useful and cost-effective as it can make inferences about the population without collecting the complete data. Some inferential statistics examples are given below:

  • Suppose the mean marks of 100 students in a particular country are known. Using this sample information the mean marks of students in the country can be approximated using inferential statistics.
  • Suppose a coach wants to find out how many average cartwheels sophomores at his college can do without stopping. A sample of a few students will be asked to perform cartwheels and the average will be calculated. Inferential statistics will use this data to make a conclusion regarding how many cartwheel sophomores can perform on average.

Inferential Statistics vs Descriptive Statistics

Descriptive and inferential statistics are used to describe data and make generalizations about the population from samples. The table given below lists the differences between inferential statistics and descriptive statistics.

Inferential statistics are used to make conclusions about the population by using analytical tools on the sample data. Descriptive statistics are used to quantify the characteristics of the data.
Hypothesis testing and regression analysis are the analytical tools used. and are the important tools used.
It is used to make inferences about an unknown population It is used to describe the characteristics of a known sample or population.
Measures of inferential statistics are t-test, z test, linear regression, etc. Measures of descriptive statistics are variance, , mean, , etc.

Related Articles:

  • Probability and Statistics
  • Data Handling
  • Summary Statistics

Important Notes on Inferential Statistics

  • Inferential statistics makes use of analytical tools to draw statistical conclusions regarding the population data from a sample.
  • Hypothesis testing and regression analysis are the types of inferential statistics.
  • Sampling techniques are used in inferential statistics to determine representative samples of the entire population.
  • Z test, t-test, linear regression are the analytical tools used in inferential statistics.

Examples on Inferential Statistics

Example 1: After a new sales training is given to employees the average sale goes up to $150 (a sample of 25 employees was examined) with a standard deviation of $12. Before the training, the average sale was $100. Check if the training helped at \(\alpha\) = 0.05.

Solution: The t test in inferential statistics is used to solve this problem.

\(\overline{x}\) = 150, \(\mu\) = 100, s = 12, n = 25

\(H_{0}\) : \(\mu = 100\)

\(H_{1}\) : \(\mu > 100\)

t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\)

The degrees of freedom is given by 25 - 1 = 24

Using the t table at \(\alpha\) = 0.05, the critical value is T(0.05, 24) = 1.71

As 20.83 > 1.71 thus, the null hypothesis is rejected and it is concluded that the training helped in increasing the average sales.

Answer: Reject Null Hypothesis.

Example 2: A test was conducted with the variance = 108 and n = 8. Certain changes were made in the test and it was again conducted with variance = 72 and n = 6. At a 0.05 significance level was there any improvement in the test results?

Solution: The f test in inferential statistics will be used

\(H_{0}\) : \(s_{1}^{2} = s_{2}^{2}\)

\(H_{1}\) : \(s_{1}^{2} > s_{2}^{2}\)

\(n_{1}\) = 8, \(n_{2}\) = 6

\(df_{1}\) = 8 - 1 = 7

\(df_{2}\) = 6 - 1 = 5

\(s_{1}^{2}\) = 108, \(s_{2}^{2}\) = 72

The f test formula is given as follows:

F = \(\frac{s_{1}^{2}}{s_{2}^{2}}\) = 106 / 72

Now from the F table the critical value F(0.05, 7, 5) = 4.88

Inferential Statistics Example

As 4.88 < 1.5, thus, we fail to reject the null hypothesis and conclude that there is not enough evidence to suggest that the test results improved.

Answer: Fail to reject the null hypothesis.

Example 3: After a new sales training is given to employees the average sale goes up to $150 (a sample of 49 employees was examined). Before the training, the average sale was $100 with a standard deviation of $12. Check if the training helped at \(\alpha\) = 0.05.

Solution: This is similar to example 1. However, as the sample size is 49 and the population standard deviation is known, thus, the z test in inferential statistics is used.

\(\overline{x}\) = 150, \(\mu\) = 100, \(\sigma\) = 12, n = 49

t = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\)

From the z table at \(\alpha\) = 0.05, the critical value is 1.645.

As 29.2 > 1.645 thus, the null hypothesis is rejected and it is concluded that the training was useful in increasing the average sales.

Answer: Reject the null hypothesis.

go to slide go to slide go to slide

assignment inferential statistics

Book a Free Trial Class

FAQs on Inferential Statistics

What is the meaning of inferential statistics.

Inferential statistics is a field of statistics that uses several analytical tools to draw inferences and make generalizations about population data from sample data.

What are the Types of Inferential Statistics?

There are two main types of inferential statistics that use different methods to draw conclusions about the population data. These are regression analysis and hypothesis testing.

What are the Different Sampling Methods Used in Inferential Statistics?

It is necessary to choose the correct sample from the population so as to represent it accurately. Some important sampling strategies used in inferential statistics are simple random sampling, stratified sampling, cluster sampling, and systematic sampling.

What are the Different Types of Hypothesis Tests In Inferential Statistics?

The most frequently used hypothesis tests in inferential statistics are parametric tests such as z test, f test, ANOVA test , t test as well as certain non-parametric tests such as Wilcoxon signed-rank test.

What is Inferential Statistics Used For?

Inferential statistics is used for comparing the parameters of two or more samples and makes generalizations about the larger population based on these samples.

Is Z Score a Part of Inferential Statistics?

Yes, z score is a fundamental part of inferential statistics as it determines whether a sample is representative of its population or not. Furthermore, it is also indirectly used in the z test.

What is the Difference Between Descriptive and Inferential Statistics?

Descriptive statistics is used to describe the features of some known dataset whereas inferential statistics analyzes a sample in order to draw conclusions regarding the population.

  • Comprehensive Learning Paths
  • 150+ Hours of Videos
  • Complete Access to Jupyter notebooks, Datasets, References.

Rating

Descriptive and Inferential Statistics – Deep Dive into Descriptive and Inferential Statistics

  • September 14, 2023

In statistics understanding the difference between descriptive and inferential statistics is crucial for anyone looking to make sense of data, whether it’s for academic research, business decision-making, or just general curiosity. Let’s dive into these core concepts.

assignment inferential statistics

In this Blog post we will learn:

  • What is Descriptive Statistics?
  • What is Inferential Statistics?
  • Difference Between Descriptive and Inferential Statistics: A Quick Glance
  • Types of Descriptive Statistics with Examples
  • Types of Inferential Statistics with Examples

1. What is Descriptive Statistics?

Descriptive statistic offer a way to capture the main features of a dataset in a summarized and comprehensible manner. It doesn’t make predictions or inferences but instead provides a concise overview of what the data shows.

For instance, imagine you’ve conducted a survey in your neighborhood asking how many books people read in a year. Descriptive statistics would provide you with insights like the average number of books read, the range between the highest and lowest figures, or the most common number reported.

2. What is Inferential Statistics?

Inferential statistics , on the other hand, goes a step beyond. Instead of just summarizing or describing data, inferential statistics aims to use the data to make predictions, inferences, or decisions about a broader context than just the sampled data.

Going back to our book-reading survey, inferential statistics might let us predict the average number of books a person in a larger area (say, the entire city) might read in a year, based on the data collected in your neighborhood.

3. Difference Between Descriptive and Inferential Statistics: A Quick Glance

Feature Descriptive Statistics Inferential Statistics
Summarize and describe data Make predictions or inferences
Specific dataset under study Sample data to infer about a larger population
Qualitative and quantitative Mostly quantitative
Mean, median, mode, standard deviation Hypothesis testing, regression analysis, ANOVA
What is happening in my data? What could be happening beyond my data?
Limited to the dataset in question Applies to a larger population or different scenarios

4. Types of Descriptive Statistics with Examples

Measures of Central Tendency :These provide insights into the central point of a dataset.

  • Example: For a dataset of ages (23, 25, 26, 29, 30), the mean age is $ \frac{23+25+26+29+30}{5} = 26.6 $ years.
  • Example: The central value in a sorted dataset. For the ages above, the median age is 26 years.
  • Example: The most frequent value(s). If the dataset is (23, 25, 25, 26, 29, 30), 25 is the mode.

Measures of Spread :These describe the distribution and dispersion of values in a dataset.

  • Example: For our ages dataset, the range is 7 years (from 23 to 30).
  • Example: For our ages dataset, variance can be calculated using a formula which takes into account the mean and the differences of each value from the mean. The standard deviation is the square root of this variance.

Frequency Distributions : This is often represented graphically, such as with histograms, to show how frequently each value appears in the dataset. – Example: A histogram might show how many people in our neighborhood read 0-5 books, 6-10 books, 11-15 books, and so on.

5. Types of Inferential Statistics with Examples

1. Hypothesis Testing : A systematic way to test claims or ideas about a group or population.

  • Example : Imagine a company claims that its weight loss pill helps people lose an average of 10 lbs in a month. To test this, a sample of individuals is selected and given the pill. If the sample shows an average weight loss significantly different from 10 lbs, the claim can be challenged.

2. Confidence Intervals : It gives a range of values used to estimate the true population parameter. This interval can give an idea of the uncertainty around a sample estimate.

  • Example : Based on a sample, a researcher might conclude that 40% of a city’s residents favor a new park, with a confidence interval of 5%. This means the researcher is confident that between 35% and 45% of all residents favor the new park.

3. p-value : L A p-value is used in hypothesis testing to determine the significance of the results of a study. It’s a measure of the evidence against a null hypothesis.

  • Example : If testing the effectiveness of a drug, a p-value of 0.03 might indicate that there’s only a 3% chance that the observed results were due to random chance (often p-values less than 0.05 are considered “statistically significant”).

4. Chi-Square Tests : Used to test relationships between categorical variables.

  • Example : Researchers might want to test if there’s a relationship between gender and the likelihood to vote for a particular candidate. The Chi-Square test can help determine if observed voting patterns are due to chance or a genuine relationship between the variables.

5. ANOVA (Analysis of Variance) : Compares the means of three or more groups to understand if they’re statistically different from each other.

assignment inferential statistics

  • Example : A psychologist might want to test three different techniques to reduce anxiety. By applying ANOVA, the psychologist can determine if one technique is superior, or if all techniques produce the same results.

6. Regression Analysis : Examines the relationship between two or more variables. It allows for predictions based on this relationship.

  • Example : An economist might explore the relationship between a country’s GDP and its unemployment rate. If a strong relationship is found, the economist can make predictions about unemployment based on future GDP estimates.

7. T-tests : Compares the means of two groups to understand if they’re statistically different from each other.

  • Example : A researcher might want to test if a new teaching method is better than the traditional method. By using a t-test, the researcher can determine if there’s a significant difference in performance between students taught with the new method versus those taught with the traditional method.

8. Z-tests : Used when the dataset is large, and you know the population variance. It’s used to compare a sample mean to a population mean.

  • Example :A large factory might claim that its assembly line produces 500 units per hour on average. An inspector could use a Z-test to see if a different hourly rate in his inspection is significantly different from the claim.

6. Conclusion

While both descriptive and inferential statistics have their unique places in data analysis, understanding when and how to use them is crucial. Descriptive statistics give you the tools to succinctly summarize and describe data, whereas inferential statistics empowers you to draw conclusions and predictions about larger contexts or populations. Both are indispensable tools in the world of data-driven decision-making.

More Articles

Correlation – connecting the dots, the role of correlation in data analysis, hypothesis testing – a deep dive into hypothesis testing, the backbone of statistical inference, sampling and sampling distributions – a comprehensive guide on sampling and sampling distributions, law of large numbers – a deep dive into the world of statistics, central limit theorem – a deep dive into central limit theorem and its significance in statistics, skewness and kurtosis – peaks and tails, understanding data through skewness and kurtosis”, similar articles, complete introduction to linear regression in r, how to implement common statistical significance tests and find the p value, logistic regression – a complete tutorial with examples in r.

Subscribe to Machine Learning Plus for high value data science content

© Machinelearningplus. All rights reserved.

assignment inferential statistics

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free sample videos:.

assignment inferential statistics

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalize your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalizing your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organizing data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualizing the relationship between two variables using a scatter plot .

By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval

Methodology

  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hostile attribution bias
  • Affect heuristic

Is this article helpful?

Other students also liked.

  • Descriptive Statistics | Definitions, Types, Examples
  • Inferential Statistics | An Easy Introduction & Examples
  • Choosing the Right Statistical Test | Types & Examples

More interesting articles

  • Akaike Information Criterion | When & How to Use It (Example)
  • An Easy Introduction to Statistical Significance (With Examples)
  • An Introduction to t Tests | Definitions, Formula and Examples
  • ANOVA in R | A Complete Step-by-Step Guide with Examples
  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Chi-Square (Χ²) Distributions | Definition & Examples
  • Chi-Square (Χ²) Table | Examples & Downloadable Table
  • Chi-Square (Χ²) Tests | Types, Formula & Examples
  • Chi-Square Goodness of Fit Test | Formula, Guide & Examples
  • Chi-Square Test of Independence | Formula, Guide & Examples
  • Coefficient of Determination (R²) | Calculation & Interpretation
  • Correlation Coefficient | Types, Formulas & Examples
  • Frequency Distribution | Tables, Types & Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | 4 Ways with Examples & Explanation
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Mode | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Hypothesis Testing | A Step-by-Step Guide with Easy Examples
  • Interval Data and How to Analyze It | Definitions & Examples
  • Levels of Measurement | Nominal, Ordinal, Interval and Ratio
  • Linear Regression in R | A Step-by-Step Guide & Examples
  • Missing Data | Types, Explanation, & Imputation
  • Multiple Linear Regression | A Quick Guide (Examples)
  • Nominal Data | Definition, Examples, Data Collection & Analysis
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • One-way ANOVA | When and How to Use It (With Examples)
  • Ordinal Data | Definition, Examples, Data Collection & Analysis
  • Parameter vs Statistic | Definitions, Differences & Examples
  • Pearson Correlation Coefficient (r) | Guide & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Probability Distribution | Formula, Types, & Examples
  • Quartiles & Quantiles | Calculation, Definition & Interpretation
  • Ratio Scales | Definition, Examples, & Data Analysis
  • Simple Linear Regression | An Easy Introduction & Examples
  • Skewness | Definition, Examples & Formula
  • Statistical Power and Why It Matters | A Simple Introduction
  • Student's t Table (Free Download) | Guide & Examples
  • T-distribution: What it is and how to use it
  • Test statistics | Definition, Interpretation, and Examples
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Two-Way ANOVA | Examples & When To Use It
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Understanding P values | Definition and Examples
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Kurtosis? | Definition, Examples & Formula
  • What Is Standard Error? | How to Calculate (Guide with Examples)

What is your plagiarism score?

Study Site Homepage

  • Request new password
  • Create a new account

Basic SPSS Tutorial

Student resources, additional assignments for inferential statistics.

Suggested assignments for inferential statistics are designed to promote in-depth engagement with the material.

Download all additional assignments for inferential statistics, and the accompanying data files: 

Package icon

Assignment 1

PDF icon

Effective Biostatistics Techniques for Solving Public Health Assignments

Zoe Rees

  • Analysing Biostatistics Techniques for Public Health

The Role of Biostatistics in Public Health

Applications of biostatistics in public health assignments, descriptive statistics, measures of central tendency, measures of dispersion, inferential statistics, hypothesis testing, confidence intervals, advanced statistical methods, regression analysis, linear regression, logistic regression, survival analysis, kaplan-meier estimator, cox proportional hazards model, data preparation, data cleaning, variable creation, descriptive analysis, inferential analysis, writing the report, creating appendices.

Biostatistics plays a crucial role in public health by providing the tools necessary to analyze data and draw meaningful conclusions. This blog will guide you through essential biostatistics techniques to help you solve assignments, particularly those involving the investigation of health outcomes like cholesterol levels in relation to socioeconomic variables and dietary intake. For those seeking help with biostatistics assignments , mastering these techniques can significantly enhance your ability to handle similar public health assignments effectively. The focus will be on understanding the importance of biostatistics, key biostatistical techniques, advanced statistical methods, and practical steps for public health assignments. Whether you are a student or a public health professional, this comprehensive guide will equip you with the knowledge and skills needed to analyze complex data, make informed decisions, and contribute to public health research. Ultimately, by mastering these biostatistical techniques, you will be better prepared to tackle various public health challenges and improve health outcomes on a broader scale.

Understanding the Importance of Biostatistics in Public Health

Analysing-Biostatistics-Techniques-for-Public-Health

Biostatistics is essential in understanding and addressing various public health issues. By applying statistical methods to health data, public health professionals can identify trends, assess risks, and develop effective interventions. This field plays a crucial role in designing studies, analyzing data, and interpreting results, which are fundamental for evidence-based decision-making. Biostatistics helps in tracking disease outbreaks, evaluating the effectiveness of public health programs, and identifying health disparities among different populations. It also aids in predicting health trends and outcomes, allowing for proactive measures to be implemented. Furthermore, biostatistical analysis is vital for validating research findings and ensuring the reliability of health information. This section will discuss the role and applications of biostatistics in public health, providing a foundation for the techniques discussed later. Understanding these applications will help you approach your assignments with a clear framework and apply the appropriate statistical methods to derive meaningful insights.

Biostatistics is essential for public health research as it allows for the systematic collection, analysis, and interpretation of health data. This field supports evidence-based decision-making by providing statistical methods to understand and address health issues. In public health, biostatistics is used to:

  • Design studies and surveys that collect relevant health data.
  • Analyze the relationship between health outcomes and various risk factors.
  • Evaluate the effectiveness of public health interventions and policies.

By integrating biostatistical methods into public health research, professionals can derive insights that lead to improved health outcomes and policies.

When dealing with public health assignments, biostatistics helps in:

  • Identifying trends and patterns in health data.
  • Assessing the impact of socioeconomic and environmental factors on health.
  • Predicting health outcomes and modeling disease spread.
  • Designing and analyzing clinical trials and epidemiological studies.

Understanding these applications will help you approach your assignments with a clear framework and apply the appropriate statistical methods. These applications are crucial for translating data into actionable public health strategies.

Key Biostatistics Techniques for Public Health

Understanding the key biostatistical techniques is essential for analyzing public health data effectively. This section will cover descriptive statistics and inferential statistics, providing a basis for more advanced methods.

Descriptive statistics summarize and describe the main features of a dataset. They provide a simple overview of the data, which is essential for initial analysis.

Measures of central tendency include the mean, median, and mode. These metrics help identify the typical value in a dataset:

  • Mean: The average of all data points. It is sensitive to extreme values (outliers).
  • Median: The middle value when the data is ordered. It is less affected by outliers.
  • Mode: The most frequently occurring value in the dataset.

These measures help summarize the data, making it easier to understand the distribution of values in the dataset.

Measures of dispersion indicate the spread or variability of the data:

  • Range: The difference between the highest and lowest values.
  • Variance: The average squared deviation from the mean.
  • Standard Deviation (SD): The square root of the variance, indicating the average distance from the mean.

These measures help you understand the distribution and variability of the data, which is crucial for further analysis. Understanding dispersion is essential for identifying potential outliers and the overall spread of the data.

Inferential statistics allow you to make inferences and draw conclusions about a population based on a sample.

Hypothesis testing assesses the evidence against a null hypothesis. Common tests include:

  • T-tests: Compare the means of two groups (e.g., comparing cholesterol levels between different dietary groups).
  • Chi-square tests: Assess the association between categorical variables (e.g., the relationship between race and cholesterol levels).

These tests help determine if observed differences are statistically significant, providing a basis for further research or policy recommendations.

Confidence intervals provide a range of values within which the true population parameter is likely to lie. They give an estimate of the uncertainty around the sample statistic.

Using these inferential techniques, you can determine whether observed differences or associations are statistically significant. Confidence intervals offer a clearer picture of the reliability and precision of your estimates.

Advanced statistical methods allow for more sophisticated analysis of complex public health data. This section covers regression analysis and survival analysis, which are essential for modeling relationships and analyzing time-to-event data.

Regression analysis models the relationship between a dependent variable and one or more independent variables. It is widely used in public health to identify risk factors and predict health outcomes.

Linear regression examines the relationship between a continuous dependent variable and one or more continuous or categorical independent variables. For example, you might use linear regression to study the impact of dietary intake on cholesterol levels.

  • Model fitting: Assess the fit of the regression model using R-squared and residual plots.
  • Interpretation: Interpret the coefficients to understand the effect of each independent variable on the dependent variable.

Linear regression helps identify significant predictors and quantify their impact, making it a powerful tool for public health research.

Logistic regression is used when the dependent variable is binary (e.g., presence or absence of high cholesterol). It estimates the probability of an event occurring based on the independent variables.

  • Odds ratios: Interpret the coefficients as odds ratios, indicating the change in odds of the outcome for a one-unit change in the predictor.

Logistic regression is essential for studying binary outcomes and identifying key risk factors in public health.

Survival analysis examines the time until an event occurs (e.g., time to disease occurrence). It is useful for analyzing data with censored observations.

The Kaplan-Meier estimator estimates the survival function and provides a visual representation of survival probabilities over time.

  • Survival curves: Plot survival curves to compare different groups (e.g., different treatment groups).

Kaplan-Meier curves offer a straightforward way to visualize survival data and compare groups over time.

The Cox model assesses the effect of multiple variables on survival time, allowing for adjustment of confounders.

  • Hazard ratios: Interpret the coefficients as hazard ratios, indicating the change in hazard for a one-unit change in the predictor.

The Cox model is valuable for identifying and quantifying the impact of various factors on survival time, making it indispensable for longitudinal studies.

Practical Steps for Public Health Assignments

Applying biostatistical techniques to public health assignments involves several practical steps. This section provides guidance on data preparation, analysis, and reporting to ensure accurate and meaningful results.

Data preparation is a critical step in any statistical analysis. It involves cleaning and organizing the data to ensure accurate and reliable results.

Data cleaning involves identifying and correcting errors or inconsistencies in the dataset:

  • Missing values: Handle missing data appropriately (e.g., imputation or exclusion).
  • Outliers: Detect and address outliers that may skew the results.
  • Consistency: Ensure consistent coding and formatting of variables.

Proper data cleaning is essential for avoiding biased or inaccurate results, ensuring the integrity of your analysis.

Creating new variables can help simplify the analysis and provide more meaningful insights:

  • Categorical variables: Convert continuous variables into categorical ones if necessary (e.g., age groups).
  • Composite scores: Combine multiple variables into a single score (e.g., a dietary index).

Variable creation allows for more nuanced analysis and can help reveal important patterns in the data.

Data Analysis

Once the data is prepared, you can proceed with the analysis using the appropriate statistical methods.

Begin with a descriptive analysis to summarize the data and identify patterns:

  • Tables and charts: Create summary tables and charts to visualize the data.
  • Descriptive statistics: Calculate measures of central tendency and dispersion for key variables.

Descriptive analysis provides an initial understanding of the data and highlights important characteristics and trends.

Conduct inferential analysis to test hypotheses and draw conclusions:

  • Hypothesis tests: Perform the relevant hypothesis tests (e.g., t-tests, chi-square tests).
  • Confidence intervals: Calculate confidence intervals to assess the precision of the estimates.

Inferential analysis allows you to make data-driven decisions and validate your findings statistically.

Reporting Results

Effectively reporting your results is crucial for communicating your findings to a broader audience.

Structure your report to clearly present your analysis and conclusions:

  • Introduction: Provide background information and state the research question.
  • Methods: Describe the data, variables, and statistical methods used.
  • Results: Present the findings with appropriate tables and charts.
  • Discussion: Interpret the results, discuss their implications, and acknowledge limitations.

A well-structured report ensures clarity and comprehensiveness, making it easier for readers to understand and evaluate your work.

Include appendices for additional tables, charts, and detailed results that support the main report.

  • Tables and figures: Ensure all tables and figures are well-labeled and referenced in the report.
  • Supplementary materials: Provide any additional analyses or data that support your conclusions.

Appendices provide detailed support for your main findings, allowing for a deeper understanding of the analysis.

Biostatistics is a powerful tool for public health research and analysis. By mastering descriptive and inferential statistics, regression analysis, and survival analysis, you can effectively tackle public health assignments and draw meaningful conclusions from health data. Remember to follow practical steps for data preparation, analysis, and reporting to ensure your work is accurate, reliable, and impactful. Whether you are investigating cholesterol levels, dietary intake, or other health outcomes, these biostatistics techniques will equip you with the skills needed to succeed in your assignments.

You Might Also Like

Our popular services.

COMMENTS

  1. Inferential Statistics

    Example: Inferential statistics. You randomly select a sample of 11th graders in your state and collect data on their SAT scores and other characteristics. You can use inferential statistics to make estimates and test hypotheses about the whole population of 11th graders in the state based on your sample data.

  2. 1.4: Inferential Statistics

    Mikki Hebl and David Lane. This page titled 1.4: Inferential Statistics is shared under a Public Domain license and was authored, remixed, and/or curated by David Lane via source content that was edited to the style and standards of the LibreTexts platform. In statistics, we often rely on a sample --- that is, a small subset of a larger set of ...

  3. Inferential Statistics

    Sure, inferential statistics are used when making predictions or inferences about a population from a sample of data. Here are a few real-time examples: Medical Research: Suppose a pharmaceutical company is developing a new drug and they're currently in the testing phase. They gather a sample of 1,000 volunteers to participate in a clinical ...

  4. Inferential Statistics

    Welcome to Inferential Statistics! In this course we will discuss Foundations for Inference. Check out the learning objectives, start watching the videos, and finally work on the quiz and the labs of this week. ... If you want to complete the course and earn a Course Certificate by submitting assignments for a grade, you can upgrade your ...

  5. Statistics and Probability

    Unit 7: Probability. Basic theoretical probability Probability using sample spaces Basic set operations Experimental probability. Randomness, probability, and simulation Addition rule Multiplication rule for independent events Multiplication rule for dependent events Conditional probability and independence.

  6. Statistical Inference For Everyone

    This is a new approach to an introductory statistical inference textbook, motivated by probability theory as logic. It is targeted to the typical Statistics 101 college student, and covers the topics typically covered in the first semester of such a course. It is freely available under the Creative Commons License, and includes a software library in Python for making some of the calculations ...

  7. Introduction to Inferential Statistics

    Many techniques have been developed to aid scientists in making sense of their data. This module explores inferential statistics, an invaluable tool that helps scientists uncover patterns and relationships in a dataset, make judgments about data, and apply observations about a smaller set of data to a much larger group. The module explains the importance of random sampling to avoid bias. Other ...

  8. Quantitative analysis: Inferential statistics

    In inferential statistics, this probability is called the value, 5 per cent is called the significance level (), and the desired relationship between the -value and is denoted as: . The significance level is the maximum level of risk that we are willing to accept as the price of our inference from the sample to the population.

  9. Statistical Inference and Estimation

    A statistical model is a representation of a complex phenomena that generated the data. It has mathematical formulations that describe relationships between random variables and parameters. It makes assumptions about the random variables, and sometimes parameters. Residuals are a representation of a lack-of-fit, that is of the portion of the ...

  10. Chapter 15 Statistical inference

    Chapter 15 Statistical inference. In Chapter 16 we will describe, in some detail, how poll aggregators such as FiveThirtyEight use data to predict election outcomes. To understand how they do this, we first need to learn the basics of Statistical Inference, the part of statistics that helps distinguish patterns arising from signal from those arising from chance.

  11. Basic Inferential Statistics

    The goal in classic inferential statistics is to prove the null hypothesis wrong. The logic says that if the two groups aren't the same, then they must be different. A low p-value indicates a low probability that the null hypothesis is correct (thus, providing evidence for the alternative hypothesis).

  12. Statistical Inference: Definition, Methods & Example

    Statistical inference is the process of using a sample to infer the properties of a population. Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn. Scientists typically want to learn about a population. When studying a phenomenon, such as the effects of a new medication ...

  13. Statistics for Data Analysts: Inferential Statistics with Python

    Inferential Statistics is an extremely valuable tool for every potential data analyst. From applying sampling techniques in your data collection process to applying hypothesis tests to deduce from ...

  14. Mastering Inferential Statistics for SPSS Assignments

    In the world of statistics, understanding inferential statistics is crucial for university students aiming to excel in data analysis, particularly when working on assignments using SPSS (Statistical Package for the Social Sciences). In this comprehensive guide, we will delve into essential concepts and techniques within inferential statistics ...

  15. Statistical inference

    Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.. Inferential statistics can be contrasted with descriptive statistics.

  16. Syllabus

    To help us better understand the nature of inferential statistics, ... Re-grading of assignments. All student work will be assessed using fair criteria that are uniform across the class. If, however, you are unsatisfied with the grade you received on a particular assignment (beyond simple clerical errors), you can request a re-grade using the ...

  17. Inferential Statistics

    Examples on Inferential Statistics. Example 1: After a new sales training is given to employees the average sale goes up to $150 (a sample of 25 employees was examined) with a standard deviation of $12. Before the training, the average sale was $100. Check if the training helped at α α = 0.05.

  18. Descriptive and Inferential Statistics

    4. Types of Descriptive Statistics with Examples. Measures of Central Tendency:These provide insights into the central point of a dataset.. Mean (Average): The sum of all values divided by the number of values. Example: For a dataset of ages (23, 25, 26, 29, 30), the mean age is $ \frac{23+25+26+29+30}{5} = 26.6 $ years. Median: The middle value in an ordered dataset.

  19. The Beginner's Guide to Statistical Analysis

    Step 4: Test hypotheses or make estimates with inferential statistics. A number that describes a sample is called a statistic, while a number describing a population is called a parameter. Using inferential statistics, you can make conclusions about population parameters based on sample statistics.

  20. Inferential Statistics

    Inferential statistics are based on the assumption that sampling is random. We trust a random sample to represent different segments of society in close to the appropriate proportions (provided the sample is large enough; see below). ... This random division of the sample into two groups is called random assignment. Random assignment is ...

  21. Additional Assignments for Inferential Statistics

    Data: work.sav. Assignment 5. Details: Inferential Statistics 5.pdf. Data: agechildren.sav. Suggested assignments for inferential statistics are designed to promote in-depth engagement with the material.Download all additional assignments for inferential statistics, and the accompanying data files: Assignments Inferential Statistics.zip.

  22. Assignment #3 Inferential Statistics Analysis and Writeup

    STAT200: Assignment #3 - Inferential Statistics Analysis and Writeup. Income can influence influence expenditure because as a household earns more money that are able to spend more money. Specific to annual expenditures is where households spend the most of their income.

  23. PSY325 Week 3 Assignment.docx

    Introduction This week's assignment asks us to identify the research questions and/or hypotheses as they are stated in the provided article along with identifying variables, inferential statistics used, and determining if the proper steps of hypothesis testing were followed. Quantitative data can be analyzed in two ways, utilizing descriptive or inferential statistics.

  24. Analysing Biostatistics Techniques for Public Health

    By mastering descriptive and inferential statistics, regression analysis, and survival analysis, you can effectively tackle public health assignments and draw meaningful conclusions from health data. Remember to follow practical steps for data preparation, analysis, and reporting to ensure your work is accurate, reliable, and impactful.