next video:
Closed captioning text:
Okay. So in this video I'm gonna start off talking about the the two kinds of statistics. At a broad level, anyway, "statistics" ends up taking on lots of different definitions.
The first kind of statistics I want to talk about are descriptive statistics. So these descriptive statistics are numbers that summarize data. So, for example, the average height of American women is a descriptive statistic and it summarizes the height of all American women.
The other main kind of statistics is called inferential statistics. These are processes that allow, I'm gonna say, scientists to make decisions about populations based on data from samples. So, for example, is it the case that SUU women are taller or shorter than all American women? If we just have a sample of SUU women we could run some inferential statistics on that data and make a decision about whether or not SUU women are taller or shorter than American women are more generally.
So two critical ideas in statistics are the idea of a population. A population is all individuals about whom we want to make a claim. For example, we might want to make a claim about all American women. And, oftentimes though in research we can't measure everybody in the population. Instead, we measure a sample, which is a subset of a population that is in our study. So, for example, I might do a study and I might have thirty SUU undergraduate women in my sample. Even though I want to have all women in the whole US, what I end up with is thirty SUU women in my sample. So, when we're doing research we want to know about a population, the people we are making a claim about.
So maybe I'll draw a little circle for each person in the population. So imagine this is all American women that are there. And let's say we are interested in their height, the average height of all American women. And so, the symbol for the mean of a population is mu. So that is a population mean. mu is a population mean. So we want to know the the mean of all of these women. Unfortunately for us we can't get all American women into our lab here at SUU. So what we do is we work with a sample. So this is from the population. It's a subset. We'll just say we end up getting those folks. Ideally, it's a random sample. In a random sample, it would mean that every individual in this population has an equal chance of being in our sample. In practice that's not usually how it works but ideally that's how it would work. What we want to do is find out the average height of American women. We want to measure all of these women's heights. All of the individual women's scores and calculate the average of that score. But what we have is some SUU women and we can calculate their average height. So X with a bar across the top is called X-bar. That is a sample mean. So we want to know the population mean but we don't have access to all the people in the population so we have a subset of that population that came into our lab. We measure their height and we have their height, the sample's mean. Then what we're gonna do is we're gonna guess the population mean based on our sample mean. And this is an inference we're making. In fact, this is a type of inferential statistic. So when you use a sample mean as your best guess at a population mean that's a point prediction, the point being the population mean in this case.
Alright so there's a couple more little details that are of interest. So these are descriptive statistics. These are numbers that summarize a dataset. So this sample mean summarizes the sample data and this population mean summarizes the population data. So these are use our descriptive statistics. And when we make this inference that's a process that we're going through and this is an inferential statistical process that we're going through here. And notice that the population mean this a Greek letter. That's gonna be commonly the case that most population statistical values like this are gonna be Greek letters, descriptive statistics. And most sample statistics are gonna be Roman letters, an X or something like that.
So, for example, if you measure all of the heights of the women in our sample and x-bar, their sample mean, is 5 feet 7 inches. And we want to know the population mean, the average height of all American women, and we don't know. The inferential process here is - just to take the sample mean and use that as your guess at the population mean. So the point prediction here for the population mean is the same as whatever the sample mean was. And that is our best guess if the population mean given the information that we have. Ideally we would measure all of the people in the population. In that case it's called a census. But since we couldn't do that we just had a sample and that's our best guess at the population mean is whatever our sample mean was.
0 Comments