An important part of being ready for a successful six sigma project is being familiar with the analyses that you’ll use to measure improvement in your processes.
Đang xem: One
One of the more useful analyses in your toolbelt can be the Analysis of Variance, commonly abbreviated ANOVA.
ANOVA covers a range of common analyses. Some analyses have names related to the number of factors, such as one-way ANOVA and two-way ANOVA. When the levels of a factor are selected at random from a wide number of possibilities, you might use a random-effects model or a mixed-effects model.
And luckily, Microsoft Excel makes it easy to perform these analyses. So we're going togo through how to use ANOVA in Excel.
hướng dẫn your data files
Follow along with the steps in the article by hướng dẫning these practice files
Enter your email address
While ANOVA has many varieties, the essential purpose of thisfamily of analyses is to determine whether factors have an association with an outcome variable.
Factors are the variables that you will use to categorize your outcome variable into groups. For example, if you want to know whether tapes from three different suppliers have the same peel strength, the suppliers are your factor. All the strength measurements for the same supplier’s tape form a group of measurements.
ANOVA is an inferential statistical analysis.
Inferential analysis is the formal way of saying that we want to look at a sample of measurements and make an educated guess about what all of the possible measurements might be like if we could take them.
Let’s return to the tape example. If you could tape 1 million boxes from a batch of tape, those million might represent the entire population that we want to know about. But if we taped those million boxes and measured the peel strength, we would have used up all of the tape. Instead, we’ll measure the strength from a sample of taped boxes and use those measurements to guess what the numbers would look like if we taped a million boxes.
An important point is that we won’t expect all the measurements in a group to be the same.
Consider the tape example again. The differences in strength measurements from the same supplier’s tape give us within-group variation. Another important point is that we won’t expect the average strength of our sample to be the same as the average strength if we taped a million boxes. This variation between the sample average and the overall average we’ll call bias.
Because of within-group variation and bias, comparisons among groups become harder. We'll know that our sample average is not the same as the real average, there’s no easy way to know when our guess is too high or too low. If we guess too high for one group and too low for another group, we might easily reach an incorrect conclusion, such as predicting that the supplier with the strongest tape on average has the weakest tape.
ANOVA gives us mathematical sets of rules, that hold certaingiven assumptions, to decide when we can have confidence that the real average of one group is different from the real average of one or more other groups. ANOVA sets up these rules by asking how sure we are that the means are the same, a concept that we refer to as the null hypothesis. Remember that the null hypothesis is a useful concept for helping us make comparisons, even though we already know that for real group averages to all be the same would be a remarkable coincidence.
Most of the time, a key result of an ANOVA analysis is a p-value. The p-value has meaning only with respect to the null hypothesis of the ANOVA analysis. For one-way ANOVA, the null hypothesis is that the means for each level of your factor are the same.
A rough interpretation would be that the p-value reflects how much confidence you can have that the null hypothesis is a reasonable model. Small p-values make you think that the null hypothesis is not a reasonable model. Large p-values might lead you to act like the null hypothesis is true, even though you know that it’s not really true, just a reasonable model.
To learn more check out this glossary of Lean Six Sigma terms
Let’s work through a practical example in Excel. We’ll begin with one-way ANOVA, which looks at the effect of a single factor.
If you’re analyzing data in Excel, then it’s natural to make use of the tools that Microsoft provides for you. One of the less obvious features in Excel is the Data Analysis Toolpak. The Toolpak is an Excel add-in from Microsoft that’s included with Excel, but isn’t turned on.
Here’s how to turn it on in the Microsoft Windows operating system.
Choose File, then OptionsIn the Excel Options Window, choose Add-insNext to Manage, select Excel Add-ins and click GoIn the Add-ins window, select Analysis ToolPak and click OK
A new button on your Data ribbon will appear.
While it can sometimes seem like a simple analysis will have the fewest applications, it’s easy to find practical ways to use one-way ANOVA. Essentially, you can use it anytime you have only one set of groups to compare.
Let’s keep going with our tape example. You’ve invested in an automatic taping machine that applies heat to tape to create strong bonds. You’ve decided that you’re going to measure the strengths of tape samples from different suppliers yourself so that you can see whether there’s any practical difference in the strengths of the bonds using your machine and your boxes.
If you’ve been using Excel for a long time, you’ve gotten used to the idea that the spreadsheet is cell-based. That is, there’s very little difference between putting numbers in the spreadsheet in rows or in columns.
Data in rows:
Data in columns:
Microsoft’s been nice enough to make it so that their one-way ANOVA feature can work either way, but I’ll recommend that you start putting your data in columns. The data arrangement will matter when you want to use some of the other offerings in the Data Analysis Toolpak or a software package for data analysis, like Minitab Statistical Software.
If you’d like to follow along with data that’s already prearranged, you can use the following Excel file: