The post Attribute MSA with JMP appeared first on Deploy OpEx.

]]>Data File: “AttributeMSA.jmp”

- Click Analyze -> Quality & Process ->Variability/Attribute Gauge Chart
- Select “Appraiser A”, “Appraiser B” and “Appraiser C” as “Y, Response”
- Select “Part” as “X,Grouping”
- Select “Reference” as “Standard”
- Select “Attribute” as the “Chart Type”
- Click “OK”

- Click on the red triangle button next to “Attribute Gauge”
- Click “Show Effectiveness Points”
- Click “Connect Effectiveness Points”

Percentage of agreement by appraiser

- Red line: the percentage of agreement with the reference level
- Blue line: the percentage of agreement between and within the appraisers
- When both lines are at 100% level across parts and appraisers, the measurement system is perfect

- % Agreement: Overall agreement percentage of both within and between appraisers. It reflects how precise the measurement system performs
- In this example, 78% of items inspected have the same measurement across different appraisers and also within each individual appraiser
- Rater Score: the agreement percentage within each individual appraiser

Kappa statistic is a coefficient indicating the agreement percentage above the expected agreement by chance. Kappa ranges from −1 (perfect disagreement) to 1 (perfect agreement). When the observed agreement is less than the chance agreement, Kappa is negative. When the observed agreement is greater than the chance agreement, Kappa is positive. Rule of thumb: If Kappa is greater than 0.7, the measurement system is acceptable. If Kappa is greater than 0.9, the measurement system is excellent.

The first table shows the Kappa statistic of the agreement between appraisers. The second table shows the Kappa statistic of the agreement between individual appraiser and the standard. The bottom table shows the categorical kappa statistic to indicate which category in the measurement has worse results.

Model summary: Count of true positives, true negatives, false positives and false negatives. The effectiveness shows the percentage of the agreement between each appraiser and the standard. It reflects the accuracy of the measurement system.

The post Attribute MSA with JMP appeared first on Deploy OpEx.

]]>The post Variable Gage R&R with JMP appeared first on Deploy OpEx.

]]>Variable Gage Repeatability & Reproducibility (Gage R&R) is a method used to analyze the variability of a measurement system by partitioning the variation of the measurements using ANOVA (Analysis of Variance). Whenever something is measured repeatedly or by different people or processes, the results of the measurements will vary. Variation comes from two primary sources:

- Differences between the parts being measured
- The measurement system

We can use a Gage R&R to conduct a measurement system analysis to determine what portion of the variability comes from the parts and what portion comes from the measurement system. There are key study results that help us determine the components of variation within our measurement system.

Variable Gage R&R primarily addresses the precision aspect of a measurement system. It is a tool used to understand if a measurement system can repeat and reproduce and if not, help us determine what aspect of the measurement system is broken so that we can fix it.

Gage R&R requires a deliberate study with parts, appraisers and measurements. Measurement data must be collected and analyzed to determine if the measurement system is acceptable. Typically Variable Gage R&Rs are conducted by 3 appraisers measuring 10 samples 3 times each. Then, the results can be compared to determine where the variability is concentrated. The optimal result is for the measurement variability to be due to the parts.

Measurement System Analysis (MSA) is a systematic method to identify and analyze the variation components of a measurement system. It is a mandatory step in any Six Sigma project to ensure the data are reliable before making any data-based decisions. A MSA is the check point of data quality before we start any further analysis and draw any conclusions from the data. Some good examples of data-based analysis where MSA should be a prerequisite:

- Correlation analysis
- Regression analysis
- Hypothesis testing
- Analysis of variance
- Design of experiments
- Statistical process control

You will see where and how the analysis techniques listed above are used. It is critical to know that any variation, anomalies, or trends found in your analysis are actually due to the data and not due to the inaccuracies or inadequacies of a measurement system. Therefore the need for a MSA is vital.

A measurement system is a process used to obtain data and quantify a part, product or process. Data obtained with a measurement device or measurement system are the observed values. Observed values are comprised of two elements

- True Value = Actual value of the measured part
- Measurement Error = Error introduced by the measurement system.

The true value is what we are ultimately trying to determine through the measurement system. It reflects the true measurement of the part or performance of the process.

Measurement error is the variation introduced by the measurement system. It is the bias or inaccuracy of the measurement device or measurement process.

The observed value is what the measurement system is telling us. It is the measured value obtained by the measurement system. Observed values are represented in various types of measures which can categorized into two primary types discrete and continuous. Continuous measurements are represented by measures of weight, height, money and other types of measures such as ratio measures. Discrete measures on the other hand are categorical such as Red/Yellow/Green, Yes/No or Ratings of 1–10 for example.

The guidelines for acceptable or unacceptable measurement systems can vary depending on an organizations tolerance or appetite for risk. The common guidelines used for interpretation are published by the Automotive Industry Action Group (AIAG). These guidelines are considered standard for interpreting the results of a measurement system analysis using Variable Gage R&R. The table below summarizes the AIAG standards.

Use JMP to Implement a Variable MSA

Data File: “VariableMSA.jmp”

Let’s take a look at an example of a Variable MSA using the data in the Variable MSA tab in your “Sample Data.xlsx” file. In this exercise we will first walk through how to set up your study using JMP and then we will perform a Variable MSA using 3 operators who all measured 10 parts three times each. The part numbers and operators and measurement trials are all generic so that you can apply the concept to your given industry. First we need to set up the study.

Step 1: Initiate the MSA study

- Click: Analyze > Quality & Process > Measurement Systems Analysis
- Select “Measurement” as “Y, Response”
- Select “Operator” as “X, Grouping”
- Select “Part” as “Sample, Part ID”
- Select “Gauge R&R” as the “MSA Method”
- Select “Crossed” as “Model Type”

- Click “OK”

Step 2: Create the variability chart for measurement

- Click on the red triangle button next to “Variability Gauge”
- Click “Connect Cell Means” to link the average measurement for each part together
- Click “Show Group Means” to display the average for each appraiser (solid line)
- Click “Show Grand Mean” to display the average for the entire data set (dotted line)

Step 3: Implement Gauge R&R

- Click on the red triangle button next to “Variability Gauge”
- Click “Gauge Studies” -> “Gauge RR”
- A window named “Enter/Verify Gauge R&R Specifications” opens
- Enter the specified value into “K, Sigma Multiplier” box. In this example, we use 5.15 to assume a 99% spread of the data

- Click “OK”

Step 4: Create Mean Plots for further analysis

- Click on the red triangle button next to “Variability Gauge”
- Click “Gauge Studies” -> “Gauge R&R Plots” -> “Mean Plots”
- Three plots appear

Model summary: The result of this Gage R&R study leaves room for consideration on one key measure. As noted in previous pages, the targeted percent contribution R&R should be less than 9% and study variation less than 30%. With % contribution at 7% it is below our 9% unacceptable threshold and similarly, Study variation at 26.1476% is also below the threshold of 30% but this result is at best marginal and should be heavily scrutinized by the business before concluding that the measurement system does not warrant further improvement.

Visual evaluation of this measurement system is another effective method of evaluation but can at times be misleading without the statistics to support it. Diagnosing the mean plots above should help in the consideration of measurement system acceptability, you may benefit from taking a closer look at operator C.

The post Variable Gage R&R with JMP appeared first on Deploy OpEx.

]]>The post Run Chart with JMP appeared first on Deploy OpEx.

]]>A run chart is a chart used to present data in time order. Run charts capture process performance over time. The X axis of a run chart indicates time and the Y axis shows the observed values. A run chart is similar to a scatter plot in that it shows the relationship between X and Y. Run charts differ however because they shows how the Y variable changes with an X variable of time.

Run charts look similar to control charts except that run charts do not have control limits and they are much easier to produce than a control chart. A run chart is often used to identify anomalies in the data and discover pattern over time. They help to identify trends, cycles, seasonality and other anomalies.

Data File: “RunChart.jmp”

- Click > Analyze -> Quality & Process -> Control Chart -> Run Chart
- Select “Measurement” as the “Process”

- Click “OK”

The figure above is a run chart created with JMP. The time series displayed by this chart appears stable. There are no extreme outliers, no visible trending or seasonal patterns. The data points seem to vary randomly over time.

Now, let us take a look at another example which may give us a different perspective. We will create another run chart using the data listed in the column labeled “Cycle”. This column is in the same file used to generate Figure 2.39. Follow the steps used for the first run chart and instead of using “Measurement” use “Cycle” in the Run Chart dialog box.

In this example, the data points are clearly exhibiting a pattern. It could be seasonal or it could be something cyclical. Imagine that the data points are taken monthly and this is a process performing over a period of 2.5 years. Perhaps the data points represent the number of customers buying new homes. The home buying market tends to peak in the summer months and dies down in the winter. Using the same data tab lets create a final run chart. This time use the “Trend” data. Again, follow the steps outlined previously to generate a run chart.

In this example the process starts out randomly, but after the seventh data point almost every data point has a lower value than the one before it. This clearly illustrates a downward trend. What might this represent? Perhaps a process winding down? Product sales at the end of a product’s life cycle? Defects decreasing after introducing a process improvement?

Model summary: It should be clear through our review of Histograms, Scatterplots and Run Charts, that there is great value in “visualizing” the data. Graphical displays of data can be very telling and offer excellent information.

The post Run Chart with JMP appeared first on Deploy OpEx.

]]>The post Scatter Plot with JMP appeared first on Deploy OpEx.

]]>A scatter plot is a diagram to present the relationship between two variables of a data set. A scatterplot consists of a set of data points. On the scatterplot, a single observation is presented by a data point with its horizontal position equal to the value of one variable and its vertical position equal to the value of the other variable. A scatterplot helps us to understand:

- Whether the two variables are related to each other or not
- What the strength of their relationship
- The shape of their relationship
- The direction of their relationship
- Whether outliers are present

Data File: “ScatterPlot.jmp”

Model summary: The figure above is JMP’s output of the scatterplot data. You can immediately see the value of graphical displays of data. This information obtainable by viewing this output shows a relationship between weight and MPG. This scatterplot shows that the heavier the weight the lower the MPG value and vice versa.

The post Scatter Plot with JMP appeared first on Deploy OpEx.

]]>The post Histogram Rendering with JMP appeared first on Deploy OpEx.

]]>A histogram is a graphical tool to present the distribution of the data. The X axis of a histogram represents the possible values of the variable and the Y axis represents the frequency of the value occurring. A histogram consists of adjacent rectangles erected over intervals with heights equal to the frequency density of the interval. The total area of all the rectangles in a histogram is the number of data values.

A histogram can also be normalized. In the case of normalization, the X axis still represents the possible values of the variable, but the Y axis represents the percentage of observations that fall into each interval on the X axis. The total area of all the rectangles in a normalized histogram is 1. Using histograms, we have a better understanding of the shape, location, and spread of the data.

Data File: “Histogram.jmp”

- Click Analyze -> Distribution
- Select “HtBk” as “Y, Columns”

- Click “OK”
- Click on the red triangle button next to “HtBk”
- Uncheck Histogram Options -> Vertical

Model summary: The output from the previous steps has generated a graphical summary report of the data set HtBk. Among the information provided is a histogram. The image shows the frequency of the data for the numerical categories ranging from 5.5 to approximately 7. You can see the shape of the data roughly follows the bell curve.

The post Histogram Rendering with JMP appeared first on Deploy OpEx.

]]>The post Box Plot with JMP appeared first on Deploy OpEx.

]]>A box plot is a graphical method to summarize a data set by visualizing the minimum value, 25^{th} percentile, median, 75^{th} percentile, the maximum value, and potential outliers. A percentile is the value below which a certain percentage of data fall. For example, if 75% of the observations have values lower than 685 in a data set, then 685 is the 75^{th} percentile of the data. At the 50^{th} percentile, or median, 50% of the values are lower and 50% are higher than that value.

The figure above describes how to read a box plot. Here are a few explanations that may help. The middle part of the plot, or the “interquartile range,” represents the middle quartiles (or the 75th minus the 25th percentile). The line near the middle of the box represents the median (or middle value of the data set). The whiskers on either side of the IQR represent the lowest and highest quartiles of the data. The ends of the whiskers represent the maximum and minimum of the data, and the individual dots beyond the whiskers represent outliers in the data set.

Data File:“BoxPlot.jmp”

- Click Analyze -> Distribution
- Select “HtBk” as “Y, Columns”

- Click “OK”
- Click on the red triangle button next to “HtBk”
- Uncheck Histogram Options -> Vertical

The post Box Plot with JMP appeared first on Deploy OpEx.

]]>The post One Sample t Test with JMP appeared first on Deploy OpEx.

]]>In statistics, a t test is a hypothesis test in which the test statistic follows a Student’s t distribution if the null hypothesis is true. We apply a one sample t test when the population variance (σ) is unknown and we use the sample standard deviation (s) instead. A hypothesis test is a statistical method in which a specific hypothesis is formulated about a population, and the decision of whether to reject the hypothesis is made based on sample data. Hypothesis tests help to determine whether a hypothesis about a population or multiple populations is true with certain confidence level based on sample data. Hypothesis testing is a critical tool in the Six Sigma tool belt. It helps us separate fact from fiction, and special cause from noise, when we are looking to make decisions based on data.

One sample t test is a hypothesis test to study whether there is a statistically significant difference between a population mean and a specified value.

- Null Hypothesis (H
_{0}):*μ =**μ*_{0} - Alternative Hypothesis (H
_{a}):*μ ≠**μ*_{0}

Where:

*μ*is the mean of a population of our interest*μ*is the specific value we want to compare against_{0}

- The sample data of the population of interest are unbiased and representative.
- The data of the population are continuous.
- The data of the population are normally distributed.
- The variance of the population of our interest is unknown.
- One sample t-test is more robust than the z-test when the sample size is small (< 30).

To check whether the population of our interest is normally distributed, we need to run normality test. While there are many normality tests available, such as Anderson–Darling, Sharpiro–Wilk, and Jarque–Bera, our examples will default to using the Anderson-Darling test for normality.

- Null Hypothesis (H
_{0}): The data are normally distributed - Alternative Hypothesis (H
_{a}): The data are not normally distributed

**Test Statistic and Critical Value of One Sample t Test**

To understand what is happening when you run a t-test with your software, the formulas here will walk you through the key calculations and how to determine if the null hypothesis should be accepted or rejected. To determine significance, you must calculate the t-statistic and compare it to the critical value, which is a reference value based on the alpha value and degrees of freedom (n – 1). The t-statistic is calculated based on the sample mean, the sample standard deviation, and the sample size.

Test statistic is calculated with the formula:

(Y ) ̅is the sample mean, n is the sample size, and s is the sample standard deviation

- t
_{crit}is the t-value in a Student’s t distribution with the predetermined significance level α and degrees of freedom (*n*–1) - t
_{crit}values for a two-sided and a one-sided hypothesis test with the same significance level α and degrees of freedom (*n*– 1) are different

**Decision Rules of One Sample t Test**

Based on the sample data, we calculated the test statistic tcalc, which is compared against tcrit to make a decision of whether to reject the null.

- Null Hypothesis (H
_{0}):*μ =**μ*_{0} - Alternative Hypothesis (H
_{a}):*μ ≠**μ*_{0}

If |t_{calc}| > t_{crit}, we reject the null and claim there is a statistically significant difference between the population mean μ and the specified value *μ _{0}*.

If |t_{calc}| < t_{crit}, we fail to reject the null and claim there is not any statistically significant difference between the population mean *μ* and the specified value *μ _{0}*.

Case study: We want to compare the average height of basketball players against 7 feet.

Data File: “OneSampleT-Test.jmp”

- Null Hypothesis (H
_{0}):*μ = 7* - Alternative Hypothesis (H
_{a}):*μ ≠ 7*

Step 1: Test whether the data are normally distributed

- Click Analyze -> Distribution
- Select “HtBk” as “Y, Columns”

- Click “OK”
- Click on the red triangle button next to “HtBk” in the Distribution page
- Click Continuous Fit -> Normal
- Click on the red triangle button next to “Fitted Normal”
- Select “Goodness of Fit”

- Null Hypothesis (H
_{0}): The data are normally distributed - Alternative Hypothesis(H
_{a}): The data are not normally distributed

Since the p-value of the normality is 0.3197 and greater than the alpha level (0.05), we fail to reject the null and claim that the data are normally distributed. If the data are not normally distributed, you need to use hypothesis tests other than the one sample t-test.

Now we can run the one-sample t-test, knowing the data are normally distributed.

Step 2: Run the one-sample t-test

- Click on the red triangle button next to “HtBk”
- Select “Test Mean”
- A new window named “Test Mean” pops up
- Enter the specific value we want to compare against in the box next to “Specify Hypothesized Mean”

- Click “OK”

- Null Hypothesis(H
_{0}):*μ = 7* - Alternative Hypothesis(H
_{a}):*μ ≠ 7*

Model summary: Since the p-value is smaller than alpha level (0.05), we reject the null hypothesis and claim that average of basketball players is statistically different from 7 feet.

The post One Sample t Test with JMP appeared first on Deploy OpEx.

]]>The post Central Limit Theorem with JMP appeared first on Deploy OpEx.

]]>The Central Limit Theorem is one of the fundamental theorems of probability theory. It states a condition under which the mean of a large number of independent and identically-distributed random variables, each of which has a finite mean and variance, would be approximately normally distributed. Let us assume Y1, Y2 . . . Yn is a sequence of n i.i.d. random variables, each of which has finite mean μ and variance σ2, where σ2 > 0. When n increases, the sample average of the n random variables is approximately normally distributed, with the mean equal to μ and variance equal to σ2/n, regardless of the common distribution Yi follows where i = 1, 2 . . . n.

A sequence of random variables is independent and identically distributed (i.i.d.) if each random variable is independent of others and has the same probability distribution as others. It is one of the basic assumptions in Central Limit Theorem. Consider the law of large numbers (LLN)—It is a theorem that describes the result of performing the same experiment a large number of times. According to the LLN, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. The following example will explain this further.

Let us assume we have 10 fair die at hand. Each time we roll all 10 die together we record the average of the 10 die. We repeat rolling the die 50 times until we will have 50 data points. Upon doing so, we will discover that the probability distribution of the sample average approximates the normal distribution even though a single roll of a fair die follows a discrete uniform distribution. Knowing that each die has six possible values (1, 2, 3, 4, 5, 6), when we record the average of the 10 dice over time, we would expect the number to start approximating 3.5 (the average of all possible values). The more rolls we perform, the closer the distribution would be to a normal distribution centered on a mean of 3.5.

- Use the sample mean to estimate the population mean if the assumptions of Central Limit Theorem are met
- Use standard error of the mean to measure the standard deviation of the sample mean estimate of a population mean
- Use a larger sample size, if economically feasible, to decrease the variance of the sampling distribution. The larger the sample size, the more precise the estimation of the population parameter. Use a confidence interval to describe the region which the population parameter would fall in. The sample distribution approximates the normal distribution in which 95% of the data stays within two standard deviations from the center. Population mean would fall in the interval of two standard errors of the mean away from the sample mean, 95% of the time

The confidence interval is an interval where the true population parameter would fall within a certain confidence level. A 95% confidence interval, the most commonly used confidence level, indicates that the population parameter would fall in that region 95% of the time or we are 95% confident that the population parameter would fall in that region. The confidence interval is used to describe the reliability of a statistical estimate of a population parameter.

The width of a confidence interval depends on the:

- Confidence level—The higher the confidence level, the wider the confidence interval
- Sample size—The smaller the sample size, the wider the confidence interval
- Variability in the data—The more variability, the wider the confidence interval

Data File: “CentralLimitTheorem.jmp”

- Click Analyze -> Distribution
- Select “Cycle Time (Minutes)” as the “Y, Columns”

- Click “OK”
- “Upper 95% Mean” and “Lower 95% Mean” at the bottom of the newly generated window are the upper and lower boundaries of 95% confidence interval

In JMP, the confidence level is 95% by default. In order to see the confidence interval of “Cycle Time (Minutes)” at other confidence level, we need to:

- Click on the red triangle button next to “Cycle Time (Minutes)”
- Select Confidence Interval -> the confidence level of interest (e.g. 90%, 95%, 99% etc.)
- The confidence interval at the selected confidence level appears at the bottom of the distribution analysis page

The post Central Limit Theorem with JMP appeared first on Deploy OpEx.

]]>The post Multi Vari Analysis with JMP appeared first on Deploy OpEx.

]]>Multi vari analysis is a graphic-driven method to analyze the effects of categorical inputs on a continuous output. It studies how the variation in the output changes across different inputs and helps us quantitatively determine the major source of variability in the output. Multi-vari charts are used to visualize the source of variation. They work for both crossed and nested hierarchies.

- Y: continuous variable
- X’s: discrete categorical variables. One X may have multiple levels

Hierarchy is a structure of objects in which the relationship of objects can be expressed similar to an organization tree. Each object in the hierarchy is described as above, below, or at the same level as another one. If object A is above object B and they are directly connected to each other in the hierarchy tree, A is B’s parent and B is A’s child. In Multi-vari analysis, we use the hierarchy to present the relationship between categorical factors (inputs). Each object in the hierarchy tree indicates a specific level of a factor (input). There are generally two types of hierarchies: crossed and nested.

Case study: ABC Company produces 10 kinds of units with different weights. Operators measure the weights of the units before sending them out to customers. Multiple factors could have an impact on the weight measurements. The ABC Company wants to have a better understanding of the main source of variability existing in the weight measurement. The ABC Company randomly selects three operators (Joe, John, and Jack) each of whom measures the weights of 10 different units. For each unit, there are three items sampled.

Data File: “Multi-Vari.jmp”

- Click Analyze -> Quality and Process -> Variability/Attribute Gauge Chart
- Select “Measurement” as “Y, Response”
- Select both “Operator” and “Unit” as “X, Grouping”

- Select “Crossed” as “Model Type” since it’s a crossed hierarchy
- Click “OK”
- In the red triangle button next to “Variability Gauge”
- Click “Connect Cell Means” to create lines connecting individual observation
- Click “Show Group Means” to display the average within individual group
- Click “Show Grand Mean” to display the average across all the groups
- Click “Variance Components” to display the proportion table of variance components

Connecting the cell means enables you to see the variation within and between units, and within operators. When adding the group means you can see the variation between operators as well. The grand mean on the right of the chart provides a broad reference point.

Based on the Multi-Vari chart, the measurement of units range from 0.3 to 1.2. Joe’s and John’s mean measurements stay between 0.8 and 0.9. Jack’s mean is slightly lower than both Joe’s and John’s. John has the worst variation when measuring the same kind of unit because John has the highest difference between the maximum and minimum bars for any kind of unit. By observing the black lines of three operators, it seems like all three operators’ measurements follow the same pattern. The operator-to-operator variability is not large. The unit-to-unit variability is large and it could be the main source of variation in measurements.

**Quantifying Variance Components**

Model summary: JMP can then quantify the amount of variance that each component contributes to the system. The unit, as you can see from the table above, contributes 75% of the variance.

The post Multi Vari Analysis with JMP appeared first on Deploy OpEx.

]]>The post Kruskal Wallis with JMP appeared first on Deploy OpEx.

]]>The Kruskal Wallis one-way analysis of variance is a statistical hypothesis test to compare the medians among more than two groups.

- Null Hypothesis (H
_{0}):*η*_{1}= η_{2}= … = η_{k} - Alternative Hypothesis
*(H*_{a}): at least one of the medians is different from others.

Where: *n*is the median of population_{i}*i**k*is the number of groups of our interest

It is an extension of Mann–Whitney test. While the Mann–Whitney test allows us to compare the samples of two populations, the Kruskal Wallis test allows us to compare the samples of more than two populations.

One key difference between this test and the Mann–Whitney test is the robustness of the test when the populations are not identically shaped. If this is the case, there is a different test, called Mood’s median, which is more appropriate.

- The sample data drawn from the populations of interest are unbiased and representative
- The data of
*k*populations are continuous or ordinal when the spacing between adjacent values is not constant - The
*k*populations are independent to each other - The Kruskal–Wallis test is robust for the non-normally distributed population

The Kruskal Wallis test works very similarly to the Mann–Whitney Test.

**Step 1:** Group the* k* samples from *k* populations (sample* i* is from population *i*) into one single data set and then sort the data in ascending order ranked from 1* to N*, where *N* is the total number of observations across* k* groups.

**Step 2:** Add up the ranks for all the observations from sample *i* and call it *r _{i}*, where i can be any integer between

**Step 3:** Calculate the test statistic

Where:

*k*is the number of groups*n*is the sample size of sample_{i}*i**N*is the total number of all the observations across*k*groups*r*is the rank (among all the observations) of observation_{ij}*j*from group*i*

**Step 4:** Make a decision of whether to reject the null hypothesis.

- Null Hypothesis
*(H*_{0}):*η*_{1}= η_{2}= … = η_{k} - Alternative Hypothesis
*(H*_{a}):*at least one of the medians is different from others*

The test statistic follows chi-square distribution when the null hypothesis is true. If T is greater than(the critical chi-square statistic), we reject the null and claim there is at least one median statistically different from other medians. If T is smaller than, we fail to reject the null and claim the medians of k groups are equal.

Case study: We are interested in comparing customer satisfaction among three types of customers using a nonparametric (i.e., distribution-free) hypothesis test: Kruskal Wallis one-way ANOVA.

Data File: “Kruskal–Wallis.jmp”

- Null Hypothesis
*(H*_{0}):*η*_{1}= η_{2}= … = η_{k} - Alternative Hypothesis (H
_{a}): at least one of the customer types has different overall satisfaction levels from the others

- Click Analyze -> Fit Y by X
- Select “Overall Satisfaction” as “Y, Response”
- Select “Customer Type” as “X, Factor”

- Click “OK”
- Click on the red triangle button next to “One-Way Analysis of Overall Satisfaction by Customer Type”
- Click Nonparametric -> Wilcoxon Test

Model summary: The p-value of the test is lower than alpha level (0.05), and we reject the null hypothesis and conclude that at least the overall satisfaction median of one customer type is statistically different from the others. The result of the test is boxed in: The p-value is lower than the alpha value of 0.05; therefore, we must reject the null hypothesis and claim that at least one of the customer types has a different level of satisfaction than the others.

The post Kruskal Wallis with JMP appeared first on Deploy OpEx.

]]>