Non-parametric tests applicability

23/08/2021 1 By indiafreenotes

Nonparametric tests are methods of statistical analysis that do not require a distribution to meet the required assumptions to be analyzed (especially if the data is not normally distributed). Due to this reason, they are sometimes referred to as distribution-free tests. Nonparametric tests serve as an alternative to parametric tests such as T-test or ANOVA that can be employed only if the underlying data satisfies certain criteria and assumptions.

Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution’s parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are violated.

Applications and purpose

Non-parametric methods are widely used for studying populations that take on a ranked order (such as movie reviews receiving one to four stars). The use of non-parametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of levels of measurement, non-parametric methods result in ordinal data.

As non-parametric methods make fewer assumptions, their applicability is much wider than the corresponding parametric methods. In particular, they may be applied in situations where less is known about the application in question. Also, due to the reliance on fewer assumptions, non-parametric methods are more robust.

Another justification for the use of non-parametric methods is simplicity. In certain cases, even when the use of parametric methods is justified, non-parametric methods may be easier to use. Due both to this simplicity and to their greater robustness, non-parametric methods are seen by some statisticians as leaving less room for improper use and misunderstanding.

The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases where a parametric test would be appropriate, non-parametric tests have less power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence.

Methods

Non-parametric (or distribution-free) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the probability distributions of the variables being assessed. The most frequently used tests include

Analysis of similarities

  • Anderson–Darling test: Tests whether a sample is drawn from a given distribution
  • Statistical bootstrap methods: Estimates the accuracy/sampling distribution of a statistic
  • Cochran’s Q: Tests whether k treatments in randomized block designs with 0/1 outcomes have identical effects
  • Cohen’s kappa: Measures inter-rater agreement for categorical items
  • Friedman two-way analysis of variance by ranks: tests whether k treatments in randomized block designs have identical effects
  • Kaplan–Meier: Estimates the survival function from lifetime data, modeling censoring
  • Kendall’s tau: Measures statistical dependence between two variables
  • Kendall’s W: A measure between 0 and 1 of inter-rater agreement
  • Kolmogorov–Smirnov test: Tests whether a sample is drawn from a given distribution, or whether two samples are drawn from the same distribution
  • Kruskal–Wallis one-way analysis of variance by ranks: tests whether > 2 independent samples are drawn from the same distribution
  • Kuiper’s test: Tests whether a sample is drawn from a given distribution, sensitive to cyclic variations such as day of the week
  • Logrank test: Compares survival distributions of two right-skewed, censored samples
  • Mann–Whitney U or Wilcoxon rank sum test: Tests whether two samples are drawn from the same distribution, as compared to a given alternative hypothesis.
  • McNemar’s test: Tests whether, in 2 × 2 contingency tables with a dichotomous trait and matched pairs of subjects, row and column marginal frequencies are equal
  • Median test: Tests whether two samples are drawn from distributions with equal medians
  • Pitman’s permutation test: A statistical significance test that yields exact p values by examining all possible rearrangements of labels
  • Rank products: Detects differentially expressed genes in replicated microarray experiments
  • Siegel–Tukey test: Tests for differences in scale between two groups
  • Sign test: tests whether matched pair samples are drawn from distributions with equal medians
  • Spearman’s rank correlation coefficient: Measures statistical dependence between two variables using a monotonic function
  • Squared ranks test: Tests equality of variances in two or more samples
  • Tukey–Duckworth test: Tests equality of two distributions by using ranks
  • Wald–Wolfowitz runs test: Tests whether the elements of a sequence are mutually independent/random
  • Wilcoxon signed-rank test: Tests whether matched pair samples are drawn from populations with different mean ranks

Reasons to Use Nonparametric Tests

The population sample size is too small

The sample size is an important assumption in selecting the appropriate statistical method. If a sample size is reasonably large, the applicable parametric test can be used. However, if a sample size is too small, it is possible that you may not be able to validate the distribution of the data. Thus, the application of nonparametric tests is the only suitable option.

The underlying data do not meet the assumptions about the population sample

Generally, the application of parametric tests requires various assumptions to be satisfied. For example, the data follows a normal distribution and the population variance is homogeneous. However, some data samples may show skewed distributions.

The skewness makes the parametric tests less powerful because the mean is no longer the best measure of central tendency because it is strongly affected by the extreme values. At the same time, nonparametric tests work well with skewed distributions and distributions that are better represented by the median.

The analyzed data is ordinal or nominal

Unlike parametric tests that can work only with continuous data, nonparametric tests can be applied to other data types such as ordinal or nominal data. For such types of variables, the nonparametric tests are the only appropriate solution.