Title: | Fast Statistical Hypothesis Tests on Rows and Columns of Matrices |
---|---|
Description: | Functions to perform fast statistical hypothesis tests on rows/columns of matrices. The main goals are: 1) speed via vectorization, 2) output that is detailed and easy to use, 3) compatibility with tests implemented in R (like those available in the 'stats' package). |
Authors: | Karolis Koncevičius [aut, cre] |
Maintainer: | Karolis Koncevičius <[email protected]> |
License: | GPL-2 |
Version: | 0.2.3 |
Built: | 2024-10-30 03:15:31 UTC |
Source: | https://github.com/karoliskoncevicius/matrixTests |
Performs Anderson-Darling goodness of fit test for normality.
row_andersondarling(x) col_andersondarling(x)
row_andersondarling(x) col_andersondarling(x)
x |
numeric matrix. |
row_andersondarling(x)
- Anderson-Darling test on rows.
col_andersondarling(x)
- Anderson-Darling test on columns.
Results should be the same as running nortest::ad.test(x)
on every row (or column) of x
a data.frame where each row contains the results of Anderson-Darling
test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs - number of observations
2. statistic - test statistic
3. pvalue - p-value
Karolis Koncevičius
shapiro.test()
col_andersondarling(iris[,1:4]) row_andersondarling(t(iris[,1:4]))
col_andersondarling(iris[,1:4]) row_andersondarling(t(iris[,1:4]))
Performs the Bartlett's test of homogeneity of variances on each row/column of the input matrix.
row_bartlett(x, g) col_bartlett(x, g)
row_bartlett(x, g) col_bartlett(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
NA values are always ommited. If values are missing for a whole group - that group is discarded. Groups with only one observation are also discarded.
row_bartlett(x, g)
- Bartlet's test on rows.
col_bartlett(x, g)
- Bartlet's test on columns.
Results should be the same as as running bartlett.test(x, g)
on every row (or column) of x
.
a data.frame where each row contains the results of the bartlett test
performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.tot - total number of observations
2. obs.groups - number of groups
3. var.pooled - pooled variance estimate
4. df - degrees of freedom
5. statistic - chi-squared statistic
6. pvalue - p-value
Karolis Koncevičius
bartlett.test()
col_bartlett(iris[,1:4], iris$Species) row_bartlett(t(iris[,1:4]), iris$Species)
col_bartlett(iris[,1:4], iris$Species) row_bartlett(t(iris[,1:4]), iris$Species)
Performs a correlation test on each row/column of a the input matrix.
row_cor_pearson(x, y, alternative = "two.sided", conf.level = 0.95) col_cor_pearson(x, y, alternative = "two.sided", conf.level = 0.95)
row_cor_pearson(x, y, alternative = "two.sided", conf.level = 0.95) col_cor_pearson(x, y, alternative = "two.sided", conf.level = 0.95)
x |
numeric matrix. |
y |
numeric matrix for the second group of observations. |
alternative |
alternative hypothesis to use for each row/column of x. A single string or a vector with value for each observation. Must be one of "two.sided" (default), "greater" or "less". |
conf.level |
confidence levels used for the confidence intervals. A single number or a numeric vector with value for each observation. All values must be in the range of [0;1] or NA. |
Functions to perform various correlation tests for rows/columns of matrices.
Main arguments and results were intentionally matched to the cor.test()
function from default stats package.
row_cor_pearson(x, y)
- test for Pearson correlation on rows.
col_cor_pearson(x, y)
- test for Pearson correlation on columns.
Results should be the same as running cor.test(x, y, method="pearson")
on every row (or column) of x
and y
.
a data.frame where each row contains the results of a correlation
test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.paired - number of paired observations (present in x and y)
2. cor - estimated correlation coefficient
3. df - degrees of freedom
4. statistic - t statistic
5. pvalue - p-value
6. conf.low - lower confidence interval
7. conf.high - higher confidence interval
8. alternative - chosen alternative hypothesis
9. cor.null - correlation of the null hypothesis (=0)
10. conf.level - chosen confidence level
For a marked increase in computation speed turn off the calculation of
confidence interval by setting conf.level
to NA.
Karolis Koncevičius
cor.test()
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_cor_pearson(X, Y) row_cor_pearson(t(X), t(Y))
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_cor_pearson(X, Y) row_cor_pearson(t(X), t(Y))
Performs a Cosinor test for periodicity on each row/column of the input matrix.
row_cosinor(x, t, period = 24) col_cosinor(x, t, period = 24)
row_cosinor(x, t, period = 24) col_cosinor(x, t, period = 24)
x |
numeric matrix. |
t |
a vector specifying time variable for each observation of x. |
period |
oscillation period in the units of |
row_cosinor
- cosinor test on rows.
col_cosinor
- cosinor test on columns.
a data.frame where each row contains the results of a cosinor test
performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs - total number of observations
2. mesor - "Midline Estimating Statistic Of Rhythm" - the average value around which the variable oscillates
3. amplitude - difference between mesor and the peak of the rhythm
4. acrophase - time when rhythm reaches its peak
5. rsquared - R-squared
6. df.model - model terms degrees of freedom
7. df.residual - residual degrees of freedom
8. statistic - F statistic for the omnibus test against intercept-only model
9. pvalue - p-value
10. period - the period used within the model
Karolis Koncevičius
wave <- sin(2*pi*1:24/24) + rnorm(24) row_cosinor(wave, 1:24, 24)
wave <- sin(2*pi*1:24/24) + rnorm(24) row_cosinor(wave, 1:24, 24)
Performs the Fligner-Killeen test of homogeneity of variances (with median centering of the groups) on each row/column of the input matrix.
row_flignerkilleen(x, g) col_flignerkilleen(x, g)
row_flignerkilleen(x, g) col_flignerkilleen(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
NA values are always ommited. If values are missing for a whole group - that group is discarded. Groups with only one observation are also discarded.
row_flignerkilleen(x, g)
- Fligner-Killeen test on rows.
col_flignerkilleen(x, g)
- Fligner-Killeen test on columns.
Results should be the same as as running fligner.test(x, g)
on every row (or column) of x
.
a data.frame where each row contains the results of the
Fligner-Killeen test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.tot - total number of observations
2. obs.groups - number of groups
3. df - degrees of freedom
4. statistic - squared statistic
5. pvalue - p-value
Karolis Koncevičius
fligner.test()
col_flignerkilleen(iris[,1:4], iris$Species) row_flignerkilleen(t(iris[,1:4]), iris$Species)
col_flignerkilleen(iris[,1:4], iris$Species) row_flignerkilleen(t(iris[,1:4]), iris$Species)
Performs the F test of equality of variances for two normal populations on each row/column of the two input matrices.
row_f_var(x, y, null = 1, alternative = "two.sided", conf.level = 0.95) col_f_var(x, y, null = 1, alternative = "two.sided", conf.level = 0.95)
row_f_var(x, y, null = 1, alternative = "two.sided", conf.level = 0.95) col_f_var(x, y, null = 1, alternative = "two.sided", conf.level = 0.95)
x |
numeric matrix. |
y |
numeric matrix for the second group of observations. |
null |
- hypothesized 'x' and 'y' variance ratio. A single number or numeric vector with values for each observation. |
alternative |
alternative hypothesis to use for each row/column of x. A single string or a vector with values for each observation. Values must be one of "two.sided" (default), "greater" or "less". |
conf.level |
confidence levels used for the confidence intervals. A single number or a numeric vector with values for each observation. All values must be in the range of [0:1] or NA. |
NA values are always ommited.
row_f_var(x, y)
- F-test for variance on rows.
col_f_var(x, y)
- F-test for variance on columns.
Results should be the same as as running var.test(x, y)
on every row (or column) of x
and y
.
a data.frame where each row contains the results of the F variance
test performed on the corresponding row/column of x and y.
Each row contains the following information (in order):
1. obs.x - number of x observations
2. obs.y - number of y observations
3. obs.tot - total number of observations
4. var.x - variance of x
5. var.y - variance of y
6. var.ratio - x/y variance ratio
7. df.num - numerator degrees of freedom
8. df.denom - denominator degrees of freedom
9. statistic - F statistic
10 pvalue - p-value
11. conf.low - lower bound of the confidence interval
12. conf.high - higher bound of the confidence interval
13. ratio.null - variance ratio of the null hypothesis
14. alternative - chosen alternative hypothesis
15. conf.level - chosen confidence level
For a marked increase in computation speed turn off the calculation of
confidence interval by setting conf.level
to NA.
Karolis Koncevičius
var.test()
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_f_var(X, Y)
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_f_var(X, Y)
Performs Jarque-Bera goodness of fit test for normality.
row_jarquebera(x) col_jarquebera(x)
row_jarquebera(x) col_jarquebera(x)
x |
numeric matrix. |
row_jarquebera(x)
- Jarque-Bera test on rows.
col_jarquebera(x)
- Jarque-Bera test on columns.
Results should be the same as running moments::jarque.test(x)
on every row (or column) of x
a data.frame where each row contains the results of Jarque-Bera
test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs - number of observations
2. skewness - skewness
3. kurtosis - kurtosis
4. df - degrees of freedom
5. statistic - chi-squared statistic
6. pvalue - p-value
Karolis Koncevičius
shapiro.test()
col_jarquebera(iris[,1:4]) row_jarquebera(t(iris[,1:4]))
col_jarquebera(iris[,1:4]) row_jarquebera(t(iris[,1:4]))
Performs a Kolmogorov-Smirnov test on each row/column of the input matrix.
row_kolmogorovsmirnov_twosample(x, y, alternative = "two.sided", exact = NA) col_kolmogorovsmirnov_twosample(x, y, alternative = "two.sided", exact = NA)
row_kolmogorovsmirnov_twosample(x, y, alternative = "two.sided", exact = NA) col_kolmogorovsmirnov_twosample(x, y, alternative = "two.sided", exact = NA)
x |
numeric matrix. |
y |
numeric matrix for the second group of observations. |
alternative |
alternative hypothesis to use for each row/column of x. A single string or a vector with values for each observation. Values must be one of "two.sided" (default), "greater" or "less". |
exact |
logical or NA (default) indicator whether an exact p-value should be computed (see Details). A single value or a logical vector with values for each observation. |
Function to perform two sample Kolmogorov-Smirnov test on rows/columns of
matrices. Main arguments and results were intentionally matched to the
ks.test()
function from default stats package.
Results should be the same as running ks.test(x, y)
on every row (or
column) of x
and y
.
By default if 'exact' argument is set to 'NA', exact p-values are computed if the product of 'x' and 'y' sample sizes is less than 10000. Otherwise, asymptotic distributions are used.
Alternative hypothesis setting specifies null and alternative hypotheses.
The possible values of 'two sided', 'less', and 'greater'.
'two sided' sets the null hypothesis for the distributions of 'x' being equal to the distribution 'y'.
'less' sets the null hypothesis for the distribution of x not being less than the distribution of y.
'greater' sets the null hypothesis for the distribution of x not being greater than the distribution of y.
See help(ks.test)
for more details.
a data.frame where each row contains the results of a Kolmogorov-Smirnov test
performed on the corresponding row/column of x and y.
Each row contains the following information (in order):
1. obs.x - number of x observations
2. obs.y - number of y observations
3. obs.tot - total number of observations
5. statistic - Wilcoxon test statistic
6. pvalue - p-value
8. alternative - chosen alternative hypothesis
9. exact - indicates if exact p-value was computed
Karolis Koncevičius
ks.test()
X <- iris[iris$Species=="setosa", 1:4] Y <- iris[iris$Species=="virginica", 1:4] col_kolmogorovsmirnov_twosample(X, Y) # same column using different alternative hypotheses col_kolmogorovsmirnov_twosample(X[,c(1,1,1)], Y[,c(1,1,1)], alternative=c("t", "g", "l"))
X <- iris[iris$Species=="setosa", 1:4] Y <- iris[iris$Species=="virginica", 1:4] col_kolmogorovsmirnov_twosample(X, Y) # same column using different alternative hypotheses col_kolmogorovsmirnov_twosample(X[,c(1,1,1)], Y[,c(1,1,1)], alternative=c("t", "g", "l"))
Performs a Kruskal-Wallis rank sum test on each row/column of the input matrix.
row_kruskalwallis(x, g) col_kruskalwallis(x, g)
row_kruskalwallis(x, g) col_kruskalwallis(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
row_kruskalwallis(x, g)
- Kruskal Wallis test on rows.
col_kruskalwallis(x, g)
- Kruskal Wallis test on columns.
Results should be the same as running kruskal.test(x, g)
on every row (or column) of x
a data.frame where each row contains the results of a Kruskal-Wallis
test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.tot - total number of observations
2. obs.groups - number of groups
4. df - degrees of freedom
5. statistic - chi-squared statistic
6. pvalue - p.value
Karolis Koncevičius
kruskal.test()
col_kruskalwallis(iris[,1:4], iris$Species) row_kruskalwallis(t(iris[,1:4]), iris$Species)
col_kruskalwallis(iris[,1:4], iris$Species) row_kruskalwallis(t(iris[,1:4]), iris$Species)
Levene's test and Brown-Forsythe test for equality of variances between groups on each row/column of the input matrix.
row_levene(x, g) col_levene(x, g) row_brownforsythe(x, g) col_brownforsythe(x, g)
row_levene(x, g) col_levene(x, g) row_brownforsythe(x, g) col_brownforsythe(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
NA values are always ommited. If values are missing for a whole group - that group is discarded.
row_levene(x, g)
- Levene's test on rows.
col_levene(x, g)
- Levene's test on columns.
row_brownforsythe(x, g)
- Brown-Forsythe test on rows.
col_brownforsythe(x, g)
- Brown-Forsythe test on columns.
a data.frame where each row contains the results of the Levene's test
performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.tot - total number of observations
2. obs.groups - number of groups
3. df.between - between group (treatment) degrees of freedom
4. df.within - within group (residual) degrees of freedom
5. statistic - F statistic
6. pvalue - p.value
Difference between Levene's test and Brown-Forsythe test is that the Brown-Forsythe test uses the median instead of the mean in computing the spread within each group. Many software implementations use the name "Levene's test" for both variants.
Karolis Koncevičius
col_levene(iris[,1:4], iris$Species) row_brownforsythe(t(iris[,1:4]), iris$Species)
col_levene(iris[,1:4], iris$Species) row_brownforsythe(t(iris[,1:4]), iris$Species)
Performs an analysis of variance tests on each row/column of the input matrix.
row_oneway_equalvar(x, g) col_oneway_equalvar(x, g) row_oneway_welch(x, g) col_oneway_welch(x, g)
row_oneway_equalvar(x, g) col_oneway_equalvar(x, g) row_oneway_welch(x, g) col_oneway_welch(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
Functions to perform ONEWAY ANOVA analysis for rows/columns of matrices.
row_oneway_equalvar(x, g)
- oneway ANOVA on rows.
col_oneway_equalvar(x, g)
- oneway ANOVA on columns.
Results should be the same as running aov(x ~ g)
on every row (or column) of x
row_oneway_welch(x, g)
- oneway ANOVA with Welch correction on rows.
col_oneway_welch(x, g)
- oneway ANOVA with Welch correction on columns.
Results should be the same as running oneway.test(x, g, var.equal=FALSE)
on every row (or column) of x
a data.frame where each row contains the results of an oneway anova
test performed on the corresponding row/column of x.
The columns will vary depending on the type of test performed.
They will contain a subset of the following information:
1. obs.tot - total number of observations
2. obs.groups - number of groups
3. sumsq.between - between group (treatment) sum of squares
4. sumsq.within - within group (residual) sum of squares
5. meansq.between - between group mean squares
6. meansq.within - within group mean squares
7. df.between - between group (treatment) degrees of freedom
8. df.within - within group (residual) degrees of freedom
9. statistic - F statistic
10. pvalue - p.value
Karolis Koncevičius
aov()
, oneway.test()
col_oneway_welch(iris[,1:4], iris$Species) row_oneway_equalvar(t(iris[,1:4]), iris$Species)
col_oneway_welch(iris[,1:4], iris$Species) row_oneway_equalvar(t(iris[,1:4]), iris$Species)
Performs a t-test on each row/column of the input matrix.
row_t_equalvar(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_equalvar(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_welch(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_welch(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_onesample(x, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_onesample(x, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_paired(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_paired(x, y, null = 0, alternative = "two.sided", conf.level = 0.95)
row_t_equalvar(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_equalvar(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_welch(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_welch(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_onesample(x, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_onesample(x, null = 0, alternative = "two.sided", conf.level = 0.95) row_t_paired(x, y, null = 0, alternative = "two.sided", conf.level = 0.95) col_t_paired(x, y, null = 0, alternative = "two.sided", conf.level = 0.95)
x |
numeric matrix. |
y |
numeric matrix for the second group of observations. |
null |
true values of the means for the null hypothesis. A single number or numeric vector with values for each observation. |
alternative |
alternative hypothesis to use for each row/column of x. A single string or a vector with values for each observation. Values must be one of "two.sided" (default), "greater" or "less". |
conf.level |
confidence levels used for the confidence intervals. A single number or a numeric vector with values for each observation. All values must be in the range of [0:1] or NA. |
Functions to perform one sample and two sample t-tests for rows/columns of matrices.
Main arguments and results were intentionally matched to the t.test()
function from default stats package. Other arguments were split into separate
functions:
row_t_onesample(x)
- one sample t-test on rows.
col_t_onesample(x)
- one sample t-test on columns.
Results should be the same as running t.test(x)
on every row (or column) of x
.
row_t_equalvar(x, y)
- two sample equal variance t-test on rows.
col_t_equalvar(x, y)
- two sample equal variance t-test on columns.
Results should be the same as running t.test(x, y, var.equal=TRUE)
on every row (or column) of x
and y
.
row_t_welch(x, y)
- two sample t-test with Welch correction on rows.
col_t_welch(x, y)
- two sample t-test with Welch correction on columns.
Results should be the same as running t.test(x, y)
on every row (or column) of x
and y
.
row_t_paired(x, y)
- two sample paired t-test on rows.
col_t_paired(x, y)
- two sample paired t-test on columns.
Results should be the same as running t.test(x, y, paired=TRUE)
on every row (or column) of x
and y
.
a data.frame where each row contains the results of a t.test
performed on the corresponding row/column of x.
The columns will vary depending on the type of test performed.
They will contain a subset of the following information:
1. obs.x - number of x observations
2. obs.y - number of y observations
3. obs.tot - total number of observations
4. obs.paired - number of paired observations (present in x and y)
5. mean.x - mean estiamte of x
6. mean.y - mean estiamte of y
7. mean.diff - mean estiamte of x-y difference
8. var.x - variance estiamte of x
9. var.y - variance estiamte of y
10. var.diff - variance estiamte of x-y difference
11. var.pooled - pooled variance estimate of x and y
12. stderr - standard error
13. df - degrees of freedom
14. statistic - t statistic
15. pvalue - p-value
16. conf.low - lower bound of the confidence interval
17. conf.high - higher bound of the confidence interval
18. mean.null - mean of the null hypothesis
19. alternative - chosen alternative hypothesis
20. conf.level - chosen confidence level
For a marked increase in computation speed turn off the calculation of
confidence interval by setting conf.level
to NA.
Karolis Koncevičius
t.test()
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_t_welch(X, Y) # same row using different confidence levels col_t_equalvar(X[,c(1,1,1)], Y[,c(1,1,1)], conf.level=c(0.9, 0.95, 0.99))
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_t_welch(X, Y) # same row using different confidence levels col_t_equalvar(X[,c(1,1,1)], Y[,c(1,1,1)], conf.level=c(0.9, 0.95, 0.99))
Performs van der Waerden test on each row/column of the input matrix.
row_waerden(x, g) col_waerden(x, g)
row_waerden(x, g) col_waerden(x, g)
x |
numeric matrix. |
g |
a vector specifying group membership for each observation of x. |
row_waerden(x, g)
- van der Waerden test on rows.
col_waerden(x, g)
- van det Waerden test on columns.
a data.frame where each row contains the results of van det Waerden
test performed on the corresponding row/column of x.
Each row contains the following information (in order):
1. obs.tot - total number of observations
2. obs.groups - number of groups
3. df - degrees of freedome
4. statistic - van det Waerden chi-squared statistic
5. pvalue - p.value
Karolis Koncevičius
vanWaerdenTest
, row_oneway_equalvar
, row_kruskalwallis
col_waerden(iris[,1:4], iris$Species) row_waerden(t(iris[,1:4]), iris$Species)
col_waerden(iris[,1:4], iris$Species) row_waerden(t(iris[,1:4]), iris$Species)
Performs a Wilcoxon test on each row/column of the input matrix.
row_wilcoxon_twosample( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_twosample( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) row_wilcoxon_onesample( x, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_onesample( x, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) row_wilcoxon_paired( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_paired( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE )
row_wilcoxon_twosample( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_twosample( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) row_wilcoxon_onesample( x, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_onesample( x, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) row_wilcoxon_paired( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE ) col_wilcoxon_paired( x, y, null = 0, alternative = "two.sided", exact = NA, correct = TRUE )
x |
numeric matrix. |
y |
numeric matrix for the second group of observations. |
null |
true values of the location shift for the null hypothesis. A single number or numeric vector with values for each observation. |
alternative |
alternative hypothesis to use for each row/column of x. A single string or a vector with values for each observation. Values must be one of "two.sided" (default), "greater" or "less". |
exact |
logical or NA (default) indicator whether an exact p-value should be computed (see Details). A single value or a logical vector with values for each observation. |
correct |
logical indicator whether continuity correction should be applied in the cases where p-values are obtained using normal approximation. A single value or logical vector with values for each observation. |
Functions to perform one sample and two sample Wilcoxon tests on rows/columns of matrices.
Main arguments and results were intentionally matched to the wilcox.test()
function from default stats package. Other arguments were split into separate
functions:
row_wilcoxon_onesample(x)
- one sample Wilcoxon test on rows.
col_wilcoxon_onesample(x)
- one sample Wilcoxon test on columns.
Results should be the same as running wilcox.test(x)
on every row (or column) of x
.
row_wilcoxon_twosample(x, y)
- two sample Wilcoxon test on rows.
col_wilcoxon_twosample(x, y)
- two sample Wilcoxon test on columns.
Results should be the same as running wilcox.test(x, y)
on every row (or column) of x
and y
.
row_wilcoxon_paired(x, y)
- two sample paired Wilcoxon test on rows.
col_wilcoxon_paired(x, y)
- two sample paired Wilcoxon test on columns.
Results should be the same as running wilcox.test(x, y, paired=TRUE)
on every row (or column) of x
and y
.
By default if 'exact' argument is set to 'NA', exact p-values are computed only if both 'x' and 'y' contain less than 50 values and there are no ties. Single sample and paired tests have additional requirement of not having zeroe values (values equal to null hypothesis location argument 'mu'). Otherwise, a normal approximation is used. Be wary of using 'exact=TRUE' on large sample sizes as computations can take a very long time.
'correct' argument controls the continuity correction of p-values but only when exact p-values cannot be computed and normal approximation is used. For cases where exact p-values are returned 'correct' is switched to FALSE.
a data.frame where each row contains the results of a wilcoxon test
performed on the corresponding row/column of x.
The columns will vary depending on the type of test performed.
They will contain a subset of the following information:
1. obs.x - number of x observations
2. obs.y - number of y observations
3. obs.tot - total number of observations
4. obs.paired - number of paired observations (present in x and y)
5. statistic - Wilcoxon test statistic
6. pvalue - p-value
7. location.null - location shift of the null hypothesis
8. alternative - chosen alternative hypothesis
9. exact - indicates if exact p-value was computed
10. correct - indicates if continuity correction was performed
Confidence interval and pseudo-median calculations are not implemented.
Karolis Koncevičius
wilcox.test()
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_wilcoxon_twosample(X, Y) # same row using different alternative hypotheses col_wilcoxon_twosample(X[,c(1,1,1)], Y[,c(1,1,1)], alternative=c("t", "g", "l"))
X <- iris[iris$Species=="setosa",1:4] Y <- iris[iris$Species=="virginica",1:4] col_wilcoxon_twosample(X, Y) # same row using different alternative hypotheses col_wilcoxon_twosample(X[,c(1,1,1)], Y[,c(1,1,1)], alternative=c("t", "g", "l"))