RV¶

class
hyppo.independence.
RV
¶ Rank Value (RV) test statistic and pvalue.
RV is the multivariate generalization of the squared Pearson correlation coefficient [1]. The RV coefficient can be thought to be closely related to principal component analysis (PCA), canonical correlation analysis (CCA), multivariate regression, and statistical classification [1]. The statistic can be derived as follows [1] [2]:
Let \(x\) and \(y\) be \((n, p)\) samples of random variables \(X\) and \(Y\). We can center \(x\) and \(y\) and then calculate the sample covariance matrix \(\hat{\Sigma}_{xy} = x^T y\) and the variance matrices for \(x\) and \(y\) are defined similarly. Then, the RV test statistic is found by calculating
\[\mathrm{RV}_n (x, y) = \frac{\mathrm{tr} \left( \hat{\Sigma}_{xy} \hat{\Sigma}_{yx} \right)} {\mathrm{tr} \left( \hat{\Sigma}_{xx}^2 \right) \mathrm{tr} \left( \hat{\Sigma}_{yy}^2 \right)}\]where \(\mathrm{tr} (\cdot)\) is the trace operator.
The pvalue returned is calculated using a permutation test using
hyppo.tools.perm_test
.
Methods Summary

Helper function that calculates the RV test statistic. 

Calculates the RV test statistic and pvalue. 

RV.
statistic
(x, y)¶ Helper function that calculates the RV test statistic.

RV.
test
(x, y, reps=1000, workers=1)¶ Calculates the RV test statistic and pvalue.
 Parameters
x,y (
ndarray
)  Input data matrices.x
andy
must have the same number of samples and dimensions. That is, the shapes must be(n, p)
where n is the number of samples and p is the number of dimensions.reps (
int
, default:1000
)  The number of replications used to estimate the null distribution when using the permutation test used to calculate the pvalue.workers (
int
, default:1
)  The number of cores to parallelize the pvalue computation over. Supply1
to use all cores available to the Process.
 Returns
Examples
>>> import numpy as np >>> from hyppo.independence import RV >>> x = np.arange(7) >>> y = x >>> stat, pvalue = RV().test(x, y) >>> '%.1f, %.2f' % (stat, pvalue) '1.0, 0.00'