Hotelling¶

class
hyppo.ksample.
Hotelling
¶ Hotelling \(T^2\) test statistic and pvalue.
Hotelling \(T^2\) is 2sample multivariate analysis of variance (MANOVA) and generalization of Student's ttest in arbitary dimension [2]. The test statistic is formulated as below [1]:
Consider input samples \(u_i \stackrel{iid}{\sim} F_U\) for \(i \in \{ 1, \ldots, n \}\) and \(v_i \stackrel{iid}{\sim} F_V\) for \(i \in \{ 1, \ldots, m \}\). Let \(\bar{u}\) refer to the columnwise means of \(u\); that is, \($\bar{u} = (1/n) \sum_{i=1}^{n} u_i\) and let \(\bar{v}\) be the same for \(v\). Calculate sample covariance matrices \(\hat{\Sigma}_{uv} = u^T v\) and sample variance matrices \(\hat{\Sigma}_{uu} = u^T u\) and \(\hat{\Sigma}_{vv} = v^T v\). Denote pooled covariance matrix \(\hat{\Sigma}\) as
\[\hat{\Sigma} = \frac{(n  1) \hat{\Sigma}_{uu} + (m  1) \hat{\Sigma}_{vv} } {n + m  2}\]Then,
\[\text{\Hotelling}_{n, m} (u, v) = \frac{n m}{n + m} (\bar{u}  \bar{v})^T \hat{\Sigma}^{1} (\bar{u}  \bar{v})\]Since it is a multivariate generalization of Student's ttests, it suffers from some of the same assumptions as Student's ttests. That is, the validity of MANOVA depends on the assumption that random variables are normally distributed within each group and each with the same covariance matrix. Distributions of input data are generally not known and cannot always be reasonably modeled as Gaussian [3] [4] and having the same covariance across groups is also generally not true of real data.
Methods Summary

Calulates the Hotelling \(T^2\) test statistic. 

Calculates the Hotelling \(T^2\) test statistic and pvalue. 

Hotelling.
statistic
(x, y)¶ Calulates the Hotelling \(T^2\) test statistic.

Hotelling.
test
(x, y)¶ Calculates the Hotelling \(T^2\) test statistic and pvalue.
 Parameters
x,y (
ndarray
)  Input data matrices.x
andy
must have the same number of dimensions. That is, the shapes must be(n, p)
and(m, p)
where n is the number of samples and p and q are the number of dimensions. Returns
Examples
>>> import numpy as np >>> from hyppo.ksample import Hotelling >>> x = np.arange(7) >>> y = x >>> stat, pvalue = Hotelling().test(x, y) >>> '%.3f, %.1f' % (stat, pvalue) '0.000, 1.0'