# Power¶

In this tutorial, we explore

The concept of power in hypothesis testing

Power computation in

`mgcpy`

Comparison of methods

## Theory¶

Consider,

We wish to test:

For a testing procedure \(T\), we define \(\alpha_n\) to be the probability of Type I error. That is,

Similarly, we define \(\beta_n\) to be the probability of Type II error.

Finally, the **power** is defined as:

or the probability of correctly rejecting the null when the alternative is true. A common desideratum for a testing procedure is to have as high of a power as possible, subject to \(\alpha_n(T) \leq \alpha\), where \(\alpha\) is some specified “significance level”. When many alternatives are possible, power is a property of not only the test, but the particular distribution of the alternative. Implicitly, it depends on the sample size as well.

## Power in `mgcpy`

¶

```
[1]:
```

```
from mgcpy.independence_tests.dcorr import DCorr
from mgcpy.independence_tests.rv_corr import RVCorr
from mgcpy.benchmarks.power import power
from mgcpy.benchmarks.simulations import linear_sim
```

`mgcpy`

comes in built with 20 simulation functions that model various types of dependencies that random variables can have (linear, spiral, sinusoidal, etc.) The power function takes an `Independence_Test`

object and function (that takes arguments `num_samples`

, `num_dimensions`

, and `noise`

) to simulate data. Using these, estimates the power of the test under the alternative posed by the simulation.

We first estimate the power of `DCorr`

and `Pearson`

on linearly related data. Without any noise, we expect this relationship to be perfectly discernable, i.e. a power of 1. For the following simulations we have sample size `n = 100`

and number of dimensions `d = 1`

.

```
[2]:
```

```
dcorr = DCorr()
pearson = RVCorr(which_test = 'pearson')
p = power(pearson, linear_sim)
q = power(dcorr, linear_sim)
print("The power of Pearson's correlation against a linear alternative is: %f" % p)
print("The power of DCorr against a linear alternative is: %f" % q)
```

```
The power of Pearson's correlation against a linear alternative is: 1.000000
The power of DCorr against a linear alternative is: 1.000000
```

By adding noise, we see a decrease in power of both tests.

```
[3]:
```

```
p = power(pearson, linear_sim, noise = 3.0)
q = power(dcorr, linear_sim, noise = 3.0)
print("The power of Pearson's correlation against a linear alternative is: %f" % p)
print("The power of DCorr against a linear alternative is: %f" % q)
```

```
The power of Pearson's correlation against a linear alternative is: 0.507000
The power of DCorr against a linear alternative is: 0.439000
```

When we change the simulation to a highly nonlinearly related distribution, such as a spiral, Pearson’s correlation is incomporable to `DCorr`

. Similarly, `MGC`

will have high power in this nonlinear setting than even `DCorr`

.

```
[4]:
```

```
from mgcpy.independence_tests.mgc import MGC
from mgcpy.benchmarks.simulations import spiral_sim
mgc = MGC()
p = power(pearson, spiral_sim)
q = power(dcorr, spiral_sim)
r = power(mgc, spiral_sim)
print("The power of Pearson's correlation against a spiral alternative is: %f" % p)
print("The power of DCorr against a spiral alternative is: %f" % q)
print("The power of MGC against a spiral alternative is: %f" % r)
```

```
The power of Pearson's correlation against a spiral alternative is: 0.130000
The power of DCorr against a spiral alternative is: 0.304000
The power of MGC against a spiral alternative is: 1.000000
```

Finally, we present a high-dimensional square shape at low sample size to show the effectiveness of `MGC`

in such a setting.

```
[5]:
```

```
from mgcpy.benchmarks.simulations import square_sim
d = 20
n = 30
p = power(pearson, square_sim, num_samples = n, noise = 1, num_dimensions = d)
q = power(dcorr, square_sim, num_samples = n, noise = 1, num_dimensions = d)
r = power(mgc, square_sim, num_samples = n, noise = 1, num_dimensions = d)
print("The power of Pearson's correlation against a square alternative at n = %d and d = %d is: %f" % (n, d, p))
print("The power of DCorr correlation against a square alternative at n = %d and d = %d is: %f" % (n, d, q))
print("The power of MGC correlation against a square alternative at n = %d and d = %d is: %f" % (n, d, r))
```

```
The power of Pearson's correlation against a square alternative at n = 30 and d = 20 is: 0.056000
The power of DCorr correlation against a square alternative at n = 30 and d = 20 is: 0.040000
The power of MGC correlation against a square alternative at n = 30 and d = 20 is: 0.059000
```