# mgcpy.independence_tests.mgc_utils package¶

## mgcpy.independence_tests.mgc_utils.local_correlation module¶

MGC’s Local Correlation Module

`mgcpy.independence_tests.mgc_utils.local_correlation.``local_correlations`(ndarray matrix_A, ndarray matrix_B, distance_metric='euclidean', base_global_correlation='mgc')

Computes all the local correlation coefficients in `O(n^2 log n)`

Parameters
• matrix_A (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for `n` samples OR

• a `[n*d]` data matrix, a matrix with `n` samples in `d` dimensions

• matrix_B (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for `n` samples OR

• a `[n*d]` data matrix, a matrix with `n` samples in `d` dimensions

• distance_metric (string) – specifies the distance_metric to use for computing the `distance_matrix`, defaults to ‘euclidean’

• base_global_correlation (string) – specifies which global correlation to build up-on, including ‘mgc’,’dcor’,’mantel’, and ‘rank’. Defaults to mgc.

Returns

A `dict` with the following keys:

• local_correlation_matrix

a 2D matrix of all local correlations within `[-1,1]`

• local_variance_A

all local variances of A

• local_variance_B

all local variances of B

Return type

dictionary

Example:

```>>> import numpy as np
>>> from scipy.spatial import distance_matrix
>>> from mgcpy.mgc.local_correlation import local_correlations
>>>
>>> X = np.array([[2, 1, 100], [4, 2, 10], [8, 3, 10]])
>>> Y = np.array([[30, 20, 10], [5, 10, 20], [8, 16, 32]])
>>> result = local_correlations(X, Y)
```
`mgcpy.independence_tests.mgc_utils.local_correlation.``local_covariance`(ndarray distance_matrix_A, ndarray distance_matrix_B, ndarray ranked_distance_matrix_A, ndarray ranked_distance_matrix_B)

Computes all local covariances simultaneously in `O(n^2)`.

Parameters
• distance_matrix_A (2D numpy.array) – first distance matrix (centered or appropriately transformed), `[n*n]`

• distance_matrix_B (2D numpy.array) – second distance matrix (centered or appropriately transformed), `[n*n]`

• ranked_distance_matrix_A (2D numpy.array) – column-wise ranked matrix of `A`, `[n*n]`

• ranked_distance_matrix_B (2D numpy.array) – column-wise ranked matrix of `B`, `[n*n]`

Returns

matrix of all local covariances, `[n*n]`

Return type

2D numpy.array

## mgcpy.independence_tests.mgc_utils.threshold_smooth module¶

MGC’s Sample Statistic Module

`mgcpy.independence_tests.mgc_utils.threshold_smooth.``threshold_local_correlations`(local_correlation_matrix, sample_size)[source]

Finds a connected region of significance in the local correlation map by thresholding

Parameters
• local_correlation_matrix – all local correlations within `[-1,1]`

• sample_size (integer) – the sample size of original data (which may not equal `m` or `n` in case of repeating data).

Returns

a binary matrix of size `m` and `n`, with 1’s indicating the significant region.

Return type

2D numpy.array

`mgcpy.independence_tests.mgc_utils.threshold_smooth.``smooth_significant_local_correlations`(significant_connected_region, local_correlation_matrix)[source]

Finds the smoothed maximal within the significant region R:

• If area of R is too small it returns the last local correlation

• Otherwise, returns the maximum within significant_connected_region.

Parameters
• significant_connected_region (2D numpy.array) – a binary matrix of size `m` and `n`, with 1’s indicating the significant region.

• local_correlation_matrix – all local correlations within `[-1,1]`

Returns

A `dict` with the following keys:

• mgc_statistic

the sample MGC statistic within `[-1, 1]`

• optimal_scale

the estimated optimal scale as an `[x, y]` pair.

Return type

dictionary