# mgcpy.hypothesis_tests package¶

## mgcpy.hypothesis_tests.transforms module¶

`mgcpy.hypothesis_tests.transforms.``k_sample_transform`(x, y, is_y_categorical=False)[source]

Transform to represent a k-sample test as an independence test

Parameters
• X (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR

• a `[n*p]` data matrix, a matrix with n samples in p dimensions

• Y (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR

• a `[n*p]` data matrix, a matrix with n samples in p dimensions

• a `[n*1]` label matrix, categorical data for X, if `is_y_categorical` is set to True

• is_y_categorical (boolean) – if set to True, `Y` has categorical data ans is a labels array for X, else, it is a plain data matrix

Returns

• u

a concatenated data matrix of dimensions `[2*n, p]`

• v

a label matrix for `u`, which indicates to which category each data entry in `u` belongs to

Return type

list

`mgcpy.hypothesis_tests.transforms.``paired_two_sample_transform`(x, y)[source]

Transform to represent a paired two-sample test as an independence test Steps:

• combine x and y to get the joint_distribution

• sample n pairs from the joint_distribution

• compute the eucledian distance between the sampled n pairs, which is `randomly_sampled_pairs_distance`

• compute the eucledian distance between the actual x and y, which is `actual_pairs_distance`

• compute the two sample transformed matrices of `randomly_sampled_pairs_distance` and `actual_pairs_distance`

Parameters
• X (2D numpy.array) – is interpreted as either: - a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR - a `[n*p]` data matrix, a matrix with n samples in p dimensions

• Y (2D numpy.array) – is interpreted as either: - a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR - a `[n*p]` data matrix, a matrix with n samples in p dimensions

Returns

• u

a data matrix of dimensions `[2*n, p]`

• v

a label matrix for `u`, which indicates to which category each data entry in `u` belongs to

Return type

list

`mgcpy.hypothesis_tests.transforms.``paired_two_sample_test_dcorr`(x, y, which_test='biased', compute_distance_matrix=None, is_fast=False)[source]

Compute paired two sample test’s DCorr test_statistic

Parameters
• X (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR

• a `[n*p]` data matrix, a matrix with n samples in p dimensions

• Y (2D numpy.array) –

is interpreted as either:

• a `[n*n]` distance matrix, a square matrix with zeros on diagonal for n samples OR

• a `[n*p]` data matrix, a matrix with n samples in p dimensions

Returns

paired two sample DCorr test_statistic

Return type

float