check_common_diff

Function

crisgi_obj.check_common_diff(
    top_n,
    target_group,
    layer="log1p",
    method="prod",
    test_type="TER",
    interactions=None,
    unit_header="subject",
    out_dir=None,
)

Identifies and analyzes the overlap between the top N differential features (e.g., genes or interactions) and a reference set within the dataset. This function is useful for evaluating the consistency of differential features across groups or conditions in the CRISGI analysis workflow.

Parameters

Name	Type	Description
`top_n`	`int`	Number of top features to consider for overlap analysis.
`target_group`	`str`	The group or condition by which to stratify the analysis.
`layer`	`str`	Data layer to use for entropy calculation (default: `'log1p'`).
`method`	`str`	Method for entropy calculation (default: `'prod'`).
`test_type`	`str`	Statistical test type to use (default: `'TER'`).
`interactions`	`list` or `None`	List of features to compare for overlap. If `None`, uses default from `edata.uns`.
`unit_header`	`str`	Header indicating the unit of analysis (default: `'subject'`).
`out_dir`	`str` or `None`	Output directory to save results. If `None`, saves to current directory.

Return type

None

Returns

This function does not return a value. It updates the obs attribute of the edata object with two new columns: - top_{top_n}_overlap: Number of overlapping features for each observation. - top_{top_n}_overlap_ratio: Ratio of overlapping features to top_n.

It also saves a CSV file with these statistics to the specified output directory.

Attributes Set

edata.obs['top_{top_n}_overlap']
edata.obs['top_{top_n}_overlap_ratio'}

Example

# Assume crisgi is an instance of the CRISGI class
crisgi.check_common_diff(
    top_n=20,
    target_group='cell_type',
    layer='log1p',
    method='prod',
    test_type='TER',
    interactions=['GeneA', 'GeneB', 'GeneC'],
    unit_header='subject',
    out_dir='results'
)

# After running, check the overlap statistics:
import pandas as pd
overlap_stats = pd.read_csv('./results/top_20_overlap.csv')
print(overlap_stats.head())

This example computes the overlap of the top 20 differential features per cell type, using the specified interactions, and saves the results in the results directory.