Skip to content

check_common_diff

Function

crisgi_obj.check_common_diff(
    top_n,
    target_group,
    layer="log1p",
    method="prod",
    test_type="TER",
    interactions=None,
    unit_header="subject",
    out_dir=None,
)

Identifies and analyzes the overlap between the top N differential features (e.g., genes or interactions) and a reference set within the dataset. This function is useful for evaluating the consistency of differential features across groups or conditions in the CRISGI analysis workflow.

Parameters

Name Type Description
top_n int Number of top features to consider for overlap analysis.
target_group str The group or condition by which to stratify the analysis.
layer str Data layer to use for entropy calculation (default: 'log1p').
method str Method for entropy calculation (default: 'prod').
test_type str Statistical test type to use (default: 'TER').
interactions list or None List of features to compare for overlap. If None, uses default from edata.uns.
unit_header str Header indicating the unit of analysis (default: 'subject').
out_dir str or None Output directory to save results. If None, saves to current directory.

Return type

None

Returns

This function does not return a value. It updates the obs attribute of the edata object with two new columns: - top_{top_n}_overlap: Number of overlapping features for each observation. - top_{top_n}_overlap_ratio: Ratio of overlapping features to top_n.

It also saves a CSV file with these statistics to the specified output directory.

Attributes Set

  • edata.obs['top_{top_n}_overlap']
  • edata.obs['top_{top_n}_overlap_ratio'}

Example

# Assume crisgi is an instance of the CRISGI class
crisgi.check_common_diff(
    top_n=20,
    target_group='cell_type',
    layer='log1p',
    method='prod',
    test_type='TER',
    interactions=['GeneA', 'GeneB', 'GeneC'],
    unit_header='subject',
    out_dir='results'
)

# After running, check the overlap statistics:
import pandas as pd
overlap_stats = pd.read_csv('./results/top_20_overlap.csv')
print(overlap_stats.head())

This example computes the overlap of the top 20 differential features per cell type, using the specified interactions, and saves the results in the results directory.