check_common_diff
Function
crisgi_obj.check_common_diff(
top_n,
target_group,
layer="log1p",
method="prod",
test_type="TER",
interactions=None,
unit_header="subject",
out_dir=None,
)
Identifies and analyzes the overlap between the top N differential features (e.g., genes or interactions) and a reference set within the dataset. This function is useful for evaluating the consistency of differential features across groups or conditions in the CRISGI analysis workflow.
Parameters
Name | Type | Description |
---|---|---|
top_n |
int |
Number of top features to consider for overlap analysis. |
target_group |
str |
The group or condition by which to stratify the analysis. |
layer |
str |
Data layer to use for entropy calculation (default: 'log1p' ). |
method |
str |
Method for entropy calculation (default: 'prod' ). |
test_type |
str |
Statistical test type to use (default: 'TER' ). |
interactions |
list or None |
List of features to compare for overlap. If None , uses default from edata.uns . |
unit_header |
str |
Header indicating the unit of analysis (default: 'subject' ). |
out_dir |
str or None |
Output directory to save results. If None , saves to current directory. |
Return type
None
Returns
This function does not return a value. It updates the obs
attribute of the edata
object with two new columns:
- top_{top_n}_overlap
: Number of overlapping features for each observation.
- top_{top_n}_overlap_ratio
: Ratio of overlapping features to top_n
.
It also saves a CSV file with these statistics to the specified output directory.
Attributes Set
edata.obs['top_{top_n}_overlap']
edata.obs['top_{top_n}_overlap_ratio'}
Example
# Assume crisgi is an instance of the CRISGI class
crisgi.check_common_diff(
top_n=20,
target_group='cell_type',
layer='log1p',
method='prod',
test_type='TER',
interactions=['GeneA', 'GeneB', 'GeneC'],
unit_header='subject',
out_dir='results'
)
# After running, check the overlap statistics:
import pandas as pd
overlap_stats = pd.read_csv('./results/top_20_overlap.csv')
print(overlap_stats.head())
This example computes the overlap of the top 20 differential features per cell type, using the specified interactions, and saves the results in the results
directory.