pycallingcards.tools.pair_peak_gene_sc#

pycallingcards.tools.pair_peak_gene_sc(adata_cc, adata, peak_annotation=None, pvalue_adj_cutoff_cc=0.01, pvalue_adj_cutoff_rna=0.01, pvalue_cutoff_cc=None, pvalue_cutoff_rna=None, lfc_cutoff=3, score_cutoff=3, distance_cutoff=None, group_cc='binomtest', group_adata='rank_genes_groups', group_name='cluster')[source]#

Pair related peaks and genes. Find out the significant binding peaks for one cluster and then see whether the annotated genes are also significantly expressed.

Parameters:
  • adata_cc (AnnData) – Anndata for callingcards

  • adata (AnnData) – Anndata for RNA.

  • peak_annotation (Optional[DataFrame] (default: None)) – peak_annotation gotten from cc.pp.annotation and cc.pp.combine_annotation

  • pvalue_adj_cutoff_cc (Optional[float] (default: 0.01)) – The cut off value for the adjusted pvalues of adata_cc.

  • pvalue_adj_cutoff_rna (Optional[float] (default: 0.01)) – The cut off value for the adjusted pvalues of adata.

  • pvalue_cutoff_cc (Optional[float] (default: None)) – The cut off value for the pvalues of adata_cc.

  • pvalue_cutoff_rna (Optional[float] (default: None)) – The cut off value for the pvalues of adata.

  • lfc_cutoff (float (default: 3)) – The cut off value for the logfoldchange of adata_cc.

  • score_cutoff (float (default: 3)) – The cut off value for the cut of score value for adata.

  • group_cc (str (default: 'binomtest')) – The name of target result in adata_cc.uns.

  • group_adata (str (default: 'rank_genes_groups')) – The name of target result in adata.uns.

  • group_name (str (default: 'cluster')) – The name of the cluster in adata_cc.obs.

Return type:

DataFrame

Returns:

pd.DataFrame with paired genes and peaks for different groups.

Example:

>>> import pycallingcards as cc
>>> import scanpy as sc
>>> adata_cc = sc.read("Mouse-Cortex_cc.h5ad")
>>> adata = cc.datasets.mousecortex_data(data="RNA")
>>> qbed_data = cc.datasets.mousecortex_data(data="qbed")
>>> peak_data = cc.pp.callpeaks(qbed_data, method = "CCcaller", reference = "mm10",  maxbetween = 2000, pvalue_cutoff = 0.01,
            lam_win_size = 1000000,  pseudocounts = 1, record = True)
>>> peak_annotation = cc.pp.annotation(peak_data, reference = "mm10")
>>> peak_annotation = cc.pp.combine_annotation(peak_data,peak_annotation)
>>> sc.tl.rank_genes_groups(adata,'cluster')
>>> cc.tl.pair_peak_gene_sc(adata_cc,adata,peak_annotation)