pycallingcards.tools.pair_peak_gene_bulk#

pycallingcards.tools.pair_peak_gene_bulk(adata_cc, deresult, peak_annotation=None, pvalue_adj_cutoff_cc=0.01, pvalue_adj_cutoff_rna=0.01, pvalue_cutoff_cc=None, pvalue_cutoff_rna=None, lfc_cutoff_cc=3, lfc_cutoff_rna=3, distance_cutoff=None, group_cc='fisher_exact', name_cc='logfoldchanges', name_bulk=['pvalue', 'padj', 'log2FoldChange'])[source]#

Pair related peaks and genes. Find out significant binding peaks for one cluster and then see whether the annotated genes are also significantly expressed.

Parameters:
  • adata_cc (AnnData) – Anndata for callingcards

  • deresult (Union[str, DataFrame]) – Results from DEseq2 could be a pandas dataframe or the path to the csv file.

  • peak_annotation (Optional[DataFrame] (default: None)) – peak_annotation gotten from cc.pp.annotation and cc.pp.combine_annotation

  • pvalue_adj_cutoff_cc (Optional[float] (default: 0.01)) – The cut off value for the adjusted pvalues of adata_cc.

  • pvalue_adj_cutoff_rna (Optional[float] (default: 0.01)) – The cut off value for the adjusted pvalues of adata.

  • pvalue_cutoff_cc (Optional[float] (default: None)) – The cut off value for the pvalues of adata_cc.

  • pvalue_cutoff_rna (Optional[float] (default: None)) – The cut off value for the pvalues of adata.

  • lfc_cutoff_cc (float (default: 3)) – The cut off value for the logfoldchange for name_cc of adata_cc.

  • lfc_cutoff_rna (float (default: 3)) – The cut off value for the logfoldchange of rna.

  • group_cc (str (default: 'fisher_exact')) – The name of target result in adata_cc.uns.

  • name_cc (str (default: 'logfoldchanges')) – The name of target result in adata.uns[group_cc].

Return type:

DataFrame

Returns:

pd.DataFrame with paired genes and peaks for different groups.

Example:

>>> import pycallingcards as cc
>>> adata_cc = cc.datasets.mouse_brd4_data(data="CC")
>>> cc.tl.pair_peak_gene_bulk(adata_cc,"https://github.com/The-Mitra-Lab/pycallingcards_data/releases/download/data/deseq_MF.csv")