pycallingcards.preprocessing.combine_peaks#
- pycallingcards.preprocessing.combine_peaks(peak_data, index, expdata=None, background=None, method='CCcaller', reference='hg38', test_method='poisson', lam_win_size=100000, pvalue_cutoff=None, pvalue_cutoffbg=None, pvalue_cutoffTTAA=None, pseudocounts=0.2, return_whole=False)[source]#
Combine two peaks.
This function combine the one and the next peak peaks.
- Parameters:
peak_data (
DataFrame
) – pd.DataFrame for peak data. Please input the original data from call_peaks function.index (
int
) – The index for the first peak to combine. Will combine peak index and peak index+1.expdata (
Optional
[DataFrame
] (default:None
)) – pd.DataFrame with the first three columns as chromosome, start and end.background (
Optional
[DataFrame
] (default:None
)) – pd.DataFrame with the first three columns as chromosome, start and end.method (
Optional
[Literal
['CCcaller'
,'MACCs'
,'Blockify'
]] (default:'CCcaller'
)) – ‘CCcaller’ is a method considering the maxdistance between insertions in the data, ‘MACCs’ uses the idea adapted from [Zhang et al., 2008] and here. ‘Blockify’ uses the method from [Moudgil et al., 2020] and here.reference (
Optional
[Literal
['hg38'
,'mm10'
,'sacCer3'
]] (default:'hg38'
)) – We currently have ‘hg38’ for human data, ‘mm10’ for mouse data and ‘sacCer3’ for yeast data.pvalue_cutoff (
Optional
[float
] (default:None
)) – The P-value cutoff for a backgound free situation. If None, no filteration.pvalue_cutoffbg (
Optional
[float
] (default:None
)) – The P-value cutoff for backgound data when backgound exists. If None, no filteration.pvalue_cutoffTTAA (
Optional
[float
] (default:None
)) – The P-value cutoff for reference data when backgound exists. Note that pvalue_cutoffTTAA is recommended to be lower than pvalue_cutoffbg. If None, no filteration.pseudocounts (
float
(default:0.2
)) – Number for pseudocounts added for the pyhothesis.return_whole (
bool
(default:False
)) – If False, return only the combined peak. If True, return the whole peak dataframe.
- Examples:
>>> import pycallingcards as cc >>> qbed_data = cc.datasets.mousecortex_data(data="qbed") >>> peak_data = cc.pp.call_peaks(qbed_data, method = "CCcaller", reference = "mm10", maxbetween = 2000,pvalue_cutoff = 0.01, pseudocounts = 1, record = True) >>> peak_data = cc.pp.combine_peaks(peak_data, 1, qbed_data, method = "CCcaller", reference = "mm10", pvalue_cutoff = 0.01, pseudocounts = 1, return_whole = True)