pycallingcards.preprocessing.filter_peaks#

pycallingcards.preprocessing.filter_peaks(data, min_counts=None, min_cells=None, max_counts=None, max_cells=None, inplace=True, copy=False)[source]#

Filter peaks based on the number of cells or counts. Keep peaks that have at least min_counts counts or are expressed in at least min_cells cells or have at most max_counts counts or are expressed in at most max_cells cells. Only provide one of the optional parameters min_counts, min_cells, max_counts, max_cells per call.

Parameters:
  • data (AnnData) – An annotated data matrix of shape n_obs * n_vars. Rows correspond to cells and columns to peaks.

  • min_counts (Optional[int] (default: None)) – Minimum number of counts required for a peak to pass filtering.

  • min_cells (Optional[int] (default: None)) – Minimum number of cells expressed required for a peak to pass filtering.

  • max_counts (Optional[int] (default: None)) – Maximum number of counts required for a peak to pass filtering.

  • max_cells (Optional[int] (default: None)) – Maximum number of cells expressed required for a peak to pass filtering.

  • inplace (bool (default: True)) – Perform computation inplace or return result.

  • copy (bool (default: False)) – Whether to modify copied input object.

Returns:

Return type:

Union[AnnData, None, Tuple[ndarray, ndarray]]

Returns the following arrays or directly subsets and annotates the data matrix.

peak_subset - Boolean index mask that does filtering. True means that the peak is kept. False means the peak is removed.
number_per_peak - Depending on what the tresholded was(counts or cells), the array stores n_counts or n_cells per peak, respectively.
Example:

>>> import pycallingcards as cc
>>> adata_cc = cc.datasets.mousecortex_data(data="CC")
>>> cc.pp.filter_peaks(adata_cc, min_counts=1)