smurf.make_preparation

smurf.make_preparation(cells_final, so, adatas_final, adata, weight_to_celltype, maximum_cells=10000)

Prepares data for optimization by organizing cells and spots, calculating weights, and grouping cells for computational efficiency.

This function processes the final cell assignments and spatial data to prepare inputs for optimization algorithms. It organizes cells and spots, calculates cell-type-specific weights, and groups cells to limit computational load, ensuring efficient processing especially when dealing with a large number of cells.

Parameters:
  • cells_final (dict) – A dictionary mapping cell IDs to their final set of spots after expansion.

  • so (spatial_object) – A spatial object containing spatial mappings, spot data, and other necessary attributes.

  • adatas_final (anndata.AnnData) – An AnnData object containing the final single-cell gene expression data after processing.

  • adata (anndata.AnnData) – An AnnData object containing spatial gene expression data.

  • weight_to_celltype (numpy.ndarray) – A NumPy array where each row corresponds to a cell type and contains weight vectors used in the optimization.

  • maximum_cells (int, optional) – The maximum number of cells to include in a group for optimization. This parameter helps limit computational load by grouping cells accordingly. Defaults to 10000.

Returns:

A tuple containing:

  • pct_toml_dic (dict): Dictionary containing spot IDs and their associated proportions and cell types.

  • spots_X_dic (dict): Dictionary of spot expression matrices for each group.

  • celltypes_dic (dict): Dictionary of cell-type-specific weight matrices for each group.

  • cells_X_plus_dic (dict): Dictionary of cell expression matrices for each group.

  • nonzero_indices_dic (dict): Dictionary of non-zero indices indicating cell presence in spots for each group.

  • nonzero_indices_toml (dict): Dictionary of updated non-zero indices with new IDs for optimization.

  • cells_before_ml (dict): Dictionary of cells and their assigned spots before machine learning adjustments.

  • cells_before_ml_x (dict): Dictionary of cell expression data aggregated before machine learning.

  • groups_combined (dict): Dictionary of cell groups formed to limit computational load.

  • spots_id_dic (dict): Dictionary of spot IDs for each group.

  • spots_id_dic_prop (dict): Dictionary of spot proportions for each group.