{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "a7b23dec-b8d3-438f-84d7-3beb08e782cd", "metadata": {}, "source": [ "# Tutorial: K562 HCT116 SP1 single-cell calling cards data. " ] }, { "attachments": {}, "cell_type": "markdown", "id": "23957f8d-186a-4d95-a223-dc67ddeba63b", "metadata": {}, "source": [ "In this tutorial, we will analyze the binding of the transcription factor Sp1 in K562 and HCT116 cell lines. These data are generated using single-cell calling cards (scCCs). The data is from [Moudgil et al., Cell. (2020)](https://doi.org/10.1016/j.cell.2020.06.037) and can be downloaded from [GEO](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE148448).\n", "\n", "We will cover how to call TF peaks using a background file, annotate these peaks, compare them with Chip-seq reference data, and perform a differential peak analysis. " ] }, { "cell_type": "code", "execution_count": 1, "id": "9c51fe44-bebe-4dde-8007-5cd516f40f51", "metadata": {}, "outputs": [], "source": [ "import pycallingcards as cc\n", "import numpy as np\n", "import pandas as pd\n", "import scanpy as sc\n", "from matplotlib import pyplot as plt\n", "plt.rcParams['figure.dpi'] = 150" ] }, { "attachments": {}, "cell_type": "markdown", "id": "669744b0-c0b2-4e07-a421-b128fc03f2b2", "metadata": {}, "source": [ "We start by reading the qbed datafile. In this file, each row represents a Sp1-directed insertion and columns indicate the chromosome, start point and end point, reads number, the direction and cell barcode of each insertion. For example, the first row tells us the first insertion is on Chromosome 1 in a TTAA site located at genomic coordinates 30116 to 30120. There are 12 reads supporting this insertion, which maps to the negative strand of the genome. The cell barcode is CCCAATCCATCGGTTA-1. Note that the cell barcode allows us to connect this insertino to scRNA-seq data collected from the same cell.\n", "\n", "Use ```cc.rd.read_qbed(filename)``` to read your own ccf data." ] }, { "cell_type": "code", "execution_count": 2, "id": "f6ed38c2-e23d-495f-a163-03e69c976921", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "30116 | \n", "30120 | \n", "5 | \n", "- | \n", "CCCAATCCATCGGTTA-1 | \n", "
1 | \n", "chr1 | \n", "34568 | \n", "34572 | \n", "3 | \n", "- | \n", "CCTTCGAAGGGCTTCC-1 | \n", "
2 | \n", "chr1 | \n", "36736 | \n", "36740 | \n", "29 | \n", "+ | \n", "ACGAGCCGTATAGGTA-1 | \n", "
3 | \n", "chr1 | \n", "42447 | \n", "42451 | \n", "3 | \n", "- | \n", "CTCTACGTCGGAGCAA-1 | \n", "
4 | \n", "chr1 | \n", "89697 | \n", "89701 | \n", "119 | \n", "- | \n", "AGCTCTCGTTTGTTTC-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
77205 | \n", "chrY | \n", "25518788 | \n", "25518792 | \n", "2 | \n", "+ | \n", "TGGGCGTTCGAACGGA-1 | \n", "
77206 | \n", "chrY | \n", "56987633 | \n", "56987637 | \n", "13 | \n", "+ | \n", "CAGTCCTAGGCACATG-1 | \n", "
77207 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "17 | \n", "+ | \n", "CGGAGCTCATCGACGC-1 | \n", "
77208 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "7 | \n", "+ | \n", "GTAACGTAGTTACGGG-1 | \n", "
77209 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "9 | \n", "+ | \n", "TCAGCAAGTTGAACTC-1 | \n", "
77210 rows × 6 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "89697 | \n", "89701 | \n", "14 | \n", "+ | \n", "TCTGAGACAATGGTCT-1 | \n", "
1 | \n", "chr1 | \n", "89697 | \n", "89701 | \n", "8 | \n", "+ | \n", "CAGCGACCAAATACAG-1 | \n", "
2 | \n", "chr1 | \n", "203932 | \n", "203936 | \n", "99 | \n", "+ | \n", "TTCTCCTTCTACTTAC-1 | \n", "
3 | \n", "chr1 | \n", "204063 | \n", "204067 | \n", "5 | \n", "- | \n", "TGTTCCGGTGTAAGTA-1 | \n", "
4 | \n", "chr1 | \n", "204063 | \n", "204067 | \n", "7 | \n", "- | \n", "CAAGATCTCGACCAGC-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
37769 | \n", "chrY | \n", "18037315 | \n", "18037319 | \n", "9 | \n", "- | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37770 | \n", "chrY | \n", "24036504 | \n", "24036508 | \n", "168 | \n", "+ | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37771 | \n", "chrY | \n", "24036504 | \n", "24036508 | \n", "508 | \n", "+ | \n", "CATATGGCAGCCAGAA-1 | \n", "
37772 | \n", "chrY | \n", "25633622 | \n", "25633626 | \n", "13 | \n", "- | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37773 | \n", "chrY | \n", "25633622 | \n", "25633626 | \n", "32 | \n", "- | \n", "CATATGGCAGCCAGAA-1 | \n", "
37774 rows × 6 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Center | \n", "Experiment Insertions | \n", "Background insertions | \n", "Reference Insertions | \n", "pvalue Reference | \n", "pvalue Background | \n", "Fraction Experiment | \n", "TPH Experiment | \n", "Fraction background | \n", "TPH background | \n", "TPH background subtracted | \n", "pvalue_adj Reference | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "906689 | \n", "907160 | \n", "906957.0 | \n", "5 | \n", "0 | \n", "3 | \n", "3.099334e-09 | \n", "1.546531e-04 | \n", "0.000065 | \n", "6475.845098 | \n", "0.000000 | \n", "0.000000 | \n", "6475.845098 | \n", "3.371990e-06 | \n", "
1 | \n", "chr1 | \n", "999921 | \n", "1000324 | \n", "1000121.0 | \n", "20 | \n", "0 | \n", "1 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "0.000259 | \n", "25903.380391 | \n", "0.000000 | \n", "0.000000 | \n", "25903.380391 | \n", "0.000000e+00 | \n", "
2 | \n", "chr1 | \n", "1156947 | \n", "1157863 | \n", "1157660.0 | \n", "11 | \n", "0 | \n", "2 | \n", "0.000000e+00 | \n", "1.274899e-09 | \n", "0.000142 | \n", "14246.859215 | \n", "0.000000 | \n", "0.000000 | \n", "14246.859215 | \n", "0.000000e+00 | \n", "
3 | \n", "chr1 | \n", "1692740 | \n", "1693542 | \n", "1693339.0 | \n", "6 | \n", "0 | \n", "3 | \n", "5.135270e-11 | \n", "1.546531e-04 | \n", "0.000078 | \n", "7771.014117 | \n", "0.000000 | \n", "0.000000 | \n", "7771.014117 | \n", "7.604315e-08 | \n", "
4 | \n", "chr1 | \n", "1744492 | \n", "1746808 | \n", "1746605.0 | \n", "11 | \n", "0 | \n", "7 | \n", "0.000000e+00 | \n", "1.274899e-09 | \n", "0.000142 | \n", "14246.859215 | \n", "0.000000 | \n", "0.000000 | \n", "14246.859215 | \n", "0.000000e+00 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
3003 | \n", "chrY | \n", "1245503 | \n", "1247177 | \n", "1245703.0 | \n", "5 | \n", "0 | \n", "7 | \n", "1.652897e-09 | \n", "4.678840e-03 | \n", "0.000065 | \n", "6475.845098 | \n", "0.000000 | \n", "0.000000 | \n", "6475.845098 | \n", "2.056405e-06 | \n", "
3004 | \n", "chrY | \n", "1280372 | \n", "1281989 | \n", "1281786.0 | \n", "5 | \n", "0 | \n", "3 | \n", "1.426995e-09 | \n", "4.678840e-03 | \n", "0.000065 | \n", "6475.845098 | \n", "0.000000 | \n", "0.000000 | \n", "6475.845098 | \n", "1.828173e-06 | \n", "
3005 | \n", "chrY | \n", "1586317 | \n", "1587733 | \n", "1587530.0 | \n", "7 | \n", "0 | \n", "8 | \n", "3.370637e-13 | \n", "1.546531e-04 | \n", "0.000091 | \n", "9066.183137 | \n", "0.000000 | \n", "0.000000 | \n", "9066.183137 | \n", "6.766326e-10 | \n", "
3006 | \n", "chrY | \n", "2391936 | \n", "2392440 | \n", "2392237.0 | \n", "6 | \n", "1 | \n", "2 | \n", "1.985434e-11 | \n", "9.958372e-02 | \n", "0.000078 | \n", "7771.014117 | \n", "0.000026 | \n", "2647.323556 | \n", "5123.690561 | \n", "3.329699e-08 | \n", "
3007 | \n", "chrY | \n", "2608793 | \n", "2610054 | \n", "2608993.0 | \n", "8 | \n", "0 | \n", "2 | \n", "2.775558e-15 | \n", "1.546531e-04 | \n", "0.000104 | \n", "10361.352156 | \n", "0.000000 | \n", "0.000000 | \n", "10361.352156 | \n", "6.730626e-12 | \n", "
3008 rows × 15 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "16529 | \n", "16533 | \n", "163 | \n", "- | \n", "GCTCCTAAGTACGTTC-1 | \n", "
1 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "10 | \n", "+ | \n", "CTCACACCAGACGCTC-1 | \n", "
2 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "155 | \n", "+ | \n", "TGGCCAGCACCCATTC-1 | \n", "
3 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "285 | \n", "+ | \n", "GTGGGTCCACGGCCAT-1 | \n", "
4 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "7 | \n", "+ | \n", "CGTCTACTCAACACGT-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
327460 | \n", "chrY | \n", "57061562 | \n", "57061566 | \n", "6 | \n", "+ | \n", "CTCATTATCATCATTC-1 | \n", "
327461 | \n", "chrY | \n", "57061562 | \n", "57061566 | \n", "67 | \n", "+ | \n", "TGCGTGGCATTAGGCT-1 | \n", "
327462 | \n", "chrY | \n", "57145084 | \n", "57145088 | \n", "2 | \n", "- | \n", "ACATACGTCGCGCCAA-1 | \n", "
327463 | \n", "chrY | \n", "57148630 | \n", "57148634 | \n", "2 | \n", "- | \n", "TATGCCCGTACAGTTC-1 | \n", "
327464 | \n", "chrY | \n", "57183913 | \n", "57183917 | \n", "228 | \n", "- | \n", "AAACCTGGTCCTGCTT-1 | \n", "
327465 rows × 6 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "30238 | \n", "30242 | \n", "3 | \n", "+ | \n", "TTTACTGCATAAAGGT-1 | \n", "
1 | \n", "chr1 | \n", "30355 | \n", "30359 | \n", "2 | \n", "- | \n", "ATCACGAAGAGTAATC-1 | \n", "
2 | \n", "chr1 | \n", "30355 | \n", "30359 | \n", "70 | \n", "+ | \n", "TTGAACGCAAATCCGT-1 | \n", "
3 | \n", "chr1 | \n", "31101 | \n", "31105 | \n", "2 | \n", "+ | \n", "CCTCAGTCATCAGTAC-1 | \n", "
4 | \n", "chr1 | \n", "32116 | \n", "32120 | \n", "5 | \n", "+ | \n", "CTAGTGAAGACAAAGG-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
107380 | \n", "chrY | \n", "57080210 | \n", "57080214 | \n", "9 | \n", "- | \n", "AAGGAGCCAGTATAAG-1 | \n", "
107381 | \n", "chrY | \n", "57087785 | \n", "57087789 | \n", "24 | \n", "- | \n", "CGAGCCAGTCTCTCTG-1 | \n", "
107382 | \n", "chrY | \n", "57144853 | \n", "57144857 | \n", "5 | \n", "+ | \n", "GAAGCAGTCCCATTTA-1 | \n", "
107383 | \n", "chrY | \n", "57183772 | \n", "57183776 | \n", "2 | \n", "- | \n", "TCTTTCCTCTTGCCGT-1 | \n", "
107384 | \n", "chrY | \n", "57204853 | \n", "57204857 | \n", "369 | \n", "- | \n", "ATAACGCAGTTTGCGT-1 | \n", "
107385 rows × 6 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Center | \n", "Experiment Insertions | \n", "Background insertions | \n", "Reference Insertions | \n", "pvalue Reference | \n", "pvalue Background | \n", "Fraction Experiment | \n", "TPH Experiment | \n", "Fraction background | \n", "TPH background | \n", "TPH background subtracted | \n", "pvalue_adj Reference | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "29684 | \n", "30087 | \n", "29884.0 | \n", "6 | \n", "0 | \n", "1 | \n", "8.878753e-11 | \n", "1.546531e-04 | \n", "0.000018 | \n", "1832.256882 | \n", "0.000000 | \n", "0.000000 | \n", "1832.256882 | \n", "3.323285e-08 | \n", "
1 | \n", "chr1 | \n", "36239 | \n", "38107 | \n", "37578.0 | \n", "24 | \n", "2 | \n", "15 | \n", "0.000000e+00 | \n", "1.486029e-03 | \n", "0.000073 | \n", "7329.027530 | \n", "0.000019 | \n", "1862.457513 | \n", "5466.570017 | \n", "0.000000e+00 | \n", "
2 | \n", "chr1 | \n", "198893 | \n", "201208 | \n", "200869.0 | \n", "28 | \n", "2 | \n", "11 | \n", "0.000000e+00 | \n", "6.927041e-05 | \n", "0.000086 | \n", "8550.532118 | \n", "0.000019 | \n", "1862.457513 | \n", "6688.074605 | \n", "0.000000e+00 | \n", "
3 | \n", "chr1 | \n", "203351 | \n", "207161 | \n", "205004.0 | \n", "92 | \n", "13 | \n", "22 | \n", "0.000000e+00 | \n", "4.337485e-05 | \n", "0.000281 | \n", "28094.605530 | \n", "0.000121 | \n", "12105.973832 | \n", "15988.631698 | \n", "0.000000e+00 | \n", "
4 | \n", "chr1 | \n", "265549 | \n", "266336 | \n", "265749.0 | \n", "5 | \n", "0 | \n", "3 | \n", "3.731359e-08 | \n", "4.678840e-03 | \n", "0.000015 | \n", "1526.880735 | \n", "0.000000 | \n", "0.000000 | \n", "1526.880735 | \n", "1.056034e-05 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
9404 | \n", "chrY | \n", "15158250 | \n", "15158653 | \n", "15158450.0 | \n", "11 | \n", "0 | \n", "1 | \n", "0.000000e+00 | \n", "1.546531e-04 | \n", "0.000034 | \n", "3359.137618 | \n", "0.000000 | \n", "0.000000 | \n", "3359.137618 | \n", "0.000000e+00 | \n", "
9405 | \n", "chrY | \n", "16985442 | \n", "16985845 | \n", "16985642.0 | \n", "5 | \n", "0 | \n", "2 | \n", "1.806731e-09 | \n", "4.678840e-03 | \n", "0.000015 | \n", "1526.880735 | \n", "0.000000 | \n", "0.000000 | \n", "1526.880735 | \n", "6.060202e-07 | \n", "
9406 | \n", "chrY | \n", "19753311 | \n", "19753714 | \n", "19753511.0 | \n", "33 | \n", "0 | \n", "1 | \n", "0.000000e+00 | \n", "2.269296e-13 | \n", "0.000101 | \n", "10077.412853 | \n", "0.000000 | \n", "0.000000 | \n", "10077.412853 | \n", "0.000000e+00 | \n", "
9407 | \n", "chrY | \n", "21011133 | \n", "21011828 | \n", "21011333.0 | \n", "5 | \n", "0 | \n", "4 | \n", "2.510448e-09 | \n", "4.678840e-03 | \n", "0.000015 | \n", "1526.880735 | \n", "0.000000 | \n", "0.000000 | \n", "1526.880735 | \n", "8.296735e-07 | \n", "
9408 | \n", "chrY | \n", "56952574 | \n", "56957328 | \n", "56953707.0 | \n", "40 | \n", "1 | \n", "37 | \n", "0.000000e+00 | \n", "2.427052e-06 | \n", "0.000122 | \n", "12215.045883 | \n", "0.000009 | \n", "931.228756 | \n", "11283.817126 | \n", "0.000000e+00 | \n", "
9409 rows × 15 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "
---|---|---|---|
0 | \n", "chr1 | \n", "29684 | \n", "30087 | \n", "
1 | \n", "chr1 | \n", "36239 | \n", "38107 | \n", "
2 | \n", "chr1 | \n", "198893 | \n", "201208 | \n", "
3 | \n", "chr1 | \n", "203351 | \n", "207161 | \n", "
4 | \n", "chr1 | \n", "265549 | \n", "266336 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
10445 | \n", "chrY | \n", "15158250 | \n", "15158653 | \n", "
10446 | \n", "chrY | \n", "16985442 | \n", "16985845 | \n", "
10447 | \n", "chrY | \n", "19753311 | \n", "19753714 | \n", "
10448 | \n", "chrY | \n", "21011133 | \n", "21011828 | \n", "
10449 | \n", "chrY | \n", "56952574 | \n", "56957328 | \n", "
10450 rows × 3 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Nearest Refseq1 | \n", "Gene Name1 | \n", "Direction1 | \n", "Distance1 | \n", "Nearest Refseq2 | \n", "Gene Name2 | \n", "Direction2 | \n", "Distance2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "29684 | \n", "30087 | \n", "NR_036051 | \n", "MIR1302-2 | \n", "+ | \n", "279 | \n", "NR_024540 | \n", "WASH7P | \n", "- | \n", "-315 | \n", "
1 | \n", "chr1 | \n", "36239 | \n", "38107 | \n", "NR_026818 | \n", "FAM138A | \n", "- | \n", "-159 | \n", "NR_036051 | \n", "MIR1302-2 | \n", "+ | \n", "-5737 | \n", "
2 | \n", "chr1 | \n", "198893 | \n", "201208 | \n", "NR_026823 | \n", "FAM138D | \n", "- | \n", "3921 | \n", "NR_107063 | \n", "MIR6859-3 | \n", "- | \n", "-10936 | \n", "
3 | \n", "chr1 | \n", "203351 | \n", "207161 | \n", "NR_026823 | \n", "FAM138D | \n", "- | \n", "0 | \n", "NR_107063 | \n", "MIR6859-3 | \n", "- | \n", "-15394 | \n", "
4 | \n", "chr1 | \n", "265549 | \n", "266336 | \n", "NR_026823 | \n", "FAM138D | \n", "- | \n", "-58953 | \n", "NR_107063 | \n", "MIR6859-3 | \n", "- | \n", "-77592 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
10445 | \n", "chrY | \n", "15158250 | \n", "15158653 | \n", "NM_001206850 | \n", "NLGN4Y | \n", "+ | \n", "-314283 | \n", "NR_046504 | \n", "NLGN4Y-AS1 | \n", "- | \n", "-354218 | \n", "
10446 | \n", "chrY | \n", "16985442 | \n", "16985845 | \n", "NR_028083 | \n", "FAM41AY1 | \n", "+ | \n", "515113 | \n", "NR_002160 | \n", "FAM224B | \n", "- | \n", "589316 | \n", "
10447 | \n", "chrY | \n", "19753311 | \n", "19753714 | \n", "NM_001146706 | \n", "KDM5D | \n", "- | \n", "-8373 | \n", "NR_045128 | \n", "TXLNGY | \n", "+ | \n", "-146142 | \n", "
10448 | \n", "chrY | \n", "21011133 | \n", "21011828 | \n", "NM_001039567 | \n", "RPS4Y2 | \n", "+ | \n", "-230102 | \n", "NM_001282471 | \n", "PRORY | \n", "- | \n", "371146 | \n", "
10449 | \n", "chrY | \n", "56952574 | \n", "56957328 | \n", "NM_005840 | \n", "SPRY3 | \n", "+ | \n", "0 | \n", "NM_001145149 | \n", "VAMP7 | \n", "+ | \n", "110472 | \n", "
10450 rows × 11 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "16529 | \n", "16533 | \n", "163 | \n", "- | \n", "GCTCCTAAGTACGTTC-1 | \n", "
1 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "10 | \n", "+ | \n", "CTCACACCAGACGCTC-1 | \n", "
2 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "155 | \n", "+ | \n", "TGGCCAGCACCCATTC-1 | \n", "
3 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "285 | \n", "+ | \n", "GTGGGTCCACGGCCAT-1 | \n", "
4 | \n", "chr1 | \n", "29884 | \n", "29888 | \n", "7 | \n", "+ | \n", "CGTCTACTCAACACGT-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
77205 | \n", "chrY | \n", "25518788 | \n", "25518792 | \n", "2 | \n", "+ | \n", "TGGGCGTTCGAACGGA-1 | \n", "
77206 | \n", "chrY | \n", "56987633 | \n", "56987637 | \n", "13 | \n", "+ | \n", "CAGTCCTAGGCACATG-1 | \n", "
77207 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "17 | \n", "+ | \n", "CGGAGCTCATCGACGC-1 | \n", "
77208 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "7 | \n", "+ | \n", "GTAACGTAGTTACGGG-1 | \n", "
77209 | \n", "chrY | \n", "57080855 | \n", "57080859 | \n", "9 | \n", "+ | \n", "TCAGCAAGTTGAACTC-1 | \n", "
404675 rows × 6 columns
\n", "\n", " | Index | \n", "cluster | \n", "
---|---|---|
0 | \n", "AAACCTGAGAAAGTGG-1 | \n", "HCT116 | \n", "
1 | \n", "AAACCTGAGACCGGAT-1 | \n", "K562 | \n", "
2 | \n", "AAACCTGAGACTAGAT-1 | \n", "HCT116 | \n", "
3 | \n", "AAACCTGAGAGCTTCT-1 | \n", "HCT116 | \n", "
4 | \n", "AAACCTGAGAGTACCG-1 | \n", "HCT116 | \n", "
... | \n", "... | \n", "... | \n", "
52206 | \n", "TTTGTCATCTCCGGTT-1 | \n", "K562 | \n", "
52207 | \n", "TTTGTCATCTCGATGA-1 | \n", "K562 | \n", "
52208 | \n", "TTTGTCATCTCTAAGG-1 | \n", "K562 | \n", "
52209 | \n", "TTTGTCATCTGGAGCC-1 | \n", "HCT116 | \n", "
52210 | \n", "TTTGTCATCTTGGGTA-1 | \n", "HCT116 | \n", "
51079 rows × 2 columns
\n", "\n", " | Chr | \n", "Start | \n", "End | \n", "Reads | \n", "Direction | \n", "Barcodes | \n", "
---|---|---|---|---|---|---|
0 | \n", "chr1 | \n", "30238 | \n", "30242 | \n", "3 | \n", "+ | \n", "TTTACTGCATAAAGGT-1 | \n", "
1 | \n", "chr1 | \n", "30355 | \n", "30359 | \n", "2 | \n", "- | \n", "ATCACGAAGAGTAATC-1 | \n", "
2 | \n", "chr1 | \n", "30355 | \n", "30359 | \n", "70 | \n", "+ | \n", "TTGAACGCAAATCCGT-1 | \n", "
3 | \n", "chr1 | \n", "31101 | \n", "31105 | \n", "2 | \n", "+ | \n", "CCTCAGTCATCAGTAC-1 | \n", "
4 | \n", "chr1 | \n", "32116 | \n", "32120 | \n", "5 | \n", "+ | \n", "CTAGTGAAGACAAAGG-1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
37769 | \n", "chrY | \n", "18037315 | \n", "18037319 | \n", "9 | \n", "- | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37770 | \n", "chrY | \n", "24036504 | \n", "24036508 | \n", "168 | \n", "+ | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37771 | \n", "chrY | \n", "24036504 | \n", "24036508 | \n", "508 | \n", "+ | \n", "CATATGGCAGCCAGAA-1 | \n", "
37772 | \n", "chrY | \n", "25633622 | \n", "25633626 | \n", "13 | \n", "- | \n", "GCAGTTAAGATCTGAA-1 | \n", "
37773 | \n", "chrY | \n", "25633622 | \n", "25633626 | \n", "32 | \n", "- | \n", "CATATGGCAGCCAGAA-1 | \n", "
145159 rows × 6 columns
\n", "