Skip to content

scDblFinder error when using aggregateFeatures and knownDoublets #82

@dottercp

Description

@dottercp

Dear developers,

I'm working with multiplexed (CMO) scATAC-seq data (one 10X-run has 6 samples) which gives me information on known doublets from overlap of hashtags. When using the scDblFinder function for this data I wanted to provide these doublets as knownDoublets and aggregate features as recommended in the vignette. However, I found that this combination of parameters does not work and throws an error (see below). After some debugging I found that the source of the issue might be that the splitting of the dataset in known doublets (sce.dbl) and others (sce) is performed before aggregation which leads to a mismatch of row names between the two subsets.

MRE -- Minimal example to reproduce the bug

scDblFinder(
  sce = sce,
  dims = 50,
  aggregateFeatures = TRUE,
  knownDoublets = (sce$ident == doublet_sample), 
  knownUse = "discard"
)

Traceback

6: stop(sprintf(fmt, msg))
5: SummarizedExperiment:::.SummarizedExperiment.charbound(subset, 
       names, fmt)
4: .convert_subset_index(i, rownames(x))
3: sce.dbl[sel_features, ]
2: sce.dbl[sel_features, ]
1: scDblFinder::scDblFinder(sce = sce, dims = 50, aggregateFeatures = TRUE, 
       knownDoublets = sce$ident == doublet_sample, knownUse = knownUse)`

Session info

R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 11 (bullseye)

Matrix products: default

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] qs_0.25.5                          furrr_0.3.1                        future_1.32.0                     
 [4] intrinsicDimension_1.2.0           yaImpute_1.0-33                    glmGamPoi_1.12.1                  
 [7] Palo_1.1                           here_1.0.1                         ComplexHeatmap_2.16.0             
[10] pheatmap_1.0.12                    ggpp_0.5.2                         BSgenome.Mmusculus.UCSC.mm10_1.4.3
[13] BSgenome_1.67.4                    rtracklayer_1.59.1                 Biostrings_2.67.2                 
[16] XVector_0.39.0                     tarchetypes_0.7.6                  scuttle_1.10.1                    
[19] Signac_1.10.0                      scDblFinder_1.14.0                 SingleCellExperiment_1.22.0       
[22] SummarizedExperiment_1.29.1        Biobase_2.59.0                     GenomicRanges_1.51.4              
[25] GenomeInfoDb_1.35.17               IRanges_2.33.1                     S4Vectors_0.38.1                  
[28] BiocGenerics_0.45.3                MatrixGenerics_1.12.2              matrixStats_1.0.0                 
[31] targets_1.1.3                      SeuratObject_4.1.3                 Seurat_4.3.0                      
[34] lubridate_1.9.2                    forcats_1.0.0                      stringr_1.5.0                     
[37] dplyr_1.1.2                        purrr_1.0.1                        readr_2.1.4                       
[40] tidyr_1.3.0                        tibble_3.2.1                       ggplot2_3.4.2                     
[43] tidyverse_2.0.0                   

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions