Clear Sky Science · en

Cell neighborhood topology directs rare cell population identification

· Back to index

Why tiny cell groups matter

Our bodies are built from vast crowds of cells, but some of the most important players are rare cell types that make up only a tiny fraction of the whole. These scarce cells can drive cancer spread, shape brain disease, or coordinate immune responses, yet they are notoriously hard to spot in modern single-cell and spatial maps of tissues. This study introduces RareQ, a new computational approach designed to reliably find these hidden cell groups, even in enormous and complex datasets.

Figure 1. How network patterns reveal tiny but important cell groups hidden in massive tissue datasets
Figure 1. How network patterns reveal tiny but important cell groups hidden in massive tissue datasets

Looking for patterns in cell neighborhoods

Traditional methods usually group cells by looking at what genes they express and then carving the data into clusters. This works well for common cell types but often swallows up rare ones into larger groups or treats them as noise. RareQ takes a different route. Instead of focusing only on gene levels, it examines how each cell sits inside a network of its closest neighbors. If a cell and its nearby neighbors are very tightly connected to each other but only weakly linked to the rest of the network, RareQ marks this neighborhood as special. This idea is captured in a measure called Q, which reflects how “cliquish” a cell’s local neighborhood is.

Turning cliques into meaningful cell groups

Using the Q score, RareQ builds cell groups step by step. Every cell starts in its own tiny cluster, then gradually adopts the label of the higher Q neighbor in its small neighborhood. In simple terms, tightly knit cell cliques act as anchors that pull in nearby similar cells. After this label spreading, the method checks which resulting clusters are strongly connected inside and only weakly tied to others. Clusters with high average Q are treated as candidate rare populations, while weaker clusters are merged to form the major cell types. This process allows RareQ to separate both big and small cell communities without needing prior knowledge of how many types are present.

Testing RareQ across many tissues

The authors benchmarked RareQ on hundreds of simulated and real datasets, including blood cells, airway lining cells, brain tissue, tumors, and spatial maps. Across these tests, RareQ more accurately detected rare cell types than seven leading methods, often finding many more rare populations with higher precision and recall. It also ran faster and used less memory, successfully analyzing datasets with over a million cells where some competing tools failed. In multi-modal data that combine gene activity with chromatin accessibility or surface proteins, RareQ could be applied either to each layer separately or to integrated views, often revealing rare cell types that were invisible to methods based solely on one data type or heavy deep learning models.

Figure 2. How tightly connected cell neighborhoods are used stepwise to isolate rare cell populations from common ones
Figure 2. How tightly connected cell neighborhoods are used stepwise to isolate rare cell populations from common ones

Revealing hidden players in health and disease

By applying RareQ to specific biological problems, the study shows how these hidden cell groups can reshape our understanding of tissues. In the airway, RareQ recovered known rare epithelial cells and uncovered new subsets linked to cell division, cilia formation, and antiviral responses. In B cell lymphoma and kidney cancer, it pinpointed rare immune cell and tumor-associated groups, including dendritic cells with high checkpoint activity that may influence response to immunotherapy. In Alzheimer’s disease brain samples, RareQ highlighted rare microglia and astrocyte states enriched in patients, with gene patterns tied to inflammation, debris clearance, and handling of amyloid proteins. In high-resolution spatial datasets from mouse brain, it detected small, anatomically precise populations, such as distinct neuron subregions in the hippocampus and specialized ciliated epithelial cells in the choroid plexus, which other methods had missed or blurred together.

What this means for future cell maps

To a lay reader, the key message is that RareQ offers a more sensitive and efficient way to find the rare cell types that often act as control knobs for disease and tissue function. By focusing on how cells are wired to their neighbors rather than only on raw gene levels, RareQ can pull out small but coherent cell groups from massive data collections. This makes future cell atlases more complete and opens the door to studying elusive cell states that could become targets for diagnostics or therapy.

Citation: Fa, B., Huang, C., Ma, Y. et al. Cell neighborhood topology directs rare cell population identification. Nat Commun 17, 4618 (2026). https://doi.org/10.1038/s41467-026-71180-x

Keywords: rare cells, single-cell analysis, spatial transcriptomics, cell networks, computational biology