Clear Sky Science · en
Community detection framework based on 3D shape descriptors for tree species classification in point cloud data
Why Sorting Trees from the Sky Matters
Forests cool cities, stabilise hillsides, store carbon, and shelter wildlife—but not all trees play the same role. Knowing which species grow where usually means time‑consuming fieldwork. This paper introduces a way to recognise tree species automatically from 3D measurements collected by lasers on planes, drones, or ground scanners. Instead of relying on large, hand‑labelled training sets for machine‑learning models, the authors show how trees can sort themselves into meaningful groups based purely on their three‑dimensional shape.
Looking at Forests in 3D
Modern laser scanning, often called LiDAR, captures forests as dense “point clouds”: millions of tiny dots that trace the outlines of trunks and crowns. These data can reveal tree height, crown width, and overall form. The challenge is that trees of the same species can look quite different depending on age, competition, wind, and light, while different species may sometimes look confusingly alike. Most current methods attack this problem with supervised machine learning, which demands extensive, carefully prepared training data tailored to each region and sensor, and can struggle with rare or unusual trees.

Letting Trees Form Their Own Groups
The authors propose a different strategy: instead of directly predicting species from raw data, they first let the trees “form communities” based on how similar their shapes are. Each individual tree is represented by a compact description of its 3D crown. To build this description, they analyse how the points spread along three main directions, then convert the cloud into a grid of small 3D blocks. By rotating the tree around its vertical axis and comparing the original and rotated versions slice by slice, they capture both how symmetric the crown is and how wide and dense it is at different heights. These measurements are smoothed into a handful of curve coefficients, producing a short feature vector that is stable even when the point cloud is noisy, sparse, or rotated.
From Shape Fingerprints to Tree Communities
Once every tree has a shape fingerprint, the method compares all trees with one another. Pairs that have very similar features are linked with strong connections; dissimilar pairs receive weak ones. This creates a network in which each tree is a node and the strength of the links reflects how alike the crowns are. A community detection algorithm—originally developed to find tightly knit groups in social networks—searches this graph for clusters of trees that are more strongly connected to each other than to the rest. Each such community tends to contain mostly trees of the same species, though it may also isolate unusual individuals or small groups with atypical forms.

Testing the Method in Synthetic and Real Forests
To see how well this works, the authors applied their framework to two public datasets. The first is a synthetic collection of 100 trees from 10 species, generated by a growth simulator. Here, the method perfectly recovered the true species groups: every simulated species formed its own tight community. The second dataset consists of nearly 700 real trees from seven species scanned in German and U.S. forests. In this noisier, more varied setting, communities still broadly matched species, but some species with similar shapes merged into mixed groups and some species split into several shape‑based communities. Importantly, the framework remained robust when the point clouds were thinned, rotated, or processed with slightly different grid sizes, and it outperformed standard clustering techniques like k‑means and hierarchical clustering on the same features.
Helping People Label Less and Learn More
The final step is to turn communities into species labels. Instead of hand‑labelling hundreds of individual trees, a user only needs to identify a few trees in each community. The majority label is then assigned to the rest of the group. On the real dataset, this semi‑automatic approach reached around 60% overall accuracy, comparable to some deep‑learning methods that require far more training data and tuning. When the same shape features were given to a standard support‑vector machine classifier with sufficient training examples, accuracy rose to about 80%, showing that the features themselves capture species‑relevant information effectively.
What This Means for Forest Monitoring
For a non‑specialist, the key idea is that tree species can be inferred from 3D shape alone by letting trees “find their own neighbours” in a network of similarities. This community‑based approach does not replace advanced machine learning, but it can sharply reduce the manual labour needed to prepare training sets and highlight outliers such as dead or highly unusual trees. As more LiDAR data become available worldwide, such interpretable, training‑light methods could speed up the creation of detailed tree inventories, supporting better forest management, climate modelling, and conservation planning.
Citation: Kohek, Š., Žalik, B., Mongus, D. et al. Community detection framework based on 3D shape descriptors for tree species classification in point cloud data. Sci Rep 16, 12091 (2026). https://doi.org/10.1038/s41598-026-42392-4
Keywords: LiDAR forests, tree species classification, 3D point clouds, community detection, remote sensing ecology