Clear Sky Science · en
Unsupervised multimodal deep learning for galaxy morphology taxonomy: integrating ConvNeXtEmbeddings and morphological parameters for scalable survey science
Teaching Computers to Read the Shapes of Galaxies
Modern sky surveys are photographing billions of galaxies, far more than any team of astronomers—or citizen scientists—could ever classify by eye. Yet the shapes of galaxies, from smooth ellipses to grand spirals and chaotic mergers, hold vital clues to how the universe builds its structures. This paper introduces a new way for computers to sort galaxies automatically, without being told in advance what to look for, opening the door to exploring cosmic structure at truly massive scale.

Why Galaxy Shapes Matter
Galaxies are not just pretty pictures; their appearance encodes their life stories. Smooth, round systems tend to be older and quieter, while galaxies with prominent spiral arms or distorted shapes often signal ongoing star birth or recent collisions. For a century, astronomers have organized these forms into families—such as ellipticals, spirals, and irregulars—to connect visible structure with underlying physics. But as projects like the Sloan Digital Sky Survey and upcoming observatories like the Rubin Observatory’s Legacy Survey of Space and Time image the sky in unprecedented depth, traditional hand labeling has become impossible to maintain.
From Human Labels to Unsupervised Discovery
Most recent advances in automatic galaxy classification rely on supervised deep learning: computers learn from thousands of examples that humans have already labeled. This works well, but depends on painstakingly created training sets and is limited to the categories people define ahead of time. The authors instead pursue an unsupervised route, asking the algorithm to discover natural groupings in the data on its own. To do this, they use powerful image-analysis networks originally trained on everyday photographs, then adapt them to galaxy images to extract rich visual fingerprints, all without needing any galaxy to carry a preassigned label.
Blending Pictures with Physical Measurements
Galaxy images contain immense detail, but astronomers also use simple numeric descriptors of structure, such as how centrally concentrated the light is, how lopsided the galaxy appears, how clumpy its star-forming regions are, and how unevenly light is spread across its pixels. The team combines both worlds: deep visual features from two modern neural networks and five classic structural measures. Because the image-based description runs to thousands of numbers while the physical measures are just a handful, they build a special "multimodal autoencoder"—a type of neural network that compresses all information into a compact internal code. This 64-number code forces the system to balance what it learns from the images with what is known from basic galaxy physics.
Letting the Data Fall into Natural Families
Once each of the 4,950 carefully cleaned Sloan survey galaxies is reduced to this balanced, 64-dimensional code, the authors apply a probabilistic clustering technique that treats the galaxy population as a smooth mixture of overlapping groups. Instead of forcing sharp boundaries, it assigns each galaxy a degree of membership in several clusters and flags only the most extreme 2 percent as genuine oddities or artifacts. The resulting main clusters line up well with familiar families: smooth, compact systems resembling early-type galaxies; diffuse, clumpy disks akin to late-type spirals; interacting and disturbed systems; and intermediate, transitional disks. Internal tests show that this combined-image-and-physics representation produces cleaner, more coherent groups than using images or structural numbers alone.

Checking Against Classic Rules and Scaling Up
To see whether the computer’s unsupervised groupings make physical sense, the authors compare them against long-used rule-of-thumb boundaries based on simple structure diagrams. Even though the algorithm never saw any human-made labels, about half of its classifications align with these traditional categories, and the rest reveal subtler variations that the older, two-parameter rules blur together. Just as importantly, the whole pipeline runs quickly: each galaxy can be processed in just a few tens of milliseconds on modern hardware, a pace suitable for petabyte-scale surveys that will soon catalog billions of galaxies.
A New Map of the Galaxy Zoo
In everyday terms, this work shows how to teach a computer to "see" and group galaxies in a way that respects both what astronomers already know and what the data may still be hiding. By blending visual patterns with simple physical measurements and by allowing for gradual transitions rather than rigid boxes, the method builds a flexible, scalable galaxy taxonomy. This approach should help scientists sift through the upcoming flood of sky images, spot rare or unusual systems, and refine our picture of how galaxies form, interact, and transform over cosmic time.
Citation: Selim, I.M., Farahat, A.S., Basmsm, L.H. et al. Unsupervised multimodal deep learning for galaxy morphology taxonomy: integrating ConvNeXtEmbeddings and morphological parameters for scalable survey science. Sci Rep 16, 12183 (2026). https://doi.org/10.1038/s41598-026-45369-5
Keywords: galaxy morphology, unsupervised learning, deep learning, astronomical surveys, clustering