Clear Sky Science · en
Enhanced skin cancer classification for minority classes using Conditional GAN pipeline and CNN-ViT ensemble
Why smarter skin checks matter
Skin cancer is one of the most common cancers worldwide, and catching it early can mean the difference between a simple outpatient procedure and a life‑threatening disease. Dermatologists rely heavily on visual clues in magnified skin images, but even experts struggle when some cancer types are rare and images are cluttered with distracting details like hair. This study introduces an artificial intelligence (AI) system designed to handle those tricky, underrepresented cases more fairly, aiming to support more reliable skin‑cancer screening for everyone.
Uneven data and hard‑to‑spot cancers
Dermatology images pose several hurdles for computers and clinicians alike. Many photos show hairs, shadows, and other artifacts that obscure the lesion itself. Some skin cancers are very common, while others are rare, so the main public datasets contain thousands of examples of benign moles but only a few hundred images of less frequent yet clinically important cancers. Standard deep‑learning models trained on such imbalanced collections tend to become very good at recognizing the majority classes while quietly failing on the rare ones—precisely the cases where automated help would be most valuable. In addition, conventional image networks often focus on small patches and may miss global clues such as overall asymmetry or irregular borders that matter for diagnosis.

Making more of the rare cases
To tackle the imbalance, the authors first build an intelligent image generator that learns from real patient data and then fabricates new, realistic‑looking examples of the rare lesion types. Instead of simply flipping or rotating existing pictures, their system, a conditional generative adversarial network, uses "attention maps" derived from a standard classifier to see which parts of each lesion influenced earlier decisions the most. These attention maps highlight medically meaningful zones. The generator then creates new images that preserve the critical structures inside those zones while adding controlled variation in color, texture, and shape. In effect, the AI learns to imagine many plausible versions of each rare cancer, enriching the training pool without copying images outright.
Two ways of seeing: local details and global patterns
Once the dataset is rebalanced with these targeted synthetic images, a second AI module takes over to perform the actual diagnosis. Here, the researchers combine two complementary types of models. A convolutional neural network (ResNet50V2) excels at capturing fine‑grained local cues—tiny streaks of pigment, subtle texture, and edge sharpness. Alongside it, a vision transformer model (DeiT) treats each image as a grid of patches and learns how distant regions relate to one another, picking up on whole‑lesion properties like symmetry, spread, and border shape. Instead of waiting until the end to merge their opinions, the team fuses their internal representations through an attention‑based fusion module that lets global context emphasize, or down‑weight, specific local features on the fly.
Putting the system to the test
The pipeline is evaluated on HAM10000, a widely used collection of more than ten thousand dermoscopic images spanning seven lesion types, from harmless pigment spots to melanoma and several less common skin cancers. After hair removal and careful train‑test splitting, the new synthetic images are mixed with the cleaned originals to form a balanced training set. The combined model’s performance is assessed using per‑class precision, recall, and F1‑scores, as well as receiver‑operating and precision‑recall curves. Crucially, the authors focus on whether rare classes such as dermatofibroma, vascular lesions, basal cell carcinoma, and actinic keratosis are recognized as reliably as the abundant types, rather than relying on a single overall accuracy number that could hide weaknesses.

More balanced answers from AI
The resulting system delivers high and, more importantly, well‑balanced accuracy across all seven classes. It achieves near‑perfect performance for several minority cancers and competitive scores for the remaining types, while also showing low statistical uncertainty when tested through extensive bootstrap resampling. This suggests that the gains are not a fluke of oversampling but stem from the synergy between attention‑guided image generation and the dual‑view classifier. For a layperson, the key message is that smarter AI design—not just bigger models—can help ensure that automated skin‑cancer tools do not overlook rare but dangerous conditions. While further testing on broader, multi‑center datasets is still needed before clinical deployment, the work points toward AI assistants that treat every skin lesion, common or rare, with equal scrutiny.
Citation: Hussain, S.R., Saritha, S., Chevuri, A. et al. Enhanced skin cancer classification for minority classes using Conditional GAN pipeline and CNN-ViT ensemble. Sci Rep 16, 13114 (2026). https://doi.org/10.1038/s41598-026-43339-5
Keywords: skin cancer detection, medical image AI, class imbalance, synthetic medical images, vision transformers