Clear Sky Science · en

Application of LSTM-CNN in skiing action recognition under artificial intelligence technology

2026-03-02 · Back to index

Smarter Coaching on the Slopes

Skiers, coaches, and fans increasingly turn to video to understand what happens in a split-second carve or jump. Yet in the real world, snow sprays, trees, changing light, and crowded slopes make it hard for computers to reliably recognize what a skier is doing. This paper presents a new artificial-intelligence system that can automatically read skiing technique from ordinary videos with high accuracy, even in messy outdoor conditions. Such technology could one day power real-time coaching tools, safer training, and more insightful performance analysis for winter sports.

Why Teaching Computers to See Skiing Is Hard

Skiing is a demanding sport to analyze because the movements are fast, three-dimensional, and often partly hidden by bulky clothing or the skier’s own body. At the same time, outdoor scenes are full of distractions: trees, snow mounds, strong reflections, and variable weather. Previous video-based systems either focused too much on static appearance in single frames or failed to properly track how movements unfold over time. As a result, they tended to confuse similar actions, struggled in poor visibility, and were not robust when new athletes or new slope conditions appeared.

A Two-Eyed View of Skiing Motion

The authors design a model that watches ski videos in two complementary ways at once. One “eye” looks at regular color frames, capturing what the skier and surroundings look like. The other “eye” focuses on motion by tracing how pixels shift from one frame to the next, a technique known as optical flow. From this motion field, the system builds a saliency map that highlights the truly active regions—the skis, legs, and torso—while downplaying static background like trees and snowbanks. Both streams pass through a 3D convolutional network that learns patterns across space and short time spans, distilling each video segment into compact signatures of appearance and movement.

Blending What It Sees and How It Moves

Instead of simply stacking or averaging the two information streams, the model learns how much weight to give to each for every clip it analyzes. For some maneuvers, such as a plow brake where skis form a distinctive shape, appearance cues matter more. For smooth parallel turns, the rhythm and direction of motion are more telling. A learnable fusion module automatically adjusts these contributions, normalizing the two feature sets and combining them through trained weights that always sum to one. This adaptive blend allows the system to focus on whichever visual evidence is most informative for the current action, making recognition more accurate and reliable across diverse skiing styles and scenes.

Reading the Full Story of Each Turn

Recognizing an action in skiing is not just about a single pose; it is about how a sequence unfolds from start to finish. To capture this, the fused features are fed into a bidirectional recurrent network that looks both forward and backward in time. Instead of only relying on past frames, the model also uses hints from upcoming frames to understand what the skier is doing. This helps it distinguish between actions that may look similar in a snapshot but differ in timing and coordination. Tests on the SkiTB dataset—a large collection of real-world skiing videos—show that the new system beats several established methods, achieving around 93% precision and F1-score. It stays above 85% accuracy even when evaluated on different weather conditions, unseen athletes, and videos with artificial noise.

What This Means for Skiers and Sports Tech

By combining focused motion perception, adaptive blending of visual cues, and a time-aware reading of movement, the proposed model can reliably tell whether a skier is turning, braking, or jumping, even in cluttered and changing environments. For non-specialists, the key takeaway is that the system does not just count frames; it learns where to look, what matters most, and how a full action cycle plays out. This approach could form the backbone of intelligent training assistants that provide objective feedback, help prevent injuries by spotting risky patterns, and support richer broadcasting analysis. While the authors note that extreme weather and very brief aerial tricks remain challenging, their framework offers a robust foundation for future smart coaching tools in skiing and potentially many other outdoor sports.

Citation: Zhang, W., Xu, L. & Wang, L. Application of LSTM-CNN in skiing action recognition under artificial intelligence technology. Sci Rep 16, 11547 (2026). https://doi.org/10.1038/s41598-026-42324-2

Keywords: skiing action recognition, sports video analysis, deep learning, optical flow, athlete performance