Clear Sky Science · en
Bridging mathematical modeling and AI for 3D coordinate recognition of moving objects without external reference and attitude measurement
Why tracking moving objects in 3D matters
From drones in city airspace to wildlife in remote skies, many modern tasks depend on knowing where fast moving objects are in three dimensions. Today this usually calls for costly satellite receivers or carefully calibrated instruments. This study introduces a way to track a flying object in 3D using only a few ordinary cameras and smart algorithms, opening the door to cheaper and more flexible monitoring systems.

Seeing motion instead of measuring hardware
Traditional 3D positioning tools fall into two camps. Active systems, such as satellite navigation or onboard sensors, require the object being tracked to carry equipment, which is not possible for unknown or uncooperative targets. Passive systems, like laser scanners or radar, do not touch the target but rely on expensive gear or reference markers to know where the sensors are pointed. The authors take a different route. Rather than measuring the exact tilt and spin of each camera ahead of time, they notice that the path of a moving object, recorded as a sequence of positions over time, can itself serve as a natural reference. If several cameras watch the same object as it flies, the shared shape of that path links their views together.
Turning 2D camera views into a shared 3D path
The team builds a two stage framework that blends artificial intelligence with classical geometry. First, an AI detector based on the latest You Only Look Once (YOLOv12) family scans each video frame and marks the drone with a simple box, from which its pixel coordinates are taken. Instead of treating each frame separately, the authors extend the model into a time aware version called YOLO Time Series. By looking at how the drone moves from frame to frame and using its typical speed, this version fills in missed sightings and filters out impostors such as birds or insects. These long, cleaned up trails of 2D points from three cameras become the raw material for reconstructing the 3D path.

Letting mathematics recover hidden camera poses
In the second stage, the authors apply a compact mathematical tool known as singular value decomposition to relate the different camera views. During an initial batch period, hundreds of frames are gathered. The shared 2D trails from a pair of cameras reveal how those cameras are rotated and shifted relative to each other, even though their directions were never measured. With this relative layout in hand, the system uses simple geometric rules to triangulate the drone’s 3D position at each moment in the coordinate system of one reference camera. Knowing only where the cameras sit on the ground in a global reference frame, the method then links this local 3D path to a world scale map, so the drone’s motion can be expressed in real distances and heights.
Testing in virtual space and on a real drone
To check the limits of the idea, the researchers first run detailed simulations of a drone flying a spiral path above three fixed cameras. In these idealized trials, their framework recovers the 3D coordinates with errors of only a few millimeters, and further tests show how mistakes in camera placement or pixel detection gradually degrade accuracy. Even when such imperfections are introduced, the errors remain modest for typical camera spacing and image quality. The team then carries out a field test at a sports stadium, tracking a real drone within a 100 by 100 by 30 meter volume using three off the shelf smartphones. Comparing their reconstructed path with the drone’s onboard satellite receiver, they report an average error of about five meters and a high match between the shapes of the two paths, even under rainy, low light conditions.
What this means for everyday 3D tracking
In plain terms, this work shows that you can turn a handful of inexpensive cameras into a real time 3D locator for moving objects without bolting precision instruments onto either the cameras or the target. By letting the motion of the object tie the views together, and by combining learning based detection with lean mathematical formulas, the framework delivers fast and reasonably accurate 3D positions using limited hardware. While demonstrated on a single drone, the same principles could extend to other flying objects or even ground based targets, offering new ways to observe the changing Earth with simpler tools.
Citation: Yi, J., Shang, Kk. & Small, M. Bridging mathematical modeling and AI for 3D coordinate recognition of moving objects without external reference and attitude measurement. Commun Eng 5, 89 (2026). https://doi.org/10.1038/s44172-026-00648-x
Keywords: 3D tracking, drone monitoring, computer vision, multi camera system, geodetic positioning