Clear Sky Science · en

A Three-Year Multimodal Holistic Dataset For Horticultural Tomato Cultivation

· Back to index

Why Tomatoes Need More Than Sunshine

Tomatoes in grocery aisles may look simple, but getting them there requires a careful balance of light, warmth, water, and plant care. As greenhouses add cameras and sensors, farmers and scientists can watch plants grow in detail that was impossible before. This article introduces a rich three-year collection of tomato data that brings together images, climate readings, soil tests, and harvest records, giving researchers a new way to understand how growing conditions shape the food on our plates.

Figure 1. How a wired greenhouse turns tomato growth into one connected stream of data
Figure 1. How a wired greenhouse turns tomato growth into one connected stream of data

A Smart Greenhouse as a Living Lab

The study was carried out in a large greenhouse in Northeast China that functioned like a living laboratory. Two common tomato types, a taste-focused salad tomato and a sweet cherry tomato, were grown in long rows of raised troughs. The greenhouse used automated systems to control air flow, heating, and shade, creating a stable but still changing indoor climate. Within this space, 14 plots received different fertilizer recipes, separated by brick barriers to prevent mixing. This setup allowed the team to watch how plant growth and yield responded to distinct nutrient plans under the same roof.

Watching Plants with Eyes and Sensors

To follow the plants through their life, the team installed a network of high-definition cameras and environmental sensors. Each camera looked straight down from above at a fixed group of 12 plants, capturing color images four to five times per day at set hours. At the same time, sensors recorded air temperature, humidity, light, carbon dioxide, and the moisture and temperature inside the growing substrate every 30 minutes. A handheld canopy device was used weekly to gauge how well leaves were capturing light and how much nitrogen they held. Together, these tools created a detailed picture of both what the plants looked like and what they were experiencing.

Measuring Health from Roots to Fruit

Numbers alone cannot tell the whole story, so the researchers also made regular manual checks. Each week they measured plant height, stem thickness, leaf size and number, and several leaf-based health indicators. Experts walked the rows to note pests, disease signs, flower and fruit formation, and overall vigor using simple rating scales. Before planting, they tested the substrate in each treatment plot for nutrients, particle sizes, and water-holding ability, revealing that all plots were rich in phosphorus and potassium but differed in how tightly they held water. At harvest, they weighed yields for each plant and measured traits such as soluble sugar and vitamin C, linking growing conditions to the fruit that consumers care about.

Figure 2. How cameras and sensors track tomato growth step by step and feed into analysis models
Figure 2. How cameras and sensors track tomato growth step by step and feed into analysis models

Turning Many Streams into One Story

Because not all measurements were taken at the same rate, the team used mathematical smoothing to align weekly manual data with the continuous sensor records on a daily timeline. Each plant kept a stable identification code tied to its position in the images, so visual, climate, and trait data could be matched even when leaves began to overlap. Files in the public repository are carefully organized by year, growing cycle, and data type, with clear field descriptions and mapping tables. The authors also supply scripts so users can reproduce every processing step and combine the image, sensor, and trait layers without starting from scratch.

What This Means for Future Tomato Growing

In the end, the Horti-M3-Tomato dataset does not offer a single new growing trick so much as it provides a powerful shared foundation. Anyone studying plant growth, testing new computer vision tools, or building models to predict yield or stress can now work with three seasons of tightly linked photos, climate logs, soil data, and harvest results from the same greenhouse. For a layperson, this means that future insights into tastier, more reliable tomatoes in controlled environments will rest on a transparent and richly documented record of how these plants actually grew, day by day, leaf by leaf.

Citation: Gong, Y., He, Y., Zhang, X. et al. A Three-Year Multimodal Holistic Dataset For Horticultural Tomato Cultivation. Sci Data 13, 726 (2026). https://doi.org/10.1038/s41597-026-07074-w

Keywords: greenhouse tomatoes, multimodal dataset, plant phenotyping, precision agriculture, environmental sensors