Clear Sky Science · en
An efficient logarithmic estimator in stratified random sampling using single auxiliary variable
Why smarter sampling matters
Whenever governments, scientists, or companies run surveys, they rarely measure every person or object. Instead, they take samples and use statistics to estimate overall averages—such as average crop yield, rainfall, or school enrollment. Small improvements in how these averages are estimated can save money, reduce fieldwork, and still deliver more reliable numbers. This paper introduces a new way to squeeze more accuracy from the same survey data by using a clever mathematical trick based on logarithms.

Breaking the population into meaningful groups
Many large surveys divide the population into groups, or strata, before sampling. For example, farms may be grouped by region, schools by district, or weather stations by climate zone. Within each group, a small sample is taken, and these pieces are combined to estimate the overall average. This approach, called stratified sampling, works especially well when each group is fairly uniform inside but quite different from the others. The authors focus on this common design and ask: given that we already sample in groups, can we use extra information inside each group to sharpen our estimates even more?
Using a helpful companion variable
In many real surveys, it is easier to measure one variable than another. For instance, it may be easier to count the number of trees in an orchard than to measure its total apple harvest, or to record how many schools exist in a district rather than tally every enrolled student. When such an easily measured quantity is strongly related to the main quantity of interest, statisticians call it an auxiliary variable. Existing methods, such as ratio and regression estimators, already use this companion variable to improve estimates of the main average. However, these traditional tools often assume fairly simple, almost straight-line relationships and may not work as well when the data are more uneven or behave in a nonlinear way.
A new twist: the logarithmic estimator
The central contribution of this study is a new estimator that blends stratified sampling with a logarithmic transformation of the auxiliary variable. Instead of working directly with the raw auxiliary averages in each group, the method transforms them using natural logarithms before combining the information. This transformation can tame large differences between groups and better capture curved or uneven relationships between the main and auxiliary variables. The authors derive mathematical expressions that describe how biased the new estimator might be and how variable it is, and they identify conditions under which it should outperform several well-known alternatives.

Testing with real and simulated data
To see how the new estimator behaves in practice, the authors apply it to three real datasets: apple yields linked to tree counts, school enrollment linked to the number of schools, and wet days linked to sunshine hours. In each case, the population is divided into strata such as regions or climate zones. They also run computer simulations on artificial populations where the relationship between the main and auxiliary variables is strong and controlled. Across different sample sizes and population structures, the new estimator repeatedly shows lower error and higher percentage relative efficiency, meaning it produces estimates that are, on average, closer to the true population mean than competing methods using the same data.
What this means for real-world surveys
For non-specialists, the key message is that this logarithmic estimator offers a way to get more accurate averages from surveys without collecting additional data. When there is a strong link between a hard-to-measure quantity and an easier one, and when the population is naturally divided into groups, this method can significantly reduce the uncertainty of the final estimates. That makes it attractive for applications ranging from agriculture and environmental monitoring to education statistics and industrial quality control, where better numbers support better decisions.
Citation: Shakoor, F., Asif, M., Atif, M. et al. An efficient logarithmic estimator in stratified random sampling using single auxiliary variable. Sci Rep 16, 11126 (2026). https://doi.org/10.1038/s41598-026-41448-9
Keywords: stratified sampling, survey accuracy, auxiliary data, statistical estimation, logarithmic methods