Clear Sky Science · en

PrivEdge: a hybrid split–federated learning framework for real-time electricity theft detection on edge nodes

2026-03-21 · Back to index

Keeping the Lights Fair and Honest

Electricity theft may sound like a shadowy edge case, but it silently drains up to 100 billion dollars a year from power companies worldwide and can account for a large share of the electricity flowing through some grids. That lost revenue ultimately shows up in higher bills, weaker investment in infrastructure, and less reliable power for honest customers. At the same time, the detailed data from modern smart meters, which could help catch thieves, raises thorny questions about consumer privacy. This paper introduces PrivEdge, a new way to spot suspicious usage patterns in real time by pushing intelligence out to small devices near the meter, while keeping most personal data close to home.

The Problem with Watching Every Watt

Traditional systems for catching electricity theft rely on gathering massive amounts of raw usage data from millions of meters and analyzing everything in a central data center. That approach does work, but it is expensive to communicate, slow to react, and creates a tempting trove of detailed household data that may conflict with strict privacy rules. Newer methods based on decentralized learning try to keep data on the customer’s side while sharing only model updates. However, many of these still demand too much computing power on small devices, do not cope well with customers whose usage patterns differ widely, or have only been tested in idealized lab settings rather than under messy, real-world conditions.

A Smarter Gatekeeper at the Meter

PrivEdge takes a different route by splitting the detection job between a low-cost gateway device—implemented here on a Raspberry Pi 4 attached to each smart meter—and a central server. On the gateway, lightweight software cleans up missing readings, rescales the data, compresses it into a smaller set of features, and uses a compact time‑aware neural network to turn recent consumption into a short numerical “fingerprint.” Only this compact fingerprint, not the original fine‑grained trace of when you boiled water or turned on the air conditioner, is sent onward. This greatly reduces how much must be transmitted and helps shield the everyday life patterns hidden in the raw data.

Learning Together Without Sharing Secrets

On the server side, those fingerprints flow into a deeper part of the neural network and a collection of classic machine‑learning models such as decision trees and support‑vector classifiers. Their outputs are combined by a simple meta‑model that learns how to weigh each one, forming an ensemble that is more accurate and resilient than any single detector. Multiple gateways participate in a coordinated training process: instead of uploading raw data, they periodically send model updates that the server averages and sends back, allowing the whole system to learn from many regions at once. Along the way, the authors layer on practical privacy shields, including secure aggregation of updates and carefully tuned noise injection into the shared signals, as well as optional heavy‑duty encryption for the most sensitive deployments.

Built for the Real Grid, Not Just the Lab

To find out whether this design holds up outside theory, the researchers tested PrivEdge on a widely used real‑world dataset from China’s State Grid, containing years of labeled normal and fraudulent consumption from tens of thousands of customers. They compared it with leading centralized, federated, split, and hybrid approaches, all under the same preprocessing and hardware conditions. PrivEdge achieved about 98% accuracy and F1‑score, outperforming all competitors while sending only compact intermediate information instead of full data streams. Long, 24‑hour hardware‑in‑the‑loop runs on Raspberry Pi gateways showed low and stable CPU usage, modest power draw, and millisecond‑level response times, even when simulating network delays, packet loss, and multiple meters feeding a single gateway.

Guarding Privacy While Catching Cheats

Because any shared signal can, in principle, leak information, the authors went further and staged realistic privacy and security attacks against their own system. In "black‑box" tests where an attacker sees only the final theft scores—not the inner workings—attempts to infer who was in the training data or to reconstruct detailed usage patterns performed little better than random guessing. When they simulated clients that deliberately tried to poison the shared model with false updates, robust aggregation methods on the server largely neutralized the impact. Altogether, the study suggests that PrivEdge can act as a practical, privacy‑conscious watchdog: it helps utilities catch a wide range of subtle and blatant theft behaviors in real time, using inexpensive edge hardware, without turning smart meters into all‑seeing surveillance devices.

Citation: Ramadan, A., Shouman, M.A., Attiya, G. et al. PrivEdge: a hybrid split–federated learning framework for real-time electricity theft detection on edge nodes. Sci Rep 16, 9685 (2026). https://doi.org/10.1038/s41598-026-39064-8

Keywords: electricity theft, smart grids, edge AI, federated learning, privacy-preserving analytics