What I've Done:
- Optimized heterogeneous AI inference on Intel CPU/iGPU by leveraging OpenVINO/VNNI acceleration; achieved 2.7× faster inference and 2.2× E2E speedup, establishing a new performance standard for a novel CNN architecture
- Designed high-reliability anomaly detection by augmenting vision models via multi-modal input channels; demonstrated mission-critical system reliability (99% precision, 2% false alarm rate) across 200+ production trials
- Standardized edge deployment pipeline by containerizing and delivering 10+ supporting modules (APIs/debug/telemetry) into the C++ inference engine, enabling high-throughput API services and management for decentralized 6-8 device clusters
Motivation: Anomaly Detection and Alarm in Retail
Self-checkout systems have become a common feature in retail as businesses strive to enhance efficiency and convenience.
In fact, over 40% U.S. grocery stores now use self-checkout in 2025. However, as shoplifting at self-checkout continues to rise,
it’s clear that stronger measures are needed to reduce product loss.
These losses can be prevented (at least partially) using security systems
that combine AI video analytics with weigh scales and cameras to monitor transactions and detect suspicious behavior, such as item swapping or unscanned items.
Our objective is to create an effective loss prevention solution in self-checkout systems.
By leveraging AI capabilities in video recognition which is further enhanced by barcode scanners, we can detect suspicious behavior and alert staff to potential theft.
Last but not least, we want to ensure that the system is both trustworthy and computationally efficient
so that it can be easily deployed and excel in various retail scenarios.
Issues with Vision-Based Anomaly Detection
At first, our system was designed to use a purely vision-based approach, where we installed a single top-down camera on existing kiosks to capture the entire transaction process.
We trained and evaluated several deep learning models which can process consecutive video frames.
With this setup, we were able to recognize some basic behaviors like "Skipping" and "Hiding Behind,"
while it failed to precisely detect "Two-At-A-Time" and "Cover-Up" where actions were occluded.
These observations showed that (1) naive vision-only approach was not sufficient, which let us explore the potential of multi-modal inputs to enhance system robustness.
We also noticed that (2) the inference latency (115ms per frame) was not optimal enough for real-time applications, which will be discussed in following sections.
Designing Systems with Multi-Modal Inputs
I realized that there was another useful type of information that can be acquired in self-checkout systems: the barcode.
Then I worked with other colleagues to integrate existing barcode scanners (which is embedded in the kiosk) into our algorithms.
By combining the barcode signals with vision model outputs, we implemented rule-based approach to consider various scenarios. (e.g. passing two items with only one barcode scanned -> "Two-At-A-Time" alert).

Scenarios with Different Multi-Modal (Vision + Barcode) Combinations
With this approach, the system achieved 99% precision and 2% false alarm rate across 200+ production trials. Worth to mention that the 56% detection rate (probablity of anomaly behavior to be detected) is not quite optimal, but it's still acceptable in this case. Retailers care more about the low false alarm rate rather than high recall rate of the alarm system.
High-Throughput AI Inference for Edge Deployment

System Performance in Different Settings
Then We Have An AI Inference Engine!
Combining the above components, we built a high-throughput AI inference engine for edge deployment. To let these modules easy-to-deploy and manage, we containerized them and delivered them as a service.
An inference engine can be deployed on a machine and it serves for a cluster of devices (usually 6-8 kiosks in a retail store).
Staff can monitor the status of the engine and the devices from the web portal, and more importantly, they will be notified when the alarm is triggered!

System Overview