LAVT project page STATUS: under review

Nonlinear Performance Degradation of Vision-Based Teleoperation under Network Latency

A full-stack testbed for controlled latency injection and systematic route-based evaluation of vision-based teleoperation and remote autonomy.

Aws Khalil · Jaerock Kwon

Bio-Inspired Machine Intelligence (BIMI) Lab, University of Michigan – Dearborn

Code & Docs Paper (arXiv) Video (YouTube) Citation

Demo Video

Research Overview

Problem. Vision-based teleoperation and remote autonomy can appear robust at low delay, yet exhibit sharp, nonlinear collapse when network latency increases. Measuring this transition reliably requires a full-stack system with controlled latency injection, accurate time synchronization, and standardized route-based evaluation.

Contributions (threefold).

We provide empirical insights into degradation patterns and failure modes specific to perception-driven control under delayed visual feedback, without introducing mitigation or predictive strategies.
We conduct a systematic simulation-based evaluation of how network-induced latency impacts stability and tracking performance in vision-based teleoperated lane keeping.
We present LAVT, a research-oriented ROS 2–based teleoperation testbed designed to enable controlled studies of network latency effects on vision-based closed-loop driving, implemented in CARLA simulation and integrated with a full-scale drive-by-wire research vehicle.

Method: LAVT (System + Protocol)

High-level LAVT architecture (vehicle/server side ↔ client/remote side) with synchronized clocks and independent latency injection for video and control.

LAVT is a distributed ROS 2 framework that combines (i) video streaming via GStreamer, (ii) time synchronization via Chrony, (iii) reproducible delay injection using Linux NetEm for both video and control channels, and (iv) a client-side autonomy/teleoperation interface supporting repeatable route executions in CARLA.

Client-side autonomy module used in the experiments. The controller processes delayed camera frames to estimate lane geometry and generates steering and speed commands in closed loop.

Real drive-by-wire research vehicle integrated with LAVT. All experiments reported in this paper were conducted in CARLA for safety and reproducibility, while the same system stack is deployable on this vehicle without modification.

SERVER

CARLA Simulator / REAL VEHICLE

NETWORK

Zenoh / Gstreamer / WiFi 5

CLIENT

Teleoperation / Remote Autonomy

LAVT is designed to operate on both simulated and real vehicle platforms. The system has been integrated with a full-scale drive-by-wire (DBW) research vehicle equipped with steering, throttle, and brake-by-wire control modules. While the complete teleoperation and streaming stack runs on this vehicle, all experiments in this study were conducted in the CARLA simulator to ensure safety, repeatability, and isolation of latency effects.

Experiment Design

We evaluate Town04 using three routes (A–C) and corresponding “key” subsets that isolate steering-intensive segments where latency-induced instability emerges first.

Route geometries in Town04. Route A: short straight → 90° right turn → long straight. Route B: 90° left turn → short straight → sustained left-hand curvature. Route C: short straight → sustained right-hand curvature. Key subsets (black boxes) isolate steering-intensive segments.

One-way latency distributions for video and control channels across conditions (L0–L5).

VIDEO PATH

Injected delay via NetEm

CONTROL PATH

Injected delay via NetEm

CLOCK SYNC

Chrony alignment

LAVT supports independent latency control on the video stream and control commands, enabling structured experiments that isolate performance sensitivity to network delay.

Results

Latency-Induced Performance Degradation. Latency-induced degradation trends across all routes and experimental runs. The top plot reports the route-balanced 95th-percentile cross-track error computed over runs that successfully completed the route without collision, while the bottom plot shows the corresponding route completion rate across all runs.

At low delay (L0–L1), the system maintains stable lane keeping with small tracking error and near-perfect completion rates. As perception latency increases to approximately 150–225 ms (conditions L2–L3), the controller begins to operate on increasingly stale visual observations, introducing phase lag between perception and actuation. This produces oscillatory steering corrections and growing lateral deviation, leading to a sharp drop in route completion. Beyond this transition region, the system exhibits nonlinear degradation: completion rates collapse rapidly while tracking error among surviving runs increases substantially. Additional control-channel delay (L4–L5) further accelerates this instability by delaying corrective commands, reducing the system's ability to recover from lane deviations.

Trajectory overlays across latency conditions

Trajectory overlays for key-route subsets under increasing latency. Each trace is a run; red markers indicate collision terminations.

Overall Performance (Aggregate Metrics)

Condition	Completion (%)	Collision Rate	Lane Invasion (mean)	P95 Cross-Track (m)
L0	100.0	0.20	0.40	2.30
L1	93.3	0.23	3.97	2.73
L2	50.0	0.67	15.10	3.91
L3	36.7	0.83	22.13	9.25
L4	30.0	0.77	18.80	4.64
L5	10.0	0.97	16.83	13.15

The table summarizes aggregate driving performance across all routes for each latency condition (completion rate, collision frequency, lane-invasion events, and P95 cross-track error).

Documentation

Recommended order: SETUP · QUICKSTART · EXPERIMENTS · TROUBLESHOOTING

Citation

Use the following BibTeX entry to cite this work:


@article{khalil2026nonlinear, 
title={Nonlinear Performance Degradation of Vision-Based Teleoperation under Network Latency}, 
author={Khalil, Aws and Kwon, Jaerock}, 
journal={arXiv preprint arXiv:2603.06850}, 
year={2026}
}