WaveMesh // Operator Console

not connected
Overview
Dashboard
Architecture
In Development
AI

WaveMesh Network Platform Demonstrator

A real-time autonomous drone fleet management platform that coordinates multi-drone operations over a 60 GHz millimetre-wave mesh network. Nine microservices collaborate via NATS JetStream messaging, Redis state caching, and RaimaDB persistent storage to deliver adaptive beamforming, autonomous mesh re-routing, and live operator control.

9 Microservices NATS JetStream 60 GHz mmWave FastAPI · Redis · Docker
Data Flow
📡Drone / Sim
📨NATS JetStream
⚙️Microservices
🗄️Redis / RaimaDB
🖥️Operator UI
Dashboard Controls
AuthAuthentication

Generates a JWT bearer token required by all protected endpoints. Enter any username — the platform uses role-based tokens (operator role) signed with HS256. The token is stored in the browser session and automatically attached to every subsequent request. Tokens expire after 60 minutes.

FleetRegister Fleet

Registers drones with the Auth IAM service, generating NKey credentials for each drone's NATS connection. Pre-filled with three drones (drone-001, drone-002, drone-003). Real drone firmware would use these credentials to authenticate against the NATS server before publishing telemetry.

MissionDispatch Missions

Sends a mission command to the Fleet Orchestration service via the gateway. Each mission specifies a drone ID, mission ID, and a set of GPS waypoints. The Fleet Orchestration service records the mission in RaimaDB and would relay commands to the target drone over its NATS subject.

EmergencyEmergency Stop

Broadcasts an immediate halt command to the entire fleet. All in-progress missions are aborted. This is an authenticated, logged action — the operator identity is recorded. In a live deployment this triggers a fleet-wide safe-landing sequence.

SimulateInject Telemetry

Publishes synthetic GPS and battery telemetry to NATS on behalf of the three drones, bypassing physical hardware. The Telemetry Ingest service picks up these messages, validates them, and writes position and battery state to Redis with a 30-second TTL. Fleet Status auto-refreshes to show the results.

SimulateBeamforming Stimulus

Simulates two drones repositioning and publishing RSSI measurements to each other. The Beamforming Control service monitors RSSI values and — when signal strength drops below −75 dBm — computes new antenna steering angles (azimuth/elevation) using Haversine geometry and publishes a beamform command back to the drone. The Beamform Commands panel shows the resulting steering instructions.

SimulateMesh Topology Stimulus

Injects link-up or link-down events into the mesh. The Mesh Routing service maintains a live adjacency graph and recomputes BFS shortest paths whenever a link changes state. Link Up builds a connection between two drones; Link Down removes it; Build Chain connects all three drones in sequence. The SVG graph and BFS route table update within seconds of each event.

PanelsFleet Status

Polls /api/v1/fleet/status every 5 seconds. The gateway scans all drone:*:pos keys in Redis and returns current GPS coordinates and battery level for each active drone. Drones vanish from the table if their Redis keys expire (30 s TTL), simulating loss of telemetry link.

PanelsMesh Topology Graph

Auto-polls /api/v1/mesh/topology every 3 seconds and renders the current adjacency state as an SVG graph. Nodes are arranged in a circle; edges are colour-coded by RSSI (green > −65 dBm, yellow −65 to −75, red < −75). The BFS route table alongside shows the shortest inter-drone path for every pair.

PanelsCall Trace Sidebar

Visualises the microservice call chain for each action you trigger. When you click a button, the relevant nodes light up in sequence — showing which services are involved, in what order, and what transport (HTTP or NATS) connects them. Useful for understanding the internal event flow without reading logs.

Key Concepts
NATS JetStream Streams

Five durable streams partition message traffic: TELEMETRY (GPS, battery, RSSI), COMMANDS (mission & beamform), TOPOLOGY (link events & adjacency), SIM (simulation ticks), ALERTS (threshold breaches). Services subscribe to specific subjects; JetStream guarantees at-least-once delivery with configurable retention.

Redis TTL State

All live drone state (position, battery, PHY layer metrics, beamform parameters) is stored in Redis with a short TTL. This creates a self-healing presence model: a drone that stops transmitting automatically disappears from Fleet Status without any explicit deregistration step.

Adaptive Beamforming

The Beamforming Control service runs a continuous loop (10 Hz). When a drone's RSSI to a peer falls below −75 dBm, it calculates the azimuth and elevation angles between the two GPS positions using the Haversine formula and publishes a steering command. This keeps the directional 60 GHz antenna locked on the strongest signal path.

Mesh Re-routing

Mesh Routing maintains an in-memory adjacency graph updated by LinkEvent messages. On each change it runs BFS from every node to compute shortest-hop paths across the fleet, then publishes the new topology to mesh.topology.updated. The gateway and UI consume this to keep the graph current.

Fleet Status

Drone IDLatLonAlt (m)BatteryState
Login to view fleet status

Telemetry — select a drone row

Click a drone row to see live telemetry.

Mesh Topology

Route Table (BFS paths)

No topology yet — publish link events.

Beamform Commands

live · from beamforming-control service
No commands yet — publish a degraded RSSI reading to trigger corrections.
Activity Log
Call Trace idle

In Development

The platform is at MVP status (v0.1.0). All nine microservices are running and the core telemetry, beamforming, and mesh-routing pipelines are functional. The items below are specified in the system requirements but not yet implemented — they represent the gap between the current prototype and a production-ready deployment.

12 open items 4 categories
Security & Authentication
Security Real NKey Ed25519 credentials

Auth/IAM currently generates UUID-based placeholder credentials instead of real Ed25519 keypairs. Production drones require genuine NKey keypairs generated via the NATS nkeys library so that the NATS server can cryptographically verify drone identity.

Security NATS server-level NKey auth enforcement

The NATS server starts with only JetStream enabled — no authentication configuration. The subject-level permission policies defined in Auth/IAM are generated but never loaded into the NATS server. Any client can currently connect and publish to any subject without credentials.

Security JWT session Redis storage

Session keys need to be written to Redis on token issuance (1 hr TTL) so that tokens can be explicitly revoked — on logout, credential rotation, or operator suspension. Currently the Auth/IAM service validates JWTs statelessly via signature verification only, so there is no way to invalidate a token before it expires naturally.

Security Credential rotation scheduling

Credential rotation logic is implemented and persists to RaimaDB, but there is no scheduled rotation loop, no operator endpoint to trigger rotation, and no mechanism to notify affected drones to re-authenticate. The requirement calls for periodic automated rotation with drone notification.

Security TLS everywhere

TLS needs to be enabled across the full stack — NATS, Redis, and all HTTP service-to-service communication. Currently all traffic is plaintext. The gateway is intended to perform TLS termination, but no certificates or TLS configuration exist anywhere in the stack.

Security Gateway API rate limiting

Redis-backed rate limiting needs to be added to the gateway to cap request rates per operator JWT and return HTTP 429 on breach. Currently any authenticated client can make unlimited requests with no throttling in place.

Service Integration
Integration Gateway → Fleet Orchestration routing

The mission dispatch endpoint returns immediately without contacting Fleet Orchestration. Missions are never published to the drone command JetStream subject and never stored in RaimaDB. Fleet Orchestration runs but receives no mission commands from the gateway.

Integration Fleet Orchestration ACK retry loop

Pending acknowledgements are tracked when a mission is dispatched and cleared when an ACK arrives, but no background task checks for timed-out entries to trigger a re-send. The requirement specifies that unacknowledged commands must be automatically retried — this retry loop is never started.

Integration Analytics JetStream stream replay

The Analytics service needs a durable JetStream consumer on the TELEMETRY and TOPOLOGY streams to drive post-run KPI computation from recorded data. Currently KPIs are computed from in-memory data passed directly to service methods — no stream-replay consumer exists and the scenario comparison endpoint is never fed from live stream history.

Integration Memcached drone_registry & config cache

Two Memcached keys are yet to be implemented: a drone registry (active drone ID list, invalidated on registration and deregistration) and per-service config blobs (invalidated on config push). Only the route table written by Mesh Routing is currently stored in Memcached.

Simulation Fidelity
Simulation Stage 1–4 simulation tool integration

The Simulation Bridge is planned as a four-stage pipeline — Gazebo/AirSim for flight dynamics, NYUSIM for mmWave channel modelling, srsRAN for PHY/MAC layer emulation, and ns-3 for mesh routing simulation. Currently all four stages are replaced by a single Python physics model with linear RSSI decay and random drift, which is sufficient for integration testing but does not accurately model 60 GHz propagation or real flight behaviour.

Infrastructure
Infrastructure NATS leaf nodes

Drone-side edge NATS servers (leaf nodes) are planned to connect to the central cluster, enabling local-first message routing, lower latency, and continued operation during uplink loss. Currently a single central NATS instance handles all traffic — drones have no local broker and lose all messaging capability if the uplink drops.

AI-Augmented mmWave Drone Mesh Networks
A Platform Architecture for Intelligent Autonomous Operations
Working paper — draft for review

This paper presents a framework for integrating machine learning and artificial intelligence capabilities into a millimeter-wave (mmWave) mesh networking platform for autonomous drone operations. We examine five principal domains of AI integration: onboard perception and navigation, predictive beamforming control, multivariate anomaly detection, adaptive mission planning, and large-model operator interfaces. For each domain we characterise the problem formulation, survey applicable methods, describe the integration architecture, and identify open research questions. A central contribution of this work is the demonstration that a subject-oriented message bus architecture (NATS JetStream) provides a uniform integration surface for heterogeneous AI components without requiring modification to core platform services. We further describe how the platform's virtual testbench — comprising coupled flight dynamics, channel emulation, PHY/MAC, and mesh routing simulators — constitutes a complete training data pipeline for learned components across all five domains.

1. Introduction
1.1 Motivation

Unmanned aerial vehicle (UAV) swarms operating over millimeter-wave mesh networks present a class of engineering problems that classical control and signal processing approaches address incompletely. The coupling between physical drone dynamics, radio propagation characteristics, network topology, and mission objectives creates a high-dimensional operational state space that admits learned representations more naturally than hand-crafted models. This paper examines where and how machine learning methods can be introduced into such a platform, with emphasis on architectural compatibility and the practical path from simulation-trained models to deployed systems.

1.2 Platform overview

The reference platform consists of a mmWave mesh network operating at 60 GHz with phased-array beamforming at each node, a microservice architecture communicating via NATS JetStream, a layered virtual testbench for hardware-absent development, and Redis/RaimaDB for state management and persistence. The platform is described in full in the companion technical specification. This paper concerns only the AI augmentation surface.

1.3 Scope and organisation

We restrict our treatment to AI capabilities that (a) can be integrated without modifying core platform services, (b) can be trained or validated using the existing simulation testbench, and (c) address documented limitations of classical approaches in the target operational environment. Section 2 addresses onboard navigation. Section 3 addresses beamforming control. Section 4 addresses anomaly detection. Section 5 addresses mission planning and fleet coordination. Section 6 addresses large-model operator interfaces. Section 7 addresses the simulation testbench as a training data pipeline. Section 8 discusses the uniform integration architecture. Section 9 identifies open problems.

2. Learned Navigation and Perception
2.1 Problem statement

GPS-denied autonomous navigation in unstructured environments requires estimation of ego-motion and environmental structure from onboard sensors. Classical approaches — geometric visual odometry, iterative closest point (ICP) terrain matching — exhibit well-characterised failure modes in low-texture environments, under adverse weather, and in the presence of sensor noise distributions not anticipated during design. The research question is whether learned methods improve robustness in precisely these failure conditions.

2.2 Learned visual odometry

We review the class of end-to-end learned odometry models including DROID-SLAM and TartanVO, which replace hand-crafted feature detectors with learned representations trained on large corpora of camera motion sequences. We characterise the generalisation properties of these models to novel environments and the inference cost at relevant frame rates on Jetson Orin class hardware.

2.3 Radar point cloud processing

mmWave radar returns are sparse relative to LiDAR and exhibit multipath and clutter artefacts that degrade classical ICP performance. We survey learned denoising and classification approaches applicable to IWR-series mmWave radar outputs, and evaluate their contribution to terrain matching accuracy across surface types represented in the simulation environment model.

2.4 Monocular depth estimation

In weight-constrained single-camera configurations, dense depth estimation from monocular imagery — using models in the Depth Anything family — supplements sparse radar returns. We examine the fusion of monocular depth with radar point clouds in an extended Kalman filter formulation and characterise the accuracy improvement over either modality alone.

2.5 Reinforcement learning for obstacle avoidance

We describe a deep reinforcement learning formulation for reactive obstacle avoidance using the fused sensor state as input. The simulation testbench provides the training environment; domain randomisation over obstacle geometry and density provides the policy robustness necessary for sim-to-real transfer.

2.6 Open problems

Drift accumulation without external reference; performance degradation in completely featureless environments; compute budget constraints at small airframe scales.

3. Predictive Beamforming Control
3.1 Problem statement

Classical beamforming controllers are reactive — they adjust beam pointing in response to observed RSSI degradation. At high drone velocities and during rapid manoeuvres, the latency between signal degradation and corrective beam adjustment produces link interruptions that impact throughput and mesh stability. The hypothesis is that a predictive model — conditioning beam angle commands on anticipated future positions — can reduce link interruption frequency and duration.

3.2 Sequence modelling for beam prediction

We formulate beam angle prediction as a sequence-to-sequence problem: given a window of historical drone trajectories and RSSI measurements, predict the optimal beam angle at time t+k. We survey LSTM and transformer-based architectures for this task and characterise the prediction horizon over which learned models outperform classical gradient ascent controllers.

3.3 Training data generation

The DeepMIMO framework provides a structured methodology for generating beam-channel correspondence datasets from ray-traced channel models. We describe the pipeline from Gazebo trajectory simulation through NYUSIM channel computation to DeepMIMO dataset construction, and the resulting training corpus characteristics.

3.4 Differentiable channel simulation

NVIDIA Sionna enables gradient computation through the channel simulation model, permitting end-to-end training of beamforming policies by backpropagation through the channel. We examine whether this approach produces policies with superior generalisation relative to those trained on pre-generated datasets.

3.5 Integration with the beamforming control service

The learned predictor replaces the PID/gradient ascent control law within the existing Beamforming Control microservice. The NATS interface — subscribing to drone.telemetry.rssi and publishing to drone.cmd.beamform — is unchanged. We characterise the performance improvement in terms of link interruption frequency and duration across a set of standard manoeuvre profiles executed in the simulation testbench.

3.6 Open problems

Generalisation across antenna configurations not represented in training; beam prediction under simultaneous multi-drone topology change; calibration of simulated channel models to real 60 GHz hardware.

4. Multivariate Anomaly Detection
4.1 Problem statement

Threshold-based anomaly detection — the current baseline — monitors individual telemetry channels independently. Failures that manifest as correlated degradation across multiple channels below individual thresholds are not detected until a single channel crosses its limit, at which point the failure may be advanced. We examine whether multivariate learned detectors provide earlier and more accurate fault identification.

4.2 Autoencoder-based detection

An autoencoder trained on nominal flight telemetry learns a compact representation of healthy multivariate sensor state. Reconstruction error at inference time provides an anomaly score sensitive to deviations not captured by any single variable. We describe the training procedure using JetStream-replayed nominal flight data and characterise detection latency and false positive rate on a set of injected fault scenarios.

4.3 Isolation forest for operational deployment

Isolation forest provides a complementary detection approach with lower inference cost and more interpretable feature importance scores than deep models. We compare detection performance across fault types and discuss the conditions under which each method is preferable.

4.4 Fault taxonomy

We define a fault taxonomy derived from the simulation testbench's fault injection capability: gradual RF degradation, sudden link failure, mechanical vibration anomaly, GPS spoofing (where GPS is present), beamforming misalignment, and multi-link simultaneous failure indicative of environmental interference. We characterise the detection performance of each method across this taxonomy.

4.5 Integration with the alert service

Anomaly scores are published to the existing alerts.* subject hierarchy. The Alert Service is extended to consume model-generated scores alongside threshold-based events. No changes are required to downstream subscribers.

4.6 Open problems

Distribution shift between simulation-generated training data and real hardware telemetry; anomaly detection under non-stationary operational conditions; root cause attribution from anomaly scores.

5. Adaptive Mission Planning and Fleet Coordination
5.1 Problem statement

Classical path planners optimise a single objective (typically distance or time) subject to geometric constraints. Operational objectives for drone fleets include link quality maintenance, battery consumption, terrain avoidance, and task completion — a multi-objective problem in a dynamically changing environment. Additionally, optimal assignment of sub-tasks to individual drones in a fleet is a combinatorial optimisation problem whose complexity scales with fleet size.

5.2 Learned cost functions for path planning

We describe the augmentation of classical planners (A*, RRT*) with learned cost functions trained on historical flight data incorporating link quality observations, battery consumption profiles, and terrain traversal difficulty. The learned cost function transforms the geometric planning problem into one that reflects real operational constraints.

5.3 Deep reinforcement learning for adaptive replanning

When a link failure, drone fault, or environmental change invalidates the current mission plan, replanning must occur rapidly. We formulate replanning as a Markov decision process and describe a DRL policy trained in the simulation testbench across a library of failure scenarios. We characterise replanning latency and mission completion rate relative to scripted fallback behaviour.

5.4 Graph neural networks for multi-drone task allocation

The variable-size fleet and dynamic task set make fixed-architecture neural networks unsuitable for task allocation. Graph neural network architectures process fleet and task state as a graph and produce allocation policies that generalise across fleet sizes not seen during training. We describe the formulation and evaluate performance on benchmark allocation problems.

5.5 Open problems

Safe exploration during online policy adaptation; formal verification of learned planners against operational constraints; handling of adversarial environments not represented in the training distribution.

6. Large Language Model Integration for Operator Interfaces
6.1 Problem statement

The operator interface to a drone platform requires translation between natural human intent and structured machine commands (mission compilation) and between structured machine state and human-interpretable explanations (post-flight debrief). These translation problems are well-suited to large language models but require grounding in platform-specific structured data to avoid hallucination.

6.2 Natural language mission compilation

We describe a system in which an operator specifies mission objectives in natural language and an LLM with access to platform schema and constraint specifications produces a structured mission plan in the format consumed by the Fleet Orchestration service. We examine prompt engineering approaches, structured output enforcement, and constraint validation as complementary techniques for ensuring plan correctness.

6.3 Retrieval-augmented post-flight analysis

Post-flight debriefing requires synthesising information across large volumes of telemetry data stored in RaimaDB. We describe a retrieval-augmented generation (RAG) architecture in which an LLM answers operator queries by retrieving relevant telemetry segments and generating grounded natural language explanations. We characterise the accuracy and latency of this approach on a set of representative post-flight queries.

6.4 Anomaly explanation

When the anomaly detection system (Section 4) generates an alert, an LLM can synthesise a contextual explanation from the surrounding telemetry. We examine the relationship between explanation quality and the structured context provided to the model, and define an evaluation framework for explanation accuracy.

6.5 Open problems

Hallucination in safety-critical mission planning contexts; evaluation methodology for natural language explanations of technical events; latency of LLM inference relative to operator response time requirements.

7. The Simulation Testbench as a Training Data Pipeline
7.1 Overview

The platform's virtual testbench — comprising Gazebo flight dynamics, NYUSIM channel emulation, srsRAN PHY/MAC simulation, and ns-3 mesh routing simulation — constitutes an end-to-end training data pipeline for learned components across all five domains described above. This section characterises the testbench as a data generation system rather than a validation system.

7.2 Dataset characteristics by domain

We describe the structure and volume of training data producible by the testbench for each learned component: trajectory-channel correspondence for beamforming (Section 3); labelled nominal and anomalous telemetry for fault detection (Section 4); environment-cost correspondence for path planning (Section 5); and sensor-state sequences for navigation policy training (Section 2).

7.3 Domain randomisation

The testbench's fault injection and scenario parameterisation capabilities support systematic domain randomisation — variation of environmental parameters across training episodes to improve model robustness. We describe a randomisation schedule covering weather conditions, terrain types, drone configurations, and failure modes, and characterise its effect on sim-to-real transfer performance.

7.4 Accelerated training runs

The simulation clock can be advanced at up to 100× real time, enabling training data generation at rates that would be impractical with real hardware. We characterise the fidelity tradeoffs introduced by clock acceleration and identify the minimum fidelity requirements for each model class.

7.5 Neural surrogate models for simulation acceleration

Full NYUSIM + srsRAN channel simulation at 100× clock rate is computationally intensive. We examine learned surrogate models that approximate the channel and PHY pipeline at lower cost, enabling higher training throughput. NVIDIA Sionna's differentiable channel model is evaluated as one such surrogate.

8. Uniform Integration Architecture
8.1 The NATS subject interface as an AI integration surface

A central observation of this work is that the platform's message bus architecture provides a uniform integration surface for AI components that requires no modification to existing services. Any AI component can be introduced as a microservice that subscribes to existing NATS subjects, applies a model, and publishes results on new or existing subjects. We formalise this interface and demonstrate its application across all five domains.

8.2 Training pipeline integration

We describe the end-to-end pipeline from simulation data generation through model training to deployment as a NATS-connected microservice, and identify the tooling required at each stage.

8.3 Model versioning and rollback

Because AI components are isolated services behind a NATS interface, model updates can be deployed and rolled back without coordination with other services. We describe a versioning scheme compatible with the JetStream durable consumer model.

9. Open Problems and Future Work
  • Formal safety verification of learned components in safety-critical flight operations
  • Online adaptation of deployed models to distribution shift without full retraining
  • Privacy-preserving federated learning across drone fleets operated by independent parties
  • Adversarial robustness of learned beamforming and navigation components under active interference
  • Evaluation methodology for natural language operator interfaces in time-pressured operational contexts
  • Multi-modal foundation models combining radar, camera, and RF channel state as a unified perceptual backbone
References

To be completed. Key works include:

  • DROID-SLAM — Teed & Deng, 2021
  • TartanVO — Wang et al., 2021
  • DeepMIMO — Alkhateeb, 2019
  • Sionna — Hoydis et al., 2022
  • ORB-SLAM3 — Campos et al., 2021
  • Depth Anything — Yang et al., 2024
  • Isolation Forest — Liu et al., 2008

Data Flow

Simulation Bridge
:8007
NATS JetStream
:4222 · 5 streams
Telemetry Ingest
:8004
Fleet Orchestration
:8002
Mesh Routing
:8003
Beamforming Control
:8005
Alert Service
:8006
Redis
:6379 · hot cache
RaimaDB
SQLite · time-series
Memcached
:11211 · registry
Analytics
:8008
Gateway API
:8000 ← you are here
Auth / IAM
:8001