Burak Demirel
← Back to Projects

AI-Native Link Adaptation

Production AI system for real-time 5G link adaptation, deployed under sub-30μs baseband constraints.

PyTorchGNNsPolicy DistillationDomain RandomizationDistributed RLONNX

+20%

Throughput gain

+10%

Spectral efficiency

<30μs

Baseband inference

Tier‑1

Operator deployment

Problem

Traditional link adaptation in radio access networks relies on heuristic control loops and coarse feedback signals, which struggle under rapidly changing channel conditions and heterogeneous traffic patterns.

While reinforcement learning methods demonstrate strong performance in simulation, they rarely transfer to production due to sub-30μs inference latency requirements, non-stationary environments, tight baseband integration, and strict reliability guarantees.

The core challenge is not learning a better policy, but deploying it reliably in real-time network infrastructure.

Constraint vs. Solution

Constraint

RL policies can perform well in simulation, but production RAN systems require deterministic, ultra-low-latency, reliable behavior under non-stationary radio conditions.

Solution

Train expressive policies in high-fidelity simulation, distill them into compact inference-ready models, and validate them inside the live link-adaptation control loop.

Contribution

Designed and deployed an AI-native link adaptation system replacing heuristic control with learned policies under real-time baseband constraints. The system bridges RL research and production through a unified pipeline for training, compression, and deployment — enabling continuous adaptation in live 5G networks.

System Architecture

Simulation training
High-capacity RL policy
Policy distillation
Compact inference model
Baseband deployment
Live network validation

RL policies are trained in high-fidelity radio simulators, distilled into compact low-latency models, and integrated into the baseband link-adaptation loop for live validation under non-stationary network conditions.

Design Decisions

Policy Distillation

Compress high-capacity RL policies into compact models that meet strict latency budgets.

Simulation-Driven Training

Train policies in system-level simulators that approximate real-world radio dynamics.

Latency-Constrained Design

Prioritize deterministic execution and predictable runtime over model complexity.

Closed-Loop Integration

Embed inference directly into the link-adaptation control loop.

Robustness to Non-Stationarity

Handle distribution shifts in dynamic network environments.

Stability-Aware Optimization

Favor reliable production behavior over aggressive peak-performance optimization.

Deployment Path

1

Train policies in high-fidelity radio simulation

2

Distill high-capacity policies into compact models

3

Optimize for deterministic sub-30μs inference

4

Integrate into the baseband link-adaptation loop

5

Validate under live non-stationary network conditions

Results

+20%

Throughput

Measured in live 5G networks

+10%

Spectral efficiency

Improved radio resource utilization

<30μs

Latency

Inference on baseband hardware

Tier‑1

Operators

Deployed in production-facing environments

Impact

This system demonstrates that reinforcement learning can operate reliably in real-world communication infrastructure. By replacing static heuristics with adaptive policies, it improves efficiency, responsiveness, and robustness in live networks — enabling AI-native radio systems.

Lessons Learned

Deployment is the bottleneck

Reliability under real-world constraints dominates performance.

Simulation–reality gap dominates

Strong simulation results do not guarantee production success.

Latency reshapes model design

Sub-30μs constraints require aggressive compression and simplification.

Stability > peak performance

Production favors predictable behavior over aggressive optimization.

System integration defines success

ML performance depends as much on infrastructure as on algorithms.

My Role

  • Architected the research-to-production ML workflow.
  • Designed the policy distillation and model-refinement path.
  • Built deployment-oriented training and evaluation pipelines.
  • Connected simulation-based RL research to production-facing RAN validation.
  • Worked across ML, baseband constraints, and system integration.

References & Coverage