This blog is directed at data science professionals and AI/ML modelers, responsible for developing Full Stack Data Science / AI/ML Pipelines capable of performing high-speed model inference at the far-Edge.
Introduction
One of the primary functions of 5G Wireless (5G) is Augmented Reality (AR), a new, ultra-reliable, low latency communication paradigm that requires a total roundtrip delay of less than 5 milliseconds (ms) for optimal performance, and to mitigate motion sickness issues. AR applications provide a reciprocal experience where the real-world is enhanced by computer-generated artifacts. For example, AR scenarios are created in real-time based on context information from the AR-user’s physical environment. This includes behavior, location, surroundings, and motion state information. As such, AR requires significant data and processing power to function properly.
Current Technical Challenges
Mobile AR devices, like headsets and glasses, are compact and battery-operated, limiting their processing power and on-board computing resources capabilities. For this reason, mobile AR applications offloaded the majority of their data processing and storage to external computing devices possessing sufficient resources. These devices may be collocated at the AR-user’s location, or more likely, cloud-hosted. The further away the computation is from the AR-user, the more latency increases. This is problematic for AR since information is continually refreshed to capture new movement in the [VR] scene.
AR’s continuous image-refresh transmissions consume considerable network bandwidth, which not only hinders core network performance but also increases AR implementation costs due to the continually increasing bandwidth requirements. Additionally, AR’s complex algorithms further compound processing delays and exacerbate battery drain issues.
Next, if leveraging cloud computing and storage resources, then all AR data is sent to the cloud for analysis, with results then sent back to the AR-user’s headset. Thus, cloud implementations further increase AR application latency, generating long transmission delays to the AR-user. When AR transmission time exceed 20ms physiological issues, such as dizziness and nausea, are often experienced by the AR-user.
The Problem to Solve
Due to these technical hurdles, designing an ultra-reliable AR framework that meets end-to-end latency constraints, ultra-high availability, and the necessary Quality of Experience (QoE) is challenging. Moreover, despite AR’s impressive immersion capabilities, widespread commercial adoption of Mobile AR technology will remain hamstrung until the technical challenges are resolved.
Best available option to solve AR’s current technical challenges:
1.Mobile-Edge Compute
The first step to reducing the AR transmission delay and relieving cloud server network congestion is to offload AR processing to purpose-built, local, Edge-based servers. These new highly performant Mobile-Edge servers moves AR processing from the remote cloud near the AR-user which reduces transmission latency.
2.Software Defined Networking
The second step separates the AR data and control schemes through software-defined networking (SDN). SDN provides network flexibility and allows for improved network performance through software optimization. This decentralized AR framework design reduces latency while increasing system reliability and availability.
3.Precision-Based AI Pipelines
The third and final step is the implementation of precision-based AI pipelines. As established, the size of the real-time AR context dataset is huge because, over time, new AR data is generated accumulatively. For this reason, traditional model optimization techniques have required a large memory footprint and as such have not been appropriate for Edge-computing, where resources are insufficient to support large AI model optimization tasks.
Leveraging new Mobile-Edge Computing AR devices, reduced computation can be achieved using a two-pronged Precision-based AI Pipeline process: model optimization and model splitting. Here, model optimization is achieved by quantifying weights, pruning parameters, and caching intermediate data between adjacent layers or through specially structured filters in Convolutional Neural Networks (CNNs). In turn, model splitting decomposes a large number of computational tasks into different parts. These tasks are then offloaded to multiple Mobile-Edge Computing devices, which all work together in a federated construct to complete task execution. Precision-based AI Pipelines for Edge-based inferencing has the ability to reduce AR transmission delay, improve energy consumption, and sustain QoE performance to within the required 5ms.
AI Pipeline Options for Rapid AR Video Frames Analysis
Real-Time Messaging Protocol (RTMP), Real-time Streaming Protocol (RTSP), and Network Sockets are the primary methods to send video frames. Middleware in the form of a NGINX server for RTMP, or Kafka for RTSP are often used to route packets to the AR-user. In turn, network sockets directly connect AR devices to each other without the use of middleware.
1.Real-Time Messaging Protocol using NGINX
Depending on the network bandwidth and server location, RTMP transmission latency can average from one second to many seconds, making the protocol ideal for spectator viewing, but not reactive videos. RTSP exhibits similar behavior unless a local 5G WiFi connection is leveraged. NGINX web servers are commonly used to host RTMP-based video streams, which are then linked and viewed through different applications. This solution requires minimal configuration; however, availability and reliability are dependent on network bandwidth.
2.Real-time Streaming Protocol using KAFKA
For AR frameworks leveraging public or private 5G WiFi connections and Mobile-Edge Compute Systems to offload the processing of user-AR data, bandwidth and processing are less of a concern. As such, provided a reasonable Frames per Second (FPS) is sustained, Kafka can be used as the RTSP server. Kafka has the ability to both simplify the streaming video architecture and reduce transmission latency. The challenge with Kafka in this AR Framework design is that it currently requires the use of an inefficient REST API for AR device connectivity. REST induces latency, as such, unless a custom Kafka connector is developed to route AR video frames through Kafka remains challenged by REST.
3.Network Sockets
For AR server configurations, particularly cloud-hosted implementations, point-to-point network sockets likely afford the best performance with the lowest transmission latency. Once the network connection is established, no additional overhead, and no middleware is required in the data pipeline. That said, socket-communication is not without its disadvantages, as it creates a complex design that requires additional layers of security that must be built into the design.
Conclusion and Future Steps
This blog presented viable options for developing a unified AR framework for Rapid AI Model Inference at the Mobile-Edge. First, to reduce transmission latency to an acceptable physiological range, specialized Mobile-Edge compute hardware is necessary to offload AR computation while keeping the AR processing as close to the AR-user as possible. Next, SDN allows for the separation of video and control data, thereby reducing bandwidth congestion and further increasing AR system performance. Third, implementing new Mobile-Edge Precision-AI model concepts, like optimization and splitting accelerates the video analytics Deep Learning process within the computational bounds of the new Mobile-Edge Compute hardware. Finally, three options exist for AI pipeline video stream ingest, with each option possessing numerous pros and cons. Further research is required to determine the ingest option that affords the highest AR performance, with simplest, secure AR design. These steps are necessary to propel industry adoption of AR/VR to mainstream.
References
Chen, D., Xie, L. J., Kim, B., Wang, L., Hong, C. S., Wang, L.-C., & Han, Z. (2020). Federated Learning Based Mobile Edge Computing for Augmented Reality Applications. 2020 International Conference on Computing, Networking and Communications (ICNC), Computing, Networking and Communications (ICNC), 2020 International Conference On, 767–773. https://doi.org/10.1109/ICNC47757.2020.9049708
Konstantoudakis, K., Christaki, K., Tsiakmakis, D., Sainidis, D., Albanis, G., Dimou, A., & Daras, P. (2022). Drone Control in AR: An Intuitive System for Single-Handed Gesture Control, Drone Tracking, and Contextualized Camera Feed Visualization in Augmented Reality. Drones, 6(43), 43. https://doi.org/10.3390/drones6020043
Ranaweera, P., Jurcut, A., & Liyanage, M. (2022). MEC-enabled 5G Use Cases: A Survey on Security Vulnerabilities and Countermeasures. ACM Computing Surveys, 54(9), 1–37. https://doi.org/10.1145/3474552
Seo, Y., Lee, J., Hwang, J., Niyato, D., Park, H., & Choi, J. K. (2021). A Novel Joint Mobile Cache and Power Management Scheme for Energy-Efficient Mobile Augmented Reality Service in Mobile Edge Computing. IEEE Wireless Communications Letters, Wireless Communications Letters, IEEE, IEEE Wireless Commun. Lett, 10(5), 1061–1065. https://doi.org/10.1109/LWC.2021.3057114
Wang, Y., Yu, T., & Sakaguchi, K. (2021). Context-Based MEC Platform for Augmented-Reality Services in 5G Networks. 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall), Vehicular Technology Conference (VTC2021-Fall), 2021 IEEE 94th, 1–5. https://doi.org/10.1109/VTC2021-Fall52928.2021.9625304