The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for DPDK Summit APAC to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.
This schedule is automatically displayed in Indochina Time – ICT (UTC +7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."
IMPORTANT NOTE: Timing of sessions and room locations are subject to change.
The UACCE (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. The UACCE bus was integrated in DPDK 24.03, it enables accelerator devices like compressors, cryptos, DMA, and ethernet devices to be seamlessly integrated and registered within DPDK applications. This topic will introduce the UACCE's design philosophy and usage, see if DPDK might evolve to support the SVA path to accelerate more areas.
ZTE has introduced a new ASIC-based enhanced data processor product to the DPDK community. This product series incorporates a programmable packet processing pipeline and offers extensive hardware offloading capabilities, including networking, storage, data encryption/decryption, and RDMA support. It also provides driver interfaces for leveraging its hardware capabilities within DPDK/SPDK and other open source communities, facilitating rapid development of services and applications. Its applicability spans across various use cases such as virtualization, AI computing, and security.
This presentation introduces Libtpa, yet another open source DPDK based userspace TCP stack implementation. What distinguishes Libtpa from many other userspace TCP stacks is its ability to coexist natively with the Linux kernel networking stack. Libtpa is highly efficient, capable of boosting the Redis performance by upto 5 times. Moreover, Libtpa is sort of stable, backed by more than 200 unit tests. In addition, Libtpa offers a rich set of debug tools, among which the sock tracing is particularly handy on debugging.
The exponential growth in telecommunications users has led to significantly increased bandwidth requirements. As a result, telecom operators are seeking solutions to develop software-based, scalable packet processing products. This proposal explores the utiliisation of DPDK capabilities to build high-speed, line-rate telecom packet processing systems. These systems enable metadata extraction from control and user plane packets and achieve control-user correlation using DPDK hash algorithms. It provides the packet processing libraries to extract the protocol packets from different interfaces like N11, N4, N3, S1-U, S11 etc. And the packets from these different interfaces require the packet stripping, adjustment, modifications to extract the necessary data which is achieved using DPDK MBUF Packet processing library APIs. The proposal can scale to multiple cores per port using the efficient RSS load-balancing techniques offered in NiC using DPDK Flow modules. The modular and robust DPDK library APIs allow building such Packet processing systems in a very short span of delivery time with high efficiency and quality.
Myself Ilan working with Aviz Networks. Currently involved in building high performance scalable software based packet processing engines for various telco use cases using DPDK.
In the past few releases of DPDK, there has been an addition of various protocol offloads and useful features in rte_security which are not much explored and needs to be highlighted. - MACsec - Provides point-to-point security on Ethernet(Layer 2) links. - TLS/DTLS record protocol - A layer 4 security protocol - Rx flow inject - An alternate datapath for security processing in cases when the inline protocol processing cannot be performed due to non-protocol reasons like outer fragmentation etc. - IPv4/IPv6 reassembly on inline inbound IPsec processing. New algorithms added in cryptodev - ShangMi algorithms - SM2/SM3/SM4 - Secure Hash Algorithm and KECCAK (SHAKE) - Updates on asymmetric crypto This session will explain the details about the offloads and their corresponding use cases along with future roadmap for cryptodev and security libraries.
I work as Principal Engineer at Marvell, member of its Dataplane and Accelarators team. I mainly contribute to the DPDK project, for which working as maintainer for dpdk-next-crypto tree. I have made significant contributions to rte_security, IPsec, PDCP, MACsec, and various Crypto... Read More →
I lead the crypto & security protocols team at Marvell. With close to 7 years of contributions in DPDK, I've been involved in enhancing support for network security protocols in DPDK. I had introduced hardware acceleration for protocols such as IPsec & TLS via rte_security and introduced... Read More →
In switchdev mode, the DPDK application manages the proxy (PF) port and VF/SF by using representors. When a packet has no matching rule in HW, it is considered a miss packet and will be sent to the port representor that matches the origin port of the packet. With HW advancements supporting hundreds of VMs and frameworks like Kubernetes that can scale to thousands of pods, the number of ports has increased significantly. The model where each VF/SF port is managed by a corresponding software representor can no longer handle such demand effectively. It consumes a lot of memory, and the software poll all ports, wasting cycles on empty queues, and causing cache misses. It also takes a long time to initialize and set up. RTE_ETH_DEV_CAPA_RXQ_SHARE was introduced earlier to mitigate this issue, but users still need to configure and setup the queues per port. The new model allows a user to configure the Rx/Tx queue on a proxy port and manage all pkts through the proxy port’s Rx/Tx queue and a single switch representor for handling miss traffic. This approach solves the scaling issue, allowing the management of thousands of ports effectively and using CPU cycles much more efficiently.
Share & enlighten on the challenges, hiccups to address DPDK application targeted for low latency (Virtual RAN and Packet core) in a container environment.
Solutions Architect and Performance Tuning expert for High speed packet processing applications and accelerate Network transformation, performance tuning and Power Efficient Compute for Network and Storage workloads on x86 COTS, Accelerators for Crypto/Compression/Baseband, Storage... Read More →
Senior Member of Technical Staff SW Developer, AMD
Diversified experience in Network Application Acceleration with a focus on analysis, performance tuning, and architecting solutions using technologies such as DPDK (x86), Ezchip (NP4, NP5), Tilera (8036, 8072) on Linux for user and kernel. Keen interest in building solutions using... Read More →
Currently, there are quite a few challenges in migrating the various open-source applications to DPDK to achieve higher throughputs. The talk discusses some suggestions to enhance DPDK library to enable easier migration to achieve higher throughputs.
Have been working on different Datapath applications using VPP, DPDK, Linux Kernel. Delivering solutions to the customers in various Networking, Network Security, End Point Security spaces
I am part of DPDK tech board. Also maintaining next-event and next-net-mrvl DPDK tree. I am also responsible for maintaining Trace, Graph, Eventdev subsystem in DPDK. I work for Marvell.
PDCP is a protocol that plays a crucial role in the UMTS, LTE and 5G air interfaces. In radio protocol stack, PDCP sits above the RLC (Radio Link Control) layer and below the RRC or user plane upper layers (like IP at the UE). PDCP handles the transfer of both user plane and control plane data. PDCP ensures security of the data transmitted by using ciphering and integrity protection. It has additional features such as in-order delivery, duplicate discard, windowing based anti-replay protection, timer based SDU discard etc. PDCP protocol relies on features that are implemented in DPDK by various libraries such as 1. rte_reorder - in-order delivery, duplicate discards 2. rte_cryptodev & rte_security - ciphering, integrity protection 3. rte_timer & rte_event - timers for SDU discard, reordering Lib PDCP provide a standard abstraction to PDCP datapath processing. The library implements all protocol specific handling with portions of protocol offloaded to rte_cryptodev based on its capability. This session would introduce the protocol, explain changes added in other libraries to adapt it for PDCP use case and generic walkthrough of PDCP library from application usage perspective.
I work as Principal Engineer at Marvell, member of its Dataplane and Accelarators team. I mainly contribute to the DPDK project, for which working as maintainer for dpdk-next-crypto tree. I have made significant contributions to rte_security, IPsec, PDCP, MACsec, and various Crypto... Read More →
I lead the crypto & security protocols team at Marvell. With close to 7 years of contributions in DPDK, I've been involved in enhancing support for network security protocols in DPDK. I had introduced hardware acceleration for protocols such as IPsec & TLS via rte_security and introduced... Read More →
There was a customer requirement to have Rx scheduling for incoming packets after applying metering and policing traffic management based on RFC-2698. The use case requires only one thread (lcore) to dequeue packets and packets should be received in priority order (highest priority first). NXP has all these 3 features in DPAA2 H/W and we extended eventdev framework usage to integrate H/W RX Scheduler with H/W based Metering & Policing module.
Software Engineer (4+ years) with experience in various projects-platform software developement in DPDK. Major work done area is on i.MX & DPAA2 platforms.
Working with Freescale/NXP from 10 years as Networking application developer and expertise in datapath performance optimization. Working exp on DPDK, ODP, VPP odp-vpp(Linaro project) etc.
The current GRO library in DPDK is suboptimal. For every packet, to verify the presence of the current 5-tuple in the GRO table, a lookup is performed on each flow. Implementing a hash-based solution for the 5-tuple would be an efficient optimization. Furthermore, it would be advantageous if applications could configure the timeout for a specific flow or 5-tuple. The existing timer mode in GRO applies a static timeout for any application, which is not ideal. The timeout should be adjustable based on the latency sensitivity of the application. If the GRO layer processes flows from different applications, a single timeout setting could lead to suboptimal performance for others. Providing an infrastructure within GRO for applications to set timeouts for tuples would be beneficial. This would allow for the judicious use of GRO resources, optimizing for different types of traffic, such as mice versus elephant flows. If packets from multiple applications are processed by the GRO layer, the varying latency tolerances could be accommodated by setting appropriate GRO timeouts.
Myself Param, I have been working for almost close to 10 years. I am working with Microsoft as Senior Software Engineer. For the past 5+ years I have been actively using DPDK. In DPDK I have contributed in fixing bugs in GRO layer and also added GRO support for IPv6. My contributions... Read More →
Solutions Architect and Performance Tuning expert for High speed packet processing applications and accelerate Network transformation, performance tuning and Power Efficient Compute for Network and Storage workloads on x86 COTS, Accelerators for Crypto/Compression/Baseband, Storage... Read More →
Senior Member of Technical Staff SW Developer, AMD
Diversified experience in Network Application Acceleration with a focus on analysis, performance tuning, and architecting solutions using technologies such as DPDK (x86), Ezchip (NP4, NP5), Tilera (8036, 8072) on Linux for user and kernel. Keen interest in building solutions using... Read More →
I am part of DPDK tech board. Also maintaining next-event and next-net-mrvl DPDK tree. I am also responsible for maintaining Trace, Graph, Eventdev subsystem in DPDK. I work for Marvell.
Exact or wildcard matches are the two traditional match methods in DPDK rte_flow. These methods involve either matching specific values in packet headers (e.g., IP addresses, ports) or ignoring specific fields while matching others. But if a user wants to match packets with TCP port above a specific value, this is not supported with these two match methods. In this session, we will introduce the newly added RTE_FLOW_ITEM_TYPE_COMPARE item, which addresses the limitations of traditional match methods by providing advanced and flexible matching capabilities. It supports match with various comparative operation results, such as equal to, less than, less than or equal to, greater than, greater than or equal to, and not equal to. This enables range and conditional matching, allowing for more precise traffic management. For example, it can match packets with field values exceeding a certain threshold.
When a DPDK application must be upgraded, the traffic downtime should be shortened as much as possible. During the migration time, the old application may stay alive while the new one starts and is configured. To optimize the switch to the new application, the old application may need to be aware of the presence of the new application being prepared. This is achieved with a new API allowing the user to change the new application state to standby and active later.
As 400G wire speed becomes more widely adopted, the need to monitor traffic at 400G is growing daily. DPDK can be a great help to achieve this speed as it provides excellent throughput results. However, monitoring packets at 400G can be challenging, so hardware acceleration also comes in handy. FPGA chips are a suitable option because they can be programmed to perform hardware offload and are powerful enough to support the processing pipeline even at 400G. Developers at CESNET have designed the first 400G FPGA-based SmartNIC that supports DPDK. It contains the Intel Agilex 7 FPGA, which provides enough resources to implement a processing pipeline. This pipeline is able to mark and filter packets and much more at 400 Gbps. It can be configured via RTE flow to help accelerate monitoring as well as many other tasks.
He has graduated from the Brno University of Technology three years ago. His bachelor's and master's thesis were focused on developing RTE flow support for OvS acceleration on cards COMBO (PMD nfb) and Intel Pac N3000 (PMD ipn3ke). He used to develop DPDK-based DDOS mitigation application... Read More →
Solutions Architect and Performance Tuning expert for High speed packet processing applications and accelerate Network transformation, performance tuning and Power Efficient Compute for Network and Storage workloads on x86 COTS, Accelerators for Crypto/Compression/Baseband, Storage... Read More →
Senior Member of Technical Staff SW Developer, AMD
Diversified experience in Network Application Acceleration with a focus on analysis, performance tuning, and architecting solutions using technologies such as DPDK (x86), Ezchip (NP4, NP5), Tilera (8036, 8072) on Linux for user and kernel. Keen interest in building solutions using... Read More →