mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2025-11-24 04:40:49 +00:00
CQE compression feature improves performance by reducing PCI bandwidth bottleneck on CQEs write. Enhanced CQE compression introduced in ConnectX-6 and it aims to reduce CPU utilization of SW side packets decompression by eliminating the need to rewrite ownership bit, which is likely to cost a cache-miss, is replaced by validity byte handled solely by HW. Another advantage of the enhanced feature is that session packets are available to SW as soon as a single CQE slot is filled, instead of waiting for session to close, this improves packet latency from NIC to host. Performance: Following are tested scenarios and reults comparing basic and enahnced CQE compression. setup: IXIA 100GbE connected directly to port 0 and port 1 of ConnectX-6 Dx 100GbE dual port. Case #1 RX only, single flow goes to single queue: IRQ rate reduced by ~ 30%, CPU utilization improved by 2%. Case #2 IP forwarding from port 1 to port 0 single flow goes to single queue: Avg latency improved from 60us to 21us, frame loss improved from 0.5% to 0.0%. Case #3 IP forwarding from port 1 to port 0 Max Throughput IXIA sends 100%, 8192 UDP flows, goes to 24 queues: Enhanced is equal or slightly better than basic. Testing the basic compression feature with this patch shows there is no perfrormance degradation of the basic compression feature. Signed-off-by: Ofer Levi <oferle@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> |
||
|---|---|---|
| .. | ||
| cq.h | ||
| device.h | ||
| doorbell.h | ||
| driver.h | ||
| eq.h | ||
| eswitch.h | ||
| fs_helpers.h | ||
| fs.h | ||
| mlx5_ifc_fpga.h | ||
| mlx5_ifc_vdpa.h | ||
| mlx5_ifc.h | ||
| mpfs.h | ||
| port.h | ||
| qp.h | ||
| rsc_dump.h | ||
| transobj.h | ||
| vport.h | ||