Because the mlnx_qos did not work at host,
I found:
https://enterprise-support.nvidia.com/s/article/HowTo-Configure-Rate-Limit-per-VF-for-ConnectX-4-ConnectX-5-ConnectX-6
From this I was able to see that Rate limit on VFs were available using:
ip link set <PF_IF_NAME> vf <VF_IDX> max_tx_rate <MAX_RATE_IN_MBIT/S> min_tx_rate <MIN_RATE_IN_MBIT/S>
Thus, I decided to:
1. Test if rate limit well works
2. Test if rate limit gives better results for latency ( latency vs message rate sensitive)
As tested multiple times before, with 13 cores given for tx, 100G was fulfilled with 64byte packets.
Rate limit 90Gbps:
Rate limit 40Gbps:
With Iperf test, I could see that performance was right below the rate limit.
This I believe is normal as iperf is unavailable to show max performance
Rate limit 90Gbps:
Rate limit 40Gbps:
Now this is weird as it shows higher performance than the rate limit.
I first thought that the Pktgen shows generated packets, and thus was not limited by the rate limit, but with 40Gbps, it still shows deprecation in rate.
Still, it shows a limit in rate.
72 percent packet loss
74 percent packet loss..
37 percent packet loss
23.7 percent packet loss
25.6 percent packet loss..
DPDK Pktgen TX: still >= 92G
DPDK Pktgen TX: about 78G
25.6 percent packet loss..
7.9 percent packet loss
Note : recv side different with 128byte packet showed no loss at baseline
35.3 percent packet loss
DPDK Pktgen TX: about 80G
DPDK Pktgen TX: about 70G