Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow sctp performance on arm64 #1815

Open
nshopik opened this issue Dec 21, 2024 · 2 comments
Open

slow sctp performance on arm64 #1815

nshopik opened this issue Dec 21, 2024 · 2 comments

Comments

@nshopik
Copy link

nshopik commented Dec 21, 2024

Context

  • Version of iperf3: 3.12

  • Hardware: arm ampere 2 cores with 4gb memory

  • Operating system (and distribution, if any): Linux 6.1.0-27-arm64 #1 SMP Debian 6.1.115-1 (2024-11-01) aarch64

Bug Report

  • Expected Behavior

somewhat similar what we see on x86

  • Actual Behavior
    sctp test
iperf 3.12
Linux goro 6.1.0-27-arm64 #1 SMP Debian 6.1.115-1 (2024-11-01) aarch64
Control connection MSS 32768
Time: Sat, 21 Dec 2024 20:44:37 GMT
Connecting to host 127.0.0.1, port 5201
      Cookie: rp56igphb6uclgiy73ijfa2ymnn6a465xotj
[  5] local 127.0.0.1 port 32984 connected to 127.0.0.1 port 5201
Starting Test: protocol: SCTP, 1 streams, 65536 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  6.62 MBytes  55.6 Mbits/sec
[  5]   1.00-2.00   sec   448 KBytes  3.67 Mbits/sec
[  5]   2.00-3.00   sec   640 KBytes  5.24 Mbits/sec
[  5]   3.00-4.00   sec   576 KBytes  4.72 Mbits/sec
[  5]   4.00-5.00   sec   768 KBytes  6.29 Mbits/sec
[  5]   5.00-6.00   sec   384 KBytes  3.15 Mbits/sec
[  5]   6.00-7.00   sec   576 KBytes  4.72 Mbits/sec
[  5]   7.00-8.00   sec   448 KBytes  3.67 Mbits/sec
[  5]   8.00-9.00   sec  2.00 MBytes  16.8 Mbits/sec
[  5]   9.00-10.00  sec  1.56 MBytes  13.1 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  13.9 MBytes  11.7 Mbits/sec                  sender
[  5]   0.00-10.00  sec  13.9 MBytes  11.6 Mbits/sec                  receiver
CPU Utilization: local/sender 0.2% (0.0%u/0.2%s), remote/receiver 0.1% (0.1%u/0.1%s)

tcp test

Linux goro 6.1.0-27-arm64 #1 SMP Debian 6.1.115-1 (2024-11-01) aarch64
Control connection MSS 32768
Time: Sat, 21 Dec 2024 20:44:51 GMT
Connecting to host 127.0.0.1, port 5201
      Cookie: ri4uc63tpz74m2zoipzcjv3d323jjh4lcill
      TCP MSS: 32768 (default)
[  5] local 127.0.0.1 port 40698 connected to 127.0.0.1 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  4.18 GBytes  35.9 Gbits/sec    0   2.37 MBytes
[  5]   1.00-2.00   sec  4.29 GBytes  36.9 Gbits/sec    0   2.50 MBytes
[  5]   2.00-3.00   sec  4.27 GBytes  36.7 Gbits/sec    0   2.62 MBytes
[  5]   3.00-4.00   sec  4.21 GBytes  36.2 Gbits/sec    0   2.87 MBytes
[  5]   4.00-5.00   sec  4.30 GBytes  37.0 Gbits/sec    0   2.87 MBytes
[  5]   5.00-6.00   sec  4.31 GBytes  37.0 Gbits/sec    0   2.87 MBytes
[  5]   6.00-7.00   sec  4.15 GBytes  35.7 Gbits/sec    0   4.37 MBytes
[  5]   7.00-8.00   sec  4.26 GBytes  36.6 Gbits/sec    0   4.37 MBytes
[  5]   8.00-9.00   sec  4.27 GBytes  36.6 Gbits/sec    0   4.37 MBytes
[  5]   9.00-10.00  sec  4.29 GBytes  36.8 Gbits/sec    0   4.37 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  42.5 GBytes  36.5 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  42.5 GBytes  36.5 Gbits/sec                  receiver
CPU Utilization: local/sender 99.3% (0.8%u/98.5%s), remote/receiver 81.2% (2.2%u/79.0%s)
snd_tcp_congestion cubic
rcv_tcp_congestion cubic
  • Steps to Reproduce

It seems issue only can be reproduced on arm64 builds. I tried raspeberry pi2 as well with most recent build from source.

  • Possible Solution

tcpdump show random 200ms delay between packets after full block size received (65536) and next SACK. perf top doesn't show any kernel bottleneck

@davidBar-On
Copy link
Contributor

I am not familiar with SCTP, but it seems that SCTP includes a "delayed ack" mechanism and that the 200ms delay is per a related system configuration setting. For example, see this and SCTP RFC 4960 section 6.2 (or in the update RFC 9260).

@nshopik
Copy link
Author

nshopik commented Jan 9, 2025

Yeah I need investigate bit more, I never looked into x86 traffic dump to see if there difference, could be related to arch settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants