On 2021/4/21 17:25, Yunsheng Lin wrote:
On 2021/4/21 16:44, Michal Kubecek wrote:
I'll try running some tests also on other architectures, including arm64 and s390x (to catch potential endinanity issues).
I tried debugging nperf in arm64, with the below patch: diff --git a/client/main.c b/client/main.c index 429634d..de1a3ef 100644 --- a/client/main.c +++ b/client/main.c @@ -63,7 +63,10 @@ static int client_init(void) ret = client_set_usr1_handler(); if (ret < 0) return ret; - return ignore_signal(SIGPIPE); + //return ignore_signal(SIGPIPE); + signal(SIGPIPE, SIG_IGN); + + return 0; }
static int ctrl_send_start(struct client_config *config) diff --git a/client/worker.c b/client/worker.c index ac026893..d269311 100644 --- a/client/worker.c +++ b/client/worker.c @@ -7,7 +7,7 @@ #include "worker.h" #include "main.h"
-#define WORKER_STACK_SIZE 16384 +#define WORKER_STACK_SIZE 131072
struct client_worker_data *workers_data; union sockaddr_any test_addr;
It has below error output:
../nperf/nperf -H 127.0.0.1 -l 3 -i 1 --exact -t TCP_STREAM -M 1 server: 127.0.0.1, port 12543 iterations: 1, threads: 1, test length: 3 test: TCP_STREAM, message size: 1048576
run test begin send begin send done: -32 failed to receive server stats *** Iteration 1 failed, quitting. ***
Tcpdump has below output: 09:55:12.253341 IP localhost.53080 > localhost.12543: Flags [S], seq 3954442980, win 65495, options [mss 65495,sackOK,TS val 3268837738 ecr 0,nop,wscale 7], length 0 09:55:12.253363 IP localhost.12543 > localhost.53080: Flags [S.], seq 4240541653, ack 3954442981, win 65483, options [mss 65495,sackOK,TS val 3268837738 ecr 3268837738,nop,wscale 7], length 0 09:55:12.253379 IP localhost.53080 > localhost.12543: Flags [.], ack 1, win 512, options [nop,nop,TS val 3268837738 ecr 3268837738], length 0 09:55:12.253412 IP localhost.53080 > localhost.12543: Flags [P.], seq 1:29, ack 1, win 512, options [nop,nop,TS val 3268837738 ecr 3268837738], length 28 09:55:12.253863 IP localhost.12543 > localhost.53080: Flags [P.], seq 1:17, ack 29, win 512, options [nop,nop,TS val 3268837739 ecr 3268837738], length 16 09:55:12.253891 IP localhost.53080 > localhost.12543: Flags [.], ack 17, win 512, options [nop,nop,TS val 3268837739 ecr 3268837739], length 0 09:55:12.254265 IP localhost.12543 > localhost.53080: Flags [F.], seq 17, ack 29, win 512, options [nop,nop,TS val 3268837739 ecr 3268837739], length 0 09:55:12.301992 IP localhost.53080 > localhost.12543: Flags [.], ack 18, win 512, options [nop,nop,TS val 3268837787 ecr 3268837739], length 0 09:55:15.254389 IP localhost.53080 > localhost.12543: Flags [F.], seq 29, ack 18, win 512, options [nop,nop,TS val 3268840739 ecr 3268837739], length 0 09:55:15.254426 IP localhost.12543 > localhost.53080: Flags [.], ack 30, win 512, options [nop,nop,TS val 3268840739 ecr 3268840739], length 0
Any idea what went wrong here?
Also, Would you mind running netperf to see if there is similar issue in your system?
Michal
.
.
On 2021/4/23 17:42, Yunsheng Lin wrote:
On 2021/4/21 17:25, Yunsheng Lin wrote:
On 2021/4/21 16:44, Michal Kubecek wrote:
I'll try running some tests also on other architectures, including arm64 and s390x (to catch potential endinanity issues).
I tried debugging nperf in arm64, with the below patch:
Any idea what went wrong here?
Also, Would you mind running netperf to see if there is similar issue in your system?
Hi, Michal I was able to reproduce the fluctuation for one thread TCP_STREAM test, So I am assuming it may more related to test environment or nperf issue.
I plan to send v5 with netdev queue stopped handling after the golden holiday in china. If there is any issue with patchset, please let me know, thanks.
Michal
.
.
.
On 2021/4/30 11:11, Yunsheng Lin wrote:
On 2021/4/23 17:42, Yunsheng Lin wrote:
On 2021/4/21 17:25, Yunsheng Lin wrote:
On 2021/4/21 16:44, Michal Kubecek wrote:
I'll try running some tests also on other architectures, including arm64 and s390x (to catch potential endinanity issues).
I tried debugging nperf in arm64, with the below patch:
Any idea what went wrong here?
Also, Would you mind running netperf to see if there is similar issue in your system?
Hi, Michal I was able to reproduce the fluctuation for one thread TCP_STREAM test,
I was *not* able Sorry for the typo.
So I am assuming it may more related to test environment or nperf issue.
I plan to send v5 with netdev queue stopped handling after the golden holiday in china. If there is any issue with patchset, please let me know, thanks.
Michal
.
.
.
.
On Fri, Apr 30, 2021 at 11:15:01AM +0800, Yunsheng Lin wrote:
On 2021/4/30 11:11, Yunsheng Lin wrote:
On 2021/4/23 17:42, Yunsheng Lin wrote:
On 2021/4/21 17:25, Yunsheng Lin wrote:
On 2021/4/21 16:44, Michal Kubecek wrote:
I'll try running some tests also on other architectures, including arm64 and s390x (to catch potential endinanity issues).
I tried debugging nperf in arm64, with the below patch:
Any idea what went wrong here?
Also, Would you mind running netperf to see if there is similar issue in your system?
Hi, Michal I was able to reproduce the fluctuation for one thread TCP_STREAM test,
I was *not* able Sorry for the typo.
I was able to reproduce the same problem with netperf:
marfak:~ # for i in {1..60}; do netperf -H 172.17.1.1 -l 30 -t TCP_STREAM -- -m 1048576; done 131072 16384 1048576 30.00 9413.36 131072 16384 1048576 30.01 7473.68 <--- 131072 16384 1048576 30.00 9413.97 131072 16384 1048576 30.00 9413.76 131072 16384 1048576 30.01 9024.25 131072 16384 1048576 30.01 8364.78 131072 16384 1048576 30.00 9413.22 131072 16384 1048576 30.00 9414.29 131072 16384 1048576 30.00 9414.32 131072 16384 1048576 30.00 9412.58 131072 16384 1048576 30.00 9412.79 131072 16384 1048576 30.00 9413.18 131072 16384 1048576 30.01 8771.57 <--- 131072 16384 1048576 30.00 9414.01 131072 16384 1048576 30.00 9413.93 131072 16384 1048576 30.00 9413.97 131072 16384 1048576 30.00 9414.05 131072 16384 1048576 30.00 9412.92 131072 16384 1048576 30.00 9413.40 131072 16384 1048576 30.00 9414.41 131072 16384 1048576 30.00 9413.25 131072 16384 1048576 30.00 9413.38 131072 16384 1048576 30.00 9412.28 131072 16384 1048576 30.00 9413.50 131072 16384 1048576 30.00 9414.12 131072 16384 1048576 30.00 9414.27 131072 16384 1048576 30.00 9412.96 131072 16384 1048576 30.00 9413.71 131072 16384 1048576 30.01 9205.98 131072 16384 1048576 30.00 9413.69 131072 16384 1048576 30.00 9413.60 131072 16384 1048576 30.01 8297.03 <--- 131072 16384 1048576 30.00 9414.09 131072 16384 1048576 30.00 9414.38 131072 16384 1048576 30.00 9413.62 131072 16384 1048576 30.00 9411.09 131072 16384 1048576 30.00 9414.37 131072 16384 1048576 30.00 9414.37 131072 16384 1048576 30.00 9412.52 131072 16384 1048576 30.00 9414.06 131072 16384 1048576 30.00 9413.66 131072 16384 1048576 30.00 9411.63 131072 16384 1048576 30.00 9414.17 131072 16384 1048576 30.00 9414.07 131072 16384 1048576 30.00 9414.09 131072 16384 1048576 30.00 9414.37 131072 16384 1048576 30.00 9390.00 131072 16384 1048576 30.00 9413.72 131072 16384 1048576 30.01 9260.97 131072 16384 1048576 30.01 9334.91 131072 16384 1048576 30.00 9413.57 131072 16384 1048576 30.00 9412.01 131072 16384 1048576 30.00 9414.36 131072 16384 1048576 30.00 9412.47 131072 16384 1048576 30.00 9413.73 131072 16384 1048576 30.00 9413.48 131072 16384 1048576 30.00 9413.36 131072 16384 1048576 30.01 9327.42 131072 16384 1048576 30.01 9240.33 131072 16384 1048576 30.00 9413.97
(filtered only the interesting lines)
But after some more testing, I was also able to see similar results with unpatched mainline kernel:
131072 16384 1048576 30.00 9413.28 131072 16384 1048576 30.01 9007.17 131072 16384 1048576 30.01 9153.22 131072 16384 1048576 30.00 9414.28 131072 16384 1048576 30.01 9244.68 131072 16384 1048576 30.01 9230.49 131072 16384 1048576 30.00 8723.24 <--- 131072 16384 1048576 30.01 8289.21 <--- 131072 16384 1048576 30.01 9258.33 131072 16384 1048576 30.00 9251.47 131072 16384 1048576 30.00 9414.23 131072 16384 1048576 30.01 9276.87 131072 16384 1048576 30.01 9255.61 131072 16384 1048576 30.00 9072.78 131072 16384 1048576 30.00 9412.09 131072 16384 1048576 30.01 9393.00 131072 16384 1048576 30.00 9413.39 131072 16384 1048576 30.01 9404.01 131072 16384 1048576 30.01 8412.83 <--- 131072 16384 1048576 30.01 9368.23 131072 16384 1048576 30.01 9259.11 131072 16384 1048576 30.01 9121.65 131072 16384 1048576 30.01 9169.87 131072 16384 1048576 30.01 9154.03 131072 16384 1048576 30.01 9336.34 131072 16384 1048576 30.00 9187.73 131072 16384 1048576 30.00 9412.54 131072 16384 1048576 30.01 6836.37 <--- 131072 16384 1048576 30.01 9388.09 131072 16384 1048576 30.01 8755.78 <--- 131072 16384 1048576 30.01 9167.63 131072 16384 1048576 30.00 9410.80 131072 16384 1048576 30.01 9392.71 131072 16384 1048576 30.01 9238.50 131072 16384 1048576 30.01 9382.78 131072 16384 1048576 30.01 9328.23 131072 16384 1048576 30.01 9396.04 131072 16384 1048576 30.01 9286.10 131072 16384 1048576 30.00 9412.44 131072 16384 1048576 30.01 7952.34 <--- 131072 16384 1048576 30.01 9309.95 131072 16384 1048576 30.00 9133.38 131072 16384 1048576 30.01 8672.75 131072 16384 1048576 30.00 9414.28 131072 16384 1048576 30.00 9411.34 131072 16384 1048576 30.00 9414.27 131072 16384 1048576 30.01 9313.60 131072 16384 1048576 30.01 9315.10 131072 16384 1048576 30.00 9413.23 131072 16384 1048576 30.01 9285.77 131072 16384 1048576 30.00 9414.28 131072 16384 1048576 30.00 9406.39 131072 16384 1048576 30.01 9343.74 131072 16384 1048576 30.01 9179.17 131072 16384 1048576 30.01 9081.18 131072 16384 1048576 30.00 9412.85 131072 16384 1048576 30.00 9413.66 131072 16384 1048576 30.01 9346.16 131072 16384 1048576 30.00 9410.01 131072 16384 1048576 30.00 9411.22
It's not clear why I haven't seen these before but the problem is unlikely to by related to your patch set.
Michal