[RFC PATCH v0 0/2] skbuff: Fix applications not being woken for errors

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH v0 0/2] skbuff: Fix applications not being woken for errors
@ 2018-03-12 23:10 Vinicius Costa Gomes
  2018-03-12 23:10 ` [RFC PATCH v0 1/2] selftests/txtimestamp: Add more configurable parameters Vinicius Costa Gomes
  2018-03-12 23:10 ` [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report() Vinicius Costa Gomes
  0 siblings, 2 replies; 6+ messages in thread
From: Vinicius Costa Gomes @ 2018-03-12 23:10 UTC (permalink / raw)
  To: netdev; +Cc: Vinicius Costa Gomes, randy.e.witt, davem

Hi,

This is actually a "bug report"-RFC instead of the more usual "new
feature"-RFC.

We are developing an application that uses TX hardware timestamping to
make some measurements, and during development Randy Witt initially
reported that the application poll() never unblocked when TX hardware
timestamping was enabled.

After some investigation, it turned out the problem wasn't only
exclusive to hardware timestamping, and could be reproduced with
software timestamping.

Applying patch (1), and running txtimestamp like this, for example:

$ ./txtimestamp -u -4 192.168.1.71 -c 1000 -D -l 1000 -F

('-u' to use UDP only, '-4' for ipv4 only, '-c 1000' to send 1000
packets for each test, '-D' to remove the delay between packets, '-l
1000' to set the payload to 1000 bytes, '-F' for configuring poll() to
wait forever)

will cause the application to become stuck in the poll() call in most
of the times. (Note: I couldn't reproduce the issue running against an
address that is routed through loopback.)

Another interesting fact is that if the POLLIN event is added to the
poll() .events, poll() no longer becomes stuck, and more interestingly
the returned event in .revents is only POLLERR.

After a few debugging sessions, we got to 'sock_queue_err_skb()' and
how it notifies applications of the error just enqueued. Changing it
to use 'sk->sk_error_report()', fixes the issue for hardware and
software timestamping. That is patch (2).

The "solution" proposed in patch (2) looks like too big a hammer, if
it's not, then it seems that this problem existed since a long time
ago (pre git) and was uncommon for folks to reach the necessary
conditions to trigger it (my hypothesis is that only triggers when the
error is reported from a different task context than the application).

Am I missing something here?

Cheers,
--

Vinicius Costa Gomes (2):
  selftests/txtimestamp: Add more configurable parameters
  skbuff: Notify errors with sk_error_report()

 net/core/skbuff.c                                   |  2 +-
 .../selftests/networking/timestamping/txtimestamp.c | 21 ++++++++++++++++++---
 2 files changed, 19 insertions(+), 4 deletions(-)

--
2.16.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH v0 1/2] selftests/txtimestamp: Add more configurable parameters
  2018-03-12 23:10 [RFC PATCH v0 0/2] skbuff: Fix applications not being woken for errors Vinicius Costa Gomes
@ 2018-03-12 23:10 ` Vinicius Costa Gomes
  2018-03-12 23:10 ` [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report() Vinicius Costa Gomes
  1 sibling, 0 replies; 6+ messages in thread
From: Vinicius Costa Gomes @ 2018-03-12 23:10 UTC (permalink / raw)
  To: netdev; +Cc: Vinicius Costa Gomes, randy.e.witt, davem

Add a way to configure if poll() should wait forever for an event, the
number of packets that should be sent for each and if there should be
any delay between packets.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 .../selftests/networking/timestamping/txtimestamp.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/networking/timestamping/txtimestamp.c b/tools/testing/selftests/networking/timestamping/txtimestamp.c
index 5df07047ca86..5190b1dd78b1 100644
--- a/tools/testing/selftests/networking/timestamping/txtimestamp.c
+++ b/tools/testing/selftests/networking/timestamping/txtimestamp.c
@@ -68,9 +68,11 @@ static int cfg_num_pkts = 4;
 static int do_ipv4 = 1;
 static int do_ipv6 = 1;
 static int cfg_payload_len = 10;
+static int cfg_poll_timeout = 100;
 static bool cfg_show_payload;
 static bool cfg_do_pktinfo;
 static bool cfg_loop_nodata;
+static bool cfg_no_delay;
 static uint16_t dest_port = 9000;
 
 static struct sockaddr_in daddr;
@@ -171,7 +173,7 @@ static void __poll(int fd)
 
 	memset(&pollfd, 0, sizeof(pollfd));
 	pollfd.fd = fd;
-	ret = poll(&pollfd, 1, 100);
+	ret = poll(&pollfd, 1, cfg_poll_timeout);
 	if (ret != 1)
 		error(1, errno, "poll");
 }
@@ -371,7 +373,8 @@ static void do_test(int family, unsigned int opt)
 			error(1, errno, "send");
 
 		/* wait for all errors to be queued, else ACKs arrive OOO */
-		usleep(50 * 1000);
+		if (!cfg_no_delay)
+			usleep(50 * 1000);
 
 		__poll(fd);
 
@@ -397,6 +400,9 @@ static void __attribute__((noreturn)) usage(const char *filepath)
 			"  -n:   set no-payload option\n"
 			"  -r:   use raw\n"
 			"  -R:   use raw (IP_HDRINCL)\n"
+			"  -D:   no delay between packets\n"
+			"  -F:   poll() waits forever for an event\n"
+			"  -c N: number of packets for each test\n"
 			"  -p N: connect to port N\n"
 			"  -u:   use udp\n"
 			"  -x:   show payload (up to 70 bytes)\n",
@@ -409,7 +415,7 @@ static void parse_opt(int argc, char **argv)
 	int proto_count = 0;
 	char c;
 
-	while ((c = getopt(argc, argv, "46hIl:np:rRux")) != -1) {
+	while ((c = getopt(argc, argv, "46hIl:np:rRuxc:DF")) != -1) {
 		switch (c) {
 		case '4':
 			do_ipv6 = 0;
@@ -447,6 +453,15 @@ static void parse_opt(int argc, char **argv)
 		case 'x':
 			cfg_show_payload = true;
 			break;
+		case 'c':
+			cfg_num_pkts = strtoul(optarg, NULL, 10);
+			break;
+		case 'D':
+			cfg_no_delay = true;
+			break;
+		case 'F':
+			cfg_poll_timeout = -1;
+			break;
 		case 'h':
 		default:
 			usage(argv[0]);
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report()
  2018-03-12 23:10 [RFC PATCH v0 0/2] skbuff: Fix applications not being woken for errors Vinicius Costa Gomes
  2018-03-12 23:10 ` [RFC PATCH v0 1/2] selftests/txtimestamp: Add more configurable parameters Vinicius Costa Gomes
@ 2018-03-12 23:10 ` Vinicius Costa Gomes
  2018-03-13  0:23   ` Eric Dumazet
  1 sibling, 1 reply; 6+ messages in thread
From: Vinicius Costa Gomes @ 2018-03-12 23:10 UTC (permalink / raw)
  To: netdev; +Cc: Vinicius Costa Gomes, randy.e.witt, davem

When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the correct application is not notified.

Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 715c13495ba6..6def3534f509 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4181,7 +4181,7 @@ int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb)
 
 	skb_queue_tail(&sk->sk_error_queue, skb);
 	if (!sock_flag(sk, SOCK_DEAD))
-		sk->sk_data_ready(sk);
+		sk->sk_error_report(sk);
 	return 0;
 }
 EXPORT_SYMBOL(sock_queue_err_skb);
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report()
  2018-03-12 23:10 ` [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report() Vinicius Costa Gomes
@ 2018-03-13  0:23   ` Eric Dumazet
  2018-03-13 17:20     ` Vinicius Costa Gomes
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-03-13  0:23 UTC (permalink / raw)
  To: Vinicius Costa Gomes, netdev; +Cc: randy.e.witt, davem



On 03/12/2018 04:10 PM, Vinicius Costa Gomes wrote:
> When errors are enqueued to the error queue via sock_queue_err_skb()
> function, it is possible that the correct application is not notified.

Your patch makes sense, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report()
  2018-03-13  0:23   ` Eric Dumazet
@ 2018-03-13 17:20     ` Vinicius Costa Gomes
  2018-03-13 17:27       ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Vinicius Costa Gomes @ 2018-03-13 17:20 UTC (permalink / raw)
  To: Eric Dumazet, netdev; +Cc: randy.e.witt, davem

Hi

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On 03/12/2018 04:10 PM, Vinicius Costa Gomes wrote:
>> When errors are enqueued to the error queue via sock_queue_err_skb()
>> function, it is possible that the correct application is not notified.
>
> Your patch makes sense, thanks.

Cool. Will send a non-RFC version then.

Would the changes to txtimestamp selftest be helpful?


Cheers,
--
Vinicius

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report()
  2018-03-13 17:20     ` Vinicius Costa Gomes
@ 2018-03-13 17:27       ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2018-03-13 17:27 UTC (permalink / raw)
  To: Vinicius Costa Gomes, Eric Dumazet, netdev; +Cc: randy.e.witt, davem



On 03/13/2018 10:20 AM, Vinicius Costa Gomes wrote:
> Hi

> Cool. Will send a non-RFC version then.
> 
> Would the changes to txtimestamp selftest be helpful?

Yes, since it could have spotted the bug earlier.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-03-13 17:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-12 23:10 [RFC PATCH v0 0/2] skbuff: Fix applications not being woken for errors Vinicius Costa Gomes
2018-03-12 23:10 ` [RFC PATCH v0 1/2] selftests/txtimestamp: Add more configurable parameters Vinicius Costa Gomes
2018-03-12 23:10 ` [RFC PATCH v0 2/2] skbuff: Notify errors with sk_error_report() Vinicius Costa Gomes
2018-03-13  0:23   ` Eric Dumazet
2018-03-13 17:20     ` Vinicius Costa Gomes
2018-03-13 17:27       ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).