* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Eric Dumazet @ 2012-05-17 22:14 UTC (permalink / raw)
To: Willy Tarreau; +Cc: netdev
In-Reply-To: <20120517211414.GP14498@1wt.eu>
On Thu, 2012-05-17 at 23:14 +0200, Willy Tarreau wrote:
> ---
>
> From 39c3f73176118a274ec9dea9c22c83d97a7fbfe0 Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@1wt.eu>
> Date: Thu, 17 May 2012 22:43:20 +0200
> Subject: [PATCH] tcp: do_tcp_sendpages() must try to push data out on oom conditions
>
> Since recent changes on TCP splicing (starting with commits 2f533844
> and 35f9c09f), I started seeing massive stalls when forwarding traffic
> between two sockets using splice() when pipe buffers were larger than
> socket buffers.
>
> Latest changes (net: netdev_alloc_skb() use build_skb()) made the
> problem even more apparent.
>
> The reason seems to be that if do_tcp_sendpages() fails on out of memory
> condition without being able to send at least one byte, tcp_push() is not
> called and the buffers cannot be flushed.
>
> After applying the attached patch, I cannot reproduce the stalls at all
> and the data rate it perfectly stable and steady under any condition
> which previously caused the problem to be permanent.
>
> The issue seems to have been there since before the kernel migrated to
> git, which makes me think that the stalls I occasionally experienced
> with tux during stress-tests years ago were probably related to the
> same issue.
>
> This issue was first encountered on 3.0.31 and 3.2.17, so please backport
> to -stable.
>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> Cc: <stable@vger.kernel.org>
> ---
> net/ipv4/tcp.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 63ddaee..231fe53 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -917,8 +917,7 @@ new_segment:
> wait_for_sndbuf:
> set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> wait_for_memory:
> - if (copied)
> - tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
> + tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
>
> if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
> goto do_error;
Acked-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 22:16 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1337292119.3403.60.camel@edumazet-glaptop>
On Fri, May 18, 2012 at 12:01:59AM +0200, Eric Dumazet wrote:
> On Thu, 2012-05-17 at 23:50 +0200, Eric Dumazet wrote:
> > On Thu, 2012-05-17 at 23:40 +0200, Eric Dumazet wrote:
> >
> > > I dont understand why we should tcp_push() if we sent 0 bytes in this
> > > splice() call.
> > >
> > > The push() should have be done already by prior splice() call, dont you
> > > think ?
> > >
> > > out:
> > > if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
> > > tcp_push(sk, flags, mss_now, tp->nonagle);
> > >
> >
> > I think I now understand
> >
> > One splice() syscall actually calls do_tcp_sendpages() several times
> > (N).
> >
> > Problem is N-1 calls are done with MSG_SENDPAGE_NOTLAST set
> >
> > And last one with MSG_SENDPAGE_NOTLAST unset
> >
> > So maybe we should replace this test by :
> >
> > if (!(flags & MSG_SENDPAGE_NOTLAST))
> > tcp_push(...);
> >
> >
>
> Its more tricky than that.
>
> If we return 0 from do_tcp_sendpages(), __splice_from_pipe() wont call
> us again, but we need to flush(). (This bug is indeed very old)
>
> So maybe use :
>
> if ((copied && !(flags & MSG_SENDPAGE_NOTLAST)) ||
> !copied)
> tcp_push(...);
That's what I wanted to do at first but was still worried about the
situations where we fail upon first call due to lack of memory. And
indeed that did not fix the issue either :-(
At least now I've checked that we fail here in do_tcp_sendpages() :
if (!sk_stream_memory_free(sk)) {
printk(KERN_DEBUG "%s:%d copied=%d\n", __FUNCTION__, __LINE__, copied);
goto wait_for_sndbuf;
}
Well, finally I just put the same test as yours above for the call to
tcp_push() upon out of memory and it fixed it too. I have simplified
the expression, since ((A && !B) || !A) == !(A & B).
With the updated patch here it works for me. I don't know if it's
better.
Willy
-----
>From 1f26263bd4f8827bbca4cef77a99fac128cf8ccc Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w@1wt.eu>
Date: Thu, 17 May 2012 22:43:20 +0200
Subject: [PATCH] tcp: do_tcp_sendpages() must try to push data out on oom conditions
Since recent changes on TCP splicing (starting with commits 2f533844
and 35f9c09f), I started seeing massive stalls when forwarding traffic
between two sockets using splice() when pipe buffers were larger than
socket buffers.
Latest changes (net: netdev_alloc_skb() use build_skb()) made the
problem even more apparent.
The reason seems to be that if do_tcp_sendpages() fails on out of memory
condition without being able to send at least one byte, tcp_push() is not
called and the buffers cannot be flushed.
After applying the attached patch, I cannot reproduce the stalls at all
and the data rate it perfectly stable and steady under any condition
which previously caused the problem to be permanent.
The issue seems to have been there since before the kernel migrated to
git, which makes me think that the stalls I occasionally experienced
with tux during stress-tests years ago were probably related to the
same issue.
This issue was first encountered on 3.0.31 and 3.2.17, so please backport
to -stable.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Cc: <stable@vger.kernel.org>
---
net/ipv4/tcp.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 63ddaee..cfe47c1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -917,7 +917,7 @@ new_segment:
wait_for_sndbuf:
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
wait_for_memory:
- if (copied)
+ if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
@@ -927,7 +927,7 @@ wait_for_memory:
}
out:
- if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
+ if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
tcp_push(sk, flags, mss_now, tp->nonagle);
return copied;
--
1.7.2.1.45.g54fbc
^ permalink raw reply related
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Eric Dumazet @ 2012-05-17 22:22 UTC (permalink / raw)
To: Willy Tarreau; +Cc: netdev
In-Reply-To: <20120517221641.GS14498@1wt.eu>
On Fri, 2012-05-18 at 00:16 +0200, Willy Tarreau wrote:
> On Fri, May 18, 2012 at 12:01:59AM +0200, Eric Dumazet wrote:
> > On Thu, 2012-05-17 at 23:50 +0200, Eric Dumazet wrote:
> > > On Thu, 2012-05-17 at 23:40 +0200, Eric Dumazet wrote:
> > >
> > > > I dont understand why we should tcp_push() if we sent 0 bytes in this
> > > > splice() call.
> > > >
> > > > The push() should have be done already by prior splice() call, dont you
> > > > think ?
> > > >
> > > > out:
> > > > if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
> > > > tcp_push(sk, flags, mss_now, tp->nonagle);
> > > >
> > >
> > > I think I now understand
> > >
> > > One splice() syscall actually calls do_tcp_sendpages() several times
> > > (N).
> > >
> > > Problem is N-1 calls are done with MSG_SENDPAGE_NOTLAST set
> > >
> > > And last one with MSG_SENDPAGE_NOTLAST unset
> > >
> > > So maybe we should replace this test by :
> > >
> > > if (!(flags & MSG_SENDPAGE_NOTLAST))
> > > tcp_push(...);
> > >
> > >
> >
> > Its more tricky than that.
> >
> > If we return 0 from do_tcp_sendpages(), __splice_from_pipe() wont call
> > us again, but we need to flush(). (This bug is indeed very old)
> >
> > So maybe use :
> >
> > if ((copied && !(flags & MSG_SENDPAGE_NOTLAST)) ||
> > !copied)
> > tcp_push(...);
>
> That's what I wanted to do at first but was still worried about the
> situations where we fail upon first call due to lack of memory. And
> indeed that did not fix the issue either :-(
>
> At least now I've checked that we fail here in do_tcp_sendpages() :
>
> if (!sk_stream_memory_free(sk)) {
> printk(KERN_DEBUG "%s:%d copied=%d\n", __FUNCTION__, __LINE__, copied);
> goto wait_for_sndbuf;
> }
>
> Well, finally I just put the same test as yours above for the call to
> tcp_push() upon out of memory and it fixed it too. I have simplified
> the expression, since ((A && !B) || !A) == !(A & B).
>
> With the updated patch here it works for me. I don't know if it's
> better.
>
> Willy
>
> -----
>
> From 1f26263bd4f8827bbca4cef77a99fac128cf8ccc Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@1wt.eu>
> Date: Thu, 17 May 2012 22:43:20 +0200
> Subject: [PATCH] tcp: do_tcp_sendpages() must try to push data out on oom conditions
>
> Since recent changes on TCP splicing (starting with commits 2f533844
> and 35f9c09f), I started seeing massive stalls when forwarding traffic
> between two sockets using splice() when pipe buffers were larger than
> socket buffers.
>
> Latest changes (net: netdev_alloc_skb() use build_skb()) made the
> problem even more apparent.
>
> The reason seems to be that if do_tcp_sendpages() fails on out of memory
> condition without being able to send at least one byte, tcp_push() is not
> called and the buffers cannot be flushed.
>
> After applying the attached patch, I cannot reproduce the stalls at all
> and the data rate it perfectly stable and steady under any condition
> which previously caused the problem to be permanent.
>
> The issue seems to have been there since before the kernel migrated to
> git, which makes me think that the stalls I occasionally experienced
> with tux during stress-tests years ago were probably related to the
> same issue.
>
> This issue was first encountered on 3.0.31 and 3.2.17, so please backport
> to -stable.
>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> Cc: <stable@vger.kernel.org>
> ---
> net/ipv4/tcp.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 63ddaee..cfe47c1 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -917,7 +917,7 @@ new_segment:
> wait_for_sndbuf:
> set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> wait_for_memory:
> - if (copied)
> + if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
> tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
>
> if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
> @@ -927,7 +927,7 @@ wait_for_memory:
> }
>
> out:
> - if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
> + if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
> tcp_push(sk, flags, mss_now, tp->nonagle);
> return copied;
>
Hmm... I believe I prefer your prior patch ( the one I Acked)
^ permalink raw reply
* Re: [PATCH 2/5] drivers/net: delete all code/drivers depending on CONFIG_MCA
From: David Miller @ 2012-05-17 22:23 UTC (permalink / raw)
To: paul.gortmaker; +Cc: linux-kernel, JBottomley, netdev
In-Reply-To: <20120517212039.GC27789@windriver.com>
From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Thu, 17 May 2012 17:20:39 -0400
> I've layered the above commit (with 3c527.h deleted) onto today's
> net-next, did a quick re-test of x86_64 allmodconfig and put it
> on a branch as per below (vs sending a 5000+ line patch).
...
> git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git delete-mca-net-drivers
Pulled, thanks Paul.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 22:24 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1337293351.3403.67.camel@edumazet-glaptop>
On Fri, May 18, 2012 at 12:22:31AM +0200, Eric Dumazet wrote:
> On Fri, 2012-05-18 at 00:16 +0200, Willy Tarreau wrote:
> > On Fri, May 18, 2012 at 12:01:59AM +0200, Eric Dumazet wrote:
> > > On Thu, 2012-05-17 at 23:50 +0200, Eric Dumazet wrote:
> > > > On Thu, 2012-05-17 at 23:40 +0200, Eric Dumazet wrote:
> > > >
> > > > > I dont understand why we should tcp_push() if we sent 0 bytes in this
> > > > > splice() call.
> > > > >
> > > > > The push() should have be done already by prior splice() call, dont you
> > > > > think ?
> > > > >
> > > > > out:
> > > > > if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
> > > > > tcp_push(sk, flags, mss_now, tp->nonagle);
> > > > >
> > > >
> > > > I think I now understand
> > > >
> > > > One splice() syscall actually calls do_tcp_sendpages() several times
> > > > (N).
> > > >
> > > > Problem is N-1 calls are done with MSG_SENDPAGE_NOTLAST set
> > > >
> > > > And last one with MSG_SENDPAGE_NOTLAST unset
> > > >
> > > > So maybe we should replace this test by :
> > > >
> > > > if (!(flags & MSG_SENDPAGE_NOTLAST))
> > > > tcp_push(...);
> > > >
> > > >
> > >
> > > Its more tricky than that.
> > >
> > > If we return 0 from do_tcp_sendpages(), __splice_from_pipe() wont call
> > > us again, but we need to flush(). (This bug is indeed very old)
> > >
> > > So maybe use :
> > >
> > > if ((copied && !(flags & MSG_SENDPAGE_NOTLAST)) ||
> > > !copied)
> > > tcp_push(...);
> >
> > That's what I wanted to do at first but was still worried about the
> > situations where we fail upon first call due to lack of memory. And
> > indeed that did not fix the issue either :-(
> >
> > At least now I've checked that we fail here in do_tcp_sendpages() :
> >
> > if (!sk_stream_memory_free(sk)) {
> > printk(KERN_DEBUG "%s:%d copied=%d\n", __FUNCTION__, __LINE__, copied);
> > goto wait_for_sndbuf;
> > }
> >
> > Well, finally I just put the same test as yours above for the call to
> > tcp_push() upon out of memory and it fixed it too. I have simplified
> > the expression, since ((A && !B) || !A) == !(A & B).
> >
> > With the updated patch here it works for me. I don't know if it's
> > better.
> >
> > Willy
> >
> > -----
> >
> > From 1f26263bd4f8827bbca4cef77a99fac128cf8ccc Mon Sep 17 00:00:00 2001
> > From: Willy Tarreau <w@1wt.eu>
> > Date: Thu, 17 May 2012 22:43:20 +0200
> > Subject: [PATCH] tcp: do_tcp_sendpages() must try to push data out on oom conditions
> >
> > Since recent changes on TCP splicing (starting with commits 2f533844
> > and 35f9c09f), I started seeing massive stalls when forwarding traffic
> > between two sockets using splice() when pipe buffers were larger than
> > socket buffers.
> >
> > Latest changes (net: netdev_alloc_skb() use build_skb()) made the
> > problem even more apparent.
> >
> > The reason seems to be that if do_tcp_sendpages() fails on out of memory
> > condition without being able to send at least one byte, tcp_push() is not
> > called and the buffers cannot be flushed.
> >
> > After applying the attached patch, I cannot reproduce the stalls at all
> > and the data rate it perfectly stable and steady under any condition
> > which previously caused the problem to be permanent.
> >
> > The issue seems to have been there since before the kernel migrated to
> > git, which makes me think that the stalls I occasionally experienced
> > with tux during stress-tests years ago were probably related to the
> > same issue.
> >
> > This issue was first encountered on 3.0.31 and 3.2.17, so please backport
> > to -stable.
> >
> > Signed-off-by: Willy Tarreau <w@1wt.eu>
> > Cc: <stable@vger.kernel.org>
> > ---
> > net/ipv4/tcp.c | 4 ++--
> > 1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 63ddaee..cfe47c1 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -917,7 +917,7 @@ new_segment:
> > wait_for_sndbuf:
> > set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> > wait_for_memory:
> > - if (copied)
> > + if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
> > tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
> >
> > if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
> > @@ -927,7 +927,7 @@ wait_for_memory:
> > }
> >
> > out:
> > - if (copied && !(flags & MSG_SENDPAGE_NOTLAST))
> > + if (!(copied && (flags & MSG_SENDPAGE_NOTLAST)))
> > tcp_push(sk, flags, mss_now, tp->nonagle);
> > return copied;
> >
>
>
> Hmm... I believe I prefer your prior patch ( the one I Acked)
I too, less changes are better :-)
Thanks,
Willy
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: David Miller @ 2012-05-17 22:25 UTC (permalink / raw)
To: w; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517222431.GT14498@1wt.eu>
From: Willy Tarreau <w@1wt.eu>
Date: Fri, 18 May 2012 00:24:31 +0200
> On Fri, May 18, 2012 at 12:22:31AM +0200, Eric Dumazet wrote:
>> Hmm... I believe I prefer your prior patch ( the one I Acked)
>
> I too, less changes are better :-)
And I think that's the one I'm going to apply after I audit this
code a little bit myself.
Thanks!
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Joe Perches @ 2012-05-17 22:27 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Willy Tarreau, netdev
In-Reply-To: <1337293351.3403.67.camel@edumazet-glaptop>
On Fri, 2012-05-18 at 00:22 +0200, Eric Dumazet wrote:
> On Fri, 2012-05-18 at 00:16 +0200, Willy Tarreau wrote:
> > I have simplified the expression,
> > since ((A && !B) || !A) == !(A & B).
Perhaps it's clearer for a reader the first way and
any decent compiler (even gcc) could probably do that
optimization.
^ permalink raw reply
* Re: [RFC PATCH net-next] hp100: delete VG/AnyLAN hp100
From: Francois Romieu @ 2012-05-17 22:28 UTC (permalink / raw)
To: Joe Perches
Cc: Paul Gortmaker, linux-kernel, JBottomley, David S. Miller, netdev,
Joel Soete
In-Reply-To: <1337292547.8872.20.camel@joe2Laptop>
Joe Perches <joe@perches.com> :
> On Thu, 2012-05-17 at 17:20 -0400, Paul Gortmaker wrote:
> > [Re: [PATCH 2/5] drivers/net: delete all code/drivers depending on CONFIG_MCA
>
> If we're removing really old and unused stuff,
> how about the VG/AnyLAN driver too?
Joel Soete appeared to actively use a PCI one with recent kernels back
in 2007.
--
Ueimor
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 22:29 UTC (permalink / raw)
To: David Miller; +Cc: Eric Dumazet, netdev
In-Reply-To: <1337292894.3403.66.camel@edumazet-glaptop>
David,
After several tests, we finally agreed on this one. I can't make the
transfers fail anymore. And they're damn fast thanks to Eric's recent
work!
Best regards,
Willy
---
>From 39c3f73176118a274ec9dea9c22c83d97a7fbfe0 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w@1wt.eu>
Date: Thu, 17 May 2012 22:43:20 +0200
Subject: [PATCH] tcp: do_tcp_sendpages() must try to push data out on oom conditions
Since recent changes on TCP splicing (starting with commits 2f533844
and 35f9c09f), I started seeing massive stalls when forwarding traffic
between two sockets using splice() when pipe buffers were larger than
socket buffers.
Latest changes (net: netdev_alloc_skb() use build_skb()) made the
problem even more apparent.
The reason seems to be that if do_tcp_sendpages() fails on out of memory
condition without being able to send at least one byte, tcp_push() is not
called and the buffers cannot be flushed.
After applying the attached patch, I cannot reproduce the stalls at all
and the data rate it perfectly stable and steady under any condition
which previously caused the problem to be permanent.
The issue seems to have been there since before the kernel migrated to
git, which makes me think that the stalls I occasionally experienced
with tux during stress-tests years ago were probably related to the
same issue.
This issue was first encountered on 3.0.31 and 3.2.17, so please backport
to -stable.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Cc: <stable@vger.kernel.org>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 63ddaee..231fe53 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -917,8 +917,7 @@ new_segment:
wait_for_sndbuf:
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
wait_for_memory:
- if (copied)
- tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
+ tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
goto do_error;
^ permalink raw reply related
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 22:30 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517.182545.91312407467980757.davem@davemloft.net>
On Thu, May 17, 2012 at 06:25:45PM -0400, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Fri, 18 May 2012 00:24:31 +0200
>
> > On Fri, May 18, 2012 at 12:22:31AM +0200, Eric Dumazet wrote:
> >> Hmm... I believe I prefer your prior patch ( the one I Acked)
> >
> > I too, less changes are better :-)
>
> And I think that's the one I'm going to apply after I audit this
> code a little bit myself.
>
> Thanks!
Sorry for the double mail, I didn't know if you were following the thread
when I forwarded it to you.
Cheers,
Willy
^ permalink raw reply
* Re: [net] e1000: Prevent reset task killing itself.
From: David Miller @ 2012-05-17 22:33 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: tushar.n.dave, netdev, gospo, sassmann, stable
In-Reply-To: <1337252690-29687-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 17 May 2012 04:04:50 -0700
> From: Tushar Dave <tushar.n.dave@intel.com>
>
> Killing reset task while adapter is resetting causes deadlock.
> Only kill reset task if adapter is not resetting.
> Ref bug #43132 on bugzilla.kernel.org
>
> CC: stable@vger.kernel.org
> Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Applied, thanks.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: David Miller @ 2012-05-17 22:35 UTC (permalink / raw)
To: w; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517223028.GV14498@1wt.eu>
From: Willy Tarreau <w@1wt.eu>
Date: Fri, 18 May 2012 00:30:28 +0200
> On Thu, May 17, 2012 at 06:25:45PM -0400, David Miller wrote:
>> From: Willy Tarreau <w@1wt.eu>
>> Date: Fri, 18 May 2012 00:24:31 +0200
>>
>> > On Fri, May 18, 2012 at 12:22:31AM +0200, Eric Dumazet wrote:
>> >> Hmm... I believe I prefer your prior patch ( the one I Acked)
>> >
>> > I too, less changes are better :-)
>>
>> And I think that's the one I'm going to apply after I audit this
>> code a little bit myself.
>>
>> Thanks!
>
> Sorry for the double mail, I didn't know if you were following the thread
> when I forwarded it to you.
No problem.
BTW, this issue goes back all the way to the original implementation
of do_tcp_sendpages().
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2012-05-17 22:44 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
1) Thanks to Willy Tarreau and Eric Dumazet, we've unlocked a bug that's
been present in do_tcp_sendpages() since that function was written in
2002.
When we block to wait for memory we have to unconditionally try and
push out pending TCP data, otherwise we can block for an
unreasonably long amount of time.
2) Fix deadlock in e1000, fixes kernel bugzilla 43132
From Tushar Dave.
Please pull, thanks a lot!
The following changes since commit 1be5f0b7575e090fd100a98b303860879b5800de:
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma (2012-05-17 09:57:13 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net master
for you to fetch changes up to 8ce6909f77ba1b7bcdea65cc2388fd1742b6d669:
e1000: Prevent reset task killing itself. (2012-05-17 18:32:41 -0400)
----------------------------------------------------------------
Tushar Dave (1):
e1000: Prevent reset task killing itself.
Willy Tarreau (1):
tcp: do_tcp_sendpages() must try to push data out on oom conditions
drivers/net/ethernet/intel/e1000/e1000_main.c | 6 +++++-
net/ipv4/tcp.c | 3 +--
2 files changed, 6 insertions(+), 3 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next] lapb: Neaten debugging
From: David Miller @ 2012-05-17 22:45 UTC (permalink / raw)
To: joe; +Cc: linux-x25, netdev, linux-kernel
In-Reply-To: <1337286349.17726.58.camel@joe2Laptop>
From: Joe Perches <joe@perches.com>
Date: Thu, 17 May 2012 13:25:49 -0700
> Enable dynamic debugging and remove a bunch of #ifdef/#endifs.
>
> Add a lapb_dbg(level, fmt, ...) macro and replace the
> printk(KERN_DEBUG uses.
> Add pr_fmt and remove embedded prefixes.
>
> Signed-off-by: Joe Perches <joe@perches.com>
Applied, thanks Joe.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 22:49 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517.183520.1377234710702090939.davem@davemloft.net>
On Thu, May 17, 2012 at 06:35:20PM -0400, David Miller wrote:
> BTW, this issue goes back all the way to the original implementation
> of do_tcp_sendpages().
I'm not surprized, I remember about tux on 2.4 sometimes leaving frozen
connections behind it when pushed to extreme speeds back in the days where
it was audacious to pull 2 Gbps out of a Pentium3.
I like it when we find very old issues when improving the code, it generally
means we're stressing it a bit more and making more efficient use of it.
Cheers,
Willy
^ permalink raw reply
* Re: [RFC PATCH net-next] hp100: delete VG/AnyLAN hp100
From: Joe Perches @ 2012-05-17 22:50 UTC (permalink / raw)
To: Francois Romieu, Joel Soete
Cc: Paul Gortmaker, linux-kernel, JBottomley, David S. Miller, netdev,
Joel Soete
In-Reply-To: <20120517222843.GA30680@electric-eye.fr.zoreil.com>
On Fri, 2012-05-18 at 00:28 +0200, Francois Romieu wrote:
> Joe Perches <joe@perches.com> :
> > On Thu, 2012-05-17 at 17:20 -0400, Paul Gortmaker wrote:
> > > [Re: [PATCH 2/5] drivers/net: delete all code/drivers depending on CONFIG_MCA
> >
> > If we're removing really old and unused stuff,
> > how about the VG/AnyLAN driver too?
>
> Joel Soete appeared to actively use a PCI one with recent kernels back
> in 2007.
I saw people using TR adapters with Linux as of 2008/9.
That doesn't mean it's common or make it desirable to
keep TR in the current kernel tree though.
Is VG/AnyLAN different?
^ permalink raw reply
* [PATCH] e1000: Reset rx ring index on receive overrun
From: Samuel Thibault @ 2012-05-17 23:01 UTC (permalink / raw)
To: Jeff Kirsher, Jesse Brandeburg, Bruce Allan, Carolyn Wyborny,
Don Skidmore, Greg Rose, Peter P Waskiewicz Jr, Alex Duyck,
John Ronciak, David S. Miller, Jiri Pirko, Dean Nelson,
e1000-devel, netdev
Cc: linux-kernel
At high traffic rate, the rx ring may get completely filled before we
manage to consume it. After it is filled, the kernel and device indexes
are not synchronized any more, so we have to reset them, otherwise the
kernel will be stuck waiting for the wrong slot to be filled.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
---
This is just a patch suggestion, I'm not an expert in network drivers, I
leave to actual driver authors to bake a better version.
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 37caa88..77c8dbc 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3759,6 +3759,21 @@ static irqreturn_t e1000_intr(int irq, void *data)
if (unlikely(test_bit(__E1000_DOWN, &adapter->flags)))
return IRQ_HANDLED;
+ if (unlikely(icr & E1000_ICR_RXO)) {
+ /* Receive Overrun */
+ u32 rctl;
+ int i;
+ rctl = er32(RCTL);
+ ew32(RCTL, rctl & ~E1000_RCTL_EN);
+ for (i = 0; i < adapter->num_rx_queues; i++) {
+ memset(adapter->rx_ring[i].desc, 0, adapter->rx_ring[i].size);
+ adapter->rx_ring[i].next_to_clean = 0;
+ }
+ ew32(RDH, 0);
+ ew32(RCTL, rctl);
+ adapter->netdev->stats.rx_fifo_errors++;
+ }
+
if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
hw->get_link_status = 1;
/* guard against interrupt when we're going down */
^ permalink raw reply related
* Re: [PATCH V3] CS89x0 : Use ioread16/iowrite16 on all platforms
From: David Miller @ 2012-05-17 23:03 UTC (permalink / raw)
To: jaccon.bastiaansen
Cc: arnd, netdev, s.hauer, romieu, joe, gfm, festevam,
linux-arm-kernel
In-Reply-To: <1337274702-19536-1-git-send-email-jaccon.bastiaansen@gmail.com>
From: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Date: Thu, 17 May 2012 19:11:42 +0200
> The use of the inw/outw functions by the cs89x0 platform driver
> results in NULL pointer references on ARM platforms and
> platforms that do not provide ISA-style programmed I/O accessors.
>
> Using inw/outw also accesses the wrong address space on platforms
> that have a PCI I/O space that is not identity-mapped into the
> physical address space.
>
> Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
I could nit-pick more on this patch, but it's such a significant
improvement I'm just going to apply this V3 as-is to net-next.
Thanks.
^ permalink raw reply
* Re: [PATCH] e1000: Reset rx ring index on receive overrun
From: Dave, Tushar N @ 2012-05-17 23:22 UTC (permalink / raw)
To: Samuel Thibault, Kirsher, Jeffrey T, Brandeburg, Jesse,
Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C,
Rose, Gregory V, Waskiewicz Jr, Peter P, Duyck, Alexander H,
Ronciak, John, David S. Miller, Jiri Pirko, Dean Nelson,
e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
In-Reply-To: <20120517230140.GZ683@type.famille.thibault.fr>
I am interested in to see if you have actual test case and more importantly test data that shows that kernel and device indexes are not synchronized any more.
-Tushar
>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>On Behalf Of Samuel Thibault
>Sent: Thursday, May 17, 2012 4:02 PM
>To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny,
>Carolyn; Skidmore, Donald C; Rose, Gregory V; Waskiewicz Jr, Peter P;
>Duyck, Alexander H; Ronciak, John; David S. Miller; Jiri Pirko; Dean
>Nelson; e1000-devel@lists.sourceforge.net; netdev@vger.kernel.org
>Cc: linux-kernel@vger.kernel.org
>Subject: [PATCH] e1000: Reset rx ring index on receive overrun
>
>At high traffic rate, the rx ring may get completely filled before we
>manage to consume it. After it is filled, the kernel and device indexes
>are not synchronized any more, so we have to reset them, otherwise the
>kernel will be stuck waiting for the wrong slot to be filled.
>
>Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
>
>---
>This is just a patch suggestion, I'm not an expert in network drivers, I
>leave to actual driver authors to bake a better version.
>
>diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c
>b/drivers/net/ethernet/intel/e1000/e1000_main.c
>index 37caa88..77c8dbc 100644
>--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>@@ -3759,6 +3759,21 @@ static irqreturn_t e1000_intr(int irq, void *data)
> if (unlikely(test_bit(__E1000_DOWN, &adapter->flags)))
> return IRQ_HANDLED;
>
>+ if (unlikely(icr & E1000_ICR_RXO)) {
>+ /* Receive Overrun */
>+ u32 rctl;
>+ int i;
>+ rctl = er32(RCTL);
>+ ew32(RCTL, rctl & ~E1000_RCTL_EN);
>+ for (i = 0; i < adapter->num_rx_queues; i++) {
>+ memset(adapter->rx_ring[i].desc, 0, adapter-
>>rx_ring[i].size);
>+ adapter->rx_ring[i].next_to_clean = 0;
>+ }
>+ ew32(RDH, 0);
>+ ew32(RCTL, rctl);
>+ adapter->netdev->stats.rx_fifo_errors++;
>+ }
>+
> if (unlikely(icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))) {
> hw->get_link_status = 1;
> /* guard against interrupt when we're going down */
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in the
>body of a message to majordomo@vger.kernel.org More majordomo info at
>http://vger.kernel.org/majordomo-info.html
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH] e1000: Reset rx ring index on receive overrun
From: Samuel Thibault @ 2012-05-17 23:28 UTC (permalink / raw)
To: Dave, Tushar N
Cc: Jiri Pirko, e1000-devel@lists.sourceforge.net, Dean Nelson,
Allan, Bruce W, Brandeburg, Jesse, linux-kernel@vger.kernel.org,
Ronciak, John, netdev@vger.kernel.org, David S. Miller
In-Reply-To: <061C8A8601E8EE4CA8D8FD6990CEA891188439E0@ORSMSX102.amr.corp.intel.com>
Dave, Tushar N, le Thu 17 May 2012 23:22:31 +0000, a écrit :
> I am interested in to see if you have actual test case and more importantly test data that shows that kernel and device indexes are not synchronized any more.
Well, it's not with an actual physical device, but with the kvm
emulation. Printing indexes from the clean_rx handler, I'm getting:
(status linux index/total size/device index)
status 7 2/256/19
status 7 3/256/19
...
status 7 18/256/19
status 7 19/256/19
status 7 20/256/19
status 7 21/256/19
...
since the status is still 7, linux continues on.
...
status 7 254/256/19
status 7 255/256/19
status 7 0/256/19
status 7 1/256/19
status 0 2/256/19
here it stops, and on next interrupts,
status 0 2/256/20
status 0 2/256/21
etc. and no progress is made any more, until
status 0 2/256/253
status 0 2/256/254
and it gets stuck there:
status 0 2/256/254
status 0 2/256/254
status 0 2/256/254
Samuel
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH] e1000: Reset rx ring index on receive overrun
From: Samuel Thibault @ 2012-05-17 23:31 UTC (permalink / raw)
To: Dave, Tushar N, Kirsher, Jeffrey T, Brandeburg, Jesse,
Allan, Bruce W, Wyborny, Carolyn, Skidmore, Donald C,
Rose, Gregory V, Waskiewicz Jr, Peter P, Duyck, Alexander H,
Ronciak, John, David S. Miller, Jiri Pirko, Dean Nelson,
e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20120517232821.GJ683@type.famille.thibault.fr>
Samuel Thibault, le Fri 18 May 2012 01:28:21 +0200, a écrit :
> Dave, Tushar N, le Thu 17 May 2012 23:22:31 +0000, a écrit :
> > I am interested in to see if you have actual test case and more importantly test data that shows that kernel and device indexes are not synchronized any more.
>
> Well, it's not with an actual physical device, but with the kvm
> emulation.
BTW, it also happens easily when request_irq takes some time to
complete: since we enable E1000_TCTL_EN before that, the card can have
time to fill the ring before irqs are processed.
Samuel
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH] e1000: Reset rx ring index on receive overrun
From: Brandeburg, Jesse @ 2012-05-18 0:04 UTC (permalink / raw)
To: Samuel Thibault
Cc: Jiri Pirko, e1000-devel@lists.sourceforge.net, Dean Nelson,
Allan, Bruce W, linux-kernel@vger.kernel.org, David S. Miller,
Ronciak, John, netdev@vger.kernel.org
In-Reply-To: <20120517233124.GK683@type.famille.thibault.fr>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 736 bytes --]
On Thu, 17 May 2012, Samuel Thibault wrote:
> Samuel Thibault, le Fri 18 May 2012 01:28:21 +0200, a écrit :
> > Dave, Tushar N, le Thu 17 May 2012 23:22:31 +0000, a écrit :
> > > I am interested in to see if you have actual test case and more
> > > importantly test data that shows that kernel and device indexes are
> > > not synchronized any more.
> >
> > Well, it's not with an actual physical device, but with the kvm
> > emulation.
>
> BTW, it also happens easily when request_irq takes some time to
> complete: since we enable E1000_TCTL_EN before that, the card can have
> time to fill the ring before irqs are processed.
I think there may well be a bug in the implementation in kvm. The
hardware doesn't have this bug.
[-- Attachment #2: Type: text/plain, Size: 395 bytes --]
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
[-- Attachment #3: Type: text/plain, Size: 257 bytes --]
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH v3] xfrm: take iphdr size into account for esp payload size calculation
From: David Miller @ 2012-05-18 0:05 UTC (permalink / raw)
To: bpoirier
Cc: netdev, kuznet, jmorris, yoshfuji, kaber, linux-kernel,
steffen.klassert
In-Reply-To: <1337196925-4919-1-git-send-email-bpoirier@suse.de>
From: Benjamin Poirier <bpoirier@suse.de>
Date: Wed, 16 May 2012 15:35:25 -0400
> Corrects the function that determines the esp payload size.
> The calculations done in esp4_get_mtu lead to overlength frames in transport
> mode for certain mtu values and suboptimal frames for others.
>
> According to what is done, mainly in esp_output(), net_header_len aka
> sizeof(struct iphdr) must be taken into account before doing the alignment
> calculation.
>
> Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
This looks great.
Could you please fix net/ipv6/esp6.c too, it seems to have the same
exact bug.
Once you respin this patch with both ipv4 and ipv6 fixed, I'll apply
it.
Thank you.
^ permalink raw reply
* Re: [PATCH] bonding: only send arp monitor packets if no other traffic
From: David Miller @ 2012-05-18 0:08 UTC (permalink / raw)
To: chris.friesen; +Cc: fubar, netdev
In-Reply-To: <4FB3F67C.6000401@genband.com>
From: Chris Friesen <chris.friesen@genband.com>
Date: Wed, 16 May 2012 12:48:28 -0600
> + if (IS_UP(slave->dev)) {
> + if (time_after_eq(jiffies, dev_trans_start(slave->dev) + delta_in_ticks) ||
> + time_after_eq(jiffies, slave->dev->last_rx + delta_in_ticks)) {
> + bond_arp_send_all(bond, slave);
> + }
These lines are very long, and instead of just reformatting them I'd suggest
that you simply make the test a static helper function.
Also, you should not use braces to enclose a single line of code.
Thanks.
^ permalink raw reply
* Re: [PATCH] e1000: Reset rx ring index on receive overrun
From: Samuel Thibault @ 2012-05-18 0:12 UTC (permalink / raw)
To: Brandeburg, Jesse
Cc: Jiri Pirko, e1000-devel@lists.sourceforge.net, Dean Nelson,
Allan, Bruce W, linux-kernel@vger.kernel.org, David S. Miller,
Ronciak, John, netdev@vger.kernel.org
In-Reply-To: <alpine.WNT.2.00.1205171702590.7900@jbrandeb-mobl2.amr.corp.intel.com>
Brandeburg, Jesse, le Thu 17 May 2012 17:04:04 -0700, a écrit :
> On Thu, 17 May 2012, Samuel Thibault wrote:
>
> > Samuel Thibault, le Fri 18 May 2012 01:28:21 +0200, a écrit :
> > > Dave, Tushar N, le Thu 17 May 2012 23:22:31 +0000, a écrit :
> > > > I am interested in to see if you have actual test case and more
> > > > importantly test data that shows that kernel and device indexes are
> > > > not synchronized any more.
> > >
> > > Well, it's not with an actual physical device, but with the kvm
> > > emulation.
> >
> > BTW, it also happens easily when request_irq takes some time to
> > complete: since we enable E1000_TCTL_EN before that, the card can have
> > time to fill the ring before irqs are processed.
>
> I think there may well be a bug in the implementation in kvm. The
> hardware doesn't have this bug.
How does it avoid filling the ring? What is the purpose of the RXO flag
if it does avoid them?
Samuel
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox