From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: netdev@vger.kernel.org, olaf.kirch@oracle.com
Subject: Re: [PATCH][RFC] network splice receive v2
Date: Fri, 15 Jun 2007 12:06:18 +0400 [thread overview]
Message-ID: <20070615080618.GA22234@2ka.mipt.ru> (raw)
In-Reply-To: <20070614115902.GG6149@kernel.dk>
On Thu, Jun 14, 2007 at 01:59:02PM +0200, Jens Axboe (jens.axboe@oracle.com) wrote:
> On Thu, Jun 14 2007, Evgeniy Polyakov wrote:
> > On Wed, Jun 13, 2007 at 12:01:04PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> > > I will rebase my tree, likely something was not merged correctly.
> >
> > Ok, I've just rebased a tree from recent git and pulled from brick -
> > things seems to be settled. I've ran several tests with different
> > filesizes and all files were received correctly without kernel crashes.
> > There is skb leak indeed, and it looks like it is in the
> > __splice_from_pipe() for the last page.
>
> Uh, the leak, right - I had forgotten about that, was working on getting
> vmsplice into shape the last two days. Interesting that you mention the
> last page, I'll dig in now! Any more info on this (how did you notice
> the leak originates from there)?
I first observed leak via slabinfo data (not a leak, but number of skbs
did not dropped after quite huge files were transferred), then added
number of allocated and freed objects in skbuff.c, they did not match
for big files, so I started to check splice source and found that
sometimes ->release callback is not called, but code breaks out of the
loop. I've put some printks in __splice_from_pipe() and found following
case, when skb is leaked:
when the same cloned skb was shared multiple times (no more than 2 though),
only one copy was freed.
Further analysis description requires some splice background (obvious
for you, but that clears it for me):
pipe_buffer only contains 16 pages.
There is a code, which copies pages (pointers) from spd to pipe_buffer
(splice_to_pipe()).
skb allocations happens in chunks of different size (i.e. with different
number of skbs/pages per call), so it is possible that number of
allocated skbs will be less than pipe_buffer size (16), and then the
rest of the pages will be put into different (or the same) pipe_buffer later.
Sometimes two skbs from above example happens to be on the boundary of
the pipe buffer, so only one of them is being copied into pipe_buffer,
which is then transferred over the pipe.
So, we have a case, when spd has (it had more, but this two are special)
2 pages (actually the same page, but two references to it), but pipe_buffer
has a place only for one. In that case second page from spd will be missed.
So, things turned down to be not in the __splice_from_pipe(), but
splice_to_pipe(). Attached patch fixes a leak for me.
It was tested with different data files and all were received correctly.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
diff --git a/fs/splice.c b/fs/splice.c
index bc481f1..365bfd9 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -211,8 +211,6 @@ ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
break;
if (pipe->nrbufs < PIPE_BUFFERS)
continue;
-
- break;
}
if (spd->flags & SPLICE_F_NONBLOCK) {
--
Evgeniy Polyakov
next prev parent reply other threads:[~2007-06-15 8:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-11 11:59 [PATCH][RFC] network splice receive v2 Jens Axboe
2007-06-12 18:17 ` Evgeniy Polyakov
2007-06-12 18:17 ` Jens Axboe
2007-06-12 18:18 ` Jens Axboe
2007-06-13 8:01 ` Evgeniy Polyakov
2007-06-13 19:50 ` Jens Axboe
2007-06-14 11:57 ` Evgeniy Polyakov
2007-06-14 11:59 ` Jens Axboe
2007-06-15 8:06 ` Evgeniy Polyakov [this message]
2007-06-15 8:43 ` Jens Axboe
2007-06-15 9:14 ` Evgeniy Polyakov
2007-06-15 9:34 ` Jens Axboe
2007-06-15 9:44 ` Evgeniy Polyakov
2007-06-15 13:31 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070615080618.GA22234@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=jens.axboe@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=olaf.kirch@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).