netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Barton <eeb@bartonsoftware.com>
Cc: "'David Miller'" <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: PATCH zero-copy send completion callback
Date: Tue, 17 Oct 2006 17:13:46 +0400	[thread overview]
Message-ID: <20061017131345.GA20225@2ka.mipt.ru> (raw)
In-Reply-To: <021a01c6f1ea$c45e1820$0281a8c0@ebpc>

On Tue, Oct 17, 2006 at 01:50:04PM +0100, Eric Barton (eeb@bartonsoftware.com) wrote:
> Evgeniy,
> 
> > You can use existing skb destructor and appropriate reference
> > counter is already there. In your own destructor you need to
> > call old one of course, and it's type can be determined from
> > the analysis of the headers and skb itself (there are not so
> > much destructor's types actually).  If that level of
> > abstraction is not enough, it is possible to change
> > skb_release_data()/__kfree_skb() so that it would be possible
> > in skb->destructor() to determine if attached pages will be
> > freed or not.
> 
> Yes absolutely.  My first thought was to use the skbuf destructor
> but I was paranoid I might screw up the destructor stacking.
> Maybe I should have been braver?

It depends on the results quality...

> Since the callback descriptor needs to track the pages in
> skb_shinfo() rather than the skbuf itself, it seemed "natural"
> to make skb_release_data() the trigger.
> 
> > Existing sendfile() implementation is synchronous, it does not
> > require async callback. 
> 
> Is it not true that you cannot know when it is safe to overwrite
> pages sent in this way?

There are tricks all over the place in sendfile. First one is sendpage()
imeplementation, which copies data if hardware does not support
checksumming and scater-gather, and simultaneous writing is "protected" in
the higher layer (check do_generic_mapping_read()). We do not care about
'later' writing, i.e. while skb was in some queue on the local machine,
since new data will be transferred in that case.
truncation is also protected by the fact, that page's reference counter
is increased, so the same page can not be freed and reused.

It was design decision not to care about page overwrites (and thus no
page locking) - either smart hardware transfers new data, or we do copy 
and send old data.

> > skbs are allocated from own cache, and the smaller it is, the better.
> 
> As I mentioned in another reply, skbs are indeed allocated from
> their own cache, but skb_shinfo() is allocated contiguously with
> the packet header using kmalloc.

Yes, skb itself is not touched.

You probably saw a lot of discussions about problems with e1000
hardware, memory fragmentation and jumbo frames.
Since skb_shared_info is added to the actual data, it frequently forces
next order allocations, so one of the solution is to put skb_shared_info
into separate allocations in some cases, although those discussions are
sleeping right now, problem still exists and if your current needs can
be handled within existing interfaces it should be tried first.

> -- 
> 
>                 Cheers,
>                         Eric
> 

-- 
	Evgeniy Polyakov

  reply	other threads:[~2006-10-17 13:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-17 12:50 PATCH zero-copy send completion callback Eric Barton
2006-10-17 13:13 ` Evgeniy Polyakov [this message]
     [not found] <20061017094643.GA28926@infradead.org>
2006-10-17 12:27 ` Eric Barton
     [not found] <20061016.135222.78711520.davem@davemloft.net>
2006-10-17  0:53 ` Eric Barton
2006-10-17  9:01   ` Eric Dumazet
2006-10-17 12:23     ` Eric Barton
2006-10-17 21:45       ` David Miller
2006-10-17 11:19   ` Evgeniy Polyakov
  -- strict thread matches above, loose matches on Subject: below --
2006-10-16 18:21 Eric Barton
2006-10-16 17:25 Eric Barton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061017131345.GA20225@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=davem@davemloft.net \
    --cc=eeb@bartonsoftware.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).