From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shirley Ma Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel Date: Wed, 29 Sep 2010 07:56:26 -0700 Message-ID: <1285772186.31343.23.camel@localhost.localdomain> References: <20100914182707.GB15549@redhat.com> <1284490143.13351.82.camel@localhost.localdomain> <20100914190156.GA16037@redhat.com> <1284492983.13351.95.camel@localhost.localdomain> <20100915051241.GA25340@redhat.com> <1284531675.24603.259.camel@localhost.localdomain> <20100915101000.GB28016@redhat.com> <1284562354.2573.12.camel@localhost.localdomain> <1285730669.31343.7.camel@localhost.localdomain> <20100929081645.GA21195@redhat.com> <20100929082820.GC21195@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Arnd Bergmann , Avi Kivity , "Xin, Xiaohui" , David Miller , netdev@vger.kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org To: "Michael S. Tsirkin" Return-path: Received: from e35.co.us.ibm.com ([32.97.110.153]:33270 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750984Ab0I2O4i (ORCPT ); Wed, 29 Sep 2010 10:56:38 -0400 In-Reply-To: <20100929082820.GC21195@redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 2010-09-29 at 10:28 +0200, Michael S. Tsirkin wrote: > > > 1. Adding completion field in struct virtqueue; > > > 2. when it is a zero copy packet, put vhost thread wait for > completion > > > to update vhost_add_used_and_signal; > > > 3. passing vq from vhost to macvtap as skb destruct_arg; > > > 4. when skb is freed for the last reference, signal vq completion > > > The test results show same performance as the original patch. How > do you > > > think? If it sounds good to you. I will resubmit this reversion > patch. > > > The patch still keeps as simple as it was before. :) > > > > > > Thanks > > > Shirley > > > > If you look at dev_hard_start_xmit you will see a call > > to skb_orphan_try which often calls the skb destructor. > > So I suspect this is almost equivalent to your original patch, > > and has the same correctness issue. > > So you could try doing skb_tx(skb)->prevent_sk_orphan = 1 > just to see what will happen. Might be interesting - just > make sure the device doesn't orphan the skb first thing. > I suspect lack of parallelism will result in bad throughput > esp for small messages. > > Note this still won't make it correct (this has module unloading > issue, and devices might still orphan skb, clone it, or hang on to > paged data in some other way) but at least closer. For message size smaller than 128, it still does copy. I tested some small message size, I didn't see any regression. I will run more test to focus on small message size between 128 - 4K. I don't need prevent_sk_orphan since in skb_release_data for last reference, I just need the ZEROCOPY flag from that sock to signal a completion. Thanks Shirley