From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH rfc] packet: zerocopy packet_snd Date: Thu, 27 Nov 2014 09:27:17 +0200 Message-ID: <20141127072717.GA8644@redhat.com> References: <1416602694-7540-1-git-send-email-willemb@google.com> <20141126182445.GA15744@redhat.com> <20141126212038.GA12118@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Network Development , David Miller , Eric Dumazet , Daniel Borkmann To: Willem de Bruijn Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53970 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750884AbaK0H1X (ORCPT ); Thu, 27 Nov 2014 02:27:23 -0500 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Wed, Nov 26, 2014 at 06:05:16PM -0500, Willem de Bruijn wrote: > On Wed, Nov 26, 2014 at 4:20 PM, Michael S. Tsirkin wrote: > > On Wed, Nov 26, 2014 at 02:59:34PM -0500, Willem de Bruijn wrote: > >> > The main problem with zero copy ATM is with queueing disciplines > >> > which might keep the socket around essentially forever. > >> > The case was described here: > >> > https://lkml.org/lkml/2014/1/17/105 > >> > and of course this will make it more serious now that > >> > more applications will be able to do this, so > >> > chances that an administrator enables this > >> > are higher. > >> > >> The denial of service issue raised there, that a single queue can > >> block an entire virtio-net device, is less problematic in the case of > >> packet sockets. A socket can run out of sk_wmem_alloc, but a prudent > >> application can increase the limit or use separate sockets for > >> separate flows. > > > > Sounds like this interface is very hard to use correctly. > > Actually, this socket alloc issue is the same for zerocopy and > non-zerocopy. Packets can be held in deep queues at which point > the packet socket is blocked. This is accepted behavior. > > >From the above thread: > > "It's ok for non-zerocopy packet to be blocked since VM1 thought the > packets has been sent instead of pending in the virtqueue. So VM1 can > still send packet to other destination." > > This is very specific to virtio and vhost-net. I don't think that that > concern applies to a packet interface. Well, you are obviously building the interface with some use-case in mind. Let's try to make it work for multiple use-cases. So at some level, you are right. The issue is not running out of wmem. But I think I'm right too - this is hard to use correctly. I think the difference is that with your patch, application can't reuse the memory until packet is transmitted, otherwise junk goes out on the network. Even closing the socket won't help. Is this true? I see this as a problem. I'm trying to figure out how would one use this interface, one obvious use would be to tunnel out raw packets directly from VM memory. For this application, a zero copy packet never completing is a problem: at minimum, you want to be able to remove the device, which translates to a requirement that closing the socket effectively stops using userspace memory. In case you want to be able to run e.g. a watchdog, ability to specify a deadline also seems benefitial. > Another issue, though, is that the method currently really only helps > TSO because ll other paths cause a deep copy. There are more use > cases once it can send up to 64KB MTU over loopback or send out > GSO datagrams without triggering skb_copy_ubufs. I have not looked > into how (or if) that can be achieved yet. I think this was done intentionally at some point, try to look at git history to find out the reasons. > > > >> > One possible solution is some kind of timer orphaning frags > >> > for skbs that have been around for too long. > >> > >> Perhaps this can be approximated without an explicit timer by calling > >> skb_copy_ubufs on enqueue whenever qlen exceeds a threshold value? > > > > Not sure. I'll have to see that patch to judge.