From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 830F33328FD for ; Wed, 26 Nov 2025 15:25:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764170759; cv=none; b=a1d6oU0BI77gx/7WNYHYNhsa7w8MWYmunb3IlcaEbJXNdi7NjruqVmi6RJEyz3n36HNu5P5H7kSfV/pUnYy2qZ+v4b7wGMlbxAEfAPBUA2xaGuJssCmcF5u3Bonmma6L9cw+f7Vk7vpS4hjdPdWa/a+jhWsBINmkoVJxENfFRZA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764170759; c=relaxed/simple; bh=MlNy/atfOJbZCeA0q/MGoSHM0JWfj9pD6YlEmXsdmCA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WUrjvO9ZnFuQLqXimry5p3FfqhvPXXVTSBzPWheNqzGL+QuT1ckO2pCGInfaHPFzH6uP0uIhrSsYm/g1dOOc2aQHChPHZ2D+S+w7naRrElb2c7bit+6OxjIuFEyBYNw5bHau2mypKlUtfpVv5ecDzfw4+ng+3enKzCRvp+cF6hA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XMG8mw1c; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=CrFKSehg; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XMG8mw1c"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="CrFKSehg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1764170756; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IdOd/whsaB5ha6Xltts70lCb5PGc5f2tglKc7q062f4=; b=XMG8mw1cukaDgthtd0xp2JhTzoIxbz2y4DC5mVHPtWzeB74oUoMZK2trxiZhALIYNePNlw z5GKnZiZpYfYeun+3PItJ4FIPYpgMrV/9rE+vFU5zY0/urzgHwMSDna1yHr+ma73V3fvVP rqniiKfVaRWTLrPj0sOiVGrHnexbLl8= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-671-Fh94ML1lPMO2Xx2efyyiWw-1; Wed, 26 Nov 2025 10:25:51 -0500 X-MC-Unique: Fh94ML1lPMO2Xx2efyyiWw-1 X-Mimecast-MFC-AGG-ID: Fh94ML1lPMO2Xx2efyyiWw_1764170750 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-42b3086a055so6518333f8f.3 for ; Wed, 26 Nov 2025 07:25:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1764170749; x=1764775549; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=IdOd/whsaB5ha6Xltts70lCb5PGc5f2tglKc7q062f4=; b=CrFKSehgTVATz5ItZ3SW5aIMvHYoe8RmV8ykOb+wphhtyd6jYk3JwjvcVP5LpU/vBV 6vf6/+SmcGGExXk67kp1vuUIyCPFL3MR7wNn3jvH2x17yiML7NFG5VmZzq0AHG7BN3TL /+3zXLS2qaOFDMsqaqBzFLAhPBlTYz/EM99xszc598B/fuZzM9ea9qj79L+xjNISad5r eUxB8HtW4ZxYHccmBzymAOywYtFcurY6gOaBaFcRyAkdsI4Ly8UBzFVgKO/zJT6Ulyy2 IYstD/lFXO3B5HowxI3AlwJq/DA3kbNY3ay8z7xxU+rYmg1PD8UPLUGvdnh/ylL8xY2d 8ypQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764170749; x=1764775549; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IdOd/whsaB5ha6Xltts70lCb5PGc5f2tglKc7q062f4=; b=i6tLccOwbmrcui4aSdLcVC5ggNIImrShAy4tAKqTU8QjlEOVoHuuk75u76nnoEThzi suU/JLabKmllEgJvDrfBnLtatCDtTkhoCUHfAEX0o6ttVgc/EdJ760juck9pmhR7X7Bt calU1I6cquyMDmn9jl2sdO1A2EuW1tvkxUnukh3paFNudLZYhvSZ0nCHKt9ecqfBkdK1 GehqMTJDztOhDfMhKQ+v36HJGJgPKRJ5Sznymg07TTOBeuylwcABW1CR2JU5jSLIfHzU ec9cgXZEK7g36LGNqN3GZ/ommfLKhGDS3g4BtvZ74O/QRNqK1gqTFOub8+o6kw//DbKf Uw6Q== X-Forwarded-Encrypted: i=1; AJvYcCVclLPoSOy7Ibs15PJAnYT2YTTEk7ktRogIf7gbVysQ+jmJMRMpw7dUmp16ERcjesjgqKDrUZj0SdzciUc=@vger.kernel.org X-Gm-Message-State: AOJu0YwJQAyG/rtluBFEoIFXPaTopfSqMF/Ff1qJJQBD5PBNEVMPWwxM dByQbEwG7K0tFDAR4xbvCSnYm9DQr3GoSqkVTsgOJWOLUxMKDkGl2Lb4uXT+Ia9atvsKJiN9Z0a oYHxKaEZktMp5sSkzl++pZNykpnvheTBjuJpeg5PrYKp93N3YSKL1cnnmKOqLC5Dybw== X-Gm-Gg: ASbGnctQo0akbQGMPl01T0auByhT72J02nBsJsOzCUIDwGDom7edSsHBFYPbWX3igkU opRYCYPyOsrfa2+ad/VBDykFt0b893ODc/Hx/SUnGFXSGhe8P52gSewHbPWf2b+Nvp8kiLkirHB M+vDZe5U8t5Eflfr4goLCmaajwALFzr54U8Krnmhm36sjdDOW79oLwzdQK+WM6DDqdZ4iIexBi6 9yGo8nAN9MGAaiphNHndR0EIoex5OEopFM3QWkF4BEpcGzhJK9Z+nxzSX77Q63YOiP+sgMQurbQ SyLczVDVvoBZ7ULMZb4NknbC+cMlYjAFfZMhPFsCUdMYoQAtixkErYQWs1JLD/1R4rY3ZjkucuU LVMZ3gRh3A2s7XXxgDotElx3M69B0MQ== X-Received: by 2002:a05:6000:2489:b0:42b:3a84:1ee6 with SMTP id ffacd0b85a97d-42e0f22a2c8mr8023545f8f.24.1764170748940; Wed, 26 Nov 2025 07:25:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IG+KQvLghlDUQh9afZD+PVNCo1iNTuBYw29hyUe/fgigkK4QndBh9YrGoJcdMGDn9uA+ZGhNA== X-Received: by 2002:a05:6000:2489:b0:42b:3a84:1ee6 with SMTP id ffacd0b85a97d-42e0f22a2c8mr8023493f8f.24.1764170748385; Wed, 26 Nov 2025 07:25:48 -0800 (PST) Received: from redhat.com (IGLD-80-230-39-63.inter.net.il. [80.230.39.63]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7f2e581sm38903906f8f.8.2025.11.26.07.25.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Nov 2025 07:25:47 -0800 (PST) Date: Wed, 26 Nov 2025 10:25:44 -0500 From: "Michael S. Tsirkin" To: Simon Schippers Cc: willemdebruijn.kernel@gmail.com, jasowang@redhat.com, andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, eperezma@redhat.com, jon@nutanix.com, tim.gebauer@tu-dortmund.de, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux.dev Subject: Re: [PATCH net-next v6 3/8] tun/tap: add synchronized ring produce/consume with queue management Message-ID: <20251126100007-mutt-send-email-mst@kernel.org> References: <20251120152914.1127975-1-simon.schippers@tu-dortmund.de> <20251120152914.1127975-4-simon.schippers@tu-dortmund.de> <20251125100655-mutt-send-email-mst@kernel.org> <4db234bd-ebd7-4325-9157-e74eccb58616@tu-dortmund.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4db234bd-ebd7-4325-9157-e74eccb58616@tu-dortmund.de> On Wed, Nov 26, 2025 at 10:23:50AM +0100, Simon Schippers wrote: > On 11/25/25 17:54, Michael S. Tsirkin wrote: > > On Thu, Nov 20, 2025 at 04:29:08PM +0100, Simon Schippers wrote: > >> Implement new ring buffer produce and consume functions for tun and tap > >> drivers that provide lockless producer-consumer synchronization and > >> netdev queue management to prevent ptr_ring tail drop and permanent > >> starvation. > >> > >> - tun_ring_produce(): Produces packets to the ptr_ring with proper memory > >> barriers and proactively stops the netdev queue when the ring is about > >> to become full. > >> > >> - __tun_ring_consume() / __tap_ring_consume(): Internal consume functions > >> that check if the netdev queue was stopped due to a full ring, and wake > >> it when space becomes available. Uses memory barriers to ensure proper > >> ordering between producer and consumer. > >> > >> - tun_ring_consume() / tap_ring_consume(): Wrapper functions that acquire > >> the consumer lock before calling the internal consume functions. > >> > >> Key features: > >> - Proactive queue stopping using __ptr_ring_full_next() to stop the queue > >> before it becomes completely full. > >> - Not stopping the queue when the ptr_ring is full already, because if > >> the consumer empties all entries in the meantime, stopping the queue > >> would cause permanent starvation. > > > > what is permanent starvation? this comment seems to answer this > > question: > > > > > > /* Do not stop the netdev queue if the ptr_ring is full already. > > * The consumer could empty out the ptr_ring in the meantime > > * without noticing the stopped netdev queue, resulting in a > > * stopped netdev queue and an empty ptr_ring. In this case the > > * netdev queue would stay stopped forever. > > */ > > > > > > why having a single entry in > > the ring we never use helpful to address this? > > > > > > > > > > In fact, all your patch does to solve it, is check > > netif_tx_queue_stopped on every consumed packet. > > > > > > I already proposed: > > > > static inline int __ptr_ring_peek_producer(struct ptr_ring *r) > > { > > if (unlikely(!r->size) || r->queue[r->producer]) > > return -ENOSPC; > > return 0; > > } > > > > And with that, why isn't avoiding the race as simple as > > just rechecking after stopping the queue? > > I think you are right and that is quite similar to what veth [1] does. > However, there are two differences: > > - Your approach avoids returning NETDEV_TX_BUSY by already stopping > when the ring becomes full (and not when the ring is full already) > - ...and the recheck of the producer wakes on !full instead of empty. > > I like both aspects better than the veth implementation. Right. Though frankly, someone should just fix NETDEV_TX_BUSY already at least with the most popular qdiscs. It is a common situation and it is just annoying that every driver has to come up with its own scheme. > Just one thing: like the veth implementation, we probably need a > smp_mb__after_atomic() after netif_tx_stop_queue() as they also discussed > in their v6 [2]. yea makes sense. > > On the consumer side, I would then just do: > > __ptr_ring_consume(); > if (unlikely(__ptr_ring_consume_created_space())) > netif_tx_wake_queue(txq); > > Right? > > And for the batched consume method, I would just call this in a loop. Well tun does not use batched consume does it? > Thank you! > > [1] Link: https://lore.kernel.org/netdev/174559288731.827981.8748257839971869213.stgit@firesoul/T/#m2582fcc48901e2e845b20b89e0e7196951484e5f > [2] Link: https://lore.kernel.org/all/174549933665.608169.392044991754158047.stgit@firesoul/T/#m63f2deb86ffbd9ff3a27e1232077a3775606c14d > > > > > __ptr_ring_produce(); > > if (__ptr_ring_peek_producer()) > > netif_tx_stop_queue > > smp_mb__after_atomic(); // Right here > > > if (!__ptr_ring_peek_producer()) > > netif_tx_wake_queue(txq); > > > > > > > > > > > > > >