From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB0A812CD8B for ; Sat, 6 Jun 2026 00:11:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780704712; cv=none; b=AksmBN5iEhHkcEsSbP81iFdQxvuf8R1gqUaj4MogVrpNpJ2Gfad1h0421a6E9vrRygH78i2Xc3X3auAgxzX6x5uZ3HC/MTTCdc9wq/mTUqIb8OY8LC+8IeUMHqkajF2KsdCU7PZoV51j3ZXAZ3oiug6gkrc7YjZImtUpAdZ3bD4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780704712; c=relaxed/simple; bh=ccE1vyo3mkwzgcyciyLNh8mUmRuplMG/jETlxZAMsnw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=safC2Y/99m80pYGjYlU/fF2BCRmJbkOtRMoRmdd/vAjDQ/dQK/TB05Xqk5iq4tCr/QdexZcxOZ+Nk2vzJkcRbBmf3PbDqWyIt+BaJvMzmKM4bv+0bZgSKVzEzlPC5kXvHSJGnGlOlKFhRH2Rt0O3Gg9lvtu5gQVBzW1IBidyg1E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TqnZCO4+; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=WBl6WJ35; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TqnZCO4+"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="WBl6WJ35" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780704709; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uJSVZacPqI8DQo08516ne3I19qnW6i3FfC453BcTXZ4=; b=TqnZCO4+xYdRtkZ8I7Xan1fZmcuLxp1FcqvbKW8iFdH0omXi0luoX98IxmG8duASwKvPTo hvvPUkbwd+X/YDXEdiYHMtP8r0WhOkTHT8PL+0gLCA9q1OdNPwv8cZMMv68UvUEzhmgvwB K0st4IGhg37bww2drYdwRJGXuKOQoeM= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-389-c6BEAQuPOJ6wZ8FbotzRzA-1; Fri, 05 Jun 2026 20:11:48 -0400 X-MC-Unique: c6BEAQuPOJ6wZ8FbotzRzA-1 X-Mimecast-MFC-AGG-ID: c6BEAQuPOJ6wZ8FbotzRzA_1780704707 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-45ef6417092so1935750f8f.2 for ; Fri, 05 Jun 2026 17:11:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1780704707; x=1781309507; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=uJSVZacPqI8DQo08516ne3I19qnW6i3FfC453BcTXZ4=; b=WBl6WJ35FuhY4tmJMdRY3quWnuBsZU68eG5irSBuuHT8XO3Z+6g3GpxcQKCmaS936+ BVL3ERF1fM5hYnk6UnqSBAHVgLcvyF1nplDuL6MUJGS4xOl5rMqsmIRc/oRvEgjiWg+z Ml9Sv4wQYxFapAueNu5HTgV1IuzmpZJUEul0q8OPm5yy0N84MeE32EhlRldj8YnhA6DE a0g7VMNzcjslmMw7ZqZOfyzlX3ej8nTj1VpqfrIdSuLWjxGtDvRf1i7yfvzAeHW3GYP6 oQ0ZlvfvscZ6431gDouBW3w1szg2NcHfudgvflZW8lZ06XAHK3BPLERIGToHnNdIdNbr iW8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780704707; x=1781309507; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uJSVZacPqI8DQo08516ne3I19qnW6i3FfC453BcTXZ4=; b=Ul3D2kSZDmqDICVjciTvOj/e63cgXbw0P5l6mIlflRlNN4qLLEg6+nFYRTqrBJQPOh 0ai0TIfwMB+CYKBWCKN7LGxCB1P0cUyXbm3mmotM2glLreCE26vUXxn12ZRNF5/ndH+V RuCsbmbF/3Qd9rXPEO+bBPfx3uBj6zljNyWVV+pVj+CYdFhSQEKT7kQo8YSx/fuYeax/ 07wANk4eSd46n+4lXp+Xjo7Z303Q8fwOqK7PlwJc8dOpCYnk9Q3EWuNUS/NgyWTmBLuj 58K6+MqM8AAUOQQynPNppku4uNpY24n4YMxRxUH5o4pU1eYNUzc274qwSFjPBZ4XjESw RrSA== X-Forwarded-Encrypted: i=1; AFNElJ+wMVoYiJXArqV+bmVCNhFvJtiqXqza/umVgOzL5SnEW3vd1gGSmatxY3ukRxQSXmIBFvO4r1vSq3zdu7Q=@vger.kernel.org X-Gm-Message-State: AOJu0YxqZPEEaRyEc5pXkTl5nQinUqc4bTbbHsULBEFDt6q+gvw8V3UU 6fRKPItxdjsB0uf/Z89igSVv5JNG+bhCwAkfBbLEEv7i/DWLn+2z0cWH1PzTYVA0yKft3br3DyD bAllwn6fZFtYaJUFyIoPcYh2KV/mQ8mpkY/RLkvhwNJH/yi49ighqV6gT1LKMTFjKwg== X-Gm-Gg: Acq92OFagd4zTrDvg9ymxcYqddXw1SCQ66kIxp5QLbFOm38KKnVZsSeIZtdXSLJKp0M bH/yPBIORFBp97cImhGcsBr+XTSZHHNwgmH+bInqjtzSTUUOAH2c75QsXfKR8h/QjZaehA90k0O mMlxzpI7YkFkDAmOSI/jHUUBLI1TKKG2GKt4y4vDTnvtsJCfivbEHukyTsR8ZJ9da5mu3J1w/96 hQcSiZTuVoNwHMJQl4cxWcx1nKO0YpgNxCbz+SdLqNhZLeNqGZhenIqw/a3TKlydvroWHhBpqD/ +9BNwpfznSpFwhH0ckdsMirzBsud9vNJ710qEziL3yABc/rYq1YizpVInxfz24xsKxgP9VZpzxA 831t1iqtQ5ZnFSBFOiwtDxYQx53oVtsxz92qAd/MxslIpL2Y9A27PHEs= X-Received: by 2002:a5d:5270:0:b0:446:96b1:f53 with SMTP id ffacd0b85a97d-4603051261dmr7230425f8f.26.1780704706744; Fri, 05 Jun 2026 17:11:46 -0700 (PDT) X-Received: by 2002:a5d:5270:0:b0:446:96b1:f53 with SMTP id ffacd0b85a97d-4603051261dmr7230401f8f.26.1780704706256; Fri, 05 Jun 2026 17:11:46 -0700 (PDT) Received: from redhat.com (ppp-94-66-118-61.home.otenet.gr. [94.66.118.61]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f0a43e9sm29881640f8f.0.2026.06.05.17.11.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 17:11:45 -0700 (PDT) Date: Fri, 5 Jun 2026 20:11:42 -0400 From: "Michael S. Tsirkin" To: Si-Wei Liu Cc: Eugenio Perez Martin , yangjiale , Jason Wang , Xuan Zhuo , virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, Andrew.Boyer@amd.com Subject: Re: [PATCH] VIRTIO: Update the desc 'flag' fied last in packed ring. Message-ID: <20260605200908-mutt-send-email-mst@kernel.org> References: <20260602043123.10207-1-yangjiale133@163.com> <6035a8f3-e225-45b0-9f48-55de953bff15@oracle.com> <20260605134252-mutt-send-email-mst@kernel.org> <5e1a10bd-2783-42ba-b443-853f12159756@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5e1a10bd-2783-42ba-b443-853f12159756@oracle.com> On Fri, Jun 05, 2026 at 11:50:36AM -0700, Si-Wei Liu wrote: > > > On 6/5/2026 10:43 AM, Michael S. Tsirkin wrote: > > On Fri, Jun 05, 2026 at 09:03:36AM -0700, Si-Wei Liu wrote: > > > > > > On 6/1/2026 11:04 PM, Eugenio Perez Martin wrote: > > > > On Tue, Jun 2, 2026 at 6:34 AM yangjiale wrote: > > > > > When a descriptor list spans across cache lines, > > > > > updating the flag first can lead to a scenario where the device side > > > > > perceives the flag as valid, yet the corresponding address and length > > > > > fields remain unupdated—resulting in invalid values. > > > > > Therefore, the flag field must be updated last. > > > > > > > > > > Signed-off-by: yangjiale > > > > > --- > > > > > drivers/virtio/virtio_ring.c | 8 ++++---- > > > > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index fbca7ce1c6bf..036b4f90d30f 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq, > > > > > &addr, &len, premapped, attr)) > > > > > goto unmap_release; > > > > > > > > > > + desc[i].addr = cpu_to_le64(addr); > > > > > + desc[i].len = cpu_to_le32(len); > > > > > + desc[i].id = cpu_to_le16(id); > > > > > + > > > > > flags = cpu_to_le16(vq->packed.avail_used_flags | > > > > > (++c == total_sg ? 0 : VRING_DESC_F_NEXT) | > > > > > (n < out_sgs ? 0 : VRING_DESC_F_WRITE)); > > > > > @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct vring_virtqueue *vq, > > > > > else > > > > > desc[i].flags = flags; > > > > > > > > > > - desc[i].addr = cpu_to_le64(addr); > > > > > - desc[i].len = cpu_to_le32(len); > > > > > - desc[i].id = cpu_to_le16(id); > > > > > - > > > > > if (unlikely(vq->use_map_api)) { > > > > > vq->packed.desc_extra[curr].addr = premapped ? > > > > > DMA_MAPPING_ERROR : addr; > > > > These flags are updated before the flags of the head descriptor at the > > > > end of the function, at "vq->packed.vring.desc[head].flags = > > > > head_flags", so the device should not see these. Because of that, the > > > > relative order between the rest of the fields of the same descriptor > > > > or other descriptors' fields, except for the head descriptor's flags, > > > > should not matter. There is a write memory barrier just before > > > > updating the head's flags. > > > The above analysis is absolutely correct. Though one hardware vendor told me > > > that this driver implementation kinda stops them from reading ahead of > > > descriptors already posted beyond the available index., ending up with > > > suboptimal performance that is hard to make up by other means. Would it be a > > > bad idea to go with this change and add write barrier in a gentle way for a > > > small flit in the batch, e.g. commit to memory after every cache line size > > > worth of descriptors are posted? Would the memory barrier have negative > > > performance overhead to other backend implementation variants than real > > > hardware PCI device? > > > > > > -Siwei > > this would need a new feature bit, won't it? > Probably. This is to capture the device's expectation and behavior right? > the driver change itself is not spec violating... yes, device can't rely on this without a feature bit. > > > > > > Also, I don't get why the cache line matters here. Can you expand? Am > > > > I missing something? > > me too. > > > Just to avoid extra delay due to excessive coherency messages and frequent > cache thrashing, device read over pci bus contends with host write/update on > the descriptors in a same cache line.. > > -Siwei this should be infrequent, the whole idea is that there's parallelism: device reads descriptors from X while host writes other ones to Y. btw i can't say whether it's ok for device to just issue 2 reads, or does it have to receive read result and only then issue the second read. -- MST