linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Gavin Shan <gshan@redhat.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
	jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	yihyu@redhat.com, shan.gavin@gmail.com,
	Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org, mochs@nvidia.com
Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring
Date: Mon, 18 Mar 2024 03:50:41 -0400	[thread overview]
Message-ID: <20240318034710-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <589d980f-2e4d-47b4-9dc7-8c64dbe271ce@redhat.com>

On Mon, Mar 18, 2024 at 09:41:45AM +1000, Gavin Shan wrote:
> On 3/18/24 02:50, Michael S. Tsirkin wrote:
> > On Fri, Mar 15, 2024 at 09:24:36PM +1000, Gavin Shan wrote:
> > > 
> > > On 3/15/24 21:05, Michael S. Tsirkin wrote:
> > > > On Fri, Mar 15, 2024 at 08:45:10PM +1000, Gavin Shan wrote:
> > > > > > > Yes, I guess smp_wmb() ('dmb') is buggy on NVidia's grace-hopper platform. I tried
> > > > > to reproduce it with my own driver where one thread writes to the shared buffer
> > > > > and another thread reads from the buffer. I don't hit the out-of-order issue so
> > > > > far.
> > > > 
> > > > Make sure the 2 areas you are accessing are in different cache lines.
> > > > 
> > > 
> > > Yes, I already put those 2 areas to separate cache lines.
> > > 
> > > > 
> > > > > My driver may be not correct somewhere and I will update if I can reproduce
> > > > > the issue with my driver in the future.
> > > > 
> > > > Then maybe your change is just making virtio slower and masks the bug
> > > > that is actually elsewhere?
> > > > 
> > > > You don't really need a driver. Here's a simple test: without barriers
> > > > assertion will fail. With barriers it will not.
> > > > (Warning: didn't bother testing too much, could be buggy.
> > > > 
> > > > ---
> > > > 
> > > > #include <pthread.h>
> > > > #include <stdio.h>
> > > > #include <stdlib.h>
> > > > #include <assert.h>
> > > > 
> > > > #define FIRST values[0]
> > > > #define SECOND values[64]
> > > > 
> > > > volatile int values[100] = {};
> > > > 
> > > > void* writer_thread(void* arg) {
> > > > 	while (1) {
> > > > 	FIRST++;
> > > > 	// NEED smp_wmb here
> > >          __asm__ volatile("dmb ishst" : : : "memory");
> > > > 	SECOND++;
> > > > 	}
> > > > }
> > > > 
> > > > void* reader_thread(void* arg) {
> > > >       while (1) {
> > > > 	int first = FIRST;
> > > > 	// NEED smp_rmb here
> > >          __asm__ volatile("dmb ishld" : : : "memory");
> > > > 	int second = SECOND;
> > > > 	assert(first - second == 1 || first - second == 0);
> > > >       }
> > > > }
> > > > 
> > > > int main() {
> > > >       pthread_t writer, reader;
> > > > 
> > > >       pthread_create(&writer, NULL, writer_thread, NULL);
> > > >       pthread_create(&reader, NULL, reader_thread, NULL);
> > > > 
> > > >       pthread_join(writer, NULL);
> > > >       pthread_join(reader, NULL);
> > > > 
> > > >       return 0;
> > > > }
> > > > 
> > > 
> > > Had a quick test on NVidia's grace-hopper and Ampere's CPUs. I hit
> > > the assert on both of them. After replacing 'dmb' with 'dsb', I can
> > > hit assert on both of them too. I need to look at the code closely.
> > > 
> > > [root@virt-mtcollins-02 test]# ./a
> > > a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> > > Aborted (core dumped)
> > > 
> > > [root@nvidia-grace-hopper-05 test]# ./a
> > > a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> > > Aborted (core dumped)
> > > 
> > > Thanks,
> > > Gavin
> > 
> > 
> > Actually this test is broken. No need for ordering it's a simple race.
> > The following works on x86 though (x86 does not need barriers
> > though).
> > 
> > 
> > #include <pthread.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <assert.h>
> > 
> > #if 0
> > #define x86_rmb()  asm volatile("lfence":::"memory")
> > #define x86_mb()  asm volatile("mfence":::"memory")
> > #define x86_smb()  asm volatile("sfence":::"memory")
> > #else
> > #define x86_rmb()  asm volatile("":::"memory")
> > #define x86_mb()  asm volatile("":::"memory")
> > #define x86_smb()  asm volatile("":::"memory")
> > #endif
> > 
> > #define FIRST values[0]
> > #define SECOND values[640]
> > #define FLAG values[1280]
> > 
> > volatile unsigned values[2000] = {};
> > 
> > void* writer_thread(void* arg) {
> > 	while (1) {
> > 	/* Now synchronize with reader */
> > 	while(FLAG);
> > 	FIRST++;
> > 	x86_smb();
> > 	SECOND++;
> > 	x86_smb();
> > 	FLAG = 1;
> > 	}
> > }
> > 
> > void* reader_thread(void* arg) {
> >      while (1) {
> > 	/* Now synchronize with writer */
> > 	while(!FLAG);
> > 	x86_rmb();
> > 	unsigned first = FIRST;
> > 	x86_rmb();
> > 	unsigned second = SECOND;
> > 	assert(first - second == 1 || first - second == 0);
> > 	FLAG = 0;
> > 
> > 	if (!(first %1000000))
> > 		printf("%d\n", first);
> >     }
> > }
> > 
> > int main() {
> >      pthread_t writer, reader;
> > 
> >      pthread_create(&writer, NULL, writer_thread, NULL);
> >      pthread_create(&reader, NULL, reader_thread, NULL);
> > 
> >      pthread_join(writer, NULL);
> >      pthread_join(reader, NULL);
> > 
> >      return 0;
> > }
> > 
> 
> I tried it on host and VM of NVidia's grace-hopper. Without the barriers, I
> can hit assert. With the barriers, it's working fine without hitting the
> assert.
> 
> I also had some code to mimic virtio vring last weekend, and it's just
> working well. Back to our original issue, __smb_wmb() is issued by guest
> while __smb_rmb() is executed on host. The VM and host are running at
> different exception level: EL2 vs EL1. I'm not sure it's the cause. I
> need to modify my code so that __smb_wmb() and __smb_rmb() can be executed
> from guest and host.

It is thinkably possible that on grace-hopper barriers work
differently somehow. We need to find out more though.
Anyone from Nvidia can chime in?

-- 
MST


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-03-18  7:51 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240314074923.426688-1-gshan@redhat.com>
     [not found] ` <20240314040443-mutt-send-email-mst@kernel.org>
     [not found]   ` <9b148de7-b687-4d10-b177-5608b8dc7046@redhat.com>
     [not found]     ` <20240314074216-mutt-send-email-mst@kernel.org>
     [not found]       ` <23dc6d00-6a57-4ddf-8611-f3c6f6a8e43c@redhat.com>
     [not found]         ` <20240314085630-mutt-send-email-mst@kernel.org>
2024-03-15 10:45           ` [PATCH] virtio_ring: Fix the stale index in available ring Gavin Shan
2024-03-15 11:05             ` Michael S. Tsirkin
2024-03-15 11:24               ` Gavin Shan
2024-03-17 16:50                 ` Michael S. Tsirkin
2024-03-17 23:41                   ` Gavin Shan
2024-03-18  7:50                     ` Michael S. Tsirkin [this message]
     [not found] ` <20240318165924.GA1824@willie-the-truck>
2024-03-19  4:59   ` Gavin Shan
2024-03-19  6:09     ` Michael S. Tsirkin
2024-03-19  6:10       ` Michael S. Tsirkin
2024-03-19  6:54         ` Gavin Shan
2024-03-19  7:04           ` Michael S. Tsirkin
2024-03-19  7:41             ` Gavin Shan
2024-03-19  8:28           ` Michael S. Tsirkin
2024-03-19  6:38       ` Gavin Shan
2024-03-19  6:43         ` Michael S. Tsirkin
2024-03-19  6:49           ` Gavin Shan
2024-03-19  7:09             ` Michael S. Tsirkin
2024-03-19  8:08               ` Gavin Shan
2024-03-19  8:49                 ` Michael S. Tsirkin
2024-03-19 18:22     ` Will Deacon
2024-03-19 23:56       ` Gavin Shan
2024-03-20  0:49         ` Michael S. Tsirkin
2024-03-20  5:24           ` Gavin Shan
2024-03-20  7:14             ` Michael S. Tsirkin
2024-03-25  7:34               ` Gavin Shan
2024-03-26  7:49                 ` Michael S. Tsirkin
2024-03-26  9:38                   ` Keir Fraser
2024-03-26 11:43                     ` Will Deacon
2024-03-26 15:46                       ` Will Deacon
2024-03-26 23:14                         ` Gavin Shan
2024-03-27  0:01                           ` Gavin Shan
2024-03-27 11:56                         ` Michael S. Tsirkin
2024-03-20 17:15             ` Keir Fraser
2024-03-21 12:06               ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240318034710-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=gshan@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=shan.gavin@gmail.com \
    --cc=virtualization@lists.linux.dev \
    --cc=will@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yihyu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).