From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 23 Mar 2020 18:43:09 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20200323184309.GE3017@work-vm> References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [Virtio-fs] xfstest generic/503 hangs List-Id: Development discussions about virtio-fs List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz Cc: virtio-fs-list * Max Reitz (mreitz@redhat.com) wrote: > Hi, > > I have this bug report here: > https://bugzilla.redhat.com/show_bug.cgi?id=1813885 > > And I’m afraid I’m not really making progress on debugging it, so I was > wondering whether any of you might have some insights. > > The problem is that the generic/503 xfstest hangs on virtio-fs. Now, I > don’t know how the reporter got that test to run in the first place, > because for me, it requires fcollapse and fzero, which as far as I can > tell are currently not supported for virtio-fs. > > So I first had to disable those requirements, and then let the helper > program (src/t_mmap_collision.c) not test those operations. > > Then, the test hangs. What I could find out so far is that the hang > occurs in src/t_mmap_collision.c’s truncate_down_fn() (run through > run_test(&truncate_down_fn), namely in one of the pread()s. I can also > see that some of the pread()s before fail with EFAULT. > > A bit more context: t_mmap_collision.c opens a test file twice (I think > the idea is that you open it once on an FS with DAX, and once without, > but AFAIU it should work either way). For the relevant test, it mmap()s > the DAX FD, truncates it, then fallocates it to increase the size again. > Then it reads from the non-DAX FD. Can you just confirm where the DAX is happening here? As I read that bz entry it's using the qemu which doesn't have DAX code yet. Dave > It does all of that in two threads simultaneously for a second. > > The EFAULT seems to come from the guest kernel. I don’t see virtiofsd > returning an error anywhere. I don’t know where it comes from exactly, > only that when I replace all occurrences of “EFAULT” by e.g. “EBADSLT” > in mm/, the test crashes instead of hanging, so I take that to mean that > the error comes from something in mm/ (which I suppose isn’t too > unexpected). > > The test passes if running the test function in a single thread instead > of two, or if you use a separate TEST_DEV and SCRATCH_DEV – but in the > latter case, you really have two separate files, so the test becomes > rather moot (AFAIU). > > The fact that truncate_down_fn() uses fallocate() seems irrelevant. > When you replace it by ftruncate() (i.e. the dax_fd is just first > truncated to 0, and then truncated back to @file_size), the test fails > in the same way. So maybe there is some interaction between the > ftruncate() and a concurrent pread()? But where does the EFAULT come from? > > Does anyone have any spontaneous ideas? :/ > > > In any case, thanks already for reading this, > > Max > > > (I suppose my plan now is that instead of debugging the kernel further, > I should come up with a simpler reproducer, to see whether the problem > is really just a concurrent ftruncate() + pread() on two FDs that point > to the same file.) > > _______________________________________________ > Virtio-fs mailing list > Virtio-fs@redhat.com > https://www.redhat.com/mailman/listinfo/virtio-fs -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK