From: Mike Snitzer <snitzer@redhat.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: dm-devel@redhat.com, linux-next@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: x86 VM Boot hang with latest linux-next
Date: Mon, 4 Mar 2019 23:07:15 -0500 [thread overview]
Message-ID: <20190305040714.GB21739@redhat.com> (raw)
In-Reply-To: <CAKgT0Ue9+wSL60Bj0drkgVQwem=mh_eTrFMda+inX4=qBgi3dA@mail.gmail.com>
On Sun, Mar 03 2019 at 12:06pm -0500,
Alexander Duyck <alexander.duyck@gmail.com> wrote:
> On Sat, Mar 2, 2019 at 7:48 PM Mike Snitzer <snitzer@redhat.com> wrote:
> >
> > On Sat, Mar 02 2019 at 6:34pm -0500,
> > Alexander Duyck <alexander.duyck@gmail.com> wrote:
> >
> > > So I have been seeing an issue with an intermittent boot hang on my
> > > x86 KVM VM with the latest linux-next and have bisected it down to the
> > > following commit:
> > > 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7 is the first bad commit
> > > commit 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7
> > > Author: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri Feb 22 11:23:01 2019 -0500
> > >
> > > dm: must allocate dm_noclone for stacked noclone devices
> > >
> > > Otherwise various lvm2 testsuite tests fail because the lower layers of
> > > the stacked noclone device aren't updated to allocate a new 'struct
> > > dm_clone' that reflects the upper layer bio that was issued to it.
> > >
> > > Fixes: 97a89458020b38 ("dm: improve noclone bio support")
> > > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > >
> > > What I am seeing is in about 3 out of 4 boots the startup just hangs
> > > at the filesystem check stage with the following message:
> > > [ OK ] Reached target Local File Systems (Pre).
> > > Starting File System Check on /dev/…127-ad57-426f-bb45-363950544c0c...
> > > [ **] (1 of 2) A start job is running for…n on device 252:2 (19s / no limit)
> > >
> > > I did some googling and it looks like a similar issue has been
> > > reported for s390. Based on the request for data there I have the
> > > following info:
> > > [root@localhost ~]# dmsetup ls --tree
> > > fedora-swap (253:1)
> > > └─ (252:2)
> > > fedora-root (253:0)
> > > └─ (252:2)
> > >
> > > [root@localhost ~]# dmsetup table
> > > fedora-swap: 0 4194304 linear 252:2 2048
> > > fedora-root: 0 31457280 linear 252:2 4196352
> >
> > Thanks, which version of Fedora are you running?
>
> The VM is running Fedora 27 with a kernel built off of latest
> linux-next as of March 1st.
>
> > Your case is more straightforward in that you're clearly using bio-based
> > DM linear (which was updated to leverage "noclone" support); whereas the
> > s390 case is using request-based DM which isn't impacted by the commit
> > in question at all.
> >
> > I'll attempt to reproduce first thing Monday.
> >
> > Mike
>
> Thanks. The behavior of it has me wondering if we are looking at
> something like an uninitialized data issue or something like that
> since as I mentioned I don't see this occur on every boot, just on
> most of them. So every now and then I can boot up the VM without any
> issues, but most of the time it will boot and then get stuck waiting
> on jobs that take forever.
I just copied you on another related thread, but for the benefit of
anyone on LKML, please see the following for a fix that works for me:
https://www.redhat.com/archives/dm-devel/2019-March/msg00027.html
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: linux-next@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
dm-devel@redhat.com
Subject: Re: x86 VM Boot hang with latest linux-next
Date: Mon, 4 Mar 2019 23:07:15 -0500 [thread overview]
Message-ID: <20190305040714.GB21739@redhat.com> (raw)
In-Reply-To: <CAKgT0Ue9+wSL60Bj0drkgVQwem=mh_eTrFMda+inX4=qBgi3dA@mail.gmail.com>
On Sun, Mar 03 2019 at 12:06pm -0500,
Alexander Duyck <alexander.duyck@gmail.com> wrote:
> On Sat, Mar 2, 2019 at 7:48 PM Mike Snitzer <snitzer@redhat.com> wrote:
> >
> > On Sat, Mar 02 2019 at 6:34pm -0500,
> > Alexander Duyck <alexander.duyck@gmail.com> wrote:
> >
> > > So I have been seeing an issue with an intermittent boot hang on my
> > > x86 KVM VM with the latest linux-next and have bisected it down to the
> > > following commit:
> > > 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7 is the first bad commit
> > > commit 1efa3bb79d3de8ca1b7f6770313a1fc0bebe25c7
> > > Author: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri Feb 22 11:23:01 2019 -0500
> > >
> > > dm: must allocate dm_noclone for stacked noclone devices
> > >
> > > Otherwise various lvm2 testsuite tests fail because the lower layers of
> > > the stacked noclone device aren't updated to allocate a new 'struct
> > > dm_clone' that reflects the upper layer bio that was issued to it.
> > >
> > > Fixes: 97a89458020b38 ("dm: improve noclone bio support")
> > > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > >
> > > What I am seeing is in about 3 out of 4 boots the startup just hangs
> > > at the filesystem check stage with the following message:
> > > [ OK ] Reached target Local File Systems (Pre).
> > > Starting File System Check on /dev/…127-ad57-426f-bb45-363950544c0c...
> > > [ **] (1 of 2) A start job is running for…n on device 252:2 (19s / no limit)
> > >
> > > I did some googling and it looks like a similar issue has been
> > > reported for s390. Based on the request for data there I have the
> > > following info:
> > > [root@localhost ~]# dmsetup ls --tree
> > > fedora-swap (253:1)
> > > └─ (252:2)
> > > fedora-root (253:0)
> > > └─ (252:2)
> > >
> > > [root@localhost ~]# dmsetup table
> > > fedora-swap: 0 4194304 linear 252:2 2048
> > > fedora-root: 0 31457280 linear 252:2 4196352
> >
> > Thanks, which version of Fedora are you running?
>
> The VM is running Fedora 27 with a kernel built off of latest
> linux-next as of March 1st.
>
> > Your case is more straightforward in that you're clearly using bio-based
> > DM linear (which was updated to leverage "noclone" support); whereas the
> > s390 case is using request-based DM which isn't impacted by the commit
> > in question at all.
> >
> > I'll attempt to reproduce first thing Monday.
> >
> > Mike
>
> Thanks. The behavior of it has me wondering if we are looking at
> something like an uninitialized data issue or something like that
> since as I mentioned I don't see this occur on every boot, just on
> most of them. So every now and then I can boot up the VM without any
> issues, but most of the time it will boot and then get stuck waiting
> on jobs that take forever.
I just copied you on another related thread, but for the benefit of
anyone on LKML, please see the following for a fix that works for me:
https://www.redhat.com/archives/dm-devel/2019-March/msg00027.html
next prev parent reply other threads:[~2019-03-05 4:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-02 23:34 x86 VM Boot hang with latest linux-next Alexander Duyck
2019-03-03 3:48 ` Mike Snitzer
2019-03-03 17:06 ` Alexander Duyck
2019-03-04 23:02 ` Mike Snitzer
2019-03-05 4:07 ` Mike Snitzer [this message]
2019-03-05 4:07 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190305040714.GB21739@redhat.com \
--to=snitzer@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.