From: Mike Snitzer <snitzer@redhat.com>
To: "Lukáš Czerner" <lczerner@redhat.com>
Cc: amwang@redhat.com, Zdenek Kabelac <zkabelac@redhat.com>,
Hugh Dickins <hughd@google.com>,
linux-kernel@vger.kernel.org, Joe Thornber <ejt@redhat.com>,
LVM general discussion and development <linux-lvm@redhat.com>,
Alasdair G Kergon <agk@redhat.com>
Subject: Re: [linux-lvm] Regression with FALLOC_FL_PUNCH_HOLE in 3.5-rc kernel
Date: Mon, 2 Jul 2012 09:41:04 -0400 [thread overview]
Message-ID: <20120702134104.GC785@redhat.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1207021216430.24050@dhcp-1-248.brq.redhat.com>
On Mon, Jul 02 2012 at 6:35am -0400,
Lukáš Czerner <lczerner@redhat.com> wrote:
> >
> > So you're testing rather old kernel so you might be missing some
> > fixes there. Could you rerun the test with the recent kernel ?
> >
> > Also it appears that the bug here happens because dm requested a
> > destination page which is within the kernel space. It seems that
> > this has been initiated by the write request from the mirror target.
> > So I do not immediately see how punch hole (discard) is involved at
> > all. You might have been lucky enough to hit a different bug
> > probably ?
> >
> > Looking at git log, this commit has been brought to my attention:
> >
> > 0c535e0d6f463365c29623350dbd91642363c39b dm io: fix discard support
> >
> > seems related to this crash.
> >
> > Please retest with recent kernel.
Ah, you beat me to recommending that fix ;)
> So from the original backtrace for the problem Zdenek is seeing on 3.5.0-rc4
> (https://lkml.org/lkml/2012/6/30/98) I think that this is
> problem in the device mapper itself. I do not think it has anything
> to do with tmpfs or mm. According to bisects from Zdenek it clearly
> shows that the problem appear when the discard support for the loop
> device is added, so it is most likely related to the dm discard support.
What about using scsi_debug with the dm-mirror target?
Never say never, DM-mirror and/or dm-io code could still have an issue,
but the commit referenced above did fix discard with the mirror target
back in 3.3.
> Anyway, the backtrace points to the NULL pointed dereference in
> dm_rh_region_context() which is simple function:
>
> void *dm_rh_region_context(struct dm_region *reg)
> {
> return reg->rh->context;
> }
>
> so either reg, or reg-rh is NULL. Now the only place this is used is
> from recovery_complete() in dm-raid1.c. So this is somewhat related
> to raid recovery. I am not familiar with the dm code, but can
> someone from the dm team look at this ?
I'll coordiinate with Zdenek.
> But just to be sure to rule out the punch hole thing Zdenek can you
> run your tests on the "real" discard capable device ? Or at least on
> the device which does not convert discard requests into punch hole ?
> You can use scsi_debug to create such device:
>
> modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 lbpu=1
Great minds think alike ;)
WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: "Lukáš Czerner" <lczerner@redhat.com>
Cc: Zdenek Kabelac <zkabelac@redhat.com>,
Hugh Dickins <hughd@google.com>,
Mikulas Patocka <mpatocka@redhat.com>,
Joe Thornber <ejt@redhat.com>,
LVM general discussion and development <linux-lvm@redhat.com>,
amwang@redhat.com, Alasdair G Kergon <agk@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: Regression with FALLOC_FL_PUNCH_HOLE in 3.5-rc kernel
Date: Mon, 2 Jul 2012 09:41:04 -0400 [thread overview]
Message-ID: <20120702134104.GC785@redhat.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1207021216430.24050@dhcp-1-248.brq.redhat.com>
On Mon, Jul 02 2012 at 6:35am -0400,
Lukáš Czerner <lczerner@redhat.com> wrote:
> >
> > So you're testing rather old kernel so you might be missing some
> > fixes there. Could you rerun the test with the recent kernel ?
> >
> > Also it appears that the bug here happens because dm requested a
> > destination page which is within the kernel space. It seems that
> > this has been initiated by the write request from the mirror target.
> > So I do not immediately see how punch hole (discard) is involved at
> > all. You might have been lucky enough to hit a different bug
> > probably ?
> >
> > Looking at git log, this commit has been brought to my attention:
> >
> > 0c535e0d6f463365c29623350dbd91642363c39b dm io: fix discard support
> >
> > seems related to this crash.
> >
> > Please retest with recent kernel.
Ah, you beat me to recommending that fix ;)
> So from the original backtrace for the problem Zdenek is seeing on 3.5.0-rc4
> (https://lkml.org/lkml/2012/6/30/98) I think that this is
> problem in the device mapper itself. I do not think it has anything
> to do with tmpfs or mm. According to bisects from Zdenek it clearly
> shows that the problem appear when the discard support for the loop
> device is added, so it is most likely related to the dm discard support.
What about using scsi_debug with the dm-mirror target?
Never say never, DM-mirror and/or dm-io code could still have an issue,
but the commit referenced above did fix discard with the mirror target
back in 3.3.
> Anyway, the backtrace points to the NULL pointed dereference in
> dm_rh_region_context() which is simple function:
>
> void *dm_rh_region_context(struct dm_region *reg)
> {
> return reg->rh->context;
> }
>
> so either reg, or reg-rh is NULL. Now the only place this is used is
> from recovery_complete() in dm-raid1.c. So this is somewhat related
> to raid recovery. I am not familiar with the dm code, but can
> someone from the dm team look at this ?
I'll coordiinate with Zdenek.
> But just to be sure to rule out the punch hole thing Zdenek can you
> run your tests on the "real" discard capable device ? Or at least on
> the device which does not convert discard requests into punch hole ?
> You can use scsi_debug to create such device:
>
> modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 lbpu=1
Great minds think alike ;)
next prev parent reply other threads:[~2012-07-02 13:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-30 11:41 [linux-lvm] Regression with FALLOC_FL_PUNCH_HOLE in 3.5-rc kernel Zdenek Kabelac
2012-06-30 13:20 ` Zdenek Kabelac
2012-06-30 19:55 ` [linux-lvm] " Hugh Dickins
2012-06-30 19:55 ` Hugh Dickins
2012-06-30 20:51 ` [linux-lvm] " Zdenek Kabelac
2012-06-30 20:51 ` Zdenek Kabelac
2012-06-30 23:10 ` [linux-lvm] " Hugh Dickins
2012-06-30 23:10 ` Hugh Dickins
2012-07-01 12:57 ` [linux-lvm] " Zdenek Kabelac
2012-07-01 12:57 ` Zdenek Kabelac
2012-07-01 18:45 ` [linux-lvm] " Hugh Dickins
2012-07-01 18:45 ` Hugh Dickins
2012-07-01 20:10 ` [linux-lvm] " Zdenek Kabelac
2012-07-01 20:10 ` Zdenek Kabelac
2012-07-01 22:03 ` [linux-lvm] " Zdenek Kabelac
2012-07-01 22:03 ` Zdenek Kabelac
2012-07-02 9:00 ` [linux-lvm] " Lukáš Czerner
2012-07-02 9:00 ` Lukáš Czerner
2012-07-02 10:35 ` [linux-lvm] " Lukáš Czerner
2012-07-02 10:35 ` Lukáš Czerner
2012-07-02 13:41 ` Mike Snitzer [this message]
2012-07-02 13:41 ` Mike Snitzer
2012-07-02 13:33 ` [linux-lvm] " Mike Snitzer
2012-07-02 13:33 ` Mike Snitzer
2012-07-04 11:51 ` [linux-lvm] " Zdenek Kabelac
2012-07-04 11:51 ` Zdenek Kabelac
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120702134104.GC785@redhat.com \
--to=snitzer@redhat.com \
--cc=agk@redhat.com \
--cc=amwang@redhat.com \
--cc=ejt@redhat.com \
--cc=hughd@google.com \
--cc=lczerner@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-lvm@redhat.com \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.