From: Karel Zak <kzak@redhat.com>
To: Theodore Tso <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, Eric Sandeen <sandeen@redhat.com>,
mbroz@redhat.com, agk@redhat.com
Subject: Re: [PATCH] blkid: optimize dm_device_is_leaf() usage
Date: Tue, 26 Aug 2008 22:47:37 +0200 [thread overview]
Message-ID: <20080826204737.GM6029@nb.net.home> (raw)
In-Reply-To: <20080826144721.GD8720@mit.edu>
On Tue, Aug 26, 2008 at 10:47:21AM -0400, Theodore Tso wrote:
> On Tue, Aug 26, 2008 at 03:51:02PM +0200, Karel Zak wrote:
> > Well, I see few problems:
> >
> > * /proc/partitions containing internal dm device names (e.g. dm-0).
> > The libdevmapper provides translation from internal to the "real"
> > names (e.g /dev/mapper/foo). I guess (hope:-) /sys provides the
> > real names too.
>
> You're right. So seaching for /dev/mapper/dm-n doesn't make any
> sense; adding /dev/mapper to the dirlist doesn't help, and in fact is
> a waste of time. However, the patch actually *did* work, and the
> reason why it does is because we are also are searching /dev/mapper by
> device number, and so we are finding the device name that way.
OK.
> I don't think you mean multipath support in terms of where there are
> multiple paths to the same physical device ala fiber channel, but
Ignore this point. You are right. The physical devices are slaves to
the final DM device (when dm-multipath is on). BTW, I look forward to
see multiple paths vs. udev (e.g. /dev/disk/by-* ) :-)
> What we don't solve is the problem where one devicemapper device is
> used to build another device mapper device. This could happen in a
> number of circumstances. You might have some wierd circumstance where
> /dev/mapper/part1 and /dev/mapper/part2 are glued together to make
> /dev/mapper/whole-filesystem. Why you might do this instead of simply
> using something like lvextend is beyond me, but that is something
> legitimate can be done with the low-level device mapper primitives.
There is worse scenario (thanks to Milan Broz from DM camp):
dmsetup create x --table "0 100 linear /dev/sdb 0"
dmsetup create y --table "0 100 linear /dev/mapper/x 0"
dmsetup create z --table "0 100 linear /dev/mapper/y 0"
# dmsetup ls --tree
z (254:3)
`-y (254:2)
`-x (254:1)
`- (8:16)
it means all these devices are exactly same, but
mount LABEL=foo
has to mount /dev/mapper/z (from top of the tree). The sdb, x and
y should be invisible for the mount(8).
> But, #1, there are times when picking a leaf dm device over a non-leaf
> dm device is not the right thing to do (which would be the case when
> you make a live snapshot of a filesystem), and #2, your patch only
> checks non-leaf dm devices for non-dm devices, probably because of #1.
>
> So with both of our patches, we have the problem where we could pick
> the wrong dm device if the user builds one dm device on top of another
I don't think so. The dm_probe_all() function never returns any
DM device which is slave to any other device. It means it always
returns the device from top of the hierarchy.
All devices from dm_probe_all() have greater priority than other
stuff from /proc/partitions (for example dm-N devs).
So back to your example...
/dev/mapper/part1 + /dev/mapper/part2 = /dev/mapper/whole-filesystem
the /dev/mapper/part1 and /dev/mapper/part2 will be visible for the
library (e.g. blkid.tab), but with *smaller priority* than
/dev/mapper/whole-filesystem.
In your non-libdevmapper implementation you need to check
/sys/block/dm-N/holders/ and prefer devices without holders.
I think we can ignore this minor problem for now. I'll try to found a
better solution for dependencies resolution without libdevmapper. My
wish is to avoid libdevmapper in libfsprobe.
> > > + if (dev) {
> > > + if (pri)
> > > + dev->bid_pri = pri;
> > > + else if (!strncmp(dev->bid_name, "/dev/mapper/", 11))
what about "if (major(devno) == DMMAJOR)" rather than strcmp()?
> > > + dev->bid_pri = BLKID_PRI_DM;
> >
> > the same problem
>
> This does work, because we do find the /dev/mapper name via a
> brute-force search of /dev looking for a matching devno when we call
> blkid_devno_to_devname(). What I *can* do is do a special search of
> /dev/mapper first, but instead of looking for /dev/mapper/<ptname>, to
> do a readdir search of /dev/mapper looking for the matching devno.
Not elegant, but... good enough :-)
It would be nice to have /sys/block/dm-N/name where you can translate
the internal dm-N name to the real device name. Alasdair? Milan? :-)
Karel
--
Karel Zak <kzak@redhat.com>
next prev parent reply other threads:[~2008-08-26 20:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-25 20:48 [PATCH] blkid: optimize dm_device_is_leaf() usage Karel Zak
2008-08-26 12:24 ` Theodore Tso
2008-08-26 13:51 ` Karel Zak
2008-08-26 14:47 ` Theodore Tso
2008-08-26 18:04 ` Theodore Tso
2008-08-26 19:44 ` Andreas Dilger
2008-08-26 20:00 ` Theodore Tso
2008-08-26 20:47 ` Karel Zak [this message]
2008-08-26 23:32 ` Theodore Tso
2008-08-27 0:19 ` Karel Zak
2008-08-27 1:21 ` Theodore Tso
2008-08-27 4:40 ` Theodore Tso
2008-08-27 8:32 ` Karel Zak
2008-08-27 7:26 ` Andreas Dilger
2008-08-27 8:10 ` Karel Zak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080826204737.GM6029@nb.net.home \
--to=kzak@redhat.com \
--cc=agk@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=mbroz@redhat.com \
--cc=sandeen@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox