* DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] [not found] ` <20201222131528.GA29822@lst.de> @ 2020-12-22 14:53 ` Mike Snitzer 2020-12-22 17:24 ` Hannes Reinecke 0 siblings, 1 reply; 5+ messages in thread From: Mike Snitzer @ 2020-12-22 14:53 UTC (permalink / raw) To: Christoph Hellwig, linux-block, dm-devel Cc: Stephen Rothwell, Alasdair G Kergon, Hannes Reinecke, Jens Axboe, Linux Kernel Mailing List, Linux Next Mailing List [added linux-block and dm-devel, if someone replies to this email to continue "proper discussion" _please_ at least drop sfr and linux-next from Cc] On Tue, Dec 22 2020 at 8:15am -0500, Christoph Hellwig <hch@lst.de> wrote: > Mike, Hannes, > > I think this patch is rather harmful. Why does device mapper even > mix file system path with a dev_t and all the other weird forms > parsed by name_to_dev_t, which was supposed to be be for the early > init code where no file system is available. OK, I'll need to revisit (unless someone beats me to it) because this could've easily been a blind-spot for me when the dm-init code went in. Any dm-init specific enabling interface shouldn't be used by more traditional DM interfaces. So Hannes' change might be treating symptom rather than the core problem (which would be better treated by factoring out dm-init requirements for a name_to_dev_t()-like interface?). DM has supported passing maj:min and blockdev names on DM table lines forever... so we'll need to be very specific about where/why things regressed. > Can we please kick off a proper discussion for this on the linux-block > list? Sure, done. But I won't drive that discussion in the near-term. I need to take some time off for a few weeks. In the meantime I'll drop Hannes' patch for 5.11; I'm open to an alternative fix that I'd pickup during 5.11-rcX. Thanks, Mike ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] 2020-12-22 14:53 ` DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] Mike Snitzer @ 2020-12-22 17:24 ` Hannes Reinecke 2020-12-22 21:06 ` [dm-devel] " Alasdair G Kergon 2020-12-23 8:05 ` Christoph Hellwig 0 siblings, 2 replies; 5+ messages in thread From: Hannes Reinecke @ 2020-12-22 17:24 UTC (permalink / raw) To: Mike Snitzer, Christoph Hellwig, linux-block, dm-devel Cc: Stephen Rothwell, Alasdair G Kergon, Jens Axboe, Linux Kernel Mailing List, Linux Next Mailing List On 12/22/20 3:53 PM, Mike Snitzer wrote: > [added linux-block and dm-devel, if someone replies to this email to > continue "proper discussion" _please_ at least drop sfr and linux-next > from Cc] > > On Tue, Dec 22 2020 at 8:15am -0500, > Christoph Hellwig <hch@lst.de> wrote: > >> Mike, Hannes, >> >> I think this patch is rather harmful. Why does device mapper even >> mix file system path with a dev_t and all the other weird forms >> parsed by name_to_dev_t, which was supposed to be be for the early >> init code where no file system is available. > > OK, I'll need to revisit (unless someone beats me to it) because this > could've easily been a blind-spot for me when the dm-init code went in. > Any dm-init specific enabling interface shouldn't be used by more > traditional DM interfaces. So Hannes' change might be treating symptom > rather than the core problem (which would be better treated by factoring > out dm-init requirements for a name_to_dev_t()-like interface?). > > DM has supported passing maj:min and blockdev names on DM table lines > forever... so we'll need to be very specific about where/why things > regressed. > Ok. The problem from my perspective is that device-mapper needs to a) ensure that the arbitrary string passed in with the table definition refers to a valid block device and b) the block device can be opened with O_EXCL, so that device-mapper can then use it. Originally (ie prior to commit 644bda6f3460) dm_get_device() just converted the string to a 'dev_t' representation, and then the block device itself was checked and opened in dm_get_table_device(). 'lookup_bdev' was just being used to convert the path if the string was not in the canonical major:minor format, as then it was assumed that it referred to a block device node, and then lookup_bdev kinda makes sense. However, lookup_bdev() now always recurses into the filesystem, causing multipath to stall in an all-paths-down scenario. So, the real issue is the table definiton; as it also accepts a device to be specified by the block device _node_ name, we need to have a way of converting that into a dev_t. If lookup_bdev() is the wrong interface for that, by all means, please, do tell me. I'd be happy to draft up a patch. Alternatively, if Mike says that only major:minor is the valid format for a table definition we can kill that code completely. But clearly _I_ cannot make the call here. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-devel] DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] 2020-12-22 17:24 ` Hannes Reinecke @ 2020-12-22 21:06 ` Alasdair G Kergon 2020-12-23 8:06 ` Christoph Hellwig 2020-12-23 8:05 ` Christoph Hellwig 1 sibling, 1 reply; 5+ messages in thread From: Alasdair G Kergon @ 2020-12-22 21:06 UTC (permalink / raw) To: Hannes Reinecke Cc: Mike Snitzer, Christoph Hellwig, linux-block, dm-devel, Jens Axboe, Linux Kernel Mailing List, Alasdair G Kergon On Tue, Dec 22, 2020 at 06:24:09PM +0100, Hannes Reinecke wrote: > However, lookup_bdev() now always recurses into the filesystem, causing > multipath to stall in an all-paths-down scenario. I have not read the background about whatever the new problem is - I'm jumping in cold seeing this message - but from the very beginning of device-mapper we have strongly recommended that userspace supplies the block device in the form MAJOR:MINOR and all our own tools do that. We guarantee not to deadlock in these places when this is done. We also accept the device in the form of a path name as we know there are times when this is safe and convenient, but then we offer no guarantees - we place the responsibility upon userspace only to do this when it knows it is safe to do so i.e. no race and no deadlock. Alasdair ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-devel] DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] 2020-12-22 21:06 ` [dm-devel] " Alasdair G Kergon @ 2020-12-23 8:06 ` Christoph Hellwig 0 siblings, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2020-12-23 8:06 UTC (permalink / raw) To: Hannes Reinecke, Mike Snitzer, Christoph Hellwig, linux-block, dm-devel, Jens Axboe, Linux Kernel Mailing List, Alasdair G Kergon On Tue, Dec 22, 2020 at 09:06:04PM +0000, Alasdair G Kergon wrote: > I have not read the background about whatever the new problem is - I'm > jumping in cold seeing this message - but from the very beginning of > device-mapper we have strongly recommended that userspace supplies the > block device in the form MAJOR:MINOR and all our own tools do that. We > guarantee not to deadlock in these places when this is done. > > We also accept the device in the form of a path name as we know there > are times when this is safe and convenient, but then we offer no > guarantees - we place the responsibility upon userspace only to do this > when it knows it is safe to do so i.e. no race and no deadlock. 644bda6f346038bce7ad3ed48f7044c10dde6d47 changes that by accepting all kinds of weirdo formats :( ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] 2020-12-22 17:24 ` Hannes Reinecke 2020-12-22 21:06 ` [dm-devel] " Alasdair G Kergon @ 2020-12-23 8:05 ` Christoph Hellwig 1 sibling, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2020-12-23 8:05 UTC (permalink / raw) To: Hannes Reinecke Cc: Mike Snitzer, Christoph Hellwig, linux-block, dm-devel, Stephen Rothwell, Alasdair G Kergon, Jens Axboe, Linux Kernel Mailing List, Linux Next Mailing List On Tue, Dec 22, 2020 at 06:24:09PM +0100, Hannes Reinecke wrote: > Ok. The problem from my perspective is that device-mapper needs to > a) ensure that the arbitrary string passed in with the table definition > refers to a valid block device > and > b) the block device can be opened with O_EXCL, so that device-mapper can > then use it. > > Originally (ie prior to commit 644bda6f3460) dm_get_device() just converted > the string to a 'dev_t' representation, and then the block device itself > was checked and opened in dm_get_table_device(). > 'lookup_bdev' was just being used to convert the path if the string was not > in the canonical major:minor format, as then it was assumed that it > referred to a block device node, and then lookup_bdev kinda makes sense. Yes, 644bda6f3460 is the cause of all evil, as it uses an API in a place where it should not be used. It and the prep patch (e6e20a7a5f3f49bfee518d5c6849107398d83912) which did the grave mistake of making name_to_dev_t available outside of the early init code really need to be reverted. > However, lookup_bdev() now always recurses into the filesystem, causing > multipath to stall in an all-paths-down scenario. lookup_bdev always did a file system lookup, and always only accepted a valid name in the file system. > Alternatively, if Mike says that only major:minor is the valid format for a > table definition we can kill that code completely. But clearly _I_ cannot > make the call here. Before 644bda6f3460 the table definitions only accepted a valid name in the file system. Which is the proper interface. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-23 8:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20201222095056.7a5ac0a0@canb.auug.org.au>
[not found] ` <20201222131528.GA29822@lst.de>
2020-12-22 14:53 ` DM's filesystem lookup in dm_get_dev_t() [was: Re: linux-next: manual merge of the device-mapper tree with Linus' tree] Mike Snitzer
2020-12-22 17:24 ` Hannes Reinecke
2020-12-22 21:06 ` [dm-devel] " Alasdair G Kergon
2020-12-23 8:06 ` Christoph Hellwig
2020-12-23 8:05 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).