* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. [not found] <20180522005336.GA30152@yyp.\x02> @ 2018-06-19 13:11 ` Mike Snitzer 2018-06-19 14:43 ` Joe Thornber 0 siblings, 1 reply; 7+ messages in thread From: Mike Snitzer @ 2018-06-19 13:11 UTC (permalink / raw) To: Monty Pavel; +Cc: dm-devel On Mon, May 21 2018 at 8:53pm -0400, Monty Pavel <monty_pavel@sina.com> wrote: > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > func and power loss happens during executing it, coincidencely > superblock wrote correctly but some metadata blocks didn't. The reason > is we write all metadata in async mode. We can guarantee that we send > superblock after other blocks but we cannot guarantee that superblock > write completely early than other blocks. > So, We need to commit other metadata blocks before change superblock. > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > --- > drivers/md/dm-thin-metadata.c | 8 ++++++++ > 1 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > index 36ef284..897d7d6 100644 > --- a/drivers/md/dm-thin-metadata.c > +++ b/drivers/md/dm-thin-metadata.c > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > if (r) > return r; > > + r = dm_tm_commit(pmd->tm, sblock); > + if (r) > + return r; > + > + r = superblock_lock(pmd, &sblock); > + if (r) > + return r; > + > disk_super = dm_block_data(sblock); > disk_super->time = cpu_to_le32(pmd->time); > disk_super->data_mapping_root = cpu_to_le64(pmd->root); > -- > 1.7.1 Have you actually found this patch to be effective? It should be unnecessary. But I must admit that in looking at the related code I couldn't convince myself it was. But then Joe pointed me to this comment block from dm-transaction-manager.h: /* * We use a 2-phase commit here. * * i) Make all changes for the transaction *except* for the superblock. * Then call dm_tm_pre_commit() to flush them to disk. * * ii) Lock your superblock. Update. Then call dm_tm_commit() which will * unlock the superblock and flush it. No other blocks should be updated * during this period. Care should be taken to never unlock a partially * updated superblock; perform any operations that could fail *before* you * take the superblock lock. */ int dm_tm_pre_commit(struct dm_transaction_manager *tm); int dm_tm_commit(struct dm_transaction_manager *tm, struct dm_block *superblock); So given __commit_transaction() is using dm_tm_pre_commit() prior to the dm_tm_commit() to flush the superblock -- it would seem that there isn't any conceptual potential for corruption. If you've found the dm_tm_pre_commit() to be lacking (whereby not all metadata getting flushed to disk before the superblock) then please explain your findings. Thanks, Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. 2018-06-19 13:11 ` dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode Mike Snitzer @ 2018-06-19 14:43 ` Joe Thornber 2018-06-19 15:00 ` Mike Snitzer 0 siblings, 1 reply; 7+ messages in thread From: Joe Thornber @ 2018-06-19 14:43 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, Monty Pavel On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote: > On Mon, May 21 2018 at 8:53pm -0400, > Monty Pavel <monty_pavel@sina.com> wrote: > > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > > func and power loss happens during executing it, coincidencely > > superblock wrote correctly but some metadata blocks didn't. The reason > > is we write all metadata in async mode. We can guarantee that we send > > superblock after other blocks but we cannot guarantee that superblock > > write completely early than other blocks. > > So, We need to commit other metadata blocks before change superblock. > > > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > > --- > > drivers/md/dm-thin-metadata.c | 8 ++++++++ > > 1 files changed, 8 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > > index 36ef284..897d7d6 100644 > > --- a/drivers/md/dm-thin-metadata.c > > +++ b/drivers/md/dm-thin-metadata.c > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > > if (r) > > return r; > > > > + r = dm_tm_commit(pmd->tm, sblock); > > + if (r) > > + return r; > > + > > + r = superblock_lock(pmd, &sblock); > > + if (r) > > + return r; > > + > > disk_super = dm_block_data(sblock); > > disk_super->time = cpu_to_le32(pmd->time); > > disk_super->data_mapping_root = cpu_to_le64(pmd->root); I don't believe you've tested this; sblock is passed to dm_tm_commit() uninitialised, and you didn't even bother to remove the later (and correct) call to dm_tm_commit(). See dm-transaction-manager.h for an explanation of how the 2-phase commit works. What is the issue that started you looking in this area? - Joe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. 2018-06-19 14:43 ` Joe Thornber @ 2018-06-19 15:00 ` Mike Snitzer [not found] ` <20180620170357.GA5838@yyp.\x02> 2018-06-20 17:03 ` monty 0 siblings, 2 replies; 7+ messages in thread From: Mike Snitzer @ 2018-06-19 15:00 UTC (permalink / raw) To: Monty Pavel, dm-devel; +Cc: ejt On Tue, Jun 19 2018 at 10:43am -0400, Joe Thornber <thornber@redhat.com> wrote: > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote: > > On Mon, May 21 2018 at 8:53pm -0400, > > Monty Pavel <monty_pavel@sina.com> wrote: > > > > > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > > > func and power loss happens during executing it, coincidencely > > > superblock wrote correctly but some metadata blocks didn't. The reason > > > is we write all metadata in async mode. We can guarantee that we send > > > superblock after other blocks but we cannot guarantee that superblock > > > write completely early than other blocks. > > > So, We need to commit other metadata blocks before change superblock. > > > > > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > > > --- > > > drivers/md/dm-thin-metadata.c | 8 ++++++++ > > > 1 files changed, 8 insertions(+), 0 deletions(-) > > > > > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > > > index 36ef284..897d7d6 100644 > > > --- a/drivers/md/dm-thin-metadata.c > > > +++ b/drivers/md/dm-thin-metadata.c > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > > > if (r) > > > return r; > > > > > > + r = dm_tm_commit(pmd->tm, sblock); > > > + if (r) > > > + return r; > > > + > > > + r = superblock_lock(pmd, &sblock); > > > + if (r) > > > + return r; > > > + > > > disk_super = dm_block_data(sblock); > > > disk_super->time = cpu_to_le32(pmd->time); > > > disk_super->data_mapping_root = cpu_to_le64(pmd->root); > > I don't believe you've tested this; sblock is passed to dm_tm_commit() > uninitialised, and you didn't even bother to remove the later (and correct) > call to dm_tm_commit(). I pointed out to Joe that the patch, in isolation, is decieving. It _looks_ like sblock may be uninitialized, etc. But once the patch is applied and you look at the entirety of __commit_transaction() it is clear that you're reusing the existing superblock_lock() to safely accomplish your additional call to dm_tm_commit(). > What is the issue that started you looking in this area? Right, as my previous reply asked: please clarify if you _know_ your patch fixes an actual problem you've experienced. The more details the better. Thanks, Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20180620170357.GA5838@yyp.\x02>]
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. [not found] ` <20180620170357.GA5838@yyp.\x02> @ 2018-06-20 14:51 ` Mike Snitzer 2018-06-21 16:54 ` monty 2018-06-20 15:00 ` Joe Thornber 1 sibling, 1 reply; 7+ messages in thread From: Mike Snitzer @ 2018-06-20 14:51 UTC (permalink / raw) To: monty; +Cc: dm-devel, thornber On Wed, Jun 20 2018 at 1:03pm -0400, monty <monty_pavel@sina.com> wrote: > > On Tue, Jun 19, 2018 at 11:00:32AM -0400, Mike Snitzer wrote: > > > > On Tue, Jun 19 2018 at 10:43am -0400, > > Joe Thornber <thornber@redhat.com> wrote: > > > > > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote: > > > > On Mon, May 21 2018 at 8:53pm -0400, > > > > Monty Pavel <monty_pavel@sina.com> wrote: > > > > > > > > > > > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > > > > > func and power loss happens during executing it, coincidencely > > > > > superblock wrote correctly but some metadata blocks didn't. The reason > > > > > is we write all metadata in async mode. We can guarantee that we send > > > > > superblock after other blocks but we cannot guarantee that superblock > > > > > write completely early than other blocks. > > > > > So, We need to commit other metadata blocks before change superblock. > > > > > > > > > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > > > > > --- > > > > > drivers/md/dm-thin-metadata.c | 8 ++++++++ > > > > > 1 files changed, 8 insertions(+), 0 deletions(-) > > > > > > > > > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > > > > > index 36ef284..897d7d6 100644 > > > > > --- a/drivers/md/dm-thin-metadata.c > > > > > +++ b/drivers/md/dm-thin-metadata.c > > > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > > > > > if (r) > > > > > return r; > > > > > > > > > > + r = dm_tm_commit(pmd->tm, sblock); > > > > > + if (r) > > > > > + return r; > > > > > + > > > > > + r = superblock_lock(pmd, &sblock); > > > > > + if (r) > > > > > + return r; > > > > > + > > > > > disk_super = dm_block_data(sblock); > > > > > disk_super->time = cpu_to_le32(pmd->time); > > > > > disk_super->data_mapping_root = cpu_to_le64(pmd->root); > > > > > > I don't believe you've tested this; sblock is passed to dm_tm_commit() > > > uninitialised, and you didn't even bother to remove the later (and correct) > > > call to dm_tm_commit(). > > > > I pointed out to Joe that the patch, in isolation, is decieving. It > > _looks_ like sblock may be uninitialized, etc. But once the patch is > > applied and you look at the entirety of __commit_transaction() it is > > clear that you're reusing the existing superblock_lock() to safely > > accomplish your additional call to dm_tm_commit(). > > > > > What is the issue that started you looking in this area? > > > > Right, as my previous reply asked: please clarify if you _know_ your > > patch fixes an actual problem you've experienced. The more details the > > better. > > > > Thanks, > > Mike > > > Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction > many times and didn't find any problem of 2-phase commit. I use > md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin > metadata. > Test case: > 1. I do copy-diff test on thin device and then reboot my machine. > 2. Rebuild our device stack and exec "vgchang -ay". > The thin-pool can not be established(details_root become a bitmap node > and metadata's bitmap_root become a btree_node). But are you saying your double commit in __commit_transaction() serves as a workaround for the corruption you're seeing? Is it just a case where raid5's writebehind mode is _not_ safe for your storage config? By "reboot" do you mean a clean shutdown? Or a forced powerfail scenario? Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. 2018-06-20 14:51 ` Mike Snitzer @ 2018-06-21 16:54 ` monty 0 siblings, 0 replies; 7+ messages in thread From: monty @ 2018-06-21 16:54 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, thornber On Wed, Jun 20, 2018 at 10:51:17AM -0400, Mike Snitzer wrote: > > On Wed, Jun 20 2018 at 1:03pm -0400, > monty <monty_pavel@sina.com> wrote: > > > > > On Tue, Jun 19, 2018 at 11:00:32AM -0400, Mike Snitzer wrote: > > > > > > On Tue, Jun 19 2018 at 10:43am -0400, > > > Joe Thornber <thornber@redhat.com> wrote: > > > > > > > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote: > > > > > On Mon, May 21 2018 at 8:53pm -0400, > > > > > Monty Pavel <monty_pavel@sina.com> wrote: > > > > > > > > > > > > > > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > > > > > > func and power loss happens during executing it, coincidencely > > > > > > superblock wrote correctly but some metadata blocks didn't. The reason > > > > > > is we write all metadata in async mode. We can guarantee that we send > > > > > > superblock after other blocks but we cannot guarantee that superblock > > > > > > write completely early than other blocks. > > > > > > So, We need to commit other metadata blocks before change superblock. > > > > > > > > > > > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > > > > > > --- > > > > > > drivers/md/dm-thin-metadata.c | 8 ++++++++ > > > > > > 1 files changed, 8 insertions(+), 0 deletions(-) > > > > > > > > > > > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > > > > > > index 36ef284..897d7d6 100644 > > > > > > --- a/drivers/md/dm-thin-metadata.c > > > > > > +++ b/drivers/md/dm-thin-metadata.c > > > > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > > > > > > if (r) > > > > > > return r; > > > > > > > > > > > > + r = dm_tm_commit(pmd->tm, sblock); > > > > > > + if (r) > > > > > > + return r; > > > > > > + > > > > > > + r = superblock_lock(pmd, &sblock); > > > > > > + if (r) > > > > > > + return r; > > > > > > + > > > > > > disk_super = dm_block_data(sblock); > > > > > > disk_super->time = cpu_to_le32(pmd->time); > > > > > > disk_super->data_mapping_root = cpu_to_le64(pmd->root); > > > > > > > > I don't believe you've tested this; sblock is passed to dm_tm_commit() > > > > uninitialised, and you didn't even bother to remove the later (and correct) > > > > call to dm_tm_commit(). > > > > > > I pointed out to Joe that the patch, in isolation, is decieving. It > > > _looks_ like sblock may be uninitialized, etc. But once the patch is > > > applied and you look at the entirety of __commit_transaction() it is > > > clear that you're reusing the existing superblock_lock() to safely > > > accomplish your additional call to dm_tm_commit(). > > > > > > > What is the issue that started you looking in this area? > > > > > > Right, as my previous reply asked: please clarify if you _know_ your > > > patch fixes an actual problem you've experienced. The more details the > > > better. > > > > > > Thanks, > > > Mike > > > > > Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction > > many times and didn't find any problem of 2-phase commit. I use > > md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin > > metadata. > > Test case: > > 1. I do copy-diff test on thin device and then reboot my machine. > > 2. Rebuild our device stack and exec "vgchang -ay". > > The thin-pool can not be established(details_root become a bitmap node > > and metadata's bitmap_root become a btree_node). > > But are you saying your double commit in __commit_transaction() serves > as a workaround for the corruption you're seeing? > > Is it just a case where raid5's writebehind mode is _not_ safe for your > storage config? By "reboot" do you mean a clean shutdown? Or a forced > powerfail scenario? > > Mike > Reboot my machine means exec reboot command. My patch seems unnecessary, because 2-phase commit have ensure that committing superblock after other metadata have written to metadata device. This problem is hard to recreate, it didn't happen after applying the patch above. I will check my device stack if there is any possible that superblock write complete early than other metadata block. Monty ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. [not found] ` <20180620170357.GA5838@yyp.\x02> 2018-06-20 14:51 ` Mike Snitzer @ 2018-06-20 15:00 ` Joe Thornber 1 sibling, 0 replies; 7+ messages in thread From: Joe Thornber @ 2018-06-20 15:00 UTC (permalink / raw) To: monty; +Cc: dm-devel, Mike Snitzer On Wed, Jun 20, 2018 at 01:03:57PM -0400, monty wrote: > Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction > many times and didn't find any problem of 2-phase commit. I use > md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin > metadata. > Test case: > 1. I do copy-diff test on thin device and then reboot my machine. > 2. Rebuild our device stack and exec "vgchang -ay". > The thin-pool can not be established(details_root become a bitmap node > and metadata's bitmap_root become a btree_node). As you simplify your setup does the problem go away? eg, turn off write-behind, use just the nvme dev etc. The only effect of your change is to call flush twice rather than once. - Joe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode. 2018-06-19 15:00 ` Mike Snitzer [not found] ` <20180620170357.GA5838@yyp.\x02> @ 2018-06-20 17:03 ` monty 1 sibling, 0 replies; 7+ messages in thread From: monty @ 2018-06-20 17:03 UTC (permalink / raw) To: Mike Snitzer; +Cc: dm-devel, thornber On Tue, Jun 19, 2018 at 11:00:32AM -0400, Mike Snitzer wrote: > > On Tue, Jun 19 2018 at 10:43am -0400, > Joe Thornber <thornber@redhat.com> wrote: > > > On Tue, Jun 19, 2018 at 09:11:06AM -0400, Mike Snitzer wrote: > > > On Mon, May 21 2018 at 8:53pm -0400, > > > Monty Pavel <monty_pavel@sina.com> wrote: > > > > > > > > > > > If dm_bufio_write_dirty_buffers func is called by __commit_transaction > > > > func and power loss happens during executing it, coincidencely > > > > superblock wrote correctly but some metadata blocks didn't. The reason > > > > is we write all metadata in async mode. We can guarantee that we send > > > > superblock after other blocks but we cannot guarantee that superblock > > > > write completely early than other blocks. > > > > So, We need to commit other metadata blocks before change superblock. > > > > > > > > Signed-off-by: Monty Pavel <monty_pavel@sina.com> > > > > --- > > > > drivers/md/dm-thin-metadata.c | 8 ++++++++ > > > > 1 files changed, 8 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c > > > > index 36ef284..897d7d6 100644 > > > > --- a/drivers/md/dm-thin-metadata.c > > > > +++ b/drivers/md/dm-thin-metadata.c > > > > @@ -813,6 +813,14 @@ static int __commit_transaction(struct dm_pool_metadata *pmd) > > > > if (r) > > > > return r; > > > > > > > > + r = dm_tm_commit(pmd->tm, sblock); > > > > + if (r) > > > > + return r; > > > > + > > > > + r = superblock_lock(pmd, &sblock); > > > > + if (r) > > > > + return r; > > > > + > > > > disk_super = dm_block_data(sblock); > > > > disk_super->time = cpu_to_le32(pmd->time); > > > > disk_super->data_mapping_root = cpu_to_le64(pmd->root); > > > > I don't believe you've tested this; sblock is passed to dm_tm_commit() > > uninitialised, and you didn't even bother to remove the later (and correct) > > call to dm_tm_commit(). > > I pointed out to Joe that the patch, in isolation, is decieving. It > _looks_ like sblock may be uninitialized, etc. But once the patch is > applied and you look at the entirety of __commit_transaction() it is > clear that you're reusing the existing superblock_lock() to safely > accomplish your additional call to dm_tm_commit(). > > > What is the issue that started you looking in this area? > > Right, as my previous reply asked: please clarify if you _know_ your > patch fixes an actual problem you've experienced. The more details the > better. > > Thanks, > Mike > Hi, Mike and Joe. Thanks for your reply. I read __commit_transaction many times and didn't find any problem of 2-phase commit. I use md-raid1(PCIe nvme and md-raid5) in write-behind mode to store dm-thin metadata. Test case: 1. I do copy-diff test on thin device and then reboot my machine. 2. Rebuild our device stack and exec "vgchang -ay". The thin-pool can not be established(details_root become a bitmap node and metadata's bitmap_root become a btree_node). Thanks, Monty ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-06-21 16:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20180522005336.GA30152@yyp.\x02>
2018-06-19 13:11 ` dm thin: superblock may write succeed before other metadata blocks because of wirting metadata in async mode Mike Snitzer
2018-06-19 14:43 ` Joe Thornber
2018-06-19 15:00 ` Mike Snitzer
[not found] ` <20180620170357.GA5838@yyp.\x02>
2018-06-20 14:51 ` Mike Snitzer
2018-06-21 16:54 ` monty
2018-06-20 15:00 ` Joe Thornber
2018-06-20 17:03 ` monty
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.