* [PATCH] DM RAID: Fix for ineffective "sync" directive
@ 2012-08-23 15:31 Jonathan Brassow
2012-09-05 1:44 ` NeilBrown
2012-09-14 18:48 ` [PATCH - v2] " Jonathan Brassow
0 siblings, 2 replies; 6+ messages in thread
From: Jonathan Brassow @ 2012-08-23 15:31 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, jbrassow
Neil,
There are two ways I see to fix the problem of the "sync" table argument
not causing the array to re-sync. We could set mddev->events to
UINT_MAX and force an mddev/bitmap event mismatch - causing the bitmap
code to set BITMAP_STALE itself. Otherwise, we can do as I have below
and set BITMAP_STALE ourselves. The real difference is whether this is
done before or after md_run(). The first solution can be done before
md_run(); the second cannot because the bitmap is allocated during
md_run(). Either solution works and correctly sets the bitmap to stale
before bitmap_load() is called. So, rather than having to think about
issues that come with setting events to UINT_MAX, I chose to set the
bitmap stale after md_run().
brassow
DM RAID: Fix for "sync" directive ineffectiveness
There are two table arguments that can be given to a DM RAID target that
control whether the array is forced to (re)synchronize or skip initialization:
"sync" and "nosync". When "sync" is given, we set mddev->recovery_cp to 0
in order to cause the device to resynchronize. This is insufficient if there
is a bitmap in use, because the array will simply look at the bitmap and see
that there is no recovery necessary.
Once md_run() is called and creates the bitmap, we set the bitmap as stale.
When the bitmap is loaded during the resume phase, a full resync will
result.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -1123,6 +1123,10 @@ static int raid_ctr(struct dm_target *ti
ret = -EINVAL;
goto size_mismatch;
}
+
+ if (rs->print_flags & DMPF_SYNC)
+ set_bit(BITMAP_STALE, &(rs->md.bitmap->flags));
+
rs->callbacks.congested_fn = raid_is_congested;
dm_table_add_target_callbacks(ti->table, &rs->callbacks);
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] DM RAID: Fix for ineffective "sync" directive
2012-08-23 15:31 [PATCH] DM RAID: Fix for ineffective "sync" directive Jonathan Brassow
@ 2012-09-05 1:44 ` NeilBrown
2012-09-05 14:32 ` Brassow Jonathan
2012-09-14 18:48 ` [PATCH - v2] " Jonathan Brassow
1 sibling, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-09-05 1:44 UTC (permalink / raw)
To: Jonathan Brassow; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 3482 bytes --]
On Thu, 23 Aug 2012 10:31:54 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:
> Neil,
>
> There are two ways I see to fix the problem of the "sync" table argument
> not causing the array to re-sync. We could set mddev->events to
> UINT_MAX and force an mddev/bitmap event mismatch - causing the bitmap
> code to set BITMAP_STALE itself. Otherwise, we can do as I have below
> and set BITMAP_STALE ourselves. The real difference is whether this is
> done before or after md_run(). The first solution can be done before
> md_run(); the second cannot because the bitmap is allocated during
> md_run(). Either solution works and correctly sets the bitmap to stale
> before bitmap_load() is called. So, rather than having to think about
> issues that come with setting events to UINT_MAX, I chose to set the
> bitmap stale after md_run().
I'm not at all sure this is the right way to look at the issue.
The only reason that we have 'sync' and 'nosync' table args is because
originally we weren't storing any metadata on disk. The 'in-sync?' state of
the array is important metadata, so it needed to be given to md via a table
option.
If you have a bitmap, then you are storing this information in metadata, so
including it as a table option is redundant and - as you noticed - ignored.
If you want to assert that the data in the bitmap is incorrect, then I think
the best approach is to change it so that it is correct.
i.e. if you want to assert that an array is not in-sync and should be fully
'synced', then the right thing to do is to write lots of '1's to the bitmap
before trying to assemble the array.
However I'm not violently against the patch - though I would be against
fiddling with event_count. It does seem unpleasantly asymmetric though.
The 'sync' option says 'assume all bits in the bitmap are set', but the
'nosync' option doesn't say 'assume all bits in the bitmap are clear'.
Is simply writing to the bitmap when you want to force some behaviour an
option, or do you really really want "sync" to mean "please ignore the
bitmap"? (or maybe an "ignorebitmap" option is what we want??).
NeilBrown
>
> brassow
>
> DM RAID: Fix for "sync" directive ineffectiveness
>
> There are two table arguments that can be given to a DM RAID target that
> control whether the array is forced to (re)synchronize or skip initialization:
> "sync" and "nosync". When "sync" is given, we set mddev->recovery_cp to 0
> in order to cause the device to resynchronize. This is insufficient if there
> is a bitmap in use, because the array will simply look at the bitmap and see
> that there is no recovery necessary.
>
> Once md_run() is called and creates the bitmap, we set the bitmap as stale.
> When the bitmap is loaded during the resume phase, a full resync will
> result.
>
> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
>
> Index: linux-upstream/drivers/md/dm-raid.c
> ===================================================================
> --- linux-upstream.orig/drivers/md/dm-raid.c
> +++ linux-upstream/drivers/md/dm-raid.c
> @@ -1123,6 +1123,10 @@ static int raid_ctr(struct dm_target *ti
> ret = -EINVAL;
> goto size_mismatch;
> }
> +
> + if (rs->print_flags & DMPF_SYNC)
> + set_bit(BITMAP_STALE, &(rs->md.bitmap->flags));
> +
> rs->callbacks.congested_fn = raid_is_congested;
> dm_table_add_target_callbacks(ti->table, &rs->callbacks);
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] DM RAID: Fix for ineffective "sync" directive
2012-09-05 1:44 ` NeilBrown
@ 2012-09-05 14:32 ` Brassow Jonathan
2012-09-10 7:11 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: Brassow Jonathan @ 2012-09-05 14:32 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Jonathan Brassow
On Sep 4, 2012, at 8:44 PM, NeilBrown wrote:
> On Thu, 23 Aug 2012 10:31:54 -0500 Jonathan Brassow <jbrassow@redhat.com>
> wrote:
>
>> Neil,
>>
>> There are two ways I see to fix the problem of the "sync" table argument
>> not causing the array to re-sync. We could set mddev->events to
>> UINT_MAX and force an mddev/bitmap event mismatch - causing the bitmap
>> code to set BITMAP_STALE itself. Otherwise, we can do as I have below
>> and set BITMAP_STALE ourselves. The real difference is whether this is
>> done before or after md_run(). The first solution can be done before
>> md_run(); the second cannot because the bitmap is allocated during
>> md_run(). Either solution works and correctly sets the bitmap to stale
>> before bitmap_load() is called. So, rather than having to think about
>> issues that come with setting events to UINT_MAX, I chose to set the
>> bitmap stale after md_run().
>
> I'm not at all sure this is the right way to look at the issue.
>
> The only reason that we have 'sync' and 'nosync' table args is because
> originally we weren't storing any metadata on disk. The 'in-sync?' state of
> the array is important metadata, so it needed to be given to md via a table
> option.
>
> If you have a bitmap, then you are storing this information in metadata, so
> including it as a table option is redundant and - as you noticed - ignored.
>
> If you want to assert that the data in the bitmap is incorrect, then I think
> the best approach is to change it so that it is correct.
> i.e. if you want to assert that an array is not in-sync and should be fully
> 'synced', then the right thing to do is to write lots of '1's to the bitmap
> before trying to assemble the array.
>
> However I'm not violently against the patch - though I would be against
> fiddling with event_count. It does seem unpleasantly asymmetric though.
> The 'sync' option says 'assume all bits in the bitmap are set', but the
> 'nosync' option doesn't say 'assume all bits in the bitmap are clear'.
>
> Is simply writing to the bitmap when you want to force some behaviour an
> option, or do you really really want "sync" to mean "please ignore the
> bitmap"? (or maybe an "ignorebitmap" option is what we want??).
For me, the 'sync' and 'nosync' args are not just a vestige of when we didn't have metadata; but also a form of continuity with dm-raid1. The 'sync' option has - for device-mapper - always been a way to reset the bitmap and force a resync.
I don't particularly care for forcing userspace to understand how the kernel is handling the metadata (i.e. forcing it to know where and what to write WRT the bitmap). However, another way of tackling this problem is simply to wipe the metadata areas. I do not think there is any information that must be carried over from the superblock if our desire is to simply resync. (Although, there are exceptions to /when/ this action should occur - like not when a reshape is happening.)
Wiping the metadata areas involves a number of additional steps for LVM, but this is not the kernel's problem per se. Wiping the metadata areas is also the way LVM resyncs dm-raid1 arrays right now, which is a bit strange to me. The only (admittedly weak) arguments for adding this patch is 1) "sync" should work if it is an allowed argument and 2) it seems a better way of clearing the bitmap than wiping all the metadata while allowing for more error checking.
brassow
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] DM RAID: Fix for ineffective "sync" directive
2012-09-05 14:32 ` Brassow Jonathan
@ 2012-09-10 7:11 ` NeilBrown
0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2012-09-10 7:11 UTC (permalink / raw)
To: Brassow Jonathan; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 3340 bytes --]
On Wed, 5 Sep 2012 09:32:16 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:
>
> On Sep 4, 2012, at 8:44 PM, NeilBrown wrote:
>
> > On Thu, 23 Aug 2012 10:31:54 -0500 Jonathan Brassow <jbrassow@redhat.com>
> > wrote:
> >
> >> Neil,
> >>
> >> There are two ways I see to fix the problem of the "sync" table argument
> >> not causing the array to re-sync. We could set mddev->events to
> >> UINT_MAX and force an mddev/bitmap event mismatch - causing the bitmap
> >> code to set BITMAP_STALE itself. Otherwise, we can do as I have below
> >> and set BITMAP_STALE ourselves. The real difference is whether this is
> >> done before or after md_run(). The first solution can be done before
> >> md_run(); the second cannot because the bitmap is allocated during
> >> md_run(). Either solution works and correctly sets the bitmap to stale
> >> before bitmap_load() is called. So, rather than having to think about
> >> issues that come with setting events to UINT_MAX, I chose to set the
> >> bitmap stale after md_run().
> >
> > I'm not at all sure this is the right way to look at the issue.
> >
> > The only reason that we have 'sync' and 'nosync' table args is because
> > originally we weren't storing any metadata on disk. The 'in-sync?' state of
> > the array is important metadata, so it needed to be given to md via a table
> > option.
> >
> > If you have a bitmap, then you are storing this information in metadata, so
> > including it as a table option is redundant and - as you noticed - ignored.
> >
> > If you want to assert that the data in the bitmap is incorrect, then I think
> > the best approach is to change it so that it is correct.
> > i.e. if you want to assert that an array is not in-sync and should be fully
> > 'synced', then the right thing to do is to write lots of '1's to the bitmap
> > before trying to assemble the array.
> >
> > However I'm not violently against the patch - though I would be against
> > fiddling with event_count. It does seem unpleasantly asymmetric though.
> > The 'sync' option says 'assume all bits in the bitmap are set', but the
> > 'nosync' option doesn't say 'assume all bits in the bitmap are clear'.
> >
> > Is simply writing to the bitmap when you want to force some behaviour an
> > option, or do you really really want "sync" to mean "please ignore the
> > bitmap"? (or maybe an "ignorebitmap" option is what we want??).
>
> For me, the 'sync' and 'nosync' args are not just a vestige of when we didn't have metadata; but also a form of continuity with dm-raid1. The 'sync' option has - for device-mapper - always been a way to reset the bitmap and force a resync.
With dm-raid1, it is dm-log.c which interprets the 'sync' and 'nosync' flags.
The meaning seems to be:
- 'sync' - ignore the log, force the array to be synced (i.e. set all
regions to 'dirty')
- 'nosync' - ignore the log, don't sync the array (i.e. set all regions
to 'clean')
- neither of those - use the log to determine whether to sync or not.
If you want to make the flags to dm-raid mean exactly the same as that (i.e.
if either sync or nosync is present, we ignore the metadata) then I would be
happy with that. I don't think that is quite what your patch does though -
it only does half of it.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH - v2] DM RAID: Fix for ineffective "sync" directive
2012-08-23 15:31 [PATCH] DM RAID: Fix for ineffective "sync" directive Jonathan Brassow
2012-09-05 1:44 ` NeilBrown
@ 2012-09-14 18:48 ` Jonathan Brassow
2012-09-20 2:35 ` NeilBrown
1 sibling, 1 reply; 6+ messages in thread
From: Jonathan Brassow @ 2012-09-14 18:48 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, jbrassow
Neil,
I've taken your suggestion to not muck around with the BITMAP_STALE, and
instead simply clear the superblocks as a whole when the "sync"
directive is given.
brassow
DM RAID: Fix for "sync" directive ineffectiveness
There are two table arguments that can be given to a DM RAID target that
control whether the array is forced to (re)synchronize or skip initialization:
"sync" and "nosync". When "sync" is given, we set mddev->recovery_cp to 0
in order to cause the device to resynchronize. This is insufficient if there
is a bitmap in use, because the array will simply look at the bitmap and see
that there is no recovery necessary.
The fix is to skip over the loading of the superblocks when "sync" is given,
causing new superblocks to be written that will force the array to go through
initialization (i.e. synchronization).
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -960,6 +960,19 @@ static int analyse_superblocks(struct dm
freshest = NULL;
rdev_for_each_safe(rdev, tmp, mddev) {
+ /*
+ * Skipping super_load due to DMPF_SYNC will cause
+ * the array to undergo initialization again as
+ * though it were new. This is the intended effect
+ * of the "sync" directive.
+ *
+ * When reshaping capability is added, we must ensure
+ * that the "sync" directive is disallowed during the
+ * reshape.
+ */
+ if (rs->print_flags & DMPF_SYNC)
+ continue;
+
if (!rdev->meta_bdev)
continue;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH - v2] DM RAID: Fix for ineffective "sync" directive
2012-09-14 18:48 ` [PATCH - v2] " Jonathan Brassow
@ 2012-09-20 2:35 ` NeilBrown
0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2012-09-20 2:35 UTC (permalink / raw)
To: Jonathan Brassow; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1916 bytes --]
On Fri, 14 Sep 2012 13:48:03 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:
> Neil,
>
> I've taken your suggestion to not muck around with the BITMAP_STALE, and
> instead simply clear the superblocks as a whole when the "sync"
> directive is given.
>
> brassow
>
> DM RAID: Fix for "sync" directive ineffectiveness
>
> There are two table arguments that can be given to a DM RAID target that
> control whether the array is forced to (re)synchronize or skip initialization:
> "sync" and "nosync". When "sync" is given, we set mddev->recovery_cp to 0
> in order to cause the device to resynchronize. This is insufficient if there
> is a bitmap in use, because the array will simply look at the bitmap and see
> that there is no recovery necessary.
>
> The fix is to skip over the loading of the superblocks when "sync" is given,
> causing new superblocks to be written that will force the array to go through
> initialization (i.e. synchronization).
>
> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
>
> Index: linux-upstream/drivers/md/dm-raid.c
> ===================================================================
> --- linux-upstream.orig/drivers/md/dm-raid.c
> +++ linux-upstream/drivers/md/dm-raid.c
> @@ -960,6 +960,19 @@ static int analyse_superblocks(struct dm
>
> freshest = NULL;
> rdev_for_each_safe(rdev, tmp, mddev) {
> + /*
> + * Skipping super_load due to DMPF_SYNC will cause
> + * the array to undergo initialization again as
> + * though it were new. This is the intended effect
> + * of the "sync" directive.
> + *
> + * When reshaping capability is added, we must ensure
> + * that the "sync" directive is disallowed during the
> + * reshape.
> + */
> + if (rs->print_flags & DMPF_SYNC)
> + continue;
> +
> if (!rdev->meta_bdev)
> continue;
>
>
OK, applied.
Thanks.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-09-20 2:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-23 15:31 [PATCH] DM RAID: Fix for ineffective "sync" directive Jonathan Brassow
2012-09-05 1:44 ` NeilBrown
2012-09-05 14:32 ` Brassow Jonathan
2012-09-10 7:11 ` NeilBrown
2012-09-14 18:48 ` [PATCH - v2] " Jonathan Brassow
2012-09-20 2:35 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).