* [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush()
2016-11-08 23:21 [md PATCH 0/3] Three unrelated md patches NeilBrown
2016-11-08 23:21 ` [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely NeilBrown
@ 2016-11-08 23:21 ` NeilBrown
2016-11-09 20:51 ` Shaohua Li
2016-11-08 23:21 ` [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums NeilBrown
2 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2016-11-08 23:21 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-raid
bitmap_flush() finishes with bitmap_update_sb(), and that finishes
with write_page(..., 1), so write_page() will wait for all writes
to complete. So there is no point calling md_super_wait()
immediately afterwards.
Signed-off-by: NeilBrown <neilb@suse.com>
---
drivers/md/md.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index f389d8abe137..1f1c7f007b68 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5472,7 +5472,6 @@ static void __md_stop_writes(struct mddev *mddev)
del_timer_sync(&mddev->safemode_timer);
bitmap_flush(mddev);
- md_super_wait(mddev);
if (mddev->ro == 0 &&
((!mddev->in_sync && !mddev_is_clustered(mddev)) ||
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [md PATCH 0/3] Three unrelated md patches.
@ 2016-11-08 23:21 NeilBrown
2016-11-08 23:21 ` [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely NeilBrown
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: NeilBrown @ 2016-11-08 23:21 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-raid
The first is a minor bug fix, the other to are just code improvements.
I have some patches to add "failfast" functionality, and I want to get
the small clean-ups out of the way first.
Thanks,
NeilBrown
---
NeilBrown (3):
md/raid1: fix: IO can block resync indefinitely
md: remove md_super_wait() call after bitmap_flush()
md: define mddev flags, recovery flags and r1bio state bits using enums
drivers/md/md.c | 1 -
drivers/md/md.h | 76 +++++++++++++++++++++++++---------------------------
drivers/md/raid1.c | 2 +
drivers/md/raid1.h | 18 +++++++-----
4 files changed, 47 insertions(+), 50 deletions(-)
--
Signature
^ permalink raw reply [flat|nested] 9+ messages in thread
* [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely
2016-11-08 23:21 [md PATCH 0/3] Three unrelated md patches NeilBrown
@ 2016-11-08 23:21 ` NeilBrown
2016-11-09 20:50 ` Shaohua Li
2016-11-08 23:21 ` [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush() NeilBrown
2016-11-08 23:21 ` [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums NeilBrown
2 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2016-11-08 23:21 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-raid
While performing a resync/recovery, raid1 divides the
array space into three regions:
- before the resync
- at or shortly after the resync point
- much further ahead of the resync point.
Write requests to the first or third do not need to wait. Write
requests to the middle region do need to wait if resync requests are
pending.
If there are any active write requests in the middle region, resync
will wait for them.
Due to an accounting error, there is a small range of addresses,
between conf->next_resync and conf->start_next_window, where write
requests will *not* be blocked, but *will* be counted in the middle
region. This can effectively block resync indefinitely if filesystem
writes happen repeatedly to this region.
As ->next_window_requests is incremented when the sector is before
conf->start_next_window + NEXT_NORMALIO_DISTANCE
the same boundary should be used for determining when write requests
should wait.
Signed-off-by: NeilBrown <neilb@suse.com>
---
drivers/md/raid1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index aac2a05cf8d1..9ac61cd85e5c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -834,7 +834,7 @@ static bool need_to_wait_for_sync(struct r1conf *conf, struct bio *bio)
else if (conf->barrier && bio_data_dir(bio) == WRITE) {
if ((conf->mddev->curr_resync_completed
>= bio_end_sector(bio)) ||
- (conf->next_resync + NEXT_NORMALIO_DISTANCE
+ (conf->start_next_window + NEXT_NORMALIO_DISTANCE
<= bio->bi_iter.bi_sector))
wait = false;
else
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums
2016-11-08 23:21 [md PATCH 0/3] Three unrelated md patches NeilBrown
2016-11-08 23:21 ` [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely NeilBrown
2016-11-08 23:21 ` [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush() NeilBrown
@ 2016-11-08 23:21 ` NeilBrown
2016-11-09 20:52 ` Shaohua Li
2 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2016-11-08 23:21 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-raid
This is less error prone than using individual #defines.
Signed-off-by: NeilBrown <neilb@suse.com>
---
drivers/md/md.h | 76 +++++++++++++++++++++++++---------------------------
drivers/md/raid1.h | 18 +++++++-----
2 files changed, 46 insertions(+), 48 deletions(-)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 21bd94fad96a..af6b33c30d2d 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -192,6 +192,25 @@ extern int rdev_clear_badblocks(struct md_rdev *rdev, sector_t s, int sectors,
int is_new);
struct md_cluster_info;
+enum mddev_flags {
+ MD_CHANGE_DEVS, /* Some device status has changed */
+ MD_CHANGE_CLEAN, /* transition to or from 'clean' */
+ MD_CHANGE_PENDING, /* switch from 'clean' to 'active' in progress */
+ MD_ARRAY_FIRST_USE, /* First use of array, needs initialization */
+ MD_CLOSING, /* If set, we are closing the array, do not open
+ * it then */
+ MD_JOURNAL_CLEAN, /* A raid with journal is already clean */
+ MD_HAS_JOURNAL, /* The raid array has journal feature set */
+ MD_RELOAD_SB, /* Reload the superblock because another node
+ * updated it.
+ */
+ MD_CLUSTER_RESYNC_LOCKED, /* cluster raid only, which means node
+ * already took resync lock, need to
+ * release the lock */
+};
+#define MD_UPDATE_SB_FLAGS (BIT(MD_CHANGE_DEVS) | \
+ BIT(MD_CHANGE_CLEAN) | \
+ BIT(MD_CHANGE_PENDING)) /* If these are set, md_update_sb needed */
struct mddev {
void *private;
struct md_personality *pers;
@@ -199,21 +218,6 @@ struct mddev {
int md_minor;
struct list_head disks;
unsigned long flags;
-#define MD_CHANGE_DEVS 0 /* Some device status has changed */
-#define MD_CHANGE_CLEAN 1 /* transition to or from 'clean' */
-#define MD_CHANGE_PENDING 2 /* switch from 'clean' to 'active' in progress */
-#define MD_UPDATE_SB_FLAGS (1 | 2 | 4) /* If these are set, md_update_sb needed */
-#define MD_ARRAY_FIRST_USE 3 /* First use of array, needs initialization */
-#define MD_CLOSING 4 /* If set, we are closing the array, do not open
- * it then */
-#define MD_JOURNAL_CLEAN 5 /* A raid with journal is already clean */
-#define MD_HAS_JOURNAL 6 /* The raid array has journal feature set */
-#define MD_RELOAD_SB 7 /* Reload the superblock because another node
- * updated it.
- */
-#define MD_CLUSTER_RESYNC_LOCKED 8 /* cluster raid only, which means node
- * already took resync lock, need to
- * release the lock */
int suspended;
atomic_t active_io;
@@ -307,31 +311,6 @@ struct mddev {
int parallel_resync;
int ok_start_degraded;
- /* recovery/resync flags
- * NEEDED: we might need to start a resync/recover
- * RUNNING: a thread is running, or about to be started
- * SYNC: actually doing a resync, not a recovery
- * RECOVER: doing recovery, or need to try it.
- * INTR: resync needs to be aborted for some reason
- * DONE: thread is done and is waiting to be reaped
- * REQUEST: user-space has requested a sync (used with SYNC)
- * CHECK: user-space request for check-only, no repair
- * RESHAPE: A reshape is happening
- * ERROR: sync-action interrupted because io-error
- *
- * If neither SYNC or RESHAPE are set, then it is a recovery.
- */
-#define MD_RECOVERY_RUNNING 0
-#define MD_RECOVERY_SYNC 1
-#define MD_RECOVERY_RECOVER 2
-#define MD_RECOVERY_INTR 3
-#define MD_RECOVERY_DONE 4
-#define MD_RECOVERY_NEEDED 5
-#define MD_RECOVERY_REQUESTED 6
-#define MD_RECOVERY_CHECK 7
-#define MD_RECOVERY_RESHAPE 8
-#define MD_RECOVERY_FROZEN 9
-#define MD_RECOVERY_ERROR 10
unsigned long recovery;
/* If a RAID personality determines that recovery (of a particular
@@ -445,6 +424,23 @@ struct mddev {
unsigned int good_device_nr; /* good device num within cluster raid */
};
+enum recovery_flags {
+ /*
+ * If neither SYNC or RESHAPE are set, then it is a recovery.
+ */
+ MD_RECOVERY_RUNNING, /* a thread is running, or about to be started */
+ MD_RECOVERY_SYNC, /* actually doing a resync, not a recovery */
+ MD_RECOVERY_RECOVER, /* doing recovery, or need to try it. */
+ MD_RECOVERY_INTR, /* resync needs to be aborted for some reason */
+ MD_RECOVERY_DONE, /* thread is done and is waiting to be reaped */
+ MD_RECOVERY_NEEDED, /* we might need to start a resync/recover */
+ MD_RECOVERY_REQUESTED, /* user-space has requested a sync (used with SYNC) */
+ MD_RECOVERY_CHECK, /* user-space request for check-only, no repair */
+ MD_RECOVERY_RESHAPE, /* A reshape is happening */
+ MD_RECOVERY_FROZEN, /* User request to abort, and not restart, any action */
+ MD_RECOVERY_ERROR, /* sync-action interrupted because io-error */
+};
+
static inline int __must_check mddev_lock(struct mddev *mddev)
{
return mutex_lock_interruptible(&mddev->reconfig_mutex);
diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
index 61c39b390cd8..5ec19449779d 100644
--- a/drivers/md/raid1.h
+++ b/drivers/md/raid1.h
@@ -161,14 +161,15 @@ struct r1bio {
};
/* bits for r1bio.state */
-#define R1BIO_Uptodate 0
-#define R1BIO_IsSync 1
-#define R1BIO_Degraded 2
-#define R1BIO_BehindIO 3
+enum r1bio_state {
+ R1BIO_Uptodate,
+ R1BIO_IsSync,
+ R1BIO_Degraded,
+ R1BIO_BehindIO,
/* Set ReadError on bios that experience a readerror so that
* raid1d knows what to do with them.
*/
-#define R1BIO_ReadError 4
+ R1BIO_ReadError,
/* For write-behind requests, we call bi_end_io when
* the last non-write-behind device completes, providing
* any write was successful. Otherwise we call when
@@ -176,10 +177,11 @@ struct r1bio {
* with failure when last write completes (and all failed).
* Record that bi_end_io was called with this flag...
*/
-#define R1BIO_Returned 6
+ R1BIO_Returned,
/* If a write for this request means we can clear some
* known-bad-block records, we set this flag
*/
-#define R1BIO_MadeGood 7
-#define R1BIO_WriteError 8
+ R1BIO_MadeGood,
+ R1BIO_WriteError,
+};
#endif
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely
2016-11-08 23:21 ` [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely NeilBrown
@ 2016-11-09 20:50 ` Shaohua Li
0 siblings, 0 replies; 9+ messages in thread
From: Shaohua Li @ 2016-11-09 20:50 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Wed, Nov 09, 2016 at 10:21:32AM +1100, Neil Brown wrote:
> While performing a resync/recovery, raid1 divides the
> array space into three regions:
> - before the resync
> - at or shortly after the resync point
> - much further ahead of the resync point.
>
> Write requests to the first or third do not need to wait. Write
> requests to the middle region do need to wait if resync requests are
> pending.
>
> If there are any active write requests in the middle region, resync
> will wait for them.
>
> Due to an accounting error, there is a small range of addresses,
> between conf->next_resync and conf->start_next_window, where write
> requests will *not* be blocked, but *will* be counted in the middle
> region. This can effectively block resync indefinitely if filesystem
> writes happen repeatedly to this region.
Good catch, thanks Neil!
> As ->next_window_requests is incremented when the sector is before
I changed 'before' to 'after' when applying this patch
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush()
2016-11-08 23:21 ` [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush() NeilBrown
@ 2016-11-09 20:51 ` Shaohua Li
2016-11-10 0:57 ` NeilBrown
0 siblings, 1 reply; 9+ messages in thread
From: Shaohua Li @ 2016-11-09 20:51 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Wed, Nov 09, 2016 at 10:21:32AM +1100, Neil Brown wrote:
> bitmap_flush() finishes with bitmap_update_sb(), and that finishes
> with write_page(..., 1), so write_page() will wait for all writes
> to complete. So there is no point calling md_super_wait()
> immediately afterwards.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> drivers/md/md.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index f389d8abe137..1f1c7f007b68 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5472,7 +5472,6 @@ static void __md_stop_writes(struct mddev *mddev)
> del_timer_sync(&mddev->safemode_timer);
>
> bitmap_flush(mddev);
> - md_super_wait(mddev);
bitmap_flush() could be null if there is no bitmap, is this safe?
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums
2016-11-08 23:21 ` [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums NeilBrown
@ 2016-11-09 20:52 ` Shaohua Li
0 siblings, 0 replies; 9+ messages in thread
From: Shaohua Li @ 2016-11-09 20:52 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Wed, Nov 09, 2016 at 10:21:33AM +1100, Neil Brown wrote:
> This is less error prone than using individual #defines.
applied, thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush()
2016-11-09 20:51 ` Shaohua Li
@ 2016-11-10 0:57 ` NeilBrown
2016-11-10 1:13 ` Shaohua Li
0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2016-11-10 0:57 UTC (permalink / raw)
To: Shaohua Li; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1293 bytes --]
On Thu, Nov 10 2016, Shaohua Li wrote:
> On Wed, Nov 09, 2016 at 10:21:32AM +1100, Neil Brown wrote:
>> bitmap_flush() finishes with bitmap_update_sb(), and that finishes
>> with write_page(..., 1), so write_page() will wait for all writes
>> to complete. So there is no point calling md_super_wait()
>> immediately afterwards.
>>
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>> drivers/md/md.c | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index f389d8abe137..1f1c7f007b68 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -5472,7 +5472,6 @@ static void __md_stop_writes(struct mddev *mddev)
>> del_timer_sync(&mddev->safemode_timer);
>>
>> bitmap_flush(mddev);
>> - md_super_wait(mddev);
>
> bitmap_flush() could be null if there is no bitmap, is this safe?
Good question.
If there is no bitmap, then all metadata updates (both superblock
and bad-block-list) are synchronous in md_update_sb(), which is always
called under ->reconfig_mutex and so which cannot race with this code.
So yes, it is safe. That md_super_wait() was only ever intended to wait
for things that bitmap_flush() might have flushed, so it should have
been inside that function.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush()
2016-11-10 0:57 ` NeilBrown
@ 2016-11-10 1:13 ` Shaohua Li
0 siblings, 0 replies; 9+ messages in thread
From: Shaohua Li @ 2016-11-10 1:13 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Thu, Nov 10, 2016 at 11:57:35AM +1100, Neil Brown wrote:
> On Thu, Nov 10 2016, Shaohua Li wrote:
>
> > On Wed, Nov 09, 2016 at 10:21:32AM +1100, Neil Brown wrote:
> >> bitmap_flush() finishes with bitmap_update_sb(), and that finishes
> >> with write_page(..., 1), so write_page() will wait for all writes
> >> to complete. So there is no point calling md_super_wait()
> >> immediately afterwards.
> >>
> >> Signed-off-by: NeilBrown <neilb@suse.com>
> >> ---
> >> drivers/md/md.c | 1 -
> >> 1 file changed, 1 deletion(-)
> >>
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index f389d8abe137..1f1c7f007b68 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -5472,7 +5472,6 @@ static void __md_stop_writes(struct mddev *mddev)
> >> del_timer_sync(&mddev->safemode_timer);
> >>
> >> bitmap_flush(mddev);
> >> - md_super_wait(mddev);
> >
> > bitmap_flush() could be null if there is no bitmap, is this safe?
>
> Good question.
> If there is no bitmap, then all metadata updates (both superblock
> and bad-block-list) are synchronous in md_update_sb(), which is always
> called under ->reconfig_mutex and so which cannot race with this code.
>
> So yes, it is safe. That md_super_wait() was only ever intended to wait
> for things that bitmap_flush() might have flushed, so it should have
> been inside that function.
Ah, yes, md_super_wait follows all md_super_write. It should be safe.
Applied, thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-11-10 1:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-08 23:21 [md PATCH 0/3] Three unrelated md patches NeilBrown
2016-11-08 23:21 ` [md PATCH 1/3] md/raid1: fix: IO can block resync indefinitely NeilBrown
2016-11-09 20:50 ` Shaohua Li
2016-11-08 23:21 ` [md PATCH 2/3] md: remove md_super_wait() call after bitmap_flush() NeilBrown
2016-11-09 20:51 ` Shaohua Li
2016-11-10 0:57 ` NeilBrown
2016-11-10 1:13 ` Shaohua Li
2016-11-08 23:21 ` [md PATCH 3/3] md: define mddev flags, recovery flags and r1bio state bits using enums NeilBrown
2016-11-09 20:52 ` Shaohua Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).