[PATCH] btrfs: add framework to handle device flush error as a volume

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] btrfs: add framework to handle device flush error as a volume
@ 2017-04-25  8:58 Anand Jain
  2017-04-27  2:53 ` Anand Jain
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Anand Jain @ 2017-04-25  8:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba

This adds comments to the flush error handling part of
the code, and hopes to maintain the same logic with a
framework which can be used to handle the errors at the
volume level.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/disk-io.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/btrfs/volumes.h |  1 +
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index eb1ee7b6f532..dafcb6bb2d5d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3527,6 +3527,10 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
 	if (wait) {
 		bio = device->flush_bio;
 		if (!bio)
+			/*
+			 * This means the alloc has failed with ENOMEM, however
+			 * here we return 0, as its not a device error.
+			 */
 			return 0;
 
 		wait_for_completion(&device->flush_wait);
@@ -3566,6 +3570,33 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
 	return 0;
 }
 
+static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
+{
+	int submit_flush_error = 0;
+	int dev_flush_error = 0;
+	struct btrfs_device *dev;
+
+	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
+		if (!dev->bdev) {
+			submit_flush_error++;
+			dev_flush_error++;
+			continue;
+		}
+		if (dev->last_flush_error == ENOMEM)
+			submit_flush_error++;
+		if (dev->last_flush_error && dev->last_flush_error != ENOMEM)
+			dev_flush_error++;
+	}
+
+	if (submit_flush_error >
+		fsdevs->fs_info->num_tolerated_disk_barrier_failures ||
+		dev_flush_error >
+		fsdevs->fs_info->num_tolerated_disk_barrier_failures)
+		return -EIO;
+
+	return 0;
+}
+
 /*
  * send an empty flush down to each device in parallel,
  * then wait for them
@@ -3593,6 +3624,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 		ret = write_dev_flush(dev, 0);
 		if (ret)
 			errors_send++;
+		dev->last_flush_error = ret;
 	}
 
 	/* wait for all the barriers */
@@ -3607,12 +3639,30 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 			continue;
 
 		ret = write_dev_flush(dev, 1);
-		if (ret)
+		if (ret) {
+			dev->last_flush_error = ret;
 			errors_wait++;
+		}
+	}
+
+	/*
+	 * Try hard in case of flush. Lets say, in RAID1 we have
+	 * the following situation
+	 *  dev1: EIO dev2: ENOMEM
+	 * this is not a fatal error as we hope to recover from
+	 * ENOMEM in the next attempt to flush.
+	 * But the following is considered as fatal
+	 *  dev1: ENOMEM dev2: ENOMEM
+	 *  dev1: bdev == NULL dev2: ENOMEM
+	 */
+	if (errors_send || errors_wait) {
+		/*
+		 * At some point we need the status of all disks
+		 * to arrive at the volume status. So error checking
+		 * is being pushed to a separate loop.
+		 */
+		return check_barrier_error(info->fs_devices);
 	}
-	if (errors_send > info->num_tolerated_disk_barrier_failures ||
-	    errors_wait > info->num_tolerated_disk_barrier_failures)
-		return -EIO;
 	return 0;
 }
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 59be81206dd7..9c09dcd96e5d 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -74,6 +74,7 @@ struct btrfs_device {
 	int missing;
 	int can_discard;
 	int is_tgtdev_for_dev_replace;
+	int last_flush_error;
 
 #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED
 	seqcount_t data_seqcount;
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: add framework to handle device flush error as a volume
  2017-04-25  8:58 [PATCH] btrfs: add framework to handle device flush error as a volume Anand Jain
@ 2017-04-27  2:53 ` Anand Jain
  2017-05-05 14:46 ` David Sterba
  2017-05-05 23:17 ` [PATCH v2] " Anand Jain
  2 siblings, 0 replies; 7+ messages in thread
From: Anand Jain @ 2017-04-27  2:53 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs


Hi David,

  This patch adds comments explaining the original design, and also adds 
a cleanup so that a volume level error-handles for the flush failures 
can be implemented. I wonder if you have any comments.?

  (Also just a note, a bio prealloc (for flush) will make below ENOMEM 
to go away, however the prealloc itself has to optimized based on the 
barrier option. So in the long run I don't think below ENOMEM would remain.)

Thanks, Anand


On 04/25/2017 04:58 PM, Anand Jain wrote:
> This adds comments to the flush error handling part of
> the code, and hopes to maintain the same logic with a
> framework which can be used to handle the errors at the
> volume level.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/disk-io.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>  fs/btrfs/volumes.h |  1 +
>  2 files changed, 55 insertions(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index eb1ee7b6f532..dafcb6bb2d5d 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3527,6 +3527,10 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
>  	if (wait) {
>  		bio = device->flush_bio;
>  		if (!bio)
> +			/*
> +			 * This means the alloc has failed with ENOMEM, however
> +			 * here we return 0, as its not a device error.
> +			 */
>  			return 0;
>
>  		wait_for_completion(&device->flush_wait);
> @@ -3566,6 +3570,33 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
>  	return 0;
>  }
>
> +static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
> +{
> +	int submit_flush_error = 0;
> +	int dev_flush_error = 0;
> +	struct btrfs_device *dev;
> +
> +	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
> +		if (!dev->bdev) {
> +			submit_flush_error++;
> +			dev_flush_error++;
> +			continue;
> +		}
> +		if (dev->last_flush_error == ENOMEM)
> +			submit_flush_error++;
> +		if (dev->last_flush_error && dev->last_flush_error != ENOMEM)
> +			dev_flush_error++;
> +	}
> +
> +	if (submit_flush_error >
> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures ||
> +		dev_flush_error >
> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures)
> +		return -EIO;
> +
> +	return 0;
> +}
> +
>  /*
>   * send an empty flush down to each device in parallel,
>   * then wait for them
> @@ -3593,6 +3624,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>  		ret = write_dev_flush(dev, 0);
>  		if (ret)
>  			errors_send++;
> +		dev->last_flush_error = ret;
>  	}
>
>  	/* wait for all the barriers */
> @@ -3607,12 +3639,30 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>  			continue;
>
>  		ret = write_dev_flush(dev, 1);
> -		if (ret)
> +		if (ret) {
> +			dev->last_flush_error = ret;
>  			errors_wait++;
> +		}
> +	}
> +
> +	/*
> +	 * Try hard in case of flush. Lets say, in RAID1 we have
> +	 * the following situation
> +	 *  dev1: EIO dev2: ENOMEM
> +	 * this is not a fatal error as we hope to recover from
> +	 * ENOMEM in the next attempt to flush.
> +	 * But the following is considered as fatal
> +	 *  dev1: ENOMEM dev2: ENOMEM
> +	 *  dev1: bdev == NULL dev2: ENOMEM
> +	 */
> +	if (errors_send || errors_wait) {
> +		/*
> +		 * At some point we need the status of all disks
> +		 * to arrive at the volume status. So error checking
> +		 * is being pushed to a separate loop.
> +		 */
> +		return check_barrier_error(info->fs_devices);
>  	}
> -	if (errors_send > info->num_tolerated_disk_barrier_failures ||
> -	    errors_wait > info->num_tolerated_disk_barrier_failures)
> -		return -EIO;
>  	return 0;
>  }
>
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 59be81206dd7..9c09dcd96e5d 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -74,6 +74,7 @@ struct btrfs_device {
>  	int missing;
>  	int can_discard;
>  	int is_tgtdev_for_dev_replace;
> +	int last_flush_error;
>
>  #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED
>  	seqcount_t data_seqcount;
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: add framework to handle device flush error as a volume
  2017-04-25  8:58 [PATCH] btrfs: add framework to handle device flush error as a volume Anand Jain
  2017-04-27  2:53 ` Anand Jain
@ 2017-05-05 14:46 ` David Sterba
  2017-05-05 23:14   ` Anand Jain
  2017-05-05 23:17 ` [PATCH v2] " Anand Jain
  2 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2017-05-05 14:46 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba

On Tue, Apr 25, 2017 at 04:58:31PM +0800, Anand Jain wrote:
> This adds comments to the flush error handling part of
> the code, and hopes to maintain the same logic with a
> framework which can be used to handle the errors at the
> volume level.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/disk-io.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>  fs/btrfs/volumes.h |  1 +
>  2 files changed, 55 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index eb1ee7b6f532..dafcb6bb2d5d 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3527,6 +3527,10 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
>  	if (wait) {
>  		bio = device->flush_bio;
>  		if (!bio)
> +			/*
> +			 * This means the alloc has failed with ENOMEM, however
> +			 * here we return 0, as its not a device error.
> +			 */
>  			return 0;
>  
>  		wait_for_completion(&device->flush_wait);
> @@ -3566,6 +3570,33 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
>  	return 0;
>  }
>  
> +static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
> +{
> +	int submit_flush_error = 0;
> +	int dev_flush_error = 0;
> +	struct btrfs_device *dev;
> +
> +	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
> +		if (!dev->bdev) {
> +			submit_flush_error++;
> +			dev_flush_error++;
> +			continue;
> +		}
> +		if (dev->last_flush_error == ENOMEM)

That's -ENOMEM

> +			submit_flush_error++;
> +		if (dev->last_flush_error && dev->last_flush_error != ENOMEM)

also here.

> +			dev_flush_error++;
> +	}
> +
> +	if (submit_flush_error >
> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures ||
> +		dev_flush_error >
> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures)
> +		return -EIO;

Can you please reformat this so it's clear what's the condition and
what's the statement?

> +
> +	return 0;
> +}
> +
>  /*
>   * send an empty flush down to each device in parallel,
>   * then wait for them
> @@ -3593,6 +3624,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>  		ret = write_dev_flush(dev, 0);
>  		if (ret)
>  			errors_send++;
> +		dev->last_flush_error = ret;

Here the error is set unconditionally, so it always tracks the return
code, not only the error ...

>  	}
>  
>  	/* wait for all the barriers */
> @@ -3607,12 +3639,30 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>  			continue;
>  
>  		ret = write_dev_flush(dev, 1);
> -		if (ret)
> +		if (ret) {
> +			dev->last_flush_error = ret;

... while this tracks only the errors. Unless I'm missing something,
both should be set in a consistent way.

>  			errors_wait++;
> +		}
> +	}
> +
> +	/*
> +	 * Try hard in case of flush. Lets say, in RAID1 we have
> +	 * the following situation
> +	 *  dev1: EIO dev2: ENOMEM
> +	 * this is not a fatal error as we hope to recover from
> +	 * ENOMEM in the next attempt to flush.

This could still be problematic under some very rare conditions, but I
don't deem it important at the moment as the memory allocation will be
gone. Then the comment reflects the current state, which is fine.

> +	 * But the following is considered as fatal
> +	 *  dev1: ENOMEM dev2: ENOMEM
> +	 *  dev1: bdev == NULL dev2: ENOMEM
> +	 */
> +	if (errors_send || errors_wait) {
> +		/*
> +		 * At some point we need the status of all disks
> +		 * to arrive at the volume status. So error checking
> +		 * is being pushed to a separate loop.
> +		 */
> +		return check_barrier_error(info->fs_devices);
>  	}
> -	if (errors_send > info->num_tolerated_disk_barrier_failures ||
> -	    errors_wait > info->num_tolerated_disk_barrier_failures)
> -		return -EIO;
>  	return 0;
>  }
>  
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 59be81206dd7..9c09dcd96e5d 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -74,6 +74,7 @@ struct btrfs_device {
>  	int missing;
>  	int can_discard;
>  	int is_tgtdev_for_dev_replace;
> +	int last_flush_error;
>  
>  #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED
>  	seqcount_t data_seqcount;
> -- 
> 2.10.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: add framework to handle device flush error as a volume
  2017-05-05 14:46 ` David Sterba
@ 2017-05-05 23:14   ` Anand Jain
  0 siblings, 0 replies; 7+ messages in thread
From: Anand Jain @ 2017-05-05 23:14 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs






>> @@ -3566,6 +3570,33 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
>>  	return 0;
>>  }
>>
>> +static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
>> +{
>> +	int submit_flush_error = 0;
>> +	int dev_flush_error = 0;
>> +	struct btrfs_device *dev;
>> +
>> +	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
>> +		if (!dev->bdev) {
>> +			submit_flush_error++;
>> +			dev_flush_error++;
>> +			continue;
>> +		}
>> +		if (dev->last_flush_error == ENOMEM)
>
> That's -ENOMEM

  My bad, will fix it.

>> +			submit_flush_error++;
>> +		if (dev->last_flush_error && dev->last_flush_error != ENOMEM)
>
> also here.

  yes.

>> +			dev_flush_error++;
>> +	}
>> +
>> +	if (submit_flush_error >
>> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures ||
>> +		dev_flush_error >
>> +		fsdevs->fs_info->num_tolerated_disk_barrier_failures)
>> +		return -EIO;
>
> Can you please reformat this so it's clear what's the condition and
> what's the statement?

  included in v2.

>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * send an empty flush down to each device in parallel,
>>   * then wait for them
>> @@ -3593,6 +3624,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>>  		ret = write_dev_flush(dev, 0);
>>  		if (ret)
>>  			errors_send++;
>> +		dev->last_flush_error = ret;
>
> Here the error is set unconditionally, so it always tracks the return
> code, not only the error ...

  Right. So it captures send part of the error, that is when we call
  write_dev_flush(... 0); which returns only -ENOMEM or 0;

>>  	}
>>
>>  	/* wait for all the barriers */
>> @@ -3607,12 +3639,30 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
>>  			continue;
>>
>>  		ret = write_dev_flush(dev, 1);
>> -		if (ret)
>> +		if (ret) {
>> +			dev->last_flush_error = ret;
>
> ... while this tracks only the errors. Unless I'm missing something,
> both should be set in a consistent way.

   Actually this is correct, that way I preserve the -ENOMEM; which
   occurred in the send part of the code. And during wait.. we always
   return 0; for -ENOMEM that occurred during send.

------------------------------
@@ -3625,6 +3625,10 @@ static int write_dev_flush(struct btrfs_device 
*device, int wait)
         if (wait) {
                 bio = device->flush_bio;
                 if (!bio)
+                /*
+                 * This means the alloc has failed with ENOMEM, however
+                 * here we return 0, as its not a device error.
+                 */
                         return 0;
------------------------------



>>  			errors_wait++;
>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Try hard in case of flush. Lets say, in RAID1 we have
>> +	 * the following situation
>> +	 *  dev1: EIO dev2: ENOMEM
>> +	 * this is not a fatal error as we hope to recover from
>> +	 * ENOMEM in the next attempt to flush.
>
> This could still be problematic under some very rare conditions, but I
> don't deem it important at the moment as the memory allocation will be
> gone. Then the comment reflects the current state, which is fine.

  Thanks for mentioning that; though I suspected that way as well;
  I had to put extra effort not to change original logic unless
  there is an identified issue with it. So I instead, I just documented
  the logic in the comments.

  I am sending V2 with the above changes.

Thanks, Anand

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] btrfs: add framework to handle device flush error as a volume
@ 2017-05-05 23:17 ` Anand Jain
  2017-05-09 17:12   ` David Sterba
  0 siblings, 1 reply; 7+ messages in thread
From: Anand Jain @ 2017-05-05 23:17 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba

This adds comments to the flush error handling part of
the code, and hopes to maintain the same logic with a
framework which can be used to handle the errors at the
volume level.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v2:
 fix -ENOMEM at two places
 add code readability changes in check_barrier_error()

 fs/btrfs/disk-io.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/btrfs/volumes.h |  1 +
 2 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e064913fa62f..2f0d0688e0a6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3625,6 +3625,10 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
 	if (wait) {
 		bio = device->flush_bio;
 		if (!bio)
+			/*
+			 * This means the alloc has failed with ENOMEM, however
+			 * here we return 0, as its not a device error.
+			 */
 			return 0;
 
 		wait_for_completion(&device->flush_wait);
@@ -3664,6 +3668,32 @@ static int write_dev_flush(struct btrfs_device *device, int wait)
 	return 0;
 }
 
+static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
+{
+	int submit_flush_error = 0;
+	int dev_flush_error = 0;
+	struct btrfs_device *dev;
+	int tolerance;
+
+	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
+		if (!dev->bdev) {
+			submit_flush_error++;
+			dev_flush_error++;
+			continue;
+		}
+		if (dev->last_flush_error == -ENOMEM)
+			submit_flush_error++;
+		if (dev->last_flush_error && dev->last_flush_error != -ENOMEM)
+			dev_flush_error++;
+	}
+
+	tolerance = fsdevs->fs_info->num_tolerated_disk_barrier_failures;
+	if (submit_flush_error > tolerance || dev_flush_error > tolerance)
+		return -EIO;
+
+	return 0;
+}
+
 /*
  * send an empty flush down to each device in parallel,
  * then wait for them
@@ -3691,6 +3721,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 		ret = write_dev_flush(dev, 0);
 		if (ret)
 			errors_send++;
+		dev->last_flush_error = ret;
 	}
 
 	/* wait for all the barriers */
@@ -3705,12 +3736,30 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 			continue;
 
 		ret = write_dev_flush(dev, 1);
-		if (ret)
+		if (ret) {
+			dev->last_flush_error = ret;
 			errors_wait++;
+		}
+	}
+
+	/*
+	 * Try hard in case of flush. Lets say, in RAID1 we have
+	 * the following situation
+	 *  dev1: EIO dev2: ENOMEM
+	 * this is not a fatal error as we hope to recover from
+	 * ENOMEM in the next attempt to flush.
+	 * But the following is considered as fatal
+	 *  dev1: ENOMEM dev2: ENOMEM
+	 *  dev1: bdev == NULL dev2: ENOMEM
+	 */
+	if (errors_send || errors_wait) {
+		/*
+		 * At some point we need the status of all disks
+		 * to arrive at the volume status. So error checking
+		 * is being pushed to a separate loop.
+		 */
+		return check_barrier_error(info->fs_devices);
 	}
-	if (errors_send > info->num_tolerated_disk_barrier_failures ||
-	    errors_wait > info->num_tolerated_disk_barrier_failures)
-		return -EIO;
 	return 0;
 }
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index cfb817384cb5..9bb248b3fa0e 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -87,6 +87,7 @@ struct btrfs_device {
 	int offline;
 	int can_discard;
 	int is_tgtdev_for_dev_replace;
+	int last_flush_error;
 
 #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED
 	seqcount_t data_seqcount;
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] btrfs: add framework to handle device flush error as a volume
  2017-05-05 23:17 ` [PATCH v2] " Anand Jain
@ 2017-05-09 17:12   ` David Sterba
  2017-05-16  9:52     ` Anand Jain
  0 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2017-05-09 17:12 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba

On Sat, May 06, 2017 at 07:17:54AM +0800, Anand Jain wrote:
> This adds comments to the flush error handling part of
> the code, and hopes to maintain the same logic with a
> framework which can be used to handle the errors at the
> volume level.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>

Reviewed-by: David Sterba <dsterba@suse.com>

So the next step is to preallocate the flush_bio and remove the ENOMEM
handling.

I think we can skip the error handling on the submission path, even if
submit_bio can fail, the error code will be stored to the bio and this
will be checked in the waiting side.

Then write_dev_flush can be split into 2 functions by the parameter
'wait', as the paths are completely independent.

> +static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
> +{
> +	int submit_flush_error = 0;
> +	int dev_flush_error = 0;
> +	struct btrfs_device *dev;
> +	int tolerance;
> +
> +	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
> +		if (!dev->bdev) {
> +			submit_flush_error++;
> +			dev_flush_error++;
> +			continue;
> +		}
> +		if (dev->last_flush_error == -ENOMEM)
> +			submit_flush_error++;
> +		if (dev->last_flush_error && dev->last_flush_error != -ENOMEM)
> +			dev_flush_error++;
> +	}
> +
> +	tolerance = fsdevs->fs_info->num_tolerated_disk_barrier_failures;
> +	if (submit_flush_error > tolerance || dev_flush_error > tolerance)
> +		return -EIO;

Actually, after reading submit_bio and the comment above, do we need to
care about the submission errors? As submit_bio could return -EIO
immeditaelly, but does not help us anyway. We'll wait and check again
later.  It's the bio completion what counts.

This should simplify the code.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] btrfs: add framework to handle device flush error as a volume
  2017-05-09 17:12   ` David Sterba
@ 2017-05-16  9:52     ` Anand Jain
  0 siblings, 0 replies; 7+ messages in thread
From: Anand Jain @ 2017-05-16  9:52 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On 05/10/2017 01:12 AM, David Sterba wrote:
> On Sat, May 06, 2017 at 07:17:54AM +0800, Anand Jain wrote:
>> This adds comments to the flush error handling part of
>> the code, and hopes to maintain the same logic with a
>> framework which can be used to handle the errors at the
>> volume level.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>
> Reviewed-by: David Sterba <dsterba@suse.com>

  Thanks.

> So the next step is to preallocate the flush_bio and remove the ENOMEM
> handling.

  one or the other way ENOMEM is going to go away. Thinking to fix
  blkdev_issue_flush(), which IMO is the right way, its RFC is sent out.


> submit_bio can fail, the error code will be stored to the bio and this
> will be checked in the waiting side.
>
> Then write_dev_flush can be split into 2 functions by the parameter
> 'wait', as the paths are completely independent.

  The new KPI blkdev_issue_flush_no_wait() makes it much simpler,
  I have sent out the btrfs part of the RFC patch as well. Thanks.

>> +static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
>> +{
>> +	int submit_flush_error = 0;
>> +	int dev_flush_error = 0;
>> +	struct btrfs_device *dev;
>> +	int tolerance;
>> +
>> +	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
>> +		if (!dev->bdev) {
>> +			submit_flush_error++;
>> +			dev_flush_error++;
>> +			continue;
>> +		}
>> +		if (dev->last_flush_error == -ENOMEM)
>> +			submit_flush_error++;
>> +		if (dev->last_flush_error && dev->last_flush_error != -ENOMEM)
>> +			dev_flush_error++;
>> +	}
>> +
>> +	tolerance = fsdevs->fs_info->num_tolerated_disk_barrier_failures;
>> +	if (submit_flush_error > tolerance || dev_flush_error > tolerance)
>> +		return -EIO;
>
> Actually, after reading submit_bio and the comment above, do we need to
> care about the submission errors? As submit_bio could return -EIO
> immeditaelly, but does not help us anyway. We'll wait and check again
> later.  It's the bio completion what counts.
>
> This should simplify the code.

  Yeah. we don;t have to worry about the send part of the error, when
  all the error that may encounter is all related to the device itself.

  There are RFC patches for the device state: offline and failed, now
  depending on whether the flush returned EIO or transport error ENXIO
  I intend to choice the device state such as of failed or offline
  respectively.

Thanks, Anand

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-05-16  9:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-25  8:58 [PATCH] btrfs: add framework to handle device flush error as a volume Anand Jain
2017-04-27  2:53 ` Anand Jain
2017-05-05 14:46 ` David Sterba
2017-05-05 23:14   ` Anand Jain
2017-05-05 23:17 ` [PATCH v2] " Anand Jain
2017-05-09 17:12   ` David Sterba
2017-05-16  9:52     ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).