public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] drbd: fix throttling on newly created DM backing devices
@ 2014-09-05 18:41 Imre Palik
  2014-09-07  9:58 ` Lars
  0 siblings, 1 reply; 5+ messages in thread
From: Imre Palik @ 2014-09-05 18:41 UTC (permalink / raw)
  To: drbd-dev
  Cc: Philipp Reisner, Lars Ellenberg, linux-kernel, Palik, Imre,
	Matt Wilson

From: "Palik, Imre" <imrep@amazon.de>

If the drbd backing device is a new device mapper device (e.g., a
dm-linear mapping of an existing block device that contains data), the
counters are initially 0 even though the device contains useful
data. This causes throttling until something accesses the drbd device
or the backing device.

The patch disables throttling, as long as only resync is responsible
for disk activity on a freshly created device.

Reported-by: Mikhail Sugakov <msugakov@amazon.de>
Cc: Matt Wilson <msw@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
---
 drivers/block/drbd/drbd_int.h      |    4 ++--
 drivers/block/drbd/drbd_receiver.c |   10 +++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 1a00001..298b1dc 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -960,8 +960,8 @@ struct drbd_device {
 	atomic_t rs_sect_in; /* for incoming resync data rate, SyncTarget */
 	atomic_t rs_sect_ev; /* for submitted resync data rate, both */
 	int rs_last_sect_ev; /* counter to compare with */
-	int rs_last_events;  /* counter of read or write "events" (unit sectors)
-			      * on the lower level device when we last looked. */
+	unsigned int rs_last_events;  /* counter of read or write "events" (unit sectors)
+				       * on the lower level device when we last looked. */
 	int c_sync_rate; /* current resync rate after syncer throttle magic */
 	struct fifo_buffer *rs_plan_s; /* correction values of resync planer (RCU, connection->conn_update) */
 	int rs_in_flight; /* resync sectors in flight (to proxy, in proxy and from proxy) */
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 9342b8d..147c917 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2467,7 +2467,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
 	struct gendisk *disk = device->ldev->backing_bdev->bd_contains->bd_disk;
 	unsigned long db, dt, dbdt;
 	unsigned int c_min_rate;
-	int curr_events;
+	unsigned int curr_events;
 
 	rcu_read_lock();
 	c_min_rate = rcu_dereference(device->ldev->disk_conf)->c_min_rate;
@@ -2477,12 +2477,12 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
 	if (c_min_rate == 0)
 		return false;
 
-	curr_events = (int)part_stat_read(&disk->part0, sectors[0]) +
-		      (int)part_stat_read(&disk->part0, sectors[1]) -
-			atomic_read(&device->rs_sect_ev);
+	curr_events = (unsigned int)part_stat_read(&disk->part0, sectors[0]) +
+		      (unsigned int)part_stat_read(&disk->part0, sectors[1]) -
+		(unsigned int)atomic_read(&device->rs_sect_ev);
 
 	if (atomic_read(&device->ap_actlog_cnt)
-	    || !device->rs_last_events || curr_events - device->rs_last_events > 64) {
+		|| curr_events - device->rs_last_events > 64) {
 		unsigned long rs_left;
 		int i;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] drbd: fix throttling on newly created DM backing devices
  2014-09-05 18:41 [PATCH v3] drbd: fix throttling on newly created DM backing devices Imre Palik
@ 2014-09-07  9:58 ` Lars
  2014-09-08 13:05   ` Imre Palik
  0 siblings, 1 reply; 5+ messages in thread
From: Lars @ 2014-09-07  9:58 UTC (permalink / raw)
  To: Imre Palik
  Cc: drbd-dev, Philipp Reisner, linux-kernel, Palik, Imre, Matt Wilson

On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
> From: "Palik, Imre" <imrep@amazon.de>
> 
> If the drbd backing device is a new device mapper device (e.g., a
> dm-linear mapping of an existing block device that contains data), the
> counters are initially 0 even though the device contains useful
> data. This causes throttling until something accesses the drbd device
> or the backing device.

What was wrong with my previous proposal?

How does changing the signedness help with
rs_last_events not being properly initialized?

Are you sure you have also considered all wrap-around cases?

Maybe you are too focused on your particular corner case
(disk_stats starting with 0).
Maybe I'm just thick right now, so please explain.

	Lars

> The patch disables throttling, as long as only resync is responsible
> for disk activity on a freshly created device.
> 
> Reported-by: Mikhail Sugakov <msugakov@amazon.de>
> Cc: Matt Wilson <msw@amazon.com>
> Signed-off-by: Imre Palik <imrep@amazon.de>
> ---
>  drivers/block/drbd/drbd_int.h      |    4 ++--
>  drivers/block/drbd/drbd_receiver.c |   10 +++++-----
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
> index 1a00001..298b1dc 100644
> --- a/drivers/block/drbd/drbd_int.h
> +++ b/drivers/block/drbd/drbd_int.h
> @@ -960,8 +960,8 @@ struct drbd_device {
>  	atomic_t rs_sect_in; /* for incoming resync data rate, SyncTarget */
>  	atomic_t rs_sect_ev; /* for submitted resync data rate, both */
>  	int rs_last_sect_ev; /* counter to compare with */
> -	int rs_last_events;  /* counter of read or write "events" (unit sectors)
> -			      * on the lower level device when we last looked. */
> +	unsigned int rs_last_events;  /* counter of read or write "events" (unit sectors)
> +				       * on the lower level device when we last looked. */
>  	int c_sync_rate; /* current resync rate after syncer throttle magic */
>  	struct fifo_buffer *rs_plan_s; /* correction values of resync planer (RCU, connection->conn_update) */
>  	int rs_in_flight; /* resync sectors in flight (to proxy, in proxy and from proxy) */
> diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
> index 9342b8d..147c917 100644
> --- a/drivers/block/drbd/drbd_receiver.c
> +++ b/drivers/block/drbd/drbd_receiver.c
> @@ -2467,7 +2467,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
>  	struct gendisk *disk = device->ldev->backing_bdev->bd_contains->bd_disk;
>  	unsigned long db, dt, dbdt;
>  	unsigned int c_min_rate;
> -	int curr_events;
> +	unsigned int curr_events;
>  
>  	rcu_read_lock();
>  	c_min_rate = rcu_dereference(device->ldev->disk_conf)->c_min_rate;
> @@ -2477,12 +2477,12 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
>  	if (c_min_rate == 0)
>  		return false;
>  
> -	curr_events = (int)part_stat_read(&disk->part0, sectors[0]) +
> -		      (int)part_stat_read(&disk->part0, sectors[1]) -
> -			atomic_read(&device->rs_sect_ev);
> +	curr_events = (unsigned int)part_stat_read(&disk->part0, sectors[0]) +
> +		      (unsigned int)part_stat_read(&disk->part0, sectors[1]) -
> +		(unsigned int)atomic_read(&device->rs_sect_ev);
>  
>  	if (atomic_read(&device->ap_actlog_cnt)
> -	    || !device->rs_last_events || curr_events - device->rs_last_events > 64) {
> +		|| curr_events - device->rs_last_events > 64) {
>  		unsigned long rs_left;
>  		int i;
>  
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] drbd: fix throttling on newly created DM backing devices
  2014-09-07  9:58 ` Lars
@ 2014-09-08 13:05   ` Imre Palik
  2014-09-08 13:38     ` Lars
  0 siblings, 1 reply; 5+ messages in thread
From: Imre Palik @ 2014-09-08 13:05 UTC (permalink / raw)
  To: Lars; +Cc: drbd-dev, Philipp Reisner, linux-kernel, Palik, Imre, Matt Wilson

On 09/07/14 11:58, Lars wrote:
> On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
>> From: "Palik, Imre" <imrep@amazon.de>
>>
>> If the drbd backing device is a new device mapper device (e.g., a
>> dm-linear mapping of an existing block device that contains data), the
>> counters are initially 0 even though the device contains useful
>> data. This causes throttling until something accesses the drbd device
>> or the backing device.
>
> What was wrong with my previous proposal?

Sorry, I haven't realised you added a proposal to your reply.  It seems, 
I really needed that extra sleep during the weekend ...

Your proposal is good.  Of course, I like my last one a slightly better. 
  But as they say, beauty is in the eye of the beholder :-)

> How does changing the signedness help with
> rs_last_events not being properly initialized?

It only helps with reasoning.  I reason with modular arithmetic way 
easier than with signed integer overflows.  Accidentally, 0 is a good 
initialisation value in case of unsigned arithmetic.

> Are you sure you have also considered all wrap-around cases?
>
> Maybe you are too focused on your particular corner case
> (disk_stats starting with 0).
> Maybe I'm just thick right now, so please explain.

The idea is that 0 is the smallest possible value for an unsigned, and 
curr_events is monotonically increasing (mod 2^32) .  This means, 
initially either curr_events > 64, that is, we enter the loop, and do 
the initialisation, or it will be bigger than 64 at most when we want to 
start throttle in an ideal world (after no more than 64 sectors of 
activity).

Basically, while you initialise rs_last_events to an ideal value with 
some calculation, I choose a safe static value.  I am content with both 
approaches.  I think, as a subsystem maintainer, you should choose the 
one you like better.  If you choose yours, then you can add
Reviewed-by: Imre Palik <imrep@amazon.de>

Imre

>
> 	Lars
>
>> The patch disables throttling, as long as only resync is responsible
>> for disk activity on a freshly created device.
>>
>> Reported-by: Mikhail Sugakov <msugakov@amazon.de>
>> Cc: Matt Wilson <msw@amazon.com>
>> Signed-off-by: Imre Palik <imrep@amazon.de>
>> ---
>>   drivers/block/drbd/drbd_int.h      |    4 ++--
>>   drivers/block/drbd/drbd_receiver.c |   10 +++++-----
>>   2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
>> index 1a00001..298b1dc 100644
>> --- a/drivers/block/drbd/drbd_int.h
>> +++ b/drivers/block/drbd/drbd_int.h
>> @@ -960,8 +960,8 @@ struct drbd_device {
>>   	atomic_t rs_sect_in; /* for incoming resync data rate, SyncTarget */
>>   	atomic_t rs_sect_ev; /* for submitted resync data rate, both */
>>   	int rs_last_sect_ev; /* counter to compare with */
>> -	int rs_last_events;  /* counter of read or write "events" (unit sectors)
>> -			      * on the lower level device when we last looked. */
>> +	unsigned int rs_last_events;  /* counter of read or write "events" (unit sectors)
>> +				       * on the lower level device when we last looked. */
>>   	int c_sync_rate; /* current resync rate after syncer throttle magic */
>>   	struct fifo_buffer *rs_plan_s; /* correction values of resync planer (RCU, connection->conn_update) */
>>   	int rs_in_flight; /* resync sectors in flight (to proxy, in proxy and from proxy) */
>> diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
>> index 9342b8d..147c917 100644
>> --- a/drivers/block/drbd/drbd_receiver.c
>> +++ b/drivers/block/drbd/drbd_receiver.c
>> @@ -2467,7 +2467,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
>>   	struct gendisk *disk = device->ldev->backing_bdev->bd_contains->bd_disk;
>>   	unsigned long db, dt, dbdt;
>>   	unsigned int c_min_rate;
>> -	int curr_events;
>> +	unsigned int curr_events;
>>
>>   	rcu_read_lock();
>>   	c_min_rate = rcu_dereference(device->ldev->disk_conf)->c_min_rate;
>> @@ -2477,12 +2477,12 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
>>   	if (c_min_rate == 0)
>>   		return false;
>>
>> -	curr_events = (int)part_stat_read(&disk->part0, sectors[0]) +
>> -		      (int)part_stat_read(&disk->part0, sectors[1]) -
>> -			atomic_read(&device->rs_sect_ev);
>> +	curr_events = (unsigned int)part_stat_read(&disk->part0, sectors[0]) +
>> +		      (unsigned int)part_stat_read(&disk->part0, sectors[1]) -
>> +		(unsigned int)atomic_read(&device->rs_sect_ev);
>>
>>   	if (atomic_read(&device->ap_actlog_cnt)
>> -	    || !device->rs_last_events || curr_events - device->rs_last_events > 64) {
>> +		|| curr_events - device->rs_last_events > 64) {
>>   		unsigned long rs_left;
>>   		int i;
>>
>> --
>> 1.7.9.5
>>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] drbd: fix throttling on newly created DM backing devices
  2014-09-08 13:05   ` Imre Palik
@ 2014-09-08 13:38     ` Lars
  2014-09-09 19:03       ` Imre Palik
  0 siblings, 1 reply; 5+ messages in thread
From: Lars @ 2014-09-08 13:38 UTC (permalink / raw)
  To: Imre Palik
  Cc: drbd-dev, Philipp Reisner, linux-kernel, Palik, Imre, Matt Wilson

On Mon, Sep 08, 2014 at 03:05:28PM +0200, Imre Palik wrote:
> On 09/07/14 11:58, Lars wrote:
> >On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
> >>From: "Palik, Imre" <imrep@amazon.de>
> >>
> >>If the drbd backing device is a new device mapper device (e.g., a
> >>dm-linear mapping of an existing block device that contains data), the
> >>counters are initially 0 even though the device contains useful
> >>data. This causes throttling until something accesses the drbd device
> >>or the backing device.
> >
> >What was wrong with my previous proposal?
> 
> Sorry, I haven't realised you added a proposal to your reply.  It
> seems, I really needed that extra sleep during the weekend ...
> 
> Your proposal is good.  Of course, I like my last one a slightly
> better.  But as they say, beauty is in the eye of the beholder :-)
> 
> >How does changing the signedness help with
> >rs_last_events not being properly initialized?
> 
> It only helps with reasoning.  I reason with modular arithmetic way
> easier than with signed integer overflows.  Accidentally, 0 is a
> good initialisation value in case of unsigned arithmetic.
> 
> >Are you sure you have also considered all wrap-around cases?
> >
> >Maybe you are too focused on your particular corner case
> >(disk_stats starting with 0).
> >Maybe I'm just thick right now, so please explain.
> 
> The idea is that 0 is the smallest possible value for an unsigned,
> and curr_events is monotonically increasing (mod 2^32) .

The problem is: it is not :-(

It's a difference between stats that are increased by the
block core at (usually) completion time, and an atomic_t
that is increased by DRBD at just before (or just after) submittion.

Depending very much on stress in the IO subsystem,
and overall timing of events, a later call may see a smaller
"curr_events" (because rs_last_sect_ev has already increased,
but the disk stats have not yet noticed).

With unsigned, that may wrap around to UINT_MAX, which we don't want.

> This
> means, initially either curr_events > 64, that is, we enter the
> loop, and do the initialisation, or it will be bigger than 64 at
> most when we want to start throttle in an ideal world (after no more
> than 64 sectors of activity).
> 
> Basically, while you initialise rs_last_events to an ideal value
> with some calculation, I choose a safe static value.  I am content
> with both approaches.  I think, as a subsystem maintainer, you
> should choose the one you like better.  If you choose yours, then
> you can add
> Reviewed-by: Imre Palik <imrep@amazon.de>

Thanks,

	Lars


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] drbd: fix throttling on newly created DM backing devices
  2014-09-08 13:38     ` Lars
@ 2014-09-09 19:03       ` Imre Palik
  0 siblings, 0 replies; 5+ messages in thread
From: Imre Palik @ 2014-09-09 19:03 UTC (permalink / raw)
  To: drbd-dev, Philipp Reisner, linux-kernel, Palik, Imre, Matt Wilson

On 09/08/14 15:38, Lars wrote:
> On Mon, Sep 08, 2014 at 03:05:28PM +0200, Imre Palik wrote:
>> On 09/07/14 11:58, Lars wrote:
>>> On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
>>>> From: "Palik, Imre" <imrep@amazon.de>
>>>>
>>>> If the drbd backing device is a new device mapper device (e.g., a
>>>> dm-linear mapping of an existing block device that contains data), the
>>>> counters are initially 0 even though the device contains useful
>>>> data. This causes throttling until something accesses the drbd device
>>>> or the backing device.
>>>
>>> What was wrong with my previous proposal?
>>
>> Sorry, I haven't realised you added a proposal to your reply.  It
>> seems, I really needed that extra sleep during the weekend ...
>>
>> Your proposal is good.  Of course, I like my last one a slightly
>> better.  But as they say, beauty is in the eye of the beholder :-)
>>
>>> How does changing the signedness help with
>>> rs_last_events not being properly initialized?
>>
>> It only helps with reasoning.  I reason with modular arithmetic way
>> easier than with signed integer overflows.  Accidentally, 0 is a
>> good initialisation value in case of unsigned arithmetic.
>>
>>> Are you sure you have also considered all wrap-around cases?
>>>
>>> Maybe you are too focused on your particular corner case
>>> (disk_stats starting with 0).
>>> Maybe I'm just thick right now, so please explain.
>>
>> The idea is that 0 is the smallest possible value for an unsigned,
>> and curr_events is monotonically increasing (mod 2^32) .
>
> The problem is: it is not :-(
>
> It's a difference between stats that are increased by the
> block core at (usually) completion time, and an atomic_t
> that is increased by DRBD at just before (or just after) submittion.
>
> Depending very much on stress in the IO subsystem,
> and overall timing of events, a later call may see a smaller
> "curr_events" (because rs_last_sect_ev has already increased,
> but the disk stats have not yet noticed).
>
> With unsigned, that may wrap around to UINT_MAX, which we don't want.

I see.  You hide the jitter behind the signedness.  Thanks for the 
explanation.

>> This
>> means, initially either curr_events > 64, that is, we enter the
>> loop, and do the initialisation, or it will be bigger than 64 at
>> most when we want to start throttle in an ideal world (after no more
>> than 64 sectors of activity).
>>
>> Basically, while you initialise rs_last_events to an ideal value
>> with some calculation, I choose a safe static value.  I am content
>> with both approaches.  I think, as a subsystem maintainer, you
>> should choose the one you like better.  If you choose yours, then
>> you can add
>> Reviewed-by: Imre Palik <imrep@amazon.de>
>
> Thanks,
>
> 	Lars
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-09 19:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-05 18:41 [PATCH v3] drbd: fix throttling on newly created DM backing devices Imre Palik
2014-09-07  9:58 ` Lars
2014-09-08 13:05   ` Imre Palik
2014-09-08 13:38     ` Lars
2014-09-09 19:03       ` Imre Palik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox