All of lore.kernel.org
 help / color / mirror / Atom feed
From: Imre Palik <imrep.amz@gmail.com>
To: drbd-dev@lists.linbit.com,
	Philipp Reisner <philipp.reisner@linbit.com>,
	linux-kernel@vger.kernel.org, "Palik, Imre" <imrep@amazon.de>,
	Matt Wilson <msw@amazon.com>
Subject: Re: [Drbd-dev] [PATCH v3] drbd: fix throttling on newly created DM backing devices
Date: Tue, 09 Sep 2014 21:03:42 +0200	[thread overview]
Message-ID: <540F4F0E.8050306@gmail.com> (raw)
In-Reply-To: <20140908133816.GF9353@soda.linbit>

On 09/08/14 15:38, Lars wrote:
> On Mon, Sep 08, 2014 at 03:05:28PM +0200, Imre Palik wrote:
>> On 09/07/14 11:58, Lars wrote:
>>> On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
>>>> From: "Palik, Imre" <imrep@amazon.de>
>>>>
>>>> If the drbd backing device is a new device mapper device (e.g., a
>>>> dm-linear mapping of an existing block device that contains data), the
>>>> counters are initially 0 even though the device contains useful
>>>> data. This causes throttling until something accesses the drbd device
>>>> or the backing device.
>>>
>>> What was wrong with my previous proposal?
>>
>> Sorry, I haven't realised you added a proposal to your reply.  It
>> seems, I really needed that extra sleep during the weekend ...
>>
>> Your proposal is good.  Of course, I like my last one a slightly
>> better.  But as they say, beauty is in the eye of the beholder :-)
>>
>>> How does changing the signedness help with
>>> rs_last_events not being properly initialized?
>>
>> It only helps with reasoning.  I reason with modular arithmetic way
>> easier than with signed integer overflows.  Accidentally, 0 is a
>> good initialisation value in case of unsigned arithmetic.
>>
>>> Are you sure you have also considered all wrap-around cases?
>>>
>>> Maybe you are too focused on your particular corner case
>>> (disk_stats starting with 0).
>>> Maybe I'm just thick right now, so please explain.
>>
>> The idea is that 0 is the smallest possible value for an unsigned,
>> and curr_events is monotonically increasing (mod 2^32) .
>
> The problem is: it is not :-(
>
> It's a difference between stats that are increased by the
> block core at (usually) completion time, and an atomic_t
> that is increased by DRBD at just before (or just after) submittion.
>
> Depending very much on stress in the IO subsystem,
> and overall timing of events, a later call may see a smaller
> "curr_events" (because rs_last_sect_ev has already increased,
> but the disk stats have not yet noticed).
>
> With unsigned, that may wrap around to UINT_MAX, which we don't want.

I see.  You hide the jitter behind the signedness.  Thanks for the 
explanation.

>> This
>> means, initially either curr_events > 64, that is, we enter the
>> loop, and do the initialisation, or it will be bigger than 64 at
>> most when we want to start throttle in an ideal world (after no more
>> than 64 sectors of activity).
>>
>> Basically, while you initialise rs_last_events to an ideal value
>> with some calculation, I choose a safe static value.  I am content
>> with both approaches.  I think, as a subsystem maintainer, you
>> should choose the one you like better.  If you choose yours, then
>> you can add
>> Reviewed-by: Imre Palik <imrep@amazon.de>
>
> Thanks,
>
> 	Lars
>


WARNING: multiple messages have this Message-ID (diff)
From: Imre Palik <imrep.amz@gmail.com>
To: drbd-dev@lists.linbit.com,
	Philipp Reisner <philipp.reisner@linbit.com>,
	linux-kernel@vger.kernel.org, "Palik, Imre" <imrep@amazon.de>,
	Matt Wilson <msw@amazon.com>
Subject: Re: [PATCH v3] drbd: fix throttling on newly created DM backing devices
Date: Tue, 09 Sep 2014 21:03:42 +0200	[thread overview]
Message-ID: <540F4F0E.8050306@gmail.com> (raw)
In-Reply-To: <20140908133816.GF9353@soda.linbit>

On 09/08/14 15:38, Lars wrote:
> On Mon, Sep 08, 2014 at 03:05:28PM +0200, Imre Palik wrote:
>> On 09/07/14 11:58, Lars wrote:
>>> On Fri, Sep 05, 2014 at 08:41:18PM +0200, Imre Palik wrote:
>>>> From: "Palik, Imre" <imrep@amazon.de>
>>>>
>>>> If the drbd backing device is a new device mapper device (e.g., a
>>>> dm-linear mapping of an existing block device that contains data), the
>>>> counters are initially 0 even though the device contains useful
>>>> data. This causes throttling until something accesses the drbd device
>>>> or the backing device.
>>>
>>> What was wrong with my previous proposal?
>>
>> Sorry, I haven't realised you added a proposal to your reply.  It
>> seems, I really needed that extra sleep during the weekend ...
>>
>> Your proposal is good.  Of course, I like my last one a slightly
>> better.  But as they say, beauty is in the eye of the beholder :-)
>>
>>> How does changing the signedness help with
>>> rs_last_events not being properly initialized?
>>
>> It only helps with reasoning.  I reason with modular arithmetic way
>> easier than with signed integer overflows.  Accidentally, 0 is a
>> good initialisation value in case of unsigned arithmetic.
>>
>>> Are you sure you have also considered all wrap-around cases?
>>>
>>> Maybe you are too focused on your particular corner case
>>> (disk_stats starting with 0).
>>> Maybe I'm just thick right now, so please explain.
>>
>> The idea is that 0 is the smallest possible value for an unsigned,
>> and curr_events is monotonically increasing (mod 2^32) .
>
> The problem is: it is not :-(
>
> It's a difference between stats that are increased by the
> block core at (usually) completion time, and an atomic_t
> that is increased by DRBD at just before (or just after) submittion.
>
> Depending very much on stress in the IO subsystem,
> and overall timing of events, a later call may see a smaller
> "curr_events" (because rs_last_sect_ev has already increased,
> but the disk stats have not yet noticed).
>
> With unsigned, that may wrap around to UINT_MAX, which we don't want.

I see.  You hide the jitter behind the signedness.  Thanks for the 
explanation.

>> This
>> means, initially either curr_events > 64, that is, we enter the
>> loop, and do the initialisation, or it will be bigger than 64 at
>> most when we want to start throttle in an ideal world (after no more
>> than 64 sectors of activity).
>>
>> Basically, while you initialise rs_last_events to an ideal value
>> with some calculation, I choose a safe static value.  I am content
>> with both approaches.  I think, as a subsystem maintainer, you
>> should choose the one you like better.  If you choose yours, then
>> you can add
>> Reviewed-by: Imre Palik <imrep@amazon.de>
>
> Thanks,
>
> 	Lars
>


  reply	other threads:[~2014-09-09 19:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-05 18:41 [Drbd-dev] [PATCH v3] drbd: fix throttling on newly created DM backing devices Imre Palik
2014-09-05 18:41 ` Imre Palik
2014-09-07  9:58 ` [Drbd-dev] " Lars
2014-09-07  9:58   ` Lars
2014-09-08 13:05   ` [Drbd-dev] " Imre Palik
2014-09-08 13:05     ` Imre Palik
2014-09-08 13:38     ` [Drbd-dev] " Lars
2014-09-08 13:38       ` Lars
2014-09-09 19:03       ` Imre Palik [this message]
2014-09-09 19:03         ` Imre Palik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540F4F0E.8050306@gmail.com \
    --to=imrep.amz@gmail.com \
    --cc=drbd-dev@lists.linbit.com \
    --cc=imrep@amazon.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=msw@amazon.com \
    --cc=philipp.reisner@linbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.