All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Gregory Farnum <gregory.farnum@dreamhost.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load
Date: Thu, 2 Feb 2012 08:38:52 -0700	[thread overview]
Message-ID: <4F2AAE0C.6030609@sandia.gov> (raw)
In-Reply-To: <CAF3hT9DV46n0TwWOVC0LsCdd921uus3kzQfPLuMNEATjpYzT3g@mail.gmail.com>

(resent because I forgot the list on my original reply)

On 02/01/2012 03:33 PM, Gregory Farnum wrote:
> On Wed, Feb 1, 2012 at 7:54 AM, Jim Schutt<jaschut@sandia.gov>  wrote:
>> Hi,
>>
>> FWIW, I've been trying to understand op delays under very heavy write
>> load, and have been working a little with the policy throttler in hopes of
>> using throttling delays to help track down which ops were backing up.
>> Without much success, unfortunately.
>>
>> When I saw the wip-osd-op-tracking branch, I wondered if any of this
>> stuff might be helpful.  Here it is, just in case.
>
> In general these patches are dumping information to the logs, and part
> of the wip-osd-op-tracking branch is actually keeping track of most of
> the message queueing wait times as part of the message itself
> (although not the information about number of waiters and sleep/wake
> seqs). I'm inclined to prefer that approach to log dumping.

I agree - I've just been using log dumping because I can extract
any relationships I can write a perl script to find :)  So far,
not too helpful.

> Are there any patches you recommend for merging? I'm a little curious
> about the ordered wakeup one — do you have data about when that's a
> problem?

I've been trying to push the client:osd ratio, and in my testbed
I can run up to 166 linux clients. Right now I'm running them
against 48 OSDs.  The clients are 1 Gb/s ethernet, and the OSDs
have a 10 Gb/s ethernet for clients and another for the cluster.

During sustained write loads I see a factor of 10 oscillation
in aggregate throughput, and during that time I see clients
stuck in the policy throttler for hundreds of seconds, and I
see a number of waiters equal to
   number of clients - (throttler limit) / (msg size)
If I do a histogram of throttler wait times I see a handful of
messages that wait for an extra couple hundreds of seconds
without the ordered wakeup.

I'm not sure what this will look like if my throughput
variations can be fixed.  But, for our HPC loads I expect
we'll often see periods where offered load is much higher
that aggregate bandwidth of any system we can afford to
build, so ordered wakeup may be useful in such cases for
client fairness.

So I'd recommend the ordered wakeup patch if you don't
see any downsides.

Sorry for the noise on the others - mostly I just wanted
to share the sort of things I've been looking at.  I'll
be learning to use your new stuff soon...

-- Jim

> -Greg
>
>


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-02-02 15:39 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-01 15:54 [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 1/6] msgr: print message sequence number and tid when receiving message envelope Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 2/6] common/Throttle: track sleep/wake sequences in Throttle, report them for policy throttler Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 3/6] common/Throttle: throttle in FIFO order Jim Schutt
2012-02-02 17:53   ` Gregory Farnum
2012-02-02 18:31     ` Jim Schutt
2012-02-02 19:01       ` Gregory Farnum
2012-02-01 15:54 ` [RFC PATCH 4/6] common/Throttle: FIFO throttler doesn't need to signal waiters when max changes Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 5/6] common/Throttle: make get() report number of waiters on entry/exit Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 6/6] msg: log Message interactions with throttler Jim Schutt
2012-02-01 22:33 ` [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load Gregory Farnum
2012-02-02 15:38   ` Jim Schutt [this message]
     [not found]   ` <4F29CDAA.408@sandia.gov>
     [not found]     ` <CAF3hT9BZEP_FWS=qt8ivA++aDpPGGFzuD_PtMcvDRS2aDEN+hw@mail.gmail.com>
     [not found]       ` <4F2AABF5.6050803@sandia.gov>
2012-02-02 17:52         ` Gregory Farnum
2012-02-02 19:06           ` [EXTERNAL] " Jim Schutt
2012-02-02 19:15             ` Sage Weil
2012-02-02 19:33               ` Jim Schutt
2012-02-02 19:32             ` Gregory Farnum
2012-02-02 20:22               ` Jim Schutt
2012-02-02 20:31                 ` Jim Schutt
2012-02-03  0:28                 ` [EXTERNAL] " Gregory Farnum
2012-02-03 16:17                   ` Jim Schutt
2012-02-03 17:06                     ` Gregory Farnum
2012-02-03 23:33                       ` Jim Schutt
     [not found]                         ` <CAC-hyiHSNv_VgLcyVCrJ66HxTGFNBONrmmBddJk5326dLTKgkw@mail.gmail.com>
2012-02-04  0:04                           ` Yehuda Sadeh Weinraub
2012-02-06 16:20                           ` Jim Schutt
2012-02-06 17:22                             ` Yehuda Sadeh Weinraub
2012-02-06 18:20                               ` Jim Schutt
2012-02-06 18:35                                 ` Gregory Farnum
2012-02-09 20:53                                   ` Jim Schutt
2012-02-09 22:40                                     ` sridhar basam
2012-02-09 23:15                                       ` Jim Schutt
2012-02-10  0:34                                         ` Tommi Virtanen
2012-02-10  1:26                                         ` sridhar basam
2012-02-10 15:32                                           ` [EXTERNAL] " Jim Schutt
2012-02-10 17:13                                             ` sridhar basam
2012-02-10 23:09                                               ` Jim Schutt
2012-02-11  0:05                                                 ` sridhar basam
2012-02-13 15:26                                                   ` Jim Schutt
2012-02-03 17:07                     ` Sage Weil
2012-02-24 15:38           ` Jim Schutt
2012-02-24 18:31             ` Tommi Virtanen
2012-02-24 18:38               ` Tommi Virtanen
2013-02-21  0:12             ` Sage Weil
2013-02-26 19:16               ` Jim Schutt
2013-02-26 19:36                 ` Sage Weil
2013-02-28 19:37                   ` Jim Schutt
2013-02-28 21:06                     ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F2AAE0C.6030609@sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gregory.farnum@dreamhost.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.