All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@phunq.net>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [1/1] Block device throttling [Re: Distributed storage.]
Date: Sun, 12 Aug 2007 16:16:10 -0700	[thread overview]
Message-ID: <200708121616.10670.phillips@phunq.net> (raw)
In-Reply-To: <20070808101708.GA23815@2ka.mipt.ru>

Hi Evgeniy,

Sorry for not getting back to you right away, I was on the road with 
limited email access.  Incidentally, the reason my mails to you keep 
bouncing is, your MTA is picky about my mailer's IP reversing to a real 
hostname.  I will take care of that pretty soon, but for now my direct 
mail to you is going to bounce and you will only see the lkml copy.

On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> This throttling mechanism allows to limit maximum amount of queued
> bios per physical device. By default it is turned off and old block
> layer behaviour with unlimited number of bios is used. When turned on
> (queue limit is set to something different than -1U via
> blk_set_queue_limit()), generic_make_request() will sleep until there
> is room in the queue. number of bios is increased in
> generic_make_request() and reduced either in bio_endio(), when bio is
> completely processed (bi_size is zero), and recharged from original
> queue when new device is assigned to bio via blk_set_bdev(). All
> oerations are not atomic, since we do not care about precise number
> of bios, but a fact, that we are close or close enough to the limit.
>
> Tested on distributed storage device - with limit of 2 bios it works
> slow :)

it seems to me you need:

-               if (q) {
+               if (q && q->bio_limit != -1) {

This patch is short and simple, and will throttle more accurately than 
the current simplistic per-request allocation limit.  However, it fails 
to throttle device mapper devices.  This is because no request is 
allocated by the device mapper queue method, instead the mapping call 
goes straight through to the mapping function.  If the mapping function 
allocates memory (typically the case) then this resource usage evades 
throttling and deadlock becomes a risk.

There are three obvious fixes:

   1) Implement bio throttling in each virtual block device
   2) Implement bio throttling generically in device mapper
   3) Implement bio throttling for all block devices

Number 1 is the approach we currently use in ddsnap, but it is ugly and 
repetitious.  Number 2 is a possibility, but I favor number 3 because 
it is a system-wide solution to a system-wide problem, does not need to 
be repeated for every block device that lacks a queue, heads in the 
direction of code subtraction, and allows system-wide reserve 
accounting. 

Your patch is close to the truth, but it needs to throttle at the top 
(virtual) end of each block device stack instead of the bottom 
(physical) end.  It does head in the direction of eliminating your own 
deadlock risk indeed, however there are block devices it does not 
cover.

Regards,

Daniel

  parent reply	other threads:[~2007-08-12 23:16 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-31 17:13 Distributed storage Evgeniy Polyakov
2007-08-02 21:08 ` Daniel Phillips
2007-08-03 10:26   ` Evgeniy Polyakov
2007-08-03 10:57     ` Evgeniy Polyakov
2007-08-03 12:27       ` Peter Zijlstra
2007-08-03 13:49         ` Evgeniy Polyakov
2007-08-03 14:53           ` Peter Zijlstra
2007-08-03 19:48             ` Daniel Phillips
2007-08-03 19:41           ` Daniel Phillips
2007-08-04  1:19     ` Daniel Phillips
2007-08-04 16:37       ` Evgeniy Polyakov
2007-08-05  8:04         ` Daniel Phillips
2007-08-05 15:08           ` Evgeniy Polyakov
2007-08-05 21:23             ` Daniel Phillips
2007-08-06  8:25               ` Evgeniy Polyakov
2007-08-07 12:05               ` Jens Axboe
2007-08-07 18:24                 ` Daniel Phillips
2007-08-07 20:55                   ` Jens Axboe
2007-08-08  9:54                     ` Block device throttling [Re: Distributed storage.] Evgeniy Polyakov
2007-08-08 10:17                       ` [1/1] " Evgeniy Polyakov
2007-08-08 13:28                         ` Evgeniy Polyakov
2007-08-12 23:16                         ` Daniel Phillips [this message]
2007-08-13  8:18                           ` Evgeniy Polyakov
2007-08-27 21:57                         ` Daniel Phillips
2007-08-28  9:35                           ` Evgeniy Polyakov
2007-08-28 17:27                             ` Daniel Phillips
2007-08-28 17:54                               ` Evgeniy Polyakov
2007-08-28 21:08                                 ` Daniel Phillips
2007-08-29  8:53                                   ` Evgeniy Polyakov
2007-08-30 23:20                                     ` Daniel Phillips
2007-08-31 17:33                                       ` Evgeniy Polyakov
2007-08-31 21:41                                       ` Alasdair G Kergon
2007-09-02  4:42                                         ` Daniel Phillips
2007-08-13  5:22                       ` Daniel Phillips
2007-08-13  5:36                       ` Daniel Phillips
2007-08-13  6:44                         ` Daniel Phillips
2007-08-13  8:14                           ` Evgeniy Polyakov
2007-08-13 11:04                             ` Daniel Phillips
2007-08-13 12:04                               ` Evgeniy Polyakov
2007-08-13 12:18                                 ` Daniel Phillips
2007-08-13 12:24                                   ` Evgeniy Polyakov
2007-08-13  8:23                         ` Evgeniy Polyakov
2007-08-13 11:18                           ` Daniel Phillips
2007-08-13 12:18                             ` Evgeniy Polyakov
2007-08-13 13:04                               ` Daniel Phillips
2007-08-14  8:46                                 ` Evgeniy Polyakov
2007-08-14 11:13                                   ` Daniel Phillips
2007-08-14 11:30                                     ` Evgeniy Polyakov
2007-08-14 11:35                                       ` Daniel Phillips
2007-08-14 11:50                                         ` Evgeniy Polyakov
2007-08-14 12:32                                           ` Daniel Phillips
2007-08-14 12:46                                             ` Evgeniy Polyakov
2007-08-14 12:54                                               ` Daniel Phillips
2007-08-12 23:36                     ` Distributed storage Daniel Phillips
2007-08-13  7:28                       ` Jens Axboe
2007-08-13  7:45                         ` Jens Axboe
2007-08-13  9:08                           ` Daniel Phillips
2007-08-13  9:13                             ` Jens Axboe
2007-08-13  9:55                               ` Daniel Phillips
2007-08-13 10:06                                 ` Jens Axboe
2007-08-13 10:15                                   ` Daniel Phillips
2007-08-13 10:22                                     ` Jens Axboe
2007-08-13 10:32                                       ` Daniel Phillips
2007-08-13  9:18                             ` Evgeniy Polyakov
2007-08-13 10:12                               ` Daniel Phillips
2007-08-13 11:03                                 ` Evgeniy Polyakov
2007-08-13 11:45                                   ` Daniel Phillips
2007-08-13  8:59                         ` Daniel Phillips
2007-08-13  9:12                           ` Jens Axboe
2007-08-13 23:27                             ` Daniel Phillips
2007-08-03  4:09 ` Mike Snitzer
2007-08-03 10:42   ` Evgeniy Polyakov
2007-08-04  0:49   ` Daniel Phillips
2007-08-03  5:04 ` Manu Abraham
2007-08-03 10:44   ` Evgeniy Polyakov
2007-08-04  2:51   ` Dave Dillow
2007-08-04  3:44     ` Manu Abraham
2007-08-04 17:03   ` Evgeniy Polyakov
2007-08-28 17:19   ` Evgeniy Polyakov
2007-08-04  0:41 ` Daniel Phillips
2007-08-04 16:44   ` Evgeniy Polyakov
2007-08-05  8:06     ` Daniel Phillips
2007-08-05 15:01       ` Evgeniy Polyakov
2007-08-05 21:35         ` Daniel Phillips
2007-08-06  8:28           ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708121616.10670.phillips@phunq.net \
    --to=phillips@phunq.net \
    --cc=jens.axboe@oracle.com \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.