All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@phunq.net>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: Block device throttling [Re: Distributed storage.]
Date: Sun, 12 Aug 2007 23:44:00 -0700	[thread overview]
Message-ID: <200708122344.00343.phillips@phunq.net> (raw)
In-Reply-To: <200708122236.24096.phillips@phunq.net>

On Sunday 12 August 2007 22:36, I wrote:
> Note!  There are two more issues I forgot to mention earlier.

Oops, and there is also:

3) The bio throttle, which is supposed to prevent deadlock, can itself 
deadlock.  Let me see if I can remember how it goes.

  * generic_make_request puts a bio in flight
  * the bio gets past the throttle and initiates network IO
  * net calls sk_alloc->alloc_pages->shrink_caches
  * shrink_caches submits a bio recursively to our block device
  * this bio blocks on the throttle
  * net may never get the memory it needs, and we are wedged

I need to review a backtrace to get this precisely right, however you 
can see the danger.  In ddsnap we kludge around this problem by not 
throttling any bio submitted in PF_MEMALLOC mode, which effectively 
increases our reserve requirement by the amount of IO that mm will 
submit to a given block device before deciding the device is congested 
and should be left alone.  This works, but is sloppy and disgusting.

The right thing to do is to make sure than the mm knows about our 
throttle accounting in backing_dev_info so it will not push IO to our 
device when it knows that the IO will just block on congestion.  
Instead, shrink_caches will find some other less congested block device 
or give up, causing alloc_pages to draw from the memalloc reserve to 
satisfy the sk_alloc request.

The mm already uses backing_dev_info this way, we just need to set the 
right bits in the backing_dev_info state flags.  I think Peter posted a 
patch set that included this feature at some point.

Regards,

Daniel

  reply	other threads:[~2007-08-13 12:21 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-31 17:13 Distributed storage Evgeniy Polyakov
2007-08-02 21:08 ` Daniel Phillips
2007-08-03 10:26   ` Evgeniy Polyakov
2007-08-03 10:57     ` Evgeniy Polyakov
2007-08-03 12:27       ` Peter Zijlstra
2007-08-03 13:49         ` Evgeniy Polyakov
2007-08-03 14:53           ` Peter Zijlstra
2007-08-03 19:48             ` Daniel Phillips
2007-08-03 19:41           ` Daniel Phillips
2007-08-04  1:19     ` Daniel Phillips
2007-08-04 16:37       ` Evgeniy Polyakov
2007-08-05  8:04         ` Daniel Phillips
2007-08-05 15:08           ` Evgeniy Polyakov
2007-08-05 21:23             ` Daniel Phillips
2007-08-06  8:25               ` Evgeniy Polyakov
2007-08-07 12:05               ` Jens Axboe
2007-08-07 18:24                 ` Daniel Phillips
2007-08-07 20:55                   ` Jens Axboe
2007-08-08  9:54                     ` Block device throttling [Re: Distributed storage.] Evgeniy Polyakov
2007-08-08 10:17                       ` [1/1] " Evgeniy Polyakov
2007-08-08 13:28                         ` Evgeniy Polyakov
2007-08-12 23:16                         ` Daniel Phillips
2007-08-13  8:18                           ` Evgeniy Polyakov
2007-08-27 21:57                         ` Daniel Phillips
2007-08-28  9:35                           ` Evgeniy Polyakov
2007-08-28 17:27                             ` Daniel Phillips
2007-08-28 17:54                               ` Evgeniy Polyakov
2007-08-28 21:08                                 ` Daniel Phillips
2007-08-29  8:53                                   ` Evgeniy Polyakov
2007-08-30 23:20                                     ` Daniel Phillips
2007-08-31 17:33                                       ` Evgeniy Polyakov
2007-08-31 21:41                                       ` Alasdair G Kergon
2007-09-02  4:42                                         ` Daniel Phillips
2007-08-13  5:22                       ` Daniel Phillips
2007-08-13  5:36                       ` Daniel Phillips
2007-08-13  6:44                         ` Daniel Phillips [this message]
2007-08-13  8:14                           ` Evgeniy Polyakov
2007-08-13 11:04                             ` Daniel Phillips
2007-08-13 12:04                               ` Evgeniy Polyakov
2007-08-13 12:18                                 ` Daniel Phillips
2007-08-13 12:24                                   ` Evgeniy Polyakov
2007-08-13  8:23                         ` Evgeniy Polyakov
2007-08-13 11:18                           ` Daniel Phillips
2007-08-13 12:18                             ` Evgeniy Polyakov
2007-08-13 13:04                               ` Daniel Phillips
2007-08-14  8:46                                 ` Evgeniy Polyakov
2007-08-14 11:13                                   ` Daniel Phillips
2007-08-14 11:30                                     ` Evgeniy Polyakov
2007-08-14 11:35                                       ` Daniel Phillips
2007-08-14 11:50                                         ` Evgeniy Polyakov
2007-08-14 12:32                                           ` Daniel Phillips
2007-08-14 12:46                                             ` Evgeniy Polyakov
2007-08-14 12:54                                               ` Daniel Phillips
2007-08-12 23:36                     ` Distributed storage Daniel Phillips
2007-08-13  7:28                       ` Jens Axboe
2007-08-13  7:45                         ` Jens Axboe
2007-08-13  9:08                           ` Daniel Phillips
2007-08-13  9:13                             ` Jens Axboe
2007-08-13  9:55                               ` Daniel Phillips
2007-08-13 10:06                                 ` Jens Axboe
2007-08-13 10:15                                   ` Daniel Phillips
2007-08-13 10:22                                     ` Jens Axboe
2007-08-13 10:32                                       ` Daniel Phillips
2007-08-13  9:18                             ` Evgeniy Polyakov
2007-08-13 10:12                               ` Daniel Phillips
2007-08-13 11:03                                 ` Evgeniy Polyakov
2007-08-13 11:45                                   ` Daniel Phillips
2007-08-13  8:59                         ` Daniel Phillips
2007-08-13  9:12                           ` Jens Axboe
2007-08-13 23:27                             ` Daniel Phillips
2007-08-03  4:09 ` Mike Snitzer
2007-08-03 10:42   ` Evgeniy Polyakov
2007-08-04  0:49   ` Daniel Phillips
2007-08-03  5:04 ` Manu Abraham
2007-08-03 10:44   ` Evgeniy Polyakov
2007-08-04  2:51   ` Dave Dillow
2007-08-04  3:44     ` Manu Abraham
2007-08-04 17:03   ` Evgeniy Polyakov
2007-08-28 17:19   ` Evgeniy Polyakov
2007-08-04  0:41 ` Daniel Phillips
2007-08-04 16:44   ` Evgeniy Polyakov
2007-08-05  8:06     ` Daniel Phillips
2007-08-05 15:01       ` Evgeniy Polyakov
2007-08-05 21:35         ` Daniel Phillips
2007-08-06  8:28           ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708122344.00343.phillips@phunq.net \
    --to=phillips@phunq.net \
    --cc=jens.axboe@oracle.com \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.