From: Daniel Phillips <phillips@phunq.net>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [1/1] Block device throttling [Re: Distributed storage.]
Date: Thu, 30 Aug 2007 16:20:35 -0700 [thread overview]
Message-ID: <200708301620.37965.phillips@phunq.net> (raw)
In-Reply-To: <20070829085331.GA16607@2ka.mipt.ru>
On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote:
> Then, if of course you will want, which I doubt, you can reread
> previous mails and find that it was pointed to that race and
> possibilities to solve it way too long ago.
What still bothers me about your response is that, while you know the
race exists and do not disagree with my example, you don't seem to see
that that race can eventually lock up the block device by repeatedly
losing throttle counts which are never recovered. What prevents that?
> > --- 2.6.22.clean/block/ll_rw_blk.c 2007-07-08 16:32:17.000000000
> > -0700 +++ 2.6.22/block/ll_rw_blk.c 2007-08-24 12:07:16.000000000
> > -0700 @@ -3237,6 +3237,15 @@ end_io:
> > */
> > void generic_make_request(struct bio *bio)
> > {
> > + struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> > +
> > + if (q && q->metric) {
> > + int need = bio->bi_reserved = q->metric(bio);
> > + bio->queue = q;
>
> In case you have stacked device, this entry will be rewritten and you
> will lost all your account data.
It is a weakness all right. Well,
- if (q && q->metric) {
+ if (q && q->metric && !bio->queue) {
which fixes that problem. Maybe there is a better fix possible. Thanks
for the catch!
The original conception was that this block throttling would apply only
to the highest level submission of the bio, the one that crosses the
boundary between filesystem (or direct block device application) and
block layer. Resubmitting a bio or submitting a dependent bio from
inside a block driver does not need to be throttled because all
resources required to guarantee completion must have been obtained
_before_ the bio was allowed to proceed into the block layer.
The other principle we are trying to satisfy is that the throttling
should not be released until bio->endio, which I am not completely sure
about with the patch as modified above. Your earlier idea of having
the throttle protection only cover the actual bio submission is
interesting and may be effective in some cases, in fact it may cover
the specific case of ddsnap. But we don't have to look any further
than ddraid (distributed raid) to find a case it doesn't cover - the
additional memory allocated to hold parity data has to be reserved
until parity data is deallocated, long after the submission completes.
So while you manage to avoid some logistical difficulties, it also looks
like you didn't solve the general problem.
Hopefully I will be able to report on whether my patch actually works
soon, when I get back from vacation. The mechanism in ddsnap this is
supposed to replace is effective, it is just ugly and tricky to verify.
Regards,
Daniel
next prev parent reply other threads:[~2007-08-30 23:21 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-31 17:13 Distributed storage Evgeniy Polyakov
2007-08-02 21:08 ` Daniel Phillips
2007-08-03 10:26 ` Evgeniy Polyakov
2007-08-03 10:57 ` Evgeniy Polyakov
2007-08-03 12:27 ` Peter Zijlstra
2007-08-03 13:49 ` Evgeniy Polyakov
2007-08-03 14:53 ` Peter Zijlstra
2007-08-03 19:48 ` Daniel Phillips
2007-08-03 19:41 ` Daniel Phillips
2007-08-04 1:19 ` Daniel Phillips
2007-08-04 16:37 ` Evgeniy Polyakov
2007-08-05 8:04 ` Daniel Phillips
2007-08-05 15:08 ` Evgeniy Polyakov
2007-08-05 21:23 ` Daniel Phillips
2007-08-06 8:25 ` Evgeniy Polyakov
2007-08-07 12:05 ` Jens Axboe
2007-08-07 18:24 ` Daniel Phillips
2007-08-07 20:55 ` Jens Axboe
2007-08-08 9:54 ` Block device throttling [Re: Distributed storage.] Evgeniy Polyakov
2007-08-08 10:17 ` [1/1] " Evgeniy Polyakov
2007-08-08 13:28 ` Evgeniy Polyakov
2007-08-12 23:16 ` Daniel Phillips
2007-08-13 8:18 ` Evgeniy Polyakov
2007-08-27 21:57 ` Daniel Phillips
2007-08-28 9:35 ` Evgeniy Polyakov
2007-08-28 17:27 ` Daniel Phillips
2007-08-28 17:54 ` Evgeniy Polyakov
2007-08-28 21:08 ` Daniel Phillips
2007-08-29 8:53 ` Evgeniy Polyakov
2007-08-30 23:20 ` Daniel Phillips [this message]
2007-08-31 17:33 ` Evgeniy Polyakov
2007-08-31 21:41 ` Alasdair G Kergon
2007-09-02 4:42 ` Daniel Phillips
2007-08-13 5:22 ` Daniel Phillips
2007-08-13 5:36 ` Daniel Phillips
2007-08-13 6:44 ` Daniel Phillips
2007-08-13 8:14 ` Evgeniy Polyakov
2007-08-13 11:04 ` Daniel Phillips
2007-08-13 12:04 ` Evgeniy Polyakov
2007-08-13 12:18 ` Daniel Phillips
2007-08-13 12:24 ` Evgeniy Polyakov
2007-08-13 8:23 ` Evgeniy Polyakov
2007-08-13 11:18 ` Daniel Phillips
2007-08-13 12:18 ` Evgeniy Polyakov
2007-08-13 13:04 ` Daniel Phillips
2007-08-14 8:46 ` Evgeniy Polyakov
2007-08-14 11:13 ` Daniel Phillips
2007-08-14 11:30 ` Evgeniy Polyakov
2007-08-14 11:35 ` Daniel Phillips
2007-08-14 11:50 ` Evgeniy Polyakov
2007-08-14 12:32 ` Daniel Phillips
2007-08-14 12:46 ` Evgeniy Polyakov
2007-08-14 12:54 ` Daniel Phillips
2007-08-12 23:36 ` Distributed storage Daniel Phillips
2007-08-13 7:28 ` Jens Axboe
2007-08-13 7:45 ` Jens Axboe
2007-08-13 9:08 ` Daniel Phillips
2007-08-13 9:13 ` Jens Axboe
2007-08-13 9:55 ` Daniel Phillips
2007-08-13 10:06 ` Jens Axboe
2007-08-13 10:15 ` Daniel Phillips
2007-08-13 10:22 ` Jens Axboe
2007-08-13 10:32 ` Daniel Phillips
2007-08-13 9:18 ` Evgeniy Polyakov
2007-08-13 10:12 ` Daniel Phillips
2007-08-13 11:03 ` Evgeniy Polyakov
2007-08-13 11:45 ` Daniel Phillips
2007-08-13 8:59 ` Daniel Phillips
2007-08-13 9:12 ` Jens Axboe
2007-08-13 23:27 ` Daniel Phillips
2007-08-03 4:09 ` Mike Snitzer
2007-08-03 10:42 ` Evgeniy Polyakov
2007-08-04 0:49 ` Daniel Phillips
2007-08-03 5:04 ` Manu Abraham
2007-08-03 10:44 ` Evgeniy Polyakov
2007-08-04 2:51 ` Dave Dillow
2007-08-04 3:44 ` Manu Abraham
2007-08-04 17:03 ` Evgeniy Polyakov
2007-08-28 17:19 ` Evgeniy Polyakov
2007-08-04 0:41 ` Daniel Phillips
2007-08-04 16:44 ` Evgeniy Polyakov
2007-08-05 8:06 ` Daniel Phillips
2007-08-05 15:01 ` Evgeniy Polyakov
2007-08-05 21:35 ` Daniel Phillips
2007-08-06 8:28 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200708301620.37965.phillips@phunq.net \
--to=phillips@phunq.net \
--cc=jens.axboe@oracle.com \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.