From: Tejun Heo <tj@kernel.org>
To: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] [RFC] make hd_struct->in_flight atomic to avoid diskstat corruption
Date: Thu, 16 Apr 2009 23:40:55 +0900 [thread overview]
Message-ID: <49E74377.6050209@kernel.org> (raw)
In-Reply-To: <200904161445.03955.knikanth@suse.de>
Hello, Nikanth, Jens.
Nikanth Karthikesan wrote:
>> Hmm. Did you observe this behaviour?
>
> Sorry, not on current kernels. But on a very old 2.6.5 kernel.
>
> Reading Documentation/iostats.txt and the changelog of commit
> e71bf0d0ee89e51b92776391c5634938236977d5 made me assume that this could be a
> problem even today.
The only problem we can run into there is if a request doesn't get
attributed to a partition on issue but gets attributed to a partition
on completion, which seems to be possible if a new partition is added
while IO on the whole device which fell into the new partition area is
already in progress, which, on the first glance, seems possible if the
admin tries really hard. I think we can get around the problem by
doing part->in_flight = min(max(new_val, part0->in_flight), 0) in
dec_in_flight(). This is pretty extreme corner case tho.
>> A quick glance at the code reveals
>> that the callers of part_inc_in_flight() and part_dec_in_flight() in the
>> block layer are always done under the queue lock. Ditto
>> part_round_stats(), which calls part_round_stats_single() and also needs
>> protection for in_flight.
>>
>> That basically just leaves the code reading this out and reporting, and
>> driver calls to part_round_stats(). I'd suggest looking there instead,
>> we're not going to make ->in_flight an atomic just because of some
>> silliness there that could be fixed.
>
> Isn't this also true for the stats protected by the
> part_stat_lock()? Only places where we are only reading seems to be
> called without the queue lock.
part_stat_lock() doesn't protect against simultaneous access. I don't
think we have any place where in_flight is updated without queuelock
and the counters being equal to or smaller then ulong, reading
shouldn't be a problem.
I don't think the bug you saw in 2.6.5 kernel applies to upstream
kernel. The minus in_flight value was seen on the diskstats of the
whole device which can't be affected by partition coming up while IOs
are in progress.
Thanks.
--
tejun
next prev parent reply other threads:[~2009-04-16 14:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-16 7:24 [PATCH] [RFC] make hd_struct->in_flight atomic to avoid diskstat corruption Nikanth Karthikesan
2009-04-16 7:35 ` Jens Axboe
2009-04-16 9:15 ` Nikanth Karthikesan
2009-04-16 14:40 ` Tejun Heo [this message]
2009-04-16 16:32 ` Jens Axboe
2009-04-19 8:51 ` Tejun Heo
2009-04-21 7:31 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49E74377.6050209@kernel.org \
--to=tj@kernel.org \
--cc=jens.axboe@oracle.com \
--cc=knikanth@suse.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).