From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261948AbULVSS3 (ORCPT ); Wed, 22 Dec 2004 13:18:29 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261954AbULVSS3 (ORCPT ); Wed, 22 Dec 2004 13:18:29 -0500 Received: from parcelfarce.linux.theplanet.co.uk ([195.92.249.252]:18070 "EHLO www.linux.org.uk") by vger.kernel.org with ESMTP id S261948AbULVSSW (ORCPT ); Wed, 22 Dec 2004 13:18:22 -0500 Date: Wed, 22 Dec 2004 13:58:16 -0200 From: Marcelo Tosatti To: "M. Edward Borasky" Cc: Linux Kernel Mailing List , Jens Axboe Subject: Re: Negative "ios_in_flight" in the 2.4 kernel Message-ID: <20041222155816.GF3088@logos.cnet> References: <1103691937.23157.14.camel@DreamGate> <20041222111642.GD12463@suse.de> <1103728782.26340.34.camel@DreamGate> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103728782.26340.34.camel@DreamGate> User-Agent: Mutt/1.5.5.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 22, 2004 at 07:19:42AM -0800, M. Edward Borasky wrote: > On Wed, 2004-12-22 at 12:16 +0100, Jens Axboe wrote: > > > > > Question: wouldn't a simple refusal to decrement ios_in_flight in > > > "down_ios" if it's zero fix this, or am I missing something? > > > > That would paper over the real bug, but it will work for you. > What is the "real bug", then? What will "work for me" is accurate disk > usage tick counts. The intent of these statistics is something known as > Operational Analysis of Queueing Networks. > > The "requirement" is that the operations on each device be accurately > counted, and the "wall clock" time spent *waiting* for requests and the > time spent *servicing* requests be accurately accumulated for each > device. The sector count is a bonus. > > >From these raw counters, one can, and iostat does, compute throughput, > utilization, average service time, average wait time and average queue > length. An excellent and highly readable reference for the math involved > can be found at > > http://www.cs.washington.edu/homes/lazowska/qsp/Images/Chap_03.pdf > > That is the intent behind these counters, and what will "work for me" is > a kernel that captures the raw counters correctly. If forcing > ios_in_flight to be non-negative is done at the expense of losing or > gaining ticks in the wait or service time accumulators, then it will not > work for me. Well something is deaccounting uncorrectly (doh), probably the disk/partition accounting logic is doing wrong in some condition, Jens? void req_merged_io(struct request *req) { struct hd_struct *hd1, *hd2; locate_hd_struct(req, &hd1, &hd2); if (hd1) down_ios(hd1); if (hd2) down_ios(hd2); } void req_finished_io(struct request *req) { struct hd_struct *hd1, *hd2; locate_hd_struct(req, &hd1, &hd2); if (hd1) account_io_end(hd1, req); if (hd2) account_io_end(hd2, req); } We could eliminate that possibility if you ran your tests with a single non-partitioned disk, but thats just a guess. Jens has more of a clue than I certainly.