From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: [PATCH] Btrfs: fix submit_worker congestion Date: Tue, 29 Nov 2011 16:47:07 -0500 Message-ID: <20111129214707.GT24338@shiny> References: <1322599256-15621-1-git-send-email-sensille@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-btrfs@vger.kernel.org To: Arne Jansen Return-path: In-Reply-To: <1322599256-15621-1-git-send-email-sensille@gmx.net> List-ID: On Tue, Nov 29, 2011 at 09:40:56PM +0100, Arne Jansen wrote: > Write bios are submitted from the submit_worker. The worker pumps down > bios into the block layer until it signals a congestion. At least this > is the theory. In pratice submit_bio just blocks before any signalling > happens. As the bios are queued per device, this can lead to a situation > where only one device is served until all bios are submitted, and only > then the next device is served. This is obviously suboptimal. > This patch just throws out the congestion detection and reschedules the > worker every 8 requests. This way, all devices can be kept busy. > This is only a temporary fix until the block layer provides a non-blocking > submit_bio. Then the whole submit_worker mechanism can be killed. The problem with the every 8 requests logic is that we've still got a pretty good chance of getting stuck behind get_request_wait. The way the elevator batching works is that it should give us a batch of requests, and once that batch is done we wait. If we jump around every 8 requests, we've turned this: [ dev A bio 1-8, dev A bio 8-16, dev A bio 16-32, dev B bio 1-8, dev B ... ] into: [ dev A bio 1-8, dev B bio 1-8, dev A bio 8-16, dev B bio 8-16 ] They look like the same IO, but if we wait for a request when we do (dev B bio 1-8) then our dev A bio 1-8 bio is likely to dispatch without all the other dev A bios we had queued. As you said in IRC, we'd be better off with one thread per device or (my preference) with a real non-blocking submit_bio. What kind of results did you get with your test from bumping the nr_requests? -chris