From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751539Ab2GSNdu (ORCPT ); Thu, 19 Jul 2012 09:33:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55697 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852Ab2GSNdr (ORCPT ); Thu, 19 Jul 2012 09:33:47 -0400 From: Jeff Moyer To: Mikulas Patocka Cc: Jan Kara , Alexander Viro , Jens Axboe , "Alasdair G. Kergon" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, dm-devel@redhat.com, lwoodman@redhat.com, Andrea Arcangeli , kosaki.motohiro@jp.fujitsu.com Subject: Re: Crash when IO is being submitted and block size is changed References: <20120628111541.GB17515@quack.suse.cz> X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Thu, 19 Jul 2012 09:33:11 -0400 In-Reply-To: (Mikulas Patocka's message of "Wed, 18 Jul 2012 22:27:13 -0400 (EDT)") Message-ID: User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mikulas Patocka writes: > On Tue, 17 Jul 2012, Jeff Moyer wrote: > >> > This is the patch that fixes this crash: it takes a rw-semaphore around >> > all direct-IO path. >> > >> > (note that if someone is concerned about performance, the rw-semaphore >> > could be made per-cpu --- take it for read on the current CPU and take it >> > for write on all CPUs). >> >> Here we go again. :-) I believe we had at one point tried taking a rw >> semaphore around GUP inside of the direct I/O code path to fix the fork >> vs. GUP race (that still exists today). When testing that, the overhead >> of the semaphore was *way* too high to be considered an acceptable >> solution. I've CC'd Larry Woodman, Andrea, and Kosaki Motohiro who all >> worked on that particular bug. Hopefully they can give better >> quantification of the slowdown than my poor memory. >> >> Cheers, >> Jeff > > Both down_read and up_read together take 82 ticks on Core2, 69 ticks on > AMD K10, 62 ticks on UltraSparc2 if the target is in L1 cache. So, if > percpu rw_semaphores were used, it would slow down only by this amount. Sorry, I'm not familiar with per-cpu rw semaphores. Where are they implemented? > I hope that Linux developers are not so obsessed with performance that > they want a fast crashing kernel rather than a slow reliable kernel. > Note that anything that changes a device block size (for example > mounting a filesystem with non-default block size) may trigger a crash > if lvm or udev reads the device simultaneously; the crash really > happened in business environment). I wasn't suggesting that we leave the problem unfixed (though I can see how you might have gotten that idea, sorry for not being more clear). I was merely suggesting that we should try to fix the problem in a way that does not kill performance. Cheers, Jeff