From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Prantis, Kelsey" Subject: Re: "No such device" error when mounting immediately after formatting Date: Mon, 7 Oct 2013 19:41:37 +0000 Message-ID: References: <20130912075818.GB1965@stefanha-thinkpad.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT Cc: "kvm@vger.kernel.org" , "Murrell, Brian" To: Stefan Hajnoczi Return-path: Received: from mga02.intel.com ([134.134.136.20]:39352 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751475Ab3JGTlk convert rfc822-to-8bit (ORCPT ); Mon, 7 Oct 2013 15:41:40 -0400 In-Reply-To: <20130912075818.GB1965@stefanha-thinkpad.redhat.com> Content-Language: en-US Content-ID: <871C075E8946B543A358674CF2C21DE8@intel.com> Sender: kvm-owner@vger.kernel.org List-ID: >I wonder if the output of "udevadm monitor" during the mfks and mount >steps shows devices appearing/disappearing? That might explain a race >condition. So sorry for the long delay in response, but the results of the "udevadm monitor" gave me a new lead that led to solving the problem (which I will discuss below). >At which point in the boot process does your script run? Our script does not run as part of the boot process. It is just formatting and mounting the devices to write repeatedly well after boot. The Solution: ------------------------ The key bit of information I think we were missing before is that the formatting and mounting were occurring in parallel for multiple devices attached to the node. When looking at the "udevadm monitor" results it brought to my attention that it was having to load a module in response to the mount command, and I wondered if there could be a race with two parallel mount commands that ask for the same module to be loaded. Turns out, that was a known kernel bug, which was fixed in kernel 3.7.0, and has nothing to do with kvm: - Original ticket here: https://bugzilla.redhat.com/show_bug.cgi?id=771285 - Patch submitted here: http://thread.gmane.org/gmane.linux.kernel/1358707/focus=1358709 I've filed a ticket with RedHat to request the fix be back ported to the REHL6 kernel here: https://bugzilla.redhat.com/show_bug.cgi?id=1009704 Until then, I found the simplest workaround was to explicitly load the module (ex: "modprobe ext4"), prior to beginning the formatting and mounting process. Sorry to bug you guys with a unrelated issue, but hopefully this explanation can help anyone else who stumbles into the problem. Regards, Kelsey