From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965499AbaH0XPp (ORCPT ); Wed, 27 Aug 2014 19:15:45 -0400 Received: from relay1.sgi.com ([192.48.180.66]:58476 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965301AbaH0XPm (ORCPT ); Wed, 27 Aug 2014 19:15:42 -0400 Message-ID: <53FE6690.80608@sgi.com> Date: Wed, 27 Aug 2014 16:15:28 -0700 From: Mike Travis User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: Andrew Morton CC: mingo@redhat.com, tglx@linutronix.de, hpa@zytor.com, msalter@redhat.com, dyoung@redhat.com, riel@redhat.com, peterz@infradead.org, mgorman@suse.de, linux-kernel@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 0/2] x86: Speed up ioremap operations References: <20140827225927.364537333@asylum.americas.sgi.com> <20140827160610.4ef142d28fd7f276efd38a51@linux-foundation.org> In-Reply-To: <20140827160610.4ef142d28fd7f276efd38a51@linux-foundation.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/27/2014 4:06 PM, Andrew Morton wrote: > On Wed, 27 Aug 2014 17:59:27 -0500 Mike Travis wrote: > >> >> We have a large university system in the UK that is experiencing >> very long delays modprobing the driver for a specific I/O device. >> The delay is from 8-10 minutes per device and there are 31 devices >> in the system. This 4 to 5 hour delay in starting up those I/O >> devices is very much a burden on the customer. > > That's nuts. Exactly! The customer was (as expected) not terribly pleased... :) > >> There are two causes for requiring a restart/reload of the drivers. >> First is periodic preventive maintenance (PM) and the second is if >> any of the devices experience a fatal error. Both of these trigger >> this excessively long delay in bringing the system back up to full >> capability. >> >> The problem was tracked down to a very slow IOREMAP operation and >> the excessively long ioresource lookup to insure that the user is >> not attempting to ioremap RAM. These patches provide a speed up >> to that function. > > With what result? > Early measurements on our in house lab system (with far fewer cpus and memory) shows about a 60-75% increase. They have a 31 devices, 3000+ cpus, 10+Tb of memory. We have 20 devices, 480 cpus, ~2Tb of memory. I expect their ioresource list to be about 5-10 times longer. [But their system is in production so we have to wait for the next scheduled PM interval before a live test can be done.]