From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934809AbZDBF4M (ORCPT ); Thu, 2 Apr 2009 01:56:12 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754281AbZDBFzz (ORCPT ); Thu, 2 Apr 2009 01:55:55 -0400 Received: from hera.kernel.org ([140.211.167.34]:37519 "EHLO hera.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751612AbZDBFzy (ORCPT ); Thu, 2 Apr 2009 01:55:54 -0400 Message-ID: <49D45364.2010703@kernel.org> Date: Thu, 02 Apr 2009 14:55:48 +0900 From: Tejun Heo User-Agent: Thunderbird 2.0.0.19 (X11/20081227) MIME-Version: 1.0 To: David Miller CC: linux-kernel@vger.kernel.org Subject: Re: More problems in setup_pcpu_remap() References: <20090401.213112.96144152.davem@davemloft.net> <49D44251.4000606@kernel.org> <20090401.215233.134215150.davem@davemloft.net> In-Reply-To: <20090401.215233.134215150.davem@davemloft.net> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0 (hera.kernel.org [127.0.0.1]); Thu, 02 Apr 2009 05:55:52 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, David. David Miller wrote: >> I guess we'll have to put a cap on how high possible cpus can be for >> remap allocator. e.g. if single chunk size is over 20% of the whole >> vmalloc area, don't use remap. Does anyone have a good random % >> number on mind? > > I would suggest instead to rethink what this code is doing. Actually, I've been looking at the numbers and I'm not sure if the concern is valid. On x86_32, the practical number of maximum processors would be around 16 so it will end up 32M, which isn't nice and it would probably a good idea to introduce a parameter to select which allocator to use but still it's far from consuming all the VM area. On x86_64, the vmalloc area is obscenely large at 2^45, ie 32 terabytes. Even with 4096 processors, single chunk is measly 0.02%. If it's a problem for other archs or extreme x86_32 configurations, we can add some safety measures but in general I don't think it is a problem. > It would make more sense to carve up 2MB chunks into some-power-of-2 > pieces and use that as the unit size. > > You could retain the NUMA goals of this function, as well as the > ability to be using 2MB pages in the TLBs. Can you please elaborate a bit? > And consider that if the dynamic allocation part of this code triggers > even once, you'll end up eating twice as much VMALLOC space. > > Using 2MB per cpu is just rediculious, and really not even necessary. The focus at the moment is using large page for the first chunk to reduce pressure on TLB, not necessarily actually using 2MB for each unit. Thanks. -- tejun