From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751709AbcA2A2B (ORCPT ); Thu, 28 Jan 2016 19:28:01 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:40813 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751193AbcA2A16 (ORCPT ); Thu, 28 Jan 2016 19:27:58 -0500 Subject: Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS) To: Paul Gortmaker References: <56A9DC76.2030502@de.ibm.com> <059e01d159af$e4317390$ac945ab0$@alibaba-inc.com> <56A9E404.7000409@de.ibm.com> <20160128143723.GN8889@windriver.com> <56AA2E44.4070709@oracle.com> <56AA939D.6080104@oracle.com> <20160128225955.GR8889@windriver.com> Cc: Christian Borntraeger , Hillf Danton , "'Andrew Morton'" , "'Nadia Yvette Chambers'" , "'Alexander Viro'" , "'Naoya Horiguchi'" , "'David Rientjes'" , "'Davidlohr Bueso'" , "'Linux Kernel Mailing List'" From: Mike Kravetz Message-ID: <56AAB1F9.7090400@oracle.com> Date: Thu, 28 Jan 2016 16:27:37 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20160128225955.GR8889@windriver.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/28/2016 02:59 PM, Paul Gortmaker wrote: > [Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS)] On 28/01/2016 (Thu 14:18) Mike Kravetz wrote: > >> On 01/28/2016 07:05 AM, Mike Kravetz wrote: >>> On 01/28/2016 06:37 AM, Paul Gortmaker wrote: >>>> [Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS)] On 28/01/2016 (Thu 10:48) Christian Borntraeger wrote: >>>> >>>>> On 01/28/2016 10:40 AM, Hillf Danton wrote: >>>>>>> >>>>>>> Paul, >>>>>>> >>>>>>> the commit 3e89e1c5ea842 ("hugetlb: make mm and fs code explicitly non-modular") >>>>>>> triggers belows warning/oops, if CONFIG_TIMER_STATS is set. >>>>>>> >>>>>>> Looking at the patch the only "real" change is the init_call, >>>>>>> and indeed >>>>>>> --- a/mm/hugetlb.c >>>>>>> +++ b/mm/hugetlb.c >>>>>>> @@ -2653,7 +2653,7 @@ static int __init hugetlb_init(void) >>>>>>> mutex_init(&hugetlb_fault_mutex_table[i]); >>>>>>> return 0; >>>>>>> } >>>>>>> -subsys_initcall(hugetlb_init); >>>>>>> +device_initcall(hugetlb_init); >>>>>>> >>>>>>> /* Should be called on processing a hugepagesz=... option */ >>>>>>> void __init hugetlb_add_hstate(unsigned int order) >>>>>>> >>>>>>> makes the problem go away. >>>>>> >>>>>> Helps more if a patch is delivered. >>>>> >>>>> The problem is that the original change was intentional. So I do not not >>>>> what the right fix is. >>>> >>>> Thanks for the report ; let me see if I can work out what TIMER_STATS >>>> is doing to cause this sometime today. >>>> >>> >>> Hmmm? CONFIG_TIMER_STATS is set in my config and I am not seeing the >>> issue. Not sure, but it looks like Christian is building/running on >>> s390. This 'might' be a contributing factor. >> >> I do not see how CONFIG_TIMER_STATS contributes to this issue. However, > > I looked at all the TIMER_STATS ifdef blocks and was also thinking the > same thing. If it did toggle the problem then it was a red herring. > My test config had this set and I retested x86-64 today with it set. > >> on s390 numa nodes are initialized at device_initcall in the appropriately >> named routine numa_init_late(). hugetlb_init must be done after numa >> initialization. So, I suggest we just move the hugetlb initialization >> back to device_initcall. What do you think Paul? Patch below. > > We could, but that ignores the fact that the original priorities worked > by chance and not by design, as my commit log indicates. Instead, I'd > like to know why S390 does core NUMA operations as late as > device_initcall. Setting up NUMA nodes should be arch_initcall or > subsys_initcall, or earlier --- it should not be device_initcall as if > it was some leaf node UART driver or ethernet driver. There is no > endpoint "device" in NUMA in this context. This is in linux-next after 4.5-rc1 commit 2d0f76a6ca1f2cdcffca7ce130f67ec61caa0999 Author: Michael Holzheu Date: Wed Jan 20 19:22:16 2016 +0100 s390/numa: move numa_init_late() from device to arch_initcall Commit 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") moves hugetlb_init() from module_init to subsys_initcall. The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj" which is initialized in numa_init_late(). Since numa_init_late() is a device_initcall which is called *after* subsys_initcall the above mentioned patch breaks NUMA on s390. So fix this and move numa_init_late() to arch_initcall. Fixes: 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular") Reviewed-by: Heiko Carstens Signed-off-by: Michael Holzheu Signed-off-by: Martin Schwidefsky -- Mike Kravetz