From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751721AbdJES0L (ORCPT ); Thu, 5 Oct 2017 14:26:11 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:39448 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751494AbdJES0K (ORCPT ); Thu, 5 Oct 2017 14:26:10 -0400 References: <20171005000430.8080-1-bauerman@linux.vnet.ibm.com> From: Thiago Jung Bauermann To: Thomas Gleixner Cc: Daniel Black , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] powerpc: Drop lockdep_assert_cpus_held call from arch_update_cpu_topology In-reply-to: Date: Thu, 05 Oct 2017 15:26:01 -0300 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 17100518-0056-0000-0000-000003D46A65 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007847; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000235; SDB=6.00926908; UDB=6.00466352; IPR=6.00707160; BA=6.00005623; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017409; XFM=3.00000015; UTC=2017-10-05 18:26:07 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17100518-0057-0000-0000-0000080B7B0F Message-Id: <87o9pl5u4m.fsf@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-05_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710050254 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Thomas, Thanks for your comments. Thomas Gleixner writes: > On Wed, 4 Oct 2017, Thiago Jung Bauermann wrote: > >> It turns out that not all paths calling arch_update_cpu_topology hold >> cpu_hotplug_lock, but that's ok because those paths aren't supposed to race >> with any concurrent hotplug events. >> >> Callers of arch_update_cpu_topology are expected to know what they are >> doing when they call the function without holding the lock, so remove the >> lockdep warning. > > "Expected to know what they are doing" is not really a good approach as > it's way too simple to screw things up. I agree. > You might consider to have two functions where one does the check and the > other does not, but I leave that to the PPC maintainers. It doesn't look like powerpc uses arch_update_cpu_topology differently than other arches. Are you saying that all callers of the function should be holding cpu_hotplug_lock? I came to the conclusion that this isn't the case because of two things: 1. This comment in sched_init_smp: /* * There's no userspace yet to cause hotplug operations; hence all the * CPU masks are stable and all blatant races in the below code cannot * happen. */ mutex_lock(&sched_domains_mutex); sched_init_domains(cpu_active_mask); cpumask_andnot(non_isolated_cpus, cpu_possible_mask, cpu_isolated_map); if (cpumask_empty(non_isolated_cpus)) cpumask_set_cpu(smp_processor_id(), non_isolated_cpus); mutex_unlock(&sched_domains_mutex); This is relevant for the following call trace: [c000003ff6107b50] [c00000000010ac70] lockdep_assert_cpus_held+0x50/0x70 (unreliable) [c000003ff6107b70] [c0000000000802c0] arch_update_cpu_topology+0x20/0x40 [c000003ff6107b90] [c000000000182ec4] sched_init_domains+0x74/0x100 [c000003ff6107bd0] [c000000000ed5c78] sched_init_smp+0x58/0x168 [c000003ff6107d00] [c000000000eb460c] kernel_init_freeable+0x1fc/0x3ac [c000003ff6107dc0] [c00000000000dc2c] kernel_init+0x2c/0x150 [c000003ff6107e30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 2. Your comment on patch "lockup_detector: Cleanup hotplug locking mess"¹: "All watchdog thread related functions are delegated to the smpboot thread infrastructure, which handles serialization against CPU hotplug correctly." Though TBH I'm still grasping the smpboot thread infrastructure and I'm not sure how it handles serialization against CPU hotplug. This is relevant for the following call trace: [c000001ff257ba00] [c0000000000f8618] lockdep_assert_cpus_held+0x48/0x80 (unreliable) [c000001ff257ba20] [c00000000007a848] arch_update_cpu_topology+0x18/0x30 [c000001ff257ba40] [c00000000016bd0c] partition_sched_domains+0x8c/0x4e0 [c000001ff257baf0] [c00000000020a914] cpuset_update_active_cpus+0x24/0x60 [c000001ff257bb10] [c0000000001449bc] sched_cpu_deactivate+0xfc/0x1a0 [c000001ff257bc20] [c0000000000f5a8c] cpuhp_invoke_callback+0x19c/0xe00 [c000001ff257bcb0] [c0000000000f6868] cpuhp_down_callbacks+0x78/0xf0 [c000001ff257bd00] [c0000000000f6c1c] cpuhp_thread_fun+0x1fc/0x210 [c000001ff257bd50] [c000000000132f4c] smpboot_thread_fn+0x2fc/0x370 [c000001ff257bdc0] [c00000000012ca24] kthread+0x1b4/0x1c0 [c000001ff257be30] [c00000000000bcec] ret_from_kernel_thread+0x5c/0x70 >>From what you said though it looks like the lockdep assertion shouldn't have been triggered in either case. I'll keep digging. -- Thiago Jung Bauermann IBM Linux Technology Center ¹ https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-September/163477.html