From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF563233D92 for ; Fri, 8 Nov 2024 21:57:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731103038; cv=none; b=aQxJ1PvqIAE2cNH5cDK/R/HTO5oYhAr9u2u60j31fk1sVQAuURy6qMId0UV7QKsKm/2sAEb8KAgf6H9F2CILOojIBGiC+0CNvDNtcKfq7g1Jqp408ykJ4kqAn8KkJLHYe+hVW9bel9EjTzJTjSZCtV8ySu1D32QdjzRyN8tSc+A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731103038; c=relaxed/simple; bh=uk6zwBUdCoWKzaxIMUe1UZubqlKY4Q1KCB1VREdP3j0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=oVz5SC32OXFt1BAyTx6xNgWAXqvVXAhGfne4FLOqyxJCsd/WiqEC6m/eP+j1kCahlcIlWQebYb7hE2TxgxvRCDNfASk0NKkyl6Sdd9dBa65Bt6UwTZlPZCe61iekz6fAYErHvrrN0yTk5rLZfOuQ0KN+obje6h9szIbAEQkbRqg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=iS+Fj0E+; arc=none smtp.client-ip=95.215.58.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="iS+Fj0E+" Message-ID: <0f763686-2b79-40ee-b8c2-0fcfe78d7520@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1731103033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GO7O9WBWpFbJyYcBW2Al7NQap8WX4yTGXeve1S0eneI=; b=iS+Fj0E+GNmAnSa096JFp6Zt9Sk+23o91W4Hj42xFwuQNztP+uV9gosdXDcH5CgErHCgdA ZCU05zq7BtwS0rB4oymcku7BLzYhNZ1UEWoZqUzRMctXvnl3eXMmJlSWjVlTmDa9jfE4qr YnlouZoRmk7emg4uqUuxVFikra7ddQs= Date: Fri, 8 Nov 2024 13:57:04 -0800 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf-next v1 0/2] Handle possible NULL trusted raw_tp arguments Content-Language: en-GB To: Alexei Starovoitov , Eduard Zingerman Cc: Puranjay Mohan , Kumar Kartikeya Dwivedi , bpf , kkd@meta.com, Alexei Starovoitov , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Steven Rostedt , Jiri Olsa , Juri Lelli , Kernel Team References: <20241101000017.3424165-1-memxor@gmail.com> <57dfdda6a89819b65be8960c3c6953bb9b8ceed3.camel@gmail.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yonghong Song In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 11/8/24 12:13 PM, Alexei Starovoitov wrote: > On Thu, Nov 7, 2024 at 9:08 PM Eduard Zingerman wrote: >> On Fri, 2024-11-01 at 17:32 -0700, Eduard Zingerman wrote: >>> On Fri, 2024-11-01 at 17:29 -0700, Alexei Starovoitov wrote: >>> >>> [...] >>> >>>> Hmm. >>>> Puranjay touched it last with extra logic. >>>> >>>> And before that David Vernet tried to address flakiness >>>> in commit 4a54de65964d. >>>> Yonghong also noticed lockups in paravirt >>>> and added workaround 7015843afc. >>>> >>>> Your additional timeout/workaround makes sense to me, >>>> but would be good to bisect whether Puranjay's change caused it. >>> I'll debug what's going on some time later today or on Sat. >> I finally had time to investigate this a bit. >> First, here is how to trigger lockup: >> >> t1=send_signal/send_signal_perf_thread_remote; \ >> t2=send_signal/send_signal_nmi_thread_remote; \ >> for i in $(seq 1 100); do ./test_progs -t $t1,$t2; done >> >> Must be both tests for whatever reason. >> The failing test is 'send_signal_nmi_thread_remote'. >> >> The test is organized as parent and child processes communicating >> various events to each other. The intended sequence of events: >> - child: >> - install SIGUSR1 handler >> - notify parent >> - wait for parent >> - parent: >> - open PERF_COUNT_SW_CPU_CLOCK event >> - attach BPF program to the event >> - notify child >> - enter busy loop for 10^8 iterations >> - wait for child >> - BPF program: >> - send SIGUSR1 to child >> - child: >> - poll for SIGUSR1 in a busy loop >> - notify parent >> - parent: >> - check value communicated by child, >> terminate test. >> >> The lockup happens because on every other test run perf event is not >> triggered, child does not receive SIGUSR1 and thus both parent and >> child are stuck. >> >> For 'send_signal_nmi_thread_remote' perf event is defined as: >> >> struct perf_event_attr attr = { >> .sample_period = 1, >> .type = PERF_TYPE_HARDWARE, >> .config = PERF_COUNT_HW_CPU_CYCLES, >> }; >> >> And is opened for parent process pid. >> >> Apparently, the perf event is not always triggered between lines >> send_signal.c:165-180. And at line 180 parent enters system call, >> so cpu cycles stop ticking for 'parent', thus if perf event >> had not been triggered already it won't be triggered at all >> (as far as I understand). >> >> Applying same fix as Yonghong did in 7015843afc is sufficient to >> reliably trigger perf event: >> >> --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c >> +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c >> @@ -223,7 +223,8 @@ static void test_send_signal_perf(bool signal_thread, bool remote) >> static void test_send_signal_nmi(bool signal_thread, bool remote) >> { >> struct perf_event_attr attr = { >> - .sample_period = 1, >> + .freq = 1, >> + .sample_freq = 1000, >> .type = PERF_TYPE_HARDWARE, >> .config = PERF_COUNT_HW_CPU_CYCLES, >> }; >> >> But I don't understand why. >> As far as I can figure from kernel source code, >> sample_period is measured in nanoseconds (is it?), > I believe sample_period is a number of samples. > 1 means that perf suppose to generate event very often. > It means nanoseconds only for SW cpu_cycles. > > let's apply above workaround and move on. Pls send a patch. sample_period = 1 intends to make sure we can get at one sample since many samples will be generated. But too many samples may cause *rare* issues in internal kernel logic in certain *corner* cases. Agree with Alexei, let us just use a reasonable sample_freq to fix the issue. Thanks!