From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB78B2BE055 for ; Fri, 15 Aug 2025 20:18:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755289085; cv=none; b=nTyE98Z0tUsXMSNj/aoUd1HVAiBkjXCTFEvfed75rlQbdjj9G/ns54Bs590tKgzvJucIjCWXFRCeBHcv7aoR+zUsXyiCbLppWK1v1Iy5Hz+Fj5y2Qc9Wk6E0j6TPH22rjt9yeUmKvMLvfG0SJDxQZvHvQcNM9D5kv6HMusHKyQg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755289085; c=relaxed/simple; bh=8vMaMJmZ58dU+V8WLmtNoLhWkY53OJshlCWFXVY9SsQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=h8hRwtaHK4EYxmIV0SXEnVkEJWpfMCdVOhkXW6MxHZUqU1RLjw+t0/t4Gxmr1x/XuxfSsZPOTXrv6JhJAcOlEPWR1RfpNpGmaoAFTyiDrghLuy37VdR2uTPg5OKcQOPQfDazsOhn56SVaPejbLV/Ro4uQSrUc9QHS3qJFQO76Lc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=YT88qVQ0; arc=none smtp.client-ip=95.215.58.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="YT88qVQ0" Message-ID: <70cdb532-4477-459c-8762-638ceedae043@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1755289079; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P22hqZIj5gevNHPgEa4U/ghutY8E332kiZN0qP3rNHw=; b=YT88qVQ0N6H+gizdVxQEnnfYPvkE4kgIOeGKU67/hf0LdIAWWzsp2kJwW1CogBsTUe24lw RmfwuDlTg4PBqP1XGstXGPmy6JTy6MtDBKtDgf/yOSX46XiotMqmlAm4EZ3iW1ykYxP2F6 hP7xYWXPifVmzUrpWxUrWqj2a7a2TAI= Date: Fri, 15 Aug 2025 13:17:39 -0700 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf] bpf/selftests: fix test_tcpnotify_user To: Stanislav Fomichev , Matt Bobrowski Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, eddyz87@gmail.com, mykolal@fb.com References: Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 8/15/25 8:44 AM, Stanislav Fomichev wrote: > On 08/15, Matt Bobrowski wrote: >> Based on a bisect, it appears that commit 7ee988770326 ("timers: >> Implement the hierarchical pull model") has somehow inadvertently >> broken BPF selftest test_tcpnotify_user. The error that is being >> generated by this test is as follows: >> >> FAILED: Wrong stats Expected 10 calls, got 8 >> >> It looks like the change allows timer functions to be run on CPUs >> different from the one they are armed on. The test had pinned itself >> to CPU 0, and in the past the retransmit attempts also occurred on CPU >> 0. The test had set the max_entries attribute for >> BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling >> bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was >> likely to be in range. With the change to allow timers to run on other >> CPUs, the current CPU tasked with performing the retransmit might be >> bumped and in turn fall out of range, as the event will be filtered >> out via __bpf_perf_event_output() using: >> >> if (unlikely(index >= array->map.max_entries)) >> return -E2BIG; > > [..] > >> A possible change would be to explicitly set the max_entries attribute >> for perf_event_map in test_tcpnotify_kern.c to a value that's at least >> as large as the number of CPUs. As it turns out however, if the field >> is left unset, then the BPF selftest library will determine the number >> of CPUs available on the underlying system and update the max_entries >> attribute accordingly. > > nit: the max_entries is set by libbpf in map_set_def_max_entries. 'BPF > selftest library' seems a bit vague. But not a reason for respin. Fixed the commit message. Thanks. Applied. > >> A further problem with the test is that it has a thread that continues >> running up until the program exits. The main thread cleans up some >> LIBBPF data structures, while the other thread continues to use them, >> which inevitably will trigger a SIGSEGV. This can be dealt with by >> telling the thread to run for as long as necessary and doing a >> pthread_join on it before exiting the program. Some of the "goto err" seems to have similar problem but ok-ish as long as the iptables runs fine. I didn't look why the test needs to start a thread at all, so I leave it as is. The CI is not running this test. The test is getting rotten overall. It should be moved to test_progs. Probably as a subtest in some of the existing sockops test in test_progs.