From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9FE72262FE6 for ; Tue, 22 Apr 2025 21:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745356865; cv=none; b=lXxINe3ea3+k40N3t1p3VvHA5CLrqS7P2SacSya2sstP7e86Q4LN397D8CGmII+ggFBNkK077pNBFmlwz2D1OC/2aKEi4Eq8HMUYry8xKlZrint6U3DNyQt5TRrQZkXzA/b6ZIMdfnThkuh+ffSShQ1G7rme33/HPhJUIcGMOCE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745356865; c=relaxed/simple; bh=ZrEAQMqsyLbolKW0CLeMrZ8PbAE+CG5CNRdOoWpu9A4=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=XXjbj0mpaRBXvolSlRvBIKhe9zKmF3n6lLAlgjafqPtVptISPtctaGGt4e3/WhNUxdrqD5v0oCC38gpbwBKNYgtgRN0djYD6ThdePuyZErrYIw9zmW+RisvoekpsYVdJPTtOUR/ZMrz1ZA9YapZEjcSMfh3kDLlYrPMKFoXp574= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=bKr+SlLA; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="bKr+SlLA" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1745356858; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8it9af45rz3xPU6jGbGfSzUcfD4g1fZDiNmeUzhLF/4=; b=bKr+SlLAN81+hFnwnO+/43ZnQibrHqfo5OCRbpM7aWSkrY/Y1gbjhgihe2JfKdc1f/LzOr xEpdBeXJpPZs1K9UJlU0ID+PvCtchtwkGiAArv3DUok2F/muMD1YFmv7ixbAeivZDAF1rM XqNk1pWsX4N4hBjEGVxqG0mqvl78fmw= Date: Tue, 22 Apr 2025 14:20:51 -0700 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH net-next] xdp: create locked/unlocked instances of xdp redirect target setters To: Joshua Washington Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Mina Almasry , Willem de Bruijn , Harshitha Ramamurthy , Jeroen de Borst , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Simon Horman , Praveen Kaligineedi , Shailend Chand , Stanislav Fomichev , Martin KaFai Lau , Joe Damato , open list References: <20250422011643.3509287-1-joshwash@google.com> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: <20250422011643.3509287-1-joshwash@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 4/21/25 6:16 PM, Joshua Washington wrote: > Commit 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with > netdev->lock") introduces the netdev lock to xdp_set_features_flag(). > The change includes a _locked version of the method, as it is possible > for a driver to have already acquired the netdev lock before calling > this helper. However, the same applies to > xdp_features_(set|clear)_redirect_flags(), which ends up calling the > unlocked version of xdp_set_features_flags() leading to deadlocks in > GVE, which grabs the netdev lock as part of its suspend, reset, and > shutdown processes: > > [ 833.265543] WARNING: possible recursive locking detected > [ 833.270949] 6.15.0-rc1 #6 Tainted: G E > [ 833.276271] -------------------------------------------- > [ 833.281681] systemd-shutdow/1 is trying to acquire lock: > [ 833.287090] ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: xdp_set_features_flag+0x29/0x90 > [ 833.295470] > [ 833.295470] but task is already holding lock: > [ 833.301400] ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: gve_shutdown+0x44/0x90 [gve] > [ 833.309508] > [ 833.309508] other info that might help us debug this: > [ 833.316130] Possible unsafe locking scenario: > [ 833.316130] > [ 833.322142] CPU0 > [ 833.324681] ---- > [ 833.327220] lock(&dev->lock); > [ 833.330455] lock(&dev->lock); > [ 833.333689] > [ 833.333689] *** DEADLOCK *** > [ 833.333689] > [ 833.339701] May be due to missing lock nesting notation > [ 833.339701] > [ 833.346582] 5 locks held by systemd-shutdow/1: > [ 833.351205] #0: ffffffffa9c89130 (system_transition_mutex){+.+.}-{4:4}, at: __se_sys_reboot+0xe6/0x210 > [ 833.360695] #1: ffff93b399e5c1b8 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xb4/0x1f0 > [ 833.369144] #2: ffff949d19a471b8 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xc2/0x1f0 > [ 833.377603] #3: ffffffffa9eca050 (rtnl_mutex){+.+.}-{4:4}, at: gve_shutdown+0x33/0x90 [gve] > [ 833.386138] #4: ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: gve_shutdown+0x44/0x90 [gve] > > Introduce xdp_features_(set|clear)_redirect_target_locked() versions > which assume that the netdev lock has already been acquired before > setting the XDP feature flag and update GVE to use the locked version. > > Cc: bpf@vger.kernel.org > Fixes: 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock") > Tested-by: Mina Almasry > Reviewed-by: Willem de Bruijn > Reviewed-by: Harshitha Ramamurthy > Signed-off-by: Joshua Washington Acked-by: Martin KaFai Lau