From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E18F1F9F79 for ; Wed, 18 Dec 2024 17:09:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734541744; cv=none; b=r8taIZZ6UOXXkM1brb6zvNSVwzTIcC/S52NPgG9+3V03t2U2lPhQ+tytjd/nwqT1kBoPzgdgSnEsXSZIVW/kBj15SEJEiJAxcSTseSBMSrmIbpArBOy4BrHg2CxiiK/lK8XhhHPr3p1R+JEwm6tUuC27/ScWZlzuv9SmkcXMYr0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734541744; c=relaxed/simple; bh=EsxWV4j978n0te6a6W9t6wV5YT9GBp7l9URdDS6SmgI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=I7x+jDWsxBwy2SAvC7K/mEEhqJnI1kiP4J61kEG+DbrAs+Ms8N+6dF5gj8b8RpQYeOT36rh/KqpVo8j09EYKLOM3GJKF/GxBblTfzj2nZDerMyJkNbw3vn1cPVZJeW1a2enlBlKgSGWLN2g8KxjJdcwSMZq3mvloejN8jqOQcOE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=IV/euG4g; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="IV/euG4g" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-218c8aca5f1so27302595ad.0 for ; Wed, 18 Dec 2024 09:09:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1734541742; x=1735146542; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=PpSQHZ9ybTyyNY7KNiD5NhxYslOMfBGaz48YVmHE+bA=; b=IV/euG4gfobmfFkIBHK8MQiPRiZH1OKuhIulETaabf0x0LLzxrdVh6g0CzWJdQAa+m eyjLBpSgoK7HQX9WDvrQSksuPadoe0dcVGA0DIc7Gvg5ORYka+nqUNVbq71XG3eJDXZU tfj5R69tKt47QGmq2KCxFPGshEnFTTL4a+XwA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734541742; x=1735146542; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PpSQHZ9ybTyyNY7KNiD5NhxYslOMfBGaz48YVmHE+bA=; b=Fk66QjA+DhpRXMYLNWFBjLOY67bnDIV8x38T1Rzkjby2bO2XpSoSB4caf8jaAZQMQx CHW/ULefJKjI7nAUk+m1PRwwHVY52q8sv5Wlj8O6rs4VQcpkuZK1LCWgOjdr1rpkCrj9 78U2KI7v6YtEsa4Djyo6kR8KOnBP/9pu3aoA0aL3LkJz31HOvzcFonCPJWqs/Y6h1jSk BGHRnNDHD0DLyuCH2DYeKq0xKIdrq663UrS7ln4MNPo0xpvYVTh1QXP9i2PSTDzr0IN9 3p9VvSSRnNRXlOvCpZ/pBWCWLGiqTHIubedQBUnsqd6qOhNYE2VWTpVHe50tSityDRs+ s23g== X-Forwarded-Encrypted: i=1; AJvYcCV2ROK4YG9wkdeYA1W/4YJzCuCubSCKOEhABu45dDvr7dK43doE706A/Az1alE/ZlW8tL6JQlc=@vger.kernel.org X-Gm-Message-State: AOJu0Yw4uQBFPbERzb5tCmVW3izfx6NsofiaRP4uFVWSNBfYihdNRCbf /zEfjY59CFl+SHs3hKTZ0TOmVweDns+FpvfbdsBm1BZiDPqGZhrg/0gF04Pu13o= X-Gm-Gg: ASbGncuc1IYoklGRstaTcKtleUZDg63uAstZwQ9vV5hHS/FTTd6RjD7wSSyMy8aNBN9 2tAcYQgJ21Kpg6nksva1kN7I4if7RulZHVAWKUwRlN/4mRyZsszupdbvtjAM9SPhsEZBJsl+J8X 6iRyBWfVsMvfOwWNTy0DV2cBVNilw7kl/+rWUyAEjM1Uv2j9kMZpfyzKVFMGNqkGBW8WssnCvm/ n42i+HFOGF9CtGHGqXiBnjFDOJ3ILUnusgdWCtfKhnfUAsW/+bkIhqdjBFuG1unx9x/YaUKc53d knZT0WCi3wHArRW8aCZnu6U= X-Google-Smtp-Source: AGHT+IFLcXEwr+Ux+NRuBtnQUxHIcEJMXQFYotalRPkLUZD05Hvov4X94huhAnS7Lo6myzNs/HTYKw== X-Received: by 2002:a17:902:f985:b0:216:6c77:7bbb with SMTP id d9443c01a7336-218d70dc242mr38520155ad.17.1734541742613; Wed, 18 Dec 2024 09:09:02 -0800 (PST) Received: from LQ3V64L9R2 (c-24-6-151-244.hsd1.ca.comcast.net. [24.6.151.244]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-218a1db63c3sm79259875ad.48.2024.12.18.09.08.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Dec 2024 09:09:02 -0800 (PST) Date: Wed, 18 Dec 2024 09:08:58 -0800 From: Joe Damato To: Alex Lazar Cc: "aleksander.lobakin@intel.com" , "almasrymina@google.com" , "amritha.nambiar@intel.com" , "bigeasy@linutronix.de" , "bjorn@rivosinc.com" , "corbet@lwn.net" , Dan Jurgens , "davem@davemloft.net" , "donald.hunter@gmail.com" , "dsahern@kernel.org" , "edumazet@google.com" , "hawk@kernel.org" , "jiri@resnulli.us" , "johannes.berg@intel.com" , "kuba@kernel.org" , "leitao@debian.org" , "leon@kernel.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "lorenzo@kernel.org" , "michael.chan@broadcom.com" , "mkarsten@uwaterloo.ca" , "netdev@vger.kernel.org" , "pabeni@redhat.com" , Saeed Mahameed , "sdf@fomichev.me" , "skhawaja@google.com" , "sridhar.samudrala@intel.com" , Tariq Toukan , "willemdebruijn.kernel@gmail.com" , "xuanzhuo@linux.alibaba.com" , Gal Pressman , Nimrod Oren , Dror Tennenbaum , Dragos Tatulea Subject: Re: [net-next v6 0/9] Add support for per-NAPI config via netlink Message-ID: Mail-Followup-To: Joe Damato , Alex Lazar , "aleksander.lobakin@intel.com" , "almasrymina@google.com" , "amritha.nambiar@intel.com" , "bigeasy@linutronix.de" , "bjorn@rivosinc.com" , "corbet@lwn.net" , Dan Jurgens , "davem@davemloft.net" , "donald.hunter@gmail.com" , "dsahern@kernel.org" , "edumazet@google.com" , "hawk@kernel.org" , "jiri@resnulli.us" , "johannes.berg@intel.com" , "kuba@kernel.org" , "leitao@debian.org" , "leon@kernel.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "lorenzo@kernel.org" , "michael.chan@broadcom.com" , "mkarsten@uwaterloo.ca" , "netdev@vger.kernel.org" , "pabeni@redhat.com" , Saeed Mahameed , "sdf@fomichev.me" , "skhawaja@google.com" , "sridhar.samudrala@intel.com" , Tariq Toukan , "willemdebruijn.kernel@gmail.com" , "xuanzhuo@linux.alibaba.com" , Gal Pressman , Nimrod Oren , Dror Tennenbaum , Dragos Tatulea References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed, Dec 18, 2024 at 11:22:33AM +0000, Alex Lazar wrote: > Hi Joe and all, > > I am part of the NVIDIA Eth drivers team, and we are experiencing a problem, > sibesced to this change: commit 86e25f40aa1e ("net: napi: Add napi_config") > > The issue occurs when sending packets from one machine to another. > On the receiver side, we have XSK (XDPsock) that receives the packet and sends it > back to the sender. > At some point, one packet (packet A) gets "stuck," and if we send a new packet > (packet B), it "pushes" the previous one. Packet A is then processed by the NAPI > poll, and packet B gets stuck, and so on. > > Your change involves moving napi_hash_del() and napi_hash_add() from > netif_napi_del() and netif_napi_add_weight() to napi_enable() and napi_disable(). > If I move them back to netif_napi_del() and netif_napi_add_weight(), > the issue is resolved (I moved the entire if/else block, not just the napi_hash_del/add). > > This issue occurs with both the new and old APIs (netif_napi_add/_config). > Moving the napi_hash_add() and napi_hash_del() functions resolves it for both. > I am debugging this, no breakthrough so far. > > I would appreciate if you could look into this. > We can provide more details per request. I appreciate your report, but there is not a lot in your message to help debug the issue. Can you please: 1.) Verify that the kernel tree you are testing on has commit cecc1555a8c2 ("net: Make napi_hash_lock irq safe") included ? If it does not, can you pull in that commit and re-run your test and report back if that fixes your problem? 2.) If (1) does not fix your problem, can you please reply with at least the following information: - Specify what device this is happening on (in case I have access to one) - Which driver is affected - Which upstream kernel SHA you are building your test kernel from - The reproducer program(s) with clear instructions on how exactly to run it/them in order to reproduce the issue Thanks, Joe