From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mail19.linbit.com (LINBIT Mail Daemon) with ESMTP id BBAE2420372 for ; Fri, 9 Dec 2022 13:44:01 +0100 (CET) Received: by mail-wm1-f71.google.com with SMTP id ay19-20020a05600c1e1300b003cf758f1617so3885986wmb.5 for ; Fri, 09 Dec 2022 04:37:13 -0800 (PST) Message-ID: From: Paolo Abeni To: Benjamin Coddington , netdev@vger.kernel.org Date: Fri, 09 Dec 2022 13:37:08 +0100 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Cc: Latchesar Ionkov , samba-technical@lists.samba.org, Dominique Martinet , Valentina Manea , linux-nvme@lists.infradead.org, Philipp Reisner , David Howells , Joseph Qi , Eric Dumazet , linux-nfs@vger.kernel.org, Marc Dionne , Shuah Khan , Christoph Hellwig , Mike Christie , drbd-dev@lists.linbit.com, linux-cifs@vger.kernel.org, Sagi Grimberg , linux-scsi@vger.kernel.org, Mark Fasheh , linux-afs@lists.infradead.org, cluster-devel@redhat.com, Christine Caulfield , Jakub Kicinski , Ilya Dryomov , open-iscsi@googlegroups.com, Keith, Anna Schumaker , Hensbergen , "James E.J. Bottomley" , Josef Bacik , David, linux-block@vger.kernel.org, nbd@other.debian.org, Greg, Teigland , Joel Becker , v9fs-developer@lists.sourceforge.net, Busch , ceph-devel@vger.kernel.org, Xiubo Li , Trond Myklebust , Jens Axboe , Chris Leech , "Martin K. Petersen" , Kroah-Hartman , linux-usb@vger.kernel.org, Jeff Layton , linux-kernel@vger.kernel.org, "David S. Miller" , Steve French , Chuck Lever , Lee Duncan , Lars Ellenberg , Eric, ocfs2-devel@oss.oracle.com Subject: Re: [Drbd-dev] [PATCH v1 2/3] Treewide: Stop corrupting socket's task_frag List-Id: "*Coordination* of development, patches, contributions -- *Questions* \(even to developers\) go to drbd-user, please." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Mon, 2022-11-21 at 08:35 -0500, Benjamin Coddington wrote: > Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the > GFP_NOIO flag on sk_allocation which the networking system uses to decide > when it is safe to use current->task_frag. The results of this are > unexpected corruption in task_frag when SUNRPC is involved in memory > reclaim. > > The corruption can be seen in crashes, but the root cause is often > difficult to ascertain as a crashing machine's stack trace will have no > evidence of being near NFS or SUNRPC code. I believe this problem to > be much more pervasive than reports to the community may indicate. > > Fix this by having kernel users of sockets that may corrupt task_frag due > to reclaim set sk_use_task_frag = false. Preemptively correcting this > situation for users that still set sk_allocation allows them to convert to > memalloc_nofs_save/restore without the same unexpected corruptions that are > sure to follow, unlikely to show up in testing, and difficult to bisect. > > CC: Philipp Reisner > CC: Lars Ellenberg > CC: "Christoph Böhmwalder" > CC: Jens Axboe > CC: Josef Bacik > CC: Keith Busch > CC: Christoph Hellwig > CC: Sagi Grimberg > CC: Lee Duncan > CC: Chris Leech > CC: Mike Christie > CC: "James E.J. Bottomley" > CC: "Martin K. Petersen" > CC: Valentina Manea > CC: Shuah Khan > CC: Greg Kroah-Hartman > CC: David Howells > CC: Marc Dionne > CC: Steve French > CC: Christine Caulfield > CC: David Teigland > CC: Mark Fasheh > CC: Joel Becker > CC: Joseph Qi > CC: Eric Van Hensbergen > CC: Latchesar Ionkov > CC: Dominique Martinet > CC: "David S. Miller" > CC: Eric Dumazet > CC: Jakub Kicinski > CC: Paolo Abeni > CC: Ilya Dryomov > CC: Xiubo Li > CC: Chuck Lever > CC: Jeff Layton > CC: Trond Myklebust > CC: Anna Schumaker > CC: drbd-dev@lists.linbit.com > CC: linux-block@vger.kernel.org > CC: linux-kernel@vger.kernel.org > CC: nbd@other.debian.org > CC: linux-nvme@lists.infradead.org > CC: open-iscsi@googlegroups.com > CC: linux-scsi@vger.kernel.org > CC: linux-usb@vger.kernel.org > CC: linux-afs@lists.infradead.org > CC: linux-cifs@vger.kernel.org > CC: samba-technical@lists.samba.org > CC: cluster-devel@redhat.com > CC: ocfs2-devel@oss.oracle.com > CC: v9fs-developer@lists.sourceforge.net > CC: netdev@vger.kernel.org > CC: ceph-devel@vger.kernel.org > CC: linux-nfs@vger.kernel.org > > Suggested-by: Guillaume Nault > Signed-off-by: Benjamin Coddington I think this is the most feasible way out of the existing issue, and I think this patchset should go via the networking tree, targeting the Linux 6.2. If someone has disagreement with the above, please speak! Thanks, Paolo