From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [PATCH net-next] rds: avoid lock hierarchy violation between m_rs_lock and rs_recv_lock Date: Sat, 11 Aug 2018 11:22:25 -0700 (PDT) Message-ID: <20180811.112225.71238093640629717.davem@davemloft.net> References: <1533761833-106379-1-git-send-email-sowmini.varadhan@oracle.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, rds-devel@oss.oracle.com, santosh.shilimkar@oracle.com To: sowmini.varadhan@oracle.com Return-path: Received: from shards.monkeyblade.net ([23.128.96.9]:47194 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727619AbeHKXZx (ORCPT ); Sat, 11 Aug 2018 19:25:53 -0400 In-Reply-To: <1533761833-106379-1-git-send-email-sowmini.varadhan@oracle.com> Sender: netdev-owner@vger.kernel.org List-ID: From: Sowmini Varadhan Date: Wed, 8 Aug 2018 13:57:13 -0700 > The following deadlock, reported by syzbot, can occur if CPU0 is in > rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue() > > CPU0 CPU1 > ---- ---- > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > lock(&(&rm->m_rs_lock)->rlock); > lock(&rs->rs_recv_lock); > > The deadlock should be avoided by moving the messages from the > rs_recv_queue into a tmp_list in rds_clear_recv_queue() under > the rs_recv_lock, and then dropping the refcnt on the messages > in the tmp_list (potentially resulting in rds_message_purge()) > after dropping the rs_recv_lock. > > The same lock hierarchy violation also exists in rds_still_queued() > and should be avoided in a similar manner > > Signed-off-by: Sowmini Varadhan > Reported-by: syzbot+52140d69ac6dc6b927a9@syzkaller.appspotmail.com I'm putting this in deferred state for now. Sowmini, once you and Santosh agree on what exactly to do, please resubmit. Thank you.