From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: Re: [PATCH net-next] rds: avoid lock hierarchy violation between m_rs_lock and rs_recv_lock Date: Wed, 8 Aug 2018 18:18:11 -0400 Message-ID: <20180808221811.GA16895@oracle.com> References: <1533761833-106379-1-git-send-email-sowmini.varadhan@oracle.com> <392e9286-e98c-1dbe-d598-9afca1818cf6@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, davem@davemloft.net, rds-devel@oss.oracle.com To: Santosh Shilimkar Return-path: Received: from aserp2130.oracle.com ([141.146.126.79]:57072 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727337AbeHIAj7 (ORCPT ); Wed, 8 Aug 2018 20:39:59 -0400 Content-Disposition: inline In-Reply-To: <392e9286-e98c-1dbe-d598-9afca1818cf6@oracle.com> Sender: netdev-owner@vger.kernel.org List-ID: On (08/08/18 14:51), Santosh Shilimkar wrote: > This bug doesn't make sense since two different transports are using > same socket (Loop and rds_tcp) and running together. > For same transport, such race can't happen with MSG_ON_SOCK flag. > CPU1-> rds_loop_inc_free > CPU0 -> rds_tcp_cork ... > The test is just reporting a lock hierarchy violation As far as I can tell, this wasn't an actual deadlock that happened because as you point out, either a socket has the rds_tcp transport or the rds_loop transport, so this particular pair of stack traces would not happen with the code as it exists today. but there is a valid lock hierachy violation here, and imho it's a good idea to get that cleaned up. It also avoids needlessly holding down the rs_recv_lock when doing an rds_inc_put. --Sowmini