From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eli Cohen Subject: Possible process deadlock in RMPP flow Date: Wed, 23 Sep 2009 18:04:54 +0300 Message-ID: <20090923150454.GA26150@mtls03> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ewg-bounces-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org Errors-To: ewg-bounces-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org To: Sean Hefty Cc: Linux RDMA list , ewg , general-list List-Id: linux-rdma@vger.kernel.org Hi Sean, one of our customers experiences problems when running ibnetdiscover. The problem happens from time to time. Here is the call stack the he gets: ibnetdiscover D ffffffff80149b8d 0 26968 26544 (L-TLB) ffff8102c900bd88 0000000000000046 ffff81037e8e0000 ffff81037e8e02e8 ffff8102c900bd78 000000000000000a ffff8102c5b50820 ffff81038a929820 0000011837bf6105 0000000000000ede ffff8102c5b50a08 0000000100000000 Call Trace: [] wait_for_completion+0x79/0xa2 [] default_wake_function+0x0/0xe [] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde [] :ib_mad:ib_unregister_mad_agent+0x30d/0x424 [] :ib_umad:ib_umad_close+0x9d/0xd6 [] __fput+0xae/0x198 [] filp_close+0x5c/0x64 [] put_files_struct+0x63/0xae [] do_exit+0x31c/0x911 [] cpuset_exit+0x0/0x6c [] system_call+0x7e/0x83 >>From the dump it seems that the process is waits on the call to flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is OFED 1.4.2. Do you have any idea or suggestions how to sort this out? Thanks.