From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zeniv.linux.org.uk ([195.92.253.2]:46741 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753250AbcBOXEj (ORCPT ); Mon, 15 Feb 2016 18:04:39 -0500 Date: Mon, 15 Feb 2016 23:04:35 +0000 From: Al Viro To: Martin Brandenburg Cc: Mike Marshall , Linus Torvalds , linux-fsdevel , Stephen Rothwell Subject: Re: Orangefs ABI documentation Message-ID: <20160215230434.GZ17997@ZenIV.linux.org.uk> References: <20160213174738.GR17997@ZenIV.linux.org.uk> <20160214025615.GU17997@ZenIV.linux.org.uk> <20160214234312.GX17997@ZenIV.linux.org.uk> <20160215184554.GY17997@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Feb 15, 2016 at 05:32:54PM -0500, Martin Brandenburg wrote: > Something that used a slot, such as reader, would call > service_operation while holding a bufmap. Then the client-core would > crash, and the kernel would get run_down waiting on the slots to be > given up. But the slots are not given up until someone wakes all the > processes waiting in service_operation up, which happens after all the > slots are given up. Then client-core hangs until someone sends a > deadly signal to all the processes waiting in service_operation or > presumably the timeout expires. > > This splits finalize and run_down so that orangefs_devreq_release can > mark the slot map as killed, then purge waiting ops, then wait for all > the slots to be released. Meanwhile, processes which were waiting will > get into orangefs_bufmap_get which will see that the slot map is > shutting down and wait for the client-core to come back. D'oh. Yes, that was exactly the point of separating mark_dead and run_down - the latter should've been done after purging all requests. Fixes folded, branch force-pushed.