From mboxrd@z Thu Jan 1 00:00:00 1970 From: Edward Shishkin Subject: Re: reiser4: FITRIM ioctl -- how to grab the space? Date: Sat, 16 Aug 2014 21:54:16 +0200 Message-ID: <53EFB6E8.6080305@gmail.com> References: <3405506.BC0S4TX54B@intelfx-laptop> <1651199.WS1bpHXlZS@intelfx-laptop> <53EF4B61.4080906@gmail.com> <3328274.dmtPODE4QV@intelfx-laptop> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=buqQtB8/pweIlEIRCVfIFWEToQA4RE3p/VrsxaKuJio=; b=dBibei6pciyXKOyKeNThLM9LiO6flmX/W/yonX8zB6hBIVMFkTEvqcFd7PHfTz2sHz kRpky0OOIEJZog07mcEE0EGjYJ65jRdEOmR3rI5ypFrWQaqcsq9JzsJwenrcU1tpypUj SIDjVLuajyJYHTFZmqqTGOt34Nr65O0/xZ7zLdKgK6sF3gf6fSKl0nGa/MVlSVc2+wPM q/7zM9VePsqV+ESdO9gKjMC1hq0dxZlUrAK0Z6kPAW8UTkxvgty/WUBG+IxfuhwdZ3vj VG2SwWoTJrH5TWnzze5U1I3Ta/8ObcAhVv4xpUSOHybJkFg6DzwhnURFiYb4LV6KqTuV Wv3w== In-Reply-To: <3328274.dmtPODE4QV@intelfx-laptop> Sender: reiserfs-devel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Ivan Shapovalov Cc: reiserfs-devel@vger.kernel.org On 08/16/2014 07:02 PM, Ivan Shapovalov wrote: > On Saturday 16 August 2014 at 14:15:29, Edward Shishkin wrote: >> On 08/16/2014 01:17 PM, Ivan Shapovalov wrote: >>> On Saturday 16 August 2014 at 10:09:44, Edward Shishkin wrote: >>>> On 08/16/2014 02:44 AM, Ivan Shapovalov wrote: >>>>> On Monday 11 August 2014 at 13:39:12, Ivan Shapovalov wrote: >>>>>> [...] >>>>>>>> I've meant "grabbing all space and then allocating all space" -- so there won't >>>>>>>> be multiple grabs or multiple atoms. >>>>>>>> >>>>>>>> Then all processes grabbing space with BA_CAN_COMMIT will wait for the discard >>>>>>>> atom to commit. >>>>>>> It seems such waiting will screw up the system. No? >>>>>> I was afraid of such situations, but how would that happen? The discard atom's >>>>>> commit will always be able to proceed as it doesn't grab space at all. >>>>>> >>>>>>>> (Actually, there is a small race window between grabbing space >>>>>>>> and creating an atom...) >>>>>>> Which one? >>>>>> BA_CAN_COMMIT machinery does wait only for atoms, not for contexts. If >>>>>> process X happens to grab space between us grabbing space and creating an atom, >>>>>> it will get -ENOSPC even with BA_CAN_COMMIT. >>>> I still don't see any "races" here. How atom creation is related to grabbing >>>> space? Are we talking about races in the existing code? f so, please show >>>> the racing paths.. >>> Well, this is not a race per se - it does not involve locking. But it is >>> a race-like behavior. >>> >>> taskA taskB >>> -------------------------------------------------------------------------- >>> grab very much space >> Ok, assume A wants X blocks. >> >>> grab some space with BA_CAN_COMMIT >> Assume B wants Y blocks. >> >>> create an atom using the grabbed space >> >> Please, specify which code is executing at this point. >> >> Anyway, we don't need any reservation to _create_ an atom. >> Reservation is expended when allocating blocks on the low level >> (bitmaps). Reservation (grabbing space) is needed to avoid hard >> ENOSPC (=no free bits in bitmaps) in situation, when we can not >> fail (e.g. flush, commit, etc..,) > Let's take reiser4_sync_file_common(). > > The grabbing is > reserve = estimate_update_common(dentry->d_inode); > if (reiser4_grab_space(reserve, BA_CAN_COMMIT)) { > > The creation of atom is (somewhere deep in the call stack) at > write_sd_by_inode_common(dentry->d_inode); > > Clearly, syncing file won't increase the real space occupied by data on disk. > However, because there is WA + journaling, such transaction still needs some > space to complete. This is "X blocks". > > Suppose there is a second sync scheduled between grabbing and creation of atom > of the first sync. In the same vein it needs Y blocks, and Y is such that > Y < free-space < X+Y. > > In this case, the second sync will fail despite BA_CAN_COMMIT flag given to > reiser4_grab_space(): at time of its execution, the first sync did not yet > create its atom, so there is nothing to commit to reclaim those X blocks. > > However, if the second sync gets ordered after write_sd_by_inode_common() of > the first sync, BA_CAN_COMMIT machinery will eventually execute > txnmgr_force_commit_all() which will wait for the first sync to complete and > reclaim those X blocks. > > So, the second transaction's result depends on scheduling. It is a race-like > behavior. It's OK. The second process fails in the situation of disk space pressure (free-space < X+Y ). We don't rely on success here. I was suspicious because of the problem of "phantom" ENOSPC, which appears once in a while: a small write returns ENOSPC, whereas there is a lot of free space on disk.