reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: Ivan Shapovalov <intelfx100@gmail.com>, reiserfs-devel@vger.kernel.org
Subject: Re: reiser4: FITRIM ioctl -- how to grab the space?
Date: Sat, 16 Aug 2014 10:09:44 +0200	[thread overview]
Message-ID: <53EF11C8.20209@gmail.com> (raw)
In-Reply-To: <2026408.Yl74NqGZKK@intelfx-laptop>


On 08/16/2014 02:44 AM, Ivan Shapovalov wrote:
> On Monday 11 August 2014 at 13:39:12, Ivan Shapovalov wrote:	
>> [...]
>>>> I've meant "grabbing all space and then allocating all space" -- so there won't
>>>> be multiple grabs or multiple atoms.
>>>>
>>>> Then all processes grabbing space with BA_CAN_COMMIT will wait for the discard
>>>> atom to commit.
>>>
>>> It seems such waiting will screw up the system. No?
>> I was afraid of such situations, but how would that happen? The discard atom's
>> commit will always be able to proceed as it doesn't grab space at all.
>>
>>>>    (Actually, there is a small race window between grabbing space
>>>> and creating an atom...)
>>>
>>> Which one?
>> BA_CAN_COMMIT machinery does wait only for atoms, not for contexts. If
>> process X happens to grab space between us grabbing space and creating an atom,
>> it will get -ENOSPC even with BA_CAN_COMMIT.


I still don't see any "races" here. How atom creation is related to grabbing
space? Are we talking about races in the existing code? f so, please show
the racing paths..


>>
>>>> The only problem is to wait for (sbinfo->block_count == sbinfo->blocks_used +
>>>> sbinfo->blocks_free) condition, i. e. until no blocks are reserved in any form,
>>>> and then to grab all space atomically wrt. reaching this condition.
>>>>
>>>> Again, if this is not feasible, I'll go with the multiple atoms approach. I
>>>> just want to make sure.
>>>>
> ...so, I've almost given up implementing this :)


great!


>
> In kernel there is a read-write semaphore implementation called rwsem.
> I've added a per-superblock instance of rwsem with following semantics:
>
> - when count of grabbed+special (not free or used) blocks is increased by any
>    means, the semaphore is taken for reading before taking spinlock and
>    modifying counters
>
> - if the counters already were non-zero, the semaphore has been already taken
>    for reading (reader count > 1) and it is released once while under spinlock
>    (so that reader count always stays at 1)
>
> - when count of grabbed+special blocks is decreased and drops to zero, the
>    semaphore is released once (so reader count drops to 0 unless there is a race
>    with increasing the count)
>
> - on second try of BA_CAN_COMMIT grabbing (if there was not enough space),
>    the semaphore is taken for writing instead of for reading, ensuring
>    that every block is either permanently used or free. The write lock is
>    converted to read lock after grabbing required space.
>
> This "almost" works. The main problem is that Linux rwsem implementation
> is write-biased: that is, if there are writers waiting, readers count can't
> increase. That is, a process must not take a semaphore for reading in second
> time if it is responsible for releasing the "first time" reader.
>
> The comment in original rwsem implementation by Andrew Morton states following:
> "It is NOT legal for one task to down_read() an rwsem multiple times."
>
>    reader1     writer1
> ------------------------
> down_read()
>              down_write()
>              up_write()
> down_read()
> up_read()
> up_read()
>
> This is a deadlock: reader1's down_read() blocks on writer1's up_write(),
> while writer1's down_write() blocks on reader1's second up_read().
>
> A force grab (or a grab preceded by grab_space_enable(), or a used2something)
> deadlocks 100% in presence of waiting writers, and so does the corresponding
> transaction commit.
>
> So I need to find a way to take rwsem in a read-biased mode... Any advice is
> accepted, including "give up with adding of yet another lock and go with
> multiple transactions" :)


IMHO this is too complicated.

Why don't you want to grab, say, 20M per iteration?
It should work without any problems, just maintain a
counter of blocks allocated in the iteration..

Thanks,
Edward.

  reply	other threads:[~2014-08-16  8:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-31 20:47 reiser4: FITRIM ioctl -- how to grab the space? Ivan Shapovalov
2014-07-31 22:03 ` Edward Shishkin
2014-07-31 22:16   ` Ivan Shapovalov
     [not found]     ` <53DACEE9.8000802@gmail.com>
2014-08-10 18:52       ` Ivan Shapovalov
2014-08-10 19:48         ` Edward Shishkin
2014-08-10 20:37           ` Ivan Shapovalov
2014-08-10 23:29             ` Edward Shishkin
2014-08-11  9:39               ` Ivan Shapovalov
2014-08-16  0:44                 ` Ivan Shapovalov
2014-08-16  8:09                   ` Edward Shishkin [this message]
2014-08-16  8:23                     ` Edward Shishkin
2014-08-16 11:27                       ` Ivan Shapovalov
2014-08-16 13:35                         ` Edward Shishkin
2014-08-16 17:05                           ` Ivan Shapovalov
2014-08-16 20:13                             ` Edward Shishkin
2014-08-16 11:17                     ` Ivan Shapovalov
2014-08-16 12:15                       ` Edward Shishkin
2014-08-16 17:02                         ` Ivan Shapovalov
2014-08-16 19:54                           ` Edward Shishkin
2014-08-02 16:40   ` Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53EF11C8.20209@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=intelfx100@gmail.com \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).