From: Edward Shishkin <edward.shishkin@gmail.com>
To: Ivan Shapovalov <intelfx100@gmail.com>, reiserfs-devel@vger.kernel.org
Subject: Re: reiser4: FITRIM ioctl -- how to grab the space?
Date: Sat, 16 Aug 2014 10:23:08 +0200 [thread overview]
Message-ID: <53EF14EC.1070101@gmail.com> (raw)
In-Reply-To: <53EF11C8.20209@gmail.com>
On 08/16/2014 10:09 AM, Edward Shishkin wrote:
>
> On 08/16/2014 02:44 AM, Ivan Shapovalov wrote:
>> On Monday 11 August 2014 at 13:39:12, Ivan Shapovalov wrote:
>>> [...]
>>>>> I've meant "grabbing all space and then allocating all space" --
>>>>> so there won't
>>>>> be multiple grabs or multiple atoms.
>>>>>
>>>>> Then all processes grabbing space with BA_CAN_COMMIT will wait for
>>>>> the discard
>>>>> atom to commit.
>>>>
>>>> It seems such waiting will screw up the system. No?
>>> I was afraid of such situations, but how would that happen? The
>>> discard atom's
>>> commit will always be able to proceed as it doesn't grab space at all.
>>>
>>>>> (Actually, there is a small race window between grabbing space
>>>>> and creating an atom...)
>>>>
>>>> Which one?
>>> BA_CAN_COMMIT machinery does wait only for atoms, not for contexts. If
>>> process X happens to grab space between us grabbing space and
>>> creating an atom,
>>> it will get -ENOSPC even with BA_CAN_COMMIT.
>
>
> I still don't see any "races" here. How atom creation is related to
> grabbing
> space? Are we talking about races in the existing code? f so, please show
> the racing paths..
>
>
>>>
>>>>> The only problem is to wait for (sbinfo->block_count ==
>>>>> sbinfo->blocks_used +
>>>>> sbinfo->blocks_free) condition, i. e. until no blocks are reserved
>>>>> in any form,
>>>>> and then to grab all space atomically wrt. reaching this condition.
>>>>>
>>>>> Again, if this is not feasible, I'll go with the multiple atoms
>>>>> approach. I
>>>>> just want to make sure.
>>>>>
>> ...so, I've almost given up implementing this :)
>
>
> great!
>
>
>>
>> In kernel there is a read-write semaphore implementation called rwsem.
>> I've added a per-superblock instance of rwsem with following semantics:
>>
>> - when count of grabbed+special (not free or used) blocks is
>> increased by any
>> means, the semaphore is taken for reading before taking spinlock and
>> modifying counters
>>
>> - if the counters already were non-zero, the semaphore has been
>> already taken
>> for reading (reader count > 1) and it is released once while under
>> spinlock
>> (so that reader count always stays at 1)
>>
>> - when count of grabbed+special blocks is decreased and drops to
>> zero, the
>> semaphore is released once (so reader count drops to 0 unless
>> there is a race
>> with increasing the count)
>>
>> - on second try of BA_CAN_COMMIT grabbing (if there was not enough
>> space),
>> the semaphore is taken for writing instead of for reading, ensuring
>> that every block is either permanently used or free. The write
>> lock is
>> converted to read lock after grabbing required space.
>>
>> This "almost" works. The main problem is that Linux rwsem implementation
>> is write-biased: that is, if there are writers waiting, readers count
>> can't
>> increase. That is, a process must not take a semaphore for reading in
>> second
>> time if it is responsible for releasing the "first time" reader.
>>
>> The comment in original rwsem implementation by Andrew Morton states
>> following:
>> "It is NOT legal for one task to down_read() an rwsem multiple times."
>>
>> reader1 writer1
>> ------------------------
>> down_read()
>> down_write()
>> up_write()
>> down_read()
>> up_read()
>> up_read()
>>
>> This is a deadlock: reader1's down_read() blocks on writer1's
>> up_write(),
>> while writer1's down_write() blocks on reader1's second up_read().
>>
>> A force grab (or a grab preceded by grab_space_enable(), or a
>> used2something)
>> deadlocks 100% in presence of waiting writers, and so does the
>> corresponding
>> transaction commit.
>>
>> So I need to find a way to take rwsem in a read-biased mode... Any
>> advice is
>> accepted, including "give up with adding of yet another lock and go with
>> multiple transactions" :)
>
>
> IMHO this is too complicated.
>
> Why don't you want to grab, say, 20M per iteration?
> It should work without any problems, just maintain a
> counter of blocks allocated in the iteration..
add the counter to the struct reiser4_context and
set it to zero at the beginning of every iteration.
use get_current_context() to access the counter.
next prev parent reply other threads:[~2014-08-16 8:23 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-31 20:47 reiser4: FITRIM ioctl -- how to grab the space? Ivan Shapovalov
2014-07-31 22:03 ` Edward Shishkin
2014-07-31 22:16 ` Ivan Shapovalov
[not found] ` <53DACEE9.8000802@gmail.com>
2014-08-10 18:52 ` Ivan Shapovalov
2014-08-10 19:48 ` Edward Shishkin
2014-08-10 20:37 ` Ivan Shapovalov
2014-08-10 23:29 ` Edward Shishkin
2014-08-11 9:39 ` Ivan Shapovalov
2014-08-16 0:44 ` Ivan Shapovalov
2014-08-16 8:09 ` Edward Shishkin
2014-08-16 8:23 ` Edward Shishkin [this message]
2014-08-16 11:27 ` Ivan Shapovalov
2014-08-16 13:35 ` Edward Shishkin
2014-08-16 17:05 ` Ivan Shapovalov
2014-08-16 20:13 ` Edward Shishkin
2014-08-16 11:17 ` Ivan Shapovalov
2014-08-16 12:15 ` Edward Shishkin
2014-08-16 17:02 ` Ivan Shapovalov
2014-08-16 19:54 ` Edward Shishkin
2014-08-02 16:40 ` Edward Shishkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53EF14EC.1070101@gmail.com \
--to=edward.shishkin@gmail.com \
--cc=intelfx100@gmail.com \
--cc=reiserfs-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.