From: Eric Dumazet <dada1@cosmosbay.com>
To: ego@in.ibm.com
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>,
linux-kernel@vger.kernel.org, paulmck@us.ibm.com
Subject: Re: New filesystem for Linux
Date: Sat, 04 Nov 2006 19:27:48 +0100 [thread overview]
Message-ID: <454CDBA4.4040503@cosmosbay.com> (raw)
In-Reply-To: <20061104173716.GA618@in.ibm.com>
Gautham R Shenoy a écrit :
> On Thu, Nov 02, 2006 at 10:52:47PM +0100, Mikulas Patocka wrote:
>> Hi
>
> Hi Mikulas
>> As my PhD thesis, I am designing and writing a filesystem, and it's now in
>> a state that it can be released. You can download it from
>> http://artax.karlin.mff.cuni.cz/~mikulas/spadfs/
>>
>> It has some new features, such as keeping inode information directly in
>> directory (until you create hardlink) so that ls -la doesn't seek much,
>> new method to keep data consistent in case of crashes (instead of
>> journaling), free space is organized in lists of free runs and converted
>> to bitmap only in case of extreme fragmentation.
>>
>> It is not very widely tested, so if you want, test it.
>>
>> I have these questions:
>>
>> * There is a rw semaphore that is locked for read for nearly all
>> operations and locked for write only rarely. However locking for read
>> causes cache line pingpong on SMP systems. Do you have an idea how to make
>> it better?
>>
>> It could be improved by making a semaphore for each CPU and locking for
>> read only the CPU's semaphore and for write all semaphores. Or is there a
>> better method?
>
> I am currently experimenting with a light-weight reader writer semaphore
> with an objective to do away what you call a reader side cache line
> "ping pong". It achieves this by using a per-cpu refcount.
>
> A drawback of this approach, as Eric Dumazet mentioned elsewhere in this
> thread, would be that each instance of the rw_semaphore would require
> (NR_CPUS * size_of(int)) bytes worth of memory in order to keep track of
> the per-cpu refcount, which can prove to be pretty costly if this
> rw_semaphore is for something like inode->i_alloc_sem.
We might use an hybrid approach : Use a percpu counter if NR_CPUS <= 8
#define refcount_addr(zone, cpu) zone[cpu]
For larger setups, have a fixed limit of 8 counters, and use a modulo
#define refcount_addr(zone, cpu) zone[cpu & 7]
In order not use too much memory, we could use kind of vmalloc() space, using
one PAGE per cpu, so that addr(cpu) = base + (cpu)*PAGE_SIZE;
(vmalloc space allows a NUMA allocation if possible)
So instead of storing in an object a table of 8 pointers, we store only the
address for cpu0.
>
> So the question I am interested in is, how many *live* instances of this
> rw_semaphore are you expecting to have at any given time?
> If this number is a constant (and/or not very big!), the light-weight
> reader writer semaphore might be useful.
>
> Regards
> Gautham.
next prev parent reply other threads:[~2006-11-04 18:27 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-02 21:52 New filesystem for Linux Mikulas Patocka
2006-11-02 22:32 ` Gabriel C
2006-11-03 1:22 ` Mikulas Patocka
2006-11-03 1:41 ` Andrew Morton
2006-11-03 17:14 ` Oleg Verych
2006-11-03 17:09 ` Mikulas Patocka
2006-11-03 17:36 ` Oleg Verych
2006-11-03 18:14 ` Mikulas Patocka
2006-11-03 19:08 ` Adrian Bunk
2006-11-03 19:32 ` Oleg Verych
2006-11-03 19:00 ` Alan Cox
2006-11-03 19:14 ` Andi Kleen
2006-11-03 2:09 ` Gabriel C
2006-11-03 8:26 ` Jan Engelhardt
2006-11-03 11:52 ` Mikulas Patocka
2006-11-03 11:59 ` Mikulas Patocka
2006-11-03 12:50 ` Jan Engelhardt
2006-11-03 18:48 ` Mikulas Patocka
2006-11-03 21:51 ` Jan Engelhardt
2006-11-03 11:47 ` Mikulas Patocka
2006-11-02 22:53 ` Eric Dumazet
2006-11-03 1:28 ` Mikulas Patocka
2006-11-03 1:43 ` Andrew Morton
2006-11-04 18:40 ` Mikulas Patocka
2006-11-04 19:07 ` Eric Dumazet
2006-11-04 19:39 ` Tomasz Torcz
2006-11-05 1:58 ` Alan Cox
2006-11-05 2:09 ` Patrick McFarland
2006-11-05 13:03 ` Maurizio Lombardi
2006-11-05 20:16 ` H. Peter Anvin
2006-11-02 22:54 ` Grzegorz Kulewski
2006-11-02 23:10 ` Eric Dumazet
2006-11-02 23:19 ` Mikulas Patocka
2006-11-02 23:29 ` Grzegorz Kulewski
2006-11-03 1:34 ` Mikulas Patocka
2006-11-03 20:30 ` Christoph Lameter
2006-11-04 18:46 ` Mikulas Patocka
2006-11-05 12:02 ` Theodore Tso
2006-11-03 22:00 ` Oleg Verych
2006-11-03 22:42 ` Mikulas Patocka
2006-11-03 0:57 ` Nigel Cunningham
2006-11-03 13:05 ` Ric Wheeler
2006-11-06 2:42 ` Phillip Susi
2006-11-04 19:59 ` Albert Cahalan
2006-11-04 21:01 ` Jan-Benedict Glaw
2006-11-05 16:37 ` Albert Cahalan
2006-11-04 23:38 ` Mikulas Patocka
2006-11-04 23:46 ` Kyle Moffett
2006-11-05 20:26 ` H. Peter Anvin
2006-11-05 21:27 ` Rene Herman
2006-11-05 21:51 ` H. Peter Anvin
2006-11-06 0:36 ` Rene Herman
2006-11-05 21:49 ` Pavel Machek
2006-11-05 1:57 ` Alan Cox
2006-11-05 11:14 ` James Courtier-Dutton
2006-11-05 11:27 ` Brad Campbell
2006-11-05 12:37 ` Alan Cox
2006-11-06 2:48 ` Phillip Susi
2006-11-05 16:22 ` Albert Cahalan
2006-11-05 17:18 ` Mikulas Patocka
2006-11-05 18:14 ` Alan Cox
2006-11-05 18:18 ` Mikulas Patocka
2006-11-05 19:14 ` Alan Cox
2006-11-02 23:15 ` Linus Torvalds
2006-11-03 20:02 ` Paul E. McKenney
2006-11-02 23:41 ` Andi Kleen
2006-11-03 1:45 ` Mikulas Patocka
2006-11-03 13:47 ` Nikita Danilov
2006-11-03 14:39 ` Mikulas Patocka
2006-11-02 23:59 ` Jörn Engel
2006-11-03 1:19 ` Mikulas Patocka
2006-11-03 10:19 ` Jörn Engel
2006-11-03 11:56 ` Mikulas Patocka
2006-11-03 12:21 ` Jörn Engel
2006-11-03 13:31 ` Mikulas Patocka
2006-11-03 13:48 ` Jörn Engel
2006-11-03 14:19 ` Mikulas Patocka
2006-11-03 14:53 ` Jörn Engel
2006-11-03 19:01 ` Mikulas Patocka
2006-11-04 10:46 ` Jörn Engel
2006-11-04 18:50 ` Mikulas Patocka
2006-11-06 21:19 ` Jörn Engel
2006-11-03 19:51 ` Adrian Bunk
2006-11-03 19:00 ` dean gaudet
2006-11-04 10:53 ` Jörn Engel
2006-11-04 11:13 ` dean gaudet
2006-11-04 20:07 ` Jörn Engel
2006-11-04 18:52 ` Mikulas Patocka
2006-11-04 18:56 ` Grzegorz Kulewski
2006-11-04 19:18 ` Mikulas Patocka
2006-11-04 17:37 ` Gautham R Shenoy
2006-11-04 18:27 ` Eric Dumazet [this message]
2006-11-05 22:33 ` Paul E. McKenney
2006-11-05 0:52 ` Linus Torvalds
2006-11-05 4:14 ` Mikulas Patocka
2006-11-05 8:34 ` Willy Tarreau
2006-11-05 11:31 ` Jan Engelhardt
2006-11-05 14:48 ` Bruno Cesar Ribas
-- strict thread matches above, loose matches on Subject: below --
2006-11-06 17:40 Al Boldi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=454CDBA4.4040503@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=ego@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mikulas@artax.karlin.mff.cuni.cz \
--cc=paulmck@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox