* Re: removing lock_kernel() from fs/fat
2007-10-12 17:18 removing lock_kernel() from fs/fat Little, Chris
@ 2007-10-12 19:28 ` Matthew Wilcox
2007-10-15 16:34 ` Little, Chris
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2007-10-12 19:28 UTC (permalink / raw)
To: kernel-janitors
On Fri, Oct 12, 2007 at 12:18:23PM -0500, Little, Chris wrote:
> 1. In file.c, line 304:
> lock_kernel();
> fat_free(inode, nr_clusters);
> unlock_kernel();
>
> Why wrap fat_free with lock_kernel and not lock in the function
> itself?
Probably no good reason. You should be able to push it down into the
function.
> 2. In inode.c, starting at lines 439 and 570, the two functions
> fat_clear_inode() and fat_write_inode() lock_kernel and then spin lock.
> Why the spin lock if it already has the bkl?
Because you have to protect against other CPUs reading this inode, some
of which may have the BKL and others may have the inode_hash_lock.
> In fact I'm almost positive I'm missing something, but
> I'm here to learn. So here it is:
>
> --- linux-2.6.23/fs/fat/dir.c 2007-10-09 15:31:38.000000000 -0500
> +++ linux-2.6.23-jcl/fs/fat/dir.c 2007-10-12 11:02:17.000000000
> -0500
> @@ -459,8 +459,9 @@
> int chi, chl, i, i2, j, last, last_u, dotoffset = 0;
> loff_t cpos;
> int ret = 0;
> + rwlock_t mr_rwlock = RW_LOCK_UNLOCKED;
>
> - lock_kernel();
> + read_lock(&mr_rwlock);
>
> cpos = filp->f_pos;
> /* Fake . and .. for the root directory. */
> @@ -642,7 +643,7 @@
> if (unicode)
> free_page((unsigned long)unicode);
> out:
> - unlock_kernel();
> + read_unlock(&mr_rwlock);
> return ret;
> }
>
Basically, you should never put a spinlock on the stack. You can't
exclude any other caller because they'll have their own copy of this
spinlock. If it were static, that would be one thing, but I think you
intend for it to be shared with the function below ...
Another mistake you've made is using an rwlock. This is a classic
beginners mistake, as the problem with it is highly non-obvious. The
trouble with rwlocks is that they're unfair to writers. A storm of
readers can prevent a writer from ever getting the lock. It's normally
a good thing to use a plain spinlock and only convert to an rwlock if
you're sure it's a good idea.
A further mistake is picking a filesystem as a good candidate for BKL
removal. Take a look at Documentation/filesystems/Locking. You'll see
there's a lot of places where the VFS takes the BKL on behalf of the
filesystem. So taking the BKL inside a filesystem implicitly locks out
a lot of places that might call into the filesystem.
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 5+ messages in thread* RE: removing lock_kernel() from fs/fat
2007-10-12 17:18 removing lock_kernel() from fs/fat Little, Chris
2007-10-12 19:28 ` Matthew Wilcox
@ 2007-10-15 16:34 ` Little, Chris
2007-10-15 16:39 ` Little, Chris
2007-10-15 18:23 ` Matthew Wilcox
3 siblings, 0 replies; 5+ messages in thread
From: Little, Chris @ 2007-10-15 16:34 UTC (permalink / raw)
To: kernel-janitors
First, thanks for the reply. Except for the details, it is much what I
expected, but this is a good thing. If nothing else other than realizing
how far I have to go. :)
> -----Original Message-----
> From: Matthew Wilcox [mailto:matthew@wil.cx]
> Sent: Friday, October 12, 2007 2:28 PM
> To: Little, Chris
> Cc: kernel-janitors@vger.kernel.org
> Subject: Re: removing lock_kernel() from fs/fat
>
> On Fri, Oct 12, 2007 at 12:18:23PM -0500, Little, Chris wrote:
> > 1. In file.c, line 304:
> > lock_kernel();
> > fat_free(inode, nr_clusters);
> > unlock_kernel();
> >
> > Why wrap fat_free with lock_kernel and not lock in the function
> > itself?
>
> Probably no good reason. You should be able to push it down
> into the function.
I don't know that it matter either way except that it seems logical to
me to have it part of fat_free() IF locking is required for fat_free().
However, fat_free() is inlined and only called once. Here. Would that
be worth it?
>
> > 2. In inode.c, starting at lines 439 and 570, the two functions
> > fat_clear_inode() and fat_write_inode() lock_kernel and
> then spin lock.
> > Why the spin lock if it already has the bkl?
>
> Because you have to protect against other CPUs reading this
> inode, some of which may have the BKL and others may have the
> inode_hash_lock.
>
> > In fact I'm almost positive I'm missing something, but I'm here to
> > learn. So here it is:
> >
> > --- linux-2.6.23/fs/fat/dir.c 2007-10-09 15:31:38.000000000 -0500
> > +++ linux-2.6.23-jcl/fs/fat/dir.c 2007-10-12 11:02:17.000000000
> > -0500
> > @@ -459,8 +459,9 @@
> > int chi, chl, i, i2, j, last, last_u, dotoffset = 0;
> > loff_t cpos;
> > int ret = 0;
> > + rwlock_t mr_rwlock = RW_LOCK_UNLOCKED;
> >
> > - lock_kernel();
> > + read_lock(&mr_rwlock);
> >
> > cpos = filp->f_pos;
> > /* Fake . and .. for the root directory. */ @@
> -642,7 +643,7
> > @@
> > if (unicode)
> > free_page((unsigned long)unicode);
> > out:
> > - unlock_kernel();
> > + read_unlock(&mr_rwlock);
> > return ret;
> > }
> >
>
> Basically, you should never put a spinlock on the stack. You
> can't exclude any other caller because they'll have their own
> copy of this spinlock. If it were static, that would be one
> thing, but I think you intend for it to be shared with the
> function below ...
>
> Another mistake you've made is using an rwlock. This is a
> classic beginners mistake, as the problem with it is highly
> non-obvious. The trouble with rwlocks is that they're unfair
> to writers. A storm of readers can prevent a writer from
> ever getting the lock. It's normally a good thing to use a
> plain spinlock and only convert to an rwlock if you're sure
> it's a good idea.
>
> A further mistake is picking a filesystem as a good candidate
> for BKL removal. Take a look at
> Documentation/filesystems/Locking. You'll see there's a lot
> of places where the VFS takes the BKL on behalf of the
> filesystem. So taking the BKL inside a filesystem implicitly
> locks out a lot of places that might call into the filesystem.
>
> --
> Intel are signing my paycheques ... these opinions are still
> mine "Bill, look, we understand that you're interested in
> selling us this operating system, but compare it to ours. We
> can't possibly take such a retrograde step."
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: removing lock_kernel() from fs/fat
2007-10-12 17:18 removing lock_kernel() from fs/fat Little, Chris
2007-10-12 19:28 ` Matthew Wilcox
2007-10-15 16:34 ` Little, Chris
@ 2007-10-15 16:39 ` Little, Chris
2007-10-15 18:23 ` Matthew Wilcox
3 siblings, 0 replies; 5+ messages in thread
From: Little, Chris @ 2007-10-15 16:39 UTC (permalink / raw)
To: kernel-janitors
Crap. Hit send before I was done.
> -----Original Message-----
> From: Matthew Wilcox [mailto:matthew@wil.cx]
> Sent: Friday, October 12, 2007 2:28 PM
> To: Little, Chris
> Cc: kernel-janitors@vger.kernel.org
> Subject: Re: removing lock_kernel() from fs/fat
>
> > Why the spin lock if it already has the bkl?
>
> Because you have to protect against other CPUs reading this
> inode, some of which may have the BKL and others may have the
> inode_hash_lock.
That makes sense.
> Another mistake you've made is using an rwlock. This is a
> classic beginners mistake, as the problem with it is highly
> non-obvious. The trouble with rwlocks is that they're unfair
> to writers. A storm of readers can prevent a writer from
> ever getting the lock. It's normally a good thing to use a
> plain spinlock and only convert to an rwlock if you're sure
> it's a good idea.
>
> A further mistake is picking a filesystem as a good candidate
> for BKL removal. Take a look at
> Documentation/filesystems/Locking. You'll see there's a lot
> of places where the VFS takes the BKL on behalf of the
> filesystem. So taking the BKL inside a filesystem implicitly
> locks out a lot of places that might call into the filesystem.
Question: Are there appropriate times to use rwlock, then? Doesn't it
imply that there are writers?
Many thanks for your comments. Things to file away. Pardon me if I
keep throwing stuff up. One day I might just get good at this.
:)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: removing lock_kernel() from fs/fat
2007-10-12 17:18 removing lock_kernel() from fs/fat Little, Chris
` (2 preceding siblings ...)
2007-10-15 16:39 ` Little, Chris
@ 2007-10-15 18:23 ` Matthew Wilcox
3 siblings, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2007-10-15 18:23 UTC (permalink / raw)
To: kernel-janitors
On Mon, Oct 15, 2007 at 11:39:42AM -0500, Little, Chris wrote:
> > Another mistake you've made is using an rwlock. This is a
> > classic beginners mistake, as the problem with it is highly
> > non-obvious. The trouble with rwlocks is that they're unfair
> > to writers. A storm of readers can prevent a writer from
> > ever getting the lock. It's normally a good thing to use a
> > plain spinlock and only convert to an rwlock if you're sure
> > it's a good idea.
>
> Question: Are there appropriate times to use rwlock, then? Doesn't it
> imply that there are writers?
There definitely are appropriate times to use rwlock. This might even
be one of them. But it's a non-obvious risk that needs to be born in
mind, and it's rarely a win.
Here's a good example:
include/linux/netdevice.h:extern rwlock_t dev_base_lock; /* Device list lock */
So when we're adding a new network device or removing an old one
(infrequent operation, can afford to wait for a long time), we take it
for write. All kinds of places in the network layer require the list
of network devices don't change while they look things up, so they all
take it for read, and there's no need to exclude them against each other.
It's possible that a situation like this obtains in fat. I haven't
looked into it. But anyone who wants to change a lock type really needs
to provide a good argument for why it should be changed. Back when we
had a BKL brigade, they posted a patch to fs/locks.c which I was able
to show would actually increase contention.
> Many thanks for your comments. Things to file away. Pardon me if I
> keep throwing stuff up. One day I might just get good at this.
It takes a while. Keep reading code, it's the best way to learn.
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 5+ messages in thread