* 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-02-27 5:27 Linux v2.6.16-rc5 Linus Torvalds
@ 2006-03-03 23:01 ` Adrian Bunk
2006-03-03 23:22 ` Linus Torvalds
0 siblings, 1 reply; 12+ messages in thread
From: Adrian Bunk @ 2006-03-03 23:01 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton, ".geert", zippel,
linux-m68k
Cc: Linux Kernel Mailing List
The m68k defconfig does no longer compile in 2.6.16-rc:
<-- snip -->
...
CC fs/file_table.o
fs/file_table.c: In function `fget':
fs/file_table.c:170: warning: implicit declaration of function `cmpxchg'
...
LD .tmp_vmlinux1
fs/built-in.o(.text+0x275a): In function `fget':
: undefined reference to `cmpxchg'
fs/built-in.o(.text+0x27da): In function `fget_light':
: undefined reference to `cmpxchg'
make: *** [.tmp_vmlinux1] Error 1
<-- snip -->
It seems the problem is that in the CONFIG_RMW_INSNS=n case, there's no
cmpxchg #define in include/asm-m68k/system.h required for the
atomic_add_unless #define in include/asm-m68k/atomic.h.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-03 23:01 ` 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken Adrian Bunk
@ 2006-03-03 23:22 ` Linus Torvalds
2006-03-03 23:43 ` Adrian Bunk
2006-03-03 23:59 ` Andrew Morton
0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2006-03-03 23:22 UTC (permalink / raw)
To: Adrian Bunk
Cc: Andrew Morton, ".geert", zippel, linux-m68k,
Linux Kernel Mailing List, Nick Piggin
On Sat, 4 Mar 2006, Adrian Bunk wrote:
>
> It seems the problem is that in the CONFIG_RMW_INSNS=n case, there's no
> cmpxchg #define in include/asm-m68k/system.h required for the
> atomic_add_unless #define in include/asm-m68k/atomic.h.
Hmm. It seems like it never has been there.. Do you know what brought this
on? Was it Nick's RCU changes from "rcuref_dec_and_test()" to
"atomic_dec_and_test()" and friends?
Judging by your error messages, I _think_ it's the "atomic_inc_not_zero()"
that gets expanded to a cmpxchg() that simply doesn't exist on m68k and
never has.
I guess we've never depended on cmpxchg before. Or am I missing something?
Linus
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-03 23:22 ` Linus Torvalds
@ 2006-03-03 23:43 ` Adrian Bunk
2006-03-03 23:59 ` Andrew Morton
1 sibling, 0 replies; 12+ messages in thread
From: Adrian Bunk @ 2006-03-03 23:43 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, geert, zippel, linux-m68k,
Linux Kernel Mailing List, Nick Piggin
On Fri, Mar 03, 2006 at 03:22:42PM -0800, Linus Torvalds wrote:
>
>
> On Sat, 4 Mar 2006, Adrian Bunk wrote:
> >
> > It seems the problem is that in the CONFIG_RMW_INSNS=n case, there's no
> > cmpxchg #define in include/asm-m68k/system.h required for the
> > atomic_add_unless #define in include/asm-m68k/atomic.h.
>
> Hmm. It seems like it never has been there.. Do you know what brought this
> on? Was it Nick's RCU changes from "rcuref_dec_and_test()" to
> "atomic_dec_and_test()" and friends?
It was Nick's commit 8426e1f6af0fd7f44d040af7263750c5a52f3cc3 that added
atomic_inc_not_zero(), and Nick's patch that changed fs/file_table.c
from rcuref_dec_and_test() to atomic_dec_and_test() exposed this
problem.
> Judging by your error messages, I _think_ it's the "atomic_inc_not_zero()"
> that gets expanded to a cmpxchg() that simply doesn't exist on m68k and
> never has.
Exactly, that's what I wanted to say in my report.
> I guess we've never depended on cmpxchg before. Or am I missing something?
It seems this is the case.
And as far as I can see, m68k is the only architecture where cmpxchg
isn't always available.
> Linus
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-03 23:22 ` Linus Torvalds
2006-03-03 23:43 ` Adrian Bunk
@ 2006-03-03 23:59 ` Andrew Morton
2006-03-04 14:01 ` Roman Zippel
1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2006-03-03 23:59 UTC (permalink / raw)
To: Linus Torvalds
Cc: bunk, ".geert", zippel, linux-m68k, linux-kernel,
nickpiggin
Linus Torvalds <torvalds@osdl.org> wrote:
>
>
>
> On Sat, 4 Mar 2006, Adrian Bunk wrote:
> >
> > It seems the problem is that in the CONFIG_RMW_INSNS=n case, there's no
> > cmpxchg #define in include/asm-m68k/system.h required for the
> > atomic_add_unless #define in include/asm-m68k/atomic.h.
>
> Hmm. It seems like it never has been there.. Do you know what brought this
> on? Was it Nick's RCU changes from "rcuref_dec_and_test()" to
> "atomic_dec_and_test()" and friends?
>
> Judging by your error messages, I _think_ it's the "atomic_inc_not_zero()"
> that gets expanded to a cmpxchg() that simply doesn't exist on m68k and
> never has.
>
> I guess we've never depended on cmpxchg before. Or am I missing something?
>
Yes, we now require cmpxchg of all architectures.
It's pretty simple to fix - just use local_irq_save(). We can steal the code
from include/asm-m68knommu/system.h.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-03 23:59 ` Andrew Morton
@ 2006-03-04 14:01 ` Roman Zippel
2006-03-04 14:12 ` Nick Piggin
0 siblings, 1 reply; 12+ messages in thread
From: Roman Zippel @ 2006-03-04 14:01 UTC (permalink / raw)
To: Andrew Morton
Cc: Linus Torvalds, bunk, ".geert", linux-m68k, linux-kernel,
nickpiggin
Hi,
On Fri, 3 Mar 2006, Andrew Morton wrote:
> Yes, we now require cmpxchg of all architectures.
Actually I'd prefer if we used atomic_cmpxchg() instead.
The cmpxchg() emulation was never added for a good reason - to keep code
from assuming it can be used it for userspace synchronisation. Using an
atomic_t here would probably get at least some attention.
bye, Roman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-04 14:01 ` Roman Zippel
@ 2006-03-04 14:12 ` Nick Piggin
2006-03-04 20:28 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: Nick Piggin @ 2006-03-04 14:12 UTC (permalink / raw)
To: Roman Zippel
Cc: Andrew Morton, Linus Torvalds, bunk, Geert Uytterhoeven,
linux-m68k, linux-kernel
Roman Zippel wrote:
> Hi,
>
> On Fri, 3 Mar 2006, Andrew Morton wrote:
>
>
>>Yes, we now require cmpxchg of all architectures.
>
>
> Actually I'd prefer if we used atomic_cmpxchg() instead.
> The cmpxchg() emulation was never added for a good reason - to keep code
> from assuming it can be used it for userspace synchronisation. Using an
> atomic_t here would probably get at least some attention.
>
Yes, I guess that's what Andrew meant. The reason we can require
atomic_cmpxchg of all architectures is because it is only guaranteed
to work on atomic_t.
Glad to hear it won't be a problem for you though.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-04 14:12 ` Nick Piggin
@ 2006-03-04 20:28 ` Andrew Morton
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2006-03-04 20:28 UTC (permalink / raw)
To: Nick Piggin; +Cc: zippel, torvalds, bunk, geert, linux-m68k, linux-kernel
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> Roman Zippel wrote:
> > Hi,
> >
> > On Fri, 3 Mar 2006, Andrew Morton wrote:
> >
> >
> >>Yes, we now require cmpxchg of all architectures.
> >
> >
> > Actually I'd prefer if we used atomic_cmpxchg() instead.
> > The cmpxchg() emulation was never added for a good reason - to keep code
> > from assuming it can be used it for userspace synchronisation. Using an
> > atomic_t here would probably get at least some attention.
> >
>
> Yes, I guess that's what Andrew meant. The reason we can require
> atomic_cmpxchg of all architectures is because it is only guaranteed
> to work on atomic_t.
>
> Glad to hear it won't be a problem for you though.
>
Could someone with an m68k compiler please send the patch?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
@ 2006-03-06 4:44 Suzanne Wood
2006-03-06 15:35 ` Dipankar Sarma
0 siblings, 1 reply; 12+ messages in thread
From: Suzanne Wood @ 2006-03-06 4:44 UTC (permalink / raw)
To: bunk; +Cc: linux-kernel, nickpiggin, paulmck
> From: Adrian Bunk Fri Mar 03 2006 - 18:40:57 EST
>
> On Fri, Mar 03, 2006 at 03:22:42PM -0800, Linus Torvalds wrote:
>>
>> On Sat, 4 Mar 2006, Adrian Bunk wrote:
>> >
>> > It seems the problem is that in the CONFIG_RMW_INSNS=n
>> > case, there's no
>> > cmpxchg #define in include/asm-m68k/system.h required for the
>> > atomic_add_unless #define in include/asm-m68k/atomic.h.
>>
>> Hmm. It seems like it never has been there.. Do you know what brought this
>> on? Was it Nick's RCU changes from "rcuref_dec_and_test()" to
>> "atomic_dec_and_test()" and friends?
>
> It was Nick's commit 8426e1f6af0fd7f44d040af7263750c5a52f3cc3 that added
> atomic_inc_not_zero(), and Nick's patch that changed fs/file_table.c
> from rcuref_dec_and_test() to atomic_dec_and_test() exposed this
> problem.
Do kernel coders value the marking of the rcu read-side critical
section for consistency? In fs/file_table.c, fcheck_files()
is called by fget_light() without rcu_read_lock() in one case,
but with the apparently necessary rcu_read_lock() in place
otherwise. The struct file pointer that fcheck_files() returns
is rcu_dereference(fdt->fd[fd]) or NULL. Does the _commented_guarantee
of the current task holding the refcnt assure there's no need to
check for NULL or to mark the rcu readside section around the first
call to fcheck_files()?
This is the code sample:
/*
* Lightweight file lookup - no refcnt increment if fd table isn't shared.
* You can use this only if it is guranteed that the current task already
* holds a refcnt to that file. That check has to be done at fget() only
* and a flag is returned to be passed to the corresponding fput_light().
* There must not be a cloning between an fget_light/fput_light pair.
*/
struct file fastcall *fget_light(unsigned int fd, int *fput_needed)
{
struct file *file;
struct files_struct *files = current->files;
*fput_needed = 0;
if (likely((atomic_read(&files->count) == 1))) {
file = fcheck_files(files, fd);
} else {
rcu_read_lock();
file = fcheck_files(files, fd);
if (file) {
if (atomic_inc_not_zero(&file->f_count))
*fput_needed = 1;
else
/* Didn't get the reference, someone's freed */
file = NULL;
}
rcu_read_unlock();
}
return file;
}
The attached patch would superficially address the rcu
discrepancy, but another underlying question is about the
desired extent of the rcu read-side critical section in that
fget_light() returns the pointer to the file struct that was
flagged for rcu protection by rcu_dereference() in
fcheck_files(). In this application, does it make sense to
push the rcu_read_lock() into fcheck_files() or add it there
or to extend it to the calling function?
Up the call tree, we note that fcheck() uses fcheck_files(),
but the only call to fcheck() nested in rcu_read_lock() is
in the disparaged irixioctl.c.
Are the other calls to fcheck() under circumstances that give
reason for the rcu_read_lock elision, like
spin_lock(&files->file_lock) in fs/fcntl.c, or being in the
context of applying locks in fs/locks.c, or calls from assembly
code in arch/sparc/kernel/sunos_ioctl.c & solaris/socksys.c.
If there is reason to pursue the insertion of the
rcu_read_lock/unlock() pairs in these circumstances, any
suggestions would be appreciated in order to dispel the question
altogether or to try to submit a more extensive patch.
Thank you.
Suzanne
-
file_table.c | 2 ++
1 files changed, 2 insertions(+)
---------------------------------
--- /testbed2/linux-2.6.16-rc5/fs/file_table.c 2006-02-26 21:09:35.000000000 -0800
+++ /testbed1/linux-2.6.16-rc5/fs/file_table.c 2006-03-05 14:36:46.000000000 -0800
@@ -194,7 +194,9 @@ struct file fastcall *fget_light(unsigne
*fput_needed = 0;
if (likely((atomic_read(&files->count) == 1))) {
+ rcu_read_lock();
file = fcheck_files(files, fd);
+ rcu_read_unlock();
} else {
rcu_read_lock();
file = fcheck_files(files, fd);
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-06 4:44 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken Suzanne Wood
@ 2006-03-06 15:35 ` Dipankar Sarma
0 siblings, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-03-06 15:35 UTC (permalink / raw)
To: Suzanne Wood; +Cc: bunk, linux-kernel, nickpiggin, paulmck
On Sun, Mar 05, 2006 at 08:44:15PM -0800, Suzanne Wood wrote:
> > From: Adrian Bunk Fri Mar 03 2006 - 18:40:57 EST
> >
> Do kernel coders value the marking of the rcu read-side critical
> section for consistency? In fs/file_table.c, fcheck_files()
Generally speaking, yes.
> is called by fget_light() without rcu_read_lock() in one case,
> but with the apparently necessary rcu_read_lock() in place
> otherwise. The struct file pointer that fcheck_files() returns
> is rcu_dereference(fdt->fd[fd]) or NULL. Does the _commented_guarantee
> of the current task holding the refcnt assure there's no need to
> check for NULL or to mark the rcu readside section around the first
> call to fcheck_files()?
>
> This is the code sample:
> /*
> * Lightweight file lookup - no refcnt increment if fd table isn't shared.
> * You can use this only if it is guranteed that the current task already
> * holds a refcnt to that file. That check has to be done at fget() only
> * and a flag is returned to be passed to the corresponding fput_light().
> * There must not be a cloning between an fget_light/fput_light pair.
> */
> struct file fastcall *fget_light(unsigned int fd, int *fput_needed)
> {
> struct file *file;
> struct files_struct *files = current->files;
>
> *fput_needed = 0;
> if (likely((atomic_read(&files->count) == 1))) {
> file = fcheck_files(files, fd);
> } else {
This means that the fd table is not shared between threads. So,
there can't be any race and no need to protect using
rcu_read_lock()/rcu_read_unlock().
>
> The attached patch would superficially address the rcu
> discrepancy, but another underlying question is about the
> desired extent of the rcu read-side critical section in that
> fget_light() returns the pointer to the file struct that was
> flagged for rcu protection by rcu_dereference() in
> fcheck_files(). In this application, does it make sense to
> push the rcu_read_lock() into fcheck_files() or add it there
> or to extend it to the calling function?
I think a comment there explaining why rcu_read_lock/unlock
pair is not there should be sufficient. While the are NOP
for non-PREEMPT kernels, they do have a cost otherwise.
Avoiding them if we can is a good idea, IMO.
> Up the call tree, we note that fcheck() uses fcheck_files(),
> but the only call to fcheck() nested in rcu_read_lock() is
> in the disparaged irixioctl.c.
>
> Are the other calls to fcheck() under circumstances that give
> reason for the rcu_read_lock elision, like
> spin_lock(&files->file_lock) in fs/fcntl.c, or being in the
> context of applying locks in fs/locks.c, or calls from assembly
> code in arch/sparc/kernel/sunos_ioctl.c & solaris/socksys.c.
> If there is reason to pursue the insertion of the
> rcu_read_lock/unlock() pairs in these circumstances, any
> suggestions would be appreciated in order to dispel the question
> altogether or to try to submit a more extensive patch.
It depends on whether the fdtable is shared or not and if
shared whether we are already holding the ->files_lock or
not. The key is that if it is lock-free and if the fdtable
is shared, they rcu_read_lock()/unlock() pair must be
there, otherwise it is a bug.
Thanks
Dipankar
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
@ 2006-03-06 16:13 Suzanne Wood
2006-03-06 16:48 ` Dipankar Sarma
0 siblings, 1 reply; 12+ messages in thread
From: Suzanne Wood @ 2006-03-06 16:13 UTC (permalink / raw)
To: dipankar; +Cc: bunk, linux-kernel, nickpiggin, paulmck
Thank you very much.
> From dipankar@in.ibm.com Mon Mar 6 07:36:50 2006
> On Sun, Mar 05, 2006 at 08:44:15PM -0800, Suzanne Wood wrote:
> > > From: Adrian Bunk Fri Mar 03 2006 - 18:40:57 EST
> > >
> > Do kernel coders value the marking of the rcu read-side critical
> > section for consistency? In fs/file_table.c, fcheck_files()
> Generally speaking, yes.
> > is called by fget_light() without rcu_read_lock() in one case,
> > but with the apparently necessary rcu_read_lock() in place
> > otherwise. The struct file pointer that fcheck_files() returns
> > is rcu_dereference(fdt->fd[fd]) or NULL. Does the _commented_guarantee
> > of the current task holding the refcnt assure there's no need to
> > check for NULL or to mark the rcu readside section around the first
> > call to fcheck_files()?
> >
> > This is the code sample:
> > /*
> > * Lightweight file lookup - no refcnt increment if fd table isn't shared.
> > * You can use this only if it is guranteed that the current task already
> > * holds a refcnt to that file. That check has to be done at fget() only
> > * and a flag is returned to be passed to the corresponding fput_light().
> > * There must not be a cloning between an fget_light/fput_light pair.
> > */
> > struct file fastcall *fget_light(unsigned int fd, int *fput_needed)
> > {
> > struct file *file;
> > struct files_struct *files = current->files;
> >
> > *fput_needed = 0;
> > if (likely((atomic_read(&files->count) == 1))) {
> > file = fcheck_files(files, fd);
> > } else {
> This means that the fd table is not shared between threads. So,
> there can't be any race and no need to protect using
> rcu_read_lock()/rcu_read_unlock().
Then why call fcheck_files() with the rcu_dereference() which would flag
an automated check for the need to mark a read-side critical section?
Would it make sense to introduce the function that doesn't? The goal of
keeping the kernel small is balanced with clarity. The inconsistency of
how fcheck_files() is used within a single function (fget_light()) was
my opening question.
> > The attached patch would superficially address the rcu
> > discrepancy, but another underlying question is about the
> > desired extent of the rcu read-side critical section in that
> > fget_light() returns the pointer to the file struct that was
> > flagged for rcu protection by rcu_dereference() in
> > fcheck_files(). In this application, does it make sense to
> > push the rcu_read_lock() into fcheck_files() or add it there
> > or to extend it to the calling function?
> I think a comment there explaining why rcu_read_lock/unlock
> pair is not there should be sufficient. While the are NOP
> for non-PREEMPT kernels, they do have a cost otherwise.
> Avoiding them if we can is a good idea, IMO.
> > Up the call tree, we note that fcheck() uses fcheck_files(),
> > but the only call to fcheck() nested in rcu_read_lock() is
> > in the disparaged irixioctl.c.
> >
> > Are the other calls to fcheck() under circumstances that give
> > reason for the rcu_read_lock elision, like
> > spin_lock(&files->file_lock) in fs/fcntl.c, or being in the
> > context of applying locks in fs/locks.c, or calls from assembly
> > code in arch/sparc/kernel/sunos_ioctl.c & solaris/socksys.c.
> > If there is reason to pursue the insertion of the
> > rcu_read_lock/unlock() pairs in these circumstances, any
> > suggestions would be appreciated in order to dispel the question
> > altogether or to try to submit a more extensive patch.
> It depends on whether the fdtable is shared or not and if
> shared whether we are already holding the ->files_lock or
> not. The key is that if it is lock-free and if the fdtable
> is shared, they rcu_read_lock()/unlock() pair must be
> there, otherwise it is a bug.
> Thanks
> Dipankar
Many thanks again.
Suzanne
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
2006-03-06 16:13 Suzanne Wood
@ 2006-03-06 16:48 ` Dipankar Sarma
0 siblings, 0 replies; 12+ messages in thread
From: Dipankar Sarma @ 2006-03-06 16:48 UTC (permalink / raw)
To: Suzanne Wood; +Cc: bunk, linux-kernel, nickpiggin, paulmck
On Mon, Mar 06, 2006 at 08:13:41AM -0800, Suzanne Wood wrote:
> Thank you very much.
> > > struct file fastcall *fget_light(unsigned int fd, int *fput_needed)
> > > {
> > > struct file *file;
> > > struct files_struct *files = current->files;
> > >
> > > *fput_needed = 0;
> > > if (likely((atomic_read(&files->count) == 1))) {
> > > file = fcheck_files(files, fd);
> > > } else {
>
> > This means that the fd table is not shared between threads. So,
> > there can't be any race and no need to protect using
> > rcu_read_lock()/rcu_read_unlock().
>
> Then why call fcheck_files() with the rcu_dereference() which would flag
> an automated check for the need to mark a read-side critical section?
> Would it make sense to introduce the function that doesn't? The goal of
> keeping the kernel small is balanced with clarity. The inconsistency of
> how fcheck_files() is used within a single function (fget_light()) was
> my opening question.
Because rcu_dereference() hurts only alpha and we don't care about
alpha :-)
Just kidding!
Good point about automated checkers. However, this isn't an
uncommon thing in multi-threaded programs - can't the checker
rules be written to take into account sharing and non-sharing of
the object in question ?
Thanks
Dipankar
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken
@ 2006-03-07 1:14 Suzanne Wood
0 siblings, 0 replies; 12+ messages in thread
From: Suzanne Wood @ 2006-03-07 1:14 UTC (permalink / raw)
To: dipankar; +Cc: linux-kernel, paulmck, walpole
Hello and thank you again.
> From dipankar@in.ibm.com Mon Mar 6 08:49:55 2006
> On Mon, Mar 06, 2006 at 08:13:41AM -0800, Suzanne Wood wrote:
> > Thank you very much.
> > > > struct file fastcall *fget_light(unsigned int fd, int *fput_needed)
> > > > {
> > > > struct file *file;
> > > > struct files_struct *files = current->files;
> > > >
> > > > *fput_needed = 0;
> > > > if (likely((atomic_read(&files->count) == 1))) {
> > > > file = fcheck_files(files, fd);
> > > > } else {
> >
> > > This means that the fd table is not shared between threads. So,
> > > there can't be any race and no need to protect using
> > > rcu_read_lock()/rcu_read_unlock().
> >
> > Then why call fcheck_files() with the rcu_dereference() which would flag
> > an automated check for the need to mark a read-side critical section?
> > Would it make sense to introduce the function that doesn't? The goal of
> > keeping the kernel small is balanced with clarity. The inconsistency of
> > how fcheck_files() is used within a single function (fget_light()) was
> > my opening question.
> Because rcu_dereference() hurts only alpha and we don't care about
> alpha :-)
> Just kidding!
> Good point about automated checkers. However, this isn't an
> uncommon thing in multi-threaded programs - can't the checker
> rules be written to take into account sharing and non-sharing of
> the object in question ?
Henzinger, et al., UC Berkeley, describe race checking on a
language for networked embedded systems NES-C using the atomic
keyword to delimit sections. (The rcu_read_lock() would be
similar in disallowing interrupts.) When flow-based analysis
returned false positives, the programmer could annotate the
code with "norace" and in practice all shared accesses were
put in atomic sections even if there were no actual race
conditions. To limit the number of atomic sections, the
UCB group modeled multiple threads, triggered hardware
interrupts and interleaved tasks and checked for safe access
and did manual corrections to the unsafe code.
In fget_light(), the rcu_dereference() is apparently never
intended in the "if true" of the conditional where
(likely((atomic_read(&files->count) == 1), so only one file
descriptor is open for the current task at that instant. (A
child process could share that descriptor, but an unrelated
process would have its own file descriptor.)
But fget_light() does return the file pointer which _some_ of
the time does require rcu-protection, so hypothetically, a
checker flags it as unsafe if no rcu_read_lock() is in place
in a caller at some level and checking can proceed to other
locking.
The core premises have been that a path through the code
that contains rcu_dereference() or rcu_assign_pointer() is
matched to the assign/deref counterpart with the same struct
object type and the rcu_dereference() is nested in a read-side
critical section delimited by rcu_read_lock() and
rcu_read_unlock() used to determine the extent of the duration
of the struct at the address.
The pointer to the file struct that fcheck_files() returns is
rcu_dereference(fdt->fd[fd]) and open.c has fd_install() call
rcu_assign_pointer(fdt->fd[fd], file). In file_table.c,
file_free() calls call_rcu(&f->f_u.fu_rcuhead, file_free_rcu),
where the fu_rcuhead is a field of the file struct and
file_free_rcu() calls kmem_cache_free().
Thank you very much for your insights into the reasoning.
Suzanne
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-03-07 1:15 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-06 4:44 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken Suzanne Wood
2006-03-06 15:35 ` Dipankar Sarma
-- strict thread matches above, loose matches on Subject: below --
2006-03-07 1:14 Suzanne Wood
2006-03-06 16:13 Suzanne Wood
2006-03-06 16:48 ` Dipankar Sarma
2006-02-27 5:27 Linux v2.6.16-rc5 Linus Torvalds
2006-03-03 23:01 ` 2.6.16-rc regression: m68k CONFIG_RMW_INSNS=n compile broken Adrian Bunk
2006-03-03 23:22 ` Linus Torvalds
2006-03-03 23:43 ` Adrian Bunk
2006-03-03 23:59 ` Andrew Morton
2006-03-04 14:01 ` Roman Zippel
2006-03-04 14:12 ` Nick Piggin
2006-03-04 20:28 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox