* [2/10 PATCH] inline inline-generic_writepages.patch
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
@ 2008-06-24 5:56 ` Mikulas Patocka
2008-06-24 5:57 ` [3/10 PATCH] inline wake_up_bit Mikulas Patocka
` (7 subsequent siblings)
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:56 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Make generic_writepages inlinable in current file.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/mm/page-writeback.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/mm/page-writeback.c 2008-06-24 07:28:14.000000000 +0200
+++ linux-2.6.26-rc7-devel/mm/page-writeback.c 2008-06-24 07:37:15.000000000 +0200
@@ -981,8 +981,8 @@
* This is a library function, which implements the writepages()
* address_space_operation.
*/
-int generic_writepages(struct address_space *mapping,
- struct writeback_control *wbc)
+__always_inline int generic_writepages(struct address_space *mapping,
+ struct writeback_control *wbc)
{
/* deal with chardevs and other special file */
if (!mapping->a_ops->writepage)
^ permalink raw reply [flat|nested] 31+ messages in thread* [3/10 PATCH] inline wake_up_bit
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
2008-06-24 5:56 ` [2/10 PATCH] inline inline-generic_writepages.patch Mikulas Patocka
@ 2008-06-24 5:57 ` Mikulas Patocka
2008-06-25 14:17 ` Denys Vlasenko
2008-06-24 5:57 ` [4/10 PATCH] inline __wake_up_bit Mikulas Patocka
` (6 subsequent siblings)
8 siblings, 1 reply; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:57 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Inline wake_up_bit. The function just pases arguments around.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/include/linux/wait.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/wait.h 2008-06-24 07:28:13.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/wait.h 2008-06-24 07:37:20.000000000 +0200
@@ -147,11 +147,32 @@
void __wake_up_bit(wait_queue_head_t *, void *, int);
int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
-void wake_up_bit(void *, int);
int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
int out_of_line_wait_on_bit_lock(void *, int, int (*)(void *), unsigned);
wait_queue_head_t *bit_waitqueue(void *, int);
+/**
+ * wake_up_bit - wake up a waiter on a bit
+ * @word: the word being waited on, a kernel virtual address
+ * @bit: the bit of the word being waited on
+ *
+ * There is a standard hashed waitqueue table for generic use. This
+ * is the part of the hashtable's accessor API that wakes up waiters
+ * on a bit. For instance, if one were to have waiters on a bitflag,
+ * one would call wake_up_bit() after clearing the bit.
+ *
+ * In order for this to function properly, as it uses waitqueue_active()
+ * internally, some kind of memory barrier must be done prior to calling
+ * this. Typically, this will be smp_mb__after_clear_bit(), but in some
+ * cases where bitflags are manipulated non-atomically under a lock, one
+ * may need to use a less regular barrier, such fs/inode.c's smp_mb(),
+ * because spin_unlock() does not guarantee a memory barrier.
+ */
+static __always_inline void wake_up_bit(void *word, int bit)
+{
+ __wake_up_bit(bit_waitqueue(word, bit), word, bit);
+}
+
#define wake_up(x) __wake_up(x, TASK_NORMAL, 1, NULL)
#define wake_up_nr(x, nr) __wake_up(x, TASK_NORMAL, nr, NULL)
#define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL)
Index: linux-2.6.26-rc7-devel/kernel/wait.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/wait.c 2008-06-24 07:28:14.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/wait.c 2008-06-24 07:37:20.000000000 +0200
@@ -219,29 +219,6 @@
}
EXPORT_SYMBOL(__wake_up_bit);
-/**
- * wake_up_bit - wake up a waiter on a bit
- * @word: the word being waited on, a kernel virtual address
- * @bit: the bit of the word being waited on
- *
- * There is a standard hashed waitqueue table for generic use. This
- * is the part of the hashtable's accessor API that wakes up waiters
- * on a bit. For instance, if one were to have waiters on a bitflag,
- * one would call wake_up_bit() after clearing the bit.
- *
- * In order for this to function properly, as it uses waitqueue_active()
- * internally, some kind of memory barrier must be done prior to calling
- * this. Typically, this will be smp_mb__after_clear_bit(), but in some
- * cases where bitflags are manipulated non-atomically under a lock, one
- * may need to use a less regular barrier, such fs/inode.c's smp_mb(),
- * because spin_unlock() does not guarantee a memory barrier.
- */
-void wake_up_bit(void *word, int bit)
-{
- __wake_up_bit(bit_waitqueue(word, bit), word, bit);
-}
-EXPORT_SYMBOL(wake_up_bit);
-
wait_queue_head_t *bit_waitqueue(void *word, int bit)
{
const int shift = BITS_PER_LONG == 32 ? 5 : 6;
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [3/10 PATCH] inline wake_up_bit
2008-06-24 5:57 ` [3/10 PATCH] inline wake_up_bit Mikulas Patocka
@ 2008-06-25 14:17 ` Denys Vlasenko
2008-06-25 14:36 ` Mikulas Patocka
2008-06-25 22:30 ` David Miller
0 siblings, 2 replies; 31+ messages in thread
From: Denys Vlasenko @ 2008-06-25 14:17 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: linux-kernel, sparclinux, davem
On Tuesday 24 June 2008 07:57, Mikulas Patocka wrote:
> Inline wake_up_bit. The function just pases arguments around.
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>
> int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
> -void wake_up_bit(void *, int);
> int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
> +static __always_inline void wake_up_bit(void *word, int bit)
> +{
> + __wake_up_bit(bit_waitqueue(word, bit), word, bit);
> +}
So now every call to wake_up_bit(word, bit) now is converted to:
__wake_up_bit(bit_waitqueue(word, bit), word, bit);
which is in turn converted to (looking into your next patch):
{
wait_queue_head_t *wq = bit_waitqueue(word, bit);
struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
if (waitqueue_active(wq))
__wake_up(wq, TASK_NORMAL, 1, &key);
}
which is in turn converted to (looking into your other patch):
{
wait_queue_head_t *wq = bit_waitqueue(word, bit);
struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
if (waitqueue_active(wq))
{
unsigned long flags;
spin_lock_irqsave(&qw->lock, flags);
__wake_up_common(wq, TASK_NORMAL, 1, 0, &key);
spin_unlock_irqrestore(&wq->lock, flags);
}
}
And you know what? This is likely not the end yet! It's possible
spin_lock_irqXXX, __wake_up_common, waitqueue_active or bit_waitqueue
are inlines - I didn't check.
--
vda
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 14:17 ` Denys Vlasenko
@ 2008-06-25 14:36 ` Mikulas Patocka
2008-06-25 15:24 ` Denys Vlasenko
2008-06-25 22:30 ` David Miller
1 sibling, 1 reply; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-25 14:36 UTC (permalink / raw)
To: Denys Vlasenko; +Cc: linux-kernel, sparclinux, davem
On Wed, 25 Jun 2008, Denys Vlasenko wrote:
> On Tuesday 24 June 2008 07:57, Mikulas Patocka wrote:
>> Inline wake_up_bit. The function just pases arguments around.
>>
>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>>
>> int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
>> -void wake_up_bit(void *, int);
>> int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
>
>> +static __always_inline void wake_up_bit(void *word, int bit)
>> +{
>> + __wake_up_bit(bit_waitqueue(word, bit), word, bit);
>> +}
>
> So now every call to wake_up_bit(word, bit) now is converted to:
>
> __wake_up_bit(bit_waitqueue(word, bit), word, bit);
>
> which is in turn converted to (looking into your next patch):
>
> {
> wait_queue_head_t *wq = bit_waitqueue(word, bit);
> struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
> if (waitqueue_active(wq))
> __wake_up(wq, TASK_NORMAL, 1, &key);
> }
>
> which is in turn converted to (looking into your other patch):
>
> {
> wait_queue_head_t *wq = bit_waitqueue(word, bit);
> struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
> if (waitqueue_active(wq))
> {
> unsigned long flags;
> spin_lock_irqsave(&qw->lock, flags);
> __wake_up_common(wq, TASK_NORMAL, 1, 0, &key);
> spin_unlock_irqrestore(&wq->lock, flags);
> }
> }
>
> And you know what? This is likely not the end yet! It's possible
> spin_lock_irqXXX, __wake_up_common, waitqueue_active or bit_waitqueue
> are inlines - I didn't check.
> --
> vda
Yes, that's 0.2% code size increase (or none increase, if drop
inline-__wake_up_bit.patch and apply only the other patches). To me it
seems crazy, how this code was refactored again and again over time, up to
8 levels of functions (including passing a pointer to a method). In 2.0.x
kernel series, it was just a single call to wake up a queue.
Mikulas
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 14:36 ` Mikulas Patocka
@ 2008-06-25 15:24 ` Denys Vlasenko
2008-06-25 16:01 ` Mikulas Patocka
2008-06-25 22:23 ` David Miller
0 siblings, 2 replies; 31+ messages in thread
From: Denys Vlasenko @ 2008-06-25 15:24 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: linux-kernel, sparclinux, davem
On Wednesday 25 June 2008 16:36, Mikulas Patocka wrote:
> > On Tuesday 24 June 2008 07:57, Mikulas Patocka wrote:
> >> Inline wake_up_bit. The function just pases arguments around.
> >>
> >> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> >>
> >> int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
> >> -void wake_up_bit(void *, int);
> >> int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
> >
> >> +static __always_inline void wake_up_bit(void *word, int bit)
> >> +{
> >> + __wake_up_bit(bit_waitqueue(word, bit), word, bit);
> >> +}
> >
> > So now every call to wake_up_bit(word, bit) now is converted to:
> >
> > __wake_up_bit(bit_waitqueue(word, bit), word, bit);
> >
> > which is in turn converted to (looking into your next patch):
> >
> > {
> > wait_queue_head_t *wq = bit_waitqueue(word, bit);
> > struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
> > if (waitqueue_active(wq))
> > __wake_up(wq, TASK_NORMAL, 1, &key);
> > }
> >
> > which is in turn converted to (looking into your other patch):
> >
> > {
> > wait_queue_head_t *wq = bit_waitqueue(word, bit);
> > struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
> > if (waitqueue_active(wq))
> > {
> > unsigned long flags;
> > spin_lock_irqsave(&qw->lock, flags);
> > __wake_up_common(wq, TASK_NORMAL, 1, 0, &key);
> > spin_unlock_irqrestore(&wq->lock, flags);
> > }
> > }
> >
> > And you know what? This is likely not the end yet! It's possible
> > spin_lock_irqXXX, __wake_up_common, waitqueue_active or bit_waitqueue
> > are inlines - I didn't check.
> > --
> > vda
>
> Yes, that's 0.2% code size increase
...In just 17 callsites in entire kernel.
> (or none increase, if drop
> inline-__wake_up_bit.patch and apply only the other patches).
Now this is a better approach - to actually see how many
callsites are there, and inlining only where makes sense.
But in practice it's hard to do and also is changing all the time
during development. What is optimal today won't be optimal in
2.6.45 :)
Ingo's suggestion to talk to gcc people to remedy
insane call convention sounds as a more workable solution.
BTW, i386 uses regparm call convention, is similar trick
possible for sparc64?
> To me it
> seems crazy, how this code was refactored again and again over time, up to
> 8 levels of functions (including passing a pointer to a method). In 2.0.x
> kernel series, it was just a single call to wake up a queue.
Yes, probably... If you can simplify it, everyone will be glad.
--
vda
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 15:24 ` Denys Vlasenko
@ 2008-06-25 16:01 ` Mikulas Patocka
2008-06-25 20:37 ` Denys Vlasenko
2008-06-25 22:23 ` David Miller
1 sibling, 1 reply; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-25 16:01 UTC (permalink / raw)
To: Denys Vlasenko; +Cc: linux-kernel, sparclinux, davem
>>> And you know what? This is likely not the end yet! It's possible
>>> spin_lock_irqXXX, __wake_up_common, waitqueue_active or bit_waitqueue
>>> are inlines - I didn't check.
>>> --
>>> vda
>>
>> Yes, that's 0.2% code size increase
>
> ...In just 17 callsites in entire kernel.
>
>> (or none increase, if drop
>> inline-__wake_up_bit.patch and apply only the other patches).
>
> Now this is a better approach - to actually see how many
> callsites are there, and inlining only where makes sense.
> But in practice it's hard to do and also is changing all the time
> during development. What is optimal today won't be optimal in
> 2.6.45 :)
>
> Ingo's suggestion to talk to gcc people to remedy
> insane call convention sounds as a more workable solution.
>
> BTW, i386 uses regparm call convention, is similar trick
> possible for sparc64?
Sparc64 has register windows: it passes arguments in registers, but it
must allocate space for that registers. If the call stack is too deep (8
levels), the CPU runs out of registers and starts spilling the registers
of the function 8-levels-deep to the stack.
The stack usage could be reduced to 176 bytes with little work from gcc
developers and to 128 bytes with more work (ABI change). If you wanted to
go below 128 bytes, you could use one register to indicate number of used
registers and modify the spill/fill handlers to load only that number of
registers and reduce the stack usage even more --- that would be a big
code change in both gcc and linux.
Mikulas
>> To me it
>> seems crazy, how this code was refactored again and again over time, up to
>> 8 levels of functions (including passing a pointer to a method). In 2.0.x
>> kernel series, it was just a single call to wake up a queue.
>
> Yes, probably... If you can simplify it, everyone will be glad.
> --
> vda
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 16:01 ` Mikulas Patocka
@ 2008-06-25 20:37 ` Denys Vlasenko
2008-06-26 0:28 ` David Miller
0 siblings, 1 reply; 31+ messages in thread
From: Denys Vlasenko @ 2008-06-25 20:37 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: linux-kernel, sparclinux, davem
On Wednesday 25 June 2008 18:01, Mikulas Patocka wrote:
> > Ingo's suggestion to talk to gcc people to remedy
> > insane call convention sounds as a more workable solution.
> >
> > BTW, i386 uses regparm call convention, is similar trick
> > possible for sparc64?
>
> Sparc64 has register windows: it passes arguments in registers, but it
> must allocate space for that registers. If the call stack is too deep (8
> levels), the CPU runs out of registers and starts spilling the registers
> of the function 8-levels-deep to the stack.
>
> The stack usage could be reduced to 176 bytes with little work from gcc
> developers and to 128 bytes with more work (ABI change). If you wanted to
Wow, it's nearly x2 reduction.
ABI change in not a problem for kernel, since it is a "freestanding
application". Exactly like i386 switched to regparm, which is a different ABI.
> go below 128 bytes, you could use one register to indicate number of used
> registers and modify the spill/fill handlers to load only that number of
> registers and reduce the stack usage even more --- that would be a big
> code change in both gcc and linux.
--
vda
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 20:37 ` Denys Vlasenko
@ 2008-06-26 0:28 ` David Miller
2008-06-26 3:35 ` Denys Vlasenko
2008-06-26 18:22 ` Pavel Machek
0 siblings, 2 replies; 31+ messages in thread
From: David Miller @ 2008-06-26 0:28 UTC (permalink / raw)
To: vda.linux; +Cc: mpatocka, linux-kernel, sparclinux
From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Wed, 25 Jun 2008 22:37:58 +0200
> On Wednesday 25 June 2008 18:01, Mikulas Patocka wrote:
> > > Ingo's suggestion to talk to gcc people to remedy
> > > insane call convention sounds as a more workable solution.
> > >
> > > BTW, i386 uses regparm call convention, is similar trick
> > > possible for sparc64?
> >
> > Sparc64 has register windows: it passes arguments in registers, but it
> > must allocate space for that registers. If the call stack is too deep (8
> > levels), the CPU runs out of registers and starts spilling the registers
> > of the function 8-levels-deep to the stack.
> >
> > The stack usage could be reduced to 176 bytes with little work from gcc
> > developers and to 128 bytes with more work (ABI change). If you wanted to
>
> Wow, it's nearly x2 reduction.
>
> ABI change in not a problem for kernel, since it is a "freestanding
> application". Exactly like i386 switched to regparm, which is a different ABI.
Except that nobody has written this code and therefore being about to
use this unimplemented compiler facility to get correctness is not
tenable.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-26 0:28 ` David Miller
@ 2008-06-26 3:35 ` Denys Vlasenko
2008-06-26 4:18 ` David Miller
2008-06-26 18:22 ` Pavel Machek
1 sibling, 1 reply; 31+ messages in thread
From: Denys Vlasenko @ 2008-06-26 3:35 UTC (permalink / raw)
To: David Miller; +Cc: mpatocka, linux-kernel, sparclinux
On Thursday 26 June 2008 02:28, David Miller wrote:
> From: Denys Vlasenko <vda.linux@googlemail.com>
> Date: Wed, 25 Jun 2008 22:37:58 +0200
> > > Sparc64 has register windows: it passes arguments in registers, but it
> > > must allocate space for that registers. If the call stack is too deep (8
> > > levels), the CPU runs out of registers and starts spilling the registers
> > > of the function 8-levels-deep to the stack.
> > >
> > > The stack usage could be reduced to 176 bytes with little work from gcc
> > > developers and to 128 bytes with more work (ABI change). If you wanted to
> >
> > Wow, it's nearly x2 reduction.
> >
> > ABI change in not a problem for kernel, since it is a "freestanding
> > application". Exactly like i386 switched to regparm, which is a different ABI.
>
> Except that nobody has written this code and therefore being about to
> use this unimplemented compiler facility to get correctness is not
> tenable.
Inlining everything is even less tenable. Why architectures which do not
require 128+ bytes of stack for every function call should suffer?
I am all for fixing code where there are extra useless levels of calls,
but in this example I pointed out that patch adds inlines too liberally.
Do you agree that blowing up every wake_up_bit() into half a dozen
or more C lines is not what we want?
--
vda
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-26 3:35 ` Denys Vlasenko
@ 2008-06-26 4:18 ` David Miller
0 siblings, 0 replies; 31+ messages in thread
From: David Miller @ 2008-06-26 4:18 UTC (permalink / raw)
To: vda.linux; +Cc: mpatocka, linux-kernel, sparclinux
From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Thu, 26 Jun 2008 05:35:59 +0200
> On Thursday 26 June 2008 02:28, David Miller wrote:
> > From: Denys Vlasenko <vda.linux@googlemail.com>
> > Date: Wed, 25 Jun 2008 22:37:58 +0200
> > > > Sparc64 has register windows: it passes arguments in registers, but it
> > > > must allocate space for that registers. If the call stack is too deep (8
> > > > levels), the CPU runs out of registers and starts spilling the registers
> > > > of the function 8-levels-deep to the stack.
> > > >
> > > > The stack usage could be reduced to 176 bytes with little work from gcc
> > > > developers and to 128 bytes with more work (ABI change). If you wanted to
> > >
> > > Wow, it's nearly x2 reduction.
> > >
> > > ABI change in not a problem for kernel, since it is a "freestanding
> > > application". Exactly like i386 switched to regparm, which is a different ABI.
> >
> > Except that nobody has written this code and therefore being about to
> > use this unimplemented compiler facility to get correctness is not
> > tenable.
>
> Inlining everything is even less tenable.
I never suggested this. Although I do support inlining the
cases which merely adjust the ordering of arguments being
passed to function calls, because such inlines are essentially
of zero cost and of gain to all platforms.
> I am all for fixing code where there are extra useless levels of calls,
> but in this example I pointed out that patch adds inlines too liberally.
> Do you agree that blowing up every wake_up_bit() into half a dozen
> or more C lines is not what we want?
I stated my suggested alternative to this in another posting, so of
course I do not support it as-is.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-26 0:28 ` David Miller
2008-06-26 3:35 ` Denys Vlasenko
@ 2008-06-26 18:22 ` Pavel Machek
1 sibling, 0 replies; 31+ messages in thread
From: Pavel Machek @ 2008-06-26 18:22 UTC (permalink / raw)
To: David Miller; +Cc: vda.linux, mpatocka, linux-kernel, sparclinux
On Wed 2008-06-25 17:28:38, David Miller wrote:
> From: Denys Vlasenko <vda.linux@googlemail.com>
> Date: Wed, 25 Jun 2008 22:37:58 +0200
>
> > On Wednesday 25 June 2008 18:01, Mikulas Patocka wrote:
> > > > Ingo's suggestion to talk to gcc people to remedy
> > > > insane call convention sounds as a more workable solution.
> > > >
> > > > BTW, i386 uses regparm call convention, is similar trick
> > > > possible for sparc64?
> > >
> > > Sparc64 has register windows: it passes arguments in registers, but it
> > > must allocate space for that registers. If the call stack is too deep (8
> > > levels), the CPU runs out of registers and starts spilling the registers
> > > of the function 8-levels-deep to the stack.
> > >
> > > The stack usage could be reduced to 176 bytes with little work from gcc
> > > developers and to 128 bytes with more work (ABI change). If you wanted to
> >
> > Wow, it's nearly x2 reduction.
> >
> > ABI change in not a problem for kernel, since it is a "freestanding
> > application". Exactly like i386 switched to regparm, which is a different ABI.
>
> Except that nobody has written this code and therefore being about to
> use this unimplemented compiler facility to get correctness is not
> tenable.
Switch to 32K stack on sparc64, then?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 15:24 ` Denys Vlasenko
2008-06-25 16:01 ` Mikulas Patocka
@ 2008-06-25 22:23 ` David Miller
1 sibling, 0 replies; 31+ messages in thread
From: David Miller @ 2008-06-25 22:23 UTC (permalink / raw)
To: vda.linux; +Cc: mpatocka, linux-kernel, sparclinux
From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Wed, 25 Jun 2008 17:24:57 +0200
> talk to gcc people to remedy insane call convention sounds as a more
> workable solution.
It's not a tenable solution to this problem. We have to make this
work properly with compiler tools currently available to users, rather
than versions of gcc that don't even exist yet.
And what's more, the amount we can get back from the current stack
size allocation is, at best, 16 bytes. The problem isn't going away
no matter what we do to the compiler.
> BTW, i386 uses regparm call convention, is similar trick
> possible for sparc64?
sparc64 already passes arguments in registers.
The stack frame is (predominantly, size wise) to accomodate the save
area for the cpu's register windows in non-leaf functions, and that
isn't going anywhere.
Back to the patch set, several of these non-inlined cases really
are extremely suboptimal. If it's just moving args around, it
should be inline. This would help even x86.
Also, suggesting that IRQSTACKS are mandatory for correct operation is
a pretty unreasonable thing to say. It's currently still optional on
the platforms where it is implemented, so I wonder if this means that
correct operation is optional on these platforms? :-)
I have a sparc64 IRQSTACKS implementation that I'm working on, but
it's not something I can stick into 2.6.26 by any stretch of the
imagination.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [3/10 PATCH] inline wake_up_bit
2008-06-25 14:17 ` Denys Vlasenko
2008-06-25 14:36 ` Mikulas Patocka
@ 2008-06-25 22:30 ` David Miller
1 sibling, 0 replies; 31+ messages in thread
From: David Miller @ 2008-06-25 22:30 UTC (permalink / raw)
To: vda.linux; +Cc: mpatocka, linux-kernel, sparclinux
From: Denys Vlasenko <vda.linux@googlemail.com>
Date: Wed, 25 Jun 2008 16:17:40 +0200
> And you know what? This is likely not the end yet! It's possible
> spin_lock_irqXXX, __wake_up_common, waitqueue_active or bit_waitqueue
> are inlines - I didn't check.
The thing to do in these kinds of scenerios is to provide top-level
routes in kernel/sched.c where all the inlining occurs, rather
than doing it in header files and thus expanding this stuff in
multiple places.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [4/10 PATCH] inline __wake_up_bit
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
2008-06-24 5:56 ` [2/10 PATCH] inline inline-generic_writepages.patch Mikulas Patocka
2008-06-24 5:57 ` [3/10 PATCH] inline wake_up_bit Mikulas Patocka
@ 2008-06-24 5:57 ` Mikulas Patocka
2008-06-24 5:58 ` [5/10 PATCH] inline __wake_up Mikulas Patocka
` (5 subsequent siblings)
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:57 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Make __wake_up_bit inlines. This requires to move task state definitions from
sched.h to wait.h
This patch has the worst size-increase impact, increasing total kernel size
by 0.2%.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/include/linux/wait.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/wait.h 2008-06-24 07:37:20.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/wait.h 2008-06-24 07:37:25.000000000 +0200
@@ -25,6 +25,42 @@
#include <asm/system.h>
#include <asm/current.h>
+/*
+ * Task state bitmask. NOTE! These bits are also
+ * encoded in fs/proc/array.c: get_task_state().
+ *
+ * We have two separate sets of flags: task->state
+ * is about runnability, while task->exit_state are
+ * about the task exiting. Confusing, but this way
+ * modifying one set can't modify the other one by
+ * mistake.
+ */
+#define TASK_RUNNING 0
+#define TASK_INTERRUPTIBLE 1
+#define TASK_UNINTERRUPTIBLE 2
+#define __TASK_STOPPED 4
+#define __TASK_TRACED 8
+/* in tsk->exit_state */
+#define EXIT_ZOMBIE 16
+#define EXIT_DEAD 32
+/* in tsk->state again */
+#define TASK_DEAD 64
+#define TASK_WAKEKILL 128
+
+/* Convenience macros for the sake of set_task_state */
+#define TASK_KILLABLE (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
+#define TASK_STOPPED (TASK_WAKEKILL | __TASK_STOPPED)
+#define TASK_TRACED (TASK_WAKEKILL | __TASK_TRACED)
+
+/* Convenience macros for the sake of wake_up */
+#define TASK_NORMAL (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE)
+#define TASK_ALL (TASK_NORMAL | __TASK_STOPPED | __TASK_TRACED)
+
+/* get_task_state() */
+#define TASK_REPORT (TASK_RUNNING | TASK_INTERRUPTIBLE | \
+ TASK_UNINTERRUPTIBLE | __TASK_STOPPED | \
+ __TASK_TRACED)
+
typedef struct __wait_queue wait_queue_t;
typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int sync, void *key);
int default_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
@@ -144,13 +180,19 @@
void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
extern void __wake_up_locked(wait_queue_head_t *q, unsigned int mode);
extern void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr);
-void __wake_up_bit(wait_queue_head_t *, void *, int);
int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
int out_of_line_wait_on_bit(void *, int, int (*)(void *), unsigned);
int out_of_line_wait_on_bit_lock(void *, int, int (*)(void *), unsigned);
wait_queue_head_t *bit_waitqueue(void *, int);
+static __always_inline void __wake_up_bit(wait_queue_head_t *wq, void *word, int bit)
+{
+ struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
+ if (waitqueue_active(wq))
+ __wake_up(wq, TASK_NORMAL, 1, &key);
+}
+
/**
* wake_up_bit - wake up a waiter on a bit
* @word: the word being waited on, a kernel virtual address
Index: linux-2.6.26-rc7-devel/kernel/wait.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/wait.c 2008-06-24 07:37:20.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/wait.c 2008-06-24 07:37:25.000000000 +0200
@@ -211,14 +211,6 @@
}
EXPORT_SYMBOL(out_of_line_wait_on_bit_lock);
-void __wake_up_bit(wait_queue_head_t *wq, void *word, int bit)
-{
- struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
- if (waitqueue_active(wq))
- __wake_up(wq, TASK_NORMAL, 1, &key);
-}
-EXPORT_SYMBOL(__wake_up_bit);
-
wait_queue_head_t *bit_waitqueue(void *word, int bit)
{
const int shift = BITS_PER_LONG == 32 ? 5 : 6;
Index: linux-2.6.26-rc7-devel/include/linux/sched.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/sched.h 2008-06-24 07:28:12.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/sched.h 2008-06-24 07:37:26.000000000 +0200
@@ -160,42 +160,6 @@
extern unsigned long long time_sync_thresh;
-/*
- * Task state bitmask. NOTE! These bits are also
- * encoded in fs/proc/array.c: get_task_state().
- *
- * We have two separate sets of flags: task->state
- * is about runnability, while task->exit_state are
- * about the task exiting. Confusing, but this way
- * modifying one set can't modify the other one by
- * mistake.
- */
-#define TASK_RUNNING 0
-#define TASK_INTERRUPTIBLE 1
-#define TASK_UNINTERRUPTIBLE 2
-#define __TASK_STOPPED 4
-#define __TASK_TRACED 8
-/* in tsk->exit_state */
-#define EXIT_ZOMBIE 16
-#define EXIT_DEAD 32
-/* in tsk->state again */
-#define TASK_DEAD 64
-#define TASK_WAKEKILL 128
-
-/* Convenience macros for the sake of set_task_state */
-#define TASK_KILLABLE (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
-#define TASK_STOPPED (TASK_WAKEKILL | __TASK_STOPPED)
-#define TASK_TRACED (TASK_WAKEKILL | __TASK_TRACED)
-
-/* Convenience macros for the sake of wake_up */
-#define TASK_NORMAL (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE)
-#define TASK_ALL (TASK_NORMAL | __TASK_STOPPED | __TASK_TRACED)
-
-/* get_task_state() */
-#define TASK_REPORT (TASK_RUNNING | TASK_INTERRUPTIBLE | \
- TASK_UNINTERRUPTIBLE | __TASK_STOPPED | \
- __TASK_TRACED)
-
#define task_is_traced(task) ((task->state & __TASK_TRACED) != 0)
#define task_is_stopped(task) ((task->state & __TASK_STOPPED) != 0)
#define task_is_stopped_or_traced(task) \
@@ -2026,6 +1990,19 @@
return signal_pending(p) && __fatal_signal_pending(p);
}
+static __always_inline int signal_pending_state(long state, struct task_struct *p)
+{
+ if (!(state & (TASK_INTERRUPTIBLE | TASK_WAKEKILL)))
+ return 0;
+ if (!signal_pending(p))
+ return 0;
+
+ if (state & (__TASK_STOPPED | __TASK_TRACED))
+ return 0;
+
+ return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
+}
+
static inline int need_resched(void)
{
return unlikely(test_thread_flag(TIF_NEED_RESCHED));
^ permalink raw reply [flat|nested] 31+ messages in thread* [5/10 PATCH] inline __wake_up
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (2 preceding siblings ...)
2008-06-24 5:57 ` [4/10 PATCH] inline __wake_up_bit Mikulas Patocka
@ 2008-06-24 5:58 ` Mikulas Patocka
2008-06-24 5:59 ` [6/10 PATCH] inline default_wake_function Mikulas Patocka
` (4 subsequent siblings)
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:58 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Inline __wake_up and __wake_up_locked.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/include/linux/wait.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/wait.h 2008-06-24 07:37:25.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/wait.h 2008-06-24 07:37:31.000000000 +0200
@@ -177,8 +177,7 @@
list_del(&old->task_list);
}
-void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
-extern void __wake_up_locked(wait_queue_head_t *q, unsigned int mode);
+void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, int sync, void *key);
extern void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr);
int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, int (*)(void *), unsigned);
@@ -186,6 +185,31 @@
int out_of_line_wait_on_bit_lock(void *, int, int (*)(void *), unsigned);
wait_queue_head_t *bit_waitqueue(void *, int);
+/**
+ * __wake_up - wake up threads blocked on a waitqueue.
+ * @q: the waitqueue
+ * @mode: which threads
+ * @nr_exclusive: how many wake-one or wake-many threads to wake up
+ * @key: is directly passed to the wakeup function
+ */
+static __always_inline void __wake_up(wait_queue_head_t *q, unsigned int mode,
+ int nr_exclusive, void *key)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->lock, flags);
+ __wake_up_common(q, mode, nr_exclusive, 0, key);
+ spin_unlock_irqrestore(&q->lock, flags);
+}
+
+/*
+ * Same as __wake_up but called with the spinlock in wait_queue_head_t held.
+ */
+static __always_inline void __wake_up_locked(wait_queue_head_t *q, unsigned int mode)
+{
+ __wake_up_common(q, mode, 1, 0, NULL);
+}
+
static __always_inline void __wake_up_bit(wait_queue_head_t *wq, void *word, int bit)
{
struct wait_bit_key key = __WAIT_BIT_KEY_INITIALIZER(word, bit);
Index: linux-2.6.26-rc7-devel/kernel/sched.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/sched.c 2008-06-24 07:28:11.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/sched.c 2008-06-24 07:37:31.000000000 +0200
@@ -4284,8 +4284,8 @@
* started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
* zero in this (rare) case, and we handle it by continuing to scan the queue.
*/
-static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
- int nr_exclusive, int sync, void *key)
+void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
+ int nr_exclusive, int sync, void *key)
{
wait_queue_t *curr, *next;
@@ -4297,32 +4297,7 @@
break;
}
}
-
-/**
- * __wake_up - wake up threads blocked on a waitqueue.
- * @q: the waitqueue
- * @mode: which threads
- * @nr_exclusive: how many wake-one or wake-many threads to wake up
- * @key: is directly passed to the wakeup function
- */
-void __wake_up(wait_queue_head_t *q, unsigned int mode,
- int nr_exclusive, void *key)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&q->lock, flags);
- __wake_up_common(q, mode, nr_exclusive, 0, key);
- spin_unlock_irqrestore(&q->lock, flags);
-}
-EXPORT_SYMBOL(__wake_up);
-
-/*
- * Same as __wake_up but called with the spinlock in wait_queue_head_t held.
- */
-void __wake_up_locked(wait_queue_head_t *q, unsigned int mode)
-{
- __wake_up_common(q, mode, 1, 0, NULL);
-}
+EXPORT_SYMBOL(__wake_up_common);
/**
* __wake_up_sync - wake up threads blocked on a waitqueue.
^ permalink raw reply [flat|nested] 31+ messages in thread* [6/10 PATCH] inline default_wake_function
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (3 preceding siblings ...)
2008-06-24 5:58 ` [5/10 PATCH] inline __wake_up Mikulas Patocka
@ 2008-06-24 5:59 ` Mikulas Patocka
2008-06-24 5:59 ` [6/10 PATCH] inline autoremove_wake_function Mikulas Patocka
` (3 subsequent siblings)
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:59 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Make autoremove_wake_function -> default_wake_function call inlined.
default_wake_function cannot be put as static inline into headers, because
reference to it is taken at some places.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/kernel/sched.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/sched.c 2008-06-24 07:37:31.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/sched.c 2008-06-24 07:37:36.000000000 +0200
@@ -4268,13 +4268,23 @@
#endif /* CONFIG_PREEMPT */
-int default_wake_function(wait_queue_t *curr, unsigned mode, int sync,
- void *key)
+__always_inline int default_wake_function(wait_queue_t *curr, unsigned mode,
+ int sync, void *key)
{
return try_to_wake_up(curr->private, mode, sync);
}
EXPORT_SYMBOL(default_wake_function);
+int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key)
+{
+ int ret = default_wake_function(wait, mode, sync, key);
+
+ if (ret)
+ list_del_init(&wait->task_list);
+ return ret;
+}
+EXPORT_SYMBOL(autoremove_wake_function);
+
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
Index: linux-2.6.26-rc7-devel/kernel/wait.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/wait.c 2008-06-24 07:37:25.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/wait.c 2008-06-24 07:37:36.000000000 +0200
@@ -127,16 +127,6 @@
}
EXPORT_SYMBOL(finish_wait);
-int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key)
-{
- int ret = default_wake_function(wait, mode, sync, key);
-
- if (ret)
- list_del_init(&wait->task_list);
- return ret;
-}
-EXPORT_SYMBOL(autoremove_wake_function);
-
int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *arg)
{
struct wait_bit_key *key = arg;
Index: linux-2.6.26-rc7-devel/include/linux/wait.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/wait.h 2008-06-24 07:37:31.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/wait.h 2008-06-24 07:37:36.000000000 +0200
@@ -63,7 +63,7 @@
typedef struct __wait_queue wait_queue_t;
typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int sync, void *key);
-int default_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
+__always_inline int default_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
struct __wait_queue {
unsigned int flags;
^ permalink raw reply [flat|nested] 31+ messages in thread* [6/10 PATCH] inline autoremove_wake_function
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (4 preceding siblings ...)
2008-06-24 5:59 ` [6/10 PATCH] inline default_wake_function Mikulas Patocka
@ 2008-06-24 5:59 ` Mikulas Patocka
2008-06-24 6:01 ` [8/10 PATCH] inline filemap_fdatawrite Mikulas Patocka
` (2 subsequent siblings)
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 5:59 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Make wake_bit_function -> autoremove_wake_function call inlined.
autoremove_wake_function cannot be put as static inline into headers, because
reference to it is taken at some places. So use "inline" keyword that will make
it both exported and inlinable in current file.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/kernel/sched.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/sched.c 2008-06-24 07:37:36.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/sched.c 2008-06-24 07:37:42.000000000 +0200
@@ -4275,7 +4275,7 @@
}
EXPORT_SYMBOL(default_wake_function);
-int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key)
+__always_inline int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key)
{
int ret = default_wake_function(wait, mode, sync, key);
@@ -4285,6 +4285,21 @@
}
EXPORT_SYMBOL(autoremove_wake_function);
+int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *arg)
+{
+ struct wait_bit_key *key = arg;
+ struct wait_bit_queue *wait_bit
+ = container_of(wait, struct wait_bit_queue, wait);
+
+ if (wait_bit->key.flags != key->flags ||
+ wait_bit->key.bit_nr != key->bit_nr ||
+ test_bit(key->bit_nr, key->flags))
+ return 0;
+ else
+ return autoremove_wake_function(wait, mode, sync, key);
+}
+EXPORT_SYMBOL(wake_bit_function);
+
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
Index: linux-2.6.26-rc7-devel/kernel/wait.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/kernel/wait.c 2008-06-24 07:37:36.000000000 +0200
+++ linux-2.6.26-rc7-devel/kernel/wait.c 2008-06-24 07:37:42.000000000 +0200
@@ -127,21 +127,6 @@
}
EXPORT_SYMBOL(finish_wait);
-int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *arg)
-{
- struct wait_bit_key *key = arg;
- struct wait_bit_queue *wait_bit
- = container_of(wait, struct wait_bit_queue, wait);
-
- if (wait_bit->key.flags != key->flags ||
- wait_bit->key.bit_nr != key->bit_nr ||
- test_bit(key->bit_nr, key->flags))
- return 0;
- else
- return autoremove_wake_function(wait, mode, sync, key);
-}
-EXPORT_SYMBOL(wake_bit_function);
-
/*
* To allow interruptible waiting and asynchronous (i.e. nonblocking)
* waiting, the actions of __wait_on_bit() and __wait_on_bit_lock() are
Index: linux-2.6.26-rc7-devel/include/linux/wait.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/wait.h 2008-06-24 07:37:36.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/wait.h 2008-06-24 07:37:42.000000000 +0200
@@ -527,7 +527,7 @@
void prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state);
void prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state);
void finish_wait(wait_queue_head_t *q, wait_queue_t *wait);
-int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
+__always_inline int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
int wake_bit_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
#define DEFINE_WAIT(name) \
^ permalink raw reply [flat|nested] 31+ messages in thread* [8/10 PATCH] inline filemap_fdatawrite
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (5 preceding siblings ...)
2008-06-24 5:59 ` [6/10 PATCH] inline autoremove_wake_function Mikulas Patocka
@ 2008-06-24 6:01 ` Mikulas Patocka
2008-06-24 6:01 ` [9/10 PATCH] inline dm-kcopyd-inline-wake.patch Mikulas Patocka
2008-06-24 6:03 ` [10/10 PATCH] inline dispatch_job Mikulas Patocka
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 6:01 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem
Make filemap_fdatawrite and filemap_flush calls inlined.
This needs to move WB_SYNC_* definitions from writeback.h to fs.h.
writeback.h requires fs.h, so it creates no problem.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/include/linux/fs.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/fs.h 2008-06-24 07:28:06.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/fs.h 2008-06-24 07:37:48.000000000 +0200
@@ -1731,8 +1731,6 @@
extern int invalidate_inode_pages2_range(struct address_space *mapping,
pgoff_t start, pgoff_t end);
extern int write_inode_now(struct inode *, int);
-extern int filemap_fdatawrite(struct address_space *);
-extern int filemap_flush(struct address_space *);
extern int filemap_fdatawait(struct address_space *);
extern int filemap_write_and_wait(struct address_space *mapping);
extern int filemap_write_and_wait_range(struct address_space *mapping,
@@ -1742,6 +1740,38 @@
extern int __filemap_fdatawrite_range(struct address_space *mapping,
loff_t start, loff_t end, int sync_mode);
+/*
+ * fs/fs-writeback.c
+ */
+enum writeback_sync_modes {
+ WB_SYNC_NONE, /* Don't wait on anything */
+ WB_SYNC_ALL, /* Wait on every mapping */
+ WB_SYNC_HOLD, /* Hold the inode on sb_dirty for sys_sync() */
+};
+
+static __always_inline int __filemap_fdatawrite(struct address_space *mapping,
+ int sync_mode)
+{
+ return __filemap_fdatawrite_range(mapping, 0, LLONG_MAX, sync_mode);
+}
+
+static __always_inline int filemap_fdatawrite(struct address_space *mapping)
+{
+ return __filemap_fdatawrite(mapping, WB_SYNC_ALL);
+}
+
+/**
+ * filemap_flush - mostly a non-blocking flush
+ * @mapping: target address_space
+ *
+ * This is a mostly non-blocking flush. Not suitable for data-integrity
+ * purposes - I/O may not be started against all dirty pages.
+ */
+static __always_inline int filemap_flush(struct address_space *mapping)
+{
+ return __filemap_fdatawrite(mapping, WB_SYNC_NONE);
+}
+
extern long do_fsync(struct file *file, int datasync);
extern void sync_supers(void);
extern void sync_filesystems(int wait);
@@ -2000,7 +2030,10 @@
extern int simple_pin_fs(struct file_system_type *, struct vfsmount **mount, int *count);
extern void simple_release_fs(struct vfsmount **mount, int *count);
-extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t);
+extern ssize_t simple_read_from_buffer(void __user *to, size_t count,
+ loff_t *ppos, const void *from, size_t available);
+extern ssize_t memory_read_from_buffer(void *to, size_t count, loff_t *ppos,
+ const void *from, size_t available);
#ifdef CONFIG_MIGRATION
extern int buffer_migrate_page(struct address_space *,
Index: linux-2.6.26-rc7-devel/include/linux/writeback.h
===================================================================
--- linux-2.6.26-rc7-devel.orig/include/linux/writeback.h 2008-06-24 07:28:07.000000000 +0200
+++ linux-2.6.26-rc7-devel/include/linux/writeback.h 2008-06-24 07:37:49.000000000 +0200
@@ -25,15 +25,6 @@
#define current_is_pdflush() task_is_pdflush(current)
/*
- * fs/fs-writeback.c
- */
-enum writeback_sync_modes {
- WB_SYNC_NONE, /* Don't wait on anything */
- WB_SYNC_ALL, /* Wait on every mapping */
- WB_SYNC_HOLD, /* Hold the inode on sb_dirty for sys_sync() */
-};
-
-/*
* A control structure which tells the writeback code what to do. These are
* always on the stack, and hence need no locking. They are always initialised
* in a manner such that unspecified fields are set to zero.
Index: linux-2.6.26-rc7-devel/mm/filemap.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/mm/filemap.c 2008-06-24 07:28:07.000000000 +0200
+++ linux-2.6.26-rc7-devel/mm/filemap.c 2008-06-24 07:37:49.000000000 +0200
@@ -223,39 +223,15 @@
ret = do_writepages(mapping, &wbc);
return ret;
}
+EXPORT_SYMBOL(__filemap_fdatawrite_range);
-static inline int __filemap_fdatawrite(struct address_space *mapping,
- int sync_mode)
-{
- return __filemap_fdatawrite_range(mapping, 0, LLONG_MAX, sync_mode);
-}
-
-int filemap_fdatawrite(struct address_space *mapping)
-{
- return __filemap_fdatawrite(mapping, WB_SYNC_ALL);
-}
-EXPORT_SYMBOL(filemap_fdatawrite);
-
-static int filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
+static __always_inline int filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
loff_t end)
{
return __filemap_fdatawrite_range(mapping, start, end, WB_SYNC_ALL);
}
/**
- * filemap_flush - mostly a non-blocking flush
- * @mapping: target address_space
- *
- * This is a mostly non-blocking flush. Not suitable for data-integrity
- * purposes - I/O may not be started against all dirty pages.
- */
-int filemap_flush(struct address_space *mapping)
-{
- return __filemap_fdatawrite(mapping, WB_SYNC_NONE);
-}
-EXPORT_SYMBOL(filemap_flush);
-
-/**
* wait_on_page_writeback_range - wait for writeback to complete
* @mapping: target address_space
* @start: beginning page index
^ permalink raw reply [flat|nested] 31+ messages in thread* [9/10 PATCH] inline dm-kcopyd-inline-wake.patch
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (6 preceding siblings ...)
2008-06-24 6:01 ` [8/10 PATCH] inline filemap_fdatawrite Mikulas Patocka
@ 2008-06-24 6:01 ` Mikulas Patocka
2008-06-24 6:03 ` [10/10 PATCH] inline dispatch_job Mikulas Patocka
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 6:01 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem, Alasdair G Kergon
Inline wake. Just passes arguments around.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/drivers/md/dm-kcopyd.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/drivers/md/dm-kcopyd.c 2008-06-24 07:28:06.000000000 +0200
+++ linux-2.6.26-rc7-devel/drivers/md/dm-kcopyd.c 2008-06-24 07:37:54.000000000 +0200
@@ -61,7 +61,7 @@
struct list_head pages_jobs;
};
-static void wake(struct dm_kcopyd_client *kc)
+static __always_inline void wake(struct dm_kcopyd_client *kc)
{
queue_work(kc->kcopyd_wq, &kc->kcopyd_work);
}
^ permalink raw reply [flat|nested] 31+ messages in thread* [10/10 PATCH] inline dispatch_job
2008-06-24 5:55 ` [1/10 PATCH] inline __queue_work Mikulas Patocka
` (7 preceding siblings ...)
2008-06-24 6:01 ` [9/10 PATCH] inline dm-kcopyd-inline-wake.patch Mikulas Patocka
@ 2008-06-24 6:03 ` Mikulas Patocka
8 siblings, 0 replies; 31+ messages in thread
From: Mikulas Patocka @ 2008-06-24 6:03 UTC (permalink / raw)
To: linux-kernel, sparclinux; +Cc: davem, Alasdair G Kergon
Inline dispatch_job. It is called at two places, but it is quite small.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Index: linux-2.6.26-rc7-devel/drivers/md/dm-kcopyd.c
===================================================================
--- linux-2.6.26-rc7-devel.orig/drivers/md/dm-kcopyd.c 2008-06-24 07:37:54.000000000 +0200
+++ linux-2.6.26-rc7-devel/drivers/md/dm-kcopyd.c 2008-06-24 07:37:59.000000000 +0200
@@ -433,7 +433,7 @@
* to do the copy, otherwise the io has to be split up into many
* jobs.
*/
-static void dispatch_job(struct kcopyd_job *job)
+static __always_inline void dispatch_job(struct kcopyd_job *job)
{
struct dm_kcopyd_client *kc = job->kc;
atomic_inc(&kc->nr_jobs);
^ permalink raw reply [flat|nested] 31+ messages in thread