public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* __lock_page calls run_task_queue(&tq_disk) unecessarily?
@ 2001-02-19 23:02 Marcelo Tosatti
  2001-02-20  1:05 ` Marcelo Tosatti
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2001-02-19 23:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: lkml


Hi Linus,

Take a look at __lock_page:

static void __lock_page(struct page *page)
{
        struct task_struct *tsk = current;
        DECLARE_WAITQUEUE(wait, tsk);

        add_wait_queue_exclusive(&page->wait, &wait);
        for (;;) {
                sync_page(page);
                set_task_state(tsk, TASK_UNINTERRUPTIBLE);
                if (PageLocked(page)) {
                        run_task_queue(&tq_disk);
                        schedule();
                        continue;
                }
                if (!TryLockPage(page))
                        break;
        }
        tsk->state = TASK_RUNNING;
        remove_wait_queue(&page->wait, &wait);
}


Af a process sleeps in __lock_page, sync_page() will be called even if the
page is already unlocked. (block_sync_page(), the sync_page routine for
generic block based filesystem calls run_task_queue(&tq_disk)).

I don't see any problem if we remove the run_task_queue(&tq_disk) and put
sync_page(page) there instead, removing the other sync_page(page) at the
beginning of the loop.

Comments?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __lock_page calls run_task_queue(&tq_disk) unecessarily?
  2001-02-19 23:02 __lock_page calls run_task_queue(&tq_disk) unecessarily? Marcelo Tosatti
@ 2001-02-20  1:05 ` Marcelo Tosatti
  2001-02-20 16:00   ` Andrea Arcangeli
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2001-02-20  1:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: lkml


Btw ___wait_on_page() does something similar.

Here goes the patch for both __lock_page() and ___wait_on_page().


--- linux/mm/filemap.c.orig     Mon Feb 19 23:51:02 2001
+++ linux/mm/filemap.c  Mon Feb 19 23:51:33 2001
@@ -611,11 +611,11 @@
 
        add_wait_queue(&page->wait, &wait);
        do {
-               sync_page(page);
                set_task_state(tsk, TASK_UNINTERRUPTIBLE);
                if (!PageLocked(page))
                        break;
-               run_task_queue(&tq_disk);
+
+               sync_page(page);
                schedule();
        } while (PageLocked(page));
        tsk->state = TASK_RUNNING;
@@ -633,10 +633,9 @@
 
        add_wait_queue_exclusive(&page->wait, &wait);
        for (;;) {
-               sync_page(page);
                set_task_state(tsk, TASK_UNINTERRUPTIBLE);
                if (PageLocked(page)) {
-                       run_task_queue(&tq_disk);
+                       sync_page(page);
                        schedule();
                        continue;
                }


On Mon, 19 Feb 2001, Marcelo Tosatti wrote:

> 
> Hi Linus,
> 
> Take a look at __lock_page:
> 
> static void __lock_page(struct page *page)
> {
>         struct task_struct *tsk = current;
>         DECLARE_WAITQUEUE(wait, tsk);
> 
>         add_wait_queue_exclusive(&page->wait, &wait);
>         for (;;) {
>                 sync_page(page);
>                 set_task_state(tsk, TASK_UNINTERRUPTIBLE);
>                 if (PageLocked(page)) {
>                         run_task_queue(&tq_disk);
>                         schedule();
>                         continue;
>                 }
>                 if (!TryLockPage(page))
>                         break;
>         }
>         tsk->state = TASK_RUNNING;
>         remove_wait_queue(&page->wait, &wait);
> }
> 
> 
> Af a process sleeps in __lock_page, sync_page() will be called even if the
> page is already unlocked. (block_sync_page(), the sync_page routine for
> generic block based filesystem calls run_task_queue(&tq_disk)).
> 
> I don't see any problem if we remove the run_task_queue(&tq_disk) and put
> sync_page(page) there instead, removing the other sync_page(page) at the
> beginning of the loop.
> 
> Comments?
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __lock_page calls run_task_queue(&tq_disk) unecessarily?
  2001-02-20  1:05 ` Marcelo Tosatti
@ 2001-02-20 16:00   ` Andrea Arcangeli
  2001-02-20 17:11     ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Arcangeli @ 2001-02-20 16:00 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Linus Torvalds, lkml

On Mon, Feb 19, 2001 at 11:05:23PM -0200, Marcelo Tosatti wrote:
> --- linux/mm/filemap.c.orig     Mon Feb 19 23:51:02 2001
> +++ linux/mm/filemap.c  Mon Feb 19 23:51:33 2001
> @@ -611,11 +611,11 @@
>  
>         add_wait_queue(&page->wait, &wait);
>         do {
> -               sync_page(page);
>                 set_task_state(tsk, TASK_UNINTERRUPTIBLE);
>                 if (!PageLocked(page))
>                         break;
> -               run_task_queue(&tq_disk);
> +
> +               sync_page(page);
>                 schedule();
>         } while (PageLocked(page));
>         tsk->state = TASK_RUNNING;
> @@ -633,10 +633,9 @@
>  
>         add_wait_queue_exclusive(&page->wait, &wait);
>         for (;;) {
> -               sync_page(page);
>                 set_task_state(tsk, TASK_UNINTERRUPTIBLE);
>                 if (PageLocked(page)) {
> -                       run_task_queue(&tq_disk);
> +                       sync_page(page);
>                         schedule();
>                         continue;
			  ^^^^^^^^
>                 }

Looks perfect. I'd also remove the `continue' from __lock_page, it's wake-one
so it should get the wakeup only when it's time to lock the page down.

Andrea

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __lock_page calls run_task_queue(&tq_disk) unecessarily?
  2001-02-20 16:00   ` Andrea Arcangeli
@ 2001-02-20 17:11     ` Linus Torvalds
  2001-02-20 18:10       ` Andrea Arcangeli
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2001-02-20 17:11 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Marcelo Tosatti, lkml



On Tue, 20 Feb 2001, Andrea Arcangeli wrote:
> 
> Looks perfect. I'd also remove the `continue' from __lock_page, it's wake-one
> so it should get the wakeup only when it's time to lock the page down.

NO!

Even if it is wake-one, others may have claimed it before. There can be
new users coming in and doing a "trylock()" etc.

NEVER *EVER* think that "exclusive wait-queue" implies some sort of
critical region protection. An exlcusive wait-queue is _not_ a lock. It's
only an optimization heuristic.

		Linus


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __lock_page calls run_task_queue(&tq_disk) unecessarily?
  2001-02-20 17:11     ` Linus Torvalds
@ 2001-02-20 18:10       ` Andrea Arcangeli
  0 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2001-02-20 18:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Marcelo Tosatti, lkml

On Tue, Feb 20, 2001 at 09:11:04AM -0800, Linus Torvalds wrote:
> Even if it is wake-one, others may have claimed it before. There can be
> new users coming in and doing a "trylock()" etc.
> 
> NEVER *EVER* think that "exclusive wait-queue" implies some sort of
> critical region protection. An exlcusive wait-queue is _not_ a lock. It's
> only an optimization heuristic.

The reason of not executing the trylock in the slow path of the lock_page is to
avoid writing to the shared ram on all the CPUs at the same time for no good
reason.

We can write just now to the same cacheline from all the CPUs at the same time
if all the cpus runs lock_page at the same time on the same page, but there's
not an high probability for such thing to happen and we would be slower to try
to read first.

The reason of the `continue' is only one: if the wakeup would be wake-all we
would end executing NR_sleppers TryLockPage() and we would get an high probability
that only one of those trylocks will succeed. So we could assume that all the
other trylocks was going to be wasted and so we used `continue' to try not to
bang the cacheline too much for probably no good reason.

But since it's a wake-one, only one task will try to acquire the cacheline after
the wakeup as it just did once with the TryLockPage in lock_page(). It will
just try again like restarting from lock_page().

I don't see how the probability of TryLockPage to succeed after the wakeup
could decrease compared to the first TryLockPage. Since the probability doesn't
decrease I don't see any point for the `continue'. Why should the probability
of succeeding in TryLockPage drecrease after a wakeup compared to the
TryLockPage in lock_page()? If you have an explanation I will certainly agree
to left the `continue' there.

Andrea

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-02-20 18:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-02-19 23:02 __lock_page calls run_task_queue(&tq_disk) unecessarily? Marcelo Tosatti
2001-02-20  1:05 ` Marcelo Tosatti
2001-02-20 16:00   ` Andrea Arcangeli
2001-02-20 17:11     ` Linus Torvalds
2001-02-20 18:10       ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox