From: Andrew Morton <andrewm@uow.edu.au>
To: kumon@flab.fujitsu.co.jp
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)
Date: Wed, 01 Nov 2000 02:36:33 +1100 [thread overview]
Message-ID: <39FEE701.CAC21AE5@uow.edu.au> (raw)
In-Reply-To: <39F957BC.4289FF10@uow.edu.au>, <39F92187.A7621A09@timpanogas.org> <Pine.GSO.4.21.0010270257550.18660-100000@weyl.math.psu.edu> <20001027094613.A18382@gruyere.muc.suse.de> <39F957BC.4289FF10@uow.edu.au> <200010271257.VAA24374@asami.proc.flab.fujitsu.co.jp>
Kouichi,
how many Apache processes are you running? MaxSpareServers?
This patch is a moderate rewrite of __wake_up_common. I'd be
interested in seeing how much difference it makes to the
performance of Apache when the file-locking serialisation is
disabled.
- It implements last-in/first-out semantics for waking
TASK_EXCLUSIVE tasks.
- It fixes what was surely a bug wherein __wake_up_common
scans the entire wait queue even when it has found the
task which it wants to run.
On a dual-CPU box it dramatically increases the max connection
rate when there are a large number of waiters:
#waiters conn/sec (t10-p5+patch) conn/sec (t10-p5)
30 5525 4990
100 5700 4100
1000 5600 1500
This will be due entirely to the queue scanning fix - my
test app has a negligible cache footprint.
It's stable, but it's a work-in-progress.
- __wake_up_common does a fair amount of SMP-specific
stuff when compiled for UP which needs sorting out
- it seems there's somebody in the networking code who
changes a task state incorrectly when it's on a wait
queue. This used to be OK, but it's not OK now that
I'm relying upon the wait queue being in the state
which it should be.
Thanks.
--- linux-2.4.0-test10-pre5/kernel/sched.c Sun Oct 15 01:27:46 2000
+++ linux-akpm/kernel/sched.c Wed Nov 1 01:54:44 2000
@@ -697,6 +697,53 @@
return;
}
+#if WAITQUEUE_DEBUG
+/*
+ * Check that the wait queue is in the correct order:
+ * !TASK_EXCLUSIVE at the head
+ * TASK_EXCLUSIVE at the tail
+ * The list is locked.
+ */
+
+static void check_wq_sanity(wait_queue_head_t *q)
+{
+ struct list_head *probe, *head;
+
+ head = &q->task_list;
+ probe = head->next;
+ while (probe != head) {
+ wait_queue_t *curr = list_entry(probe, wait_queue_t, task_list);
+ if (curr->task->state & TASK_EXCLUSIVE)
+ break;
+ probe = probe->next;
+ }
+ while (probe != head) {
+ wait_queue_t *curr = list_entry(probe, wait_queue_t, task_list);
+ if (!(curr->task->state & TASK_EXCLUSIVE)) {
+ printk("check_wq_sanity: mangled wait queue\n");
+#ifdef CONFIG_X86
+ show_stack(0);
+#endif
+ break;
+ }
+ probe = probe->next;
+ }
+}
+#endif /* WAITQUEUE_DEBUG */
+
+/*
+ * Wake up some tasks which are on *q.
+ *
+ * All tasks which are !TASK_EXCLUSIVE are woken.
+ * Only one TASK_EXCLUSIVE task is woken.
+ * The !TASK_EXCLUSIVE tasks start at q->task_list.next
+ * The TASK_EXCLUSIVE tasks start at q->task_list.prev
+ *
+ * When waking a TASK_EXCLUSIVE task we search backward,
+ * so we find the most-recently-added task (it will have the
+ * hottest cache)
+ */
+
static inline void __wake_up_common (wait_queue_head_t *q, unsigned int mode,
const int sync)
{
@@ -714,6 +761,7 @@
wq_write_lock_irqsave(&q->lock, flags);
#if WAITQUEUE_DEBUG
+ check_wq_sanity(q);
CHECK_MAGIC_WQHEAD(q);
#endif
@@ -722,6 +770,11 @@
if (!head->next || !head->prev)
WQ_BUG();
#endif
+
+ /*
+ * if (mode & TASK_EXCLUSIVE) Wake all the !TASK_EXCLUSIVE tasks
+ * if !(mode & TASK_EXCLUSIVE) Wake all the tasks
+ */
tmp = head->next;
while (tmp != head) {
unsigned int state;
@@ -734,40 +787,69 @@
#endif
p = curr->task;
state = p->state;
+ if (state & mode & TASK_EXCLUSIVE)
+ break; /* Finished all !TASK_EXCLUSIVEs */
if (state & (mode & ~TASK_EXCLUSIVE)) {
#if WAITQUEUE_DEBUG
curr->__waker = (long)__builtin_return_address(0);
#endif
- /*
- * If waking up from an interrupt context then
- * prefer processes which are affine to this
- * CPU.
- */
- if (irq && (state & mode & TASK_EXCLUSIVE)) {
+ if (sync)
+ wake_up_process_synchronous(p);
+ else
+ wake_up_process(p);
+ }
+ }
+
+ /*
+ * Now wake one TASK_EXCLUSIVE task.
+ */
+ if (mode & TASK_EXCLUSIVE) {
+ tmp = head->prev;
+ while (tmp != head) {
+ unsigned int state;
+ wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
+
+ tmp = tmp->prev;
+#if WAITQUEUE_DEBUG
+ CHECK_MAGIC(curr->__magic);
+#endif
+ p = curr->task;
+ state = p->state;
+ if (!(state & TASK_EXCLUSIVE))
+ break; /* Exhausted all TASK_EXCLUSIVEs */
+
+ if (state & mode) {
+ /*
+ * If waking up from an interrupt context then
+ * prefer processes which are affine to this
+ * CPU.
+ */
if (!best_exclusive)
best_exclusive = p;
- else if ((p->processor == best_cpu) &&
- (best_exclusive->processor != best_cpu))
+ if (irq) {
+ if (p->processor == best_cpu) {
best_exclusive = p;
- } else {
- if (sync)
- wake_up_process_synchronous(p);
- else
- wake_up_process(p);
- if (state & mode & TASK_EXCLUSIVE)
+ break;
+ }
+ } else {
break;
+ }
}
}
- }
- if (best_exclusive)
- best_exclusive->state = TASK_RUNNING;
- wq_write_unlock_irqrestore(&q->lock, flags);
-
- if (best_exclusive) {
- if (sync)
- wake_up_process_synchronous(best_exclusive);
- else
- wake_up_process(best_exclusive);
+
+ if (best_exclusive)
+ best_exclusive->state = TASK_RUNNING;
+
+ wq_write_unlock_irqrestore(&q->lock, flags);
+
+ if (best_exclusive) {
+ if (sync)
+ wake_up_process_synchronous(best_exclusive);
+ else
+ wake_up_process(best_exclusive);
+ }
+ } else {
+ wq_write_unlock_irqrestore(&q->lock, flags);
}
out:
return;
--- linux-2.4.0-test10-pre5/kernel/fork.c Sat Sep 9 16:19:30 2000
+++ linux-akpm/kernel/fork.c Wed Nov 1 02:08:00 2000
@@ -39,6 +39,14 @@
unsigned long flags;
wq_write_lock_irqsave(&q->lock, flags);
+#if WAITQUEUE_DEBUG
+ if (wait->task.state & TASK_EXCLUSIVE) {
+ printk("add_wait_queue: exclusive task added at head!\n");
+#ifdef CONFIG_X86
+ show_stack(0);
+#endif
+ }
+#endif
__add_wait_queue(q, wait);
wq_write_unlock_irqrestore(&q->lock, flags);
}
@@ -48,6 +56,14 @@
unsigned long flags;
wq_write_lock_irqsave(&q->lock, flags);
+#if WAITQUEUE_DEBUG
+ if (!(wait->task.state & TASK_EXCLUSIVE)) {
+ printk("add_wait_queue: non-exclusive task added at tail!\n");
+#ifdef CONFIG_X86
+ show_stack(0);
+#endif
+ }
+#endif
__add_wait_queue_tail(q, wait);
wq_write_unlock_irqrestore(&q->lock, flags);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2000-10-31 15:37 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200010250736.QAA12373@asami.proc.flab.fujitsu.co.jp>
[not found] ` <Pine.LNX.4.21.0010251242050.943-100000@duckman.distro.conectiva>
[not found] ` <200010260138.KAA17028@asami.proc.flab.fujitsu.co.jp>
[not found] ` <200010261405.XAA19135@asami.proc.flab.fujitsu.co.jp>
2000-10-27 6:24 ` Negative scalability by removal of lock_kernel()? (Was: Strange performance behavior of 2.4.0-test9) kumon
2000-10-27 6:32 ` Negative scalability by removal of lock_kernel()?(Was: " Jeff V. Merkey
2000-10-27 7:13 ` Alexander Viro
2000-10-27 7:46 ` Andi Kleen
2000-10-27 10:23 ` Andrew Morton
2000-10-27 10:25 ` Andi Kleen
2000-10-27 12:57 ` [PATCH] " kumon
2000-10-28 15:46 ` Andrew Morton
2000-10-28 15:58 ` Andi Kleen
2000-10-28 16:05 ` Jeff Garzik
2000-10-28 16:20 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Alan Cox
2000-10-29 19:45 ` dean gaudet
2000-10-30 6:29 ` Andi Kleen
2000-10-30 15:28 ` Andrea Arcangeli
2000-10-30 16:36 ` Rik van Riel
2000-10-30 18:02 ` Andrea Arcangeli
2000-10-28 16:46 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9) Andrew Morton
2000-10-30 9:27 ` kumon
2000-10-30 15:00 ` Andrew Morton
2000-10-30 23:24 ` dean gaudet
2000-11-04 5:08 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange " Andrew Morton
2000-11-04 6:23 ` Linus Torvalds
2000-11-04 10:54 ` [PATCH] Re: Negative scalability by removal of Alan Cox
2000-11-04 17:22 ` Linus Torvalds
2000-11-05 16:22 ` Andrea Arcangeli
2000-11-05 20:21 ` dean gaudet
2000-11-05 22:43 ` Alan Cox
2000-11-04 20:03 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9) dean gaudet
2000-11-04 20:42 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange Alan Cox
2000-11-04 20:11 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9) dean gaudet
2000-11-04 20:43 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange Alan Cox
2000-11-05 4:52 ` dean gaudet
2000-10-31 15:36 ` Andrew Morton [this message]
2000-11-01 1:02 ` [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9) kumon
2000-11-02 11:09 ` kumon
2000-11-02 12:50 ` kumon
2000-11-04 5:07 ` Andrew Morton
2000-10-27 8:17 ` Jeff V. Merkey
2000-10-27 10:11 ` kumon
2000-11-04 5:55 ` Preemptive scheduling of woken-up processes kumon
2000-11-05 4:19 [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9) Dave Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=39FEE701.CAC21AE5@uow.edu.au \
--to=andrewm@uow.edu.au \
--cc=kumon@flab.fujitsu.co.jp \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.