public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* OOM killer firing on 2.6.18 and later during LTP runs
@ 2006-11-25 21:03 Martin J. Bligh
  2006-11-25 21:28 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Martin J. Bligh @ 2006-11-25 21:03 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Andy Whitcroft, Andrew Morton

On 2.6.18-rc7 and later during LTP:
http://test.kernel.org/abat/48393/debug/console.log

oom-killer: gfp_mask=0x201d2, order=0

Call Trace:
  [<ffffffff802638cb>] out_of_memory+0x33/0x220
  [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
  [<ffffffff802667d2>] __do_page_cache_readahead+0x99/0x212
  [<ffffffff80260799>] sync_page+0x0/0x45
  [<ffffffff804b304c>] io_schedule+0x28/0x33
  [<ffffffff804b32b8>] __wait_on_bit_lock+0x5b/0x66
  [<ffffffff8043d849>] dm_any_congested+0x3b/0x42
  [<ffffffff80262e50>] filemap_nopage+0x14b/0x353
  [<ffffffff8026cf9a>] __handle_mm_fault+0x387/0x93f
  [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
  [<ffffffff80245a4e>] autoremove_wake_function+0x0/0x2e
oom-killer: gfp_mask=0x280d2, order=0

Call Trace:
  [<ffffffff802638cb>] out_of_memory+0x33/0x220
  [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
  [<ffffffff8026cde3>] __handle_mm_fault+0x1d0/0x93f
  [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
  [<ffffffff804b2854>] thread_return+0x0/0xe0
  [<ffffffff8020a405>] error_exit+0x0/0x84

--------------------------------------------------

This doesn't seem to happen every run, unfortnately, only
intermittently, and we don't have much data before that, so
hard to tell how long it's been going on.

Still happening on latest kernels.
http://test.kernel.org/abat/62445/debug/console.log

automount invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
lamb-payload invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

Call Trace:
  [<ffffffff80264dca>] out_of_memory+0x70/0x262
  [<ffffffff802459f6>] autoremove_wake_function+0x0/0x2e
  [<ffffffff802668bf>] __alloc_pages+0x238/0x2c1
  [<ffffffff80268070>] __do_page_cache_readahead+0xab/0x234
  [<ffffffff8026205c>] sync_page+0x0/0x45
  [<ffffffff804bf888>] io_schedule+0x28/0x33
  [<ffffffff804bfaeb>] __wait_on_bit_lock+0x5b/0x66
  [<ffffffff80446fc9>] dm_any_congested+0x3b/0x42
  [<ffffffff80264158>] filemap_nopage+0x148/0x34e
  [<ffffffff8026e49a>] __handle_mm_fault+0x1f8/0x9b0
  [<ffffffff804c2d0f>] do_page_fault+0x441/0x7b5
  [<ffffffff804c0d61>] _spin_unlock_irq+0x9/0xc
  [<ffffffff804bf121>] thread_return+0x64/0x100
  [<ffffffff804c119d>] error_exit+0x0/0x84

Does at least seem to be the same stack, mostly, and this machine is
using dm it seems, which most of the others aren't

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-25 21:03 OOM killer firing on 2.6.18 and later during LTP runs Martin J. Bligh
@ 2006-11-25 21:28 ` Andrew Morton
  2006-11-25 21:35   ` Martin J. Bligh
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Andrew Morton @ 2006-11-25 21:28 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Linux Kernel Mailing List, Andy Whitcroft

On Sat, 25 Nov 2006 13:03:45 -0800
"Martin J. Bligh" <mbligh@mbligh.org> wrote:

> On 2.6.18-rc7 and later during LTP:
> http://test.kernel.org/abat/48393/debug/console.log

The traces are a bit confusing, but I don't actually see anything wrong
there.  The machine has used up all swap, has used up all memory and has
correctly gone and killed things.  After that, there's free memory again.

> oom-killer: gfp_mask=0x201d2, order=0
> 
> Call Trace:
>   [<ffffffff802638cb>] out_of_memory+0x33/0x220
>   [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
>   [<ffffffff802667d2>] __do_page_cache_readahead+0x99/0x212
>   [<ffffffff80260799>] sync_page+0x0/0x45
>   [<ffffffff804b304c>] io_schedule+0x28/0x33
>   [<ffffffff804b32b8>] __wait_on_bit_lock+0x5b/0x66
>   [<ffffffff8043d849>] dm_any_congested+0x3b/0x42
>   [<ffffffff80262e50>] filemap_nopage+0x14b/0x353
>   [<ffffffff8026cf9a>] __handle_mm_fault+0x387/0x93f
>   [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
>   [<ffffffff80245a4e>] autoremove_wake_function+0x0/0x2e
> oom-killer: gfp_mask=0x280d2, order=0
> 
> Call Trace:
>   [<ffffffff802638cb>] out_of_memory+0x33/0x220
>   [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
>   [<ffffffff8026cde3>] __handle_mm_fault+0x1d0/0x93f
>   [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
>   [<ffffffff804b2854>] thread_return+0x0/0xe0
>   [<ffffffff8020a405>] error_exit+0x0/0x84
> 
> --------------------------------------------------
> 
> This doesn't seem to happen every run, unfortnately, only
> intermittently, and we don't have much data before that, so
> hard to tell how long it's been going on.
> 
> Still happening on latest kernels.
> http://test.kernel.org/abat/62445/debug/console.log

The same appears to have happened there too.  Although it does seem to have
killed a lot more than it should have.

Has something changed in the configuration of that machine?  New LTP
version?  Less swapsapce?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-25 21:28 ` Andrew Morton
@ 2006-11-25 21:35   ` Martin J. Bligh
  2006-11-25 22:08     ` Andrew Morton
  2006-11-26  3:00   ` Dave Jones
  2006-11-26 11:38   ` Andy Whitcroft
  2 siblings, 1 reply; 9+ messages in thread
From: Martin J. Bligh @ 2006-11-25 21:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List, Andy Whitcroft

> The traces are a bit confusing, but I don't actually see anything wrong
> there.  The machine has used up all swap, has used up all memory and has
> correctly gone and killed things.  After that, there's free memory again.

Yeah, it's just a bit odd that it's always in the IO path. Makes me
suspect there's actually a bunch of pagecache in the box as well, but
maybe it's just coincidence, and the rest of the box really is full
of anon mem. I thought we dumped the alt-sysrq-m type stuff on an OOM
kill, but it seems not. maybe that's just not in mainline.

>> This doesn't seem to happen every run, unfortnately, only
>> intermittently, and we don't have much data before that, so
>> hard to tell how long it's been going on.
>>
>> Still happening on latest kernels.
>> http://test.kernel.org/abat/62445/debug/console.log
> 
> The same appears to have happened there too.  Although it does seem to have
> killed a lot more than it should have.
> 
> Has something changed in the configuration of that machine?  New LTP
> version?  Less swapsapce?

Difficult to tell, it's a fairly new box to the grid, so it seems to
have been doing that intermittently forever.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-25 21:35   ` Martin J. Bligh
@ 2006-11-25 22:08     ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2006-11-25 22:08 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Linux Kernel Mailing List, Andy Whitcroft

On Sat, 25 Nov 2006 13:35:40 -0800
"Martin J. Bligh" <mbligh@mbligh.org> wrote:

> > The traces are a bit confusing, but I don't actually see anything wrong
> > there.  The machine has used up all swap, has used up all memory and has
> > correctly gone and killed things.  After that, there's free memory again.
> 
> Yeah, it's just a bit odd that it's always in the IO path.

It's not.  It's in the main pagecache allocation path for reads.

> Makes me
> suspect there's actually a bunch of pagecache in the box as well,

show_free_areas() doesn't appear to dump the information which is needed to
work out how much of that memory is pagecache and how much is swapcache.  I
assume it's basically all swapcache.

> but
> maybe it's just coincidence, and the rest of the box really is full
> of anon mem. I thought we dumped the alt-sysrq-m type stuff on an OOM
> kill, but it seems not. maybe that's just not in mainline.

We do.  It's sitting there in your logs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-25 21:28 ` Andrew Morton
  2006-11-25 21:35   ` Martin J. Bligh
@ 2006-11-26  3:00   ` Dave Jones
  2006-11-26  7:11     ` Andrew Morton
  2006-11-26 11:38   ` Andy Whitcroft
  2 siblings, 1 reply; 9+ messages in thread
From: Dave Jones @ 2006-11-26  3:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Linux Kernel Mailing List, Andy Whitcroft,
	Larry Woodman

On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote:
 > On Sat, 25 Nov 2006 13:03:45 -0800
 > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
 > 
 > > On 2.6.18-rc7 and later during LTP:
 > > http://test.kernel.org/abat/48393/debug/console.log
 > 
 > The traces are a bit confusing, but I don't actually see anything wrong
 > there.  The machine has used up all swap, has used up all memory and has
 > correctly gone and killed things.  After that, there's free memory again.

We covered this a month or two back.  For RHEL5, we've ended up
reintroducing the oom killer prevention logic that we had up until
circa 2.6.10.   It seemed that there exist circumstances where
given a little more time, some memory hogging apps will run to completion
allowing other allocators to succeed instead of being killed.

For reference, here's the patch that Larry Woodman came up with
for RHEL5.  The 'rhts' test suite that is mentioned below was
actually failing when it got to LTP iirc, which matches Martins experience.

		Dave


Dave, this patch includes the upstream OOM kill changes so that RHEL5 is 
in sync
with the latest 2.6.19 kernel, as well as the out_of_memory() change so 
that it must
be called more than 10 times within a 5 second window before it actually 
kills a
process.  I think this gives us the best of everything, we have all the 
upstream code
plus one small change that gets us to pass the RHTS test suite.

--- linux-2.6.18.noarch/mm/oom_kill.c.larry
+++ linux-2.6.18.noarch/mm/oom_kill.c
@@ -58,6 +58,12 @@ unsigned long badness(struct task_struct
 	}
 
 	/*
+	 * swapoff can easily use up all memory, so kill those first.
+	 */
+	if (p->flags & PF_SWAPOFF)
+		return ULONG_MAX;
+
+	/*
 	 * The memory size of the process is the basis for the badness.
 	 */
 	points = mm->total_vm;
@@ -127,6 +133,14 @@ unsigned long badness(struct task_struct
 		points /= 4;
 
 	/*
+	 * If p's nodes don't overlap ours, it may still help to kill p
+	 * because p may have allocated or otherwise mapped memory on
+	 * this node before. However it will be less likely.
+	 */
+	if (!cpuset_excl_nodes_overlap(p))
+		points /= 8;
+
+	/*
 	 * Adjust the score by oomkilladj.
 	 */
 	if (p->oomkilladj) {
@@ -191,25 +205,38 @@ static struct task_struct *select_bad_pr
 		unsigned long points;
 		int releasing;
 
+		/* skip kernel threads */
+		if (!p->mm)
+			continue;
+
 		/* skip the init task with pid == 1 */
 		if (p->pid == 1)
 			continue;
-		if (p->oomkilladj == OOM_DISABLE)
-			continue;
-		/* If p's nodes don't overlap ours, it won't help to kill p. */
-		if (!cpuset_excl_nodes_overlap(p))
-			continue;
-
 		/*
 		 * This is in the process of releasing memory so wait for it
 		 * to finish before killing some other task by mistake.
+		 *
+		 * However, if p is the current task, we allow the 'kill' to
+		 * go ahead if it is exiting: this will simply set TIF_MEMDIE,
+		 * which will allow it to gain access to memory reserves in
+		 * the process of exiting and releasing its resources.
+		 * Otherwise we could get an OOM deadlock.
 		 */
 		releasing = test_tsk_thread_flag(p, TIF_MEMDIE) ||
 						p->flags & PF_EXITING;
-		if (releasing && !(p->flags & PF_DEAD))
+		if (releasing) {
+			/* PF_DEAD tasks have already released their mm */
+			if (p->flags & PF_DEAD)
+				continue;
+			if (p->flags & PF_EXITING && p == current) {
+				chosen = p;
+				*ppoints = ULONG_MAX;
+				break;
+			}
 			return ERR_PTR(-1UL);
-		if (p->flags & PF_SWAPOFF)
-			return p;
+		}
+		if (p->oomkilladj == OOM_DISABLE)
+			continue;
 
 		points = badness(p, uptime.tv_sec);
 		if (points > *ppoints || !chosen) {
@@ -241,7 +268,8 @@ static void __oom_kill_task(struct task_
 		return;
 	}
 	task_unlock(p);
-	printk(KERN_ERR "%s: Killed process %d (%s).\n",
+	if (message) 
+		printk(KERN_ERR "%s: Killed process %d (%s).\n",
 				message, p->pid, p->comm);
 
 	/*
@@ -293,8 +321,15 @@ static int oom_kill_process(struct task_
 	struct task_struct *c;
 	struct list_head *tsk;
 
-	printk(KERN_ERR "Out of Memory: Kill process %d (%s) score %li and "
-		"children.\n", p->pid, p->comm, points);
+	/*
+	 * If the task is already exiting, don't alarm the sysadmin or kill
+	 * its children or threads, just set TIF_MEMDIE so it can die quickly
+	 */
+	if (p->flags & PF_EXITING) {
+		__oom_kill_task(p, NULL);
+		return 0;
+	}
+
 	/* Try to kill a child first */
 	list_for_each(tsk, &p->children) {
 		c = list_entry(tsk, struct task_struct, sibling);
@@ -306,6 +341,69 @@ static int oom_kill_process(struct task_
 	return oom_kill_task(p, message);
 }
 
+int should_oom_kill(void)
+{
+	static spinlock_t oom_lock = SPIN_LOCK_UNLOCKED;
+	static unsigned long first, last, count, lastkill;
+	unsigned long now, since;
+	int ret = 0;
+
+	spin_lock(&oom_lock);
+	now = jiffies;
+	since = now - last;
+	last = now;
+
+	/*
+	 * If it's been a long time since last failure,
+	 * we're not oom.
+	 */
+	if (since > 5*HZ)
+		goto reset;
+
+	/*
+	 * If we haven't tried for at least one second,
+	 * we're not really oom.
+	 */
+	since = now - first;
+	if (since < HZ)
+		goto out_unlock;
+
+	/*
+	 * If we have gotten only a few failures,
+	 * we're not really oom.
+	 */
+	if (++count < 10)
+		goto out_unlock;
+
+	/*
+	 * If we just killed a process, wait a while
+	 * to give that task a chance to exit. This
+	 * avoids killing multiple processes needlessly.
+	 */
+	since = now - lastkill;
+	if (since < HZ*5)
+		goto out_unlock;
+
+	/*
+	 * Ok, really out of memory. Kill something.
+	 */
+	lastkill = now;
+	ret = 1;
+
+reset:
+/*
+ * We dropped the lock above, so check to be sure the variable
+ * first only ever increases to prevent false OOM's.
+ */
+	if (time_after(now, first))
+		first = now;
+	count = 0;
+
+out_unlock:
+	spin_unlock(&oom_lock);
+	return ret;
+}
+
 /**
  * out_of_memory - kill the "best" process when we run out of memory
  *
@@ -320,12 +418,16 @@ void out_of_memory(struct zonelist *zone
 	unsigned long points = 0;
 
 	if (printk_ratelimit()) {
-		printk("oom-killer: gfp_mask=0x%x, order=%d\n",
-			gfp_mask, order);
+		printk(KERN_WARNING "%s invoked oom-killer: "
+		"gfp_mask=0x%x, order=%d, oomkilladj=%d\n",
+		current->comm, gfp_mask, order, current->oomkilladj);
 		dump_stack();
 		show_mem();
 	}
 
+	if (!should_oom_kill())
+		return;
+
 	cpuset_lock();
 	read_lock(&tasklist_lock);
 
--- linux-2.6.18.noarch/mm/vmscan.c.larry
+++ linux-2.6.18.noarch/mm/vmscan.c
@@ -62,6 +62,8 @@ struct scan_control {
 	int swap_cluster_max;
 
 	int swappiness;
+
+	int all_unreclaimable;
 };
 
 /*
@@ -695,6 +697,11 @@ done:
 	return nr_reclaimed;
 }
 
+static inline int zone_is_near_oom(struct zone *zone)
+{
+	return zone->pages_scanned >= (zone->nr_active + zone->nr_inactive)*3;
+}
+
 /*
  * This moves pages from the active list to the inactive list.
  *
@@ -730,6 +737,9 @@ static void shrink_active_list(unsigned 
 		long distress;
 		long swap_tendency;
 
+		if (zone_is_near_oom(zone))
+			goto force_reclaim_mapped;
+
 		/*
 		 * `distress' is a measure of how much trouble we're having
 		 * reclaiming pages.  0 -> no problems.  100 -> great trouble.
@@ -765,6 +775,7 @@ static void shrink_active_list(unsigned 
 		 * memory onto the inactive list.
 		 */
 		if (swap_tendency >= 100)
+force_reclaim_mapped:
 			reclaim_mapped = 1;
 	}
 
@@ -925,6 +936,7 @@ static unsigned long shrink_zones(int pr
 	unsigned long nr_reclaimed = 0;
 	int i;
 
+	sc->all_unreclaimable = 1;
 	for (i = 0; zones[i] != NULL; i++) {
 		struct zone *zone = zones[i];
 
@@ -941,6 +953,8 @@ static unsigned long shrink_zones(int pr
 		if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 			continue;	/* Let kswapd poll it */
 
+		sc->all_unreclaimable = 0;
+
 		nr_reclaimed += shrink_zone(priority, zone, sc);
 	}
 	return nr_reclaimed;
@@ -1021,6 +1035,10 @@ unsigned long try_to_free_pages(struct z
 		if (sc.nr_scanned && priority < DEF_PRIORITY - 2)
 			blk_congestion_wait(WRITE, HZ/10);
 	}
+	/* top priority shrink_caches still had more to do? don't OOM, then */
+	if (!sc.all_unreclaimable || nr_reclaimed)
+		ret = 1;
+
 out:
 	for (i = 0; zones[i] != 0; i++) {
 		struct zone *zone = zones[i];
@@ -1153,7 +1171,7 @@ scan:
 			if (zone->all_unreclaimable)
 				continue;
 			if (nr_slab == 0 && zone->pages_scanned >=
-				    (zone->nr_active + zone->nr_inactive) * 4)
+				    (zone->nr_active + zone->nr_inactive) * 6)
 				zone->all_unreclaimable = 1;
 			/*
 			 * If we've done a decent amount of scanning and

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-26  3:00   ` Dave Jones
@ 2006-11-26  7:11     ` Andrew Morton
  2006-11-26  7:25       ` Dave Jones
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2006-11-26  7:11 UTC (permalink / raw)
  To: Dave Jones
  Cc: Martin J. Bligh, Linux Kernel Mailing List, Andy Whitcroft,
	Larry Woodman

On Sat, 25 Nov 2006 22:00:45 -0500
Dave Jones <davej@redhat.com> wrote:

> On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote:
>  > On Sat, 25 Nov 2006 13:03:45 -0800
>  > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>  > 
>  > > On 2.6.18-rc7 and later during LTP:
>  > > http://test.kernel.org/abat/48393/debug/console.log
>  > 
>  > The traces are a bit confusing, but I don't actually see anything wrong
>  > there.  The machine has used up all swap, has used up all memory and has
>  > correctly gone and killed things.  After that, there's free memory again.
> 
> We covered this a month or two back.  For RHEL5, we've ended up
> reintroducing the oom killer prevention logic that we had up until
> circa 2.6.10.   It seemed that there exist circumstances where
> given a little more time, some memory hogging apps will run to completion
> allowing other allocators to succeed instead of being killed.

I _think_ what you're describing here is a false-positive oom-killing?  But
Martin appears to be hitting a genuine oom.

But it does appear that some changes are needed, because lots of things got
oom-killed.

I think.  Maybe not - there's no timestamping in those logs and it is of
course possible that we're seeing unrelated ooms which happened a long time
apart.

> For reference, here's the patch that Larry Woodman came up with
> for RHEL5.

gulp.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-26  7:11     ` Andrew Morton
@ 2006-11-26  7:25       ` Dave Jones
  2006-11-26  7:30         ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Jones @ 2006-11-26  7:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Linux Kernel Mailing List, Andy Whitcroft,
	Larry Woodman

On Sat, Nov 25, 2006 at 11:11:53PM -0800, Andrew Morton wrote:
 > On Sat, 25 Nov 2006 22:00:45 -0500
 > Dave Jones <davej@redhat.com> wrote:
 > 
 > > On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote:
 > >  > On Sat, 25 Nov 2006 13:03:45 -0800
 > >  > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
 > >  > 
 > >  > > On 2.6.18-rc7 and later during LTP:
 > >  > > http://test.kernel.org/abat/48393/debug/console.log
 > >  > 
 > >  > The traces are a bit confusing, but I don't actually see anything wrong
 > >  > there.  The machine has used up all swap, has used up all memory and has
 > >  > correctly gone and killed things.  After that, there's free memory again.
 > > 
 > > We covered this a month or two back.  For RHEL5, we've ended up
 > > reintroducing the oom killer prevention logic that we had up until
 > > circa 2.6.10.   It seemed that there exist circumstances where
 > > given a little more time, some memory hogging apps will run to completion
 > > allowing other allocators to succeed instead of being killed.
 > 
 > I _think_ what you're describing here is a false-positive oom-killing?  But
 > Martin appears to be hitting a genuine oom.
 
what we saw during the rhel5 testing was that yes, the machine _was_ OOM
*temporarily*, but if instead of killing the task trying to allocate, we
postponed the killing a few times, it would give other tasks the opportunity
to complete writeout, or free up memory some other way, allowing the
allocating process to succeed shortly afterwards.

 > But it does appear that some changes are needed, because lots of things got
 > oom-killed.
 >
 > I think.  Maybe not - there's no timestamping in those logs and it is of
 > course possible that we're seeing unrelated ooms which happened a long time
 > apart.

Maybe, but it does sound spookily familiar.
The last time Larry's patch got floated to lkml it was met with
"Ah!, but we have new oom killer changes in -git which might solve this".
We tried them. They didn't.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-26  7:25       ` Dave Jones
@ 2006-11-26  7:30         ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2006-11-26  7:30 UTC (permalink / raw)
  To: Dave Jones
  Cc: Martin J. Bligh, Linux Kernel Mailing List, Andy Whitcroft,
	Larry Woodman

On Sun, 26 Nov 2006 02:25:38 -0500
Dave Jones <davej@redhat.com> wrote:

> On Sat, Nov 25, 2006 at 11:11:53PM -0800, Andrew Morton wrote:
>  > On Sat, 25 Nov 2006 22:00:45 -0500
>  > Dave Jones <davej@redhat.com> wrote:
>  > 
>  > > On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote:
>  > >  > On Sat, 25 Nov 2006 13:03:45 -0800
>  > >  > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>  > >  > 
>  > >  > > On 2.6.18-rc7 and later during LTP:
>  > >  > > http://test.kernel.org/abat/48393/debug/console.log
>  > >  > 
>  > >  > The traces are a bit confusing, but I don't actually see anything wrong
>  > >  > there.  The machine has used up all swap, has used up all memory and has
>  > >  > correctly gone and killed things.  After that, there's free memory again.
>  > > 
>  > > We covered this a month or two back.  For RHEL5, we've ended up
>  > > reintroducing the oom killer prevention logic that we had up until
>  > > circa 2.6.10.   It seemed that there exist circumstances where
>  > > given a little more time, some memory hogging apps will run to completion
>  > > allowing other allocators to succeed instead of being killed.
>  > 
>  > I _think_ what you're describing here is a false-positive oom-killing?  But
>  > Martin appears to be hitting a genuine oom.
>  
> what we saw during the rhel5 testing was that yes, the machine _was_ OOM
> *temporarily*, but if instead of killing the task trying to allocate, we
> postponed the killing a few times, it would give other tasks the opportunity
> to complete writeout, or free up memory some other way, allowing the
> allocating process to succeed shortly afterwards.

That would be a false positive then.

In Martin's case he's 100% out of swapspace and has only a few tens of
pages letf mapped into pagetables, so Iassume that all memory is unmapped
swapcache (but that cannot be confirmed from the info which we have).  But
it looks like a real oom.

That's not to say that we don't have omm-killer problems.

>  > But it does appear that some changes are needed, because lots of things got
>  > oom-killed.
>  >
>  > I think.  Maybe not - there's no timestamping in those logs and it is of
>  > course possible that we're seeing unrelated ooms which happened a long time
>  > apart.
> 
> Maybe, but it does sound spookily familiar.
> The last time Larry's patch got floated to lkml it was met with
> "Ah!, but we have new oom killer changes in -git which might solve this".
> We tried them. They didn't.

What's the testcase?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOM killer firing on 2.6.18 and later during LTP runs
  2006-11-25 21:28 ` Andrew Morton
  2006-11-25 21:35   ` Martin J. Bligh
  2006-11-26  3:00   ` Dave Jones
@ 2006-11-26 11:38   ` Andy Whitcroft
  2 siblings, 0 replies; 9+ messages in thread
From: Andy Whitcroft @ 2006-11-26 11:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Martin J. Bligh, Linux Kernel Mailing List

Andrew Morton wrote:
> On Sat, 25 Nov 2006 13:03:45 -0800
> "Martin J. Bligh" <mbligh@mbligh.org> wrote:
> 
>> On 2.6.18-rc7 and later during LTP:
>> http://test.kernel.org/abat/48393/debug/console.log
> 
> The traces are a bit confusing, but I don't actually see anything wrong
> there.  The machine has used up all swap, has used up all memory and has
> correctly gone and killed things.  After that, there's free memory again.
> 
>> oom-killer: gfp_mask=0x201d2, order=0
>>
>> Call Trace:
>>   [<ffffffff802638cb>] out_of_memory+0x33/0x220
>>   [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
>>   [<ffffffff802667d2>] __do_page_cache_readahead+0x99/0x212
>>   [<ffffffff80260799>] sync_page+0x0/0x45
>>   [<ffffffff804b304c>] io_schedule+0x28/0x33
>>   [<ffffffff804b32b8>] __wait_on_bit_lock+0x5b/0x66
>>   [<ffffffff8043d849>] dm_any_congested+0x3b/0x42
>>   [<ffffffff80262e50>] filemap_nopage+0x14b/0x353
>>   [<ffffffff8026cf9a>] __handle_mm_fault+0x387/0x93f
>>   [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
>>   [<ffffffff80245a4e>] autoremove_wake_function+0x0/0x2e
>> oom-killer: gfp_mask=0x280d2, order=0
>>
>> Call Trace:
>>   [<ffffffff802638cb>] out_of_memory+0x33/0x220
>>   [<ffffffff80265374>] __alloc_pages+0x23a/0x2c3
>>   [<ffffffff8026cde3>] __handle_mm_fault+0x1d0/0x93f
>>   [<ffffffff804b6366>] do_page_fault+0x44b/0x7ba
>>   [<ffffffff804b2854>] thread_return+0x0/0xe0
>>   [<ffffffff8020a405>] error_exit+0x0/0x84
>>
>> --------------------------------------------------
>>
>> This doesn't seem to happen every run, unfortnately, only
>> intermittently, and we don't have much data before that, so
>> hard to tell how long it's been going on.
>>
>> Still happening on latest kernels.
>> http://test.kernel.org/abat/62445/debug/console.log
> 
> The same appears to have happened there too.  Although it does seem to have
> killed a lot more than it should have.
> 
> Has something changed in the configuration of that machine?  New LTP
> version?  Less swapsapce?

As far as I know neither LTP has changed nor the machine configuration
has changed.   This is one of the very few machines we run which uses
LVM/dm etc perhaps that is a factor.

/dev/mapper/VolGroup00-LogVol01         partition       2031608 156     -1

We do know that the LTP tests add a bunch of swap and then rip them away
again.  Its possible that something bad happens when that is occuring.
It would change the level of deparation rather dramatically for sure.

Perhaps it would make sense to try out the patch from RedHat.  Sadly its
not really reproducible reliably ... so its hard to know how we tell if
its worked.

Sigh.

-apw


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-11-26 11:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-25 21:03 OOM killer firing on 2.6.18 and later during LTP runs Martin J. Bligh
2006-11-25 21:28 ` Andrew Morton
2006-11-25 21:35   ` Martin J. Bligh
2006-11-25 22:08     ` Andrew Morton
2006-11-26  3:00   ` Dave Jones
2006-11-26  7:11     ` Andrew Morton
2006-11-26  7:25       ` Dave Jones
2006-11-26  7:30         ` Andrew Morton
2006-11-26 11:38   ` Andy Whitcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox