Re: [Bug #14141] order 2 page allocation failures in iwlagn

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mel-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
To: Frans Pop <elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
Cc: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	"Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Kernel Testers List
	<kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Pekka Enberg <penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org>,
	Reinette Chatre
	<reinette.chatre-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Bartlomiej Zolnierkiewicz
	<bzolnier-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Karol Lewandowski
	<karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Mohamed Abbas
	<mohamed.abbas-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	"John W. Linville"
	<linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org
Subject: Re: [Bug #14141] order 2 page allocation failures in iwlagn
Date: Mon, 19 Oct 2009 15:01:52 +0100	[thread overview]
Message-ID: <20091019140151.GC9036@csn.ul.ie> (raw)
In-Reply-To: <200910190133.33183.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>

On Mon, Oct 19, 2009 at 01:33:29AM +0200, Frans Pop wrote:
> Another long mail, sorry.
> 
> On Wednesday 14 October 2009, Frans Pop wrote:
> > > There still has not been a mm-change identified that makes
> > > fragmentation significantly worse.
> >
> > My bisection shows a very clear point, even if not an individual commit,
> > in the 'akpm' merge where SKB errors suddenly become *much* more
> > frequent and easy to trigger.
> > I'm sorry to say this, but the fact that nothing has been identified yet
> > is IMO the result of a lack of effort, not because there is no such
> > change.
> 
> I was wrong. It turns out that I was creating the variations in the test 
> results around the akpm merge myself by tiny changes in the way I ran the 
> tests. It took another round of about 30 compilations and tests purely in 
> this range to show that, but those same tests also made me aware of other 
> patterns I should look at.
> 

Once again, thanks for persisting with this for so long. That many tests
and searching is a miserable undertaking.

> Until a few days ago I was concentrating on "do I see SKB allocation errors 
> or not". Since then I've also been looking more consciously at when they 
> happen, at disk access patterns and at desktop freeze patterns.
> 
> I think I did mention before that this whole issue is rather subtle :-/

Indeed

> So, my apologies for finguering the wrong area for so long, but it looked 
> solid given the info available at the time.
> 
> On Thursday 15 October 2009, Mel Gorman wrote:
> > Outside the range of commits suspected of causing problems was the
> > following. It's extremely low probability
> >
> > Commit 8aa7e84 Fix congestion_wait() sync/async vs read/write confusion
> >         This patch alters the call to congestion_wait() in the page
> >         allocator. Frankly, I don't get the change but it might worth
> >         checking if replacing BLK_RW_ASYNC with WRITE on top of 2.6.31
> >         makes any difference
> 
> This is the real culprit. Mel: thanks very much for looking beyond the 
> area I identified. Your overview of mm changes was exactly what I needed 
> and really helped a lot during my later tests.
> 

I'm surprised this made such a big difference which is why I described
it as "extremely low probability". It implies that the real problem isn't
fragmentation per-se but the timing of when pages get consumed.

Maybe what has really changed is how long direct reclaimers wait before trying
to allocate again. After the commit, if direct reclaimers are waiting longer
between direct reclaim attempts, it might mean that the GFP_KERNEL reclaimers
of high-order pages are doing less work before and hurting parallel GFP_ATOMIC
users. Jens, does this sound plausible?

> This commit definitely causes most of the problems; confirmed by reverting 
> it on top of 2.6.31 (also requires reverting 373c0a7e, which is a later 
> build fix).
> 
> The rest of this mail gives details on my tests and how I reached the above 
> conclusion.
> 
> TEST BASELINE (2.6.30)
> ======================
> I mentioned in an earlier mail that I run three instances of gitk for my 
> tests. Loading gitk seems to consist of 3 phases:
> 1) general initial scan of the repository (branches?)
> 2) reading commits: commit counter increases
> 3) reading references (including bisection good/bad points) and
>    uncommitted changes
> 
> Below times and comments per stage when the test is run with 2.6.30. As my 
> test starts after a clean boot, buffers are mostly empty.
> 
> 1st instance: 'gitk v2.6.29..master' (preparation)
> 1) ~20 seconds; user interface is mostly blank
> 2) ~5 seconds to read 35.000 commits; user interface is updated and counter
>    increases steadily as they are read
> 3) ~10 seconds; "branch"/"follows"/"precedes" info and tags are filled
>    in; fairly heavy disk activity
> 
> 2st instance: 'gitk master' (preparation)
> 1) 0 seconds (because data is already buffered)
> 2) ~25 seconds to read 167500 commits; counter increases steadily
> 3) 1-2 seconds (because data is already buffered)
> 
> 3st instance: 'gitk master' (the actual test)
> 1) 0 seconds because data is already buffered
> 2) ~55 seconds due to swapping overhead; minor music skip around commit
>    110.000; counter slower after 90.000, some short halts, but generally
>    increases steadily; moderate disk activity
> 3) ~55-60 seconds; because buffers have been emptied data must by read
>    again, with swapping; very heavy disk activity; fairly long music
>    skip (15-20 seconds), but no SKB allocation errors
> 
> So, the loading of the 3rd instance takes 1.5 minutes longer than the 
> second because of the swapping. And phase 3) is most affected by it.
> 
> AFTER WIRELESS CHANGE
> =====================
> After commit 4752c93c30 ("iwlcore: Allow skb allocation from tasklet") I 
> start getting the SKB errors. They can be triggered reliably if the whole 
> test is repeated 1 or 2 times, but generally not the first time the test 
> is run.

It's up to the wireless driver maintainer what to do here, but it seems
like that patch needs to be reverted and thought about some more before
trying again.

> 
> Or so I thought for a long time.
> It turns out that I will get SKB errors during the first run if I'm
> "sloppy" in the test execution. For example if I wait too long before 
> switching from the last gitk instance to konsole where I have 
> a 'tail -f /var/log/kern.log' running.

So the timing is critical of when the high-order atomic allocations
start kicking in.

> Another factor is the state of the repository: do I have master checked 
> out, or an older branch, or am I in the middle of a bisection. This 
> influences how data is read from the disk and thus the test results.
> A last factor may be the size of the kernel I'm using: my test/bisect 
> kernel is significantly smaller than my regular kernel.
> 
> If the test is run completely cleanly, I will not get SKB errors during the 
> first run. Also, this change does not affect the timings of the test at 
> all: the total load time of the 3rd instance is still ~1:55 and music 
> skips happen in roughly the same places. The pattern of disk activity also 
> remains unchanged.
> 
> If I do *not* run the test cleanly, any SKB errors during the first test 
> run will always be during phase 3), never during phase 2). This is what I 
> saw during tests in the 'akpm' range, and explains the inconsistent 
> results there.
> 
> After discovering this I've made a copy of the git repo so that I always 
> test using the exact same state and tightened my test procedure.
> 
> AFTER congestion_wait CHANGE
> ============================
> If I test commit 9f2d8be, which is just before the congestion_wait() 
> change, I still get the same pattern as described above. But when I test 
> with 8aa7e84 ("Fix congestion_wait() sync/async vs read/write confusion"), 
> things change dramatically when the 3rd gitk instance is started.
> 

So, assuming this is a timing problem, this commit affects the timing of
when pages are consumed by processes doing direct reclaim.

> During the 2nd phase I see the first SKB allocation errors with a music 
> skip between reading commits 95.000 and 110.000.
> About commit 115.000 there is a very long pause during which the counter 
> does not increase, music stops and the desktop freezes completely. The 
> first 30 seconds of that freeze there is only very low disk activity (which 
> seems strange);

I'm just going to have to depend on Jens here. Jens, the congestion_wait() is
on BLK_RW_ASYNC after the commit. Reclaim usually writes pages asynchronously
but lumpy reclaim actually waits of pages to write out synchronously so
it's not always async.

Either way, reclaim is usually worried about writing pages but it would appear
after this change that a lot of read activity can also stall a process in
direct reclaim. What might be happening in Frans's particular case is that the
tasklet that allocates high-order pages for the RX buffers is getting stalled
by congestion caused by other processes doing reads from the filesystem.
While it makes sense from a congestion point of view to halt the IO, the
reclaim operations from direct reclaimers is getting delayed for long enough
to cause problems for GFP_ATOMIC.

Does this sound plausible to you? If so, what's the best way of
addressing this? Changing congestion_wait back to WRITE (assuming that
works for Frans)? Changing it to SYNC (again, assuming it actually
works) or a revert?

> the next 25 seconds there suddenly is very high disk  
> activity during which things gradually unfreeze and more SKB errors are 
> displayed. After that the commit counter runs up fairly steadily again.
> 
> Phase 2) ends at ~1:45. Phase 3) (with more SKB errors) ends at ~2:05.
> 
> So this change almost doubles the time needed for phase 2) and causes SKB 
> allocation errors to occur during that phase. Also, before this commit the 
> desktop freezes are much shorter and less severe. With this change the 
> desktop is completely unusable for almost a minute during phase 2), with 
> even the mouse pointer frozen solid.
> Note that phase 3) becomes shorter, but that the total time needed to load 
> the 3rd instance increases by about 10-15 seconds.
> 
> Note: -rc2 and -rc3 had broken NFS, so I had to cherry-pick 3 NFS commits 
> from -rc4 on top of the commits I wanted to test.
> 
> WITH congestion_wait CHANGE REVERTED
> ====================================
> I've done quite a few tests of 2.6.31 with 373c0a7e and 8aa7e847 reverted 
> to confirm that's really the culprit. I've done this for .31-rc3, .31-rc4,
> .31-rc5, .31 and .31.1.
> 
> In all cases the huge freeze in phase 2) is gone and the general behavior 
> and timings are again as it was after the wireless change. During most 
> tests I did not get any SKB allocation errors during phase 2) or phase 3).
> 
> However with .31-rc5, .31 and .31.1 I have had some tests where I would see 
> a few SKB allocation errors during phase 3) (which is somewhat likely), 
> but also during phase 2). At this point I'm unsure whether this is just 
> noise, or maybe a minor influence from some change merged after .31-rc4.
> Looking through the commits there are several mm/page allocation changes.
> 

It could still be kswapd not being woken up often enough after direct
reclaimers. I took a look through the commits but none of the mm or
allocator changes struck me as likely candidates for making
fragmentation worse or altering the timing.

> For now I suggest ignoring this though as the impact (if any) is very minor 
> and it is not reproducible reliably enough.
> 
> Next I'll retest Mel's patches and also test Reinette's patches.
> 

Of the two patches, only the kswapd one should have any significance. As
David pointed out, the second patch is essentially a no-op as it should
not have been possible to enter direct reclaim with ALLOC_NO_WATERMARKS
set.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mel@csn.ul.ie>
To: Frans Pop <elendil@planet.nl>
Cc: David Rientjes <rientjes@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
	Karol Lewandowski <karol.k.lewandowski@gmail.com>,
	Mohamed Abbas <mohamed.abbas@intel.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	linux-mm@kvack.org
Subject: Re: [Bug #14141] order 2 page allocation failures in iwlagn
Date: Mon, 19 Oct 2009 15:01:52 +0100	[thread overview]
Message-ID: <20091019140151.GC9036@csn.ul.ie> (raw)
In-Reply-To: <200910190133.33183.elendil@planet.nl>

On Mon, Oct 19, 2009 at 01:33:29AM +0200, Frans Pop wrote:
> Another long mail, sorry.
> 
> On Wednesday 14 October 2009, Frans Pop wrote:
> > > There still has not been a mm-change identified that makes
> > > fragmentation significantly worse.
> >
> > My bisection shows a very clear point, even if not an individual commit,
> > in the 'akpm' merge where SKB errors suddenly become *much* more
> > frequent and easy to trigger.
> > I'm sorry to say this, but the fact that nothing has been identified yet
> > is IMO the result of a lack of effort, not because there is no such
> > change.
> 
> I was wrong. It turns out that I was creating the variations in the test 
> results around the akpm merge myself by tiny changes in the way I ran the 
> tests. It took another round of about 30 compilations and tests purely in 
> this range to show that, but those same tests also made me aware of other 
> patterns I should look at.
> 

Once again, thanks for persisting with this for so long. That many tests
and searching is a miserable undertaking.

> Until a few days ago I was concentrating on "do I see SKB allocation errors 
> or not". Since then I've also been looking more consciously at when they 
> happen, at disk access patterns and at desktop freeze patterns.
> 
> I think I did mention before that this whole issue is rather subtle :-/

Indeed

> So, my apologies for finguering the wrong area for so long, but it looked 
> solid given the info available at the time.
> 
> On Thursday 15 October 2009, Mel Gorman wrote:
> > Outside the range of commits suspected of causing problems was the
> > following. It's extremely low probability
> >
> > Commit 8aa7e84 Fix congestion_wait() sync/async vs read/write confusion
> >         This patch alters the call to congestion_wait() in the page
> >         allocator. Frankly, I don't get the change but it might worth
> >         checking if replacing BLK_RW_ASYNC with WRITE on top of 2.6.31
> >         makes any difference
> 
> This is the real culprit. Mel: thanks very much for looking beyond the 
> area I identified. Your overview of mm changes was exactly what I needed 
> and really helped a lot during my later tests.
> 

I'm surprised this made such a big difference which is why I described
it as "extremely low probability". It implies that the real problem isn't
fragmentation per-se but the timing of when pages get consumed.

Maybe what has really changed is how long direct reclaimers wait before trying
to allocate again. After the commit, if direct reclaimers are waiting longer
between direct reclaim attempts, it might mean that the GFP_KERNEL reclaimers
of high-order pages are doing less work before and hurting parallel GFP_ATOMIC
users. Jens, does this sound plausible?

> This commit definitely causes most of the problems; confirmed by reverting 
> it on top of 2.6.31 (also requires reverting 373c0a7e, which is a later 
> build fix).
> 
> The rest of this mail gives details on my tests and how I reached the above 
> conclusion.
> 
> TEST BASELINE (2.6.30)
> ======================
> I mentioned in an earlier mail that I run three instances of gitk for my 
> tests. Loading gitk seems to consist of 3 phases:
> 1) general initial scan of the repository (branches?)
> 2) reading commits: commit counter increases
> 3) reading references (including bisection good/bad points) and
>    uncommitted changes
> 
> Below times and comments per stage when the test is run with 2.6.30. As my 
> test starts after a clean boot, buffers are mostly empty.
> 
> 1st instance: 'gitk v2.6.29..master' (preparation)
> 1) ~20 seconds; user interface is mostly blank
> 2) ~5 seconds to read 35.000 commits; user interface is updated and counter
>    increases steadily as they are read
> 3) ~10 seconds; "branch"/"follows"/"precedes" info and tags are filled
>    in; fairly heavy disk activity
> 
> 2st instance: 'gitk master' (preparation)
> 1) 0 seconds (because data is already buffered)
> 2) ~25 seconds to read 167500 commits; counter increases steadily
> 3) 1-2 seconds (because data is already buffered)
> 
> 3st instance: 'gitk master' (the actual test)
> 1) 0 seconds because data is already buffered
> 2) ~55 seconds due to swapping overhead; minor music skip around commit
>    110.000; counter slower after 90.000, some short halts, but generally
>    increases steadily; moderate disk activity
> 3) ~55-60 seconds; because buffers have been emptied data must by read
>    again, with swapping; very heavy disk activity; fairly long music
>    skip (15-20 seconds), but no SKB allocation errors
> 
> So, the loading of the 3rd instance takes 1.5 minutes longer than the 
> second because of the swapping. And phase 3) is most affected by it.
> 
> AFTER WIRELESS CHANGE
> =====================
> After commit 4752c93c30 ("iwlcore: Allow skb allocation from tasklet") I 
> start getting the SKB errors. They can be triggered reliably if the whole 
> test is repeated 1 or 2 times, but generally not the first time the test 
> is run.

It's up to the wireless driver maintainer what to do here, but it seems
like that patch needs to be reverted and thought about some more before
trying again.

> 
> Or so I thought for a long time.
> It turns out that I will get SKB errors during the first run if I'm
> "sloppy" in the test execution. For example if I wait too long before 
> switching from the last gitk instance to konsole where I have 
> a 'tail -f /var/log/kern.log' running.

So the timing is critical of when the high-order atomic allocations
start kicking in.

> Another factor is the state of the repository: do I have master checked 
> out, or an older branch, or am I in the middle of a bisection. This 
> influences how data is read from the disk and thus the test results.
> A last factor may be the size of the kernel I'm using: my test/bisect 
> kernel is significantly smaller than my regular kernel.
> 
> If the test is run completely cleanly, I will not get SKB errors during the 
> first run. Also, this change does not affect the timings of the test at 
> all: the total load time of the 3rd instance is still ~1:55 and music 
> skips happen in roughly the same places. The pattern of disk activity also 
> remains unchanged.
> 
> If I do *not* run the test cleanly, any SKB errors during the first test 
> run will always be during phase 3), never during phase 2). This is what I 
> saw during tests in the 'akpm' range, and explains the inconsistent 
> results there.
> 
> After discovering this I've made a copy of the git repo so that I always 
> test using the exact same state and tightened my test procedure.
> 
> AFTER congestion_wait CHANGE
> ============================
> If I test commit 9f2d8be, which is just before the congestion_wait() 
> change, I still get the same pattern as described above. But when I test 
> with 8aa7e84 ("Fix congestion_wait() sync/async vs read/write confusion"), 
> things change dramatically when the 3rd gitk instance is started.
> 

So, assuming this is a timing problem, this commit affects the timing of
when pages are consumed by processes doing direct reclaim.

> During the 2nd phase I see the first SKB allocation errors with a music 
> skip between reading commits 95.000 and 110.000.
> About commit 115.000 there is a very long pause during which the counter 
> does not increase, music stops and the desktop freezes completely. The 
> first 30 seconds of that freeze there is only very low disk activity (which 
> seems strange);

I'm just going to have to depend on Jens here. Jens, the congestion_wait() is
on BLK_RW_ASYNC after the commit. Reclaim usually writes pages asynchronously
but lumpy reclaim actually waits of pages to write out synchronously so
it's not always async.

Either way, reclaim is usually worried about writing pages but it would appear
after this change that a lot of read activity can also stall a process in
direct reclaim. What might be happening in Frans's particular case is that the
tasklet that allocates high-order pages for the RX buffers is getting stalled
by congestion caused by other processes doing reads from the filesystem.
While it makes sense from a congestion point of view to halt the IO, the
reclaim operations from direct reclaimers is getting delayed for long enough
to cause problems for GFP_ATOMIC.

Does this sound plausible to you? If so, what's the best way of
addressing this? Changing congestion_wait back to WRITE (assuming that
works for Frans)? Changing it to SYNC (again, assuming it actually
works) or a revert?

> the next 25 seconds there suddenly is very high disk  
> activity during which things gradually unfreeze and more SKB errors are 
> displayed. After that the commit counter runs up fairly steadily again.
> 
> Phase 2) ends at ~1:45. Phase 3) (with more SKB errors) ends at ~2:05.
> 
> So this change almost doubles the time needed for phase 2) and causes SKB 
> allocation errors to occur during that phase. Also, before this commit the 
> desktop freezes are much shorter and less severe. With this change the 
> desktop is completely unusable for almost a minute during phase 2), with 
> even the mouse pointer frozen solid.
> Note that phase 3) becomes shorter, but that the total time needed to load 
> the 3rd instance increases by about 10-15 seconds.
> 
> Note: -rc2 and -rc3 had broken NFS, so I had to cherry-pick 3 NFS commits 
> from -rc4 on top of the commits I wanted to test.
> 
> WITH congestion_wait CHANGE REVERTED
> ====================================
> I've done quite a few tests of 2.6.31 with 373c0a7e and 8aa7e847 reverted 
> to confirm that's really the culprit. I've done this for .31-rc3, .31-rc4,
> .31-rc5, .31 and .31.1.
> 
> In all cases the huge freeze in phase 2) is gone and the general behavior 
> and timings are again as it was after the wireless change. During most 
> tests I did not get any SKB allocation errors during phase 2) or phase 3).
> 
> However with .31-rc5, .31 and .31.1 I have had some tests where I would see 
> a few SKB allocation errors during phase 3) (which is somewhat likely), 
> but also during phase 2). At this point I'm unsure whether this is just 
> noise, or maybe a minor influence from some change merged after .31-rc4.
> Looking through the commits there are several mm/page allocation changes.
> 

It could still be kswapd not being woken up often enough after direct
reclaimers. I took a look through the commits but none of the mm or
allocator changes struck me as likely candidates for making
fragmentation worse or altering the timing.

> For now I suggest ignoring this though as the impact (if any) is very minor 
> and it is not reproducible reliably enough.
> 
> Next I'll retest Mel's patches and also test Reinette's patches.
> 

Of the two patches, only the kswapd one should have any significance. As
David pointed out, the second patch is essentially a no-op as it should
not have been possible to enter direct reclaim with ALLOC_NO_WATERMARKS
set.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mel@csn.ul.ie>
To: Frans Pop <elendil@planet.nl>
Cc: David Rientjes <rientjes@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
	Karol Lewandowski <karol.k.lewandowski@gmail.com>,
	Mohamed Abbas <mohamed.abbas@intel.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	linux-mm@kvack.org
Subject: Re: [Bug #14141] order 2 page allocation failures in iwlagn
Date: Mon, 19 Oct 2009 15:01:52 +0100	[thread overview]
Message-ID: <20091019140151.GC9036@csn.ul.ie> (raw)
In-Reply-To: <200910190133.33183.elendil@planet.nl>

On Mon, Oct 19, 2009 at 01:33:29AM +0200, Frans Pop wrote:
> Another long mail, sorry.
> 
> On Wednesday 14 October 2009, Frans Pop wrote:
> > > There still has not been a mm-change identified that makes
> > > fragmentation significantly worse.
> >
> > My bisection shows a very clear point, even if not an individual commit,
> > in the 'akpm' merge where SKB errors suddenly become *much* more
> > frequent and easy to trigger.
> > I'm sorry to say this, but the fact that nothing has been identified yet
> > is IMO the result of a lack of effort, not because there is no such
> > change.
> 
> I was wrong. It turns out that I was creating the variations in the test 
> results around the akpm merge myself by tiny changes in the way I ran the 
> tests. It took another round of about 30 compilations and tests purely in 
> this range to show that, but those same tests also made me aware of other 
> patterns I should look at.
> 

Once again, thanks for persisting with this for so long. That many tests
and searching is a miserable undertaking.

> Until a few days ago I was concentrating on "do I see SKB allocation errors 
> or not". Since then I've also been looking more consciously at when they 
> happen, at disk access patterns and at desktop freeze patterns.
> 
> I think I did mention before that this whole issue is rather subtle :-/

Indeed

> So, my apologies for finguering the wrong area for so long, but it looked 
> solid given the info available at the time.
> 
> On Thursday 15 October 2009, Mel Gorman wrote:
> > Outside the range of commits suspected of causing problems was the
> > following. It's extremely low probability
> >
> > Commit 8aa7e84 Fix congestion_wait() sync/async vs read/write confusion
> >         This patch alters the call to congestion_wait() in the page
> >         allocator. Frankly, I don't get the change but it might worth
> >         checking if replacing BLK_RW_ASYNC with WRITE on top of 2.6.31
> >         makes any difference
> 
> This is the real culprit. Mel: thanks very much for looking beyond the 
> area I identified. Your overview of mm changes was exactly what I needed 
> and really helped a lot during my later tests.
> 

I'm surprised this made such a big difference which is why I described
it as "extremely low probability". It implies that the real problem isn't
fragmentation per-se but the timing of when pages get consumed.

Maybe what has really changed is how long direct reclaimers wait before trying
to allocate again. After the commit, if direct reclaimers are waiting longer
between direct reclaim attempts, it might mean that the GFP_KERNEL reclaimers
of high-order pages are doing less work before and hurting parallel GFP_ATOMIC
users. Jens, does this sound plausible?

> This commit definitely causes most of the problems; confirmed by reverting 
> it on top of 2.6.31 (also requires reverting 373c0a7e, which is a later 
> build fix).
> 
> The rest of this mail gives details on my tests and how I reached the above 
> conclusion.
> 
> TEST BASELINE (2.6.30)
> ======================
> I mentioned in an earlier mail that I run three instances of gitk for my 
> tests. Loading gitk seems to consist of 3 phases:
> 1) general initial scan of the repository (branches?)
> 2) reading commits: commit counter increases
> 3) reading references (including bisection good/bad points) and
>    uncommitted changes
> 
> Below times and comments per stage when the test is run with 2.6.30. As my 
> test starts after a clean boot, buffers are mostly empty.
> 
> 1st instance: 'gitk v2.6.29..master' (preparation)
> 1) ~20 seconds; user interface is mostly blank
> 2) ~5 seconds to read 35.000 commits; user interface is updated and counter
>    increases steadily as they are read
> 3) ~10 seconds; "branch"/"follows"/"precedes" info and tags are filled
>    in; fairly heavy disk activity
> 
> 2st instance: 'gitk master' (preparation)
> 1) 0 seconds (because data is already buffered)
> 2) ~25 seconds to read 167500 commits; counter increases steadily
> 3) 1-2 seconds (because data is already buffered)
> 
> 3st instance: 'gitk master' (the actual test)
> 1) 0 seconds because data is already buffered
> 2) ~55 seconds due to swapping overhead; minor music skip around commit
>    110.000; counter slower after 90.000, some short halts, but generally
>    increases steadily; moderate disk activity
> 3) ~55-60 seconds; because buffers have been emptied data must by read
>    again, with swapping; very heavy disk activity; fairly long music
>    skip (15-20 seconds), but no SKB allocation errors
> 
> So, the loading of the 3rd instance takes 1.5 minutes longer than the 
> second because of the swapping. And phase 3) is most affected by it.
> 
> AFTER WIRELESS CHANGE
> =====================
> After commit 4752c93c30 ("iwlcore: Allow skb allocation from tasklet") I 
> start getting the SKB errors. They can be triggered reliably if the whole 
> test is repeated 1 or 2 times, but generally not the first time the test 
> is run.

It's up to the wireless driver maintainer what to do here, but it seems
like that patch needs to be reverted and thought about some more before
trying again.

> 
> Or so I thought for a long time.
> It turns out that I will get SKB errors during the first run if I'm
> "sloppy" in the test execution. For example if I wait too long before 
> switching from the last gitk instance to konsole where I have 
> a 'tail -f /var/log/kern.log' running.

So the timing is critical of when the high-order atomic allocations
start kicking in.

> Another factor is the state of the repository: do I have master checked 
> out, or an older branch, or am I in the middle of a bisection. This 
> influences how data is read from the disk and thus the test results.
> A last factor may be the size of the kernel I'm using: my test/bisect 
> kernel is significantly smaller than my regular kernel.
> 
> If the test is run completely cleanly, I will not get SKB errors during the 
> first run. Also, this change does not affect the timings of the test at 
> all: the total load time of the 3rd instance is still ~1:55 and music 
> skips happen in roughly the same places. The pattern of disk activity also 
> remains unchanged.
> 
> If I do *not* run the test cleanly, any SKB errors during the first test 
> run will always be during phase 3), never during phase 2). This is what I 
> saw during tests in the 'akpm' range, and explains the inconsistent 
> results there.
> 
> After discovering this I've made a copy of the git repo so that I always 
> test using the exact same state and tightened my test procedure.
> 
> AFTER congestion_wait CHANGE
> ============================
> If I test commit 9f2d8be, which is just before the congestion_wait() 
> change, I still get the same pattern as described above. But when I test 
> with 8aa7e84 ("Fix congestion_wait() sync/async vs read/write confusion"), 
> things change dramatically when the 3rd gitk instance is started.
> 

So, assuming this is a timing problem, this commit affects the timing of
when pages are consumed by processes doing direct reclaim.

> During the 2nd phase I see the first SKB allocation errors with a music 
> skip between reading commits 95.000 and 110.000.
> About commit 115.000 there is a very long pause during which the counter 
> does not increase, music stops and the desktop freezes completely. The 
> first 30 seconds of that freeze there is only very low disk activity (which 
> seems strange);

I'm just going to have to depend on Jens here. Jens, the congestion_wait() is
on BLK_RW_ASYNC after the commit. Reclaim usually writes pages asynchronously
but lumpy reclaim actually waits of pages to write out synchronously so
it's not always async.

Either way, reclaim is usually worried about writing pages but it would appear
after this change that a lot of read activity can also stall a process in
direct reclaim. What might be happening in Frans's particular case is that the
tasklet that allocates high-order pages for the RX buffers is getting stalled
by congestion caused by other processes doing reads from the filesystem.
While it makes sense from a congestion point of view to halt the IO, the
reclaim operations from direct reclaimers is getting delayed for long enough
to cause problems for GFP_ATOMIC.

Does this sound plausible to you? If so, what's the best way of
addressing this? Changing congestion_wait back to WRITE (assuming that
works for Frans)? Changing it to SYNC (again, assuming it actually
works) or a revert?

> the next 25 seconds there suddenly is very high disk  
> activity during which things gradually unfreeze and more SKB errors are 
> displayed. After that the commit counter runs up fairly steadily again.
> 
> Phase 2) ends at ~1:45. Phase 3) (with more SKB errors) ends at ~2:05.
> 
> So this change almost doubles the time needed for phase 2) and causes SKB 
> allocation errors to occur during that phase. Also, before this commit the 
> desktop freezes are much shorter and less severe. With this change the 
> desktop is completely unusable for almost a minute during phase 2), with 
> even the mouse pointer frozen solid.
> Note that phase 3) becomes shorter, but that the total time needed to load 
> the 3rd instance increases by about 10-15 seconds.
> 
> Note: -rc2 and -rc3 had broken NFS, so I had to cherry-pick 3 NFS commits 
> from -rc4 on top of the commits I wanted to test.
> 
> WITH congestion_wait CHANGE REVERTED
> ====================================
> I've done quite a few tests of 2.6.31 with 373c0a7e and 8aa7e847 reverted 
> to confirm that's really the culprit. I've done this for .31-rc3, .31-rc4,
> .31-rc5, .31 and .31.1.
> 
> In all cases the huge freeze in phase 2) is gone and the general behavior 
> and timings are again as it was after the wireless change. During most 
> tests I did not get any SKB allocation errors during phase 2) or phase 3).
> 
> However with .31-rc5, .31 and .31.1 I have had some tests where I would see 
> a few SKB allocation errors during phase 3) (which is somewhat likely), 
> but also during phase 2). At this point I'm unsure whether this is just 
> noise, or maybe a minor influence from some change merged after .31-rc4.
> Looking through the commits there are several mm/page allocation changes.
> 

It could still be kswapd not being woken up often enough after direct
reclaimers. I took a look through the commits but none of the mm or
allocator changes struck me as likely candidates for making
fragmentation worse or altering the timing.

> For now I suggest ignoring this though as the impact (if any) is very minor 
> and it is not reproducible reliably enough.
> 
> Next I'll retest Mel's patches and also test Reinette's patches.
> 

Of the two patches, only the kswapd one should have any significance. As
David pointed out, the second patch is essentially a no-op as it should
not have been possible to enter direct reclaim with ALLOC_NO_WATERMARKS
set.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-10-19 14:01 UTC|newest]

Thread overview: 369+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-01 19:53 2.6.32-rc1-git2: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-10-01 19:53 ` [Bug #13645] NULL pointer dereference at (null) (level2_spare_pgt) Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13836] suspend script fails, related to stdout? Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13809] oprofile: possible circular locking dependency detected Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13733] 2.6.31-rc2: irq 16: nobody cared Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13940] iwlagn and sky2 stopped working, ACPI-related Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13869] Radeon framebuffer (w/o KMS) corruption at boot Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13935] 2.6.31-rcX breaks Apple MightyMouse (Bluetooth version) Rafael J. Wysocki
2009-10-02 12:51   ` Jan Scholz
2009-10-02 12:51     ` Jan Scholz
2009-10-02 15:58   ` Jiri Kosina
2009-10-02 15:58     ` Jiri Kosina
     [not found]     ` <alpine.LSU.2.00.0910021757390.10941-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2009-10-02 17:16       ` Rafael J. Wysocki
2009-10-02 17:16         ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13906] Huawei E169 GPRS connection causes Ooops Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13941] x86 Geode issue Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13942] Troubles with AoE and uninitialized object Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-02 19:36   ` Bruno Prémont
2009-10-02 19:36     ` Bruno Prémont
     [not found]     ` <20091002213630.42c73909-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>
2009-10-02 21:24       ` Rafael J. Wysocki
2009-10-02 21:24         ` Rafael J. Wysocki
2009-10-02 19:57   ` David Rientjes
2009-10-02 19:57     ` David Rientjes
2009-10-01 19:55 ` [Bug #13948] ath5k broken after suspend-to-ram Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13943] WARNING: at net/mac80211/mlme.c:2292 with ath5k Rafael J. Wysocki
2009-10-02  7:12   ` Fabio Comolli
2009-10-02  7:12     ` Fabio Comolli
     [not found]     ` <b637ec0b0910020012n57e110cbl180aa5bda318e5d5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-02 17:17       ` Rafael J. Wysocki
2009-10-02 17:17         ` Rafael J. Wysocki
     [not found]         ` <200910021917.31509.rjw-KKrjLPT3xs0@public.gmane.org>
2009-10-02 21:37           ` Fabio Comolli
2009-10-02 21:37             ` Fabio Comolli
     [not found]             ` <b637ec0b0910021437l5a011f13qfa4dd541607a6dfb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-02 21:42               ` Rafael J. Wysocki
2009-10-02 21:42                 ` Rafael J. Wysocki
     [not found]                 ` <200910022342.47977.rjw-KKrjLPT3xs0@public.gmane.org>
2009-10-03 13:36                   ` Fabio Comolli
2009-10-03 13:36                     ` Fabio Comolli
2009-10-01 19:55 ` [Bug #14017] _end symbol missing from Symbol.map Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14013] hd don't show up Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13987] Received NMI interrupt at resume Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #13950] Oops when USB Serial disconnected while in use Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-02 19:45   ` Bruno Prémont
2009-10-02 19:45     ` Bruno Prémont
     [not found]     ` <20091002214550.6727df5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>
2009-10-02 21:26       ` Rafael J. Wysocki
2009-10-02 21:26         ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14070] lockdep warning triggered by dup_fd Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14058] Oops in fsnotify Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-02  7:14   ` Jaswinder Singh Rajput
2009-10-02  7:14     ` Jaswinder Singh Rajput
2009-10-01 19:55 ` [Bug #14090] WARNING: at fs/notify/inotify/inotify_user.c:394 Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14137] usb console regressions Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14133] WARNING: at arch/x86/kernel/smp.c:117 native_smp_send_reschedule Rafael J. Wysocki
2009-10-02  7:00   ` Jaswinder Singh Rajput
2009-10-02  7:00     ` Jaswinder Singh Rajput
2009-10-02  7:34     ` Jens Axboe
     [not found]       ` <20091002073425.GA14918-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2009-10-02 17:21         ` Rafael J. Wysocki
2009-10-02 17:21           ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14114] Tuning a saa7134 based card is broken in kernel 2.6.31-rc7 Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14143] OOPS when setting nr_requests for md devices Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-02  9:11   ` Frans Pop
2009-10-02  9:11     ` Frans Pop
     [not found]     ` <200910021111.55749.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-02  9:32       ` Mel Gorman
2009-10-02  9:32         ` Mel Gorman
     [not found]         ` <20091002093226.GJ21906-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-10-02 10:01           ` Frans Pop
2009-10-02 10:01             ` Frans Pop
2009-10-02 20:01           ` Karol Lewandowski
2009-10-02 20:01             ` Karol Lewandowski
2009-10-04 19:28             ` Karol Lewandowski
2009-10-05  5:13     ` Frans Pop
2009-10-05  5:13       ` Frans Pop
2009-10-05  6:50       ` Frans Pop
2009-10-05  6:50         ` Frans Pop
2009-10-05  8:54         ` Frans Pop
2009-10-05  8:54           ` Frans Pop
2009-10-05  8:57         ` Mel Gorman
2009-10-05  8:57           ` Mel Gorman
2009-10-05 21:34           ` Frans Pop
2009-10-05 21:34             ` Frans Pop
2009-10-06  0:04             ` David Rientjes
2009-10-06  0:04               ` David Rientjes
2009-10-06  1:25               ` KOSAKI Motohiro
2009-10-06  1:25                 ` KOSAKI Motohiro
2009-10-06  8:53               ` Mel Gorman
2009-10-06  8:53                 ` Mel Gorman
2009-10-06  9:14                 ` David Rientjes
2009-10-06  9:14                   ` David Rientjes
2009-10-06  9:22                   ` Mel Gorman
2009-10-06  9:22                     ` Mel Gorman
2009-10-06 10:23               ` Frans Pop
2009-10-06 10:23                 ` Frans Pop
2009-10-11 23:10         ` Frans Pop
2009-10-11 23:10           ` Frans Pop
2009-10-11 23:36           ` Frans Pop
2009-10-11 23:36             ` Frans Pop
2009-10-12 13:43           ` Mel Gorman
2009-10-12 13:43             ` Mel Gorman
2009-10-12 13:43             ` Mel Gorman
2009-10-12 17:32             ` Frans Pop
2009-10-12 17:32               ` Frans Pop
2009-10-12 18:43               ` Mel Gorman
2009-10-12 18:43                 ` Mel Gorman
2009-10-13 20:38               ` Frans Pop
2009-10-13 20:38                 ` Frans Pop
2009-10-14 10:30                 ` Mel Gorman
2009-10-14 10:30                   ` Mel Gorman
2009-10-14 10:30                   ` Mel Gorman
2009-10-14 13:10                   ` Frans Pop
2009-10-14 15:40                     ` Mel Gorman
2009-10-14 15:40                       ` Mel Gorman
2009-10-14 16:13                       ` Frans Pop
2009-10-14 16:13                         ` Frans Pop
2009-10-14 18:34                       ` Frans Pop
2009-10-14 18:34                         ` Frans Pop
2009-10-14 23:56                         ` Mel Gorman
2009-10-14 23:56                           ` Mel Gorman
2009-10-14 23:56                           ` Mel Gorman
2009-10-15 20:15                           ` Frans Pop
2009-10-15 20:15                             ` Frans Pop
2009-10-16  9:39                             ` Mel Gorman
2009-10-16  9:39                               ` Mel Gorman
2009-10-14 16:30                     ` reinette chatre
2009-10-14 16:30                       ` reinette chatre
     [not found]                     ` <200910141510.11059.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-18 23:33                       ` Frans Pop
2009-10-18 23:33                         ` Frans Pop
2009-10-18 23:33                         ` Frans Pop
2009-10-19  0:36                         ` Pekka Enberg
2009-10-19  0:36                           ` Pekka Enberg
2009-10-19  2:44                           ` Frans Pop
2009-10-19  2:44                             ` Frans Pop
2009-10-19  2:44                             ` Frans Pop
2009-10-19  9:49                             ` [Bug #14141] order 2 page allocation failures (generic) Tobi Oetiker
2009-10-19  9:49                               ` Tobi Oetiker
     [not found]                               ` <alpine.DEB.2.00.0910191146110.1306-EjsAmf5DE5zIvOfxy3zmAzgUDZmNtoG9@public.gmane.org>
2009-10-19  9:54                                 ` Pekka Enberg
2009-10-19  9:54                                   ` Pekka Enberg
2009-10-19  9:54                                   ` Pekka Enberg
2009-10-19 14:01                                   ` Karol Lewandowski
2009-10-19 14:01                                     ` Karol Lewandowski
     [not found]                                     ` <20091019140145.GA4222-nLtalAL5mPp2RxbNQum0x1nzlInOXLuq@public.gmane.org>
2009-10-19 14:06                                       ` Mel Gorman
2009-10-19 14:06                                         ` Mel Gorman
2009-10-19 14:06                                         ` Mel Gorman
2009-10-19 17:09                                         ` Karol Lewandowski
2009-10-19 17:09                                           ` Karol Lewandowski
2009-10-20  1:47                                           ` Karol Lewandowski
2009-10-20  1:47                                             ` Karol Lewandowski
2009-10-19 13:31                                 ` Mel Gorman
2009-10-19 13:31                                   ` Mel Gorman
2009-10-19 13:31                                   ` Mel Gorman
     [not found]                                   ` <20091019133146.GB9036-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-10-19 13:40                                     ` Tobias Oetiker
2009-10-19 13:40                                       ` Tobias Oetiker
2009-10-19 13:40                                       ` Tobias Oetiker
     [not found]                                       ` <alpine.DEB.2.00.0910191538450.8526-EjsAmf5DE5zIvOfxy3zmAzgUDZmNtoG9@public.gmane.org>
2009-10-19 14:09                                         ` Mel Gorman
2009-10-19 14:09                                           ` Mel Gorman
2009-10-19 14:09                                           ` Mel Gorman
2009-10-19 14:16                                           ` Tobias Oetiker
2009-10-19 14:16                                             ` Tobias Oetiker
2009-10-19 14:59                                             ` Mel Gorman
2009-10-19 14:59                                               ` Mel Gorman
2009-10-19 20:12                                               ` Tobias Oetiker
2009-10-19 20:12                                                 ` Tobias Oetiker
2009-10-19 20:17                                                 ` Tobias Oetiker
2009-10-19 20:17                                                   ` Tobias Oetiker
2009-10-20 10:57                                                   ` Mel Gorman
2009-10-20 10:57                                                     ` Mel Gorman
2009-10-20 11:44                                                     ` Tobias Oetiker
2009-10-20 11:44                                                       ` Tobias Oetiker
2009-10-20 12:51                                                       ` Mel Gorman
2009-10-20 12:51                                                         ` Mel Gorman
2009-10-20 12:58                                                         ` Tobias Oetiker
2009-10-20 12:58                                                           ` Tobias Oetiker
2009-10-20 13:39                                                           ` Mel Gorman
2009-10-20 13:39                                                             ` Mel Gorman
2009-10-20 13:50                                                             ` Tobias Oetiker
2009-10-20 13:50                                                               ` Tobias Oetiker
2009-10-20 14:14                                                               ` Mel Gorman
2009-10-20 14:14                                                                 ` Mel Gorman
2009-10-20 14:20                                                                 ` Tobias Oetiker
2009-10-20 14:20                                                                   ` Tobias Oetiker
2009-10-22 10:27                                                             ` Tobias Oetiker
2009-10-22 10:27                                                               ` Tobias Oetiker
2009-10-19  2:52                           ` [Bug #14141] order 2 page allocation failures in iwlagn Jens Axboe
2009-10-19  2:52                             ` Jens Axboe
     [not found]                         ` <200910190133.33183.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-19 14:01                           ` Mel Gorman [this message]
2009-10-19 14:01                             ` Mel Gorman
2009-10-19 14:01                             ` Mel Gorman
2009-10-19 16:18                             ` Chris Mason
2009-10-19 16:18                               ` Chris Mason
2009-10-19 17:01                               ` Christoph Hellwig
2009-10-19 17:01                               ` Christoph Hellwig
2009-10-19 17:01                                 ` Christoph Hellwig
2009-10-19 17:01                                 ` Christoph Hellwig
     [not found]                                 ` <20091019170115.GA4593-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2009-10-19 21:57                                   ` Chris Mason
2009-10-19 21:57                                     ` Chris Mason
2009-10-19 21:57                                     ` Chris Mason
2009-10-20 10:48                               ` Mel Gorman
2009-10-20 10:48                               ` Mel Gorman
2009-10-20 10:48                                 ` Mel Gorman
2009-10-20 10:48                                 ` Mel Gorman
2009-10-26 21:06                                 ` Frans Pop
     [not found]                                   ` <200910262206.13146.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-27 14:54                                     ` Mel Gorman
2009-10-27 14:54                                       ` Mel Gorman
2009-10-27 14:54                                       ` Mel Gorman
2009-10-27 15:16                                       ` KOSAKI Motohiro
2009-10-27 15:16                                         ` KOSAKI Motohiro
     [not found]                                         ` <2f11576a0910270816s3e1b268ah91b5f2d0cc0d562e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-27 15:21                                           ` Mel Gorman
2009-10-27 15:21                                             ` Mel Gorman
2009-10-27 15:21                                             ` Mel Gorman
     [not found]                                       ` <20091027145435.GG8900-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-10-27 15:52                                         ` Mel Gorman
2009-10-27 15:52                                           ` Mel Gorman
2009-10-27 15:52                                           ` Mel Gorman
2009-10-27 16:03                                           ` Chris Mason
2009-10-27 16:03                                             ` Chris Mason
2009-10-27 17:21                                             ` Frans Pop
2009-10-27 17:21                                               ` Frans Pop
2009-10-27 17:21                                               ` Frans Pop
2009-10-27 17:21                                             ` Frans Pop
2009-11-05 20:14                                     ` Frans Pop
2009-11-05 20:14                                       ` Frans Pop
2009-11-05 20:14                                       ` Frans Pop
     [not found]                                       ` <200911052114.36718.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-11-06  9:51                                         ` Frans Pop
2009-11-06  9:51                                           ` Frans Pop
2009-11-06  9:51                                           ` Frans Pop
2009-11-09 19:00                                           ` Mel Gorman
2009-11-09 19:00                                             ` Mel Gorman
2009-10-25 18:54                               ` Frans Pop
2009-10-25 18:54                                 ` Frans Pop
2009-10-25 18:54                                 ` Frans Pop
2009-10-14 16:28                   ` reinette chatre
2009-10-14 16:28                     ` reinette chatre
2009-10-14 16:50                     ` Mel Gorman
2009-10-14 16:50                       ` Mel Gorman
2009-10-14 20:41                       ` reinette chatre
2009-10-14 20:41                         ` reinette chatre
2009-10-14 21:33                         ` Frans Pop
2009-10-14 21:33                           ` Frans Pop
2009-10-14 21:55                           ` reinette chatre
2009-10-14 21:55                             ` reinette chatre
2009-10-15  2:02                         ` Frans Pop
2009-10-15  2:02                           ` Frans Pop
2009-10-15 15:29                           ` reinette chatre
2009-10-15 15:29                             ` reinette chatre
2009-10-15 19:41                             ` Frans Pop
2009-10-16 17:21                               ` reinette chatre
2009-10-16 17:21                                 ` reinette chatre
2009-10-16 17:21                                 ` reinette chatre
     [not found]                               ` <200910152142.02876.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-17  5:42                                 ` reinette chatre
2009-10-17  5:42                                   ` reinette chatre
2009-10-17  5:42                                   ` reinette chatre
2009-10-27 11:10                                   ` Frans Pop
2009-10-27 11:10                                     ` Frans Pop
     [not found]                                     ` <200910271210.31014.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-27 16:15                                       ` reinette chatre
2009-10-27 16:15                                         ` reinette chatre
2009-10-27 16:15                                         ` reinette chatre
2009-10-01 19:55 ` [Bug #14181] b43 causes panic at system shutdown Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14157] end_request: I/O error, dev cciss/cXdX, sector 0 Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14205] Intel DX58SO mainboard - powering off takes really long Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14185] Oops in driversbasefirmware_class Rafael J. Wysocki
2009-10-01 19:55   ` Rafael J. Wysocki
2009-10-01 19:55 ` [Bug #14204] MCE prevent booting on my computer(pentium iii @500Mhz) Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14249] BUG: oops in gss_validate on 2.6.31 Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14248] 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34 Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14222] Hibernation oopses for the 2nd time with 2.6.31 (won't fit the screen) Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14252] WARNING: at include/linux/skbuff.h:1382 w/ e1000 Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14254] Hibernation broken by clocksource: Save mult_orig in clocksource_disable() Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14251] 2.6.31: no login prompt Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14253] Oops in driversbasefirmware_class Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14258] Memory leak in SCSI initialization Rafael J. Wysocki
2009-10-02 12:58   ` Tetsuo Handa
2009-10-02 17:26     ` Rafael J. Wysocki
2009-10-07 14:04       ` Tetsuo Handa
2009-10-07 20:24         ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14257] Not able to boot on 32 bit System Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14255] WARNING: at drivers/char/tty_io.c:1267 Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-02  0:05   ` Linus Torvalds
2009-10-02  0:05     ` Linus Torvalds
2009-10-01 19:56 ` [Bug #14256] kernel BUG at fs/ext3/super.c:435 Rafael J. Wysocki
2009-10-04 17:38   ` Mikael Pettersson
2009-10-04 17:38     ` Mikael Pettersson
2009-10-04 20:49     ` Rafael J. Wysocki
     [not found]       ` <200910042249.54639.rjw-KKrjLPT3xs0@public.gmane.org>
2009-10-04 23:04         ` Mikael Pettersson
2009-10-04 23:04           ` Mikael Pettersson
     [not found]           ` <19145.10741.402938.867088-tgku4HJDRZih8lFjZTKsyTAV6s6igYVG@public.gmane.org>
2009-10-09 16:40             ` Mikael Pettersson
2009-10-09 16:40               ` Mikael Pettersson
     [not found]               ` <19151.26501.727411.584056-tgku4HJDRZih8lFjZTKsyTAV6s6igYVG@public.gmane.org>
2009-10-09 22:03                 ` Rafael J. Wysocki
2009-10-09 22:03                   ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14261] e1000e jumbo frames no longer work: 'Unsupported MTU setting' Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-02 20:33   ` Nix
     [not found]     ` <877hvd8rj5.fsf-AdTWujXS48Mg67Zj9sPl2A@public.gmane.org>
2009-10-02 21:31       ` Rafael J. Wysocki
2009-10-02 21:31         ` Rafael J. Wysocki
2009-10-02 22:13         ` Jeff Kirsher
2009-10-02 22:13           ` Jeff Kirsher
2009-10-07 18:34           ` Theodore Tso
2009-10-07 18:34             ` Theodore Tso
2009-10-07 19:12             ` Jeff Kirsher
     [not found]             ` <20091007183453.GD12971-3s7WtUTddSA@public.gmane.org>
2009-10-07 19:12               ` Jeff Kirsher
2009-10-01 19:56 ` [Bug #14264] ehci problem - mouse dead on scroll Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14267] Disassociating atheros wlan Rafael J. Wysocki
2009-10-05  0:34   ` Justin Mattock
2009-10-05  0:34     ` Justin Mattock
2009-10-05 20:09     ` Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14266] regression in page writeback Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14265] ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100 Rafael J. Wysocki
2009-10-21 20:04   ` [PATCH] SLUB: Don't drop __GFP_NOFAIL completely from allocate_slab() (was: Re: [Bug #14265] ifconfig: page allocation failure. order:5,ode:0x8020 w/ e100) Karol Lewandowski
2009-10-21 20:04     ` Karol Lewandowski
     [not found]     ` <20091021200442.GA2987-nLtalAL5mPp2RxbNQum0x1nzlInOXLuq@public.gmane.org>
2009-10-21 21:06       ` David Rientjes
2009-10-21 21:06         ` David Rientjes
2009-10-21 21:06         ` David Rientjes
     [not found]         ` <alpine.DEB.2.00.0910211400140.20010-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2009-10-21 21:20           ` Karol Lewandowski
2009-10-21 21:20             ` Karol Lewandowski
2009-10-21 21:20             ` Karol Lewandowski
2009-10-22 10:20             ` Mel Gorman
2009-10-22 10:20               ` Mel Gorman
     [not found]               ` <20091022102014.GL11778-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-10-22 21:33                 ` Karol Lewandowski
2009-10-22 21:33                   ` Karol Lewandowski
2009-10-22 21:33                   ` Karol Lewandowski
2009-10-01 19:56 ` [Bug #14270] Cannot boot on a PIII Celeron Rafael J. Wysocki
2009-10-01 19:56   ` Rafael J. Wysocki
2009-10-02  8:30   ` Cyrill Gorcunov
2009-10-02  8:30     ` Cyrill Gorcunov
     [not found]     ` <aa79d98a0910020130p4d3c5b5fh9597ea435db7f872-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-02  9:13       ` Michael Tokarev
2009-10-02  9:13         ` Michael Tokarev
     [not found]         ` <4AC5C42E.9070909-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org>
2009-10-02 10:38           ` Michael Tokarev
2009-10-02 10:38             ` Michael Tokarev
2009-10-02 10:55             ` Cyrill Gorcunov
     [not found]               ` <aa79d98a0910020355r31b37ea0v1fef7286f7a71508-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-02 10:59                 ` Michael Tokarev
2009-10-02 10:59                   ` Michael Tokarev
2009-10-02 14:05                   ` Cyrill Gorcunov
2009-10-04 12:14                   ` Michael Tokarev
2009-10-04 12:43                     ` Cyrill Gorcunov
2009-10-01 19:56 ` [Bug #14275] kernel>=2.6.31: ahci.c: do not force unconditionally sb600 to 32bit dma any more? Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14294] kernel BUG at drivers/ide/ide-disk.c:187 Rafael J. Wysocki
2009-10-01 19:56 ` [Bug #14301] WARNING: at net/ipv4/af_inet.c:154 Rafael J. Wysocki
2009-10-03  8:36   ` Eric Dumazet
2009-10-03  8:36     ` Eric Dumazet
     [not found]     ` <4AC70D20.4060009-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-03  8:52       ` Eric Dumazet
2009-10-03  8:52         ` Eric Dumazet
2009-10-03 17:53         ` Eric Dumazet
2009-10-03 17:53           ` Eric Dumazet
     [not found]           ` <4AC78F7C.40908-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-07 15:41             ` Eric Dumazet
2009-10-07 15:41               ` Eric Dumazet
2009-10-09 14:43               ` [PATCH] udp: Fix udp_poll() and ioctl() Eric Dumazet
     [not found]                 ` <4ACF4C1C.4050505-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-13 10:18                   ` David Miller
2009-10-13 10:18                     ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2009-10-11 22:41 2.6.32-rc4: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-10-11 23:01 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
2009-10-11 23:57   ` Frans Pop
2009-10-11 23:57     ` Frans Pop
     [not found]     ` <200910120157.04616.elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org>
2009-10-12 21:29       ` Rafael J. Wysocki
2009-10-12 21:29         ` Rafael J. Wysocki
2009-10-26 19:26 2.6.32-rc5-git3: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-10-26 19:31 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
2009-11-16 22:58 2.6.32-rc7-git1: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-11-16 23:01 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki
2009-11-21 14:59 2.6.32-rc8-git1: Reported regressions 2.6.30 -> 2.6.31 Rafael J. Wysocki
2009-11-21 15:02 ` [Bug #14141] order 2 page allocation failures in iwlagn Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091019140151.GC9036@csn.ul.ie \
    --to=mel-wprd99kpj+uzqb+pc5nmwq@public.gmane.org \
    --cc=bzolnier-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=elendil-EIBgga6/0yRmR6Xm/wNWPw@public.gmane.org \
    --cc=jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org \
    --cc=mohamed.abbas-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org \
    --cc=reinette.chatre-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=rjw-KKrjLPT3xs0@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.