From: Andrew Morton <akpm@osdl.org>
To: Helge Hafting <helgehaf@aitel.hist.no>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.6.17-mm one process gets stuck in infinite loop in the kernel.
Date: Fri, 30 Jun 2006 16:55:32 -0700 [thread overview]
Message-ID: <20060630165532.5eadf286.akpm@osdl.org> (raw)
In-Reply-To: <20060630215405.GA9744@aitel.hist.no>
Helge Hafting <helgehaf@aitel.hist.no> wrote:
>
> On Thu, Jun 29, 2006 at 10:41:17AM -0700, Andrew Morton wrote:
> > On Thu, 29 Jun 2006 13:25:20 +0200
> > Helge Hafting <helge.hafting@aitel.hist.no> wrote:
> >
> > > I have seen this both with mm2, m33 and mm4.
> > > Suddenly, the load meter jumps.
> > > Using ps & top, I see one process using 100% cpu.
> > > This is always a process that was exiting, this tend to happen
> > > when I close applications, or doing debian upgrades which
> > > runs lots of short-lived processes.
> > >
> > > I believe it is running in the kernel, ps lists it with stat "RN"
> > > and it cannot be killed, not even with kill -9 from root.
> > >
> > > Something wrong with process termination?
> > >
> >
> > Please generate a kernel profile when it happens so we can see
> > where it got stuck.
> >
> > <boot with profile=1>
> > <wait for it to happen>
> > readprofile -r
> > sleep 10
> > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
>
> It was easier to reproduce on my home machine, running mm2.
> I followed the recipe above, except typing manually means
> the wait was more than 10s.
>
> Output from the pipe above:
> ffffffff801f9050 do_get_write_access 210,0170
> ffffffff80111c20 __do_softirq 280,1591
> ffffffff8012e0e0 vm_stat_account 280,2917
> ffffffff80194890 search_exception_tables 491,5312
> ffffffff801f1380 ext3_journal_start_sb 821,0250
> ffffffff801fd670 __log_space_left 892,7812
> ffffffff8010bf50 __wake_up_bit 1252,6042
> ffffffff8010c2f0 put_page 1412,9375
> ffffffff8010dd30 mark_page_accessed 1732,1625
> ffffffff801a3340 __filemap_copy_from_user_iovec_inatomic 1951,7411
> ffffffff801643f0 cond_resched 2001,5625
> ffffffff8015f241 error_exit 2431,8409
> ffffffff801a5dd0 balance_dirty_pages_ratelimited_nr 2580,5375
> ffffffff801f9aa0 journal_start 3211,0559
> ffffffff8011ac00 page_waitqueue 3813,9688
> ffffffff80139ea0 generic_commit_write 4254,4271
> ffffffff80117500 __block_commit_write 4902,3558
> ffffffff8010b320 __down_read_trylock 50210,4583
> ffffffff8013d100 block_prepare_write 57712,0208
> ffffffff80121800 __up_read 5833,3125
> ffffffff80117bc0 unlock_page 69214,4167
> ffffffff801fd710 journal_blocks_per_page 83426,0625
> ffffffff8012e0c0 __wake_up 89427,9375
> ffffffff80109d70 kmem_cache_alloc 112617,5938
> ffffffff801e78b0 walk_page_buffers 12657,1875
> ffffffff801ea920 ext3_ordered_commit_write 13525,2812
> ffffffff801074c0 kmem_cache_free 15807,5962
> ffffffff801f11e0 __ext3_journal_stop 169617,6667
> ffffffff801f96b0 start_this_handle 18711,8562
> ffffffff801f8e50 journal_stop 19063,7227
> ffffffff801eacf0 ext3_prepare_write 19915,9256
> ffffffff80113440 find_lock_page 205314,2569
> ffffffff8010f690 generic_file_buffered_write 21391,2612
> ffffffff8010deb0 __block_prepare_write 22841,9291
> ffffffff801e85c0 ext3_writepage_trans_blocks 279919,4375
> ffffffff80166356 bad_gs 33520,4129
> ffffffff80179070 search_extable 386834,5357
> ffffffff8010b400 find_vma 420137,5089
> ffffffff8010a180 do_page_fault 107535,1302
> 0000000000000000 total 516580,0124
Oh. This is probably the generic_file_buffer_write() hang, due to
zero-length iovec segments.
If so, the below should fix it up.
The presence of do_page_fault() in that trace is interesting. At a guess,
I'd say that userspace is passing in a bad iovec.iov_base as well as
iovec.iov_len=0, and the kernel's copy_from_user() implementation is
needlessly dereferencing the pointer, getting a fault, then seeing that it
didn't need to copy anything anyway. hmm.
diff -puN mm/filemap.c~generic_file_buffered_write-handle-zero-length-iovec-segments-stable mm/filemap.c
--- a/mm/filemap.c~generic_file_buffered_write-handle-zero-length-iovec-segments-stable
+++ a/mm/filemap.c
@@ -2125,6 +2125,12 @@ generic_file_buffered_write(struct kiocb
break;
}
+ if (unlikely(bytes == 0)) {
+ status = 0;
+ copied = 0;
+ goto zero_length_segment;
+ }
+
status = a_ops->prepare_write(file, page, offset, offset+bytes);
if (unlikely(status)) {
loff_t isize = i_size_read(inode);
@@ -2154,7 +2160,8 @@ generic_file_buffered_write(struct kiocb
page_cache_release(page);
continue;
}
- if (likely(copied > 0)) {
+zero_length_segment:
+ if (likely(copied >= 0)) {
if (!status)
status = copied;
diff -puN mm/filemap.h~generic_file_buffered_write-handle-zero-length-iovec-segments-stable mm/filemap.h
--- a/mm/filemap.h~generic_file_buffered_write-handle-zero-length-iovec-segments-stable
+++ a/mm/filemap.h
@@ -88,7 +88,7 @@ filemap_set_next_iovec(const struct iove
const struct iovec *iov = *iovp;
size_t base = *basep;
- while (bytes) {
+ do {
int copy = min(bytes, iov->iov_len - base);
bytes -= copy;
@@ -97,7 +97,7 @@ filemap_set_next_iovec(const struct iove
iov++;
base = 0;
}
- }
+ } while (bytes);
*iovp = iov;
*basep = base;
}
_
next prev parent reply other threads:[~2006-06-30 23:52 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-29 8:36 2.6.17-mm4 Andrew Morton
2006-06-29 9:44 ` 2.6.17-mm4 Benoit Boissinot
2006-06-29 11:25 ` 2.6.17-mm one process gets stuck in infinite loop in the kernel Helge Hafting
2006-06-29 17:41 ` Andrew Morton
2006-06-29 20:39 ` Ralf Hildebrandt
2006-06-29 21:00 ` Andrew Morton
2006-06-30 12:48 ` Helge Hafting
2006-06-30 21:54 ` Helge Hafting
2006-06-30 23:55 ` Andrew Morton [this message]
2006-07-01 10:58 ` Helge Hafting
2006-07-01 11:05 ` Andrew Morton
2006-06-29 11:44 ` 2.6.17-mm4 Reuben Farrelly
2006-06-29 11:45 ` 2.6.17-mm4 Reuben Farrelly
2006-06-29 17:52 ` 2.6.17-mm4 Andrew Morton
2006-06-30 7:18 ` 2.6.17-mm4 Reuben Farrelly
2006-06-30 7:33 ` 2.6.17-mm4 Andrew Morton
2006-06-29 17:53 ` 2.6.17-mm4 Jesse Brandeburg
2006-06-29 19:05 ` 2.6.17-mm4 Andrew Morton
2006-06-30 23:53 ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01 0:12 ` 2.6.17-mm4 Andrew Morton
2006-07-01 0:17 ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01 0:31 ` 2.6.17-mm4 john stultz
2006-07-01 17:33 ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01 17:56 ` 2.6.17-mm4 john stultz
2006-07-01 23:57 ` 2.6.17-mm4 Andrew Morton
2006-07-02 2:45 ` 2.6.17-mm4 john stultz
2006-07-02 3:19 ` 2.6.17-mm4 Andrew Morton
2006-07-02 3:37 ` 2.6.17-mm4 john stultz
2006-07-01 0:52 ` 2.6.17-mm4 Andrew Morton
2006-07-01 18:18 ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01 0:22 ` 2.6.17-mm4 Andrew Morton
2006-06-29 19:20 ` [-mm patch] drivers/message/fusion/mptsas.c: make 2 functions static Adrian Bunk
2006-06-29 19:20 ` [-mm patch] fs/nfs/: " Adrian Bunk
2006-06-29 19:36 ` Possible circular locking dependency detected in Reiser4 Andrew James Wade
2006-06-29 20:39 ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:43 ` 2.6.17-mm4 Dave Jones
2006-06-29 20:46 ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:49 ` 2.6.17-mm4 Dave Jones
2006-06-29 20:57 ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:58 ` 2.6.17-mm4 Andrew Morton
2006-06-29 21:41 ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 21:09 ` 2.6.17-mm4 Ingo Molnar
2006-06-29 23:05 ` 2.6.17-mm4 Ingo Molnar
2006-06-30 10:07 ` 2.6.17-mm4 Alan Cox
2006-06-30 9:50 ` 2.6.17-mm4 Ingo Molnar
2006-06-30 9:54 ` 2.6.17-mm4 Arjan van de Ven
2006-06-30 11:01 ` 2.6.17-mm4 Andreas Mohr
2006-06-30 12:14 ` 2.6.17-mm4 Alan Cox
2006-06-30 17:27 ` 2.6.17-mm4 Dave Jones
2006-06-30 17:52 ` 2.6.17-mm4 Alan Cox
2006-06-29 21:40 ` 2.6.17-mm4 Chris Rode
2006-06-29 22:18 ` 2.6.17-mm4 Andrew Morton
2006-06-29 23:27 ` 2.6.17-mm4 Ingo Molnar
2006-06-30 19:20 ` 2.6.17-mm4 Manuel Lauss
2006-06-30 23:26 ` 2.6.17-mm4 Andrew Morton
2006-07-01 7:12 ` 2.6.17-mm4 Manuel Lauss
2006-06-30 20:16 ` 2.6.17-mm4 Rafael J. Wysocki
2006-07-01 11:11 ` 2.6.17-mm4 raid bugs & traces Helge Hafting
2006-07-01 11:52 ` Andrew Morton
2006-07-01 16:25 ` Helge Hafting
2006-07-02 5:38 ` Reuben Farrelly
2006-07-02 18:46 ` Helge Hafting
2006-07-03 13:10 ` David Greaves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060630165532.5eadf286.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=helgehaf@aitel.hist.no \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.