All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Helge Hafting <helgehaf@aitel.hist.no>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.6.17-mm one process gets stuck in infinite loop in the kernel.
Date: Fri, 30 Jun 2006 16:55:32 -0700	[thread overview]
Message-ID: <20060630165532.5eadf286.akpm@osdl.org> (raw)
In-Reply-To: <20060630215405.GA9744@aitel.hist.no>

Helge Hafting <helgehaf@aitel.hist.no> wrote:
>
> On Thu, Jun 29, 2006 at 10:41:17AM -0700, Andrew Morton wrote:
> > On Thu, 29 Jun 2006 13:25:20 +0200
> > Helge Hafting <helge.hafting@aitel.hist.no> wrote:
> > 
> > > I have seen this both with mm2, m33 and mm4.
> > > Suddenly, the load meter jumps.
> > > Using ps & top, I see one process using 100% cpu.
> > > This is always a process that was exiting, this tend to happen
> > > when I close applications, or doing debian upgrades which
> > > runs lots of short-lived processes.
> > > 
> > > I believe it is running in the kernel, ps lists it with stat "RN"
> > > and it cannot be killed, not even with kill -9 from root.
> > > 
> > > Something wrong with process termination?
> > > 
> > 
> > Please generate a kernel profile when it happens so we can see
> > where it got stuck.
> > 
> > <boot with profile=1>
> > <wait for it to happen>
> > readprofile -r
> > sleep 10
> > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
> 
> It was easier to reproduce on my home machine, running mm2.
> I followed the recipe above, except typing manually means
> the wait was more than 10s.
> 
> Output from the pipe above:
> ffffffff801f9050 do_get_write_access                          210,0170
> ffffffff80111c20 __do_softirq                                 280,1591
> ffffffff8012e0e0 vm_stat_account                              280,2917
> ffffffff80194890 search_exception_tables                      491,5312
> ffffffff801f1380 ext3_journal_start_sb                        821,0250
> ffffffff801fd670 __log_space_left                             892,7812
> ffffffff8010bf50 __wake_up_bit                               1252,6042
> ffffffff8010c2f0 put_page                                    1412,9375
> ffffffff8010dd30 mark_page_accessed                          1732,1625
> ffffffff801a3340 __filemap_copy_from_user_iovec_inatomic     1951,7411
> ffffffff801643f0 cond_resched                                2001,5625
> ffffffff8015f241 error_exit                                  2431,8409
> ffffffff801a5dd0 balance_dirty_pages_ratelimited_nr          2580,5375
> ffffffff801f9aa0 journal_start                               3211,0559
> ffffffff8011ac00 page_waitqueue                              3813,9688
> ffffffff80139ea0 generic_commit_write                        4254,4271
> ffffffff80117500 __block_commit_write                        4902,3558
> ffffffff8010b320 __down_read_trylock                         50210,4583
> ffffffff8013d100 block_prepare_write                         57712,0208
> ffffffff80121800 __up_read                                   5833,3125
> ffffffff80117bc0 unlock_page                                 69214,4167
> ffffffff801fd710 journal_blocks_per_page                     83426,0625
> ffffffff8012e0c0 __wake_up                                   89427,9375
> ffffffff80109d70 kmem_cache_alloc                           112617,5938
> ffffffff801e78b0 walk_page_buffers                          12657,1875
> ffffffff801ea920 ext3_ordered_commit_write                  13525,2812
> ffffffff801074c0 kmem_cache_free                            15807,5962
> ffffffff801f11e0 __ext3_journal_stop                        169617,6667
> ffffffff801f96b0 start_this_handle                          18711,8562
> ffffffff801f8e50 journal_stop                               19063,7227
> ffffffff801eacf0 ext3_prepare_write                         19915,9256
> ffffffff80113440 find_lock_page                             205314,2569
> ffffffff8010f690 generic_file_buffered_write                21391,2612
> ffffffff8010deb0 __block_prepare_write                      22841,9291
> ffffffff801e85c0 ext3_writepage_trans_blocks                279919,4375
> ffffffff80166356 bad_gs                                     33520,4129
> ffffffff80179070 search_extable                             386834,5357
> ffffffff8010b400 find_vma                                   420137,5089
> ffffffff8010a180 do_page_fault                             107535,1302
> 0000000000000000 total                                     516580,0124


Oh.  This is probably the generic_file_buffer_write() hang, due to
zero-length iovec segments.

If so, the below should fix it up.

The presence of do_page_fault() in that trace is interesting.  At a guess,
I'd say that userspace is passing in a bad iovec.iov_base as well as
iovec.iov_len=0, and the kernel's copy_from_user() implementation is
needlessly dereferencing the pointer, getting a fault, then seeing that it
didn't need to copy anything anyway.  hmm.


diff -puN mm/filemap.c~generic_file_buffered_write-handle-zero-length-iovec-segments-stable mm/filemap.c
--- a/mm/filemap.c~generic_file_buffered_write-handle-zero-length-iovec-segments-stable
+++ a/mm/filemap.c
@@ -2125,6 +2125,12 @@ generic_file_buffered_write(struct kiocb
 			break;
 		}
 
+		if (unlikely(bytes == 0)) {
+			status = 0;
+			copied = 0;
+			goto zero_length_segment;
+		}
+
 		status = a_ops->prepare_write(file, page, offset, offset+bytes);
 		if (unlikely(status)) {
 			loff_t isize = i_size_read(inode);
@@ -2154,7 +2160,8 @@ generic_file_buffered_write(struct kiocb
 			page_cache_release(page);
 			continue;
 		}
-		if (likely(copied > 0)) {
+zero_length_segment:
+		if (likely(copied >= 0)) {
 			if (!status)
 				status = copied;
 
diff -puN mm/filemap.h~generic_file_buffered_write-handle-zero-length-iovec-segments-stable mm/filemap.h
--- a/mm/filemap.h~generic_file_buffered_write-handle-zero-length-iovec-segments-stable
+++ a/mm/filemap.h
@@ -88,7 +88,7 @@ filemap_set_next_iovec(const struct iove
 	const struct iovec *iov = *iovp;
 	size_t base = *basep;
 
-	while (bytes) {
+	do {
 		int copy = min(bytes, iov->iov_len - base);
 
 		bytes -= copy;
@@ -97,7 +97,7 @@ filemap_set_next_iovec(const struct iove
 			iov++;
 			base = 0;
 		}
-	}
+	} while (bytes);
 	*iovp = iov;
 	*basep = base;
 }
_



  reply	other threads:[~2006-06-30 23:52 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-29  8:36 2.6.17-mm4 Andrew Morton
2006-06-29  9:44 ` 2.6.17-mm4 Benoit Boissinot
2006-06-29 11:25 ` 2.6.17-mm one process gets stuck in infinite loop in the kernel Helge Hafting
2006-06-29 17:41   ` Andrew Morton
2006-06-29 20:39     ` Ralf Hildebrandt
2006-06-29 21:00       ` Andrew Morton
2006-06-30 12:48     ` Helge Hafting
2006-06-30 21:54     ` Helge Hafting
2006-06-30 23:55       ` Andrew Morton [this message]
2006-07-01 10:58         ` Helge Hafting
2006-07-01 11:05           ` Andrew Morton
2006-06-29 11:44 ` 2.6.17-mm4 Reuben Farrelly
2006-06-29 11:45 ` 2.6.17-mm4 Reuben Farrelly
2006-06-29 17:52   ` 2.6.17-mm4 Andrew Morton
2006-06-30  7:18     ` 2.6.17-mm4 Reuben Farrelly
2006-06-30  7:33       ` 2.6.17-mm4 Andrew Morton
2006-06-29 17:53 ` 2.6.17-mm4 Jesse Brandeburg
2006-06-29 19:05   ` 2.6.17-mm4 Andrew Morton
2006-06-30 23:53     ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01  0:12       ` 2.6.17-mm4 Andrew Morton
2006-07-01  0:17         ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01  0:31           ` 2.6.17-mm4 john stultz
2006-07-01 17:33             ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01 17:56               ` 2.6.17-mm4 john stultz
2006-07-01 23:57                 ` 2.6.17-mm4 Andrew Morton
2006-07-02  2:45                   ` 2.6.17-mm4 john stultz
2006-07-02  3:19                     ` 2.6.17-mm4 Andrew Morton
2006-07-02  3:37                       ` 2.6.17-mm4 john stultz
2006-07-01  0:52           ` 2.6.17-mm4 Andrew Morton
2006-07-01 18:18             ` 2.6.17-mm4 Jesse Brandeburg
2006-07-01  0:22         ` 2.6.17-mm4 Andrew Morton
2006-06-29 19:20 ` [-mm patch] drivers/message/fusion/mptsas.c: make 2 functions static Adrian Bunk
2006-06-29 19:20 ` [-mm patch] fs/nfs/: " Adrian Bunk
2006-06-29 19:36 ` Possible circular locking dependency detected in Reiser4 Andrew James Wade
2006-06-29 20:39 ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:43   ` 2.6.17-mm4 Dave Jones
2006-06-29 20:46     ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:49       ` 2.6.17-mm4 Dave Jones
2006-06-29 20:57         ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 20:58       ` 2.6.17-mm4 Andrew Morton
2006-06-29 21:41         ` 2.6.17-mm4 Michal Piotrowski
2006-06-29 21:09     ` 2.6.17-mm4 Ingo Molnar
2006-06-29 23:05       ` 2.6.17-mm4 Ingo Molnar
2006-06-30 10:07         ` 2.6.17-mm4 Alan Cox
2006-06-30  9:50           ` 2.6.17-mm4 Ingo Molnar
2006-06-30  9:54           ` 2.6.17-mm4 Arjan van de Ven
2006-06-30 11:01             ` 2.6.17-mm4 Andreas Mohr
2006-06-30 12:14             ` 2.6.17-mm4 Alan Cox
2006-06-30 17:27               ` 2.6.17-mm4 Dave Jones
2006-06-30 17:52                 ` 2.6.17-mm4 Alan Cox
2006-06-29 21:40 ` 2.6.17-mm4 Chris Rode
2006-06-29 22:18   ` 2.6.17-mm4 Andrew Morton
2006-06-29 23:27 ` 2.6.17-mm4 Ingo Molnar
2006-06-30 19:20 ` 2.6.17-mm4 Manuel Lauss
2006-06-30 23:26   ` 2.6.17-mm4 Andrew Morton
2006-07-01  7:12     ` 2.6.17-mm4 Manuel Lauss
2006-06-30 20:16 ` 2.6.17-mm4 Rafael J. Wysocki
2006-07-01 11:11 ` 2.6.17-mm4 raid bugs & traces Helge Hafting
2006-07-01 11:52   ` Andrew Morton
2006-07-01 16:25   ` Helge Hafting
2006-07-02  5:38     ` Reuben Farrelly
2006-07-02 18:46       ` Helge Hafting
2006-07-03 13:10         ` David Greaves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060630165532.5eadf286.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=helgehaf@aitel.hist.no \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.