From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755256AbaI3EgD (ORCPT ); Tue, 30 Sep 2014 00:36:03 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:54452 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750841AbaI3EgB (ORCPT ); Tue, 30 Sep 2014 00:36:01 -0400 Date: Tue, 30 Sep 2014 05:35:56 +0100 From: Al Viro To: Linus Torvalds Cc: Dave Jones , Linux Kernel Subject: Re: pipe/page fault oddness. Message-ID: <20140930043556.GS7996@ZenIV.linux.org.uk> References: <20140930033327.GA14558@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 29, 2014 at 09:27:09PM -0700, Linus Torvalds wrote: > On Mon, Sep 29, 2014 at 8:33 PM, Dave Jones wrote: > > > > Looking at the dump, there's only one running trinity child, > > with all the others blocking on it. > > > > trinity-c49 R running task 12856 19464 7633 0x00000004 > > ffff8800a09bf960 0000000000000002 ffff8800a09bf9f8 ffff880219650000 > > 00000000001d4080 0000000000000000 ffff8800a09bffd8 00000000001d4080 > > ffff88023f755bc0 ffff880219650000 ffff8800a09bffd8 ffff88010b017e00 > > Call Trace: > > [] handle_mm_fault+0x3a7/0xcd0 > > [] __do_page_fault+0x1a4/0x600 > > [] do_page_fault+0x1e/0x70 > > [] page_fault+0x22/0x30 > > [] ? copy_page_to_iter+0x3b3/0x500 > > [] pipe_read+0xdf/0x330 > > > > Running the function tracer on that pid shows it spinning forever.. > > http://codemonkey.org.uk/junk/pipe-trace.txt > > > > Kernel bug (missing EFAULT check somewhere perhaps?), or is this a > > case where the fuzzer asked the kernel to do something stupid, and it obliged ? > > Hmm. It looks like copy_page_to_iter_iovec() is broken and keeps not > making any progress while just faulting. > > I don't see how that could happen, though. All the loops there are > conditional on the user copies *not* failing (ie "!left"), and they > seem to properly update "iov". > > Mind sending a disassembly of your "copy_page_to_iter" function, in > particular around that whole "0x3b3/0x500" area which is where the > page fault seems to happen? > > Adding Al to the cc, since this code is from his commit 6e58e79db8a1 > ("introduce copy_page_to_iter, kill loop over iovec in > generic_file_aio_read()") but I don't see anything obviously wrong > there. > > Al? Do you see something I don't? Dave's function trace does seem to > say that it doesn't even get back to pipe_read(), though, so the loop > really must be inside copy_page_to_iter(). I'll take a look tomorrow morning after I get some sleep - 19 hours of uptime, on top of 5 hours of sleep, on top of ~20 hours of uptime ;-/