From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755532AbcITRLp (ORCPT ); Tue, 20 Sep 2016 13:11:45 -0400 Received: from mx5-phx2.redhat.com ([209.132.183.37]:49173 "EHLO mx5-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753964AbcITRLn (ORCPT ); Tue, 20 Sep 2016 13:11:43 -0400 Date: Tue, 20 Sep 2016 13:11:41 -0400 (EDT) From: Jan Stancek To: Al Viro Cc: linux-kernel@vger.kernel.org Message-ID: <570490469.234828.1474391501934.JavaMail.zimbra@redhat.com> In-Reply-To: <20160920150657.GN2356@ZenIV.linux.org.uk> References: <57E131E6.1090507@redhat.com> <20160920150657.GN2356@ZenIV.linux.org.uk> Subject: Re: [bug] pwritev02 hang on s390x with 4.8.0-rc7 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.34.26.57] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - FF48 (Win)/8.0.6_GA_5922) Thread-Topic: pwritev02 hang on s390x with 4.8.0-rc7 Thread-Index: v1PxNYDZNRSgH+OfVUrv8nf9l0OCqw== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > From: "Al Viro" > To: "Jan Stancek" > Cc: linux-kernel@vger.kernel.org > Sent: Tuesday, 20 September, 2016 5:06:57 PM > Subject: Re: [bug] pwritev02 hang on s390x with 4.8.0-rc7 > > On Tue, Sep 20, 2016 at 02:56:06PM +0200, Jan Stancek wrote: > > Hi, > > > > I'm hitting a regression with LTP's pwritev02 [1] on s390x with 4.8.0-rc7. > > Specifically the EFAULT case, which is passing an iovec with invalid base > > address: > > #define CHUNK 64 > > static struct iovec wr_iovec3[] = { > > {(char *)-1, CHUNK}, > > }; > > hangs with 100% cpu usage and not very helpful stack trace: > > # cat /proc/28544/stack > > [<0000000000001000>] 0x1000 > > [] 0xffffffffffffffff > > > > The problem starts with d4690f1e1cda "fix iov_iter_fault_in_readable()". > > > > Before this commit fault_in_pages_readable() called __get_user() on start > > address which presumably failed with -EFAULT immediately. > > > > With this commit applied fault_in_multipages_readable() appears to return 0 > > for the case when "start" is invalid but "end" overflows. Then according to > > my traces we keep spinning inside while loop in iomap_write_actor(). > > Cute. Let me see if I understand what's going on there: we have a wraparound > that would've been caught by most of access_ok(), but not on an architectures > where access_ok() is a no-op; in that case the loop is skipped and we > just check the last address, which passes and we get a false positive. > Bug is real and it's definitely -stable fodder. > > I'm not sure that the fix you propose is right, though. Note that ERR_PTR() > is not a valid address on any architecture, so any wraparound automatically > means -EFAULT and we can simply check unlikely(uaddr > end) and bugger off > if it holds. while() turns into do-while(), of course, and the same is > needed for the read side. > > Could you check if the following works for you? This fixes pwritev02 hang. I ran all syscalls tests from LTP, and I see a change in behaviour of couple other tests (writev01, writev03 and writev04 [1]) in 4.8.0-rc7. These call writev() with partially invalid iovecs, and now fail with EFAULT, while with previous -rc6 kernel they returned number of bytes written before they encountered invalid iovec record. This should be reproducible also on x86. Regards, Jan [1] https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/syscalls/writev > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index 66a1260..7e3d537 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -571,56 +571,56 @@ static inline int fault_in_pages_readable(const char > __user *uaddr, int size) > */ > static inline int fault_in_multipages_writeable(char __user *uaddr, int > size) > { > - int ret = 0; > char __user *end = uaddr + size - 1; > > if (unlikely(size == 0)) > - return ret; > + return 0; > > + if (unlikely(uaddr > end)) > + return -EFAULT; > /* > * Writing zeroes into userspace here is OK, because we know that if > * the zero gets there, we'll be overwriting it. > */ > - while (uaddr <= end) { > - ret = __put_user(0, uaddr); > - if (ret != 0) > - return ret; > + do { > + if (unlikely(__put_user(0, uaddr) != 0)) > + return -EFAULT; > uaddr += PAGE_SIZE; > - } > + } while (uaddr <= end); > > /* Check whether the range spilled into the next page. */ > if (((unsigned long)uaddr & PAGE_MASK) == > ((unsigned long)end & PAGE_MASK)) > - ret = __put_user(0, end); > + return __put_user(0, end); > > - return ret; > + return 0; > } > > static inline int fault_in_multipages_readable(const char __user *uaddr, > int size) > { > volatile char c; > - int ret = 0; > const char __user *end = uaddr + size - 1; > > if (unlikely(size == 0)) > - return ret; > + return 0; > > - while (uaddr <= end) { > - ret = __get_user(c, uaddr); > - if (ret != 0) > - return ret; > + if (unlikely(uaddr > end)) > + return -EFAULT; > + > + do { > + if (unlikely(__get_user(c, uaddr) != 0)) > + return -EFAULT; > uaddr += PAGE_SIZE; > - } > + } while (uaddr <= end); > > /* Check whether the range spilled into the next page. */ > if (((unsigned long)uaddr & PAGE_MASK) == > ((unsigned long)end & PAGE_MASK)) { > - ret = __get_user(c, end); > - (void)c; > + return __get_user(c, end); > } > > - return ret; > + return 0; > } > > int add_to_page_cache_locked(struct page *page, struct address_space > *mapping, >