From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751451AbdAMKUY (ORCPT ); Fri, 13 Jan 2017 05:20:24 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:40736 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751363AbdAMKUW (ORCPT ); Fri, 13 Jan 2017 05:20:22 -0500 Date: Fri, 13 Jan 2017 10:20:19 +0000 From: Al Viro To: "Alan J. Wylie" Cc: Linus Torvalds , Thorsten Leemhuis , linux-kernel Subject: Re: 4.9.0 regression in pipe-backed iov_iter with systemd-nspawn Message-ID: <20170113102019.GK1555@ZenIV.linux.org.uk> References: <22647.59020.331664.632444@wylie.me.uk> <22648.1838.747474.51727@wylie.me.uk> <22648.32903.752857.203733@wylie.me.uk> <20170113093359.GJ1555@ZenIV.linux.org.uk> <22648.41914.351371.678606@wylie.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <22648.41914.351371.678606@wylie.me.uk> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 13, 2017 at 09:54:02AM +0000, Alan J. Wylie wrote: > root 1669 0.0 0.0 76144 5412 ? S 09:51 0:00 \_ /usr/sbin/postdrop -r > > Another hang. > > # dmesg | tail > [ 22.352442] r8169 0000:03:00.0: loading /lib/firmware/4.9.3-dirty/rtl_nic/rtl8168e-3.fw failed with error -2 > [ 22.408814] r8169 0000:03:00.0: direct-loading rtl_nic/rtl8168e-3.fw > [ 22.408821] fw_set_page_data: fw-rtl_nic/rtl8168e-3.fw buf=ffff92b7b1cb8c80 data=ffffad1641179000 size=3872 > [ 22.536043] r8169 0000:03:00.0 enp3s0: link down > [ 22.536079] r8169 0000:03:00.0 enp3s0: link down > [ 24.873801] r8169 0000:03:00.0 enp3s0: link up > [ 24.874766] br0: port 1(enp3s0) entered blocking state > [ 24.876622] br0: port 1(enp3s0) entered forwarding state > [ 24.878560] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready > [ 219.683974] nr: 0->16, cur: 5->5, buffers: 16->16 OK, so it is iov_iter_advance() failing to free the shit allocated, either due to some breakage in pipe_advance() or buggered 'copied'... Let's see which one; could you apply the following and run your reproducer? The only difference from the previous is that it collects and prints a bit more, so it should be just as reproducible... diff --git a/fs/splice.c b/fs/splice.c index 873d83104e79..11477609e7f7 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -393,6 +393,10 @@ static ssize_t default_file_splice_read(struct file *in, loff_t *ppos, size_t offset, dummy, copied = 0; ssize_t res; int i; + unsigned nrbufs = pipe->nrbufs, + curbuf = pipe->curbuf, + buffers = pipe->buffers; + int idx, count, offs; if (pipe->nrbufs == pipe->buffers) return -EAGAIN; @@ -444,7 +448,22 @@ static ssize_t default_file_splice_read(struct file *in, loff_t *ppos, for (i = 0; i < nr_pages; i++) put_page(pages[i]); kvfree(pages); + count = to.count; + idx = to.idx; + offs = to.iov_offset; iov_iter_advance(&to, copied); /* truncates and discards */ + if (res == -EAGAIN && ( + pipe->nrbufs != nrbufs || + pipe->curbuf != curbuf || + pipe->buffers != buffers) + ) { + printk(KERN_ERR "nr: %d->%d, cur: %d->%d, buffers: %d->%d\n", + nrbufs, pipe->nrbufs, + curbuf, pipe->curbuf, + buffers, pipe->buffers); + printk(KERN_ERR "copied: %zd, count:%d, idx:%d, offs:%d\n", + copied, count, idx, offs); + } return res; }