From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751470AbdAMV7X (ORCPT ); Fri, 13 Jan 2017 16:59:23 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:58066 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751267AbdAMV7W (ORCPT ); Fri, 13 Jan 2017 16:59:22 -0500 Date: Fri, 13 Jan 2017 21:59:19 +0000 From: Al Viro To: Linus Torvalds Cc: "Alan J. Wylie" , Thorsten Leemhuis , linux-kernel Subject: Re: 4.9.0 regression in pipe-backed iov_iter with systemd-nspawn Message-ID: <20170113215919.GU1555@ZenIV.linux.org.uk> References: <20170113093359.GJ1555@ZenIV.linux.org.uk> <22648.41914.351371.678606@wylie.me.uk> <20170113102019.GK1555@ZenIV.linux.org.uk> <20170113111842.GL1555@ZenIV.linux.org.uk> <20170113200826.GP1555@ZenIV.linux.org.uk> <20170113201121.GQ1555@ZenIV.linux.org.uk> <20170113204731.GS1555@ZenIV.linux.org.uk> <20170113215504.GT1555@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170113215504.GT1555@ZenIV.linux.org.uk> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 13, 2017 at 09:55:04PM +0000, Al Viro wrote: > On Fri, Jan 13, 2017 at 08:47:31PM +0000, Al Viro wrote: > > On Fri, Jan 13, 2017 at 12:32:37PM -0800, Linus Torvalds wrote: > > > > > Ugh. I still think your patch is butt-ugly, and the index comparisons > > > make me nervous, but.. > > > > No arguments here - 6am on 20-odd hours of uptime is _not_ a good time > > for writing, especially since the data structure needs better documentation > > and probably a couple of inlined helpers. I'll try to massage the damn > > thing into more readable form. > > FWIW, I think it will be more readable if we separate the "advance" and > "truncate" parts like this (warning: not even build-tested). Comments? diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 25f572303801..dae1ac940d5f 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -730,43 +730,60 @@ size_t iov_iter_copy_from_user_atomic(struct page *page, } EXPORT_SYMBOL(iov_iter_copy_from_user_atomic); +static inline void pipe_truncate(struct iov_iter *i) +{ + struct pipe_inode_info *pipe = i->pipe; + if (pipe->nrbufs) { + size_t off = i->iov_offset; + int idx = i->idx; + int n; + + n = (pipe->curbuf + pipe->nrbufs - idx) & (pipe->buffers - 1); + if (off) { + pipe->bufs[idx].len = off - pipe->bufs[idx].offset; + /* free all after idx; n can't be 0 */ + idx = next_idx(idx, pipe); + n--; + } else { + /* free all _starting_ at idx. + * n is 0 when we have nothing to do + * *or* when we are truncating full pipe to empty. + */ + if (pipe->nrbufs == pipe->buffers && !n) + n = pipe->buffers; + } + while (n--) { + pipe_buf_release(pipe, &pipe->bufs[idx]); + idx = next_idx(idx, pipe); + pipe->nrbufs--; + } + } +} + static void pipe_advance(struct iov_iter *i, size_t size) { struct pipe_inode_info *pipe = i->pipe; - struct pipe_buffer *buf; - int idx = i->idx; - size_t off = i->iov_offset, orig_sz; - if (unlikely(i->count < size)) size = i->count; - orig_sz = size; - if (size) { + struct pipe_buffer *buf; + size_t off = i->iov_offset, left = size; + int idx = i->idx; if (off) /* make it relative to the beginning of buffer */ - size += off - pipe->bufs[idx].offset; + left += off - pipe->bufs[idx].offset; while (1) { buf = &pipe->bufs[idx]; - if (size <= buf->len) + if (left <= buf->len) break; - size -= buf->len; + left -= buf->len; idx = next_idx(idx, pipe); } - buf->len = size; i->idx = idx; - off = i->iov_offset = buf->offset + size; - } - if (off) - idx = next_idx(idx, pipe); - if (pipe->nrbufs) { - int unused = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1); - /* [curbuf,unused) is in use. Free [idx,unused) */ - while (idx != unused) { - pipe_buf_release(pipe, &pipe->bufs[idx]); - idx = next_idx(idx, pipe); - pipe->nrbufs--; - } + i->iov_offset = buf->offset + left; } - i->count -= orig_sz; + i->count -= size; + /* ... and discard everything past that point */ + pipe_truncate(i); } void iov_iter_advance(struct iov_iter *i, size_t size) @@ -826,6 +843,7 @@ void iov_iter_pipe(struct iov_iter *i, int direction, size_t count) { BUG_ON(direction != ITER_PIPE); + WARN_ON(pipe->nrbufs == pipe->buffers); i->type = direction; i->pipe = pipe; i->idx = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1);