From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757443AbbFPXFb (ORCPT ); Tue, 16 Jun 2015 19:05:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60792 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751555AbbFPXFX (ORCPT ); Tue, 16 Jun 2015 19:05:23 -0400 Date: Wed, 17 Jun 2015 01:04:14 +0200 From: Oleg Nesterov To: Al Viro , Andrew Morton , Benjamin LaHaise , Jeff Moyer Cc: linux-aio@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/3] aio: ctx->dead cleanups Message-ID: <20150616230414.GA15776@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Al, please help. We are trying to backport some aio fixes and I am absolutely confused by your b2edffdd912b "fix mremap() vs. ioctx_kill() race". Firstly, I simply can't understand what exactly it tries to fix. OK, aio_free_ring() can race with kill and we can remap the soon-to-be-killed ctx. So what? kill_ioctx() will the the correct (already re-mapped) ctx->mmap_base after it drops mm->ioctx_lock. So it seems to me we only need this change to ensure that move_vma() can not succeed if ctx was already removed from ->ioctx_table, or, if we race with ioctx_alloc(), it was not added to ->ioctx_table. IOW, we need to ensure that move_vma()->aio_ring_mmap() can not race with vm_munmap(ctx->mmap_base) in kill_ioctx() or ioctx_alloc(). And this race doesn't look really bad. The kernel can't crash, just the application can fool itself. But I guess I missed something, and I'd like to know what I have missed. Could you explain? Also. The change in move_vma() looks "obviously wrong". Don't we need something like the patch at the end to ensure we do not "leak" new_vma or I am totally confused? But to me the main problem is atomic_read(ctx->dead) in aio_remap(). I mean, it complicates the backporting, and it looks unnecessary and confusing. See the 1st patch. Please review, I do not know how to test this. Oleg. --- x/mm/mremap.c +++ x/mm/mremap.c @@ -275,6 +275,8 @@ static unsigned long move_vma(struct vm_ moved_len = move_page_tables(vma, old_addr, new_vma, new_addr, old_len, need_rmap_locks); if (moved_len < old_len) { + err = -ENOMEM; +xxx: /* * On error, move entries back from new area to old, * which will succeed since page tables still there, @@ -285,14 +287,11 @@ static unsigned long move_vma(struct vm_ vma = new_vma; old_len = new_len; old_addr = new_addr; - new_addr = -ENOMEM; + new_addr = err; } else if (vma->vm_file && vma->vm_file->f_op->mremap) { err = vma->vm_file->f_op->mremap(vma->vm_file, new_vma); - if (err < 0) { - move_page_tables(new_vma, new_addr, vma, old_addr, - moved_len, true); - return err; - } + if (err < 0) + goto xxx; } /* Conceal VM_ACCOUNT so old reservation is not undone */