From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: Revert "aio: block exit_aio() until all context requests are completed" Date: Sat, 16 May 2015 09:16:12 -0600 Message-ID: <55575F3C.6020107@kernel.dk> References: <1431675417-30464-1-git-send-email-borntraeger@de.ibm.com> <5555A33B.20006@de.ibm.com> <55561038.5080602@de.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: Gu Zheng , Benjamin LaHaise , linux-aio@kvack.org, linux-fsdevel@vger.kernel.org, stable@vger.kernel.org To: Christian Borntraeger , Jeff Moyer Return-path: Received: from mail-pd0-f169.google.com ([209.85.192.169]:34349 "EHLO mail-pd0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752127AbbEPPQQ (ORCPT ); Sat, 16 May 2015 11:16:16 -0400 Received: by pdeq5 with SMTP id q5so67923431pde.1 for ; Sat, 16 May 2015 08:16:16 -0700 (PDT) In-Reply-To: <55561038.5080602@de.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 05/15/2015 09:26 AM, Christian Borntraeger wrote: > Am 15.05.2015 um 15:42 schrieb Jeff Moyer: >> Christian Borntraeger writes: >> >>> I see a significant latency (can be minutes with 2000 disks and HZ=100) >>> when exiting a QEMU process that has lots of disk devices via aio. The >>> process sits idle doing nothing as zombie in exit_aio waiting for the >>> completion. >>> >>> Turns out that >>> commit 6098b45b32 ("aio: block exit_aio() until all context requests are >>> completed") caused the delay. >>> >>> Patch description was: >>> >>> It seems that exit_aio() also needs to wait for all iocbs to complete (like >>> io_destroy), but we missed the wait step in current implemention, so fix >>> it in the same way as we did in io_destroy. >>> >>> Now: io_destroy requires to block until everything is cleaned up from its >>> interface description in the manpage: >>> DESCRIPTION >>> The io_destroy() system call will attempt to cancel all outstanding >>> asynchronous I/O operations against ctx_id, will block on the completion >>> of all operations that could not be canceled, and will destroy the ctx_id. >>> >>> Does process exit require the same full blocking? We might be able to >>> cleanup the process and let the aio data structures be freed lazily. >>> Opinions or better ideas? >> >> This has already been fixed: >> >> commit dc48e56d761610da4ea1088d1bea0a030b8e3e43 >> Author: Jens Axboe >> Date: Wed Apr 15 11:17:23 2015 -0600 >> >> aio: fix serial draining in exit_aio() >> >> Cheers, >> Jeff >> > Cool thanks. As the original patch had cc stable, shouldnt the fix also be backported? I'll email stable. -- Jens Axboe