* [LST/MM TOPIC] really non-blocking in aio stack @ 2012-02-13 15:35 Zheng Liu 2012-02-13 16:34 ` Zach Brown 0 siblings, 1 reply; 3+ messages in thread From: Zheng Liu @ 2012-02-13 15:35 UTC (permalink / raw) To: lsf-pc; +Cc: linux-fsdevel Hi all, Currently native aio has been used in many critical applications like innodb of MySQL and Nginx for a web server. But it is really not as asynchronous as the user expects. __getblk() can be blocked by the metadata allocation, and get_reuqest() can sleep because of the queue congestion. So the user is really annoyed by the delay. So we want to improve it somehow to make it at least really non-blocking. Although we have define EIOCBRETRY, it seems to me that it is not used as perfect as it can. We can return a EIOCBRETRY when the underlying work will be blocked, and the generic aio can be tuned to either do it asynchronously or notify the user about the status and let the user have been no finalization yet. So maybe we can take this chance to discuss it since now at least it really hurts some very important applications in the world. I am sorry that I forgot to send a topic to mailing list before the deadline. Hope it isn't too late. Regards, Zheng ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [LST/MM TOPIC] really non-blocking in aio stack 2012-02-13 15:35 [LST/MM TOPIC] really non-blocking in aio stack Zheng Liu @ 2012-02-13 16:34 ` Zach Brown 2012-02-15 6:13 ` Zheng Liu 0 siblings, 1 reply; 3+ messages in thread From: Zach Brown @ 2012-02-13 16:34 UTC (permalink / raw) To: linux-fsdevel, gnehzuil.liu; +Cc: Jeff Moyer (dropping lsf-pc from the follow-on discussion for fsdevel) > Although we have define EIOCBRETRY, it seems to me that it is not used > as perfect as it can. EIOCBRETRY is a disaster because the operations are retried in the context of the kaio threads. To use it safely you have to ensure that nothing the operation will do after returning -EIOCBRETRY will reference current-> . Realize that this can include convoluted paths through shared code that might have *no idea* that they're used by some other path after EIOCBRETRY and so have to be supernaturally careful with current-> references. It's a maintenance nightmare. The fs/aio.c retry code has the aio thread magically assume the mm context of the submitting thread when it calls the retry handlers. (aio_kick_handler()). So, great, that's one current field that happens to be sharable. How about others? current->journal_info? current->io_context? People sometimes ask about EIOCBRETRY and vfs ops and never mention current->link_count. As one of the people who has sunk serious time into fs/aio.c (cc:ing my erstwhile partner in crime), I strongly discourage investing more resources into the fs/aio.c design. If it were me I'd be putting resources into async infrastructure which makes use of the current existing sync system call handling paths. Async calls should have no idea that they're async: no duplication of the syscall abi in submission argument structs, no magical fget before calling operation handlers, no iocbs being sprinkled down through kernel call stacks, no magical return codes. Yeah, this ends up implying heavy use of kernel threads and playing scary games with the task_struct of the submitter and async processing thread. At least the scary code would be in one place. The current alternative of requiring fragile async implementations of system calls has a compelling history of failure. fs/aio.c has been around for a decade and has not seen significant use outside of its initial supported operation. I should really get the ogg of my LCA presentation (more of a jet-lagged rant :)) on this posted somewhere. - z ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [LST/MM TOPIC] really non-blocking in aio stack 2012-02-13 16:34 ` Zach Brown @ 2012-02-15 6:13 ` Zheng Liu 0 siblings, 0 replies; 3+ messages in thread From: Zheng Liu @ 2012-02-15 6:13 UTC (permalink / raw) To: Zach Brown; +Cc: linux-fsdevel, Jeff Moyer On Mon, Feb 13, 2012 at 11:34:33AM -0500, Zach Brown wrote: > (dropping lsf-pc from the follow-on discussion for fsdevel) > > >Although we have define EIOCBRETRY, it seems to me that it is not used > >as perfect as it can. > > EIOCBRETRY is a disaster because the operations are retried in the > context of the kaio threads. To use it safely you have to ensure that > nothing the operation will do after returning -EIOCBRETRY will reference > current-> . > > Realize that this can include convoluted paths through shared code that > might have *no idea* that they're used by some other path after > EIOCBRETRY and so have to be supernaturally careful with current-> > references. It's a maintenance nightmare. > > The fs/aio.c retry code has the aio thread magically assume the mm > context of the submitting thread when it calls the retry handlers. > (aio_kick_handler()). So, great, that's one current field that happens > to be sharable. How about others? current->journal_info? > current->io_context? People sometimes ask about EIOCBRETRY and vfs ops > and never mention current->link_count. > > As one of the people who has sunk serious time into fs/aio.c (cc:ing my > erstwhile partner in crime), I strongly discourage investing more > resources into the fs/aio.c design. If it were me I'd be putting > resources into async infrastructure which makes use of the current > existing sync system call handling paths. > > Async calls should have no idea that they're async: no duplication of > the syscall abi in submission argument structs, no magical fget before > calling operation handlers, no iocbs being sprinkled down through kernel > call stacks, no magical return codes. > > Yeah, this ends up implying heavy use of kernel threads and playing > scary games with the task_struct of the submitter and async processing > thread. At least the scary code would be in one place. > > The current alternative of requiring fragile async implementations of > system calls has a compelling history of failure. fs/aio.c has been > around for a decade and has not seen significant use outside of its > initial supported operation. Hi Zach, As I am the new comer to this problem, so any suggestions are welcomed, and that is also the reason I raise it as a topic in this year's LSF summit. Currently we provide the semantics and some very important applications try to use it, while it annoyed them for such a long time. For example, Innodb of MySQL use io_submit in background thread to improve write performance. After uses it, the performance of MySQL can be promoted by 10%. Nginx uses io_submit to read/write files. So, IMHO, we need to think about how to improve it, rather than let applications to modify their programs. Especially, MySQL and Nginx are widely used in the world. Now my employer does allocate some resources for me to try to resolve it , so let us discuss a roadmap and I volunteer to work on it. My very first attempt is trivial. Just let the user decide what to do. If there is any blocking issues, we can return it to the caller and let him/her decide if he/she can endure the delay or throw it to another thread. > > I should really get the ogg of my LCA presentation (more of a jet-lagged > rant :)) on this posted somewhere. Never mind. Google has helped me to find it out. http://mirror.linux.org.au/pub/linux.conf.au/2009/Thursday/131.ogg Thanks for the advice. Regards, Zheng > > - z ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-02-15 6:08 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-13 15:35 [LST/MM TOPIC] really non-blocking in aio stack Zheng Liu 2012-02-13 16:34 ` Zach Brown 2012-02-15 6:13 ` Zheng Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).