From mboxrd@z Thu Jan 1 00:00:00 1970 From: Coly Li Subject: Re: [PATCH][RFC] A readahead complete notify approach to implement buffer aio Date: Fri, 04 Nov 2011 02:09:19 +0800 Message-ID: <4EB2D8CF.2080108@coly.li> References: <1320138024-10837-1-git-send-email-gaoyang.zyh@taobao.com> Reply-To: i@coly.li Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: linux-kernel@vger.kernel.org, bcrl@kvack.org, viro@zeniv.linux.org.uk, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org, jaxboe@fusionio.com, Zhu Yanhai To: Zhu Yanhai Return-path: In-Reply-To: <1320138024-10837-1-git-send-email-gaoyang.zyh@taobao.com> Sender: owner-linux-aio@kvack.org List-Id: linux-fsdevel.vger.kernel.org On 2011=E5=B9=B411=E6=9C=8801=E6=97=A5 17:00, Zhu Yanhai Wrote: > The current libaio/aio has to be Direct-IO, otherwise it falls back int= o sync IO. > However, the aio core has already been asychronous naturally. This patc= h adds a complete > notify mechanism to implement buffer aio, the main idea is to readahead= ()-like in > io_submit(), counts the non-uptodated pages assocaiated with each iocb,= then put each ref > in the bio complete path just before unlock_page(), and hook them on to= the aio ring buffer > finally when the ref drops to zero. In io_getevents(), we call vfs_read= () as a safe net > since there is still little possibility that the pages had brought in w= ere reclaimed > between io_submit() and io_getevents(). >=20 > I have tested this patch for a while, for the small size random io requ= est, its > performance is more or less the same with the traditional aio, for the = big io request, > the overhead of one extra memory copy arises. >=20 > I think so far it has at least below obvious drawbacks, >=20 > * mpage_readpage() is a really narrow interface, I have no way to pass = down > the new control struct baiocb, so I just put it into struct task_struct= and > refer it by current() as a workaround. >=20 > * the do_baio_read() routine is heavily similar with do_generic_file_re= ad(), but > the latter is really hard to modify. I think we may stuff these code do= wn into the > readahead path to reduce code reduplication. >=20 > Hopefully the explanations are clear enough and don't muddy the water a= ny worse. > I figure the code does need some better comments, and any suggestion ar= e welcome. >=20 > Signed-off-by: Zhu Yanhai >=20 > --- > fs/aio.c | 319 +++++++++++++++++++++++++++++++++++= +++++++- > fs/buffer.c | 26 ++++- > fs/mpage.c | 28 ++++- > include/linux/aio.h | 9 ++ > include/linux/aio_abi.h | 1 + > include/linux/blk_types.h | 2 + > include/linux/buffer_head.h | 3 + > include/linux/page-flags.h | 2 + > include/linux/sched.h | 1 + > 9 files changed, 386 insertions(+), 5 deletions(-) >=20 Hmm, I don't see the usage from user space. Is it possible to post a demo= code in user space, so people are able to understand how to use/test your patch. BTW, if there is any performance number, it should be interesting, too. Thanks. --=20 Coly Li -- To unsubscribe, send a message with 'unsubscribe linux-aio' in the body to majordomo@kvack.org. For more info on Linux AIO, see: http://www.kvack.org/aio/ Don't email: aart@kvack.org