From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60359) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYTVM-0005UT-6z for qemu-devel@nongnu.org; Wed, 24 Feb 2016 02:05:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYTVH-0005f2-7Z for qemu-devel@nongnu.org; Wed, 24 Feb 2016 02:05:00 -0500 Received: from mail-pa0-x244.google.com ([2607:f8b0:400e:c03::244]:36099) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYTVG-0005eh-R0 for qemu-devel@nongnu.org; Wed, 24 Feb 2016 02:04:55 -0500 Received: by mail-pa0-x244.google.com with SMTP id a7so472140pax.3 for ; Tue, 23 Feb 2016 23:04:54 -0800 (PST) References: <56C02E4F.6030303@gmail.com> <87y4an8l93.fsf@linux.vnet.ibm.com> <56C41DE8.2030904@gmail.com> <87h9h7z9a0.fsf@linux.vnet.ibm.com> <56C6D8A0.2040502@gmail.com> From: Jevon Qiao Message-ID: <56CD5610.9010909@gmail.com> Date: Wed, 24 Feb 2016 15:04:48 +0800 MIME-Version: 1.0 In-Reply-To: <56C6D8A0.2040502@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 2/2] hw/9pfs: fix alignment issue when host filesystem block size is larger than client msize List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Aneesh Kumar K.V" , qemu-devel@nongnu.org Cc: gkurz@linux.vnet.ibm.com, mst@redhat.com [Removing ceph-devel alias] Hi Aneesh, Any further comment on my reply below? Thanks, Jevon On 19/2/16 16:56, Jevon Qiao wrote: > Hi Aneesh, >> I am not sure I understand the details correctly. iounit is the size >> that we use in client_read to determine the size in which >> we should request I/O from the client. But we still can't do I/O in size >> larger than s->msize. If you look at the client side (kernel 9p fs), you >> will find >> >> rsize = fid->iounit; >> if (!rsize || rsize > clnt->msize-P9_IOHDRSZ) >> rsize = clnt->msize - P9_IOHDRSZ; > Yes, I know this. >> if your iounit calculation ends up zero, that should be handled >> correctly by >> >> if (!iounit) { >> iounit = s->msize - P9_IOHDRSZ; >> } >> return iounit; >> >> >> So what is the issue here. ? > This will result in an alignment issue while mapping the I/O requested by > client into pages in the function of p9_nr_pages(). > > int p9_nr_pages(char *data, int len) > { > unsigned long start_page, end_page; > start_page = (unsigned long)data >> PAGE_SHIFT; > end_page = ((unsigned long)data + len + PAGE_SIZE - 1) >> > PAGE_SHIFT; > return end_page - start_page; > } > > Please see the following experiment I did without the fix. > > 1) Start qemu with cephfs, > > $ qemu-system-x86_64 /root/CentOS---6.6-64bit---2015-03-06-a.qcow2 > -smp 4 -m 4096 -fsdev > cephfs,security_model=passthrough,id=fsdev0,path=/ -device > virtio-9p-pci,id=fs0,fsdev=fsdev0,mount_tag=cephfs --enable-kvm > -nographic -net nic -net tap,ifname=tap0,script=no,downscript=no > > > 2) Mount the fs in the guest. > > [root@localhost ~]# mount -t 9p -o trans=virtio,version=9p2000.L > cephfs /mnt > [root@localhost ~]# ls -lah /mnt/8kfile > -rw-r--r-- 1 root root 8.0K 2016-02-19 09:37 /mnt/8kfile > > In this case, I used the default msize which is 8192(in Byte). Since > cephfs > is using 4M as the f_bsize, the iounit will be 8168 as P9_IOHDRSZ is > equal to 24. > > 3) Run the following systemtap script to trace the paging result, > > [root@localhost ~]# cat p9_read.stp > probe kernel.function("p9_virtio_zc_request").call > { > printf("p9_virtio_zc_request: inlen size is %d\n", int_arg(5)); > } > > probe kernel.function("p9_nr_pages").call > { > printf("p9_nr_pages: start_page = %ld\n", int_arg(1) >> 12); > printf("p9_nr_pages: end_age = %ld\n", (int_arg(1) + 8168 + > 4096 -1) >> 12); > } > > 4) The output I got when I copied out the file /mnt/8kfile to /tmp/ > directory, > > p9_virtio_zc_request: inlen size is 8168 > p9_nr_pages: start_page = 34293757815 > p9_nr_pages: end_age = 34293757818 > > Per the text in red(start_page = 34293757815, end_page = 34293757818), > it turns out 8k data will be mapped into three pages. This could hurt the > performance. > > Actually, I enabled the cephfs debug functionality added by me to see > how the data is distributed in this case, the result is as follows, > > CEPHFS_DEBUG: cephfs_preadv iov_len=4096 > CEPHFS_DEBUG: cephfs_preadv iov_len=4072 > CEPHFS_DEBUG: cephfs_preadv iov_len=24 > > This patch aims to fix this. And the result turns out it works quite > well, all the > data is well aligned. > > p9_virtio_zc_request: inlen size is 4096 > p9_nr_pages: start_page = 34203171814 > p9_nr_pages: end_age = 34203171815 > p9_virtio_zc_request: inlen size is 4096 > p9_nr_pages: start_page = 34203171815 > p9_nr_pages: end_age = 34203171816 > > CEPHFS_DEBUG: cephfs_preadv iov_len=4096 > CEPHFS_DEBUG: cephfs_preadv iov_len=4096 > > Thanks, > Jevon >> -aneesh >> > >