From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58566) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abJUq-0007tw-1L for qemu-devel@nongnu.org; Wed, 02 Mar 2016 22:00:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abJUm-0002yu-RA for qemu-devel@nongnu.org; Wed, 02 Mar 2016 22:00:11 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:35134) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abJUm-0002y7-Cj for qemu-devel@nongnu.org; Wed, 02 Mar 2016 22:00:08 -0500 Received: by mail-pf0-x243.google.com with SMTP id w128so536259pfb.2 for ; Wed, 02 Mar 2016 19:00:08 -0800 (PST) References: <56C02E4F.6030303@gmail.com> <87y4an8l93.fsf@linux.vnet.ibm.com> <56C41DE8.2030904@gmail.com> <87h9h7z9a0.fsf@linux.vnet.ibm.com> <56C6D8A0.2040502@gmail.com> <56CD5610.9010909@gmail.com> From: Jevon Qiao Message-ID: <56D7A8B1.3030202@gmail.com> Date: Thu, 3 Mar 2016 11:00:01 +0800 MIME-Version: 1.0 In-Reply-To: <56CD5610.9010909@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 2/2] hw/9pfs: fix alignment issue when host filesystem block size is larger than client msize List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Aneesh Kumar K.V" , qemu-devel@nongnu.org Cc: gkurz@linux.vnet.ibm.com, mst@redhat.com Any further question/comment on this patch? Thanks, Jevon On 24/2/16 15:04, Jevon Qiao wrote: > [Removing ceph-devel alias] > > Hi Aneesh, > > Any further comment on my reply below? > > Thanks, > Jevon > On 19/2/16 16:56, Jevon Qiao wrote: >> Hi Aneesh, >>> I am not sure I understand the details correctly. iounit is the size >>> that we use in client_read to determine the size in which >>> we should request I/O from the client. But we still can't do I/O in >>> size >>> larger than s->msize. If you look at the client side (kernel 9p fs), >>> you >>> will find >>> >>> rsize = fid->iounit; >>> if (!rsize || rsize > clnt->msize-P9_IOHDRSZ) >>> rsize = clnt->msize - P9_IOHDRSZ; >> Yes, I know this. >>> if your iounit calculation ends up zero, that should be handled >>> correctly by >>> >>> if (!iounit) { >>> iounit = s->msize - P9_IOHDRSZ; >>> } >>> return iounit; >>> >>> >>> So what is the issue here. ? >> This will result in an alignment issue while mapping the I/O >> requested by >> client into pages in the function of p9_nr_pages(). >> >> int p9_nr_pages(char *data, int len) >> { >> unsigned long start_page, end_page; >> start_page = (unsigned long)data >> PAGE_SHIFT; >> end_page = ((unsigned long)data + len + PAGE_SIZE - 1) >> >> PAGE_SHIFT; >> return end_page - start_page; >> } >> >> Please see the following experiment I did without the fix. >> >> 1) Start qemu with cephfs, >> >> $ qemu-system-x86_64 /root/CentOS---6.6-64bit---2015-03-06-a.qcow2 >> -smp 4 -m 4096 -fsdev >> cephfs,security_model=passthrough,id=fsdev0,path=/ -device >> virtio-9p-pci,id=fs0,fsdev=fsdev0,mount_tag=cephfs --enable-kvm >> -nographic -net nic -net tap,ifname=tap0,script=no,downscript=no >> >> >> 2) Mount the fs in the guest. >> >> [root@localhost ~]# mount -t 9p -o trans=virtio,version=9p2000.L >> cephfs /mnt >> [root@localhost ~]# ls -lah /mnt/8kfile >> -rw-r--r-- 1 root root 8.0K 2016-02-19 09:37 /mnt/8kfile >> >> In this case, I used the default msize which is 8192(in Byte). Since >> cephfs >> is using 4M as the f_bsize, the iounit will be 8168 as P9_IOHDRSZ is >> equal to 24. >> >> 3) Run the following systemtap script to trace the paging result, >> >> [root@localhost ~]# cat p9_read.stp >> probe kernel.function("p9_virtio_zc_request").call >> { >> printf("p9_virtio_zc_request: inlen size is %d\n", int_arg(5)); >> } >> >> probe kernel.function("p9_nr_pages").call >> { >> printf("p9_nr_pages: start_page = %ld\n", int_arg(1) >> 12); >> printf("p9_nr_pages: end_age = %ld\n", (int_arg(1) + 8168 + >> 4096 -1) >> 12); >> } >> >> 4) The output I got when I copied out the file /mnt/8kfile to /tmp/ >> directory, >> >> p9_virtio_zc_request: inlen size is 8168 >> p9_nr_pages: start_page = 34293757815 >> p9_nr_pages: end_age = 34293757818 >> >> Per the text in red(start_page = 34293757815, end_page = 34293757818), >> it turns out 8k data will be mapped into three pages. This could hurt >> the >> performance. >> >> Actually, I enabled the cephfs debug functionality added by me to see >> how the data is distributed in this case, the result is as follows, >> >> CEPHFS_DEBUG: cephfs_preadv iov_len=4096 >> CEPHFS_DEBUG: cephfs_preadv iov_len=4072 >> CEPHFS_DEBUG: cephfs_preadv iov_len=24 >> >> This patch aims to fix this. And the result turns out it works quite >> well, all the >> data is well aligned. >> >> p9_virtio_zc_request: inlen size is 4096 >> p9_nr_pages: start_page = 34203171814 >> p9_nr_pages: end_age = 34203171815 >> p9_virtio_zc_request: inlen size is 4096 >> p9_nr_pages: start_page = 34203171815 >> p9_nr_pages: end_age = 34203171816 >> >> CEPHFS_DEBUG: cephfs_preadv iov_len=4096 >> CEPHFS_DEBUG: cephfs_preadv iov_len=4096 >> >> Thanks, >> Jevon >>> -aneesh >>> >> >> >