From: "Venkateswararao Jujjuri (JV)" <jvrao@linux.vnet.ibm.com>
To: Eric Van Hensbergen <ericvh@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, v9fs-developer@lists.sourceforge.net
Subject: Re: [V9fs-developer] :[RFC] [PATCH 6/7] [net/9p] Read and Write side zerocopy changes for 9P2000.L protocol.
Date: Wed, 09 Feb 2011 13:12:27 -0800 [thread overview]
Message-ID: <4D53033B.4080300@linux.vnet.ibm.com> (raw)
In-Reply-To: <4D5302A3.7000000@linux.vnet.ibm.com>
On 2/9/2011 1:09 PM, Venkateswararao Jujjuri (JV) wrote:
> WRITE
dd if=/dev/zero of=/pmnt/file1 bs=4096 count=1MB (variable bs = IO SIZE)
>
> IO SIZE TOTAL SIZE No ZC ZC
> 1 1MB 22.4 kb/s 19.8 kb/s
> 32 32MB 711 kb/s 633 kb/s
> 64 64MB 1.4 mb/s 1.3 mb/s
> 128 128MB 2.8 mb/s 2.6 mb/s
> 256 256MB 5.6 mb/s 5.1 mb/s
> 512 512MB 10.4 mb/s 10.2 mb/s
> 1024 1GB 19.7 mb/s 20.4 mb/s
> 2048 2GB 40.1 mb/s 43.7 mb/s
> 4096 4GB 71.4 mb/s 73.1 mb/s
>
>
>
> READ
dd of=/dev/null if=/pmnt/file1 bs=4096 count=1MB(variable bs = IO SIZE)
> IO SIZE TOTAL SIZE No ZC ZC
> 1 1MB 26.6 kb/s 23.1 kb/s
> 32 32MB 783 kb/s 734 kb/s
> 64 64MB 1.7 mb/s 1.5 mb/s
> 128 128MB 3.4 mb/s 3.0 mb/s
> 256 256MB 4.2 mb/s 5.9 mb/s
> 512 512MB 6.9 mb/s 11.6 mb/s
> 1024 1GB 23.3 mb/s 23.4 mb/s
> 2048 2GB 42.5 mb/s 45.4 mb/s
> 4096 4GB 67.4 mb/s 73.9 mb/s
>
> As you can see, the difference is marginal..but zc improves as the IO size
> increases.
> In the past we have seen tremendous improvements with different msizes.
> It is mostly because of shipping bigger chunks of data which is possible with
> zero copy.
> Also it could be my setup/box even on the host I am getting same/similar numbers.
>
> - JV
>
>
>
>
> On 2/8/2011 1:16 PM, Eric Van Hensbergen wrote:
>> oh and, for reference, while a different environment, my request is
>> based on a little bit more than idle fancy. Check out the graph on
>> page 6 in: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.8182&rep=rep1&type=pdf
>>
>> -eric
>>
>>
>> On Tue, Feb 8, 2011 at 3:09 PM, Eric Van Hensbergen <ericvh@gmail.com> wrote:
>>> One thing I wonder is if we always want to zero copy for payload. In
>>> the extreme, do we want to take the overhead of pinning an extra page
>>> if we are only reading/writing a byte? memcpy is expensive for large
>>> packets, but may actually be more efficient for small packets.
>>>
>>> Have we done any performance measurements of this code with various
>>> payload sizes versus non-zero-copy? Of course that may not really
>>> show the impact of pinning the extra pages....
>>>
>>> In any case, if such a tradeoff did exist, we might choose to not do
>>> zero copy for requests smaller than some size -- and that might
>>> alleviate some of the problems with the legacy protocols (such as the
>>> Rerror issue) -- killing two birds with one stone. In any case, given
>>> the implementations it should be really easy to shut me up with
>>> comparison data of zc and non-zc for 1, 64, 128, 256, 512, 1024, 2048,
>>> 4192 byte payloads (without caches enabled of course).
>>>
>>> -eric
>>>
>>>
>>> On Mon, Feb 7, 2011 at 12:57 AM, Venkateswararao Jujjuri (JV)
>>> <jvrao@linux.vnet.ibm.com> wrote:
>>>> On 2/6/2011 11:21 PM, Venkateswararao Jujjuri (JV) wrote:
>>>>> Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
>>>>> ---
>>>>> net/9p/client.c | 45 +++++++++++++++++++++++++++++++--------------
>>>>> net/9p/protocol.c | 38 ++++++++++++++++++++++++++++++++++++++
>>>>> 2 files changed, 69 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/net/9p/client.c b/net/9p/client.c
>>>>> index a848bca..f939edf 100644
>>>>> --- a/net/9p/client.c
>>>>> +++ b/net/9p/client.c
>>>>> @@ -1270,7 +1270,14 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset,
>>>>> if (count < rsize)
>>>>> rsize = count;
>>>>>
>>>>> - req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize);
>>>>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) ==
>>>>> + P9_TRANS_PREF_PAYLOAD_SEP) {
>>>>> + req = p9_client_rpc(clnt, P9_TREAD, "dqE", fid->fid, offset,
>>>>> + rsize, data ? data : udata);
>>>>> + } else {
>>>>> + req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset,
>>>>> + rsize);
>>>>> + }
>>>>> if (IS_ERR(req)) {
>>>>> err = PTR_ERR(req);
>>>>> goto error;
>>>>> @@ -1284,13 +1291,15 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset,
>>>>>
>>>>> P9_DPRINTK(P9_DEBUG_9P, "<<< RREAD count %d\n", count);
>>>>>
>>>>> - if (data) {
>>>>> - memmove(data, dataptr, count);
>>>>> - } else {
>>>>> - err = copy_to_user(udata, dataptr, count);
>>>>> - if (err) {
>>>>> - err = -EFAULT;
>>>>> - goto free_and_error;
>>>>> + if (!req->tc->pbuf_size) {
>>>>> + if (data) {
>>>>> + memmove(data, dataptr, count);
>>>>> + } else {
>>>>> + err = copy_to_user(udata, dataptr, count);
>>>>> + if (err) {
>>>>> + err = -EFAULT;
>>>>> + goto free_and_error;
>>>>> + }
>>>>> }
>>>>> }
>>>>> p9_free_req(clnt, req);
>>>>> @@ -1323,12 +1332,20 @@ p9_client_write(struct p9_fid *fid, char *data, const char __user *udata,
>>>>>
>>>>> if (count < rsize)
>>>>> rsize = count;
>>>>> - if (data)
>>>>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid, offset,
>>>>> - rsize, data);
>>>>> - else
>>>>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid, offset,
>>>>> - rsize, udata);
>>>>> +
>>>>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) ==
>>>>> + P9_TRANS_PREF_PAYLOAD_SEP) {
>>>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqF", fid->fid, offset,
>>>>> + rsize, data ? data : udata);
>>>>> + } else {
>>>>> + if (data)
>>>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid,
>>>>> + offset, rsize, data);
>>>>> + else
>>>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid,
>>>>> + offset, rsize, udata);
>>>>> + }
>>>>> +
>>>>> if (IS_ERR(req)) {
>>>>> err = PTR_ERR(req);
>>>>> goto error;
>>>>> diff --git a/net/9p/protocol.c b/net/9p/protocol.c
>>>>> index dfc358f..ea778dd 100644
>>>>> --- a/net/9p/protocol.c
>>>>> +++ b/net/9p/protocol.c
>>>>> @@ -114,6 +114,24 @@ pdu_write_u(struct p9_fcall *pdu, const char __user *udata, size_t size)
>>>>> return size - len;
>>>>> }
>>>>>
>>>>> +static size_t
>>>>> +pdu_write_uw(struct p9_fcall *pdu, const char *udata, size_t size)
>>>>> +{
>>>>> + size_t len = min(pdu->capacity - pdu->size, size);
>>>>> + pdu->pbuf = udata;
>>>>> + pdu->pbuf_size = len;
>>>>> + return size - len;
>>>>> +}
>>>>> +
>>>>> +static size_t
>>>>> +pdu_write_ur(struct p9_fcall *pdu, const char *udata, size_t size)
>>>>> +{
>>>>> + size_t len = min(pdu->capacity - pdu->size, size);
>>>>> + pdu->pbuf = udata;
>>>>> + pdu->pbuf_size = len;
>>>>> + return size - len;
>>>>> +}
>>>>> +
>>>>> /*
>>>>> b - int8_t
>>>>> w - int16_t
>>>>> @@ -445,6 +463,26 @@ p9pdu_vwritef(struct p9_fcall *pdu, int proto_version, const char *fmt,
>>>>> errcode = -EFAULT;
>>>>> }
>>>>> break;
>>>>> + case 'E':{
>>>>> + int32_t count = va_arg(ap, int32_t);
>>>>> + const char *udata = va_arg(ap, const void *);
>>>>> + errcode = p9pdu_writef(pdu, proto_version, "d",
>>>>> + count);
>>>>> + if (!errcode && pdu_write_ur(pdu, udata,
>>>>> + count))
>>>>> + errcode = -EFAULT;
>>>>> + }
>>>>> + break;
>>>>> + case 'F':{
>>>>> + int32_t count = va_arg(ap, int32_t);
>>>>> + const char *udata = va_arg(ap, const void *);
>>>>> + errcode = p9pdu_writef(pdu, proto_version, "d",
>>>>> + count);
>>>>> + if (!errcode && pdu_write_uw(pdu, udata,
>>>>> + count))
>>>>> + errcode = -EFAULT;
>>>>> + }
>>>>> + break;
>>>>> case 'U':{
>>>>> int32_t count = va_arg(ap, int32_t);
>>>>> const char __user *udata =
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> The modern datacenter depends on network connectivity to access resources
>>>> and provide services. The best practices for maximizing a physical server's
>>>> connectivity to a physical network are well understood - see how these
>>>> rules translate into the virtual world?
>>>> http://p.sf.net/sfu/oracle-sfdevnlfb
>>>> _______________________________________________
>>>> V9fs-developer mailing list
>>>> V9fs-developer@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/v9fs-developer
>>>>
>>>
>
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> _______________________________________________
> V9fs-developer mailing list
> V9fs-developer@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/v9fs-developer
next prev parent reply other threads:[~2011-02-09 21:12 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-07 7:21 [RFC-V2] [PATCH 0/7] Zero Copy Venkateswararao Jujjuri (JV)
2011-02-07 6:55 ` Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate zero copy Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` Venkateswararao Jujjuri (JV)
2011-02-07 6:56 ` [RFC] [PATCH 2/7] [net/9p] Adds supporting functions for " Venkateswararao Jujjuri (JV)
2011-02-08 15:20 ` [V9fs-developer] " Latchesar Ionkov
2011-02-08 17:21 ` Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate " Venkateswararao Jujjuri (JV)
2011-02-07 6:56 ` [RFC] [PATCH 3/7] [net/9p] Assign type of transaction to tc->pdu->id which is otherwise unsed Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate zero copy Venkateswararao Jujjuri (JV)
2011-02-07 6:56 ` [RFC] [PATCH 4/7] [net/9p] Add gup/zero_copy support to VirtIO transport layer Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate zero copy Venkateswararao Jujjuri (JV)
2011-02-07 6:57 ` [RFC] [PATCH 5/7] [net/9p] Add preferences to transport layer Venkateswararao Jujjuri (JV)
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate zero copy Venkateswararao Jujjuri (JV)
2011-02-07 6:57 ` :[RFC] [PATCH 6/7] [net/9p] Read and Write side zerocopy changes for 9P2000.L protocol Venkateswararao Jujjuri (JV)
2011-02-08 21:09 ` [V9fs-developer] " Eric Van Hensbergen
2011-02-08 21:16 ` Eric Van Hensbergen
2011-02-09 21:09 ` Venkateswararao Jujjuri (JV)
2011-02-09 21:12 ` Venkateswararao Jujjuri (JV) [this message]
2011-02-09 21:18 ` Eric Van Hensbergen
2011-02-09 21:39 ` Venkateswararao Jujjuri (JV)
2011-02-08 23:50 ` Venkateswararao Jujjuri (JV)
2011-02-09 1:59 ` Venkateswararao Jujjuri (JV)
2011-02-09 14:28 ` Eric Van Hensbergen
2011-02-07 7:21 ` [RFC] [PATCH 1/7] [net/9p] Additional elements to p9_fcall to accomodate zero copy Venkateswararao Jujjuri (JV)
2011-02-07 6:58 ` [PATCH 7/7] [net/9p] Handle TREAD/RERROR case in !dotl case Venkateswararao Jujjuri (JV)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D53033B.4080300@linux.vnet.ibm.com \
--to=jvrao@linux.vnet.ibm.com \
--cc=ericvh@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=v9fs-developer@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).