From: Nicolas Williams <Nicolas.Williams@oracle.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Query to understand the Lustre request/reply message
Date: Wed, 13 Oct 2010 02:43:46 -0500 [thread overview]
Message-ID: <20101013074345.GH1635@oracle.com> (raw)
In-Reply-To: <49BEA69F-6931-473E-AA86-4A676A71607A@clusterstor.com>
On Wed, Oct 13, 2010 at 10:27:55AM +0300, Alexey Lyashkov wrote:
> eh.. Nicolas,
>
> Format for messages which want to reconstructed after resend and don't
> want recontructed - is different.
>
> As quick example it is OPEN request (via MDS_REINT command), that type
> message need a have extra buffer to store LOV EA, which to be send to
> MDS in replay case (with additional flag in header). (client have a
> copy data from a mds reply after ptlrpc finish processing request).
> That is why i say about "Reconstruct/replay case"
Sure, but this buffer needs to be declared a priori. If you won't know
whether you'll need a buffer until later, that's OK: you declare it
anyways and you set its size to zero if you don't need it.
You can't change a capsule's format to add buffers; you can only set the
size of unnecessary buffers to zero. This is because the header of a
ptlrpc (not the ptlrpc_body, mind you) has a count of buffers then a
variable length (64-bit aligned) set of that many 32-bit buffer lengths
(I'm going from memory here), and adding buffers can put a reply over
the expected max size on the client side, leading to it being dropped.
You can change a capsule's format to change the definition of a field
from one without a swabber to one with a swabber.
You'll see in many cases that the presence of a field (meaning, whether
it's checked for or whether it has a non-zero size) is dependent on a
flag in the mdt or ost body, as you mention. Replays are not the only
interesting case here. Capabilities are another.
Some of these flags could be removed and replaced instead with checks of
buffer size (0 -> flag not set, >0 -> flag set).
> Also format is different is you want to use MDS_REINT + sub commands
> or you want to use something similar to MDS_SET_INFO. For
> MDS_SET_INFO you use single format for all messages (just simple key
> <> value) buffer, but for MDS_REINT you need two formats - one for
> generic MDS_REINT code (get opcode from command, get locks, and
> possible other) and own format for each opcode - such as open,
> unlink, setxattr, setattr. all of them have a different number of
> buffers (fields).
The SET_INFO RPCs are kinda gross. I should know, since I finished the
conversion of ost_handler.c to the new API. You can see that I used
req_capsule_extend() to handle some SET_INFO cases. No, I didn't cover
this detail, nor others, because I figured Vilobh needed a starting
point, and that's all I was going to provide tonight.
Nico
--
next prev parent reply other threads:[~2010-10-13 7:43 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-12 17:55 [Lustre-devel] Query to understand the Lustre request/reply message Vilobh Meshram
2010-10-12 18:21 ` Alexey Lyashkov
2010-10-12 22:17 ` Vilobh Meshram
2010-10-13 3:46 ` Alexey Lyashkov
2010-10-13 4:06 ` Vilobh Meshram
2010-10-13 4:20 ` Alexey Lyashkov
2010-10-13 4:35 ` Vilobh Meshram
2010-10-13 4:41 ` Alexey Lyashkov
2010-10-13 5:42 ` Nicolas Williams
2010-10-13 5:54 ` Alexey Lyashkov
2010-10-13 7:15 ` Nicolas Williams
2010-10-13 7:32 ` Alexey Lyashkov
2010-10-13 6:07 ` Vilobh Meshram
2010-10-13 7:17 ` Nicolas Williams
2010-10-13 6:25 ` Alexey Lyashkov
2010-10-13 7:12 ` Nicolas Williams
2010-10-13 7:27 ` Alexey Lyashkov
2010-10-13 7:43 ` Nicolas Williams [this message]
2010-10-13 7:51 ` Alexey Lyashkov
2010-10-13 23:51 ` Vilobh Meshram
2010-10-14 0:28 ` Nicolas Williams
2010-10-14 1:41 ` Vilobh Meshram
2010-10-14 3:38 ` Alexey Lyashkov
2010-10-14 5:18 ` Nicolas Williams
2010-10-14 5:46 ` Alexey Lyashkov
2010-10-14 14:31 ` Andreas Dilger
2010-10-14 14:40 ` Alexey Lyashkov
2010-10-14 15:04 ` Vilobh Meshram
2010-10-14 15:10 ` Alexey Lyashkov
2010-10-14 15:29 ` Vilobh Meshram
2010-10-14 15:45 ` Alexey Lyashkov
2010-10-14 16:25 ` Vilobh Meshram
2010-10-15 0:58 ` Vilobh Meshram
2010-10-15 7:39 ` Alexey Lyashkov
2010-10-15 16:25 ` Vilobh Meshram
2010-10-15 17:22 ` Alexey Lyashkov
2010-10-14 8:44 ` Alexey Lyashkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101013074345.GH1635@oracle.com \
--to=nicolas.williams@oracle.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.