* [Lustre-devel] o2iblnd bug ?
@ 2010-07-01 16:18 Nic Henke
2010-07-01 21:27 ` Liang Zhen
0 siblings, 1 reply; 2+ messages in thread
From: Nic Henke @ 2010-07-01 16:18 UTC (permalink / raw)
To: lustre-devel
There looks to be a bug in the o2iblnd (and maybe other LNDs...) in
kiblnd_tx_done.
When tx_lntmsg[1] has a reply allocated (lnet_create_reply_msg) for a
GET_REQ, we are committed to lnet_finalize that no matter the status of
the RDMA. However, kiblnd_tx_done will call lnet_finalize() with the
'error' status on both the request (lntmsg[0]) and the allocated reply.
This could lead to the upper layer receiving a REPLY event for a message
it has already nuked due to the EIO on the originial request.
In the pttlnd and qswlnd, they seem to handle this properly. They will
complete the request with rc=0, then complete the reply with rc=-EIO.
So - is this really a bug or just inconsequential differences ?
This looks to be present in HEAD, as well as b1_8 and friends.
Cheers,
Nic
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Lustre-devel] o2iblnd bug ?
2010-07-01 16:18 [Lustre-devel] o2iblnd bug ? Nic Henke
@ 2010-07-01 21:27 ` Liang Zhen
0 siblings, 0 replies; 2+ messages in thread
From: Liang Zhen @ 2010-07-01 21:27 UTC (permalink / raw)
To: lustre-devel
Nic Henke wrote:
> There looks to be a bug in the o2iblnd (and maybe other LNDs...) in
> kiblnd_tx_done.
>
> When tx_lntmsg[1] has a reply allocated (lnet_create_reply_msg) for a
> GET_REQ, we are committed to lnet_finalize that no matter the status of
> the RDMA. However, kiblnd_tx_done will call lnet_finalize() with the
> 'error' status on both the request (lntmsg[0]) and the allocated reply.
> This could lead to the upper layer receiving a REPLY event for a message
> it has already nuked due to the EIO on the originial request.
>
>
Nic,
I think lnet_create_reply_msg has already taken an extra reference on MD
(lnet_create_reply_msg()->lnet_commit_md()), so the upper layer message
shouldn't be nuked before the last event(unlinked).
Liang
> In the pttlnd and qswlnd, they seem to handle this properly. They will
> complete the request with rc=0, then complete the reply with rc=-EIO.
>
> So - is this really a bug or just inconsequential differences ?
>
> This looks to be present in HEAD, as well as b1_8 and friends.
>
> Cheers,
> Nic
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-07-01 21:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-01 16:18 [Lustre-devel] o2iblnd bug ? Nic Henke
2010-07-01 21:27 ` Liang Zhen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.