From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: Linux NFS Mailing list <linux-nfs@vger.kernel.org>,
Steve Dickson <steved@redhat.com>
Subject: Re: pNFS: invalid IP:port selection when talks to DS
Date: Mon, 20 Mar 2017 22:09:32 +0100 (CET) [thread overview]
Message-ID: <1657271697.3093215.1490044172396.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <362211751.3088036.1490043081358.JavaMail.zimbra@desy.de>
Hi Olga,
you did not have the answer, however you gave me an important hint!
I believe, all our DSes on a single host generate the same server
owner during exchange-id. I guess, this can be the reason, why
client decides to talk to an other DS.
Tigran.
----- Original Message -----
> From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
> To: "Olga Kornievskaia" <aglo@umich.edu>
> Cc: "Linux NFS Mailing list" <linux-nfs@vger.kernel.org>, "Steve Dickson"=
<steved@redhat.com>
> Sent: Monday, March 20, 2017 9:51:21 PM
> Subject: Re: pNFS: invalid IP:port selection when talks to DS
> Hi Olga,
>=20
> ----- Original Message -----
>> From: "Olga Kornievskaia" <aglo@umich.edu>
>> To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
>> Cc: "Linux NFS Mailing list" <linux-nfs@vger.kernel.org>, "Steve Dickson=
"
>> <steved@redhat.com>
>> Sent: Monday, March 20, 2017 9:14:34 PM
>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>=20
>> Hi Tigran,
>>=20
>> While I don't have an answer to your question, I'd like to point out
>> that in 4.9 is when Andy's session trunking patches when in.
>>=20
>> I'm curious this client that's now talking to the DS at port 24006
>> instead of 24005, did it before also earlier correctly (legally)
>> talked to DS that was on 24006?
>=20
> Yes, earlier during testing it had legal access to DS on port 24006.
>=20
> Tigran.
>=20
>>=20
>> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
>> <tigran.mkrtchyan@desy.de> wrote:
>>>
>>>
>>> Dear (p)NFS-ors,
>>>
>>> we observe VERY unpleasant situation with pNFS in the production.
>>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>>> a wrong port number when talks to data server:
>>>
>>> If client uses different DSes on the same host, then at some point it s=
tarts
>>> to send data to the wrong port number:
>>>
>>> Client <=3D> MDS:
>>>
>>>
>>> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call OP=
EN DH:
>>> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>>> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 1) OPEN
>>> StateID: 0xec18
>>> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call SE=
TATTR FH: 0x6ccf3dfa
>>> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 3) SETATTR
>>> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call LA=
YOUTGET
>>> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 5) LAYOUTGET
>>> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call GE=
TDEVINFO
>>> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 7) GETDEVINFO
>>> Opcode: GETDEVINFO (47)
>>> Status: NFS4_OK (0)
>>> layout type: LAYOUT4_NFSV4_1_FILES (1)
>>> device index: 0
>>> r_netid: tcp
>>> length: 3
>>> contents: tcp
>>> fill bytes: opaque data
>>> r_addr: 131.169.51.50.93.197
>>> length: 20
>>> contents: 131.169.51.50.93.197
>>> r_netid: tcp
>>> length: 3
>>> contents: tcp
>>> fill bytes: opaque data
>>> r_addr: 131.169.51.50.93.197
>>> length: 20
>>> contents: 131.169.51.50.93.197
>>> notification bitmap: 6
>>> notification bitmap: 0
>>> [Main Opcode: GETDEVINFO (47)]
>>>
>>> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call TE=
ST_STATEID
>>> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 9)
>>> TEST_STATEID
>>>
>>>
>>>
>>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>>
>>> client <=3D> DS
>>>
>>> $ tshark -r ds-write.pcap -n -z conv,tcp
>>> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WRI=
TE StateID: 0xff01
>>> Offset: 0 Len: 3968
>>> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 1) WRITE
>>> Status: NFS4ERR_BAD_STATEID
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>> TCP Conversations
>>> Filter:<No Filter>
>>> | <- =
| | -> | | Total | Relative | Duration |
>>> | Frames By=
tes | | Frames Bytes | | Frames Bytes | Start |
>>> | |
>>> 131.169.51.50:24006 <-> 131.169.251.53:847 1 =
4240
>>> 1 168 2 4408 0.000000000 0.0001
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>
>>> NOTICE, that it talks to DS on port 24006!
>>>
>>> Is there know fix which is missing in CentOS7? I can't reproduce it wit=
h
>>> 4.9 kernel (or it's harder to reproduce).
>>>
>>>
>>> The packages are attached.
>>>
>>> Tigran.
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-03-20 21:10 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-20 15:52 pNFS: invalid IP:port selection when talks to DS Mkrtchyan, Tigran
2017-03-20 16:16 ` Mkrtchyan, Tigran
2017-03-20 20:14 ` Olga Kornievskaia
2017-03-20 20:51 ` Mkrtchyan, Tigran
2017-03-20 21:09 ` Mkrtchyan, Tigran [this message]
2017-03-22 16:04 ` Olga Kornievskaia
2017-03-22 20:27 ` Mkrtchyan, Tigran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1657271697.3093215.1490044172396.JavaMail.zimbra@desy.de \
--to=tigran.mkrtchyan@desy.de \
--cc=aglo@umich.edu \
--cc=linux-nfs@vger.kernel.org \
--cc=steved@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).