linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Linux NFS Mailing list <linux-nfs@vger.kernel.org>
Cc: Steve Dickson <steved@redhat.com>
Subject: pNFS: invalid IP:port selection when talks to DS
Date: Mon, 20 Mar 2017 16:52:40 +0100 (CET)	[thread overview]
Message-ID: <45574919.3034342.1490025160438.JavaMail.zimbra@desy.de> (raw)

[-- Attachment #1: Type: text/plain, Size: 3244 bytes --]



Dear (p)NFS-ors,

we observe VERY unpleasant situation with pNFS in the production.
Our hosts run multiple DSes on different ports, usually 24001-24009.
With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
a wrong port number when talks to data server:

If client uses different DSes on the same host, then at some point it starts
to send data to the wrong port number:

Client <=> MDS:


    1 0.000000000 131.169.251.53 → 131.169.51.35 NFS V4 Call OPEN DH: 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
    2 0.001469799 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 1) OPEN StateID: 0xec18
    3 0.001578128 131.169.251.53 → 131.169.51.35 NFS V4 Call SETATTR FH: 0x6ccf3dfa
    4 0.002657187 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 3) SETATTR
    5 0.003243819 131.169.251.53 → 131.169.51.35 NFS V4 Call LAYOUTGET
    6 0.014603386 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 5) LAYOUTGET
    7 0.014899121 131.169.251.53 → 131.169.51.35 NFS V4 Call GETDEVINFO
    8 0.015014216 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 7) GETDEVINFO
        Opcode: GETDEVINFO (47)
            Status: NFS4_OK (0)
            layout type: LAYOUT4_NFSV4_1_FILES (1)
            device index: 0
            r_netid: tcp
                length: 3
                contents: tcp
                fill bytes: opaque data
            r_addr: 131.169.51.50.93.197
                length: 20
                contents: 131.169.51.50.93.197
            r_netid: tcp
                length: 3
                contents: tcp
                fill bytes: opaque data
            r_addr: 131.169.51.50.93.197
                length: 20
                contents: 131.169.51.50.93.197
            notification bitmap: 6
            notification bitmap: 0
    [Main Opcode: GETDEVINFO (47)]

    9 0.105442455 131.169.251.53 → 131.169.51.35 NFS V4 Call TEST_STATEID
   10 0.105521354 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 9) TEST_STATEID



NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.

client <=> DS

$ tshark -r ds-write.pcap  -n -z conv,tcp
    1   0.000000 131.169.251.53 → 131.169.51.50 NFS V4 Call WRITE StateID: 0xff01 Offset: 0 Len: 3968
    2   0.000090 131.169.51.50 → 131.169.251.53 NFS V4 Reply (Call In 1) WRITE Status: NFS4ERR_BAD_STATEID
================================================================================
TCP Conversations
Filter:<No Filter>
                                                           |       <-      | |       ->      | |     Total     |    Relative    |   Duration   |
                                                           | Frames  Bytes | | Frames  Bytes | | Frames  Bytes |      Start     |              |
131.169.51.50:24006        <-> 131.169.251.53:847               1      4240       1       168       2      4408     0.000000000         0.0001
================================================================================

NOTICE, that it talks to DS on port 24006!

Is there know fix which is missing in CentOS7? I can't reproduce it with
4.9 kernel (or it's harder to reproduce).


The packages are attached.

Tigran.


[-- Attachment #2: ds-write.pcapng --]
[-- Type: application/x-pcapng, Size: 4520 bytes --]

[-- Attachment #3: mds.pcapng --]
[-- Type: application/x-pcapng, Size: 3580 bytes --]

             reply	other threads:[~2017-03-20 16:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-20 15:52 Mkrtchyan, Tigran [this message]
2017-03-20 16:16 ` pNFS: invalid IP:port selection when talks to DS Mkrtchyan, Tigran
2017-03-20 20:14 ` Olga Kornievskaia
2017-03-20 20:51   ` Mkrtchyan, Tigran
2017-03-20 21:09     ` Mkrtchyan, Tigran
2017-03-22 16:04       ` Olga Kornievskaia
2017-03-22 20:27         ` Mkrtchyan, Tigran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45574919.3034342.1490025160438.JavaMail.zimbra@desy.de \
    --to=tigran.mkrtchyan@desy.de \
    --cc=linux-nfs@vger.kernel.org \
    --cc=steved@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).