linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Magoon <gmagoon@MIT.EDU>
To: "Loewe, Bill" <bloewe@panasas.com>,
	"Harrosh, Boaz" <bharrosh@panasas.com>,
	Trond Myklebust <Trond.Myklebust@netapp.com>,
	"J. Bruce Fields" <bfields@citi.umich.edu>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: RE: NFSv4 vs NFSv3 with MPICH2-1.4
Date: Tue, 09 Aug 2011 19:15:00 -0400	[thread overview]
Message-ID: <20110809191500.2rm2rkclk408k0o0@webmail.mit.edu> (raw)
In-Reply-To: <20110802121913.y5qpfnkftw00o00s@webmail.mit.edu>

Just a quick follow-up...I was wondering if anyone had the chance to 
take a look
at the tcpdump I sent to a few of you last week.

If anyone else on the list wants to take a look, please let me know, 
and I will
send you the link privately.

Thanks,
Greg

Quoting Gregory Magoon <gmagoon@MIT.EDU>:

> Thanks all for the feedback and sorry for the delay...one of our HDDs 
> failed on
> Saturday, so I had to take care of that.
>
> Because I don't want to interrupt a working system, it will not be convenient
> for me to try the "no delegations" option that has been suggested.
>
> I was however, able to grab a hold of a temporarily free node (temporarily
> returned to NFSv4 configuration) to capture the tcp traffic. I have sent a
> short (< 1 sec) snapshot captured during (I believe) the allred3 
> mpich2 test. I
> have privately sent you a link to the file. Hopefully the issue will 
> be obvious
> from this (e.g. you will immediately see that I am doing something I 
> shouldn't
> be doing). If a longer snapshot started before the tests would be 
> useful, I can
> get that too.
>
> I had posted on the mpich mailing list before I came here (
> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2011-July/010432.html ) and
> unfortunately they weren't able to provide any insights.
>
> Thanks again,
> Greg
>
>
> Quoting "Loewe, Bill" <bloewe@panasas.com>:
>
>> Hi Greg,
>>
>> IOR is independent of MPICH2, but does require MPI for process 
>> coordination.  By default, IOR will use the "-a POSIX" option for 
>> standard POSIX I/O -- open(), write(), close(), etc.
>>
>> In addition, IOR can use the MPI-IO library calls (MPI_File_open(), 
>> etc.) to perform I/O.
>>
>> For the build process of MPICH2 "make tests" exercises this MPI-IO 
>> (ROMIO) interface which uses an ADIO (Abstract-Device Interface for 
>> I/O) layer.  ADIO can interface to different file systems (NFS, 
>> PanFS, PVFS2, Lustre, e.g.).
>>
>> The errors you're encountering in "make tests" for MPICH2 do not 
>> appear to be testing the I/O, however, but seem to be an issue with 
>> the launcher for the tests in general.  I agree with Boaz that it 
>> may make sense to follow up with the MPICH developers for this.  
>> Under their main page 
>> (http://www.mcs.anl.gov/research/projects/mpich2/) they have a 
>> support pulldown with FAQ and a mailing list.  They may be able to 
>> help resolve this for you.
>>
>> Thanks,
>>
>> --Bill.
>>
>> -----Original Message-----
>> From: Harrosh, Boaz
>> Sent: Friday, July 29, 2011 8:20 PM
>> To: Gregory Magoon
>> Cc: Trond Myklebust; linux-nfs@vger.kernel.org; J. Bruce Fields; Loewe, Bill
>> Subject: Re: NFSv4 vs NFSv3 with MPICH2-1.4
>>
>> On 07/28/2011 04:15 PM, Gregory Magoon wrote:
>>> Unfortunately, I'm not familiar enough with MPICH2 to have an idea about
>>> significant changes between version 1.3 and 1.4, but other evidence 
>>> suggests
>>> that the version is not the issue and that I would have the same 
>>> problem with
>>> v1.3.
>>>
>>> I'm using the MPICH2 test suite invoked by "make testing" (see below
>>> for initial
>>> output).
>>>
>>> I'm using the nfs-kernel-server and nfs-common Ubuntu packages (natty
>>> release).
>>>
>>
>> You have not answered the most important question:
>>>> Also are you using the builtin nfs-client driver or the POSIX interface?
>>
>> Which I'll assume means you don't know. So I'll try to elaborate. Just for
>> background, I've never used "make tests" before all I used was IOR & mdtest.
>>
>> Now if you print the usage string for IOR you get this option:
>>
>> 	-a S  api --  API for I/O [POSIX|MPIIO|HDF5|NCMPI]
>>
>> I'm not familiar with the code but what I understand is only "-a 
>> POSIX" will actually
>> use the regular Kernel VFS interface for read/writing of files. The 
>> other options
>> have different drivers for different protocols. I do not know first 
>> hand, but I once
>> heard in a conference that -a MPIIO has a special NFS driver that 
>> uses better NFS
>> semantics and avoids the POSIX semantics which are bad for big 
>> cluster performance.
>> All this is speculations and rumors on my part, and you will need to 
>> consult with the
>> mpich guys.
>>
>> Now I can imagine that a "make tests" would try all possible 
>> combinations of "-a S"
>> So you'll need to dig out what is the falling test and is it really 
>> using the Kernel
>> NFS driver at that point. (I bet if you do a tcpdump like Bruce said 
>> the guys here will
>> be able to see if this is a Linux NFS or not)
>>
>> I CC: Bill Loewe that might know much more then me about this 
>> subject. And please do
>> speak with the MPICH people (But keep us in the loop it is 
>> interesting to know)
>>
>> Thanks
>> Boaz
>>
>>> Thanks,
>>> Greg
>>>
>>> user@node01:~/Molpro/src/mpich2-1.4$ make testing
>>> (cd test && make testing)
>>> make[1]: Entering directory `/home/user/Molpro/src/mpich2-1.4/test'
>>> (NOXMLCLOSE=YES && export NOXMLCLOSE && cd mpi && make testing)
>>> make[2]: Entering directory `/home/user/Molpro/src/mpich2-1.4/test/mpi'
>>> ./runtests -srcdir=. -tests=testlist \
>>>                     
>>> -mpiexec=/home/user/Molpro/src/mpich2-install/bin/mpiexec \
>>>                     -xmlfile=summary.xml
>>> Looking in ./testlist
>>> Processing directory attr
>>> Looking in ./attr/testlist
>>> Processing directory coll
>>> Looking in ./coll/testlist
>>> Unexpected output in allred: [mpiexec@node01] APPLICATION TIMED OUT
>>> Unexpected output in allred: [proxy:0:0@node01] 
>>> HYD_pmcd_pmip_control_cmd_cb
>>> (./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
>>> Unexpected output in allred: [proxy:0:0@node01] 
>>> HYDT_dmxu_poll_wait_for_event
>>> (./tools/demux/demux_poll.c:77): callback returned error status
>>> Unexpected output in allred: [proxy:0:0@node01] main
>>> (./pm/pmiserv/pmip.c:226):
>>> demux engine error waiting for event
>>> Unexpected output in allred: [mpiexec@node01] HYDT_bscu_wait_for_completion
>>> (./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated
>>> badly; aborting
>>> Unexpected output in allred: [mpiexec@node01] HYDT_bsci_wait_for_completion
>>> (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
>>> completion
>>> Unexpected output in allred: [mpiexec@node01] HYD_pmci_wait_for_completion
>>> (./pm/pmiserv/pmiserv_pmci.c:189): launcher returned error waiting for
>>> completion
>>> Unexpected output in allred: [mpiexec@node01] main 
>>> (./ui/mpich/mpiexec.c:397):
>>> process manager error waiting for completion
>>> Program allred exited without No Errors
>>>
>>>>
>>>> Hi Gregory
>>>>
>>>> We are using MPICH2-1.3.1 and the IOR mpich test. as well as the mdtest
>>>> test. And have had no issues so far with nfsv4 nfsv4.1 and pnfs. In fact
>>>> this is our standard performance test.
>>>>
>>>> What tests are you using?
>>>> Do you know of any major changes between MPICH2-1.3.1 and MPICH2-1.4?
>>>> Also are you using the builtin nfs-client driver or the POSIX interface?
>>>>
>>>> Boaz
>>>>
>>>
>>>
>>
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



  reply	other threads:[~2011-08-09 23:15 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-28 19:23 NFSv4 vs NFSv3 with MPICH2-1.4 Gregory Magoon
2011-07-28 20:58 ` Trond Myklebust
2011-07-28 21:24   ` Gregory Magoon
2011-07-28 21:47     ` Trond Myklebust
2011-07-28 22:01       ` Gregory Magoon
2011-07-28 22:31         ` Boaz Harrosh
2011-07-28 23:15           ` Gregory Magoon
2011-07-30  3:19             ` Boaz Harrosh
2011-07-30  3:56               ` Loewe, Bill
2011-08-02 16:19                 ` Gregory Magoon
2011-08-09 23:15                   ` Gregory Magoon [this message]
2011-08-10  0:54                     ` Boaz Harrosh
2011-07-29 14:51       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110809191500.2rm2rkclk408k0o0@webmail.mit.edu \
    --to=gmagoon@mit.edu \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bfields@citi.umich.edu \
    --cc=bharrosh@panasas.com \
    --cc=bloewe@panasas.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).