linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namjae Jeon <linkinjeon@gmail.com>
To: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Cc: bfields@fieldses.org, neilb@suse.de, linux-nfs@vger.kernel.org,
	Amit Sahrawat <a.sahrawat@samsung.com>,
	Nam-Jae Jeon <namjae.jeon@samsung.com>
Subject: Re: sunrpc/cache.c: races while updating cache entries
Date: Mon, 13 May 2013 13:08:45 +0900	[thread overview]
Message-ID: <CAKYAXd_9a=xukJDpV=ug3npyaoa4mrpW8ijf_6DiKPDjiOYe7g@mail.gmail.com> (raw)
In-Reply-To: <CAKYAXd9dWGA1Eaq5mi-eRbY0RRhkmWDR7CeDoeW18dBcKcGv+Q@mail.gmail.com>

Hi.

Sorry for interrupt.
I fixed my issue using this patch(nfsd4: fix hang on fast-booting nfs
servers). it was different issue with this subject on current mail.

Thanks.

2013/5/10, Namjae Jeon <linkinjeon@gmail.com>:
> Hi. Bodo.
>
> We are facing issues with respect to the SUNRPC cache.
> In our case we have two targets connected back-to-back
> NFS Server: Kernel version, 2.6.35
>
> At times, when Client tries to connect to the Server it stucks for
> very long duration and keeps on trying to mount.
>
> When we try to figure out using logs, we checked that client was not
> getting response of FSINFO request.
>
> Further, by debugging we found that the request was getting dropped at
> the SERVER, so this request was not being served.
>
> In the code we reached this point:
> svcauth_unix_set_client()->
>   gi = unix_gid_find(cred->cr_uid, rqstp);
>         switch (PTR_ERR(gi)) {
>         case -EAGAIN:
>                 return SVC_DROP;
>
> This path is related with the SUNRPC cache management.
>
> When we remove this UNIX_GID_FIND path from our code, there is no problem.
>
> When we try to figure the possible related problems as per our
> scneario, We found that you have faced similar issue for RACE in the
> cache.
> Can you please suggest what could be the problem  so that we can check
> further ?
>
> Or from the solution if you encounter the similar situation.
> can you please suggest on the possible patches for 2.6.35 - which we
> can try in our environment ?
>
> We will be highly grateful.
>
> Thanks
>
>
> 2013/4/20, Bodo Stroesser <bstroesser@ts.fujitsu.com>:
>> On 05 Apr 2013 23:09:00 +0100 J. Bruce Fields <bfields@fieldses.org>
>> wrote:
>>> On Fri, Apr 05, 2013 at 05:33:49PM +0200, Bodo Stroesser wrote:
>>> > On 05 Apr 2013 14:40:00 +0100 J. Bruce Fields <bfields@fieldses.org>
>>> > wrote:
>>> > > On Thu, Apr 04, 2013 at 07:59:35PM +0200, Bodo Stroesser wrote:
>>> > > > There is no reason for apologies. The thread meanwhile seems to be
>>> > > > a
>>> > > > bit
>>> > > > confusing :-)
>>> > > >
>>> > > > Current state is:
>>> > > >
>>> > > > - Neil Brown has created two series of patches. One for SLES11-SP1
>>> > > > and a
>>> > > >   second one for -SP2
>>> > > >
>>> > > > - AFAICS, the series for -SP2 will match with mainline also.
>>> > > >
>>> > > > - Today I found and fixed the (hopefully) last problem in the -SP1
>>> > > > series.
>>> > > >   My test using this patchset will run until Monday.
>>> > > >
>>> > > > - Provided the test on SP1 succeeds, probably on Tuesday I'll
>>> > > > start
>>> > > > to test
>>> > > >   the patches for SP2 (and mainline). If it runs fine, we'll have
>>> > > > a
>>> > > > tested
>>> > > >   patchset not later than Mon 15th.
>>> > >
>>> > > OK, great, as long as it hasn't just been forgotten!
>>> > >
>>> > > I'd also be curious to understand why we aren't getting a lot of
>>> > > complaints about this from elsewhere....  Is there something unique
>>> > > about your setup?  Do the bugs that remain upstream take a long time
>>> > > to
>>> > > reproduce?
>>> > >
>>> > > --b.
>>> > >
>>> >
>>> > It's no secret, what we are doing. So let me try to explain:
>>>
>>> Thanks for the detailed explanation!  I'll look forward to the patches.
>>>
>>> --b.
>>>
>>
>> Let me give an intermediate result:
>>
>> The test of the -SP1 patch series succeeded.
>>
>> We started the test of the -SP2 (and mainline) series on Tue, 9th, but
>> had
>> no
>> success.
>> We did _not_ find a problem with the patches, but under -SP2 our test
>> scenario
>> has less than 40% of the throughput we saw under -SP1. With that low
>> performance, we had a 4 day run without any dropped RPC request. But we
>> don't
>> know the error rate without the patches under these conditions. So we
>> can't
>> give an o.k. for the patches yet.
>>
>> Currently we try to find the reason for the different behavior of SP1 and
>> SP2
>>
>> Bodo
>>
>

  reply	other threads:[~2013-05-13  4:08 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-19 16:55 sunrpc/cache.c: races while updating cache entries Bodo Stroesser
2013-05-10  7:51 ` Namjae Jeon
2013-05-13  4:08   ` Namjae Jeon [this message]
     [not found] <61eb00$3oamkh@dgate20u.abg.fsc.net>
2013-06-13  1:54 ` NeilBrown
2013-06-13  2:04   ` J. Bruce Fields
  -- strict thread matches above, loose matches on Subject: below --
2013-06-03 14:27 Bodo Stroesser
     [not found] <d6437a$47jkcm@dgate10u.abg.fsc.net>
2013-04-05 21:08 ` J. Bruce Fields
2013-04-05 15:33 Bodo Stroesser
     [not found] <61eb00$3itd78@dgate20u.abg.fsc.net>
2013-04-05 12:40 ` J. Bruce Fields
2013-04-04 17:59 Bodo Stroesser
     [not found] <61eb00$3hon1j@dgate20u.abg.fsc.net>
2013-04-03 18:36 ` J. Bruce Fields
2013-03-21 16:41 Bodo Stroesser
     [not found] <61eb00$3hl8ah@dgate20u.abg.fsc.net>
2013-03-20 23:33 ` NeilBrown
2013-03-20 18:45 Bodo Stroesser
     [not found] <d6437a$45t6bs@dgate10u.abg.fsc.net>
2013-03-20  4:39 ` NeilBrown
2013-03-19 19:58 Bodo Stroesser
     [not found] <d6437a$45efvo@dgate10u.abg.fsc.net>
2013-03-19  3:23 ` NeilBrown
2013-03-15 20:35 Bodo Stroesser
2013-03-14 17:31 Bodo Stroesser
2013-03-13 16:47 Bodo Stroesser
     [not found] <61eb00$3gpm51@dgate20u.abg.fsc.net>
2013-03-13  5:55 ` NeilBrown
2013-03-11 16:13 Bodo Stroesser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKYAXd_9a=xukJDpV=ug3npyaoa4mrpW8ijf_6DiKPDjiOYe7g@mail.gmail.com' \
    --to=linkinjeon@gmail.com \
    --cc=a.sahrawat@samsung.com \
    --cc=bfields@fieldses.org \
    --cc=bstroesser@ts.fujitsu.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=namjae.jeon@samsung.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).