linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* blacklisted DS with pnfs
@ 2012-01-25  9:56 Tigran Mkrtchyan
  2012-01-25 12:44 ` Boaz Harrosh
  0 siblings, 1 reply; 5+ messages in thread
From: Tigran Mkrtchyan @ 2012-01-25  9:56 UTC (permalink / raw)
  To: linux-nfs

Hi,

we have observed that in some situations ( probably network glitches )
the pnfs client blacklisted one of the data servers:

NFS: data server 83a95099 connection error -12. Deviceid [22000000000]
marked out of use.

As a result, data server can't be used by this client anymore.

Is there a way to let client to forget about data server?
Some magic in /proc ?

This is SL6.2 (RHEL 6.2):
# uname -a
Linux p3-wgs13 2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
CST 2011 x86_64 x86_64 x86_64 GNU/Linux
#

Regards,
   Tigran.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: blacklisted DS with pnfs
  2012-01-25  9:56 blacklisted DS with pnfs Tigran Mkrtchyan
@ 2012-01-25 12:44 ` Boaz Harrosh
  2012-01-25 12:46   ` Boaz Harrosh
  0 siblings, 1 reply; 5+ messages in thread
From: Boaz Harrosh @ 2012-01-25 12:44 UTC (permalink / raw)
  To: tigran.mkrtchyan; +Cc: linux-nfs

On 01/25/2012 11:56 AM, Tigran Mkrtchyan wrote:
> Hi,
> 
> we have observed that in some situations ( probably network glitches )
> the pnfs client blacklisted one of the data servers:
> 
> NFS: data server 83a95099 connection error -12. Deviceid [22000000000]
> marked out of use.
> 
> As a result, data server can't be used by this client anymore.
> 
> Is there a way to let client to forget about data server?
> Some magic in /proc ?
> 
> This is SL6.2 (RHEL 6.2):
> # uname -a
> Linux p3-wgs13 2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
> CST 2011 x86_64 x86_64 x86_64 GNU/Linux
> #
> 

Look in the source code, I think there is a RECALL that the server
can do to trash the all device cache. or one of the devices.

What happens is that the device is marked with error but is in 
cache so is not re-fetched.

wait let me look ....

I found it! The server sends a NOTIFY_DEVICEID4_CHANGE. The
client will remove the deviceid from cache and unmount if needed.
Next layout with that deviceid will re-establish the connection and
will put a new clean entry in the dev cache.

[If you decide to enhance pynfs to send a NOTIFY_DEVICEID4_CHANGE as an admin
 tool. That would be interesting]

> Regards,
>    Tigran.

Cheers
Boaz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: blacklisted DS with pnfs
  2012-01-25 12:44 ` Boaz Harrosh
@ 2012-01-25 12:46   ` Boaz Harrosh
  2012-01-25 13:17     ` Tigran Mkrtchyan
  0 siblings, 1 reply; 5+ messages in thread
From: Boaz Harrosh @ 2012-01-25 12:46 UTC (permalink / raw)
  To: tigran.mkrtchyan; +Cc: linux-nfs

On 01/25/2012 02:44 PM, Boaz Harrosh wrote:
> On 01/25/2012 11:56 AM, Tigran Mkrtchyan wrote:
>> Hi,
>>
>> we have observed that in some situations ( probably network glitches )
>> the pnfs client blacklisted one of the data servers:
>>
>> NFS: data server 83a95099 connection error -12. Deviceid [22000000000]
>> marked out of use.
>>
>> As a result, data server can't be used by this client anymore.
>>
>> Is there a way to let client to forget about data server?
>> Some magic in /proc ?
>>
>> This is SL6.2 (RHEL 6.2):
>> # uname -a
>> Linux p3-wgs13 2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
>> CST 2011 x86_64 x86_64 x86_64 GNU/Linux
>> #
>>
> 
> Look in the source code, I think there is a RECALL that the server
> can do to trash the all device cache. or one of the devices.
> 
> What happens is that the device is marked with error but is in 
> cache so is not re-fetched.
> 
> wait let me look ....
> 
> I found it! The server sends a NOTIFY_DEVICEID4_CHANGE. The
> client will remove the deviceid from cache and unmount if needed.
> Next layout with that deviceid will re-establish the connection and
> will put a new clean entry in the dev cache.
> 

If you want to see for your self look at:
 callback_proc.c::nfs4_callback_devicenotify()

Boaz
> [If you decide to enhance pynfs to send a NOTIFY_DEVICEID4_CHANGE as an admin
>  tool. That would be interesting]
> 
>> Regards,
>>    Tigran.
> 
> Cheers
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: blacklisted DS with pnfs
  2012-01-25 12:46   ` Boaz Harrosh
@ 2012-01-25 13:17     ` Tigran Mkrtchyan
  2012-01-25 13:29       ` Boaz Harrosh
  0 siblings, 1 reply; 5+ messages in thread
From: Tigran Mkrtchyan @ 2012-01-25 13:17 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: linux-nfs

Thanks Boaz I will check.

I believe rhel6 kernel does not support  device notification.
Currently we just generated a new device id.

Tigran.

On Wed, Jan 25, 2012 at 1:46 PM, Boaz Harrosh <bharrosh@panasas.com> wrote:
> On 01/25/2012 02:44 PM, Boaz Harrosh wrote:
>> On 01/25/2012 11:56 AM, Tigran Mkrtchyan wrote:
>>> Hi,
>>>
>>> we have observed that in some situations ( probably network glitches )
>>> the pnfs client blacklisted one of the data servers:
>>>
>>> NFS: data server 83a95099 connection error -12. Deviceid [22000000000]
>>> marked out of use.
>>>
>>> As a result, data server can't be used by this client anymore.
>>>
>>> Is there a way to let client to forget about data server?
>>> Some magic in /proc ?
>>>
>>> This is SL6.2 (RHEL 6.2):
>>> # uname -a
>>> Linux p3-wgs13 2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
>>> CST 2011 x86_64 x86_64 x86_64 GNU/Linux
>>> #
>>>
>>
>> Look in the source code, I think there is a RECALL that the server
>> can do to trash the all device cache. or one of the devices.
>>
>> What happens is that the device is marked with error but is in
>> cache so is not re-fetched.
>>
>> wait let me look ....
>>
>> I found it! The server sends a NOTIFY_DEVICEID4_CHANGE. The
>> client will remove the deviceid from cache and unmount if needed.
>> Next layout with that deviceid will re-establish the connection and
>> will put a new clean entry in the dev cache.
>>
>
> If you want to see for your self look at:
>  callback_proc.c::nfs4_callback_devicenotify()
>
> Boaz
>> [If you decide to enhance pynfs to send a NOTIFY_DEVICEID4_CHANGE as an admin
>>  tool. That would be interesting]
>>
>>> Regards,
>>>    Tigran.
>>
>> Cheers
>> Boaz
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: blacklisted DS with pnfs
  2012-01-25 13:17     ` Tigran Mkrtchyan
@ 2012-01-25 13:29       ` Boaz Harrosh
  0 siblings, 0 replies; 5+ messages in thread
From: Boaz Harrosh @ 2012-01-25 13:29 UTC (permalink / raw)
  To: tigran.mkrtchyan; +Cc: linux-nfs

On 01/25/2012 03:17 PM, Tigran Mkrtchyan wrote:
> Thanks Boaz I will check.
> 
> I believe rhel6 kernel does not support  device notification.
> Currently we just generated a new device id.
> 

OK That code is pretty old. How much of the pnfs was ported?
What is the pnfs Kernel version the port was based on?

new-ids is smart too.

Thanks
Boaz

> Tigran.
> 
> On Wed, Jan 25, 2012 at 1:46 PM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>> On 01/25/2012 02:44 PM, Boaz Harrosh wrote:
>>> On 01/25/2012 11:56 AM, Tigran Mkrtchyan wrote:
>>>> Hi,
>>>>
>>>> we have observed that in some situations ( probably network glitches )
>>>> the pnfs client blacklisted one of the data servers:
>>>>
>>>> NFS: data server 83a95099 connection error -12. Deviceid [22000000000]
>>>> marked out of use.
>>>>
>>>> As a result, data server can't be used by this client anymore.
>>>>
>>>> Is there a way to let client to forget about data server?
>>>> Some magic in /proc ?
>>>>
>>>> This is SL6.2 (RHEL 6.2):
>>>> # uname -a
>>>> Linux p3-wgs13 2.6.32-220.2.1.el6.x86_64 #1 SMP Thu Dec 22 11:15:52
>>>> CST 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>> #
>>>>
>>>
>>> Look in the source code, I think there is a RECALL that the server
>>> can do to trash the all device cache. or one of the devices.
>>>
>>> What happens is that the device is marked with error but is in
>>> cache so is not re-fetched.
>>>
>>> wait let me look ....
>>>
>>> I found it! The server sends a NOTIFY_DEVICEID4_CHANGE. The
>>> client will remove the deviceid from cache and unmount if needed.
>>> Next layout with that deviceid will re-establish the connection and
>>> will put a new clean entry in the dev cache.
>>>
>>
>> If you want to see for your self look at:
>>  callback_proc.c::nfs4_callback_devicenotify()
>>
>> Boaz
>>> [If you decide to enhance pynfs to send a NOTIFY_DEVICEID4_CHANGE as an admin
>>>  tool. That would be interesting]
>>>
>>>> Regards,
>>>>    Tigran.
>>>
>>> Cheers
>>> Boaz
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-01-25 13:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-25  9:56 blacklisted DS with pnfs Tigran Mkrtchyan
2012-01-25 12:44 ` Boaz Harrosh
2012-01-25 12:46   ` Boaz Harrosh
2012-01-25 13:17     ` Tigran Mkrtchyan
2012-01-25 13:29       ` Boaz Harrosh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).