[lustre-devel] Possible minor bug

All of lore.kernel.org
 help / color / mirror / Atom feed

* [lustre-devel] Possible minor bug
@ 2015-05-27 20:03 Patrick Farrell
  0 siblings, 0 replies; 3+ messages in thread
From: Patrick Farrell @ 2015-05-27 20:03 UTC (permalink / raw)
  To: lustre-devel

While doing some other work, I noticed something I believe is a potential problem in the server side quota code.

Specifically, in qmt_glimpse_lock:

While the resource lock (a spin lock) is held, it does

OBD_ALLOC_PTR(work);

Since allocations can sleep, doesn't this allocation need to be atomic?

So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));

I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here.  My understanding is that, in general, doing allocations while holding a spin lock is not recommended.

I'm hoping for other input before I go further - Am I right that this is something which needs fixing?  If so, I'll open an LU and submit a patch.

Thanks,
- Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20150527/f8295b6c/attachment.htm>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [lustre-devel] Possible minor bug
@ 2015-05-27 20:55 Dilger, Andreas
  2015-05-27 21:15 ` Patrick Farrell
  0 siblings, 1 reply; 3+ messages in thread
From: Dilger, Andreas @ 2015-05-27 20:55 UTC (permalink / raw)
  To: lustre-devel

On 2015/05/27, 2:03 PM, "Patrick Farrell" <paf at cray.com<mailto:paf@cray.com>> wrote:

While doing some other work, I noticed something I believe is a potential problem in the server side quota code.

Specifically, in qmt_glimpse_lock:

While the resource lock (a spin lock) is held, it does

OBD_ALLOC_PTR(work);

Since allocations can sleep, doesn't this allocation need to be atomic?

So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));

You could also use OBD_ALLOC_GFP(work, sizeof(*work), GFP_ATOMIC), which is equivalent, but at least keeps the same paradigm of other allocations in this code.

I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here.  My understanding is that, in general, doing allocations while holding a spin lock is not recommended.

Not only not recommended, but not allowed for GFP_NOFS allocations.  Probably if you had CONFIG_DEBUG_SPINLOCK or similar enabled, you would get a warning here due to __might_sleep() in the allocation path.

I'm hoping for other input before I go further - Am I right that this is something which needs fixing?  If so, I'll open an LU and submit a patch.

Yes, please do.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [lustre-devel] Possible minor bug
  2015-05-27 20:55 [lustre-devel] Possible minor bug Dilger, Andreas
@ 2015-05-27 21:15 ` Patrick Farrell
  0 siblings, 0 replies; 3+ messages in thread
From: Patrick Farrell @ 2015-05-27 21:15 UTC (permalink / raw)
  To: lustre-devel

Ok, thanks.

Here's the LU:
https://jira.hpdd.intel.com/browse/LU-6656
________________________________________
From: Dilger, Andreas [andreas.dilger at intel.com]
Sent: Wednesday, May 27, 2015 3:55 PM
To: Patrick Farrell
Cc: lustre-devel at lists.lustre.org
Subject: Re: [lustre-devel] Possible minor bug

On 2015/05/27, 2:03 PM, "Patrick Farrell" <paf at cray.com<mailto:paf@cray.com>> wrote:

While doing some other work, I noticed something I believe is a potential problem in the server side quota code.

Specifically, in qmt_glimpse_lock:

While the resource lock (a spin lock) is held, it does

OBD_ALLOC_PTR(work);

Since allocations can sleep, doesn't this allocation need to be atomic?

So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));

You could also use OBD_ALLOC_GFP(work, sizeof(*work), GFP_ATOMIC), which is equivalent, but at least keeps the same paradigm of other allocations in this code.

I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here.  My understanding is that, in general, doing allocations while holding a spin lock is not recommended.

Not only not recommended, but not allowed for GFP_NOFS allocations.  Probably if you had CONFIG_DEBUG_SPINLOCK or similar enabled, you would get a warning here due to __might_sleep() in the allocation path.

I'm hoping for other input before I go further - Am I right that this is something which needs fixing?  If so, I'll open an LU and submit a patch.

Yes, please do.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-05-27 21:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-27 20:55 [lustre-devel] Possible minor bug Dilger, Andreas
2015-05-27 21:15 ` Patrick Farrell
  -- strict thread matches above, loose matches on Subject: below --
2015-05-27 20:03 Patrick Farrell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.