* [lustre-devel] Possible minor bug
@ 2015-05-27 20:03 Patrick Farrell
0 siblings, 0 replies; 3+ messages in thread
From: Patrick Farrell @ 2015-05-27 20:03 UTC (permalink / raw)
To: lustre-devel
While doing some other work, I noticed something I believe is a potential problem in the server side quota code.
Specifically, in qmt_glimpse_lock:
While the resource lock (a spin lock) is held, it does
OBD_ALLOC_PTR(work);
Since allocations can sleep, doesn't this allocation need to be atomic?
So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));
I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here. My understanding is that, in general, doing allocations while holding a spin lock is not recommended.
I'm hoping for other input before I go further - Am I right that this is something which needs fixing? If so, I'll open an LU and submit a patch.
Thanks,
- Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20150527/f8295b6c/attachment.htm>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [lustre-devel] Possible minor bug
@ 2015-05-27 20:55 Dilger, Andreas
2015-05-27 21:15 ` Patrick Farrell
0 siblings, 1 reply; 3+ messages in thread
From: Dilger, Andreas @ 2015-05-27 20:55 UTC (permalink / raw)
To: lustre-devel
On 2015/05/27, 2:03 PM, "Patrick Farrell" <paf at cray.com<mailto:paf@cray.com>> wrote:
While doing some other work, I noticed something I believe is a potential problem in the server side quota code.
Specifically, in qmt_glimpse_lock:
While the resource lock (a spin lock) is held, it does
OBD_ALLOC_PTR(work);
Since allocations can sleep, doesn't this allocation need to be atomic?
So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));
You could also use OBD_ALLOC_GFP(work, sizeof(*work), GFP_ATOMIC), which is equivalent, but at least keeps the same paradigm of other allocations in this code.
I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here. My understanding is that, in general, doing allocations while holding a spin lock is not recommended.
Not only not recommended, but not allowed for GFP_NOFS allocations. Probably if you had CONFIG_DEBUG_SPINLOCK or similar enabled, you would get a warning here due to __might_sleep() in the allocation path.
I'm hoping for other input before I go further - Am I right that this is something which needs fixing? If so, I'll open an LU and submit a patch.
Yes, please do.
Cheers, Andreas
^ permalink raw reply [flat|nested] 3+ messages in thread
* [lustre-devel] Possible minor bug
2015-05-27 20:55 [lustre-devel] Possible minor bug Dilger, Andreas
@ 2015-05-27 21:15 ` Patrick Farrell
0 siblings, 0 replies; 3+ messages in thread
From: Patrick Farrell @ 2015-05-27 21:15 UTC (permalink / raw)
To: lustre-devel
Ok, thanks.
Here's the LU:
https://jira.hpdd.intel.com/browse/LU-6656
________________________________________
From: Dilger, Andreas [andreas.dilger at intel.com]
Sent: Wednesday, May 27, 2015 3:55 PM
To: Patrick Farrell
Cc: lustre-devel at lists.lustre.org
Subject: Re: [lustre-devel] Possible minor bug
On 2015/05/27, 2:03 PM, "Patrick Farrell" <paf at cray.com<mailto:paf@cray.com>> wrote:
While doing some other work, I noticed something I believe is a potential problem in the server side quota code.
Specifically, in qmt_glimpse_lock:
While the resource lock (a spin lock) is held, it does
OBD_ALLOC_PTR(work);
Since allocations can sleep, doesn't this allocation need to be atomic?
So, following the current Lustre convention, it should be:
LIBCFS_ALLOC_ATOMIC(work, sizeof(struct
ldlm_glimpse_work));
You could also use OBD_ALLOC_GFP(work, sizeof(*work), GFP_ATOMIC), which is equivalent, but at least keeps the same paradigm of other allocations in this code.
I have seen no actual bugs from this, but I hit a hang while modifying the equivalent code in ofd_intent_policy for lock ahead, and I think the same hang is theoretically possible here. My understanding is that, in general, doing allocations while holding a spin lock is not recommended.
Not only not recommended, but not allowed for GFP_NOFS allocations. Probably if you had CONFIG_DEBUG_SPINLOCK or similar enabled, you would get a warning here due to __might_sleep() in the allocation path.
I'm hoping for other input before I go further - Am I right that this is something which needs fixing? If so, I'll open an LU and submit a patch.
Yes, please do.
Cheers, Andreas
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-05-27 21:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-27 20:55 [lustre-devel] Possible minor bug Dilger, Andreas
2015-05-27 21:15 ` Patrick Farrell
-- strict thread matches above, loose matches on Subject: below --
2015-05-27 20:03 Patrick Farrell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.