* [bug report] shmem: quota support
@ 2023-08-02 6:53 ` Dan Carpenter
2023-08-02 14:22 ` Carlos Maiolino
0 siblings, 1 reply; 5+ messages in thread
From: Dan Carpenter @ 2023-08-02 6:53 UTC (permalink / raw)
To: cem; +Cc: linux-fsdevel
Hello Carlos Maiolino,
The patch 9a9f8f590f6d: "shmem: quota support" from Jul 25, 2023
(linux-next), leads to the following Smatch static checker warning:
fs/quota/dquot.c:1271 flush_warnings()
warn: sleeping in atomic context
fs/quota/dquot.c
1261 static void flush_warnings(struct dquot_warn *warn)
1262 {
1263 int i;
1264
1265 for (i = 0; i < MAXQUOTAS; i++) {
1266 if (warn[i].w_type == QUOTA_NL_NOWARN)
1267 continue;
1268 #ifdef CONFIG_PRINT_QUOTA_WARNING
1269 print_warning(&warn[i]);
1270 #endif
--> 1271 quota_send_warning(warn[i].w_dq_id,
1272 warn[i].w_sb->s_dev, warn[i].w_type);
The quota_send_warning() function does GFP_NOFS allocations, which don't
touch the fs but can still sleep. GFP_ATOMIC or GFP_NOWAIT don't sleep.
1273 }
1274 }
The call trees that Smatch is worried about are listed. The "disables
preempt" functions take the spin_lock_irq(&info->lock) before calling
shmem_recalc_inode().
shmem_charge() <- disables preempt
shmem_uncharge() <- disables preempt
shmem_undo_range() <- disables preempt
shmem_getattr() <- disables preempt
shmem_writepage() <- disables preempt
shmem_set_folio_swapin_error() <- disables preempt
shmem_swapin_folio() <- disables preempt
shmem_get_folio_gfp() <- disables preempt
-> shmem_recalc_inode()
-> shmem_inode_unacct_blocks()
-> dquot_free_block_nodirty()
-> dquot_free_space_nodirty()
-> __dquot_free_space()
-> flush_warnings()
regards,
dan carpenter
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug report] shmem: quota support
2023-08-02 6:53 ` [bug report] shmem: quota support Dan Carpenter
@ 2023-08-02 14:22 ` Carlos Maiolino
2023-08-03 0:00 ` Hugh Dickins
0 siblings, 1 reply; 5+ messages in thread
From: Carlos Maiolino @ 2023-08-02 14:22 UTC (permalink / raw)
To: Dan Carpenter; +Cc: linux-fsdevel, jack, brauner, hughd
On Wed, Aug 02, 2023 at 09:53:54AM +0300, Dan Carpenter wrote:
> Hello Carlos Maiolino,
>
> The patch 9a9f8f590f6d: "shmem: quota support" from Jul 25, 2023
> (linux-next), leads to the following Smatch static checker warning:
>
> fs/quota/dquot.c:1271 flush_warnings()
> warn: sleeping in atomic context
>
Thanks for the report Dan!
> fs/quota/dquot.c
> 1261 static void flush_warnings(struct dquot_warn *warn)
> 1262 {
> 1263 int i;
> 1264
> 1265 for (i = 0; i < MAXQUOTAS; i++) {
> 1266 if (warn[i].w_type == QUOTA_NL_NOWARN)
> 1267 continue;
> 1268 #ifdef CONFIG_PRINT_QUOTA_WARNING
> 1269 print_warning(&warn[i]);
> 1270 #endif
> --> 1271 quota_send_warning(warn[i].w_dq_id,
> 1272 warn[i].w_sb->s_dev, warn[i].w_type);
>
> The quota_send_warning() function does GFP_NOFS allocations, which don't
> touch the fs but can still sleep. GFP_ATOMIC or GFP_NOWAIT don't sleep.
>
Hmm, tbh I think the simplest way to fix it is indeed change GFP_NOFS to
GFP_NOWAIT when calling genlmsg_new(), quota_send_warnings() already abstain to
pass any error back to its caller, I don't think moving it from GFP_NOFS ->
GFP_NOWAIT will have much impact here as the amount of memory required for it is
not that big and wouldn't fail unless free memory is really short. I might be
wrong though.
If not that, another option would be to swap tmpfs spinlocks for mutexes, but I
would rather avoid that.
CC'ing other folks for more suggestions.
Carlos
> 1273 }
> 1274 }
>
> The call trees that Smatch is worried about are listed. The "disables
> preempt" functions take the spin_lock_irq(&info->lock) before calling
> shmem_recalc_inode().
>
> shmem_charge() <- disables preempt
> shmem_uncharge() <- disables preempt
> shmem_undo_range() <- disables preempt
> shmem_getattr() <- disables preempt
> shmem_writepage() <- disables preempt
> shmem_set_folio_swapin_error() <- disables preempt
> shmem_swapin_folio() <- disables preempt
> shmem_get_folio_gfp() <- disables preempt
> -> shmem_recalc_inode()
> -> shmem_inode_unacct_blocks()
> -> dquot_free_block_nodirty()
> -> dquot_free_space_nodirty()
> -> __dquot_free_space()
> -> flush_warnings()
Hm, I see, we add dquot_free_block_nodirty() call to shmem_inode_unacct_blocks()
here, which leads to this possible sleep inside spin_lock_irq().
>
> regards,
> dan carpenter
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug report] shmem: quota support
2023-08-02 14:22 ` Carlos Maiolino
@ 2023-08-03 0:00 ` Hugh Dickins
2023-08-03 11:10 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Hugh Dickins @ 2023-08-03 0:00 UTC (permalink / raw)
To: Carlos Maiolino
Cc: Dan Carpenter, Andrew Morton, linux-fsdevel, jack, brauner, hughd
On Wed, 2 Aug 2023, Carlos Maiolino wrote:
> On Wed, Aug 02, 2023 at 09:53:54AM +0300, Dan Carpenter wrote:
> > Hello Carlos Maiolino,
> >
> > The patch 9a9f8f590f6d: "shmem: quota support" from Jul 25, 2023
> > (linux-next), leads to the following Smatch static checker warning:
> >
> > fs/quota/dquot.c:1271 flush_warnings()
> > warn: sleeping in atomic context
> >
>
> Thanks for the report Dan!
>
> > fs/quota/dquot.c
> > 1261 static void flush_warnings(struct dquot_warn *warn)
> > 1262 {
> > 1263 int i;
> > 1264
> > 1265 for (i = 0; i < MAXQUOTAS; i++) {
> > 1266 if (warn[i].w_type == QUOTA_NL_NOWARN)
> > 1267 continue;
> > 1268 #ifdef CONFIG_PRINT_QUOTA_WARNING
> > 1269 print_warning(&warn[i]);
> > 1270 #endif
> > --> 1271 quota_send_warning(warn[i].w_dq_id,
> > 1272 warn[i].w_sb->s_dev, warn[i].w_type);
> >
> > The quota_send_warning() function does GFP_NOFS allocations, which don't
> > touch the fs but can still sleep. GFP_ATOMIC or GFP_NOWAIT don't sleep.
> >
>
> Hmm, tbh I think the simplest way to fix it is indeed change GFP_NOFS to
> GFP_NOWAIT when calling genlmsg_new(), quota_send_warnings() already abstain to
> pass any error back to its caller, I don't think moving it from GFP_NOFS ->
> GFP_NOWAIT will have much impact here as the amount of memory required for it is
> not that big and wouldn't fail unless free memory is really short. I might be
> wrong though.
>
> If not that, another option would be to swap tmpfs spinlocks for mutexes, but I
> would rather avoid that.
>
> CC'ing other folks for more suggestions.
This is certainly a problem, for both dquot_alloc and dquot_free paths.
Thank you, Dan, for catching it.
GFP_NOWAIT is an invitation to flakiness: I don't think it's right to
regress existing quota users by changing GFP_NOFS to GFP_NOWAIT in all
cases there; but it does seem a sensible stopgap for the new experimental
user tmpfs.
I think the thing to do, for now, is to add a flag (DQUOT_SPACE_WARN_NOWAIT?)
which gets passed down to the __dquot_alloc and __dquot_free for tmpfs,
and those choose GFP_NOFS or GFP_NOWAIT accordingly, and pass that gfp_t
on down to flush_warnings() to quota_send_warning() to genlmsg_new() and
genlmsg_multicast(). Carlos, if you agree, please try that.
I have no experience with netlink whatsoever: I hope that will be enough
to stop it from blocking.
I did toy with the idea of passing back the dquot_warn, and letting the
caller do the flush_warnings() at a more suitable moment; and that might
work out, but I suspect that the rearrangement involved would be better
directed to just rearranging where mm/shmem.c makes it dquot_alloc and
dquot_free calls.
And that's something I shall probably want to do, but not rush into.
There's an existing problem, for which I do have the pre-quotas fix,
of concurrent faulters in a size-limited tmpfs getting failures when
they try to allocate the last block (worse when huge page). Respecting
all the different layers of limiting is awkward, now quotas add another.
Ordinariily, with this blocking issue coming up now, I'd have asked to
back the tmpfs quotas series out of the next tree, and you rework where
the dquot_alloc and dquot_free are done, then bring the series back in
the next cycle. But with it being managed from vfs.git for this cycle,
and strong preference to return to mm.git next cycle, let's try for a
workaround now, then maybe I can do the rearrangement in mm/shmem.c
next cycle - it is one of the things I was hoping to do then (but
"hope" is not much of a guarantee).
info->mutex instead of info->lock: no thanks.
Hugh
>
> Carlos
>
> > 1273 }
> > 1274 }
> >
> > The call trees that Smatch is worried about are listed. The "disables
> > preempt" functions take the spin_lock_irq(&info->lock) before calling
> > shmem_recalc_inode().
> >
> > shmem_charge() <- disables preempt
> > shmem_uncharge() <- disables preempt
> > shmem_undo_range() <- disables preempt
> > shmem_getattr() <- disables preempt
> > shmem_writepage() <- disables preempt
> > shmem_set_folio_swapin_error() <- disables preempt
> > shmem_swapin_folio() <- disables preempt
> > shmem_get_folio_gfp() <- disables preempt
> > -> shmem_recalc_inode()
> > -> shmem_inode_unacct_blocks()
> > -> dquot_free_block_nodirty()
> > -> dquot_free_space_nodirty()
> > -> __dquot_free_space()
> > -> flush_warnings()
>
> Hm, I see, we add dquot_free_block_nodirty() call to shmem_inode_unacct_blocks()
> here, which leads to this possible sleep inside spin_lock_irq().
> >
> > regards,
> > dan carpenter
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug report] shmem: quota support
2023-08-03 0:00 ` Hugh Dickins
@ 2023-08-03 11:10 ` Jan Kara
2023-08-03 16:53 ` Hugh Dickins
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2023-08-03 11:10 UTC (permalink / raw)
To: Hugh Dickins
Cc: Carlos Maiolino, Dan Carpenter, Andrew Morton, linux-fsdevel,
jack, brauner
On Wed 02-08-23 17:00:49, Hugh Dickins wrote:
> On Wed, 2 Aug 2023, Carlos Maiolino wrote:
> > On Wed, Aug 02, 2023 at 09:53:54AM +0300, Dan Carpenter wrote:
> > > Hello Carlos Maiolino,
> > >
> > > The patch 9a9f8f590f6d: "shmem: quota support" from Jul 25, 2023
> > > (linux-next), leads to the following Smatch static checker warning:
> > >
> > > fs/quota/dquot.c:1271 flush_warnings()
> > > warn: sleeping in atomic context
> > >
> >
> > Thanks for the report Dan!
> >
> > > fs/quota/dquot.c
> > > 1261 static void flush_warnings(struct dquot_warn *warn)
> > > 1262 {
> > > 1263 int i;
> > > 1264
> > > 1265 for (i = 0; i < MAXQUOTAS; i++) {
> > > 1266 if (warn[i].w_type == QUOTA_NL_NOWARN)
> > > 1267 continue;
> > > 1268 #ifdef CONFIG_PRINT_QUOTA_WARNING
> > > 1269 print_warning(&warn[i]);
> > > 1270 #endif
> > > --> 1271 quota_send_warning(warn[i].w_dq_id,
> > > 1272 warn[i].w_sb->s_dev, warn[i].w_type);
> > >
> > > The quota_send_warning() function does GFP_NOFS allocations, which don't
> > > touch the fs but can still sleep. GFP_ATOMIC or GFP_NOWAIT don't sleep.
> > >
> >
> > Hmm, tbh I think the simplest way to fix it is indeed change GFP_NOFS to
> > GFP_NOWAIT when calling genlmsg_new(), quota_send_warnings() already abstain to
> > pass any error back to its caller, I don't think moving it from GFP_NOFS ->
> > GFP_NOWAIT will have much impact here as the amount of memory required for it is
> > not that big and wouldn't fail unless free memory is really short. I might be
> > wrong though.
> >
> > If not that, another option would be to swap tmpfs spinlocks for mutexes, but I
> > would rather avoid that.
> >
> > CC'ing other folks for more suggestions.
>
> This is certainly a problem, for both dquot_alloc and dquot_free paths.
> Thank you, Dan, for catching it.
>
> GFP_NOWAIT is an invitation to flakiness: I don't think it's right to
> regress existing quota users by changing GFP_NOFS to GFP_NOWAIT in all
> cases there; but it does seem a sensible stopgap for the new experimental
> user tmpfs.
So passing gfp argument to quota_send_warning() and propagating the
blocking info through __dquot_alloc_space() and __dquot_free_space() flags
would be OK for me *but* if CONFIG_PRINT_QUOTA_WARNING is set (which is
deprecated but still exists), we end up calling tty_write_message() out of
flush_warnings() and that can definitely block.
So if we are looking for unintrusive stopgap solution, maybe tmpfs can just
tell quota code to not issue warnings at all by using
__dquot_alloc_space() without DQUOT_SPACE_WARN flag and add support for
this flag to __dquot_free_space()? The feature is not used too much AFAIK
anyway. And once we move dquot calls into places where they can sleep, we
can reenable the warning support.
> I think the thing to do, for now, is to add a flag (DQUOT_SPACE_WARN_NOWAIT?)
> which gets passed down to the __dquot_alloc and __dquot_free for tmpfs,
> and those choose GFP_NOFS or GFP_NOWAIT accordingly, and pass that gfp_t
> on down to flush_warnings() to quota_send_warning() to genlmsg_new() and
> genlmsg_multicast(). Carlos, if you agree, please try that.
>
> I have no experience with netlink whatsoever: I hope that will be enough
> to stop it from blocking.
Yes, if you pass non-blocking gfp mode to netlink code, it takes care not
to block when allocating and sending the message.
> I did toy with the idea of passing back the dquot_warn, and letting the
> caller do the flush_warnings() at a more suitable moment; and that might
> work out, but I suspect that the rearrangement involved would be better
> directed to just rearranging where mm/shmem.c makes it dquot_alloc and
> dquot_free calls.
Yeah, frankly I think this is the best fix. AFAIU the problem is only with
shmem_recalc_inode() getting called under info->lock which looks managable
as far as I'm looking at the call sites and relatively easy wrt quotas as
freeing of quota space cannot fail. At least all shmem_inode_acct_blocks()
calls seem to be in places where they can sleep.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug report] shmem: quota support
2023-08-03 11:10 ` Jan Kara
@ 2023-08-03 16:53 ` Hugh Dickins
0 siblings, 0 replies; 5+ messages in thread
From: Hugh Dickins @ 2023-08-03 16:53 UTC (permalink / raw)
To: Jan Kara
Cc: Hugh Dickins, Carlos Maiolino, Dan Carpenter, Andrew Morton,
linux-fsdevel, brauner
Thanks for very helpful reply, Jan: it changes everything, at the bottom.
On Thu, 3 Aug 2023, Jan Kara wrote:
> On Wed 02-08-23 17:00:49, Hugh Dickins wrote:
> >
> > This is certainly a problem, for both dquot_alloc and dquot_free paths.
> > Thank you, Dan, for catching it.
> >
> > GFP_NOWAIT is an invitation to flakiness: I don't think it's right to
> > regress existing quota users by changing GFP_NOFS to GFP_NOWAIT in all
> > cases there; but it does seem a sensible stopgap for the new experimental
> > user tmpfs.
>
> So passing gfp argument to quota_send_warning() and propagating the
> blocking info through __dquot_alloc_space() and __dquot_free_space() flags
> would be OK for me *but* if CONFIG_PRINT_QUOTA_WARNING is set (which is
> deprecated but still exists), we end up calling tty_write_message() out of
> flush_warnings() and that can definitely block.
Oh yes :(
>
> So if we are looking for unintrusive stopgap solution, maybe tmpfs can just
> tell quota code to not issue warnings at all by using
> __dquot_alloc_space() without DQUOT_SPACE_WARN flag and add support for
> this flag to __dquot_free_space()? The feature is not used too much AFAIK
> anyway. And once we move dquot calls into places where they can sleep, we
> can reenable the warning support.
If the warning feature is not used very much at all (I did not realize
that), then certainly this would be a better way to go for now, than
the inadequate and extra DQUOT_SPACE_WARN_NOWAIT I was suggesting.
>
> > I think the thing to do, for now, is to add a flag (DQUOT_SPACE_WARN_NOWAIT?)
> > which gets passed down to the __dquot_alloc and __dquot_free for tmpfs,
> > and those choose GFP_NOFS or GFP_NOWAIT accordingly, and pass that gfp_t
> > on down to flush_warnings() to quota_send_warning() to genlmsg_new() and
> > genlmsg_multicast(). Carlos, if you agree, please try that.
Carlos, sorry, please don't waste your time on DQUOT_SPACE_WARN_NOWAIT
or no-DQUOT_SPACE_WARN.
> >
> > I have no experience with netlink whatsoever: I hope that will be enough
> > to stop it from blocking.
>
> Yes, if you pass non-blocking gfp mode to netlink code, it takes care not
> to block when allocating and sending the message.
Useful info, thanks.
>
> > I did toy with the idea of passing back the dquot_warn, and letting the
> > caller do the flush_warnings() at a more suitable moment; and that might
> > work out, but I suspect that the rearrangement involved would be better
> > directed to just rearranging where mm/shmem.c makes it dquot_alloc and
> > dquot_free calls.
>
> Yeah, frankly I think this is the best fix. AFAIU the problem is only with
> shmem_recalc_inode() getting called under info->lock which looks managable
> as far as I'm looking at the call sites and relatively easy wrt quotas as
> freeing of quota space cannot fail. At least all shmem_inode_acct_blocks()
> calls seem to be in places where they can sleep.
Ah, I believe you're right, and that's great: I was living in the past,
when shmem_charge() was still calling shmem_inode_acct_block() under
info->lock.
I agree, the only problem appears to be that shmem_inode_unacct_blocks()
call which I had to place inside shmem_recalc_inode(): under info->lock,
so I was just perpetuating the problems - extending them even.
So the fix should not require any rearrangement of where the dquot_alloc
and dquot_free are done: I may want to do so later, when updating to fix
the failures of concurrent allocation of last block, but there's no need
to get into any such rearrangement as part of this fix.
We just want shmem_recalc_inode() to take the info->lock itself, do its
adjustments and balancing, release the lock and dquot_free the excess.
I had a quick look through that, most places look straightforward to
update, but there are a couple where I need to think a bit first.
So no patch in this mail, but I'll get back to it in a few hours.
Hugh
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-03 16:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <kU3N4tqbYA3gHO6AXf5TbwIkfbkKFI9NaCK_39Uj4qC6YJKXa_j98uqXcegkmzc8Nxj8L3rD_UWv_x6y0RGv1Q==@protonmail.internalid>
2023-08-02 6:53 ` [bug report] shmem: quota support Dan Carpenter
2023-08-02 14:22 ` Carlos Maiolino
2023-08-03 0:00 ` Hugh Dickins
2023-08-03 11:10 ` Jan Kara
2023-08-03 16:53 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).