From: Tycho Andersen <tycho@tycho.pizza>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Christian Brauner <brauner@kernel.org>, Tejun Heo <tj@kernel.org>,
Zefan Li <lizefan.x@bytedance.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Haitao Huang <haitao.huang@linux.intel.com>,
Kamalesh Babulal <kamalesh.babulal@oracle.com>,
Tycho Andersen <tandersen@netflix.com>
Subject: Re: [RFC 4/6] misc cgroup: introduce an fd counter
Date: Wed, 8 Nov 2023 14:01:40 -0700 [thread overview]
Message-ID: <ZUv3NCZjRn5zfytj@tycho.pizza> (raw)
In-Reply-To: <20231108165749.GY1957730@ZenIV>
Hi Al,
Thanks for looking. Somehow I also missed CCing you, whoops,
On Wed, Nov 08, 2023 at 04:57:49PM +0000, Al Viro wrote:
> On Tue, Nov 07, 2023 at 05:26:45PM -0700, Tycho Andersen wrote:
>
> > + if (!charge_current_fds(newf, count_open_files(new_fdt)))
> > + return newf;
>
> Are you sure that on configs that are not cgroup-infested compiler
> will figure out that count_open_files() would have no side effects
> and doesn't need to be evaluated?
>
> Incidentally, since you are adding your charge/uncharge stuff on each
> allocation/freeing, why not simply maintain an accurate counter, cgroup or
> no cgroup? IDGI... Make it an inlined helper right there in fs/file.c,
> doing increment/decrement and, conditional upon config, calling
> the cgroup side of things. No need to look at fdt, etc. outside
> of fs/file.c either - the counter can be picked right from the
> files_struct...
Thanks, I can re-work it to look like this.
> > static void __put_unused_fd(struct files_struct *files, unsigned int fd)
> > {
> > struct fdtable *fdt = files_fdtable(files);
> > + if (test_bit(fd, fdt->open_fds))
> > + uncharge_current_fds(files, 1);
>
> Umm... Just where do we call it without the bit in ->open_fds set?
> Any such caller would be a serious bug; suppose you are trying to
> call __put_unused_fd(files, N) while N is not in open_fds. Just before
> your call another thread grabs a descriptor and picks N. Resulting
> state won't be pretty, especially if right *after* your call the
> third thread also asks for a descriptor - and also gets N.
>
> Sure, you have an exclusion on ->file_lock, but AFAICS all callers
> are under it and in all callers except for put_unused_fd() we
> have just observed a non-NULL file reference in ->fd[N]; that
> would *definitely* be a hard constraint violation if it ever
> happened with N not in ->open_fds at that moment.
>
> So the only possibility would be a broken caller of put_unused_fd(),
> and any such would be a serious bug.
>
> Details, please - have you ever observed that?
No, I just kept it from the original series. I agree that it should be
safe to drop.
> BTW, what about the locking hierarchy? In the current tree ->files_lock
> nests inside of everything; what happens with your patches in place?
If I understand correctly you're asking about ->files_lock nesting
inside of task_lock()? I tried to make the cgroup side in this patch
do the same thing in the same order. Or am I misunderstanding?
I did test this with some production container traffic and didn't see
anything too strange, but no doubt there are complicated edge cases
here.
Thanks,
Tycho
next prev parent reply other threads:[~2023-11-08 21:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-08 0:26 [RFC 0/6] tracking fd counts per cgroup Tycho Andersen
2023-11-08 0:26 ` [RFC 1/6] fs: count_open_files() -> count_possible_open_files() Tycho Andersen
2023-11-08 0:26 ` [RFC 2/6] fs: introduce count_open_files() Tycho Andersen
2023-11-08 0:26 ` [RFC 3/6] misc: introduce misc_cg_charge() Tycho Andersen
2023-11-08 0:26 ` [RFC 4/6] misc cgroup: introduce an fd counter Tycho Andersen
2023-11-08 16:57 ` Al Viro
2023-11-08 21:01 ` Tycho Andersen [this message]
2023-11-09 9:53 ` Christian Brauner
2023-11-09 14:58 ` Tycho Andersen
2023-11-08 0:26 ` [RFC 5/6] selftests/cgroup: add a flags arg to clone_into_cgroup() Tycho Andersen
2023-11-08 0:26 ` [RFC 6/6] selftests/cgroup: add a test for misc cgroup Tycho Andersen
2023-11-09 18:44 ` [RFC 0/6] tracking fd counts per cgroup Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUv3NCZjRn5zfytj@tycho.pizza \
--to=tycho@tycho.pizza \
--cc=brauner@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=haitao.huang@linux.intel.com \
--cc=hannes@cmpxchg.org \
--cc=kamalesh.babulal@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=tandersen@netflix.com \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).