* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
[not found] <20231004165007.43d79161@gandalf.local.home>
@ 2023-11-17 14:23 ` Heiko Carstens
2023-11-17 14:38 ` Heiko Carstens
0 siblings, 1 reply; 6+ messages in thread
From: Heiko Carstens @ 2023-11-17 14:23 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mark Rutland,
Andrew Morton, Ajay Kaher, chinglinyu, lkp, namit, oe-lkp,
amakhalov, er.ajay.kaher, srivatsa, tkundu, vsirnapalli,
linux-s390
Hi Steven,
On Wed, Oct 04, 2023 at 04:50:07PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
>
> Instead of having a descriptor for every file represented in the eventfs
> directory, only have the directory itself represented. Change the API to
> send in a list of entries that represent all the files in the directory
> (but not other directories). The entry list contains a name and a callback
> function that will be used to create the files when they are accessed.
...
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Ajay Kaher <akaher@vmware.com>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v4: https://lore.kernel.org/linux-trace-kernel/20231003184059.4924468e@gandalf.local.home/
>
> - Get the ei->dentry within the eventfs_mutex to keep consistency during the lookup.
>
> fs/tracefs/event_inode.c | 847 ++++++++++++++++++-----------------
> fs/tracefs/inode.c | 2 +-
> fs/tracefs/internal.h | 37 +-
> include/linux/trace_events.h | 2 +-
> include/linux/tracefs.h | 29 +-
> kernel/trace/trace.c | 7 +-
> kernel/trace/trace.h | 4 +-
> kernel/trace/trace_events.c | 313 +++++++++----
> 8 files changed, 705 insertions(+), 536 deletions(-)
I think this patch causes from time to time crashes when running ftrace
selftests. In particular I guess there is a bug wrt error handling in this
function (see below for call trace):
> +static struct dentry *
> +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> + struct dentry *parent, const char *name, umode_t mode, void *data,
> + const struct file_operations *fops, bool lookup)
> +{
> + struct dentry *dentry;
> + bool invalidate = false;
> +
> + mutex_lock(&eventfs_mutex);
> + /* If the e_dentry already has a dentry, use it */
> + if (*e_dentry) {
> + /* lookup does not need to up the ref count */
> + if (!lookup)
> + dget(*e_dentry);
> + mutex_unlock(&eventfs_mutex);
> + return *e_dentry;
> + }
> + mutex_unlock(&eventfs_mutex);
> +
> + /* The lookup already has the parent->d_inode locked */
> + if (!lookup)
> + inode_lock(parent->d_inode);
> +
> + dentry = create_file(name, mode, parent, data, fops);
> +
> + if (!lookup)
> + inode_unlock(parent->d_inode);
> +
> + mutex_lock(&eventfs_mutex);
> +
> + if (IS_ERR_OR_NULL(dentry)) {
> + /*
> + * When the mutex was released, something else could have
> + * created the dentry for this e_dentry. In which case
> + * use that one.
> + *
> + * Note, with the mutex held, the e_dentry cannot have content
> + * and the ei->is_freed be true at the same time.
> + */
> + WARN_ON_ONCE(ei->is_freed);
> + dentry = *e_dentry;
> + /* The lookup does not need to up the dentry refcount */
> + if (dentry && !lookup)
> + dget(dentry);
> + mutex_unlock(&eventfs_mutex);
> + return dentry;
> + }
> +
> + if (!*e_dentry && !ei->is_freed) {
> + *e_dentry = dentry;
> + dentry->d_fsdata = ei;
> + } else {
> + /*
> + * Should never happen unless we get here due to being freed.
> + * Otherwise it means two dentries exist with the same name.
> + */
> + WARN_ON_ONCE(!ei->is_freed);
> + invalidate = true;
> + }
> + mutex_unlock(&eventfs_mutex);
> +
> + if (invalidate)
> + d_invalidate(dentry);
> +
> + if (lookup || invalidate)
> + dput(dentry);
> +
> + return invalidate ? NULL : dentry;
> +}
We sometimes see crashes like this:
specification exception: 0006 ilc:2 [#1] SMP
CPU: 6 PID: 38815 Comm: ls Not tainted 6.7.0-20231116.rc1.git1.a7e756a5bb26.300.vr.fc38.s390x #1
Hardware name: IBM 3906 M04 704 (z/VM 7.1.0)
Krnl PSW : 0704c00180000000 000001682304bb00 (d_invalidate+0x30/0x110)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: ffffffffffffffff 000000e200000000 0000000000000047 000000e200000007
0000000000000000 ffffff7c197bf000 000000e2f13b0b20 000000e25bfae180
000000e2f2536000 ffffffffffffffef 0000000000000000 ffffffffffffffef
000003ff95cacf98 000000e2f29323f0 000000e827c1fa18 000000e827c1f9d0
Krnl Code: 000001682304baf4: a7180000 lhi %r1,0
000001682304baf8: 583003ac l %r3,940
#000001682304bafc: ba13b058 cs %r1,%r3,88(%r11)
>000001682304bb00: ec16006b007e cij %r1,0,6,000001682304bbd6
000001682304bb06: e310b0100002 ltg %r1,16(%r11)
000001682304bb0c: a784004e brc 8,000001682304bba8
000001682304bb10: b904002b lgr %r2,%r11
000001682304bb14: c0e5ffffe67e brasl %r14,0000016823048810
Call Trace:
[<000001682304bb00>] d_invalidate+0x30/0x110
[<000001682329147a>] create_dir_dentry+0xe2/0x200
[<000001682329190a>] dcache_dir_open_wrapper+0x102/0x3e8
[<000001682301fb8a>] do_dentry_open+0x24a/0x568
[<0000016823038836>] do_open+0x2de/0x448
[<000001682303cb58>] path_openat+0x110/0x2b0
[<000001682303d688>] do_filp_open+0x90/0x130
[<0000016823022960>] do_sys_openat2+0xa8/0xd8
[<0000016823022b50>] do_sys_open+0x58/0x90
[<00000168239c9edc>] __do_syscall+0x1d4/0x200
[<00000168239db1f8>] system_call+0x70/0x98
Last Breaking-Event-Address:
[<0000016823291474>] create_dir_dentry+0xdc/0x200
Kernel panic - not syncing: Fatal exception: panic_on_oops
Note that the compare and swap instruction within d_invalidate() generates
a specification exception because it operates on an invalid address
(0xffffffffffffffef), which happens to be -EEXIST. So my assumption is that
create_dir_dentry() has incorrect error handling and passes -EEXIST instead
of a valid dentry pointer to d_invalidate().
But I leave it up to you to figure this out :)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
2023-11-17 14:23 ` [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode Heiko Carstens
@ 2023-11-17 14:38 ` Heiko Carstens
2023-11-23 11:25 ` Heiko Carstens
0 siblings, 1 reply; 6+ messages in thread
From: Heiko Carstens @ 2023-11-17 14:38 UTC (permalink / raw)
To: Heiko Carstens
Cc: Steven Rostedt, LKML, Linux Trace Kernel, Masami Hiramatsu,
Mark Rutland, Andrew Morton, Ajay Kaher, chinglinyu, lkp, namit,
oe-lkp, amakhalov, er.ajay.kaher, srivatsa, tkundu, vsirnapalli,
linux-s390
On Fri, Nov 17, 2023 at 03:23:35PM +0100, Heiko Carstens wrote:
> I think this patch causes from time to time crashes when running ftrace
> selftests. In particular I guess there is a bug wrt error handling in this
> function (see below for call trace):
>
> > +static struct dentry *
> > +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> > + struct dentry *parent, const char *name, umode_t mode, void *data,
> > + const struct file_operations *fops, bool lookup)
> > +{
...
> Note that the compare and swap instruction within d_invalidate() generates
> a specification exception because it operates on an invalid address
> (0xffffffffffffffef), which happens to be -EEXIST. So my assumption is that
> create_dir_dentry() has incorrect error handling and passes -EEXIST instead
> of a valid dentry pointer to d_invalidate().
>
> But I leave it up to you to figure this out :)
Ok, wrong function quoted of course. But the rest of my statement
should be correct.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
2023-11-17 14:38 ` Heiko Carstens
@ 2023-11-23 11:25 ` Heiko Carstens
2023-11-23 12:34 ` Ajay Kaher
2023-11-23 15:23 ` Steven Rostedt
0 siblings, 2 replies; 6+ messages in thread
From: Heiko Carstens @ 2023-11-23 11:25 UTC (permalink / raw)
To: Heiko Carstens
Cc: Steven Rostedt, LKML, Linux Trace Kernel, Masami Hiramatsu,
Mark Rutland, Andrew Morton, Ajay Kaher, chinglinyu, lkp, namit,
oe-lkp, amakhalov, er.ajay.kaher, srivatsa, tkundu, vsirnapalli,
linux-s390
On Fri, Nov 17, 2023 at 03:38:29PM +0100, Heiko Carstens wrote:
> On Fri, Nov 17, 2023 at 03:23:35PM +0100, Heiko Carstens wrote:
> > I think this patch causes from time to time crashes when running ftrace
> > selftests. In particular I guess there is a bug wrt error handling in this
> > function (see below for call trace):
> >
> > > +static struct dentry *
> > > +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> > > + struct dentry *parent, const char *name, umode_t mode, void *data,
> > > + const struct file_operations *fops, bool lookup)
> > > +{
> ...
> > Note that the compare and swap instruction within d_invalidate() generates
> > a specification exception because it operates on an invalid address
> > (0xffffffffffffffef), which happens to be -EEXIST. So my assumption is that
> > create_dir_dentry() has incorrect error handling and passes -EEXIST instead
> > of a valid dentry pointer to d_invalidate().
> >
> > But I leave it up to you to figure this out :)
>
> Ok, wrong function quoted of course. But the rest of my statement
> should be correct.
So, if it helps (this still happens with Linus' master branch):
create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
parameter), which points to a data structure where "is_freed" is 1. Then it
looks like create_dir() returned "-EEXIST". And looking at the code this
combination then must lead to d_invalidate() incorrectly being called with
"-EEXIST" as dentry pointer.
Now, I have no idea how the code should work, but it is quite obvious that
something is broken :)
Here the dump of the struct eventfs_inode that was passed to
create_file_dentry() when the crash happened:
crash> struct eventfs_inode 00000000eada7680
struct eventfs_inode {
list = {
next = 0x10f802da0,
prev = 0x122
},
entries = 0x12c031328 <event_entries>,
name = 0x12b90bbac <__tpstrtab_xfs_alloc_vextent_exact_bno> "xfs_alloc_vextent_exact_bno",
children = {
next = 0xeada76a0,
prev = 0xeada76a0
},
dentry = 0x0,
d_parent = 0x107c75d40,
d_children = 0xeada5700,
entry_attrs = 0x0,
attr = {
mode = 0,
uid = {
val = 0
},
gid = {
val = 0
}
},
data = 0xeada6660,
{
llist = {
next = 0xeada7668
},
rcu = {
next = 0xeada7668,
func = 0x12ad2a5b8 <free_rcu_ei>
}
},
is_freed = 1,
nr_entries = 6
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
2023-11-23 11:25 ` Heiko Carstens
@ 2023-11-23 12:34 ` Ajay Kaher
2023-11-23 15:23 ` Steven Rostedt
1 sibling, 0 replies; 6+ messages in thread
From: Ajay Kaher @ 2023-11-23 12:34 UTC (permalink / raw)
To: Heiko Carstens
Cc: Steven Rostedt, LKML, Linux Trace Kernel, Masami Hiramatsu,
Mark Rutland, Andrew Morton, chinglinyu@google.com, lkp@intel.com,
Nadav Amit, oe-lkp@lists.linux.dev, Alexey Makhalov,
er.ajay.kaher@gmail.com, srivatsa@csail.mit.edu, Tapas Kundu,
Vasavi Sirnapalli, linux-s390@vger.kernel.org
> On 23-Nov-2023, at 4:55 PM, Heiko Carstens <hca@linux.ibm.com> wrote:
>
> !! External Email
>
> On Fri, Nov 17, 2023 at 03:38:29PM +0100, Heiko Carstens wrote:
>> On Fri, Nov 17, 2023 at 03:23:35PM +0100, Heiko Carstens wrote:
>>> I think this patch causes from time to time crashes when running ftrace
>>> selftests. In particular I guess there is a bug wrt error handling in this
>>> function (see below for call trace):
>>>
>>>> +static struct dentry *
>>>> +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
>>>> + struct dentry *parent, const char *name, umode_t mode, void *data,
>>>> + const struct file_operations *fops, bool lookup)
>>>> +{
>> ...
>>> Note that the compare and swap instruction within d_invalidate() generates
>>> a specification exception because it operates on an invalid address
>>> (0xffffffffffffffef), which happens to be -EEXIST. So my assumption is that
>>> create_dir_dentry() has incorrect error handling and passes -EEXIST instead
>>> of a valid dentry pointer to d_invalidate().
>>>
>>> But I leave it up to you to figure this out :)
>>
>> Ok, wrong function quoted of course. But the rest of my statement
>> should be correct.
>
> So, if it helps (this still happens with Linus' master branch):
>
> create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
> parameter), which points to a data structure where "is_freed" is 1. Then it
> looks like create_dir() returned "-EEXIST". And looking at the code this
> combination then must lead to d_invalidate() incorrectly being called with
> "-EEXIST" as dentry pointer.
>
> Now, I have no idea how the code should work, but it is quite obvious that
> something is broken :)
>
> Here the dump of the struct eventfs_inode that was passed to
> create_file_dentry() when the crash happened:
>
> crash> struct eventfs_inode 00000000eada7680
> struct eventfs_inode {
> list = {
> next = 0x10f802da0,
> prev = 0x122
> },
> entries = 0x12c031328 <event_entries>,
> name = 0x12b90bbac <__tpstrtab_xfs_alloc_vextent_exact_bno> "xfs_alloc_vextent_exact_bno",
> children = {
> next = 0xeada76a0,
> prev = 0xeada76a0
> },
> dentry = 0x0,
> d_parent = 0x107c75d40,
> d_children = 0xeada5700,
> entry_attrs = 0x0,
> attr = {
> mode = 0,
> uid = {
> val = 0
> },
> gid = {
> val = 0
> }
> },
> data = 0xeada6660,
> {
> llist = {
> next = 0xeada7668
> },
> rcu = {
> next = 0xeada7668,
> func = 0x12ad2a5b8 <free_rcu_ei>
> }
> },
> is_freed = 1,
> nr_entries = 6
> }
Heiko, your analysis looks good to me. Seems -EEXIST is from:
https://elixir.bootlin.com/linux/v6.7-rc2/source/fs/tracefs/inode.c#L533
Steve, as per me error handling should be same for create_dir_dentry()
and create_file_dentry() or am I missing something.
-Ajay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
2023-11-23 11:25 ` Heiko Carstens
2023-11-23 12:34 ` Ajay Kaher
@ 2023-11-23 15:23 ` Steven Rostedt
2023-11-23 16:06 ` Heiko Carstens
1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2023-11-23 15:23 UTC (permalink / raw)
To: Heiko Carstens
Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mark Rutland,
Andrew Morton, Ajay Kaher, chinglinyu, lkp, namit, oe-lkp,
amakhalov, er.ajay.kaher, srivatsa, tkundu, vsirnapalli,
linux-s390
On Thu, 23 Nov 2023 12:25:48 +0100
Heiko Carstens <hca@linux.ibm.com> wrote:
> So, if it helps (this still happens with Linus' master branch):
>
> create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
> parameter), which points to a data structure where "is_freed" is 1. Then it
> looks like create_dir() returned "-EEXIST". And looking at the code this
> combination then must lead to d_invalidate() incorrectly being called with
> "-EEXIST" as dentry pointer.
I haven't looked too much at the error codes, let me do that on Monday
(it's currently Turkey weekend here in the US).
But could you test this branch:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git trace/core
I have a bunch of fixes in that branch that may fix your issue. I just
finished testing it and plan on pushing it to Linus before the next rc
release.
Thanks!
-- Steve
>
> Now, I have no idea how the code should work, but it is quite obvious that
> something is broken :)
>
> Here the dump of the struct eventfs_inode that was passed to
> create_file_dentry() when the crash happened:
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode
2023-11-23 15:23 ` Steven Rostedt
@ 2023-11-23 16:06 ` Heiko Carstens
0 siblings, 0 replies; 6+ messages in thread
From: Heiko Carstens @ 2023-11-23 16:06 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mark Rutland,
Andrew Morton, Ajay Kaher, chinglinyu, lkp, namit, oe-lkp,
amakhalov, er.ajay.kaher, srivatsa, tkundu, vsirnapalli,
linux-s390
On Thu, Nov 23, 2023 at 10:23:49AM -0500, Steven Rostedt wrote:
> On Thu, 23 Nov 2023 12:25:48 +0100
> Heiko Carstens <hca@linux.ibm.com> wrote:
>
> > So, if it helps (this still happens with Linus' master branch):
> >
> > create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
> > parameter), which points to a data structure where "is_freed" is 1. Then it
> > looks like create_dir() returned "-EEXIST". And looking at the code this
> > combination then must lead to d_invalidate() incorrectly being called with
> > "-EEXIST" as dentry pointer.
>
> I haven't looked too much at the error codes, let me do that on Monday
> (it's currently Turkey weekend here in the US).
>
> But could you test this branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git trace/core
>
> I have a bunch of fixes in that branch that may fix your issue. I just
> finished testing it and plan on pushing it to Linus before the next rc
> release.
This is not that easy to reproduce, however you branch contains commit
71cade82f2b5 ("eventfs: Do not invalidate dentry in create_file/dir_dentry()")
which removes the d_invalidate() call.
The crash I reported cannot happen anymore with that commit. I'll consider
this fixed, and report again if this (or something else) still causes
problems.
Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-11-23 16:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20231004165007.43d79161@gandalf.local.home>
2023-11-17 14:23 ` [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode Heiko Carstens
2023-11-17 14:38 ` Heiko Carstens
2023-11-23 11:25 ` Heiko Carstens
2023-11-23 12:34 ` Ajay Kaher
2023-11-23 15:23 ` Steven Rostedt
2023-11-23 16:06 ` Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox