From: Christoph Hellwig <hch@infradead.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs: replace filestream item xfs_inode reference with xfs_mount
Date: Thu, 5 Apr 2018 23:59:44 -0700 [thread overview]
Message-ID: <20180406065944.GA12740@infradead.org> (raw)
In-Reply-To: <20180405181300.GB417@bfoster.bfoster>
On Thu, Apr 05, 2018 at 02:13:01PM -0400, Brian Foster wrote:
> On Thu, Apr 05, 2018 at 10:18:42AM -0700, Christoph Hellwig wrote:
> > On Thu, Apr 05, 2018 at 08:11:47AM -0400, Brian Foster wrote:
> > > The obvious solution is to grab an inode reference for
> > > xfs_fstrm_item. The filestream mechanism only actually uses the
> > > inode pointer as a means to access the xfs_mount, however. Rather
> > > than add unnecessary complexity, simplify the implementation to
> > > store an xfs_mount pointer instead of the inode. This also requires
> > > updates to the tracepoint class to provide the associated data via
> > > parameters rather than the inode and a minor hack to peek at the MRU
> > > key to establish the inode number at free time.
> >
> > How about not replacing it all and just providing the context from
> > the mru cache? Something like the untested patch below:
> >
>
> I actually considered something like this briefly but it initially
> looked more involved than what you've come up with here. I think that I
> just didn't consider storing the mp directly in the xfs_mru_cache so it
> would be available to the reaper context. With that, this looks nicer to
> me. I'll give it a whirl.. thanks!
FYI, here is a version with a proper changelog:
---
>From ef8248a6cf39f2fbf69b2590c4cc66c9a2ba57fd Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 5 Apr 2018 19:17:51 +0200
Subject: xfs: remove filestream item xfs_inode reference
The filestreams allocator stores an xfs_fstrm_item structure in the MRU to
cache inode number to agno mappings for a particular length of time. Each
xfs_fstrm_item contains the internal MRU structure, an inode pointer and
agno value.
The inode pointer stored in the xfs_fstrm_item is not referenced, however,
which means the inode itself can be removed and reclaimed before the MRU
item is freed. If this occurs, xfs_fstrm_free_func() can access freed or
unrelated memory through xfs_fstrm_item->ip and crash.
The obvious solution is to grab an inode reference for xfs_fstrm_item.
The filestream mechanism only actually uses the inode pointer as a means
to access the xfs_mount, however. Rather than add unnecessary
complexity, simplify the implementation to store an xfs_mount pointer in
struct xfs_mru_cache, and pass it to the free callback. This also
requires updates to the tracepoint class to provide the associated data
via parameters rather than the inode and a minor hack to peek at the MRU
key to establish the inode number at free time.
Based on debugging work and an earlier patch from Brian Foster, who
also wrote most of this changelog.
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_filestream.c | 19 +++++++++----------
fs/xfs/xfs_mru_cache.c | 8 +++++---
fs/xfs/xfs_mru_cache.h | 8 ++++----
fs/xfs/xfs_trace.h | 14 +++++++-------
4 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c
index 043ca3808ea2..5131a6e25fc9 100644
--- a/fs/xfs/xfs_filestream.c
+++ b/fs/xfs/xfs_filestream.c
@@ -34,7 +34,6 @@
struct xfs_fstrm_item {
struct xfs_mru_cache_elem mru;
- struct xfs_inode *ip;
xfs_agnumber_t ag; /* AG in use for this directory */
};
@@ -122,14 +121,15 @@ xfs_filestream_put_ag(
static void
xfs_fstrm_free_func(
+ void *data,
struct xfs_mru_cache_elem *mru)
{
+ struct xfs_mount *mp = data;
struct xfs_fstrm_item *item =
container_of(mru, struct xfs_fstrm_item, mru);
- xfs_filestream_put_ag(item->ip->i_mount, item->ag);
-
- trace_xfs_filestream_free(item->ip, item->ag);
+ xfs_filestream_put_ag(mp, item->ag);
+ trace_xfs_filestream_free(mp, mru->key, item->ag);
kmem_free(item);
}
@@ -165,7 +165,7 @@ xfs_filestream_pick_ag(
trylock = XFS_ALLOC_FLAG_TRYLOCK;
for (nscan = 0; 1; nscan++) {
- trace_xfs_filestream_scan(ip, ag);
+ trace_xfs_filestream_scan(mp, ip->i_ino, ag);
pag = xfs_perag_get(mp, ag);
@@ -265,7 +265,6 @@ xfs_filestream_pick_ag(
goto out_put_ag;
item->ag = *agp;
- item->ip = ip;
err = xfs_mru_cache_insert(mp->m_filestream, ip->i_ino, &item->mru);
if (err) {
@@ -333,7 +332,7 @@ xfs_filestream_lookup_ag(
ag = container_of(mru, struct xfs_fstrm_item, mru)->ag;
xfs_mru_cache_done(mp->m_filestream);
- trace_xfs_filestream_lookup(ip, ag);
+ trace_xfs_filestream_lookup(mp, ip->i_ino, ag);
goto out;
}
@@ -399,7 +398,7 @@ xfs_filestream_new_ag(
* Only free the item here so we skip over the old AG earlier.
*/
if (mru)
- xfs_fstrm_free_func(mru);
+ xfs_fstrm_free_func(mp, mru);
IRELE(pip);
exit:
@@ -426,8 +425,8 @@ xfs_filestream_mount(
* timer tunable to within about 10 percent. This requires at least 10
* groups.
*/
- return xfs_mru_cache_create(&mp->m_filestream, xfs_fstrm_centisecs * 10,
- 10, xfs_fstrm_free_func);
+ return xfs_mru_cache_create(&mp->m_filestream, mp,
+ xfs_fstrm_centisecs * 10, 10, xfs_fstrm_free_func);
}
void
diff --git a/fs/xfs/xfs_mru_cache.c b/fs/xfs/xfs_mru_cache.c
index f8a674d7f092..70eea7ae2876 100644
--- a/fs/xfs/xfs_mru_cache.c
+++ b/fs/xfs/xfs_mru_cache.c
@@ -112,6 +112,7 @@ struct xfs_mru_cache {
xfs_mru_cache_free_func_t free_func; /* Function pointer for freeing. */
struct delayed_work work; /* Workqueue data for reaping. */
unsigned int queued; /* work has been queued */
+ void *data;
};
static struct workqueue_struct *xfs_mru_reap_wq;
@@ -259,7 +260,7 @@ _xfs_mru_cache_clear_reap_list(
list_for_each_entry_safe(elem, next, &tmp, list_node) {
list_del_init(&elem->list_node);
- mru->free_func(elem);
+ mru->free_func(mru->data, elem);
}
spin_lock(&mru->lock);
@@ -326,6 +327,7 @@ xfs_mru_cache_uninit(void)
int
xfs_mru_cache_create(
struct xfs_mru_cache **mrup,
+ void *data,
unsigned int lifetime_ms,
unsigned int grp_count,
xfs_mru_cache_free_func_t free_func)
@@ -369,7 +371,7 @@ xfs_mru_cache_create(
mru->grp_time = grp_time;
mru->free_func = free_func;
-
+ mru->data = data;
*mrup = mru;
exit:
@@ -492,7 +494,7 @@ xfs_mru_cache_delete(
elem = xfs_mru_cache_remove(mru, key);
if (elem)
- mru->free_func(elem);
+ mru->free_func(mru->data, elem);
}
/*
diff --git a/fs/xfs/xfs_mru_cache.h b/fs/xfs/xfs_mru_cache.h
index fb5245ba5ff7..b3f3fbdfcc47 100644
--- a/fs/xfs/xfs_mru_cache.h
+++ b/fs/xfs/xfs_mru_cache.h
@@ -26,13 +26,13 @@ struct xfs_mru_cache_elem {
};
/* Function pointer type for callback to free a client's data pointer. */
-typedef void (*xfs_mru_cache_free_func_t)(struct xfs_mru_cache_elem *elem);
+typedef void (*xfs_mru_cache_free_func_t)(void *, struct xfs_mru_cache_elem *);
int xfs_mru_cache_init(void);
void xfs_mru_cache_uninit(void);
-int xfs_mru_cache_create(struct xfs_mru_cache **mrup, unsigned int lifetime_ms,
- unsigned int grp_count,
- xfs_mru_cache_free_func_t free_func);
+int xfs_mru_cache_create(struct xfs_mru_cache **mrup, void *data,
+ unsigned int lifetime_ms, unsigned int grp_count,
+ xfs_mru_cache_free_func_t free_func);
void xfs_mru_cache_destroy(struct xfs_mru_cache *mru);
int xfs_mru_cache_insert(struct xfs_mru_cache *mru, unsigned long key,
struct xfs_mru_cache_elem *elem);
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 945de08af7ba..19f4e30265c2 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -506,8 +506,8 @@ DEFINE_BUF_ITEM_EVENT(xfs_trans_bhold_release);
DEFINE_BUF_ITEM_EVENT(xfs_trans_binval);
DECLARE_EVENT_CLASS(xfs_filestream_class,
- TP_PROTO(struct xfs_inode *ip, xfs_agnumber_t agno),
- TP_ARGS(ip, agno),
+ TP_PROTO(struct xfs_mount *mp, xfs_ino_t ino, xfs_agnumber_t agno),
+ TP_ARGS(mp, ino, agno),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(xfs_ino_t, ino)
@@ -515,10 +515,10 @@ DECLARE_EVENT_CLASS(xfs_filestream_class,
__field(int, streams)
),
TP_fast_assign(
- __entry->dev = VFS_I(ip)->i_sb->s_dev;
- __entry->ino = ip->i_ino;
+ __entry->dev = mp->m_super->s_dev;
+ __entry->ino = ino;
__entry->agno = agno;
- __entry->streams = xfs_filestream_peek_ag(ip->i_mount, agno);
+ __entry->streams = xfs_filestream_peek_ag(mp, agno);
),
TP_printk("dev %d:%d ino 0x%llx agno %u streams %d",
MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -528,8 +528,8 @@ DECLARE_EVENT_CLASS(xfs_filestream_class,
)
#define DEFINE_FILESTREAM_EVENT(name) \
DEFINE_EVENT(xfs_filestream_class, name, \
- TP_PROTO(struct xfs_inode *ip, xfs_agnumber_t agno), \
- TP_ARGS(ip, agno))
+ TP_PROTO(struct xfs_mount *mp, xfs_ino_t ino, xfs_agnumber_t agno), \
+ TP_ARGS(mp, ino, agno))
DEFINE_FILESTREAM_EVENT(xfs_filestream_free);
DEFINE_FILESTREAM_EVENT(xfs_filestream_lookup);
DEFINE_FILESTREAM_EVENT(xfs_filestream_scan);
--
2.16.3
next prev parent reply other threads:[~2018-04-06 6:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-05 12:11 [PATCH 1/2] xfs: reset xfs_inode struct on reclaim in debug mode Brian Foster
2018-04-05 12:11 ` [PATCH 2/2] xfs: replace filestream item xfs_inode reference with xfs_mount Brian Foster
2018-04-05 17:18 ` Christoph Hellwig
2018-04-05 18:13 ` Brian Foster
2018-04-06 6:59 ` Christoph Hellwig [this message]
2018-04-06 13:47 ` Brian Foster
2018-04-05 16:51 ` [PATCH 1/2] xfs: reset xfs_inode struct on reclaim in debug mode Darrick J. Wong
2018-04-05 18:12 ` Brian Foster
2018-04-06 16:16 ` Darrick J. Wong
2018-04-05 21:14 ` Dave Chinner
2018-04-06 13:28 ` Brian Foster
2018-04-06 16:13 ` Darrick J. Wong
2018-04-06 17:01 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180406065944.GA12740@infradead.org \
--to=hch@infradead.org \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).