* Re: [Patch] ext3_journal_stop inode access [not found] <1048185825.2491.386.camel@sisko.scot.redhat.com> @ 2003-03-20 21:15 ` Andrew Morton 2003-03-20 21:36 ` Stephen C. Tweedie 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2003-03-20 21:15 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: ext3-users, linux-kernel, linux-fsdevel "Stephen C. Tweedie" <sct@redhat.com> wrote: > > Hi Andrew, > > The patch below addresses the problem we were talking about earlier > where ext3_writepage ends up accessing the inode after the page lock has > been dropped (and hence at a point where it is possible for the inode to > have been reclaimed.) Tested minimally (it builds and boots.) Burton has confirmed that removing the inode->i_sb->s_dirt = 1; line makes the oopses go away, so this will fix it. > It makes ext3_journal_stop take an sb, not an inode, as its final > parameter. argh. I wrote and tested a patch too. That patch puts the superblock pointer into the new journal->j_private and removes the second arg to ext3_journal_start altogether. I went that way just to save a little text. Because ext3_journal_start/stop need to be uninlined - that saves 5.5 kbytes of text. Which do you think is best? If you're planning on patching 2.4 and if you want to do that by passing the superblock pointer in, then we should go that way in 2.5 too, keep things in sync. > It also sets sb->s_need_sync_fs, not sb->s_dirt, as setting > s_dirt was only ever a workaround for the lack of a proper sync-fs > mechanism. > > Btw, we clear s_need_sync_fs in sync_filesystems(). Don't we also need > to do the same in fsync_super()? The intent of s_need_sync_fs is to avoid livelock in sync_filesystems(). Imagine that two filesytems are being continually dirtied. It would be very, very easy for a sync_filesystems() caller to never terminate. This is a repeated problem with lists of dirty objects which need cleaning, in which only the head-of-list is stable outside the lock. So sync_filesystems() will tag all the filesystems on the first pass and then only sync tagged filesytems on the second pass. No livelock. (This still means that new sync() callers will cause older sync() callers to perform a second round of syncing. I guess I should slap a mutex around the whole thing to prevent that). So s_need_sync_fs is "private to sync_fileystems". I should have commented that, sorry. sync_filesystems() will even call call ->sync_fs against non-s_dirt filesystems, which seems a little odd. The reason for this is that ext3_write_super() will clear s_dirt and _not_ wait on the writeout. So if sync_filesystems() happened to be called against a filesystem shortly after its ->write_super() had been called, sync_filesystems() would incorrectly assume that the filesystem was fully synced. I shall comment that, too. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] ext3_journal_stop inode access 2003-03-20 21:15 ` [Patch] ext3_journal_stop inode access Andrew Morton @ 2003-03-20 21:36 ` Stephen C. Tweedie 2003-03-21 0:12 ` Andrew Morton 0 siblings, 1 reply; 4+ messages in thread From: Stephen C. Tweedie @ 2003-03-20 21:36 UTC (permalink / raw) To: Andrew Morton Cc: ext3 users list, linux-kernel, linux-fsdevel, Stephen Tweedie Hi, On Thu, 2003-03-20 at 21:15, Andrew Morton wrote: > Burton has confirmed that removing the > > inode->i_sb->s_dirt = 1; > > line makes the oopses go away, so this will fix it. Good. > > It makes ext3_journal_stop take an sb, not an inode, as its final > > parameter. > > argh. I wrote and tested a patch too. That patch puts the superblock > pointer into the new journal->j_private and removes the second arg to > ext3_journal_start altogether. Well, there's still the if (err) __ext3_std_error(inode->i_sb, where, err); case in ext3_journal_stop() to worry about, so we still need it; and I'd much rather not hack this via j_private, when what we're doing at this point is most definitely a fs-specific, not journal-related, operation. > I went that way just to save a little text. Because ext3_journal_start/stop > need to be uninlined - that saves 5.5 kbytes of text. Agreed. > Which do you think is best? If you're planning on patching 2.4 and if you > want to do that by passing the superblock pointer in, then we should go that > way in 2.5 too, keep things in sync. I was wondering why we've never seen this on 2.4, even with slab poisoning enabled. But I think the vulnerability exists on 2.4 too, so yes, we ought to keep the two in sync. > > It also sets sb->s_need_sync_fs, not sb->s_dirt, as setting > > s_dirt was only ever a workaround for the lack of a proper sync-fs > > mechanism. > > > > Btw, we clear s_need_sync_fs in sync_filesystems(). Don't we also need > > to do the same in fsync_super()? > > The intent of s_need_sync_fs is to avoid livelock in sync_filesystems(). ... Well, the intent of the s_dirt was to force a call to ext3_write_super when the fs was dirty, back before the days when we had a sync_fs() method at all. Now that we have the latter, it sounds like we should actually just drop the line which sets s_dirt in ext3_journal_stop entirely, because sync will always call the new sync_fs which will do the commit that we need. We still have the error handling path in ext3_journal_stop so we can't avoid having to find the sb, so _some_ rejigging is still needed. Cheers, Stephen ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] ext3_journal_stop inode access 2003-03-20 21:36 ` Stephen C. Tweedie @ 2003-03-21 0:12 ` Andrew Morton 2003-03-20 22:18 ` Stephen C. Tweedie 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2003-03-21 0:12 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: ext3-users, linux-kernel, linux-fsdevel "Stephen C. Tweedie" <sct@redhat.com> wrote: > > Well, there's still the > > if (err) > __ext3_std_error(inode->i_sb, where, err); > > case in ext3_journal_stop() to worry about We already have that. int __ext3_journal_stop(const char *where, handle_t *handle) { struct super_block *sb; int err; int rc; sb = handle->h_transaction->t_journal->j_private; err = handle->h_err; rc = journal_stop(handle); sb->s_dirt = 1; if (!err) err = rc; if (err) __ext3_std_error(sb, where, err); return err; } > , so we still need it; and I'd > much rather not hack this via j_private, when what we're doing at this > point is most definitely a fs-specific, not journal-related, operation. I don't think it's a hack at all. ext3 owns the journal and there is plenty of precedent for putting owner-private things into owned objects. But I'm not particularly fussed either way - it will only be 100-200 bytes of code saved. > I was wondering why we've never seen this on 2.4, even with slab > poisoning enabled. But I think the vulnerability exists on 2.4 too, so > yes, we ought to keep the two in sync. It could be due to differences in inode reclaim. If 2.4 sees an inode with attached pages it will skip it. 2.5 will instead detach the pages and free the inode. And writepage() doesn't get used much in 2.4. Plus this bug was hit on a preemptible kernel where the timing windows are much wider. > Well, the intent of the s_dirt was to force a call to ext3_write_super > when the fs was dirty, back before the days when we had a sync_fs() > method at all. Now that we have the latter, it sounds like we should > actually just drop the line which sets s_dirt in ext3_journal_stop > entirely, because sync will always call the new sync_fs which will do > the commit that we need. OK. > We still have the error handling path in ext3_journal_stop so we can't > avoid having to find the sb, so _some_ rejigging is still needed. That is available from the handle. (And via ext3_journal_current_handle()->j_private, even). The journal and the superblock have a definite one-to-one relationship - I think the backpointer makes sense. But whatever - I'll let you flip that coin. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch] ext3_journal_stop inode access 2003-03-21 0:12 ` Andrew Morton @ 2003-03-20 22:18 ` Stephen C. Tweedie 0 siblings, 0 replies; 4+ messages in thread From: Stephen C. Tweedie @ 2003-03-20 22:18 UTC (permalink / raw) To: Andrew Morton Cc: ext3 users list, linux-kernel, linux-fsdevel, Stephen Tweedie Hi,On Fri, 2003-03-21 at 00:12, Andrew Morton wrote: > > Well, there's still the > > if (err) > > __ext3_std_error(inode->i_sb, where, err); > > case in ext3_journal_stop() to worry about > > We already have that. Only if we fix the underlying problem --- I was only pointing out that even if we drop the setting of s_dirt entirely, which was what we were trying to fix, we can't avoid having to find the sb. > But I'm not particularly fussed either way - it will only be 100-200 bytes of > code saved. Yep, but there are probably other places we can find where we can avoid passing the sb around too if we have the back-pointer. I guess it makes sense to go ahead with that. > The journal and the superblock have a definite one-to-one relationship - I think the > backpointer makes sense. But whatever - I'll let you flip that coin. OK, go for it and I'll merge for 2.4. Cheers, Stephen ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-03-21 0:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1048185825.2491.386.camel@sisko.scot.redhat.com>
2003-03-20 21:15 ` [Patch] ext3_journal_stop inode access Andrew Morton
2003-03-20 21:36 ` Stephen C. Tweedie
2003-03-21 0:12 ` Andrew Morton
2003-03-20 22:18 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).