* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer [not found] ` <20180220023038.19883-6-tytso@mit.edu> @ 2018-03-06 17:10 ` Jan Kara 2018-03-22 15:26 ` Theodore Y. Ts'o 0 siblings, 1 reply; 9+ messages in thread From: Jan Kara @ 2018-03-06 17:10 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List, stable On Mon 19-02-18 21:30:36, Theodore Ts'o wrote: > Previously the jbd2 layer assumed that a file system check would be > required after a journal abort. In the case of the deliberate file > system shutdown, this should not be necessary. Allow the jbd2 layer > to distinguish between these two cases by using the ESHUTDOWN errno. > > Also add proper locking to __journal_abort_soft(). > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@vger.kernel.org > --- Maybe a cleaner api would be to have a separate JBD2 function like jbd2_journal_abort_clean() that would just abort the journal without setting an error to it. > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 3fbf48ec2188..efa0c72a0b9f 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -2105,12 +2108,22 @@ void __jbd2_journal_abort_hard(journal_t *journal) > * but don't do any other IO. */ > static void __journal_abort_soft (journal_t *journal, int errno) > { > - if (journal->j_flags & JBD2_ABORT) > - return; > + int old_errno; > > - if (!journal->j_errno) > + write_lock(&journal->j_state_lock); > + old_errno = journal->j_errno; > + if (!journal->j_errno || errno == -ESHUTDOWN) > journal->j_errno = errno; > > + if (journal->j_flags & JBD2_ABORT) { > + write_unlock(&journal->j_state_lock); > + if (!old_errno && old_errno != -ESHUTDOWN && > + errno == -ESHUTDOWN) > + jbd2_journal_update_sb_errno(journal); > + return; > + } > + write_unlock(&journal->j_state_lock); > + Is it really correct that once the filesystem gets shutdown you clear the previous error from the journal? Because if we hit some real fs corruption, the journal gets aborted, and then someone calls ext4_shutdown(), we'd clear that error which looks like a bug to me because that shutdown hardly fixes the fs corruption... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer 2018-03-06 17:10 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Jan Kara @ 2018-03-22 15:26 ` Theodore Y. Ts'o 2018-05-17 15:18 ` Jan Kara 0 siblings, 1 reply; 9+ messages in thread From: Theodore Y. Ts'o @ 2018-03-22 15:26 UTC (permalink / raw) To: Jan Kara; +Cc: Ext4 Developers List, stable On Tue, Mar 06, 2018 at 06:10:41PM +0100, Jan Kara wrote: > Is it really correct that once the filesystem gets shutdown you clear the > previous error from the journal? Because if we hit some real fs corruption, > the journal gets aborted, and then someone calls ext4_shutdown(), we'd > clear that error which looks like a bug to me because that shutdown hardly > fixes the fs corruption... That's not what the code does. If journal->j_errno is set, then we won't clear it, for precisely what concern you've articulated. - Ted ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer 2018-03-22 15:26 ` Theodore Y. Ts'o @ 2018-05-17 15:18 ` Jan Kara 0 siblings, 0 replies; 9+ messages in thread From: Jan Kara @ 2018-05-17 15:18 UTC (permalink / raw) To: Theodore Y. Ts'o; +Cc: Jan Kara, Ext4 Developers List, stable On Thu 22-03-18 11:26:45, Theodore Y. Ts'o wrote: > On Tue, Mar 06, 2018 at 06:10:41PM +0100, Jan Kara wrote: > > Is it really correct that once the filesystem gets shutdown you clear the > > previous error from the journal? Because if we hit some real fs corruption, > > the journal gets aborted, and then someone calls ext4_shutdown(), we'd > > clear that error which looks like a bug to me because that shutdown hardly > > fixes the fs corruption... > > That's not what the code does. If journal->j_errno is set, then we > won't clear it, for precisely what concern you've articulated. Sorry for not following up on this earlier but now I've returned to this and I still cannot wrap my head around checks in __journal_abort_soft(). There's: if (!journal->j_errno || errno == -ESHUTDOWN) journal->j_errno = errno; Due to this ESHUTDOWN will override anything in journal->j_errno. Why is that? And then: if (journal->j_flags & JBD2_ABORT) { write_unlock(&journal->j_state_lock); if (!old_errno && old_errno != -ESHUTDOWN && errno == -ESHUTDOWN) jbd2_journal_update_sb_errno(journal); And here can the test ever be true? Probably only if we hard-aborted the journal. The test !old_errno && old_errno != -ESHUTDOWN definitely looks weird and is equivalent to old_errno == 0. Furthermore the errno == -ESHUTDOWN part looks strange as well as in that case we'd only clear the sb->s_errno field... So what was really the intention here? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20180220023038.19883-7-tytso@mit.edu>]
* Re: [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail [not found] ` <20180220023038.19883-7-tytso@mit.edu> @ 2018-03-07 8:55 ` Jan Kara 0 siblings, 0 replies; 9+ messages in thread From: Jan Kara @ 2018-03-07 8:55 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List, stable On Mon 19-02-18 21:30:37, Theodore Ts'o wrote: > This updates the jbd2 superblock unnecessarily, and on an abort we > shouldn't truncate the log. > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@vger.kernel.org Looks good. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/jbd2/journal.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index efa0c72a0b9f..dfb057900e79 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -974,7 +974,7 @@ int __jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block) > } > > /* > - * This is a variaon of __jbd2_update_log_tail which checks for validity of > + * This is a variation of __jbd2_update_log_tail which checks for validity of > * provided log tail and locks j_checkpoint_mutex. So it is safe against races > * with other threads updating log tail. > */ > @@ -1417,6 +1417,9 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid, > journal_superblock_t *sb = journal->j_superblock; > int ret; > > + if (is_journal_aborted(journal)) > + return -EIO; > + > BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex)); > jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n", > tail_block, tail_tid); > -- > 2.16.1.72.g5be1f00a9a > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20180220023038.19883-8-tytso@mit.edu>]
* Re: [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps [not found] ` <20180220023038.19883-8-tytso@mit.edu> @ 2018-03-07 9:06 ` Jan Kara 2018-03-22 15:29 ` Theodore Y. Ts'o 0 siblings, 1 reply; 9+ messages in thread From: Jan Kara @ 2018-03-07 9:06 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List, stable On Mon 19-02-18 21:30:38, Theodore Ts'o wrote: > When reading the inode or block allocation bitmap, if the bitmap needs > to be initialized, do not update the checksum in the block group > descriptor. That's because we're not set up to journal those changes. > Instead, just set the verified bit on the bitmap block, so that it's > not necessary to validate the checksum. > > When a block or inode allocation actually happens, at that point the > checksum will be calculated, and update of the bg descriptor block > will be properly journalled. > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@vger.kernel.org > --- > fs/ext4/balloc.c | 3 +-- > fs/ext4/ialloc.c | 47 +++-------------------------------------------- > 2 files changed, 4 insertions(+), 46 deletions(-) > > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > index f9b3e0a83526..f82c4966f4ce 100644 > --- a/fs/ext4/balloc.c > +++ b/fs/ext4/balloc.c > @@ -243,8 +243,6 @@ static int ext4_init_block_bitmap(struct super_block *sb, > */ > ext4_mark_bitmap_end(num_clusters_in_group(sb, block_group), > sb->s_blocksize * 8, bh->b_data); > - ext4_block_bitmap_csum_set(sb, block_group, gdp, bh); > - ext4_group_desc_csum_set(sb, block_group, gdp); > return 0; > } Probably you should remove the bad checksum handling in ext4_init_block_bitmap() the same way as you did in ext4_init_inode_bitmap()? Otherwise the patch looks good. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps 2018-03-07 9:06 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Jan Kara @ 2018-03-22 15:29 ` Theodore Y. Ts'o 0 siblings, 0 replies; 9+ messages in thread From: Theodore Y. Ts'o @ 2018-03-22 15:29 UTC (permalink / raw) To: Jan Kara; +Cc: Ext4 Developers List, stable On Wed, Mar 07, 2018 at 10:06:07AM +0100, Jan Kara wrote: > > Probably you should remove the bad checksum handling in > ext4_init_block_bitmap() the same way as you did in > ext4_init_inode_bitmap()? Otherwise the patch looks good. There isn't an ext4_init_inode_bitmap() --- the functionality is inlined ext4_read_inode_bitmap(), and it's already doing it that way (it doesn't set the checksum; it just sets the verified bit). - Ted ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20180220023038.19883-5-tytso@mit.edu>]
* Re: [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl [not found] ` <20180220023038.19883-5-tytso@mit.edu> @ 2018-03-07 9:07 ` Jan Kara 0 siblings, 0 replies; 9+ messages in thread From: Jan Kara @ 2018-03-07 9:07 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List, stable On Mon 19-02-18 21:30:35, Theodore Ts'o wrote: > The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to > avoid some races (that are now fixed), but in fact it introduced its > own race. > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@vger.kernel.org Looks good. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/ioctl.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c > index 4d1b1575f8ac..16d3d1325f5b 100644 > --- a/fs/ext4/ioctl.c > +++ b/fs/ext4/ioctl.c > @@ -498,10 +498,8 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg) > break; > case EXT4_GOING_FLAGS_NOLOGFLUSH: > set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags); > - if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) { > - msleep(100); > + if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) > jbd2_journal_abort(sbi->s_journal, 0); > - } > break; > default: > return -EINVAL; > -- > 2.16.1.72.g5be1f00a9a > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20180220023038.19883-4-tytso@mit.edu>]
* Re: [PATCH 3/7] ext4: shutdown should not prevent get_write_access [not found] ` <20180220023038.19883-4-tytso@mit.edu> @ 2018-03-07 9:42 ` Jan Kara 2018-03-22 15:38 ` Theodore Y. Ts'o 0 siblings, 1 reply; 9+ messages in thread From: Jan Kara @ 2018-03-07 9:42 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Ext4 Developers List, stable On Mon 19-02-18 21:30:34, Theodore Ts'o wrote: > The ext4 forced shutdown flag needs to prevent new handles from being > started, but it needs to allow existing handles to complete. So the > forced shutdown flag should not force ext4_journal_get_write_access to > fail. > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@vger.kernel.org OK, if you want the semantics of ext4 shutdown to be that running handles should be allowed to complete, I see where you are going with this patch. However there are more problems with this semantics than just __ext4_journal_get_write_access(). Just for example ext4_reserve_inode_write() will bail in case the fs got shutdown and thus inode changes won't be properly added to the running handle. Also places that rely on nested transactions being possible will not work because ext4_journal_start_sb() will refuse to get refcount of a running handle (ext4_journal_check_start() fails) in case fs got shutdown. And I may have missed other cases. The above problems are a reason why I though the semantics of ext4 shutdown was terminate the fs *now* - effectively a software equivalent of power off. That is much easier to implement since we just have to make sure no running handle makes it to the journal... Since I've said Google is using ext4 shutdown - is there any reason why you need the "running handles are allowed to finish" semantics? After all it seems it's just a race whether some handle makes it before the cut off or not... Honza > --- > fs/ext4/ext4_jbd2.c | 7 ------- > 1 file changed, 7 deletions(-) > > diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c > index 2d593201cf7a..7c70b08d104c 100644 > --- a/fs/ext4/ext4_jbd2.c > +++ b/fs/ext4/ext4_jbd2.c > @@ -166,13 +166,6 @@ int __ext4_journal_get_write_access(const char *where, unsigned int line, > might_sleep(); > > if (ext4_handle_valid(handle)) { > - struct super_block *sb; > - > - sb = handle->h_transaction->t_journal->j_private; > - if (unlikely(ext4_forced_shutdown(EXT4_SB(sb)))) { > - jbd2_journal_abort_handle(handle); > - return -EIO; > - } > err = jbd2_journal_get_write_access(handle, bh); > if (err) > ext4_journal_abort_handle(where, line, __func__, bh, > -- > 2.16.1.72.g5be1f00a9a > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/7] ext4: shutdown should not prevent get_write_access 2018-03-07 9:42 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Jan Kara @ 2018-03-22 15:38 ` Theodore Y. Ts'o 0 siblings, 0 replies; 9+ messages in thread From: Theodore Y. Ts'o @ 2018-03-22 15:38 UTC (permalink / raw) To: Jan Kara; +Cc: Ext4 Developers List, stable On Wed, Mar 07, 2018 at 10:42:01AM +0100, Jan Kara wrote: > On Mon 19-02-18 21:30:34, Theodore Ts'o wrote: > > The ext4 forced shutdown flag needs to prevent new handles from being > > started, but it needs to allow existing handles to complete. So the > > forced shutdown flag should not force ext4_journal_get_write_access to > > fail. > > > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > > Cc: stable@vger.kernel.org > > OK, if you want the semantics of ext4 shutdown to be that running > handles should be allowed to complete, I see where you are going with this > patch. However there are more problems with this semantics than just > __ext4_journal_get_write_access(). Just for example > ext4_reserve_inode_write() will bail in case the fs got shutdown and thus > inode changes won't be properly added to the running handle. Also places > that rely on nested transactions being possible will not work because > ext4_journal_start_sb() will refuse to get refcount of a running handle > (ext4_journal_check_start() fails) in case fs got shutdown. And I may have > missed other cases. We have three cases in ext4_shutdown: EXT4_GOING_FLAGS_DEFUALT: Freezes the block device, sets the EXT4_FLAGS_SHUTDOWN flag, and then unfreezes the block device: EXT4_GOING_FLAGS_LOGFLUSH: Sets the EXT4_FLAGS_SHUTDOWN flag, forces a commit, and then aborts the journal. EXT4_GOING_FLAGS_NOLOGFLUSH Sets the EXT4_FLAGS_SHUTDOWN flag, aborts the journal. > The above problems are a reason why I though the semantics of ext4 shutdown > was terminate the fs *now* - effectively a software equivalent of power > off. That is much easier to implement since we just have to make sure no > running handle makes it to the journal... Since I've said Google is using > ext4 shutdown - is there any reason why you need the "running handles are > allowed to finish" semantics? After all it seems it's just a race whether > some handle makes it before the cut off or not... Good points. I'll have to look at what happens if we just drop the EXT4_FLAGS_SHUTDOWN flag altogether. - Ted ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-05-17 15:18 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20180220023038.19883-1-tytso@mit.edu> [not found] ` <20180220023038.19883-6-tytso@mit.edu> 2018-03-06 17:10 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Jan Kara 2018-03-22 15:26 ` Theodore Y. Ts'o 2018-05-17 15:18 ` Jan Kara [not found] ` <20180220023038.19883-7-tytso@mit.edu> 2018-03-07 8:55 ` [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail Jan Kara [not found] ` <20180220023038.19883-8-tytso@mit.edu> 2018-03-07 9:06 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Jan Kara 2018-03-22 15:29 ` Theodore Y. Ts'o [not found] ` <20180220023038.19883-5-tytso@mit.edu> 2018-03-07 9:07 ` [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl Jan Kara [not found] ` <20180220023038.19883-4-tytso@mit.edu> 2018-03-07 9:42 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Jan Kara 2018-03-22 15:38 ` Theodore Y. Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).