linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bdi has dirty inode after umount of ext4 fs in 3.4.83
@ 2014-03-21 15:25 Benjamin LaHaise
  2014-03-23 13:14 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin LaHaise @ 2014-03-21 15:25 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel, linux-ext4

Hello Al and folks,

After adding some debugging code in an application to check for dirty 
buffers on a bdi after umount, I'm seeing instances where b_dirty has 
exactly 1 dirty inode listed on a 3.4.83 kernel after umount() of a 
filesystem.  Roughly what the application does is to umount an ext3 
filesystem (using the ext4 codebase), perform an fsync() of the block 
device, then check the bdi stats in /sys/kernel/debug/252:4/stats (this 
is a dm partition on top of a dm multipath device for an FC LUN).  I've 
found that if I add a sync() call instead of the fsync(), the b_dirty 
count usually drops to 0, but not always.  I've added some debugging 
code to the bdi stats dump, and the inode on the b_dirty list shows up as:

	inode=ffff88081beaada0, i_ino=0, i_nlink=1 i_sb=ffff88083c03e400
	i_state=0x00000004 i_data.nrpages=4 i_count=3
	i_sb->s_dev=0x00000002

The fact that the inode number is 0 looks very odd.

Testing the application on top of a newer kernel is a bit of a challenge 
as other parts of the system have yet to be forward ported from the 3.4 
kernel, but I'll try to come up with a test case that shows the issue.  
In the meantime, is anyone aware of any umount()/sync related issues that 
might be affecting ext4 in 3.4.83?  Thanks in advance for any ideas on 
how to track this down.  Cheers,

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bdi has dirty inode after umount of ext4 fs in 3.4.83
  2014-03-21 15:25 bdi has dirty inode after umount of ext4 fs in 3.4.83 Benjamin LaHaise
@ 2014-03-23 13:14 ` Jan Kara
  2014-03-25 22:09   ` Benjamin LaHaise
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2014-03-23 13:14 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Alexander Viro, linux-fsdevel, linux-ext4

On Fri 21-03-14 11:25:41, Benjamin LaHaise wrote:
  Hello,

> After adding some debugging code in an application to check for dirty 
> buffers on a bdi after umount, I'm seeing instances where b_dirty has 
> exactly 1 dirty inode listed on a 3.4.83 kernel after umount() of a 
> filesystem.  Roughly what the application does is to umount an ext3 
> filesystem (using the ext4 codebase), perform an fsync() of the block 
> device, then check the bdi stats in /sys/kernel/debug/252:4/stats (this 
> is a dm partition on top of a dm multipath device for an FC LUN).  I've 
> found that if I add a sync() call instead of the fsync(), the b_dirty 
> count usually drops to 0, but not always.  I've added some debugging 
> code to the bdi stats dump, and the inode on the b_dirty list shows up as:
> 
> 	inode=ffff88081beaada0, i_ino=0, i_nlink=1 i_sb=ffff88083c03e400
> 	i_state=0x00000004 i_data.nrpages=4 i_count=3
> 	i_sb->s_dev=0x00000002
> 
> The fact that the inode number is 0 looks very odd.
  So the dirty inode is almost certainly a block device inode. Another clue
is that fsync(2) actually doesn't clean inode dirty state (especially not
for block device inodes since that inode is a special one and fs usually
doesn't get to inspecting it). sync(2) does in general clear inode dirty
state because that's handled by flusher thread. However if ->sync_fs()
dirties the block device inode, subsequent sync_blockdev() call only writes
the data but doesn't clean the inode state. So even with sync(2) it can
happen the block device inode remains dirty.

In general inode dirty state isn't reliable. I_DIRTY_DATA can be set when
inode is in fact clean. You have to use mapping_tagged(inode->i_mapping,
PAGECACHE_TAG_DIRTY) to determine whether the inode has actually any dirty
data.

> Testing the application on top of a newer kernel is a bit of a challenge 
> as other parts of the system have yet to be forward ported from the 3.4 
> kernel, but I'll try to come up with a test case that shows the issue.  
> In the meantime, is anyone aware of any umount()/sync related issues that 
> might be affecting ext4 in 3.4.83?  Thanks in advance for any ideas on 
> how to track this down.  Cheers,
  Newer kernels don't bring anything substantially new to the picture...

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bdi has dirty inode after umount of ext4 fs in 3.4.83
  2014-03-23 13:14 ` Jan Kara
@ 2014-03-25 22:09   ` Benjamin LaHaise
  2014-03-26  5:32     ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin LaHaise @ 2014-03-25 22:09 UTC (permalink / raw)
  To: Jan Kara; +Cc: Alexander Viro, linux-fsdevel, linux-ext4

On Sun, Mar 23, 2014 at 02:14:16PM +0100, Jan Kara wrote:
>   So the dirty inode is almost certainly a block device inode. Another clue
> is that fsync(2) actually doesn't clean inode dirty state (especially not
> for block device inodes since that inode is a special one and fs usually
> doesn't get to inspecting it). sync(2) does in general clear inode dirty
> state because that's handled by flusher thread. However if ->sync_fs()
> dirties the block device inode, subsequent sync_blockdev() call only writes
> the data but doesn't clean the inode state. So even with sync(2) it can
> happen the block device inode remains dirty.

> In general inode dirty state isn't reliable. I_DIRTY_DATA can be set when
> inode is in fact clean. You have to use mapping_tagged(inode->i_mapping,
> PAGECACHE_TAG_DIRTY) to determine whether the inode has actually any dirty
> data.

That is indeed the case.  I checked the contents of the inode, and none of 
the buffers attached to that inode were dirty.

Is there any desire to fix this?  Seeing an inode on the b_dirty list that 
isn't really an inode that contains any data doesn't make a whole lot of 
sense.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bdi has dirty inode after umount of ext4 fs in 3.4.83
  2014-03-25 22:09   ` Benjamin LaHaise
@ 2014-03-26  5:32     ` Jan Kara
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2014-03-26  5:32 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Jan Kara, Alexander Viro, linux-fsdevel, linux-ext4

On Tue 25-03-14 18:09:38, Benjamin LaHaise wrote:
> On Sun, Mar 23, 2014 at 02:14:16PM +0100, Jan Kara wrote:
> >   So the dirty inode is almost certainly a block device inode. Another clue
> > is that fsync(2) actually doesn't clean inode dirty state (especially not
> > for block device inodes since that inode is a special one and fs usually
> > doesn't get to inspecting it). sync(2) does in general clear inode dirty
> > state because that's handled by flusher thread. However if ->sync_fs()
> > dirties the block device inode, subsequent sync_blockdev() call only writes
> > the data but doesn't clean the inode state. So even with sync(2) it can
> > happen the block device inode remains dirty.
> 
> > In general inode dirty state isn't reliable. I_DIRTY_DATA can be set when
> > inode is in fact clean. You have to use mapping_tagged(inode->i_mapping,
> > PAGECACHE_TAG_DIRTY) to determine whether the inode has actually any dirty
> > data.
> 
> That is indeed the case.  I checked the contents of the inode, and none of 
> the buffers attached to that inode were dirty.
> 
> Is there any desire to fix this?  Seeing an inode on the b_dirty list that 
> isn't really an inode that contains any data doesn't make a whole lot of 
> sense.
  It doesn't make a lot of sense but this kind of lazy b_dirty management
allows us to avoid synchronization issues with flusher working in the inode
at the same time. So I'm not convinced we want to fix that...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-03-26  5:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-21 15:25 bdi has dirty inode after umount of ext4 fs in 3.4.83 Benjamin LaHaise
2014-03-23 13:14 ` Jan Kara
2014-03-25 22:09   ` Benjamin LaHaise
2014-03-26  5:32     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).