linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* inotify on mmap writes
@ 2023-03-22  1:50 Amol Dixit
  2023-03-22  7:15 ` Amir Goldstein
  2023-03-22 14:50 ` Matthew Wilcox
  0 siblings, 2 replies; 8+ messages in thread
From: Amol Dixit @ 2023-03-22  1:50 UTC (permalink / raw)
  To: linux-fsdevel

Hello,
Apologies if this has been discussed or clarified in the past.

The lack of file modification notification events (inotify, fanotify)
for mmap() regions is a big hole to anybody watching file changes from
userspace. I can imagine atleast 2 reasons why that support may be
lacking, perhaps there are more:

1. mmap() writeback is async (unless msync/fsync triggered) driven by
file IO and page cache writeback mechanims, unlike write system calls
that get funneled via the vfs layer, whih is a convenient common place
to issue notifications. Now mm code would have to find a common ground
with filesystem/vfs, which is messy.

2. writepages, being an address-space op is treated by each file
system independently. If mm did not want to get involved, onus would
be on each filesystem to make their .writepages handlers notification
aware. This is probably also considered not worth the trouble.

So my question is, notwithstanding minor hurdles (like lost events,
hardlinks etc.), would the community like to extend inotify support
for mmap'ed writes to files? Under configs options, would a fix on a
per filesystem basis be an acceptable solution (I can start with say
ext4 writepages linking back to inode/dentry and firing a
notification)?

Eventually we will have larger support across the board and
inotify/fanotify can be a reliable tracking mechanism for
modifications to files.

Thank you,
Amol

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22  1:50 inotify on mmap writes Amol Dixit
@ 2023-03-22  7:15 ` Amir Goldstein
  2023-03-22 19:43   ` Amol Dixit
  2023-03-22 14:50 ` Matthew Wilcox
  1 sibling, 1 reply; 8+ messages in thread
From: Amir Goldstein @ 2023-03-22  7:15 UTC (permalink / raw)
  To: Amol Dixit; +Cc: linux-fsdevel

On Wed, Mar 22, 2023 at 4:13 AM Amol Dixit <amoldd@gmail.com> wrote:
>
> Hello,
> Apologies if this has been discussed or clarified in the past.
>
> The lack of file modification notification events (inotify, fanotify)
> for mmap() regions is a big hole to anybody watching file changes from
> userspace. I can imagine atleast 2 reasons why that support may be
> lacking, perhaps there are more:
>
> 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> file IO and page cache writeback mechanims, unlike write system calls
> that get funneled via the vfs layer, whih is a convenient common place
> to issue notifications. Now mm code would have to find a common ground
> with filesystem/vfs, which is messy.
>
> 2. writepages, being an address-space op is treated by each file
> system independently. If mm did not want to get involved, onus would
> be on each filesystem to make their .writepages handlers notification
> aware. This is probably also considered not worth the trouble.
>
> So my question is, notwithstanding minor hurdles (like lost events,
> hardlinks etc.), would the community like to extend inotify support
> for mmap'ed writes to files? Under configs options, would a fix on a
> per filesystem basis be an acceptable solution (I can start with say
> ext4 writepages linking back to inode/dentry and firing a
> notification)?
>
> Eventually we will have larger support across the board and
> inotify/fanotify can be a reliable tracking mechanism for
> modifications to files.
>

What is the use case?
Would it be sufficient if you had an OPEN_WRITE event?
or if OPEN event had the O_ flags as an extra info to the event?
I have a patch for the above and I personally find this information
missing from OPEN events.

Are you trying to monitor mmap() calls? write to an mmaped area?
because writepages() will get you neither of these.

Please specify the use case and we will work out what can be done
from there.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22  1:50 inotify on mmap writes Amol Dixit
  2023-03-22  7:15 ` Amir Goldstein
@ 2023-03-22 14:50 ` Matthew Wilcox
  2023-03-22 19:13   ` Amol Dixit
  1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2023-03-22 14:50 UTC (permalink / raw)
  To: Amol Dixit; +Cc: linux-fsdevel

On Tue, Mar 21, 2023 at 06:50:14PM -0700, Amol Dixit wrote:
> The lack of file modification notification events (inotify, fanotify)
> for mmap() regions is a big hole to anybody watching file changes from
> userspace. I can imagine atleast 2 reasons why that support may be
> lacking, perhaps there are more:
> 
> 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> file IO and page cache writeback mechanims, unlike write system calls
> that get funneled via the vfs layer, whih is a convenient common place
> to issue notifications. Now mm code would have to find a common ground
> with filesystem/vfs, which is messy.
> 
> 2. writepages, being an address-space op is treated by each file
> system independently. If mm did not want to get involved, onus would
> be on each filesystem to make their .writepages handlers notification
> aware. This is probably also considered not worth the trouble.
> 
> So my question is, notwithstanding minor hurdles (like lost events,
> hardlinks etc.), would the community like to extend inotify support
> for mmap'ed writes to files? Under configs options, would a fix on a
> per filesystem basis be an acceptable solution (I can start with say
> ext4 writepages linking back to inode/dentry and firing a
> notification)?

I don't understand why you think writepages is the right place for
monitoring.  That tells you when data is leaving the page cache, not
when the file is modified.  If you want to know when the file is
modified, you need to hook into the page_mkwrite path.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22 14:50 ` Matthew Wilcox
@ 2023-03-22 19:13   ` Amol Dixit
  0 siblings, 0 replies; 8+ messages in thread
From: Amol Dixit @ 2023-03-22 19:13 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel

Thanks for correction. It has been a while for me and I thought
writepages may be the common denominator (vague memory of reads
blocking on DIO writeback with writepages casued me to believe it may
be the path for mmap)...most certainly it is ext4_page_mkwrite that
does get_block and and ends up with ll_rw_block(), so I stand
corrected.

Back to my point of tracking mmap writes to any region of a file. The
use case is to build on top of inotify to know dirty files and take
action as the app would wish - so this is more about completion of the
API to track all writes. Just saw patches in the works for splice() go
by, also in the same vein.

Ideally I would want to take the inotify interface further, by
returning offset/length information in the MODIFY event to further
assist user applications built on this interface. For vfs originated
events this would be at byte granularity, while for mmap originated
events this may be at page granularity - in any case this would be
valuable information surfaced up from the depths of the filesystem up
to userspace.

Thanks,
Amol



On Wed, Mar 22, 2023 at 7:50 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Mar 21, 2023 at 06:50:14PM -0700, Amol Dixit wrote:
> > The lack of file modification notification events (inotify, fanotify)
> > for mmap() regions is a big hole to anybody watching file changes from
> > userspace. I can imagine atleast 2 reasons why that support may be
> > lacking, perhaps there are more:
> >
> > 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> > file IO and page cache writeback mechanims, unlike write system calls
> > that get funneled via the vfs layer, whih is a convenient common place
> > to issue notifications. Now mm code would have to find a common ground
> > with filesystem/vfs, which is messy.
> >
> > 2. writepages, being an address-space op is treated by each file
> > system independently. If mm did not want to get involved, onus would
> > be on each filesystem to make their .writepages handlers notification
> > aware. This is probably also considered not worth the trouble.
> >
> > So my question is, notwithstanding minor hurdles (like lost events,
> > hardlinks etc.), would the community like to extend inotify support
> > for mmap'ed writes to files? Under configs options, would a fix on a
> > per filesystem basis be an acceptable solution (I can start with say
> > ext4 writepages linking back to inode/dentry and firing a
> > notification)?
>
> I don't understand why you think writepages is the right place for
> monitoring.  That tells you when data is leaving the page cache, not
> when the file is modified.  If you want to know when the file is
> modified, you need to hook into the page_mkwrite path.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22  7:15 ` Amir Goldstein
@ 2023-03-22 19:43   ` Amol Dixit
  2023-03-22 21:12     ` Amir Goldstein
  0 siblings, 1 reply; 8+ messages in thread
From: Amol Dixit @ 2023-03-22 19:43 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: linux-fsdevel

On Wed, Mar 22, 2023 at 12:16 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 4:13 AM Amol Dixit <amoldd@gmail.com> wrote:
> >
> > Hello,
> > Apologies if this has been discussed or clarified in the past.
> >
> > The lack of file modification notification events (inotify, fanotify)
> > for mmap() regions is a big hole to anybody watching file changes from
> > userspace. I can imagine atleast 2 reasons why that support may be
> > lacking, perhaps there are more:
> >
> > 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> > file IO and page cache writeback mechanims, unlike write system calls
> > that get funneled via the vfs layer, whih is a convenient common place
> > to issue notifications. Now mm code would have to find a common ground
> > with filesystem/vfs, which is messy.
> >
> > 2. writepages, being an address-space op is treated by each file
> > system independently. If mm did not want to get involved, onus would
> > be on each filesystem to make their .writepages handlers notification
> > aware. This is probably also considered not worth the trouble.
> >
> > So my question is, notwithstanding minor hurdles (like lost events,
> > hardlinks etc.), would the community like to extend inotify support
> > for mmap'ed writes to files? Under configs options, would a fix on a
> > per filesystem basis be an acceptable solution (I can start with say
> > ext4 writepages linking back to inode/dentry and firing a
> > notification)?
> >
> > Eventually we will have larger support across the board and
> > inotify/fanotify can be a reliable tracking mechanism for
> > modifications to files.
> >
>
> What is the use case?
> Would it be sufficient if you had an OPEN_WRITE event?
> or if OPEN event had the O_ flags as an extra info to the event?
> I have a patch for the above and I personally find this information
> missing from OPEN events.
>
> Are you trying to monitor mmap() calls? write to an mmaped area?
> because writepages() will get you neither of these.

OPEN events are not useful to track file modifications in real time,
although I can do see the usefulness of OPEN_WRITE events to track
files that can change.

I am trying to track writes to mmaped area (as these are not notified
using inotify events). I wanted to ask the community of the
feasibility and usefulness of this. I had some design ideas of
tracking writes (using jbd commit callbacks for instance) in the
kernel, but to make it generic sprucing up the inotify interface is a
much better approach.

Hope that provides some context.
Thanks,
Amol

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22 19:43   ` Amol Dixit
@ 2023-03-22 21:12     ` Amir Goldstein
  2023-03-22 22:13       ` Amol Dixit
  0 siblings, 1 reply; 8+ messages in thread
From: Amir Goldstein @ 2023-03-22 21:12 UTC (permalink / raw)
  To: Amol Dixit; +Cc: linux-fsdevel, Jan Kara

On Wed, Mar 22, 2023 at 9:43 PM Amol Dixit <amoldd@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 12:16 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 4:13 AM Amol Dixit <amoldd@gmail.com> wrote:
> > >
> > > Hello,
> > > Apologies if this has been discussed or clarified in the past.
> > >
> > > The lack of file modification notification events (inotify, fanotify)
> > > for mmap() regions is a big hole to anybody watching file changes from
> > > userspace. I can imagine atleast 2 reasons why that support may be
> > > lacking, perhaps there are more:
> > >
> > > 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> > > file IO and page cache writeback mechanims, unlike write system calls
> > > that get funneled via the vfs layer, whih is a convenient common place
> > > to issue notifications. Now mm code would have to find a common ground
> > > with filesystem/vfs, which is messy.
> > >
> > > 2. writepages, being an address-space op is treated by each file
> > > system independently. If mm did not want to get involved, onus would
> > > be on each filesystem to make their .writepages handlers notification
> > > aware. This is probably also considered not worth the trouble.
> > >
> > > So my question is, notwithstanding minor hurdles (like lost events,
> > > hardlinks etc.), would the community like to extend inotify support
> > > for mmap'ed writes to files? Under configs options, would a fix on a
> > > per filesystem basis be an acceptable solution (I can start with say
> > > ext4 writepages linking back to inode/dentry and firing a
> > > notification)?
> > >
> > > Eventually we will have larger support across the board and
> > > inotify/fanotify can be a reliable tracking mechanism for
> > > modifications to files.
> > >
> >
> > What is the use case?
> > Would it be sufficient if you had an OPEN_WRITE event?
> > or if OPEN event had the O_ flags as an extra info to the event?
> > I have a patch for the above and I personally find this information
> > missing from OPEN events.
> >
> > Are you trying to monitor mmap() calls? write to an mmaped area?
> > because writepages() will get you neither of these.
>
> OPEN events are not useful to track file modifications in real time,
> although I can do see the usefulness of OPEN_WRITE events to track
> files that can change.
>
> I am trying to track writes to mmaped area (as these are not notified
> using inotify events). I wanted to ask the community of the
> feasibility and usefulness of this. I had some design ideas of
> tracking writes (using jbd commit callbacks for instance) in the
> kernel, but to make it generic sprucing up the inotify interface is a
> much better approach.
>
> Hope that provides some context.

Not enough.

For a given file mmaped writable by a process that is writing
to that mapped memory all the time for a long time.

What do you expect to happen?
How many events?
On first time write to a page? To the memory region?
When dirty memory is written back to disk?

You have mixed a lot of different things in your question.
You need to be more specific about what the purpose
of this monitoring is.

From all of the above, only MODIFY on mmap() call
seems reasonable to me and MODIFY on first write to
an mmaped area is something that we can consider if
there is very good justification.

FYI, the existing MODIFY events are from after the
write system call modified the page cache and there is
no guarantee about when writeback to disk is done.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22 21:12     ` Amir Goldstein
@ 2023-03-22 22:13       ` Amol Dixit
  2023-03-23  6:35         ` Amir Goldstein
  0 siblings, 1 reply; 8+ messages in thread
From: Amol Dixit @ 2023-03-22 22:13 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: linux-fsdevel, Jan Kara

Thank you Amir for taking the time. I will take another stab at the motivation.

Say I am writing an efficient real time file backup application, and
monitoring changes to certain files. The best rsync can do is to chunk
and checksum and gather delta regions to transfer. What if, through
inotify, the application is alerted of precise extents written to a
certain file. This would take the form of <logical file offset,
length> tuples in the metadata attached with each MODIFY event. That
should be easily possible (just like we add file names to CREATE
events). For mmaped regions 'length' would be in page granularity
since the kernel wouldn't know precise regions written within a given
page.

> What do you expect to happen?
Notifications can be collapsed until they are read. So if first IO is
<0, 20> and second IO is <20, 20>, then the event can be collapsed
in-place to read <0, 40>. If they are not contiguous, say second IO is
<30, 10>, then we will have 2 extent entries in the metadata of MODIFY
event - <0, 20> and <30, 10>, and so on.

> How many events?
Events are always opportunistic. If too many events of the same kind,
a generic "Too many changes" event is enough (CIFS change notification
has something similar) to alert the reader.

> On first time write to a page?
Doesn't help ongoing activity tracking.

> To the memory region?
Precision as much as possible for offsets and lengths is nice to have.

> When dirty memory is written back to disk?
Events are more like hints (as you said they do not guarantee
writeback to disk anyway). Applications will do their own integrity
checks on top of these hints.

With precise written extents available in userspace, the backup
application is very happy to just incrementally backup written extents
at byte granularity (or page granularity for mmaped events).

Amol





On Wed, Mar 22, 2023 at 2:12 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 9:43 PM Amol Dixit <amoldd@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 12:16 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 4:13 AM Amol Dixit <amoldd@gmail.com> wrote:
> > > >
> > > > Hello,
> > > > Apologies if this has been discussed or clarified in the past.
> > > >
> > > > The lack of file modification notification events (inotify, fanotify)
> > > > for mmap() regions is a big hole to anybody watching file changes from
> > > > userspace. I can imagine atleast 2 reasons why that support may be
> > > > lacking, perhaps there are more:
> > > >
> > > > 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> > > > file IO and page cache writeback mechanims, unlike write system calls
> > > > that get funneled via the vfs layer, whih is a convenient common place
> > > > to issue notifications. Now mm code would have to find a common ground
> > > > with filesystem/vfs, which is messy.
> > > >
> > > > 2. writepages, being an address-space op is treated by each file
> > > > system independently. If mm did not want to get involved, onus would
> > > > be on each filesystem to make their .writepages handlers notification
> > > > aware. This is probably also considered not worth the trouble.
> > > >
> > > > So my question is, notwithstanding minor hurdles (like lost events,
> > > > hardlinks etc.), would the community like to extend inotify support
> > > > for mmap'ed writes to files? Under configs options, would a fix on a
> > > > per filesystem basis be an acceptable solution (I can start with say
> > > > ext4 writepages linking back to inode/dentry and firing a
> > > > notification)?
> > > >
> > > > Eventually we will have larger support across the board and
> > > > inotify/fanotify can be a reliable tracking mechanism for
> > > > modifications to files.
> > > >
> > >
> > > What is the use case?
> > > Would it be sufficient if you had an OPEN_WRITE event?
> > > or if OPEN event had the O_ flags as an extra info to the event?
> > > I have a patch for the above and I personally find this information
> > > missing from OPEN events.
> > >
> > > Are you trying to monitor mmap() calls? write to an mmaped area?
> > > because writepages() will get you neither of these.
> >
> > OPEN events are not useful to track file modifications in real time,
> > although I can do see the usefulness of OPEN_WRITE events to track
> > files that can change.
> >
> > I am trying to track writes to mmaped area (as these are not notified
> > using inotify events). I wanted to ask the community of the
> > feasibility and usefulness of this. I had some design ideas of
> > tracking writes (using jbd commit callbacks for instance) in the
> > kernel, but to make it generic sprucing up the inotify interface is a
> > much better approach.
> >
> > Hope that provides some context.
>
> Not enough.
>
> For a given file mmaped writable by a process that is writing
> to that mapped memory all the time for a long time.
>
> What do you expect to happen?
> How many events?
> On first time write to a page? To the memory region?
> When dirty memory is written back to disk?
>
> You have mixed a lot of different things in your question.
> You need to be more specific about what the purpose
> of this monitoring is.
>
> From all of the above, only MODIFY on mmap() call
> seems reasonable to me and MODIFY on first write to
> an mmaped area is something that we can consider if
> there is very good justification.
>
> FYI, the existing MODIFY events are from after the
> write system call modified the page cache and there is
> no guarantee about when writeback to disk is done.
>
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: inotify on mmap writes
  2023-03-22 22:13       ` Amol Dixit
@ 2023-03-23  6:35         ` Amir Goldstein
  0 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2023-03-23  6:35 UTC (permalink / raw)
  To: Amol Dixit; +Cc: linux-fsdevel, Jan Kara

> On Wed, Mar 22, 2023 at 2:12 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 9:43 PM Amol Dixit <amoldd@gmail.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 12:16 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Wed, Mar 22, 2023 at 4:13 AM Amol Dixit <amoldd@gmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > > Apologies if this has been discussed or clarified in the past.
> > > > >
> > > > > The lack of file modification notification events (inotify, fanotify)
> > > > > for mmap() regions is a big hole to anybody watching file changes from
> > > > > userspace. I can imagine atleast 2 reasons why that support may be
> > > > > lacking, perhaps there are more:
> > > > >
> > > > > 1. mmap() writeback is async (unless msync/fsync triggered) driven by
> > > > > file IO and page cache writeback mechanims, unlike write system calls
> > > > > that get funneled via the vfs layer, whih is a convenient common place
> > > > > to issue notifications. Now mm code would have to find a common ground
> > > > > with filesystem/vfs, which is messy.
> > > > >
> > > > > 2. writepages, being an address-space op is treated by each file
> > > > > system independently. If mm did not want to get involved, onus would
> > > > > be on each filesystem to make their .writepages handlers notification
> > > > > aware. This is probably also considered not worth the trouble.
> > > > >
> > > > > So my question is, notwithstanding minor hurdles (like lost events,
> > > > > hardlinks etc.), would the community like to extend inotify support
> > > > > for mmap'ed writes to files? Under configs options, would a fix on a
> > > > > per filesystem basis be an acceptable solution (I can start with say
> > > > > ext4 writepages linking back to inode/dentry and firing a
> > > > > notification)?
> > > > >
> > > > > Eventually we will have larger support across the board and
> > > > > inotify/fanotify can be a reliable tracking mechanism for
> > > > > modifications to files.
> > > > >
> > > >
> > > > What is the use case?
> > > > Would it be sufficient if you had an OPEN_WRITE event?
> > > > or if OPEN event had the O_ flags as an extra info to the event?
> > > > I have a patch for the above and I personally find this information
> > > > missing from OPEN events.
> > > >
> > > > Are you trying to monitor mmap() calls? write to an mmaped area?
> > > > because writepages() will get you neither of these.
> > >
> > > OPEN events are not useful to track file modifications in real time,
> > > although I can do see the usefulness of OPEN_WRITE events to track
> > > files that can change.
> > >
> > > I am trying to track writes to mmaped area (as these are not notified
> > > using inotify events). I wanted to ask the community of the
> > > feasibility and usefulness of this. I had some design ideas of
> > > tracking writes (using jbd commit callbacks for instance) in the
> > > kernel, but to make it generic sprucing up the inotify interface is a
> > > much better approach.
> > >
> > > Hope that provides some context.
> >
> > Not enough.
> >
> > For a given file mmaped writable by a process that is writing
> > to that mapped memory all the time for a long time.
> >
> > What do you expect to happen?
> > How many events?
> > On first time write to a page? To the memory region?
> > When dirty memory is written back to disk?
> >
> > You have mixed a lot of different things in your question.
> > You need to be more specific about what the purpose
> > of this monitoring is.
> >
> > From all of the above, only MODIFY on mmap() call
> > seems reasonable to me and MODIFY on first write to
> > an mmaped area is something that we can consider if
> > there is very good justification.
> >
> > FYI, the existing MODIFY events are from after the
> > write system call modified the page cache and there is
> > no guarantee about when writeback to disk is done.
> >


On Thu, Mar 23, 2023 at 12:13 AM Amol Dixit <amoldd@gmail.com> wrote:
>
> Thank you Amir for taking the time. I will take another stab at the motivation.

Please do not "top post" on fsdevel discussions.

>
> Say I am writing an efficient real time file backup application, and
> monitoring changes to certain files. The best rsync can do is to chunk
> and checksum and gather delta regions to transfer. What if, through
> inotify, the application is alerted of precise extents written to a
> certain file. This would take the form of <logical file offset,
> length> tuples in the metadata attached with each MODIFY event. That
> should be easily possible (just like we add file names to CREATE
> events). For mmaped regions 'length' would be in page granularity
> since the kernel wouldn't know precise regions written within a given
> page.
>
> > What do you expect to happen?
> Notifications can be collapsed until they are read. So if first IO is
> <0, 20> and second IO is <20, 20>, then the event can be collapsed
> in-place to read <0, 40>. If they are not contiguous, say second IO is

That can be done.
I already have patches for FAN_EVENT_INFO_TYPE_RANGE.

> <30, 10>, then we will have 2 extent entries in the metadata of MODIFY
> event - <0, 20> and <30, 10>, and so on.
>

That seems like an overkill.
More than a single extent could just drop the granular range info.

> > How many events?
> Events are always opportunistic. If too many events of the same kind,
> a generic "Too many changes" event is enough (CIFS change notification
> has something similar) to alert the reader.
>
> > On first time write to a page?
> Doesn't help ongoing activity tracking.
>
> > To the memory region?
> Precision as much as possible for offsets and lengths is nice to have.
>
> > When dirty memory is written back to disk?
> Events are more like hints (as you said they do not guarantee
> writeback to disk anyway). Applications will do their own integrity
> checks on top of these hints.
>

Hints, yes, but event do need to guarantee that a change is
not missed, so in the context of mmaped memory writes that
means that after the event is consumed by application or after
the application reads the file content, PTE may need to be setup to
trigger a new event on the next write.

Doing that on page level seems like an unacceptable overkill
for the use case of backup applications.

Perhaps a more feasible option is to generate an event when
an inode or mapping change state into "dirty pages", then backup
application needs to do:

1. consume pending MODIFY events on file
2. call fsdatasync()/msync()/sync_file_range()
3. read content of file to backup

And then we should be able to provide a guarantee
that if there is any write after #2 returned success,
a new MODIFY event will be generated.

We should probably make this a new event (e.g. FAN_WRITE)
because it has different semantics than FAN_MODIFY and it can also
be useful to non-mmapped writes use case.

None of this is going to be simple though, so to answer your
original questions:

> So my question is, notwithstanding minor hurdles (like lost events,
> hardlinks etc.), would the community like to extend inotify support
> for mmap'ed writes to files?

If you are willing to do the work and you can prove that it does not
hurt performance of any existing workload when the new feature
is not in use, I think it would be a nice improvement.

> Under configs options,

No config options please.
If you cannot make it work without hurting performance, no go.

> would a fix on a per filesystem basis be an acceptable solution
> (I can start with say ext4 writepages linking back to inode/dentry
> and firing a notification)?

Solution should be generic in vfs.
It is possible that this will not be supported for all filesystems,
but only on some filesystems that implement some vfs operation
or opt-in with some fs flag, but not a fs specific implementation.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-23  6:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-22  1:50 inotify on mmap writes Amol Dixit
2023-03-22  7:15 ` Amir Goldstein
2023-03-22 19:43   ` Amol Dixit
2023-03-22 21:12     ` Amir Goldstein
2023-03-22 22:13       ` Amol Dixit
2023-03-23  6:35         ` Amir Goldstein
2023-03-22 14:50 ` Matthew Wilcox
2023-03-22 19:13   ` Amol Dixit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).