Proposal for map refresh logic

All of lore.kernel.org
 help / color / mirror / Atom feed

* Proposal for map refresh logic
@ 2004-05-21 14:31 raven
  2004-05-21 16:44 ` Jim Carter
  0 siblings, 1 reply; 15+ messages in thread
From: raven @ 2004-05-21 14:31 UTC (permalink / raw)
  To: autofs mailing list

Hi all,

Here are my thoughts on the automatic map refresh issue.

A few things to keep in mind are:

1. Re-reading a map also removes stale entries unless they are
   mounted. In this case they would not be removed until the
   refresh following their umount.

2. Using the success or failure of the mount operation is
   unreliable at best and could, at best, lead to unwanted
   map refreshes and at worst lead to a map reading storm.

3. For file maps we cannot read a single entry so instead
   the lookup of an individual entry amounts to checking the
   map files' modification time. This value will be recorded
   when the map is read.

4. We need to take advantage of hints that the map may have
   changed frequently in order to catch updated maps without
   a large time lag.

5. The HUP signal handling will remain to allow map refresh
   on demand.

There are two distinct cases to consider:

1. A map lookup results in a cache hit.
2. A map lookup results in a cache miss.

For a a cache hit we can:

1. Check if the map entry has exceeded a given time to live, say
   the map expire time.
2. If not continue as normal otherwise lookup the individual map
   entry and compare it to the cached entry.
3. If they are identical continue as normal otherwise it's a hint
   the map has changed so re-read it.

For a cache miss we can:

1. Lookup the individual map entry.
2. If this fails consider it as a real failure and continue.
3. If it succeeds re-read the entire map as it must be out of date.

This scheme avoids using the mount result as a hint.

It should give a reasonably frequent refresh interval particularly for 
map entries that change often.

One further consideration remains.

Should a time based (say 6-10 times the expire timeout) map refresh be 
added to the above logic?

I would prefer not as I'm not sure that it gets us much more than the 
above scheme. The main point being that the map will be out of date for 
an extended amount of time anyway and likely would be refreshed before
timeout anyway if it has changed and is frequent use. 

Comments please.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-21 14:31 Proposal for map refresh logic raven
@ 2004-05-21 16:44 ` Jim Carter
  2004-05-22  4:37   ` raven
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jim Carter @ 2004-05-21 16:44 UTC (permalink / raw)
  To: raven; +Cc: autofs mailing list

On Fri, 21 May 2004 raven@themaw.net wrote:

> 1. Re-reading a map also removes stale entries unless they are
>    mounted. In this case they would not be removed until the
>    refresh following their umount.

Good.

> 2. Using the success or failure of the mount operation is
>    unreliable at best and could, at best, lead to unwanted
>    map refreshes and at worst lead to a map reading storm.

If the algo below is followed, map rereads will be controlled by the TTL, 
avoiding reread storms.  However, there may be cases where it's hard to 
recognize a changed map. 

> 3. For file maps we cannot read a single entry so instead
>    the lookup of an individual entry amounts to checking the
>    map files' modification time. This value will be recorded
>    when the map is read.

Good.  This is certainly sufficient.  

> 4. We need to take advantage of hints that the map may have
>    changed frequently in order to catch updated maps without
>    a large time lag.

We need to frequently glim the hints... right?  Your point isn't about 
vacillating maps.  For updating the ghost mounts this is true, but for 
actual mounting it's irrelevant if the cache contains outdated info because 
of the TTL rule.  NIS maps have an "order number" which you can get if you 
know what key to ask for, rather than having to read the whole map.  I 
don't know about LDAP.  Suggestion for ghost mounts: only refresh the map 
when a userspace process does opendir on the containing directory, as in 
"ls /net/hostname".  In any other case, outdated or missing ghost mounts 
cannot be seen by userspace, so let sleeping dogs lie.

Question about ghost mounts: if you did "ls /net", you would like to see
all servers from which anything potentially could be mounted, but with a
wildcard map this is impossible, right?  But with explicitly listed
servers, there would be ghost mounts but no actual submount processes.
Also, server submounts will time out if unused, and the ghost mounts from
those servers will no longer be visible.  Am I right, that a submount with
all ghost mounts counts as idle, not in use, and will exit after the
timeout?  This is important.  (But if you subsequently do "ls /net/server"
a new submount will be forked, and it will populate itself with only ghost
mounts, right?)

> 5. The HUP signal handling will remain to allow map refresh
>    on demand.

Good.  This is important for setup and debugging.

> For a a cache hit we can:
> 
> 1. Check if the map entry has exceeded a given time to live, say
>    the map expire time.
> 2. If not continue as normal otherwise lookup the individual map
>    entry and compare it to the cached entry.
> 3. If they are identical continue as normal otherwise it's a hint
>    the map has changed so re-read it.

Suggestion: set a flag saying "we know that this map is out of date", but 
only re-read the whole thing when userspace expects to see up-to-date ghost 
mounts.  Understood, that to read any one row from a file map you have to 
read the whole thing; the suggestion makes a difference only for NIS and 
LDAP -- where it makes a big difference.

Brainwave: You could also set the TTL of every other cache row from that
map to an expired state, since some of them might have been read more
recently.  That's dumb, because we're trying to re-read the map row by row,
and we don't want to invalidate rows that came from the new version of the
map.  If each row had an associated order number or date, we could
recognize and purge truly obsolete rows.  Don't monkey with the TTL, toss 
them.  Hmm, for an order number you can use the date when the particular 
row was read, and if row X is seen to be wrong then all older rows are 
surely wrong too.  

> For a cache miss we can:
> 
> 1. Lookup the individual map entry.
> 2. If this fails consider it as a real failure and continue.
> 3. If it succeeds re-read the entire map as it must be out of date.

Same comment applies.

> One further consideration remains.
> 
> Should a time based (say 6-10 times the expire timeout) map refresh be 
> added to the above logic?
> 
> I would prefer not...

I agree with your points on this.  Cached stale data doesn't hurt until you 
actually mount something or do "ls /net/server", and pre-re-reading will be
wasted effort 99% of the time.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-21 16:44 ` Jim Carter
@ 2004-05-22  4:37   ` raven
  2004-05-23  3:19     ` Thorild Selen
  2004-05-24  0:24     ` Jim Carter
  2004-06-14 11:04   ` raven
  2004-06-14 14:48   ` raven
  2 siblings, 2 replies; 15+ messages in thread
From: raven @ 2004-05-22  4:37 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

Hi Jim,

Much of what is here has been taken from your original suggestions. You 
probably noticed that.

I will have to digest your recent comments for a while. However, at the 
risk of not answering accuratly I will respond to your suggestions 
now anyway (as best I can).

On Fri, 21 May 2004, Jim Carter wrote:

> 
> > 4. We need to take advantage of hints that the map may have
> >    changed frequently in order to catch updated maps without
> >    a large time lag.
> 
> We need to frequently glim the hints... right?  Your point isn't about 
> vacillating maps.  For updating the ghost mounts this is true, but for 
> actual mounting it's irrelevant if the cache contains outdated info because 
> of the TTL rule.  NIS maps have an "order number" which you can get if you 
> know what key to ask for, rather than having to read the whole map.  I 
> don't know about LDAP.  Suggestion for ghost mounts: only refresh the map 
> when a userspace process does opendir on the containing directory, as in 
> "ls /net/hostname".  In any other case, outdated or missing ghost mounts 
> cannot be seen by userspace, so let sleeping dogs lie.

OK. I'm bound to misunderstand this but I'll try.

Please understand that wildcard map handling is a seperate issue which I 
have also been thinking about over the past months. In fact I have had a 
patch for many months from Mark Faseh (think that's how it's spelt, I 
can't find a post by him) which I'm still not in a position to implement. 
I have however retained his kernel module changes to support it in my 
recent submission for 2.6.

We'll be discussing this at some point, when the bug fixes slow down and I 
have time to work on enhancements.

I'm aware of the NIS order number but don't have an idea of how this can 
be done for LDAP maps. I want the handling to remain as consistent as 
possible between map types. This will be an advantage when I add 
ghosting of nis+ maps.

You would be suprised how often the "opendir" equivalent in kernel space 
is called. Checking for a map re-read then means frequent callbacks from 
kernelspace, essentially from within an existing callback. This is likely 
to be a little nightmarish in terms of implementation complexity and bound 
to be full of races.

I feel fairly stubonly about needing to keep this to a userspace procedure.

> 
> Question about ghost mounts: if you did "ls /net", you would like to see
> all servers from which anything potentially could be mounted, but with a
> wildcard map this is impossible, right?  But with explicitly listed
> servers, there would be ghost mounts but no actual submount processes.
> Also, server submounts will time out if unused, and the ghost mounts from
> those servers will no longer be visible.  Am I right, that a submount with
> all ghost mounts counts as idle, not in use, and will exit after the
> timeout?  This is important.  (But if you subsequently do "ls /net/server"
> a new submount will be forked, and it will populate itself with only ghost
> mounts, right?)

Umm ... little bit unclear but ...

This also relates to another issue on the burner. Lazy mounting of host 
(or multi-mount) maps. This will be dealt with in 4.2.0, I hope. I have a 
reasonably clear plan for this but am still pondering the details.

The change in behaviour for the mounts you refer to should not change with 
this proposal because they don't currently support ghosting.

The enumeration of exports for host mounts is also part of Marks' patch, 
some of which is inclued in autofs now but remains unused. And yes, hosts 
and exports will need to be enumerated in some way to provide a ghosted 
diectory tree. But please lets not persue this right now. I'm sure that 
when this functionality is being implemented you will have many valueable 
contributions to refining it's behaviour.

The host type multi-mounts you mention are still handled "as a single 
entry~, and hence do not allow ghosting yet (4.2.0). So no the "ls 
/net/server" will not populate itself with ghosted mounts but 
actual mounts due to the "as a single entry" behaviour.

> 
> > For a a cache hit we can:
> > 
> > 1. Check if the map entry has exceeded a given time to live, say
> >    the map expire time.
> > 2. If not continue as normal otherwise lookup the individual map
> >    entry and compare it to the cached entry.
> > 3. If they are identical continue as normal otherwise it's a hint
> >    the map has changed so re-read it.
> 
> Suggestion: set a flag saying "we know that this map is out of date", but 
> only re-read the whole thing when userspace expects to see up-to-date ghost 
> mounts.  Understood, that to read any one row from a file map you have to 
> read the whole thing; the suggestion makes a difference only for NIS and 
> LDAP -- where it makes a big difference.

The flag already exists. It's part of each cache entry. It is set to the 
time a map is read. It's used to clean stale entries now.

So you are saying we only set a flag when the tests above indicate a map 
is out of date. But how do we know when user space expects to see an 
up-to-date map other than a changed map entry? So to use a TTL will give 
us the oppertunity to check for an outdated map entry. Again only for 
ghosted maps.

> 
> Brainwave: You could also set the TTL of every other cache row from that
> map to an expired state, since some of them might have been read more
> recently.  That's dumb, because we're trying to re-read the map row by row,
> and we don't want to invalidate rows that came from the new version of the
> map.  If each row had an associated order number or date, we could
> recognize and purge truly obsolete rows.  Don't monkey with the TTL, toss 
> them.  Hmm, for an order number you can use the date when the particular 
> row was read, and if row X is seen to be wrong then all older rows are 
> surely wrong too.  

Think I already have that behaviour.

> 
> > For a cache miss we can:
> > 
> > 1. Lookup the individual map entry.
> > 2. If this fails consider it as a real failure and continue.
> > 3. If it succeeds re-read the entire map as it must be out of date.
> 
> Same comment applies.
> 

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-22  4:37   ` raven
@ 2004-05-23  3:19     ` Thorild Selen
  2004-05-24  0:24     ` Jim Carter
  1 sibling, 0 replies; 15+ messages in thread
From: Thorild Selen @ 2004-05-23  3:19 UTC (permalink / raw)
  To: autofs

raven@themaw.net writes:
> I'm aware of the NIS order number but don't have an idea of how this can 
> be done for LDAP maps. I want the handling to remain as consistent as 
> possible between map types. This will be an advantage when I add 
> ghosting of nis+ maps.

With servers that use the modifyTimestamp and createTimestamp
operational attributes (these SHOULD be set when an object is added or
changed, according to RFC 2252), these could be used to detect if any
entry has been changed or added since last time you checked. However,
this would fail to detect any removed entries.

What you really want is probably something like lcup ("LDAP Client
Update Protocol",
<URL:http://www.ietf.org/internet-drafts/draft-ietf-ldup-lcup-06.txt>)
or ldapsync ("The LDAP Content Synchronization Operation",
<URL:http://www.ietf.org/internet-drafts/draft-zeilenga-ldup-sync-05.txt>),
which both address this problem. It appears that recent versions of
OpenLDAP support ldapsync, but neither solution is even close to
de-facto standard status.

For some years to come, any useful and portable implementation would
need to support the case where there are no timestamps and no cool
synchronization operations to help you, I'm afraid.

Thorild Selén
Datorföreningen Update / Update Computer Club, Uppsala, SE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-22  4:37   ` raven
  2004-05-23  3:19     ` Thorild Selen
@ 2004-05-24  0:24     ` Jim Carter
  2004-05-25  1:23       ` Ian Kent
  2004-06-14 14:52       ` raven
  1 sibling, 2 replies; 15+ messages in thread
From: Jim Carter @ 2004-05-24  0:24 UTC (permalink / raw)
  To: raven; +Cc: autofs mailing list

On Sat, 22 May 2004 raven@themaw.net wrote:
> Much of what is here has been taken from your original suggestions. You 
> probably noticed that.

Yes, thank you.

> Please understand that wildcard map handling is a seperate issue which I 
> have also been thinking about over the past months. In fact I have had a 
> patch for many months from Mark Faseh (think that's how it's spelt, I 
> can't find a post by him) which I'm still not in a position to implement. 
> I have however retained his kernel module changes to support it in my 
> recent submission for 2.6.

Agreed, wildcard maps make impossible (or at least difficult) some things 
that people ask for.

> I'm aware of the NIS order number but don't have an idea of how this can 
> be done for LDAP maps. I want the handling to remain as consistent as 
> possible between map types. This will be an advantage when I add 
> ghosting of nis+ maps.

Good goal.  The date of reading should be good enough for recognizing
entries that are out of date.

> You would be suprised how often the "opendir" equivalent in kernel space 
> is called. 

I didn't know that.  I wonder if there's some other way to distinguish
"the caller is going to hit just one file in the directory" from "the 
caller is going to read the whole thing".  Only in the latter case is it 
important that the directory have only up-to-date entries.

> Checking for a map re-read then means frequent callbacks from 
> kernelspace, essentially from within an existing callback. 

Forget I mentioned it.  We'll have to find another way or do without.

> > (But if you subsequently do "ls /net/server"
> > a new submount will be forked, and it will populate itself with only ghost
> > mounts, right?)
>
> The host type multi-mounts you mention are still handled "as a single 
> entry~, and hence do not allow ghosting yet (4.2.0). So no the "ls 
> /net/server" will not populate itself with ghosted mounts but 
> actual mounts due to the "as a single entry" behaviour.

OK - it's extra work for autofs but not enough to justify hard programmer
work to reduce it.  Also, many people have an alias ls -> ls -F, so ls is
under a compulsion to stat every file it lists, to tag directories and
executables.  Autofs has to mount the filesystem anyway so ls can be
really, really sure that a directory is mounted there.

> The enumeration of exports for host mounts is also part of Marks' patch, 
> some of which is inclued in autofs now but remains unused. And yes, hosts 
> and exports will need to be enumerated in some way to provide a ghosted 
> diectory tree. But please lets not persue this right now. 

No problem.  I brought it up because I saw entanglements, not because I 
wanted them resolved quickly, or in a particular way.

> The flag already exists. It's part of each cache entry. It is set to the 
> time a map is read. It's used to clean stale entries now.

Sorry about not being on top of the code.

> So you are saying we only set a flag when the tests above indicate a map 
> is out of date. But how do we know when user space expects to see an 
> up-to-date map other than a changed map entry? So to use a TTL will give 
> us the oppertunity to check for an outdated map entry. Again only for 
> ghosted maps.

The issue I'm worrying about is, suppose all cache hits until now, that had
timed out and were rechecked from the map, turned out to be correct.  But
suppose there is a map row which we haven't seen yet, either because we
just never read it, or because the map was enumerated some time ago but has
been added to since then?  If the user reads the whole directory you have
to ignore the cache and re-read the NIS map every time (unless you rely on
the NIS order number, not generalizable) (OK to just stat a file map).  
But then, if you can't distinguish "read whole directory" from "hunt for
one file in directory", you end up re-reading the entire map when you don't
want to.

Forgive my meager kernel knowledge, but even if opendir is the same for
both uses, isn't there a separate object method implementing readdir,
versus looking up a file by name?  Readdir should bypass the cache, reading 
the NIS or LDAP map directly, while lookup should and does use the cache.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu  http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-24  0:24     ` Jim Carter
@ 2004-05-25  1:23       ` Ian Kent
  2004-05-25 16:28         ` Mike Waychison
  2004-06-14 14:52       ` raven
  1 sibling, 1 reply; 15+ messages in thread
From: Ian Kent @ 2004-05-25  1:23 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

On Sun, 23 May 2004, Jim Carter wrote:

> 
> > So you are saying we only set a flag when the tests above indicate a map 
> > is out of date. But how do we know when user space expects to see an 
> > up-to-date map other than a changed map entry? So to use a TTL will give 
> > us the oppertunity to check for an outdated map entry. Again only for 
> > ghosted maps.
> 
> The issue I'm worrying about is, suppose all cache hits until now, that had
> timed out and were rechecked from the map, turned out to be correct.  But
> suppose there is a map row which we haven't seen yet, either because we
> just never read it, or because the map was enumerated some time ago but has
> been added to since then?  If the user reads the whole directory you have
> to ignore the cache and re-read the NIS map every time (unless you rely on
> the NIS order number, not generalizable) (OK to just stat a file map).  
> But then, if you can't distinguish "read whole directory" from "hunt for
> one file in directory", you end up re-reading the entire map when you don't
> want to.

I'm concerned about that issue as well. I'm still not sure how to deal 
with that.

One thing to be aware of is that the kernel module and the daemon don't 
really know much about each other. So the cache is, for the purpose of 
this issue, either not available or is available much to late to be able 
to change the directory listing. Essentially, the directories are created 
in the filesystem by the daemon after reading the map and at some later 
time the kernel uses that alone to list the directory contents. 

I think that, for this, modifiying the kernel module is an absolute last 
resort as the place for handling this context information is the daemon.

> 
> Forgive my meager kernel knowledge, but even if opendir is the same for
> both uses, isn't there a separate object method implementing readdir,
> versus looking up a file by name?  Readdir should bypass the cache, reading 
> the NIS or LDAP map directly, while lookup should and does use the cache.

The kernel never sees the cache. The filesystem itself must be kept up to 
date. Creating or removing directories during a readdir operation will 
end in tears without a doubt.

But the real worry is the frequency with which this happens (as I 
mentioned above).

If we could just come up with a workable heuristic for map refresh we 
would be OK.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-25  1:23       ` Ian Kent
@ 2004-05-25 16:28         ` Mike Waychison
  2004-05-26  1:37           ` Ian Kent
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Waychison @ 2004-05-25 16:28 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Kent wrote:
> On Sun, 23 May 2004, Jim Carter wrote:
>
>
>>>So you are saying we only set a flag when the tests above indicate a map
>>>is out of date. But how do we know when user space expects to see an
>>>up-to-date map other than a changed map entry? So to use a TTL will give
>>>us the oppertunity to check for an outdated map entry. Again only for
>>>ghosted maps.
>>
>>The issue I'm worrying about is, suppose all cache hits until now,
that had
>>timed out and were rechecked from the map, turned out to be correct.  But
>>suppose there is a map row which we haven't seen yet, either because we
>>just never read it, or because the map was enumerated some time ago
but has
>>been added to since then?  If the user reads the whole directory you have
>>to ignore the cache and re-read the NIS map every time (unless you rely on
>>the NIS order number, not generalizable) (OK to just stat a file map).
>>But then, if you can't distinguish "read whole directory" from "hunt for
>>one file in directory", you end up re-reading the entire map when you
don't
>>want to.
>
>
> I'm concerned about that issue as well. I'm still not sure how to deal
> with that.
>
> One thing to be aware of is that the kernel module and the daemon don't
> really know much about each other. So the cache is, for the purpose of
> this issue, either not available or is available much to late to be able
> to change the directory listing. Essentially, the directories are created
> in the filesystem by the daemon after reading the map and at some later
> time the kernel uses that alone to list the directory contents.
>
> I think that, for this, modifiying the kernel module is an absolute last
> resort as the place for handling this context information is the daemon.
>
>
>>Forgive my meager kernel knowledge, but even if opendir is the same for
>>both uses, isn't there a separate object method implementing readdir,
>>versus looking up a file by name?  Readdir should bypass the cache,
reading
>>the NIS or LDAP map directly, while lookup should and does use the cache.
>
>
> The kernel never sees the cache.

This is further complicated as the kernel may call the readdir operation
once per directory/map entry.

> The filesystem itself must be kept up to
> date. Creating or removing directories during a readdir operation will
> end in tears without a doubt.

- ->readdir itself is serialized on the parent inode's i_sem, as are all
real_lookups (the call that makes the ->lookup callback).

>
> But the real worry is the frequency with which this happens (as I
> mentioned above).
>
> If we could just come up with a workable heuristic for map refresh we
> would be OK.
>

What is wrong with a cache of map keys in kernelspace? (Other than
modifying the kernel module being a last resort)



- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAs3QbdQs4kOxk3/MRAnefAJoDUROWmiwg7dM9i2/QLKavgCbtmACfU/o5
TuXkvVUMVUMClKYbRsH+hA4=
=0ugm
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-25 16:28         ` Mike Waychison
@ 2004-05-26  1:37           ` Ian Kent
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Kent @ 2004-05-26  1:37 UTC (permalink / raw)
  To: Mike Waychison; +Cc: autofs mailing list

On Tue, 25 May 2004, Mike Waychison wrote:

> >
> >
> > The kernel never sees the cache.
> 
> This is further complicated as the kernel may call the readdir operation
> once per directory/map entry.
> 
> > The filesystem itself must be kept up to
> > date. Creating or removing directories during a readdir operation will
> > end in tears without a doubt.
> 
> - ->readdir itself is serialized on the parent inode's i_sem, as are all
> real_lookups (the call that makes the ->lookup callback).

Can be a problem if userspace expects to see the changes as they occur. 
Directory modification seems to cause opendir/readdir calls a bit of a 
problem for large directories. So scandir is needed instead.

> 
> >
> > But the real worry is the frequency with which this happens (as I
> > mentioned above).
> >
> > If we could just come up with a workable heuristic for map refresh we
> > would be OK.
> >
> 
> What is wrong with a cache of map keys in kernelspace? (Other than
> modifying the kernel module being a last resort)

Nothing.

This is, pretty much, the way the autofs v4 internal design is. 
It's probably not a good idea to make wholsale changes within a point 
release as well. 

Of course you are working on a different design for which this does not 
apply.

Mind, I'm likely to use a similar mechinism to fix the direct map and lazy 
mounting of multi-mount map in v4. So thanks for your fresh ideas.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-24  0:24     ` Jim Carter
  2004-05-25  1:23       ` Ian Kent
@ 2004-06-14 14:52       ` raven
  1 sibling, 0 replies; 15+ messages in thread
From: raven @ 2004-06-14 14:52 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

On Sun, 23 May 2004, Jim Carter wrote:

> 
> Forgive my meager kernel knowledge, but even if opendir is the same for
> both uses, isn't there a separate object method implementing readdir,
> versus looking up a file by name?  Readdir should bypass the cache, reading 
> the NIS or LDAP map directly, while lookup should and does use the cache.
>

Yes. A directory inode gets assigned the directory methods.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-21 16:44 ` Jim Carter
  2004-05-22  4:37   ` raven
@ 2004-06-14 11:04   ` raven
  2004-06-14 21:10     ` Jim Carter
  2004-06-14 14:48   ` raven
  2 siblings, 1 reply; 15+ messages in thread
From: raven @ 2004-06-14 11:04 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

On Fri, 21 May 2004, Jim Carter wrote:

> don't know about LDAP.  Suggestion for ghost mounts: only refresh the map 
> when a userspace process does opendir on the containing directory, as in 
> "ls /net/hostname".  In any other case, outdated or missing ghost mounts 
> cannot be seen by userspace, so let sleeping dogs lie.

I wonder how many of those opendir events are actually userspace 
requests not themselves initiated by the daemon?

I'll investigate this a little further!

What were your thoughts about frequent opendir events, many users, or just 
many ls commands?

Perhaps a map needs to be considered up to date for some amount of time 
after a map re-read?

Thoughts?

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-06-14 11:04   ` raven
@ 2004-06-14 21:10     ` Jim Carter
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Carter @ 2004-06-14 21:10 UTC (permalink / raw)
  To: raven; +Cc: autofs mailing list

On Mon, 14 Jun 2004 raven@themaw.net wrote:
> On Fri, 21 May 2004, Jim Carter wrote:

> > don't know about LDAP.  Suggestion for ghost mounts: only refresh the map 
> > when a userspace process does opendir on the containing directory, as in 
> > "ls /net/hostname".  In any other case, outdated or missing ghost mounts 
> > cannot be seen by userspace, so let sleeping dogs lie.
> 
> I wonder how many of those opendir events are actually userspace 
> requests not themselves initiated by the daemon?
> 
> What were your thoughts about frequent opendir events, many users, or just 
> many ls commands?

I think it's relatively rare to users to do "ls /whatever" so as to cause a 
NIS automount map to be enumerated, but I'm sensitive to it because when 
there's a dead NFS server that's mounted, the user process has to wait for 
the NFS timeout.  Actually, "ls" isn't the worst culprit: it's /bin/pwd.
During /bin/csh startup there are (I think) three pwd's, at least with our 
.login/.cshrc standard scripts, and when there's a dead server and the 
user's host is later in directory order than the corpse, the user has to 
wait forever to get logged in, leading to an avalanche at the help desk.

Of course there's nothing the automounter can do about this.  Or is there?  
If autofs could recognize the dead server itself, it could do a forced 
unmount, and only one user would be inconvenienced (beyond the ones whose 
homedirs are on the dead server).  

Hmm, why is there a NFS timeout in /bin/pwd?  It stats its own directory to
find out the inode (e.g. /net/myhost), then it enumerates the /net
directory looking for the name of the file (myhost) which has that inode.  
The inode number should be coming back in the readdir output, right?  Or is
this just the inode of the mount point, and it has to actually stat the
file by name to detect the inode of what's mounted on it.  And statting a
mounted NFS filesystem requires a round-trip to the (dead) server, to get
the directory's mode, owner, etc, even if all you care about is the locally
assigned inode number.

> Perhaps a map needs to be considered up to date for some amount of time 
> after a map re-read?

For individual file lookups a TTL is appropriate on cache entries.  If you
use the cache on a readdir a TTL for the whole map is appropriate.  Hmm, on
"ls" the user wants to see ghost entries for the whole map including
entries added since the last full re-read.  But on /bin/pwd, the looked-for
directory entry is known to contain the current working directory, and
therefore has to be already mounted, and in the cache, but we don't care
whether the rest of the map is complete or even if it's accurate.

Here's a heuristic: when they do opendir, clear cache entries for that map 
which are beyond their TTL.  Then they do readdir repeatedly, and you 
return cache entries one by one.  /bin/pwd will close the directory in the 
middle of this phase, having found its target.  "ls" will read the whole 
cache and ask for more.  Now you enumerate the map (caching it, updating 
entries with changed mount options, and removing cache entries no longer in 
the map).  Then resume returning cached map entries, only those (could be 
none) that were not returned yet.

Issues: 

1.  On a cache miss, you will do opendir/readdir and will want some kludge 
to bypass the saved cache entries, already known to be not what you want.

2.  If a directory is added/deleted to/from the cache in the middle of the 
sequence of readdir operations, it may be missed or returned twice.  
Locking can be a problem.  It's probably sufficient if you only delete 
obsolete cache entries during the map re-read, relying on the TTL so you 
can ignore obsolete entries in other cases.  

3.  If only one user at a time could be doing "ls" or "pwd", you could have
a flag in each entry saying "this was returned", but that's clearly bogus, 
and I'm not sure how to distinguish already-returned entries in
the general case.  The cache being a linked list, is the order sufficiently
stable that you can save a pointer to the last returned item (or list head, 
if nothing was cached), and then resume from there in list order, after the 
NIS map is re-enumerated?  (Suppose the pointed-to item gets deleted?)

4.  If a mounted filesystem has vanished from its map, should you delete 
the cache entry at all?

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu  http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-05-21 16:44 ` Jim Carter
  2004-05-22  4:37   ` raven
  2004-06-14 11:04   ` raven
@ 2004-06-14 14:48   ` raven
  2004-06-14 20:11     ` Jim Carter
  2 siblings, 1 reply; 15+ messages in thread
From: raven @ 2004-06-14 14:48 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

On Fri, 21 May 2004, Jim Carter wrote:

> 
> > For a a cache hit we can:
> > 
> > 1. Check if the map entry has exceeded a given time to live, say
> >    the map expire time.
> > 2. If not continue as normal otherwise lookup the individual map
> >    entry and compare it to the cached entry.
> > 3. If they are identical continue as normal otherwise it's a hint
> >    the map has changed so re-read it.
> 
> Suggestion: set a flag saying "we know that this map is out of date", but 
> only re-read the whole thing when userspace expects to see up-to-date ghost 
> mounts.  Understood, that to read any one row from a file map you have to 
> read the whole thing; the suggestion makes a difference only for NIS and 
> LDAP -- where it makes a big difference.
> 

What were you thinking of as defining "out of date" Jim?

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-06-14 14:48   ` raven
@ 2004-06-14 20:11     ` Jim Carter
  2004-06-15  1:30       ` Ian Kent
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Carter @ 2004-06-14 20:11 UTC (permalink / raw)
  To: raven; +Cc: autofs mailing list

On Mon, 14 Jun 2004 raven@themaw.net wrote:
> On Fri, 21 May 2004, Jim Carter wrote:

> > > For a a cache hit we can:
> > > 
> > > 1. Check if the map entry has exceeded a given time to live, say
> > >    the map expire time.
> > > 2. If not continue as normal otherwise lookup the individual map
> > >    entry and compare it to the cached entry.
> > > 3. If they are identical continue as normal otherwise it's a hint
> > >    the map has changed so re-read it.
> > 
> > Suggestion: set a flag saying "we know that this map is out of date", but 
> > only re-read the whole thing when userspace expects to see up-to-date ghost 
> > mounts.  Understood, that to read any one row from a file map you have to 
> > read the whole thing; the suggestion makes a difference only for NIS and 
> > LDAP -- where it makes a big difference.

> What were you thinking of as defining "out of date" Jim?

Item 3 -- we did the NIS lookup on this one entry because the TTL had
expired, and NIS gave a different answer than the cache, so every cached
entry from that map is probably wrong.  But there's no need to read the
whole map; just clear the cached entries that came from it (except the one
just read, known to be up to date).

In subsequent discussion we may or may not have agreed that the only time 
you need to read the whole map is when the caller does readdir, as results 
from "ls /net" or "ls /net/hostname".  In that case you can't trust the 
cache, because entries might have been added to the map and you'd never 
know.  Someone pointed out that readdir may have to be used on an ordinary 
file lookup if you have to dig through the directory to find the file.  
I'm assuming that a successful cache lookup bypasses this readdir.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu  http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-06-14 20:11     ` Jim Carter
@ 2004-06-15  1:30       ` Ian Kent
  2004-06-17 16:56         ` Jim Carter
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Kent @ 2004-06-15  1:30 UTC (permalink / raw)
  To: Jim Carter; +Cc: autofs mailing list

On Mon, 14 Jun 2004, Jim Carter wrote:

> On Mon, 14 Jun 2004 raven@themaw.net wrote:
> > On Fri, 21 May 2004, Jim Carter wrote:
> 
> > > > For a a cache hit we can:
> > > > 
> > > > 1. Check if the map entry has exceeded a given time to live, say
> > > >    the map expire time.
> > > > 2. If not continue as normal otherwise lookup the individual map
> > > >    entry and compare it to the cached entry.
> > > > 3. If they are identical continue as normal otherwise it's a hint
> > > >    the map has changed so re-read it.
> > > 
> > > Suggestion: set a flag saying "we know that this map is out of date", but 
> > > only re-read the whole thing when userspace expects to see up-to-date ghost 
> > > mounts.  Understood, that to read any one row from a file map you have to 
> > > read the whole thing; the suggestion makes a difference only for NIS and 
> > > LDAP -- where it makes a big difference.
> 
> > What were you thinking of as defining "out of date" Jim?
> 
> Item 3 -- we did the NIS lookup on this one entry because the TTL had
> expired, and NIS gave a different answer than the cache, so every cached
> entry from that map is probably wrong.  But there's no need to read the
> whole map; just clear the cached entries that came from it (except the one
> just read, known to be up to date).

One thing I had in mind with this is that map updates are generally small, 
a few records or so.

> 
> In subsequent discussion we may or may not have agreed that the only time 
> you need to read the whole map is when the caller does readdir, as results 
> from "ls /net" or "ls /net/hostname".  In that case you can't trust the 
> cache, because entries might have been added to the map and you'd never 
> know.  Someone pointed out that readdir may have to be used on an ordinary 
> file lookup if you have to dig through the directory to find the file.  
> I'm assuming that a successful cache lookup bypasses this readdir.

Guilty.

We can distinguish between an open on a file and a directory. I'm still 
investigating the actual behaviour of the module as it is now.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Proposal for map refresh logic
  2004-06-15  1:30       ` Ian Kent
@ 2004-06-17 16:56         ` Jim Carter
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Carter @ 2004-06-17 16:56 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs mailing list

On Tue, 15 Jun 2004, Ian Kent wrote:

> On Mon, 14 Jun 2004, Jim Carter wrote:
> > Item 3 -- we did the NIS lookup on this one entry because the TTL had
> > expired, and NIS gave a different answer than the cache, so every cached
> > entry from that map is probably wrong.  But there's no need to read the
> > whole map; just clear the cached entries that came from it (except the one
> > just read, known to be up to date).
> 
> One thing I had in mind with this is that map updates are generally small, 
> a few records or so.

Only a few records change at any one time.  But which ones changed?  The
only way the client (automounter) can find out is to read the whole map.  
Even NIS itself uploads the whole map every time to its slave servers,
although NIS+ has an incremental update feature.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA  90095-1555
Email: jimc@math.ucla.edu    http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-06-17 16:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-21 14:31 Proposal for map refresh logic raven
2004-05-21 16:44 ` Jim Carter
2004-05-22  4:37   ` raven
2004-05-23  3:19     ` Thorild Selen
2004-05-24  0:24     ` Jim Carter
2004-05-25  1:23       ` Ian Kent
2004-05-25 16:28         ` Mike Waychison
2004-05-26  1:37           ` Ian Kent
2004-06-14 14:52       ` raven
2004-06-14 11:04   ` raven
2004-06-14 21:10     ` Jim Carter
2004-06-14 14:48   ` raven
2004-06-14 20:11     ` Jim Carter
2004-06-15  1:30       ` Ian Kent
2004-06-17 16:56         ` Jim Carter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.