All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] When /tmp is not tmpfs.
@ 2005-11-24 12:11 Rob Landley
  2005-11-24 20:40 ` Blaisorblade
                   ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-24 12:11 UTC (permalink / raw)
  To: user-mode-linux-devel

So apparently, one reason for the pathological behavior of UML (pegging the 
hard drive, which I mentioned earlier) is that by default Ubuntu doesn't 
mount /tmpfs on /tmp.  This means it's part of /root, which is ext3, and 
every touched page gets scheduled for writeout after a few seconds.  (The 
optimization not to do that for deleted files was apparently taken out of 
2.6.)

There is a tmpfs mount, it's /dev/shm.  And apparently, even if tmpfs isn't 
exposed as a separate filesystem, system V shared memory will still use it.

So my question is, could system v shared memory be used in place of the tmpfs 
mount?  (Can it be mapped in the right location and inherited across fork()?)  
Or is this just a "systems that don't mount /tmpfs on /tmp are screwed, it's 
another prerequisite for running UML".

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-24 12:11 [uml-devel] When /tmp is not tmpfs Rob Landley
@ 2005-11-24 20:40 ` Blaisorblade
  2005-11-25  8:26   ` Rob Landley
  2005-11-25  9:55 ` Jeff Dike
  2005-11-25 14:56 ` Nix
  2 siblings, 1 reply; 42+ messages in thread
From: Blaisorblade @ 2005-11-24 20:40 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Rob Landley

On Thursday 24 November 2005 13:11, Rob Landley wrote:
> So apparently, one reason for the pathological behavior of UML (pegging the
> hard drive, which I mentioned earlier) is that by default Ubuntu doesn't
> mount /tmpfs on /tmp.  This means it's part of /root, which is ext3, and
> every touched page gets scheduled for writeout after a few seconds.  (The
> optimization not to do that for deleted files was apparently taken out of
> 2.6.)

> There is a tmpfs mount, it's /dev/shm.

> And apparently, even if tmpfs isn't 
> exposed as a separate filesystem, system V shared memory will still use it.

Ah,ok... more or less it's true.

> So my question is, could system v shared memory be used in place of the
> tmpfs mount?  (Can it be mapped in the right location and inherited across
> fork()?)
IIRC you can share a SysV shmem area across arbitrary processes - anybody 
calls ftok on a file, gets its handle and can open the shmem area.

I mostly wonder about automatic cleanup.

One (mis) feature of SysV IPC is persistance till reboot (i.e. no auto-cleanup 
if the process exits).

In fact, we make processes sleep on pipes rather than use SysV semaphore 
exactly for this reason (I wanted to use futexes, but never found the time).

However, I just found out, see shmctl(2), that IPC_RMID implements the 
refcount "garbage collection" algorithm, so apparently it *could* be used.

The question is if we want it, and considering the new features being added to 
shmfs, the answer is probably either "no" or "we accept patches if somebody 
else is willing to maintain them" (adding yet another code path doesn't make 
me that happy - see the effort needed to make TT and SKAS3, and now SKAS0 and 
SKAS3, keep working).

> Or is this just a "systems that don't mount /tmpfs on /tmp are 
> screwed, it's another prerequisite for running UML".
First, UML works anyway.

Set properly one of TMPDIR / TMP / TEMP (don't remember exact priorities, but 
IIRC TMPDIR has most priority) to point to /dev/shm. Actually, we could even 
make it the default (but must cater for older systems).

It's used for POSIX shmem, so it's as standard on >=2.4 Linuxes as SysV shmem.
-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-24 20:40 ` Blaisorblade
@ 2005-11-25  8:26   ` Rob Landley
  0 siblings, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-25  8:26 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel

On Thursday 24 November 2005 14:40, Blaisorblade wrote:
> However, I just found out, see shmctl(2), that IPC_RMID implements the
> refcount "garbage collection" algorithm, so apparently it *could* be used.
>
> The question is if we want it, and considering the new features being added
> to shmfs, the answer is probably either "no" or "we accept patches if
> somebody else is willing to maintain them" (adding yet another code path
> doesn't make me that happy - see the effort needed to make TT and SKAS3,
> and now SKAS0 and SKAS3, keep working).

Hmmm...  (Eyes to-do list...)

> > Or is this just a "systems that don't mount /tmpfs on /tmp are
> > screwed, it's another prerequisite for running UML".
>
> First, UML works anyway.
>
> Set properly one of TMPDIR / TMP / TEMP (don't remember exact priorities,
> but IIRC TMPDIR has most priority) to point to /dev/shm. Actually, we could
> even make it the default (but must cater for older systems).
>
> It's used for POSIX shmem, so it's as standard on >=2.4 Linuxes as SysV
> shmem.

Expecting /dev/shm to be tmpfs seems more reliable than expecting /tmp to be.  
(After all, its' original name was shmfs...)

I just added
  [ -d /dev/shm ] && export TMPDIR=/dev/shm
to my build script, and it seems to help.

Thanks,

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25  9:55 ` Jeff Dike
@ 2005-11-25  9:48   ` Rob Landley
  2005-11-25 10:52     ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-25  9:48 UTC (permalink / raw)
  To: Jeff Dike; +Cc: user-mode-linux-devel

On Friday 25 November 2005 03:55, Jeff Dike wrote:
> On Thu, Nov 24, 2005 at 06:11:01AM -0600, Rob Landley wrote:
> > So my question is, could system v shared memory be used in place of the
> > tmpfs mount?  (Can it be mapped in the right location and inherited
> > across fork()?)
>
> tmpfs and shmfs are two names for the same underlying code.

Yes, but when tmpfs is not configured into the kernel it isn't exposed to 
userspace as a mountable filesystem, so the only way to _access_ it is 
sometimes through the sysv shared memory API.

I agree that using the sysv API probably isn't worth the effort if we've got a 
good workaround.  I have a workaround, so I'm unlikely to code it up.

> I think the shmfs mount is for the benefit of things that use SysV shared
> memory. 

Yup, but it was standard practice to do this back in the 2.4 days, while 
mounting /tmp as tmpfs is _not_ done on such recent systems as ubuntu "horny 
hedgehog", which shipped earlier this year.

I'm not worried about 2.4 systems because they had the "deleted files don't 
get synced to disk" hack that got yanked in 2.6, so the pathological behavior 
I'm seeing shouldn't show up on them even though for them /tmp isn't usually 
tmpfs.

Instead, this pathological behavior only shows up when you do something that 
constantly dirties a lot of pages on a 2.6 system where /tmp isn't tmpfs.

> So, no.  Just use tmpfs on /tmp.

If I can guarantee that I have root access to all the systems my code is 
running on and can make that kind of administrative change, then there's not 
a whole lot of point in me bothering with UML in the first place.

Setting TMPDIR to point to /dev/shm seems to be a decent workaround, but I'm 
not sure how portable it is.  (Is there a system out there where /tmp is 
mounted tmpfs but /dev/shm isn't?  Right now I'm testing if /dev/shm exists 
and assuming it's tmpfs if so.)

But documenting that UML -skas0 performance is going to not only suck rocks 
but bog down the rest of your system very very noticeably when you run on 2.6 
and its temp file isn't on tmpfs might be a useful thing to do.

>     Jeff

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-24 12:11 [uml-devel] When /tmp is not tmpfs Rob Landley
  2005-11-24 20:40 ` Blaisorblade
@ 2005-11-25  9:55 ` Jeff Dike
  2005-11-25  9:48   ` Rob Landley
  2005-11-25 14:56 ` Nix
  2 siblings, 1 reply; 42+ messages in thread
From: Jeff Dike @ 2005-11-25  9:55 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel

On Thu, Nov 24, 2005 at 06:11:01AM -0600, Rob Landley wrote:
> So my question is, could system v shared memory be used in place of the tmpfs 
> mount?  (Can it be mapped in the right location and inherited across fork()?)

tmpfs and shmfs are two names for the same underlying code.  I think the shmfs
mount is for the benefit of things that use SysV shared memory.  

So, no.  Just use tmpfs on /tmp.

				Jeff


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25  9:48   ` Rob Landley
@ 2005-11-25 10:52     ` Rob Landley
  2005-11-25 11:26       ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-25 10:52 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Jeff Dike

FYI:

The mounts on a Fedora Core 4 system:

/dev/hda2 on / type ext3 (rw)
/dev/proc on /proc type proc (rw)
/dev/sys on /sys type sysfs (rw)
/dev/devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
/dev/shm on /dev/shm type tmpfs (rw)
/dev/hdb1 on /home type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
none on /var/named/chroot/proc type proc (rw)
automount(pid1716) on /misc type autofs 
(rw,fd=4,pgrp=1716,minproto=2,maxproto=4)
automount(pid1759) on /net type autofs 
(rw,fd=4,pgrp=1759,minproto=2,maxproto=4)
nfsd on /proc/fs/nfsd type nfsd (rw)

/tmp is nothing special (it inherits / which is ext3), but /dev/shm is a tmpfs 
mount which is world writeable and has the sticky bit set.

The mounts on the x86-64 PLD system I've been borrowing (and on which I do not 
have root access):

/dev/sda3 on / type jfs (rw)
none on /proc type proc (rw,gid=17)
sysfs on /sys type sysfs (rw)
selinuxfs on /selinux type selinuxfs (rw)
/dev/sda1 on /boot type ext2 (rw)
/dev/sda5 on /usr type jfs (rw)
/dev/sda6 on /var type jfs (rw)
/dev/sda7 on /tmp type jfs (rw)
/dev/sda8 on /home type jfs (rw)
/dev/sda9 on /srv type jfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/root/pldcd-0.95.iso on /root/pld type iso9660 (rw,loop=/dev/loop0)

/tmp is an explicit scsi mount.  /dev/shm inherits / (which is jfs), but 
that's moot because the directory is not world writeable.

The shell servers from sourceforge:

/dev/md0 on / type ext3 (rw,errors=remount-ro)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/md1 on /tmp type ext3 (rw)
/dev/md2 on /var type ext3 (rw)
/dev/md3 on /usr type ext3 (rw)
/dev/md4 on /var/local type ext3 (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
pr-fs-users-a:/home/users/a on /home/users/a type nfs 
(rw,nosuid,nodev,nfsvers=3,udp,rsize=16384,wsize=16384,hard,intr,addr=10.5.1.153)
... and so on [about 8 gazillion more /home/users/blah mounts trimmed].

Again, /tmp is not tmpfs (it's ext3 on a raid), but /dev/shm is a tmpfs mount 
which is world writeable and has the sticky bit set.

And I reiterate that on my ubuntu laptop /tmp is not tmpfs (it inherits my 
ext3 /) but /dev/shm is a world writeable tmpfs mount that has the sticky bit 
set.

My conclusion from this is that /dev/shm is probably a better default 
than /tmp for User Mode Linux's physical memory file, perhaps with a fallback 
to /tmp if it can't write there.  But I'd appreciate hearing from other 
people with different systems.

The assumption that tmpfs is already mounted on /tmp does not seem to be true 
on any preexisting system I can currently find out in the field.  Where have 
you seen this?

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 10:52     ` Rob Landley
@ 2005-11-25 11:26       ` Rob Landley
  0 siblings, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-25 11:26 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Jeff Dike

On Friday 25 November 2005 04:52, Rob Landley wrote:
> FYI:
>
> The mounts on a Fedora Core 4 system:
...
> The mounts on the x86-64 PLD system I've been borrowing (and on which I do
> not have root access):
...
> The shell servers from sourceforge:
...
> And I reiterate that on my ubuntu laptop /tmp is not tmpfs (it inherits my
> ext3 /) but /dev/shm is a world writeable tmpfs mount that has the sticky
> bit set.
...

Found one more lying around:  A gentoo system my friend mark set up.  Can't 
easily cut and paste the mount table from here (it's a laptop), but /tmp 
inherits / which is ext3, and /dev/shm is a tmpfs mount which is world 
writeable and has the sticky bit set.

Except for the x86-64 PLD system (which is already known to be weird), it's 
unanimous so far...

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-24 12:11 [uml-devel] When /tmp is not tmpfs Rob Landley
  2005-11-24 20:40 ` Blaisorblade
  2005-11-25  9:55 ` Jeff Dike
@ 2005-11-25 14:56 ` Nix
  2005-11-25 15:03   ` Chris Lightfoot
  2 siblings, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-25 14:56 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel

On Thu, 24 Nov 2005, Rob Landley uttered the following:
> There is a tmpfs mount, it's /dev/shm.  And apparently, even if tmpfs isn't 
> exposed as a separate filesystem, system V shared memory will still use it.

s/System V/POSIX/

It's the shm_open()/shm_close()shm_unlink() functions you're looking for.

It's been present in glibc since 2.2, so you should be able to use them
without any real difficulty if need be.

> So my question is, could system v shared memory be used in place of the tmpfs 
> mount?  (Can it be mapped in the right location and inherited across fork()?)  

You could certainly do just that with POSIX shm :)

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 14:56 ` Nix
@ 2005-11-25 15:03   ` Chris Lightfoot
  2005-11-25 15:36     ` Nix
  2005-11-25 16:03     ` Rob Landley
  0 siblings, 2 replies; 42+ messages in thread
From: Chris Lightfoot @ 2005-11-25 15:03 UTC (permalink / raw)
  To: Nix; +Cc: Rob Landley, user-mode-linux-devel

On Fri, Nov 25, 2005 at 02:56:49PM +0000, Nix wrote:
> You could certainly do just that with POSIX shm :)

Another option is to mlock the memory, which should
prevent paging, but requires root. I have a patch which
does this using a helper binary, if people would like it.

-- 
``As usual the Liberals offer a mixture of sound and original ideas.
  Unfortunately none of the sound ideas is original and none of the
  original ideas is sound.'' (Harold Macmillan)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 15:03   ` Chris Lightfoot
@ 2005-11-25 15:36     ` Nix
  2005-11-25 16:03     ` Rob Landley
  1 sibling, 0 replies; 42+ messages in thread
From: Nix @ 2005-11-25 15:36 UTC (permalink / raw)
  To: Chris Lightfoot; +Cc: Rob Landley, user-mode-linux-devel

On Fri, 25 Nov 2005, Chris Lightfoot murmured woefully:
> On Fri, Nov 25, 2005 at 02:56:49PM +0000, Nix wrote:
>> You could certainly do just that with POSIX shm :)
> 
> Another option is to mlock the memory, which should
> prevent paging, but requires root. I have a patch which
> does this using a helper binary, if people would like it.

Well, mlocking it is certainly not practical for everyone :) while
shm_open() and friends *is* practical as a general solution.

e.g., one of my more important UMLs, my firewall:

nix@loki 27 /home/nix% ps -o rss,vsz -C uml-esperi
  RSS    VSZ
34296  99788
 1468   1624
34296  99788
34296  99788
34296  99788

That's a very large RSS because I'm sshing in through it; normally it's
more like 5Mb. The host only has 128Mb RAM and does many other things as
well: mlock()ing that 99Mb into RAM would render the host almost
useless!

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 15:03   ` Chris Lightfoot
  2005-11-25 15:36     ` Nix
@ 2005-11-25 16:03     ` Rob Landley
  2005-11-25 19:33       ` Nix
  1 sibling, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-25 16:03 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Chris Lightfoot, Nix

On Friday 25 November 2005 09:03, Chris Lightfoot wrote:
> On Fri, Nov 25, 2005 at 02:56:49PM +0000, Nix wrote:
> > You could certainly do just that with POSIX shm :)
>
> Another option is to mlock the memory, which should
> prevent paging, but requires root. I have a patch which
> does this using a helper binary, if people would like it.

A) mlock would be a bad thing.  Not only is it a trivial DOS waiting to happen 
but I like the UML physmem being swapped out under memory pressure.  I just 
don't want uselessly writing it to disk over and over in the absence of any 
memory pressure whatosever to consume all I/O bandwidth to no purpose, which 
is the effect when it's not on tmpfs.

B) Still requires root.  The suid root helper program is only available in 
exactly the same circumstances in which you can just ask the admin to mount 
tmpfs somewhere for you.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 16:03     ` Rob Landley
@ 2005-11-25 19:33       ` Nix
  2005-11-25 20:18         ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-25 19:33 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Chris Lightfoot

On Fri, 25 Nov 2005, Rob Landley uttered the following:
> A) mlock would be a bad thing.  Not only is it a trivial DOS waiting to happen 
> but I like the UML physmem being swapped out under memory pressure.  I just 
> don't want uselessly writing it to disk over and over in the absence of any 
> memory pressure whatosever to consume all I/O bandwidth to no purpose, which 
> is the effect when it's not on tmpfs.

Maybe this is a stupid question, but... why do *any* systems other than
extremely memory-constrained ones not mount tmpfs on /tmp? It seems to
me to have numerous advantages and no disadvantages.

In fact, even when you're memory-constrained, if you *have* diskspace that
you could spend on /tmp, you can swap to it instead, and spend the space
on virtual memory when you're not spending it on /tmp.

So, er, why?

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 19:33       ` Nix
@ 2005-11-25 20:18         ` Rob Landley
  2005-11-25 21:04           ` Nix
  2005-11-25 23:46           ` Chris Lightfoot
  0 siblings, 2 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-25 20:18 UTC (permalink / raw)
  To: Nix; +Cc: user-mode-linux-devel, Chris Lightfoot

On Friday 25 November 2005 13:33, Nix wrote:
> On Fri, 25 Nov 2005, Rob Landley uttered the following:
> > A) mlock would be a bad thing.  Not only is it a trivial DOS waiting to
> > happen but I like the UML physmem being swapped out under memory
> > pressure.  I just don't want uselessly writing it to disk over and over
> > in the absence of any memory pressure whatosever to consume all I/O
> > bandwidth to no purpose, which is the effect when it's not on tmpfs.
>
> Maybe this is a stupid question, but... why do *any* systems other than
> extremely memory-constrained ones not mount tmpfs on /tmp? It seems to
> me to have numerous advantages and no disadvantages.

Actually, I consider the fact the OOM killer doesn't delete files out of tmpfs 
mounts to be a potential disadvantage in this context.

Using /tmp for anything has been kind of discouraged for a while, because 
throwing any insufficiently randomized filename in there is a security hole 
waiting to happen.  By the time tmpfs was widely available as something you 
might mount on /tmp, the use of /tmp had been largely replaced with things 
like the ~/.kde directory or /var/spool/appdir with ownership and permissions 
enforced.

Most of the remaining uses of /tmp are actually for things like named sockets 
(where tmpfs really doesn't help at all), or for tiny little files (like all 
the mcop crap) that on a different day would live under /var.  It's used for 
inter-process communications, not for temporary storage space.  Long ago 
things like vi would create temporary files in /tmp, but these days it uses .
${filename}.swp in the same directory as the file being edited.  (As a matter 
of fact, there's even a /var/tmp that konqueror recently started storing its 
cache in.  It used to be in ~/.kde.  So there isn't just _one_ tmp directory; 
if you try to tmpfs mount your /tmp than you need to do more than one.)

I suspect that the real reason nobody mounts tmpfs on /tmp is that nobody 
_bothers_.  Nobody in their right mind puts anything big under /tmp, the few 
remaining uses are largely IPC between different users on the same machine, 
and even X11 has mostly moved away from that.  Things like postfix and cups 
use subdirectories under /var/spool that aren't world readable.

Keep in mind that tmpfs used to be shmfs, and what it's good at is providing 
shared memory.  What UML really _wants_ is shared memory, which has 
traditionally been available through /dev/shm.  Insisting that /tmp behave 
like /dev/shm because otherwise what you get doesn't behave like shared 
memory A) doesn't make make a whole lot of sense, B) doesn't match existing 
practice.

> In fact, even when you're memory-constrained, if you *have* diskspace that
> you could spend on /tmp, you can swap to it instead, and spend the space
> on virtual memory when you're not spending it on /tmp.

"can" doesn't mean "should".  Yes you can make a 10 gigabyte swap partition, 
but most people actively don't want one because if your system ever winds up 
using more than about twice as much swap space as it has physical memory, 
it's likely that the amount of swap thrashing you're doing is getting 
pathological.  Having a runaway app have to churn through 10 gigabytes of 
swap space before the OOM killer terminates it can turn 30 seconds of 
paralysis into 10 minutes.  Not an improvement.

Also, although it's pretty common to have 10 gigabytes of spare disk space on 
a modern laptop, it is _not_ common to have 10 gigabytes of spare swap space, 
and that's for a reason.  Extra space in your filesystem can be used for all 
sorts of things.  Extra swap space is normally wasted.

So having tmp just be a normal directory isn't really that bad of a choice.  
It normally manifests no downsides whatsoever.  And encouraging people to 
use /tmp is considered a security hole.

> So, er, why?

/dev/shm appears to be is the widely available tmpfs mount, because its 
purpose is to provide shared memory.  It is not and never has been the 
purpose of /tmp to provide shared memory.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 20:18         ` Rob Landley
@ 2005-11-25 21:04           ` Nix
  2005-11-25 22:31             ` Rob Landley
  2005-11-25 23:33             ` Blaisorblade
  2005-11-25 23:46           ` Chris Lightfoot
  1 sibling, 2 replies; 42+ messages in thread
From: Nix @ 2005-11-25 21:04 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Chris Lightfoot

On Fri, 25 Nov 2005, Rob Landley moaned:
> On Friday 25 November 2005 13:33, Nix wrote:
>> Maybe this is a stupid question, but... why do *any* systems other than
>> extremely memory-constrained ones not mount tmpfs on /tmp? It seems to
>> me to have numerous advantages and no disadvantages.
> 
> Actually, I consider the fact the OOM killer doesn't delete files out of tmpfs 
> mounts to be a potential disadvantage in this context.

Yeah, true, if you think the OOM killer is worthwhile (I do: most of the MM
hackers don't. I know who knows more about the Linux kernel's MM and it's
not me!)

> Using /tmp for anything has been kind of discouraged for a while, because 
> throwing any insufficiently randomized filename in there is a security hole 
> waiting to happen.

Um, atomically create a directory, spray all the files you like under
it.  Trivial. Doesn't everyone have a bit of scriptage that does that in
$LANGUAGE_OF_CHOICE?

>                    By the time tmpfs was widely available as something you 
> might mount on /tmp, the use of /tmp had been largely replaced with things 
> like the ~/.kde directory or /var/spool/appdir with ownership and permissions 
> enforced.

The ~/.kde directory doesn't contain temporary files, but persistent state:
and the same is true of /var/spool, and /var/cache and /var/tmp for that
matter.

> Most of the remaining uses of /tmp are actually for things like named sockets 
> (where tmpfs really doesn't help at all), or for tiny little files (like all 
> the mcop crap) that on a different day would live under /var.  It's used for 
> inter-process communications, not for temporary storage space.  Long ago 

I suspect what causes this is that /tmp is explicitly for uses that
*don't outlive a reboot*, and how many of those are there? Not all that
damn many. --- at least, not all that damn many *programmatic* ones.

> things like vi would create temporary files in /tmp,

/var/tmp, because the entire point of those files was to survive a reboot.

>                                                      but these days it uses .
> ${filename}.swp in the same directory as the file being edited.

Yes, and I absolutely despise this behaviour. Is there any way to force vim
to use /var/tmp like everyone else?

>                                                                  (As a matter 
> of fact, there's even a /var/tmp that konqueror recently started storing its 
> cache in.  It used to be in ~/.kde.  So there isn't just _one_ tmp directory; 
> if you try to tmpfs mount your /tmp than you need to do more than one.)

Since /tmp and /var/tmp serve different purposes, yes, of course. This has
always been true; right back in the early Slackware days the boot scripts
used to carefully scrub /tmp but leave /var/tmp alone.

> I suspect that the real reason nobody mounts tmpfs on /tmp is that nobody 
> _bothers_.  Nobody in their right mind puts anything big under /tmp, the few 
> remaining uses are largely IPC between different users on the same machine, 
> and even X11 has mostly moved away from that.  Things like postfix and cups 
> use subdirectories under /var/spool that aren't world readable.

Well, I'd say the majority users in my case are:

- programs writing to $TMPDIR; config.guess, configure, and GCC are big users
  on my systems, but lots of other apps write here for a while. Of course
  you could point TMPDIR somewhere else, but does anyone do that?

  There are a quite surprising number of these: generally the files live
  for brief instants before being unlinked, if at all. (mkstemp() creates its
  files in $TMPDIR, after all, and often for those files minimal overhead
  is what counts; and like it or not tmpfs has lower overhead than ext*fs.)

- users. A *lot* of my users dump temporary crud in /tmp: the names of these
  files aren't predictable unless you're telepathic so we're pretty safe from
  symlink attacks. (My local users are Nice Guys anyway, or I shoot them. No
  shots have so far been necessary.)

Maybe your users don't dump everything they don't care much about in
/tmp: mine are always sticking all sorts of things in there from
half-chewed LaTeX through to boring logfiles and stuff being looked over
on its way to the printer :)

(the half-chewed LaTeX worries me slightly: maybe a baby's dummy
wouldn't go amiss.)

> Keep in mind that tmpfs used to be shmfs, and what it's good at is providing 
> shared memory.

Yep. It just so happens that this gives good properties for transient stuff
that should vanish no later than the next reboot, and generally lives only
for as long as someone has it open.

>                 What UML really _wants_ is shared memory, which has 
> traditionally been available through /dev/shm.  Insisting that /tmp behave 
> like /dev/shm because otherwise what you get doesn't behave like shared 
> memory A) doesn't make make a whole lot of sense, B) doesn't match existing 
> practice.

`Existing practice' seems to me to have pretty much wanted something,
uh, like tmpfs. But maybe your existing practice of /tmp is very different
from mine. (It certainly sounds like it.)

>> In fact, even when you're memory-constrained, if you *have* diskspace that
>> you could spend on /tmp, you can swap to it instead, and spend the space
>> on virtual memory when you're not spending it on /tmp.
> 
> "can" doesn't mean "should".  Yes you can make a 10 gigabyte swap partition, 
> but most people actively don't want one because if your system ever winds up 
> using more than about twice as much swap space as it has physical memory, 
> it's likely that the amount of swap thrashing you're doing is getting 
> pathological.

You've never used dar in infinint mode or watched large matrix maths stuff
churn through to completion :/ there really are things with insane memory
requirements and good locality of reference. (I think the most I ever saw
dar eat was 15Gb of swap. *gah*)

>                Having a runaway app have to churn through 10 gigabytes of 
> swap space before the OOM killer terminates it can turn 30 seconds of 
> paralysis into 10 minutes.  Not an improvement.

The problem there is that it's churning, i.e. that its locality of reference
is crap. Such a program should indeed not be allowed to eat that much swap.

> Also, although it's pretty common to have 10 gigabytes of spare disk space on 
> a modern laptop, it is _not_ common to have 10 gigabytes of spare swap space, 
> and that's for a reason.  Extra space in your filesystem can be used for all 
> sorts of things.  Extra swap space is normally wasted.

You can zap it if you need it for something else pretty easily. swapfiles are
no slower than swap partitions these days, and swap partitions are easy to
turn into filesystems too.

> So having tmp just be a normal directory isn't really that bad of a choice.  
> It normally manifests no downsides whatsoever.  And encouraging people to 
> use /tmp is considered a security hole.

That depends if you have telepathic local attackers :) I don't.

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 21:04           ` Nix
@ 2005-11-25 22:31             ` Rob Landley
  2005-11-27 16:48               ` Blaisorblade
  2005-11-27 18:17               ` Nix
  2005-11-25 23:33             ` Blaisorblade
  1 sibling, 2 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-25 22:31 UTC (permalink / raw)
  To: Nix; +Cc: user-mode-linux-devel, Chris Lightfoot

On Friday 25 November 2005 15:04, Nix wrote:
> The ~/.kde directory doesn't contain temporary files, but persistent state:

~/.kde/share/apps/kmail/lock is persistent state?

I do know that half the time the darn battery runs out and kde suddely shuts 
down my desktop without the courtesy of even _warning_ me first (oh it pops 
up a window three seconds before doing it), kmail doesn't have a chance to 
zap this file before being killed and thus I have to drill down and zap the 
sucker by hand or it'll refuse to run when I boot back up.

Circa Red Hat 9, konqueror's cache files were under .kde.  I have no idea what 
the junk in .kde/share/apps/kpdf is for...

But I take your point.  They've instituted a policy and tried to clean this 
up.  Similarly, .bash_history, .bittorrent, .DCOPserver*, .mcop, and all the 
other fun stuff written into home must be considered persistent state.

> and the same is true of /var/spool,

Doesn't /var/spool/cups contains files spooled to the printer?  (I dunno, the 
only printer in the house is hooked up to my fiance's windows machine.)

> > things like vi would create temporary files in /tmp,
>
> /var/tmp, because the entire point of those files was to survive a reboot.

Is it?  I thought it was to support undo.

> >                                                      but these days it
> > uses . ${filename}.swp in the same directory as the file being edited.
>
> Yes, and I absolutely despise this behaviour. Is there any way to force vim
> to use /var/tmp like everyone else?

It's a compile-time option.  (I accidentally set it to use /tmp once and had 
to figure out how to undo it.)

> - programs writing to $TMPDIR; config.guess, configure, and GCC are big
> users on my systems, but lots of other apps write here for a while. Of
> course you could point TMPDIR somewhere else, but does anyone do that?
>
>   There are a quite surprising number of these: generally the files live
>   for brief instants before being unlinked, if at all. (mkstemp() creates
> its files in $TMPDIR, after all, and often for those files minimal overhead
> is what counts; and like it or not tmpfs has lower overhead than ext*fs.)

Files that live for brief instants never get written out to disk anyway.  
That's why there's the delay before dirty pages in the page cache are 
scheduled for writeout.  So tmpfs doesn't help there.

> - users. A *lot* of my users dump temporary crud in /tmp:

Yeah, at Rutgers we used to do that on the Sun machines to get around the disk 
quota.

> the names of these files aren't predictable unless you're telepathic
> so we're pretty safe from symlink attacks. (My local users are Nice Guys
> anyway, or I shoot them. No shots have so far been necessary.)
>
> Maybe your users don't dump everything they don't care much about in
> /tmp: mine are always sticking all sorts of things in there from
> half-chewed LaTeX through to boring logfiles and stuff being looked over
> on its way to the printer :)

Sounds like your users are old unix hands who cut their teeth on traditional 
Unix boxes in the days before Linux.

> > Keep in mind that tmpfs used to be shmfs, and what it's good at is
> > providing shared memory.
>
> Yep. It just so happens that this gives good properties for transient stuff
> that should vanish no later than the next reboot, and generally lives only
> for as long as someone has it open.

*shrug*.  The truly transient stuff never leaves the page cache, no matter 
what the filesystem.  (Especially if you mount with noatime, which is the 
norm these days.)

> >                 What UML really _wants_ is shared memory, which has
> > traditionally been available through /dev/shm.  Insisting that /tmp
> > behave like /dev/shm because otherwise what you get doesn't behave like
> > shared memory A) doesn't make make a whole lot of sense, B) doesn't match
> > existing practice.
>
> `Existing practice' seems to me to have pretty much wanted something,
> uh, like tmpfs. But maybe your existing practice of /tmp is very different
> from mine. (It certainly sounds like it.)

Out there in the field, today, /tmp is not usually tmpfs.  And nobody's seen 
enough benefit in it to bother deploying it on the Fedora, Gentoo, and Ubuntu 
systems I've tested.

I suspect that knoppix uses tmpfs for /tmp, since it has no backing store.  
(Firing up knoppix 4.0 under qemu...)  Heh.  I was sort of right.  /tmp 
doesn't have anything explicitly mounted on it, but inherits the unionfs 
mount on root, which is a combination of the cdrom and a tmpfs mount 
on /ramdisk.  So it is sort of tmpfs, but not explicitly.  It seems to line 
up with Jeff's recommendations entirely by accident. :)

> > "can" doesn't mean "should".  Yes you can make a 10 gigabyte swap
> > partition, but most people actively don't want one because if your system
> > ever winds up using more than about twice as much swap space as it has
> > physical memory, it's likely that the amount of swap thrashing you're
> > doing is getting pathological.
>
> You've never used dar in infinint mode

Never even heard of it.

> or watched large matrix maths stuff 
> churn through to completion :/

Oh I've watched large jobs thrash the heck out of a machine all afternoon.  
Classic ray tracing, for example...

> there really are things with insane memory 
> requirements and good locality of reference. (I think the most I ever saw
> dar eat was 15Gb of swap. *gah*)

I'm not saying there aren't uses for it, I'm just saying it's not the norm and 
hence not a sane default.

> >                Having a runaway app have to churn through 10 gigabytes of
> > swap space before the OOM killer terminates it can turn 30 seconds of
> > paralysis into 10 minutes.  Not an improvement.
>
> The problem there is that it's churning, i.e. that its locality of
> reference is crap. Such a program should indeed not be allowed to eat that
> much swap.

Well, there's also the fact a high-end modern laptop probably has about 80 
gigs of storage space and the cheaper ones have 40 or even 20.  So eating 10 
gigs of that just isn't an option.

More laptops were sold last year than workstations.

> > Also, although it's pretty common to have 10 gigabytes of spare disk
> > space on a modern laptop, it is _not_ common to have 10 gigabytes of
> > spare swap space, and that's for a reason.  Extra space in your
> > filesystem can be used for all sorts of things.  Extra swap space is
> > normally wasted.
>
> You can zap it if you need it for something else pretty easily. swapfiles
> are no slower than swap partitions these days, and swap partitions are easy
> to turn into filesystems too.

I've done this, but it's not automatic.  (Did they ever make swapfiles 
reliable so they don't lock up under low memory situations?)

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 21:04           ` Nix
  2005-11-25 22:31             ` Rob Landley
@ 2005-11-25 23:33             ` Blaisorblade
  2005-11-26  2:12               ` Nix
  2005-11-26 10:44               ` Rob Landley
  1 sibling, 2 replies; 42+ messages in thread
From: Blaisorblade @ 2005-11-25 23:33 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Nix, Rob Landley, Chris Lightfoot

On Friday 25 November 2005 22:04, Nix wrote:
> On Fri, 25 Nov 2005, Rob Landley moaned:
> > On Friday 25 November 2005 13:33, Nix wrote:

> > Actually, I consider the fact the OOM killer doesn't delete files out of
> > tmpfs mounts to be a potential disadvantage in this context.

Not quite understood this - what's the fact?

> Yeah, true, if you think the OOM killer is worthwhile (I do: most of the MM
> hackers don't. I know who knows more about the Linux kernel's MM and it's
> not me!)

Its euristics are crap (many cases breaking them), and the concept is crap: 
damn hell, a C programmer has been taught to check that malloc() can return 
NULL, not that he should patch a kernel to get a meaningful behaviour.

In fact, luckily, Linux provides a "strict overcommit policy", i.e. no 
overselling of memory.

However, the idea of an OOM could be made to work, if you can kill an app 
based on the derivative of its memory usage (i.e. how fast usage has 
increased over the last moments).

> > Using /tmp for anything has been kind of discouraged for a while, because
> > throwing any insufficiently randomized filename in there is a security
> > hole waiting to happen.

> Um, atomically create a directory,

DoS-able if filenames are predictable...

> spray all the files you like under 
> it.  Trivial. Doesn't everyone have a bit of scriptage that does that in
> $LANGUAGE_OF_CHOICE?

Never seen anybody doing it, IIRC. Not even mkstemp() (even if today I 
discover mkdtemp()).

> > Most of the remaining uses of /tmp are actually for things like named
> > sockets (where tmpfs really doesn't help at all), or for tiny little
> > files (like all the mcop crap) that on a different day would live under
> > /var.

There is no point in keeping them at reboot so why under /var?

> I suspect what causes this is that /tmp is explicitly for uses that
> *don't outlive a reboot*, and how many of those are there? Not all that
> damn many. --- at least, not all that damn many *programmatic* ones.

> > things like vi would create temporary files in /tmp,

> /var/tmp, because the entire point of those files was to survive a reboot.

> >                                                      but these days it
> > uses . ${filename}.swp in the same directory as the file being edited.

> Yes, and I absolutely despise this behaviour. Is there any way to force vim
> to use /var/tmp like everyone else?

> Since /tmp and /var/tmp serve different purposes, yes, of course. This has
> always been true; right back in the early Slackware days the boot scripts
> used to carefully scrub /tmp but leave /var/tmp alone.

> >                 What UML really _wants_ is shared memory, which has
> > traditionally been available through /dev/shm.  Insisting that /tmp
> > behave like /dev/shm because otherwise what you get doesn't behave like
> > shared memory A) doesn't make make a whole lot of sense, B) doesn't match
> > existing practice.

> `Existing practice' seems to me to have pretty much wanted something,
> uh, like tmpfs. But maybe your existing practice of /tmp is very different
> from mine. (It certainly sounds like it.)

Existing practice is there for very good reasons:

- back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount the 
new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.

- now most distros tend to *suggest* mounting tmpfs there (Gentoo does suggest 
this). But since it's not back-compatible, aka if you don't know that you 
loose data, it's up to the admin to do it. And btw, the admin of the network 
I'm writing from didn't notice that a partition was not mounted on /home - 
and said me "hey! I ran yum update and it removed the whole /home!".

He had never known that "mount" lists mounts...

(Btw, the problem was that he added a new external disk, but labeled it /boot, 
like an existing /boot partition , so mount -a choked with "duplicate label 
'/boot'" and it stopped before mounting /home).

Sorry for the OT, hope it's fun ;D

> >> In fact, even when you're memory-constrained, if you *have* diskspace
> >> that you could spend on /tmp, you can swap to it instead, and spend the
> >> space on virtual memory when you're not spending it on /tmp.

> > "can" doesn't mean "should".  Yes you can make a 10 gigabyte swap
> > partition, but most people actively don't want one because if your system
> > ever winds up using more than about twice as much swap space as it has
> > physical memory, it's likely that the amount of swap thrashing you're
> > doing is getting pathological.

> You've never used dar in infinint mode or watched large matrix maths stuff
> churn through to completion :/ there really are things with insane memory
> requirements and good locality of reference. (I think the most I ever saw
> dar eat was 15Gb of swap. *gah*)

Boy, be serious - we are talking about normal systems, and you know that you'd 
better run dar on properly sized systems...

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 20:18         ` Rob Landley
  2005-11-25 21:04           ` Nix
@ 2005-11-25 23:46           ` Chris Lightfoot
  2005-11-26 10:03             ` Rob Landley
  1 sibling, 1 reply; 42+ messages in thread
From: Chris Lightfoot @ 2005-11-25 23:46 UTC (permalink / raw)
  To: Rob Landley; +Cc: Nix, user-mode-linux-devel

On Fri, Nov 25, 2005 at 02:18:43PM -0600, Rob Landley wrote:
> Using /tmp for anything has been kind of discouraged for a while, because 
> throwing any insufficiently randomized filename in there is a security hole 
> waiting to happen.

Which case are you worried about here? SFAIK all the
filesystems anyone is likely to mount on /tmp implement
O_EXCL correctly, and in any case (as was remarked
elsewhere) there's always mkdir.

-- 
Sudden death syndrome, eh?
Sounds nasty.
What are the symptoms?              (seen on the internet)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 23:33             ` Blaisorblade
@ 2005-11-26  2:12               ` Nix
  2005-11-26 11:47                 ` Rob Landley
  2005-11-26 10:44               ` Rob Landley
  1 sibling, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-26  2:12 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Rob Landley, Chris Lightfoot

On Sat, 26 Nov 2005, blaisorblade@yahoo.it announced authoritatively:
> On Friday 25 November 2005 22:04, Nix wrote:
>> On Fri, 25 Nov 2005, Rob Landley moaned:
>> > On Friday 25 November 2005 13:33, Nix wrote:
> 
>> > Actually, I consider the fact the OOM killer doesn't delete files out of
>> > tmpfs mounts to be a potential disadvantage in this context.
> 
> Not quite understood this - what's the fact?

I'm not sure why deleting files is considered a good thing, but I guess
it *does* mean that sticking files in a tmpfs reduces the freedom of the
OOM killer somewhat. Still, it's rarely (OK, in my experience never) a
problem.

If it's a problem you have both hostile users and no size limits on /tmp
and you therefore have bigger problems anyway. :)

>> Yeah, true, if you think the OOM killer is worthwhile (I do: most of the MM
>> hackers don't. I know who knows more about the Linux kernel's MM and it's
>> not me!)
> 
> Its euristics are crap (many cases breaking them), and the concept is crap: 
> damn hell, a C programmer has been taught to check that malloc() can return 
> NULL, not that he should patch a kernel to get a meaningful behaviour.

Yeah, but it does sort of work. Personally I prefer to just never run out
of memory :)

> In fact, luckily, Linux provides a "strict overcommit policy", i.e. no 
> overselling of memory.

... alas, I run a number of programs that rely on memory overcommitment
(some of them rely on sparse files and mmap()-thereof as well).

> However, the idea of an OOM could be made to work, if you can kill an app 
> based on the derivative of its memory usage (i.e. how fast usage has 
> increased over the last moments).

... and the VM appears to be growing things that might help in that area :)

>> > Using /tmp for anything has been kind of discouraged for a while, because
>> > throwing any insufficiently randomized filename in there is a security
>> > hole waiting to happen.
> 
>> Um, atomically create a directory,
> 
> DoS-able if filenames are predictable...

... with a random name, obviously. :)

>> spray all the files you like under 
>> it.  Trivial. Doesn't everyone have a bit of scriptage that does that in
>> $LANGUAGE_OF_CHOICE?
> 
> Never seen anybody doing it, IIRC. Not even mkstemp() (even if today I 
> discover mkdtemp()).

Oh. I do it all the time. I prefer not to work under the assumption that
I'm more brilliant than thirty years of Unix hackers and spotted
something none of them did, but so be it...

(and yes, mkdtemp() is cool, and since it's a directory name it avoids the
tiny little vast gaping problems with mktemp().)

>> `Existing practice' seems to me to have pretty much wanted something,
>> uh, like tmpfs. But maybe your existing practice of /tmp is very different
>> from mine. (It certainly sounds like it.)
> 
> Existing practice is there for very good reasons:
> 
> - back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount the 
> new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.

Ah, well, I never use initrd if I can avoid it, and a bug in one tool is
a reason to *fix that tool*, not rejig teh whole damn system.

> - now most distros tend to *suggest* mounting tmpfs there (Gentoo does suggest 
> this). But since it's not back-compatible, aka if you don't know that you 
> loose data, it's up to the admin to do it. And btw, the admin of the network 
> I'm writing from didn't notice that a partition was not mounted on /home - 
> and said me "hey! I ran yum update and it removed the whole /home!".
> 
> He had never known that "mount" lists mounts...

Hey, I made that mistake at least once, after arranging for home
directories to get bind-mounted into place and then forgetting that I'd
done it, so three years later when the bind-mounting broke due to stupid
script bugs I was running around in a panic for some time wondering
where everyone's $HOME had gone.

(and `mount', of course, only lists mounts if you trust /proc/mounts to
be accurate. What does it look like in this brave new world of shared
subtrees? Obviously /etc/mtab *must* be a symlink to /proc/mounts, now,
only oops that breaks the quota tools...)

> (Btw, the problem was that he added a new external disk, but labeled it /boot, 
> like an existing /boot partition , so mount -a choked with "duplicate label 
> '/boot'" and it stopped before mounting /home).

I think now is an appropriate time to say

I HATE FSCKING MTAB

(in three-part harmony, probably)

> Sorry for the OT, hope it's fun ;D

It's terribly early on Saturday morning, I've got a cold, and I can't
sleep. Staying *on*-topic is likely to be impossible.

>> You've never used dar in infinint mode or watched large matrix maths stuff
>> churn through to completion :/ there really are things with insane memory
>> requirements and good locality of reference. (I think the most I ever saw
>> dar eat was 15Gb of swap. *gah*)
> 
> Boy, be serious - we are talking about normal systems, and you know that you'd 
> better run dar on properly sized systems...

I still boggle that infinint mode is the default for that tool. Its
memory requirements are sane if you ask it to use 64-bit longs instead,
but oh no the default must be an unbounded-size (minimum ~64-byte-long)
numeric class because it's so *common* for you to have >2^64 files in
your backup, or individual files >2^64 bytes long in that backup.

Bah. Silly design decision.

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 23:46           ` Chris Lightfoot
@ 2005-11-26 10:03             ` Rob Landley
  2005-11-26 10:15               ` Chris Lightfoot
  0 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-26 10:03 UTC (permalink / raw)
  To: Chris Lightfoot; +Cc: Nix, user-mode-linux-devel

On Friday 25 November 2005 17:46, Chris Lightfoot wrote:
> On Fri, Nov 25, 2005 at 02:18:43PM -0600, Rob Landley wrote:
> > Using /tmp for anything has been kind of discouraged for a while, because
> > throwing any insufficiently randomized filename in there is a security
> > hole waiting to happen.
>
> Which case are you worried about here? SFAIK all the
> filesystems anyone is likely to mount on /tmp implement
> O_EXCL correctly, and in any case (as was remarked
> elsewhere) there's always mkdir.

I think programmers got the general impression using /tmp for temporary files 
was a really stupid idea from the fact that it keeps cropping up on things 
like LWN's security section.  Here's the ones they linked to just last week 
as still being fixed by various distros:
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2004-0968
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2005-2672
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2005-2851
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2005-2104
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2005-3124

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26 10:03             ` Rob Landley
@ 2005-11-26 10:15               ` Chris Lightfoot
  0 siblings, 0 replies; 42+ messages in thread
From: Chris Lightfoot @ 2005-11-26 10:15 UTC (permalink / raw)
  To: Rob Landley; +Cc: Nix, user-mode-linux-devel

On Sat, Nov 26, 2005 at 04:03:54AM -0600, Rob Landley wrote:
> On Friday 25 November 2005 17:46, Chris Lightfoot wrote:
> > On Fri, Nov 25, 2005 at 02:18:43PM -0600, Rob Landley wrote:
> > > Using /tmp for anything has been kind of discouraged for a while, because
> > > throwing any insufficiently randomized filename in there is a security
> > > hole waiting to happen.
> >
> > Which case are you worried about here? SFAIK all the
> > filesystems anyone is likely to mount on /tmp implement
> > O_EXCL correctly, and in any case (as was remarked
> > elsewhere) there's always mkdir.
> 
> I think programmers got the general impression using /tmp for temporary files 
> was a really stupid idea from the fact that it keeps cropping up on things 
> like LWN's security section.  Here's the ones they linked to just last week 
> as still being fixed by various distros:
    [...]

hmm. I'm not sure any of that's an argument for avoiding
use of /tmp in new programs. I'm not really sure what the
sensible alternative is, either: at least you can sensibly
write policy about (e.g.) cleaning old files out of /tmp
if you want to, whereas if you have multiple ad-hoc
policies for temporary files, you can't.

-- 
language not worship must pink delirious sleep produce
(fridge poetry)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 23:33             ` Blaisorblade
  2005-11-26  2:12               ` Nix
@ 2005-11-26 10:44               ` Rob Landley
  2005-11-27 16:38                 ` Blaisorblade
  2005-11-27 17:10                 ` Blaisorblade
  1 sibling, 2 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-26 10:44 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Friday 25 November 2005 17:33, Blaisorblade wrote:
> On Friday 25 November 2005 22:04, Nix wrote:
> > On Fri, 25 Nov 2005, Rob Landley moaned:
> > > On Friday 25 November 2005 13:33, Nix wrote:
> > >
> > > Actually, I consider the fact the OOM killer doesn't delete files out
> > > of tmpfs mounts to be a potential disadvantage in this context.
>
> Not quite understood this - what's the fact?

That a normal user can allocate persistent memory, that outlives all their 
processes, with no special priviledges, and that the limits on it are per 
system rather than per user.  (In theory you can apply quota to /tmp but I've 
never seen anybody do it.  And yeah, shmfs is no worse than shmget, for 
obvious reasons.  Apparently System V wasn't big into reference counting.).

I'm not saying it's a disproportionately enormous downside, I'm just saying it 
is one compared to having /tmp on a normal filesystem.  DRAM tends to be a 
much more scarce resource than disk space these days.

> > Yeah, true, if you think the OOM killer is worthwhile (I do: most of the
> > MM hackers don't. I know who knows more about the Linux kernel's MM and
> > it's not me!)
>
> Its euristics are crap (many cases breaking them), and the concept is crap:

Expecting DRAM to work properly is a heuristic.  (And ECC memory is a backup 
heuristic on top of that.)

> damn hell, a C programmer has been taught to check that malloc() can return
> NULL, not that he should patch a kernel to get a meaningful behaviour.

Since when does malloc() return null?  It happily succeeds as long as there's 
enough virtual address space, you only later run out of physical memory due 
to page faults on normal access.  (And if you run in no overcommit mode you 
get spurious OOMs every time a pig like mozilla forks, even though it's about 
to exec.  Unless we're back to wasting huge amounts of swap space so that in 
a real OOM situation we thrash endlessly and which your average laptop may 
not have to spare...)

I'm not saying it's unsolvable.  I'm saying that the OOM killer is better than 
the hard lockups I used to see before.  (Admittedly not much better, echo 0 
> /proc/sys/vm/swappiness and the OOM killer goes all rabid.  But oh well.)

> In fact, luckily, Linux provides a "strict overcommit policy", i.e. no
> overselling of memory.

Yup.  And the default is?  (And which distros have a different default?)

> However, the idea of an OOM could be made to work, if you can kill an app
> based on the derivative of its memory usage (i.e. how fast usage has
> increased over the last moments).

There are better heuristics, sure.

> > > Using /tmp for anything has been kind of discouraged for a while,
> > > because throwing any insufficiently randomized filename in there is a
> > > security hole waiting to happen.
> >
> > Um, atomically create a directory,
>
> DoS-able if filenames are predictable...

It is possible to work out all the ramifications of using /tmp to do so in a 
safe manner.

Whether it's a can of worms your average programmer should be exposed to is 
another matter.

What's the advantage of using /tmp?  Your home directory generally isn't world 
writeable (and if it is you have bigger problems), and that's a fine place 
for temp files.  Besides, lots of tempfiles are the kind that "sed -i" 
creates, and although you want them to be unpredictable (in case you do use 
it in a world writeable directory), you also want to be able to mv the result 
to the final filename, and you can't do that across different filesystems, so 
having it in the same directory makes sense.  We've got much better ways of 
doing IPC these days (it's useful for users doing "sed -f", but in scripts 
you can make arbitrarily long command lines these days), and a program that 
needs to remember something can keep it in ram.  What's left?  Batch jobs 
submitted to your outgoing mailer or your print spooler don't live in /tmp 
anymore, they live in /var/spool...

> > > Most of the remaining uses of /tmp are actually for things like named
> > > sockets (where tmpfs really doesn't help at all), or for tiny little
> > > files (like all the mcop crap) that on a different day would live under
> > > /var.
>
> There is no point in keeping them at reboot so why under /var?

Why have "pid" files always lived under /var?  (It conceptually doesn't make 
sense for them to live across reboots, and yet that's been the standard all 
along...)

It's normal in /var for there to be persistent directories with known 
permissions that an application can create an then depend on being there.  
And being able to scribble in a private directory is enough of an advantage I 
don't think anybody cares about the persisting across reboot semantics one 
way or the other.

This is a guess as to the last 30 years of usage, by the way...

> > `Existing practice' seems to me to have pretty much wanted something,
> > uh, like tmpfs. But maybe your existing practice of /tmp is very
> > different from mine. (It certainly sounds like it.)
>
> Existing practice is there for very good reasons:
>
> - back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount
> the new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.

I vaguely remember this being fixed. :)

> - now most distros tend to *suggest* mounting tmpfs there (Gentoo does
> suggest this). But since it's not back-compatible, aka if you don't know
> that you loose data, it's up to the admin to do it. And btw, the admin of
> the network I'm writing from didn't notice that a partition was not mounted
> on /home - and said me "hey! I ran yum update and it removed the whole
> /home!".
>
> He had never known that "mount" lists mounts...

I try not to assume that people know everything there is to know about Unix.  
I first used a Unix variant on a vax back in the 80's (dialing into my 
father's work computer to download amiga programs), spent three years poking 
at Sun workstations at Rutgers, and have used Linux as my primary desktop OS 
since 1998 (when I _finally_ got a video card that xfree86 could actually 
use).  I've wandered through the kernel source code, built my own 
distributions from source code for several different projects, debugged and 
reimplemented strange command line esoterica for busybox...

And despite this, I still encounter new things every couple of weeks.  I think 
the most useful one I learned this year was the "lsof" command.  The most 
recent might have been "setsid", depending on what the threshold is...  No 
wait, it was learning that "uname" isn't just a command line utility but is 
actually a syscall.  That was yesterday.

So when I see "you can configure your system like this, but the default is..." 
I immediately go "ok, 90% of the systems out there will have the default, 
we'd better at least not exhibit pathological behavior with that".

And the default seems to be that /tmp ain't tempfs, but /dev/shm is.

> (Btw, the problem was that he added a new external disk, but labeled it
> /boot, like an existing /boot partition , so mount -a choked with
> "duplicate label '/boot'" and it stopped before mounting /home).

He's using Red Hat, isn't he? :)

(Been there, done that, moved the darn labels to /dev/hda4 and such.  Wouldn't 
recommend that with SCSI because the scsi bus detects devices via chicken 
entrails and then enumerates them in sequence (with no gaps) on a first come 
first served basis.  With ATA, /dev/hdd3 means second controller, slave 
device, third partition, and that doesn't move unless you physically unplug 
it from its connector cable, no matter what else you plug in.  The whole 
_reason_ Red hat has this boot label stuff is some people have an unreasoning 
love of SCSI devices. :)

> Sorry for the OT, hope it's fun ;D
>
> > >> In fact, even when you're memory-constrained, if you *have* diskspace
> > >> that you could spend on /tmp, you can swap to it instead, and spend
> > >> the space on virtual memory when you're not spending it on /tmp.
> > >
> > > "can" doesn't mean "should".  Yes you can make a 10 gigabyte swap
> > > partition, but most people actively don't want one because if your
> > > system ever winds up using more than about twice as much swap space as
> > > it has physical memory, it's likely that the amount of swap thrashing
> > > you're doing is getting pathological.
> >
> > You've never used dar in infinint mode or watched large matrix maths
> > stuff churn through to completion :/ there really are things with insane
> > memory requirements and good locality of reference. (I think the most I
> > ever saw dar eat was 15Gb of swap. *gah*)
>
> Boy, be serious - we are talking about normal systems, and you know that
> you'd better run dar on properly sized systems...

$ man dar
No manual entry for dar

google dar:  Daughters of the American Revolution, Dar Williams, Dar-us-Salam 
Islamic Publications...

google dar linux:
Some french disk archiving tool, apparently.  I generally just use tarballs or 
rsync.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26  2:12               ` Nix
@ 2005-11-26 11:47                 ` Rob Landley
  2005-11-27 17:37                   ` Blaisorblade
  2005-11-27 18:31                   ` Nix
  0 siblings, 2 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-26 11:47 UTC (permalink / raw)
  To: Nix; +Cc: Blaisorblade, user-mode-linux-devel, Chris Lightfoot

On Friday 25 November 2005 20:12, Nix wrote:
> If it's a problem you have both hostile users and no size limits on /tmp
> and you therefore have bigger problems anyway. :)

The size limits on /tmp aren't per-user.

> >> Yeah, true, if you think the OOM killer is worthwhile (I do: most of the
> >> MM hackers don't. I know who knows more about the Linux kernel's MM and
> >> it's not me!)
> >
> > Its euristics are crap (many cases breaking them), and the concept is
> > crap: damn hell, a C programmer has been taught to check that malloc()
> > can return NULL, not that he should patch a kernel to get a meaningful
> > behaviour.
>
> Yeah, but it does sort of work. Personally I prefer to just never run out
> of memory :)

My laptop has 512 megs of ram, and 700 megs of swap.  I'm running QEMU to boot 
a knoppix image with 256 megs of ram, running UML to build gcc 4 (which has a 
high water mark of disk usage somewhere north of 128 megs).  I have two 
konqueror windows open with an average of 30 tabs in each.  I have kmail open 
with a threaded view of linux-kernel with 69,649 messages in that folder.  
Plus the general overhead for kde, two standalone pdf viewers, several 
terminal windows, a partridge in a pear tree, and so on.

It's been a few weeks since I've triggered the OOM killer, but I've done it.

> > However, the idea of an OOM could be made to work, if you can kill an app
> > based on the derivative of its memory usage (i.e. how fast usage has
> > increased over the last moments).
>
> ... and the VM appears to be growing things that might help in that area :)

We get better as time goes on.

My original point was that the semantics of what UML wants is shared memory.  
It's trusting /tmp to provide different behavior than simply using ~, and 
this turns out to be a very unreliable assumption.  There is a directory 
(/dev/shm) whose entire definition is to provide those semantics, and 
shouldn't even _exist_ if it doesn't.  I believe that would be a better 
directory to use.

I can submit a patch for this.  It's arch/um/os-Linux/mem.c, line 37, in 
find_tempdir().

And while I'm at it, os-Linux/start_up.c has a check_tmpexec() that has "/tmp" 
hardwired into its messages, even if that's not what find_tempdir() 
returned...

> >> > Using /tmp for anything has been kind of discouraged for a while,
> >> > because throwing any insufficiently randomized filename in there is a
> >> > security hole waiting to happen.
> >>
> >> Um, atomically create a directory,
> >
> > DoS-able if filenames are predictable...
>
> ... with a random name, obviously. :)

Like "/tmp/uml.ctl" in arch/um/drivers/daemon_kern.c, line 70?

(It's not obvious where this file is actually created, it's one of those funky 
callback things where data in a structure is used somewhere else...)

> > Never seen anybody doing it, IIRC. Not even mkstemp() (even if today I
> > discover mkdtemp()).
>
> Oh. I do it all the time. I prefer not to work under the assumption that
> I'm more brilliant than thirty years of Unix hackers and spotted
> something none of them did, but so be it...

30 years ago the Unix hackers were working on a 16-bit PDP-11 with two RK05 
disk packs storing 2.5 megabytes each.  And the reason they duplicated /bin 
and /sbin and /lib under /usr is that they ran out of space on the root disk 
and had to leak the OS into the second disk pack which had previously held 
all the user home directories.  And people never revisited this decision for 
the next three decades, despite the fact the "needed for early boot" 
rationale was entirely a pragmatic thing of the moment, and makes _no_ sense 
on a modern system ever since the invention of the initial ramdisk, let alone 
initramfs.  I personally symlink /bin, /sbin, and /lib to the 
corresponding /usr directories and consolidate the whole mess, myself.  Yes, 
you have to patch gcc's paths (in collect2) to not search _both_ /lib 
and /usr/lib because if gnu's linker finds the same symbols in two different 
libraries it statically links them in rather than trying to figure out which 
one is right, resulting in executables as big as if they're statically linked 
but still refusing to run if they can't find their shared libraries at run 
time.  That's a bug in ld.

The point is, it's important to know _what_ conclusions the 30 years of unix 
hackers came to, but keep in mind that the computing environment of 2005 is 
in some ways very different from the computing environments of 1976 or 1984.

> > - back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount
> > the new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.
>
> Ah, well, I never use initrd if I can avoid it, and a bug in one tool is
> a reason to *fix that tool*, not rejig teh whole damn system.

I agree initrd is kinda pointless, but initramfs isn't.  The kernel guys are 
moving towards initramfs being required someday.  These are still nebulous 
future plans with no actual deadline, but they include moving to dynamically 
assigned major/minor numbers (so you need something like udev to 
populate /dev), having userspace find and mount the real root partition (so 
when you're booting from a USB key but your root paritition lives on an NFS 
server that in order to access it you have to dhcp yourself an address, 
nslookup the server name, and then login with a public key from said USB 
stick...)  All the various partitioning schemes could be moved over to device 
mapper.  And so on.

They'd proposed a serious kernel crapectomy "for 2.7" back before 2.7 got put 
on indefinite hold.  How they're rolling it out now, we dunno.  They seem to 
be happy chewing their current mouthful, at the moment...

> (and `mount', of course, only lists mounts if you trust /proc/mounts to
> be accurate.

If the kernel doesn't know what's mounted, you have bigger problems.

> What does it look like in this brave new world of shared 
> subtrees?

I had this discussion on the kernel list a week or so back: namespaces are 
reference counted so as soon as the last process that can see a mount goes 
away, umount happens.  This means that umount -a should only zap everything 
in your current namespace, so that after init kills all sub-processes it can 
then run umount -a for pid 1, life is good.

I had this discussion because I wanted to make sure busybox umount would be 
doing it right.

> Obviously /etc/mtab *must* be a symlink to /proc/mounts, now, 
> only oops that breaks the quota tools...)

I rewrote busybox mount so that things work properly with /proc/mounts.  And I 
vaguely remember coming up with an in-house patch to fix the quota tools 
(they were upset by rootfs) something like four years ago.

> > (Btw, the problem was that he added a new external disk, but labeled it
> > /boot, like an existing /boot partition , so mount -a choked with
> > "duplicate label '/boot'" and it stopped before mounting /home).
>
> I think now is an appropriate time to say
>
> I HATE FSCKING MTAB
>
> (in three-part harmony, probably)

Everybody hates /etc/mtab.  It doesn't work if you chroot.  It can't handle 
--bind or --move mounts...  Just symlink it to /proc/mounts and recognize 
that any tool that can't handle that is a buggy tool that needs to be fixed.

> >> You've never used dar in infinint mode or watched large matrix maths
> >> stuff churn through to completion :/ there really are things with insane
> >> memory requirements and good locality of reference. (I think the most I
> >> ever saw dar eat was 15Gb of swap. *gah*)
> >
> > Boy, be serious - we are talking about normal systems, and you know that
> > you'd better run dar on properly sized systems...
>
> I still boggle that infinint mode is the default for that tool.

First time I've heard of the tool, but then back under 2.4.7 I remember I had 
rsync regularly triggering the OOM killer.  Not because rsync was leaking, 
but because the servers backing up only had 128 megs of memory and the 
balancing was _terrible_ so the dentry cache and page cache would squeeze out 
anonymous pages to the point where rsync itself got OOM killed...

People who want truly insane amounts of memory these days (often for graphics 
or video editing) tend to mmap their data files directly and work in there.  
Once again rendering insane amounts of swap less useful...

I'm under the vague impression there's some kind of madvise you can do that 
says "don't flush this before close unless you're responding to memory 
pressure".  Hmmm...  Closest I can find is MADV_RANDOM...

If we had a "treat this like it's on tmpfs" madvice, that would be ideal...

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26 10:44               ` Rob Landley
@ 2005-11-27 16:38                 ` Blaisorblade
  2005-11-27 18:49                   ` Nix
  2005-11-27 17:10                 ` Blaisorblade
  1 sibling, 1 reply; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 16:38 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Saturday 26 November 2005 11:44, Rob Landley wrote:
> On Friday 25 November 2005 17:33, Blaisorblade wrote:

> > - back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount
> > the new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.
>
> I vaguely remember this being fixed. :)

Yep, it was, but at 2.4.18 (my first kernel) it wasn't yet.

> > He had never known that "mount" lists mounts...
>
> I try not to assume that people know everything there is to know about
> Unix.

I've done my share of stupid things, but he's _payed_ as a sysadmin (though 
he's actually a programmer) and I learned this time ago, and since I started 
using Linux <=3.5 years ago, time ago means at the very beginning ... he 
_should_ know that. However he had backups - which shows he's well 
intentioned.

(as an aside, I do despise learning *so* fast... I wouldn't be able to use 
socket() without manuals, and there's a lot of stuff I'd like to learn well. 
And I'd really like to _start_ and finish even a little project on my own... 
it's years I don't start coding some fun project up).

> And the default seems to be that /tmp ain't tempfs, but /dev/shm is.

I argue for that too...

> > (Btw, the problem was that he added a new external disk, but labeled it
> > /boot, like an existing /boot partition , so mount -a choked with
> > "duplicate label '/boot'" and it stopped before mounting /home).

> He's using Red Hat, isn't he? :)

Yes...

> (Been there, done that, moved the darn labels to /dev/hda4 and such. 
> Wouldn't recommend that with SCSI because the scsi bus detects devices via
> chicken entrails and then enumerates them in sequence (with no gaps) on a
> first come first served basis.  With ATA, /dev/hdd3 means second
> controller, slave device, third partition, and that doesn't move unless you
> physically unplug it from its connector cable, no matter what else you plug
> in.  The whole _reason_ Red hat has this boot label stuff is some people
> have an unreasoning love of SCSI devices. :)

Well, it makes sense anyhow, and though it's unusual and it sucks for the 
user, it would be much more meaningful to use labels rather than partitions 
(when repartitioning the same can happen - and I've seen the bloody hell 
happen with partition tables


> google dar linux:
> Some french disk archiving tool, apparently.  I generally just use tarballs
> or rsync.

It's clear Nix is using some calculation program (not sure what's it).

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 22:31             ` Rob Landley
@ 2005-11-27 16:48               ` Blaisorblade
  2005-11-27 18:17               ` Nix
  1 sibling, 0 replies; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 16:48 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Rob Landley, Nix, Chris Lightfoot

On Friday 25 November 2005 23:31, Rob Landley wrote:
> On Friday 25 November 2005 15:04, Nix wrote:
> > The ~/.kde directory doesn't contain temporary files, but persistent
> > state:
>
> ~/.kde/share/apps/kmail/lock is persistent state?
>
> I do know that half the time the darn battery runs out and kde suddely
> shuts down my desktop without the courtesy of even _warning_ me first (oh
> it pops up a window three seconds before doing it), kmail doesn't have a
> chance to zap this file before being killed and thus I have to drill down
> and zap the sucker by hand or it'll refuse to run when I boot back up.
>
> Circa Red Hat 9, konqueror's cache files were under .kde.  I have no idea
> what the junk in .kde/share/apps/kpdf is for...
>
> But I take your point.  They've instituted a policy and tried to clean this
> up.  Similarly, .bash_history, .bittorrent, .DCOPserver*, .mcop, and all
> the other fun stuff written into home must be considered persistent state.
>
> > and the same is true of /var/spool,
>
> Doesn't /var/spool/cups contains files spooled to the printer?  (I dunno,
> the only printer in the house is hooked up to my fiance's windows machine.)

Spool *is* persistent... this is why cups is a daemon. Indeed, when you power 
on a printer and it starts printing without further action, you (or the 
average user, or me for a couple of seconds) say "it's a daemon's fault!"...

> > > things like vi would create temporary files in /tmp,
> >
> > /var/tmp, because the entire point of those files was to survive a
> > reboot.
>
> Is it?  I thought it was to support undo.


> Files that live for brief instants never get written out to disk anyway.
> That's why there's the delay before dirty pages in the page cache are
> scheduled for writeout.  So tmpfs doesn't help there.

That's not entirely correct for performances - the file could not get written 
out, but on most filesystems (excluding XFS, Reiser4 and some experimental 
ext3 version) a few preparatory steps (block allocation, for instance, and 
that involves poking through free list bitmaps and is even computationally 
intensive*) are done. Delayed allocation was invented exactly for that.

* on a RAID array with ext3, in a benchmark, it limited writeout speed downto 
300 Mb/s instead of 500 Mb/s (See OLS 2005, ext3 paper).

> > - users. A *lot* of my users dump temporary crud in /tmp:

> *shrug*.  The truly transient stuff never leaves the page cache, no matter
> what the filesystem.  (Especially if you mount with noatime, which is the
> norm these days.)

I've seen it rarely used... only Gentoo suggests that.

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26 10:44               ` Rob Landley
  2005-11-27 16:38                 ` Blaisorblade
@ 2005-11-27 17:10                 ` Blaisorblade
  1 sibling, 0 replies; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 17:10 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Saturday 26 November 2005 11:44, Rob Landley wrote:
> On Friday 25 November 2005 17:33, Blaisorblade wrote:
> > On Friday 25 November 2005 22:04, Nix wrote:
> > > On Fri, 25 Nov 2005, Rob Landley moaned:
> > > > On Friday 25 November 2005 13:33, Nix wrote:

> That a normal user can allocate persistent memory, that outlives all their
> processes, with no special priviledges, and that the limits on it are per
> system rather than per user.  (In theory you can apply quota to /tmp but
> I've never seen anybody do it.  And yeah, shmfs is no worse than shmget,
> for obvious reasons.  Apparently System V wasn't big into reference
> counting.).

When I studied SysV API I've seen this being claimed and intended as a 
feature.

In the same way, the absence of garbage collection on filesystems is called a 
feature.

And in fact shmctl(IPC_RMID) does reference counting - the concept is exactly 
the same one as with filesystems and deleting files (argh, it's not so for 
anything else of the SysV APIvefd, unfortunately).

Additionally, on Linux, you can attach a shm region which is marked to be 
destroyed (not on other systems), and then you create a region and mark it 
for deletion, getting the desired behaviour.

The (big) misfeature is the miss of a purely refcounting option of the API as 
a standard, and the miss of a standard utility to clean the leftover cruft... 
(I think this can be called a potential local DoS on almost all deployed 
systems, where no such utility exists to my knowledge).

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

		
___________________________________ 
Yahoo! Messenger: chiamate gratuite in tutto il mondo 
http://it.messenger.yahoo.com



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26 11:47                 ` Rob Landley
@ 2005-11-27 17:37                   ` Blaisorblade
  2005-11-27 18:35                     ` Nix
  2005-11-27 18:59                     ` Rob Landley
  2005-11-27 18:31                   ` Nix
  1 sibling, 2 replies; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 17:37 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Rob Landley, Nix, Chris Lightfoot

On Saturday 26 November 2005 12:47, Rob Landley wrote:
> On Friday 25 November 2005 20:12, Nix wrote:
> > If it's a problem you have both hostile users and no size limits on /tmp
> > and you therefore have bigger problems anyway. :)

> My original point was that the semantics of what UML wants is shared
> memory. It's trusting /tmp to provide different behavior than simply using
> ~, and this turns out to be a very unreliable assumption.  There is a
> directory (/dev/shm) whose entire definition is to provide those semantics,
> and shouldn't even _exist_ if it doesn't.  I believe that would be a better
> directory to use.

> I can submit a patch for this.  It's arch/um/os-Linux/mem.c, line 37, in
> find_tempdir().

> And while I'm at it, os-Linux/start_up.c has a check_tmpexec() that has
> "/tmp" hardwired into its messages, even if that's not what find_tempdir()
> returned...

Good note... I'd gladly accept that.

> Like "/tmp/uml.ctl" in arch/um/drivers/daemon_kern.c, line 70?
>
> (It's not obvious where this file is actually created, it's one of those
> funky callback things where data in a structure is used somewhere else...)

It's not a file, it's a AF_UNIX socket bound there - and bind() fails if the 
file exists. So it's a different story (I was puzzled by a missing 
bind(O_EXCL), but I learned with trial there's no need).

It's created at uml_switch (not setuid) startup, which can be done by anybody.

Btw, Debian moves that socket to something under /var/run/uml-utilities or 
something like that.

> > Oh. I do it all the time. I prefer not to work under the assumption that
> > I'm more brilliant than thirty years of Unix hackers and spotted
> > something none of them did, but so be it...

I recently realized that even the mktemp(1) utility works - it creates the 
file and returns the pathname. I kept wondering "but what if an attacker 
alters the file afterward", but I forgot the sticky bit - nobody else can 
delete my file.

> And the reason they duplicated /bin
> and /sbin and /lib under /usr is that they ran out of space on the root
> disk and had to leak the OS into the second disk pack which had previously
> held all the user home directories.

Seen this argumentation for Hurd systems... However until LVM2
(and-all-the-rest)-on-root works out of the box, I'll call anything else 
crap.

> I agree initrd is kinda pointless, but initramfs isn't.  The kernel guys
> are moving towards initramfs being required someday.  These are still
> nebulous future plans with no actual deadline, but they include moving to
> dynamically assigned major/minor numbers (so you need something like udev
> to
> populate /dev),

Nice move to disable init=/bin/sh. Really. Next one is moving kdelibs into the 
kernel?

> They'd proposed a serious kernel crapectomy 
Yep, I remember.
> "for 2.7" back before 2.7 got 
> put on indefinite hold.  How they're rolling it out now, we dunno.  They
> seem to be happy chewing their current mouthful, at the moment...

> > What does it look like in this brave new world of shared
> > subtrees?

> > Obviously /etc/mtab *must* be a symlink to /proc/mounts, now,
> > only oops that breaks the quota tools...)

> I rewrote busybox mount so that things work properly with /proc/mounts. 
> And I vaguely remember coming up with an in-house patch to fix the quota
> tools (they were upset by rootfs) something like four years ago.


> > I HATE FSCKING MTAB
> >
> > (in three-part harmony, probably)
>
> Everybody hates /etc/mtab.  It doesn't work if you chroot.
Right.
> It can't handle 
> --bind or --move mounts...

In my experience it does (I have 2/3 distros and use chrooting often, so I 
loop-mount half my disk):

$ grep bind /etc/mtab
/home /mnt/gen32/home none rw,bind 0 0
/var/spool/wwwoffle /mnt/gen32/var/spool/wwwoffle none rw,bind 0 0
/mnt/gen32/var/tmp/portage /var/tmp/portage none rw,bind 0 0
/home /mnt/mdk/home none rw,bind 0 0
/mnt/win_c /mnt/gen32/mnt/win_c none rw,bind 0 0

Don't know for shared mounts...

> Just symlink it to /proc/mounts and recognize 
> that any tool that can't handle that is a buggy tool that needs to be
> fixed.

No - the kernel doesn't allow storing the full set of infos which are added by 
mount there. And frankly I don't want the kernel to do that.

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-25 22:31             ` Rob Landley
  2005-11-27 16:48               ` Blaisorblade
@ 2005-11-27 18:17               ` Nix
  2005-11-27 19:24                 ` Rob Landley
  1 sibling, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-27 18:17 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Chris Lightfoot

[Sorry for response delay, steaming cold/flu]

On Fri, 25 Nov 2005, Rob Landley worried:
> On Friday 25 November 2005 15:04, Nix wrote:
>> The ~/.kde directory doesn't contain temporary files, but persistent state:
> 
> ~/.kde/share/apps/kmail/lock is persistent state?

No, but KDE is a bit of a mess in some areas, and this is one of htem.

> I do know that half the time the darn battery runs out and kde suddely shuts 
> down my desktop without the courtesy of even _warning_ me first (oh it pops 
> up a window three seconds before doing it), kmail doesn't have a chance to 
> zap this file before being killed and thus I have to drill down and zap the 
> sucker by hand or it'll refuse to run when I boot back up.

... and this is why it should be in /tmp.

> Circa Red Hat 9, konqueror's cache files were under .kde.  I have no idea what 
> the junk in .kde/share/apps/kpdf is for...

Not true as of reasonably recent Konquerors.

> But I take your point.  They've instituted a policy and tried to clean this 
> up.  Similarly, .bash_history, .bittorrent, .DCOPserver*, .mcop, and all the 
> other fun stuff written into home must be considered persistent state.

Certainly the bash history and bittorrent stuff is persistent. .mcop is
persistent (the trader cache should outlast reboots).

>> and the same is true of /var/spool,
> 
> Doesn't /var/spool/cups contains files spooled to the printer?  (I dunno, the 
> only printer in the house is hooked up to my fiance's windows machine.)

Yes. Again, if the machine reboots, you don't want to lose stuff you've got
waiting to print.

>> > things like vi would create temporary files in /tmp,
>>
>> /var/tmp, because the entire point of those files was to survive a reboot.
> 
> Is it?  I thought it was to support undo.

Nah, as this XEmacs user understands it, it's for `vi -r'. (XEmacs does that
with stuff in the local directory and/or any-directory-of-your-choice, so I
picked one under /var/tmp. :) )

>> >                                                      but these days it
>> > uses . ${filename}.swp in the same directory as the file being edited.
>>
>> Yes, and I absolutely despise this behaviour. Is there any way to force vim
>> to use /var/tmp like everyone else?
> 
> It's a compile-time option.  (I accidentally set it to use /tmp once and had 
> to figure out how to undo it.)

Is it? Oh good, I'll flip it next time I upgrade :)

>> - programs writing to $TMPDIR; config.guess, configure, and GCC are big
>> users on my systems, but lots of other apps write here for a while. Of
>> course you could point TMPDIR somewhere else, but does anyone do that?
>>
>>   There are a quite surprising number of these: generally the files live
>>   for brief instants before being unlinked, if at all. (mkstemp() creates
>> its files in $TMPDIR, after all, and often for those files minimal overhead
>> is what counts; and like it or not tmpfs has lower overhead than ext*fs.)
> 
> Files that live for brief instants never get written out to disk anyway.  

Aside: it's easy to test this by writing something that creates and
unlinks a file, dumps stuff into it, then deletes it, and loops on that:
watch the disk light. I'll write a testcase because I'm so sure I'm right.

[five minutes later]

... oops. I just, er, proved I was wrong. Ah well. You live and
learn. This was certainly true in 2.4 but in 2.6 it seems to be the case
that dirty blocks get magically undirtied if the file in question gets
completely unlinked and not kept open by anything before the blocks hit
the disk (unless the file is too large to fit in the page cache of
course; even then it might fit in tmpfs, as tmpfs is swap-backed but
even I'll admit that multi-hundred-megabyte writes to /tmp are rare
things for programs to do.)

> That's why there's the delay before dirty pages in the page cache are 
> scheduled for writeout.  So tmpfs doesn't help there.

Well, it does if the consuming program takes some time to consume the
file, or the producing program takes some time to generate it (e.g. GCC;
yes, even in -pipe mode, some temporary files in /tmp are used.)

>> - users. A *lot* of my users dump temporary crud in /tmp:
> 
> Yeah, at Rutgers we used to do that on the Sun machines to get around the disk 
> quota.

Mine do it to avoid cluttering up their $HOMEs with crap. (Well, all but
one whose home directory looks like a sewer. I avoid looking in there
unless forced.)

>> Maybe your users don't dump everything they don't care much about in
>> /tmp: mine are always sticking all sorts of things in there from
>> half-chewed LaTeX through to boring logfiles and stuff being looked over
>> on its way to the printer :)
> 
> Sounds like your users are old unix hands who cut their teeth on traditional 
> Unix boxes in the days before Linux.

Two of them are for certain: I don't know about the rest. They're not
doing it for efficiency reasons, just out of tidiness.

>> `Existing practice' seems to me to have pretty much wanted something,
>> uh, like tmpfs. But maybe your existing practice of /tmp is very different
>> from mine. (It certainly sounds like it.)
> 
> Out there in the field, today, /tmp is not usually tmpfs.

Out there in the field, today, the average Linux box is running Oracle and
very little else :(

>                                                            And nobody's seen 
> enough benefit in it to bother deploying it on the Fedora, Gentoo, and Ubuntu 
> systems I've tested.

I haven't seen a non-tmpfs-for-/tmp Linux box in years. I guess this is
another transatlantic divide thing :)

>> You've never used dar in infinint mode
> 
> Never even heard of it.

A very powerful backup program with, ahem, notable memory consumption
problems in its default configuration (unless you *like* 1Gb memory
consumption per million files, approx.)

>> or watched large matrix maths stuff 
>> churn through to completion :/
> 
> Oh I've watched large jobs thrash the heck out of a machine all afternoon.  
> Classic ray tracing, for example...

Ray tracing is a worst case; it has very little locality of reference at
all (at least not unless the ray tracer has been optimized for parallism,
which `classic' ones generally haven't been).

>> > Also, although it's pretty common to have 10 gigabytes of spare disk
>> > space on a modern laptop, it is _not_ common to have 10 gigabytes of
>> > spare swap space, and that's for a reason.  Extra space in your
>> > filesystem can be used for all sorts of things.  Extra swap space is
>> > normally wasted.
>>
>> You can zap it if you need it for something else pretty easily. swapfiles
>> are no slower than swap partitions these days, and swap partitions are easy
>> to turn into filesystems too.
> 
> I've done this, but it's not automatic.  (Did they ever make swapfiles 
> reliable so they don't lock up under low memory situations?)

As I understand it all the downsides of swapfiles (speed, reliability et
al) went away in the 2.5.x timeframe.

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-26 11:47                 ` Rob Landley
  2005-11-27 17:37                   ` Blaisorblade
@ 2005-11-27 18:31                   ` Nix
  2005-11-28  1:07                     ` Rob Landley
  1 sibling, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-27 18:31 UTC (permalink / raw)
  To: Rob Landley; +Cc: Blaisorblade, user-mode-linux-devel, Chris Lightfoot

On Sat, 26 Nov 2005, Rob Landley murmured woefully:
> On Friday 25 November 2005 20:12, Nix wrote:
>> If it's a problem you have both hostile users and no size limits on /tmp
>> and you therefore have bigger problems anyway. :)
> 
> The size limits on /tmp aren't per-user.

True. TODO: add tmpfs quota support. :)

>> > However, the idea of an OOM could be made to work, if you can kill an app
>> > based on the derivative of its memory usage (i.e. how fast usage has
>> > increased over the last moments).
>>
>> ... and the VM appears to be growing things that might help in that area :)
> 
> We get better as time goes on.
> 
> My original point was that the semantics of what UML wants is shared memory.  
> It's trusting /tmp to provide different behavior than simply using ~, and 
> this turns out to be a very unreliable assumption.  There is a directory 
> (/dev/shm) whose entire definition is to provide those semantics, and 
> shouldn't even _exist_ if it doesn't.  I believe that would be a better 
> directory to use.

I have to agree, not least because it is counterintuitive to set a strict
size limit on /tmp on the host and then find you can't start big UMLs.

>> ... with a random name, obviously. :)
> 
> Like "/tmp/uml.ctl" in arch/um/drivers/daemon_kern.c, line 70?

Um. :/

>             I personally symlink /bin, /sbin, and /lib to the 
> corresponding /usr directories and consolidate the whole mess, myself.  Yes, 
> you have to patch gcc's paths (in collect2) to not search _both_ /lib 
> and /usr/lib because if gnu's linker finds the same symbols in two different 
> libraries it statically links them in rather than trying to figure out which 
> one is right, resulting in executables as big as if they're statically linked 
> but still refusing to run if they can't find their shared libraries at run 
> time.  That's a bug in ld.

I'll say! I'll see if I can fix that (if it isn't already fixed: I'm having
trouble reproducing it here, with binutils 2.16.91.0.2...)

>> > - back in 2.4, tmpfs on /tmp broke mkinitrd since it tried to loop-mount
>> > the new initrd, which was in /tmp. And loop-mount over tmpfs didn't work.
>>
>> Ah, well, I never use initrd if I can avoid it, and a bug in one tool is
>> a reason to *fix that tool*, not rejig the whole damn system.
> 
> I agree initrd is kinda pointless, but initramfs isn't.  The kernel guys are 
> moving towards initramfs being required someday.

I didn't properly understand the difference (using rootfs versus not) when
I wrote that email. I do now thanks to your nice little document recently
mentioned on l-k, and I have to agree that initramfs seems a whole lot nicer.

>                                                  These are still nebulous 
> future plans with no actual deadline, but they include moving to dynamically 
> assigned major/minor numbers (so you need something like udev to 
> populate /dev),

How terrible. :)

>                 having userspace find and mount the real root partition (so 
> when you're booting from a USB key but your root paritition lives on an NFS 
> server that in order to access it you have to dhcp yourself an address, 
> nslookup the server name, and then login with a public key from said USB 
> stick...)  All the various partitioning schemes could be moved over to device 
> mapper.  And so on.

It's a little annoying for those of us *without* horribly complex boot
schemes; I guess there'll be a `default initramfs' which replicates the
current behaviour.

> They'd proposed a serious kernel crapectomy "for 2.7" back before 2.7 got put 
> on indefinite hold.  How they're rolling it out now, we dunno.  They seem to 
> be happy chewing their current mouthful, at the moment...

Yeah, the change rate of the kernel doesn't exactly seem to be at an
all-time low :)

>> What does it look like in this brave new world of shared 
>> subtrees?
> 
> I had this discussion on the kernel list a week or so back: namespaces are 
> reference counted so as soon as the last process that can see a mount goes 
> away, umount happens.  This means that umount -a should only zap everything 
> in your current namespace, so that after init kills all sub-processes it can 
> then run umount -a for pid 1, life is good.

Yeah, but what does /proc/mounts say? Does it show only references that the
querying process can see?

... actually, hey, yes, it's a symlink to /proc/self/mounts, so it does the
right thing already. Nifty.

>> Obviously /etc/mtab *must* be a symlink to /proc/mounts, now, 
>> only oops that breaks the quota tools...)
> 
> I rewrote busybox mount so that things work properly with /proc/mounts.  And I 
> vaguely remember coming up with an in-house patch to fix the quota tools 
> (they were upset by rootfs) something like four years ago.

Please feed it upstream to the quota tools people before I have to write the
same damn patch ;)))

>> I HATE FSCKING MTAB
>>
>> (in three-part harmony, probably)
> 
> Everybody hates /etc/mtab.  It doesn't work if you chroot.  It can't handle 
> --bind or --move mounts...  Just symlink it to /proc/mounts and recognize 
> that any tool that can't handle that is a buggy tool that needs to be fixed.

Well, ideally the kernel should allow mount(2) to feed it *arbitrary*
options in the `data' argument, reflecting those it doesn't understand
back into /proc/mounts. That would avoid breaking the quota tools and,
um, whatever else depends on this (I've seen distributed administration
tools that mark up filesystems with custom options in the expectation
that they'll land in mtab, too: I think there's some automated fstab
editor in HAL that does the same thing).

[...]
> First time I've heard of the tool, but then back under 2.4.7 I remember I had 
> rsync regularly triggering the OOM killer.  Not because rsync was leaking, 
> but because the servers backing up only had 128 megs of memory and the 
> balancing was _terrible_ so the dentry cache and page cache would squeeze out 
> anonymous pages to the point where rsync itself got OOM killed...

Ick, yes. I switched to 2.4 around that time and switched right back to
2.2 again because the MM had so many problems...

> People who want truly insane amounts of memory these days (often for graphics 
> or video editing) tend to mmap their data files directly and work in there.  
> Once again rendering insane amounts of swap less useful...

Not necessarily, given the existence of MAP_PRIVATE. (The problem with working
directly in data files without MAP_PRIVATE is that if you lose power at *any*
time, your data file is toast.)

> If we had a "treat this like it's on tmpfs" madvice, that would be ideal...

Agreed. Combine that with per-user filesytems and, well, give every user
a small tmpfs mount of their own on /tmp and let apps use suitably
advised mmaps for everything else :)

(security holes? but other users can't *see* that /tmp, which is why it's
mode 640, just like their $HOME...)

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 17:37                   ` Blaisorblade
@ 2005-11-27 18:35                     ` Nix
  2005-11-27 19:10                       ` Blaisorblade
  2005-11-27 21:21                       ` Rob Landley
  2005-11-27 18:59                     ` Rob Landley
  1 sibling, 2 replies; 42+ messages in thread
From: Nix @ 2005-11-27 18:35 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Rob Landley, Chris Lightfoot

On Sun, 27 Nov 2005, blaisorblade@yahoo.it whispered secretively:
> It's not a file, it's a AF_UNIX socket bound there - and bind() fails if the 
> file exists. So it's a different story (I was puzzled by a missing 
> bind(O_EXCL), but I learned with trial there's no need).

There's an (optional) abstract namespace for AF_UNIX sockets now. It's
Linux-only, but UML isn't going to care about that :)

>> > Oh. I do it all the time. I prefer not to work under the assumption that
>> > I'm more brilliant than thirty years of Unix hackers and spotted
>> > something none of them did, but so be it...
> 
> I recently realized that even the mktemp(1) utility works - it creates the 
> file and returns the pathname. I kept wondering "but what if an attacker 
> alters the file afterward", but I forgot the sticky bit - nobody else can 
> delete my file.

If that utility exists :( an *awful* lot of Linux systems don't have it,
and of course in the howling wilderness that is proprietary Unix, nobody
has it at all.

>> And the reason they duplicated /bin
>> and /sbin and /lib under /usr is that they ran out of space on the root
>> disk and had to leak the OS into the second disk pack which had previously
>> held all the user home directories.
> 
> Seen this argumentation for Hurd systems... However until LVM2
> (and-all-the-rest)-on-root works out of the box, I'll call anything else 
> crap.

That's one of the jobs of the initramfs :) and it's even kept up to date
for you with new versions of the tools whenever you rebuild the kernel.

>> I agree initrd is kinda pointless, but initramfs isn't.  The kernel guys
>> are moving towards initramfs being required someday.  These are still
>> nebulous future plans with no actual deadline, but they include moving to
>> dynamically assigned major/minor numbers (so you need something like udev
>> to
>> populate /dev),
> 
> Nice move to disable init=/bin/sh. Really. Next one is moving kdelibs into the 
> kernel?

Nah, AIUI the initramfs runs *first*; it's its job to parse those parts
of the kernel parameters. (I just hope it gets it right. A lot of initrd
scripts I've seen just ignore init=, leading to much pain later on.)

> Don't know for shared mounts...

/etc/mtab assumes *one single* canonical filesystem view, so shared or
private mounts or anything smacking of them will break it completely.

(Indeed in my experience breathing heavily near it will break it
completely...)

>> Just symlink it to /proc/mounts and recognize 
>> that any tool that can't handle that is a buggy tool that needs to be
>> fixed.
> 
> No - the kernel doesn't allow storing the full set of infos which are added by 
> mount there. And frankly I don't want the kernel to do that.

Why not? It should. Only root can call mount(), so there's no real
danger that some attacker will stick megabytes of stuff in there.

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 16:38                 ` Blaisorblade
@ 2005-11-27 18:49                   ` Nix
  2005-11-27 21:25                     ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Nix @ 2005-11-27 18:49 UTC (permalink / raw)
  To: Blaisorblade; +Cc: Rob Landley, user-mode-linux-devel, Chris Lightfoot

On Sun, 27 Nov 2005, blaisorblade@yahoo.it whispered secretively:
> (as an aside, I do despise learning *so* fast... I wouldn't be able to use 
> socket() without manuals, and there's a lot of stuff I'd like to learn well. 
> And I'd really like to _start_ and finish even a little project on my own... 
> it's years I don't start coding some fun project up).

The BSD socket layer is so irregular that I think *everyone* needs to refer
to the manuals when using it, unless they write networking apps in C every
few days.

>> google dar linux:
>> Some french disk archiving tool, apparently.  I generally just use tarballs
>> or rsync.
> 
> It's clear Nix is using some calculation program (not sure what's it).

I'm using both matlab/octave *and*, when running backups, said French disk
archiver. The source is gradually being Anglicised so that the developer
base can rise a bit :)

It has numerous advantages over tar and rsync if, like me, you're stuck using
a pile of CD-R[W]s as your backup medium.

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 17:37                   ` Blaisorblade
  2005-11-27 18:35                     ` Nix
@ 2005-11-27 18:59                     ` Rob Landley
  2005-11-27 19:20                       ` Blaisorblade
  1 sibling, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-27 18:59 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Sunday 27 November 2005 11:37, Blaisorblade wrote:
> > Like "/tmp/uml.ctl" in arch/um/drivers/daemon_kern.c, line 70?
> >
> > (It's not obvious where this file is actually created, it's one of those
> > funky callback things where data in a structure is used somewhere
> > else...)
>
> It's not a file, it's a AF_UNIX socket bound there - and bind() fails if
> the file exists. So it's a different story (I was puzzled by a missing
> bind(O_EXCL), but I learned with trial there's no need).
>
> It's created at uml_switch (not setuid) startup, which can be done by
> anybody.
>
> Btw, Debian moves that socket to something under /var/run/uml-utilities or
> something like that.

Any user can create /tmp/uml.ctl and the sticky bit prevents anybody else from 
deleting it, so any user can block UML switch from working right.  
Under /var/run you can have a persistent directory belonging to a GID or some 
such that UML switch is setgid to, so under /var you are at least _capable_ 
of dealing with this sort of thing...

> > > Oh. I do it all the time. I prefer not to work under the assumption
> > > that I'm more brilliant than thirty years of Unix hackers and spotted
> > > something none of them did, but so be it...
>
> I recently realized that even the mktemp(1) utility works - it creates the
> file and returns the pathname. I kept wondering "but what if an attacker
> alters the file afterward", but I forgot the sticky bit - nobody else can
> delete my file.

Although if your permissions are wrong they can alter its contents.  But yeah, 
the stick bit's a good thing...

> > And the reason they duplicated /bin
> > and /sbin and /lib under /usr is that they ran out of space on the root
> > disk and had to leak the OS into the second disk pack which had
> > previously held all the user home directories.
>
> Seen this argumentation for Hurd systems... However until LVM2
> (and-all-the-rest)-on-root works out of the box, I'll call anything else
> crap.

What argument for hurd systems?  Hard drives are enormous these days, and if 
your OS itself is larger than 10 gigs there's something deeply wrong with it.  
(Your application data may be huge, but that's not your OS.)

> > I agree initrd is kinda pointless, but initramfs isn't.  The kernel guys
> > are moving towards initramfs being required someday.  These are still
> > nebulous future plans with no actual deadline, but they include moving to
> > dynamically assigned major/minor numbers (so you need something like udev
> > to
> > populate /dev),
>
> Nice move to disable init=/bin/sh. Really. Next one is moving kdelibs into
> the kernel?

Try "rdinit=/bin/sh", that affects what init gets run on the initramfs.  
(Assuming the initramfs has a comand shell...)

> > > I HATE FSCKING MTAB
> > >
> > > (in three-part harmony, probably)
> >
> > Everybody hates /etc/mtab.  It doesn't work if you chroot.
>
> Right.
>
> > It can't handle
> > --bind or --move mounts...
>
> In my experience it does (I have 2/3 distros and use chrooting often, so I
> loop-mount half my disk):

So this was fixed in util-linux then?  (I know _busybox_ gets it right, but 
that's because I took it into account in my rewrite...)

http://www.busybox.net/lists/busybox/2005-August/015285.html

> > Just symlink it to /proc/mounts and recognize
> > that any tool that can't handle that is a buggy tool that needs to be
> > fixed.
>
> No - the kernel doesn't allow storing the full set of infos which are added
> by mount there. And frankly I don't want the kernel to do that.

For which use?  I got the loopback functionality working just fine (to the 
point where you don't have to specify -o loop anymore.  If you try to mount a 
file instead of a block device, it'll do the losetup for you.  I'm pondering 
adding support for "mount -o offset=12345 file.img /woot" even...

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:35                     ` Nix
@ 2005-11-27 19:10                       ` Blaisorblade
  2005-11-27 19:43                         ` Nix
  2005-11-27 21:21                       ` Rob Landley
  1 sibling, 1 reply; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 19:10 UTC (permalink / raw)
  To: Nix; +Cc: user-mode-linux-devel, Rob Landley, Chris Lightfoot

On Sunday 27 November 2005 19:35, Nix wrote:
> On Sun, 27 Nov 2005, blaisorblade@yahoo.it whispered secretively:

> > Nice move to disable init=/bin/sh. Really. Next one is moving kdelibs
> > into the kernel?

> Nah, AIUI the initramfs runs *first*;

> it's its job to parse those parts 
> of the kernel parameters.

Ehy, initramfs is virtually empty by default...

Also, I didn't expect initramfs to parse that, but rather to prepare things 
and later let the kernel do the usual stuff - something different would 
surprise me.

Plus, for deep troubleshooting (mainly for kernels) init=/bin/sh is useful.

> > No - the kernel doesn't allow storing the full set of infos which are
> > added by mount there. And frankly I don't want the kernel to do that.

> Why not? It should. Only root can call mount(), so there's no real
> danger that some attacker will stick megabytes of stuff in there.

Yep, but I don't like the kernel to become a configuration repository... it's 
conceptually similar to the in-kernel Win32 registry (though it's smaller, 
yep).
-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:59                     ` Rob Landley
@ 2005-11-27 19:20                       ` Blaisorblade
  2005-11-27 21:41                         ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Blaisorblade @ 2005-11-27 19:20 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Rob Landley, Nix, Chris Lightfoot

On Sunday 27 November 2005 19:59, Rob Landley wrote:
> On Sunday 27 November 2005 11:37, Blaisorblade wrote:
> > > Like "/tmp/uml.ctl" in arch/um/drivers/daemon_kern.c, line 70?

> Any user can create /tmp/uml.ctl and the sticky bit prevents anybody else
> from deleting it, so any user can block UML switch from working right.
There's a switch for the path, so it's only an inconvenience.

> Under /var/run you can have a persistent directory belonging to a GID or
> some such that UML switch is setgid to, so under /var you are at least
> _capable_ of dealing with this sort of thing...

> What argument for hurd systems?
They by default have /usr -> /.
> Hard drives are enormous these days, and 
> if your OS itself is larger than 10 gigs there's something deeply wrong
> with it. (Your application data may be huge, but that's not your OS.)

I usually size / as 500M, and split away /usr and /var on LVM. 
Especially /usr, I've been growing it via LVM 3 times this week.

> Try "rdinit=/bin/sh", that affects what init gets run on the initramfs.
> (Assuming the initramfs has a comand shell...)

Missing from Documentation/filesystems/ramfs-rootfs-initramfs.txt.

> So this was fixed in util-linux then?

Don't remember the bug.

> (I know _busybox_ gets it right, but 
> that's because I took it into account in my rewrite...)

> http://www.busybox.net/lists/busybox/2005-August/015285.html

I never tested --move, only --bind.

Also, umount has other shortcomings: for instance, umount dev will 
umount /dev, even if I meant a udev mount on /mnt/gen32/dev with dev on first 
column (or something like that). I.e. umount tries to absolutize paths 
relative to /.

Also, umount /var when /var is bind-mounted elsewhere is ambiguous, and I've 
no idea of a proper solution.

(However, take this with a grain of salt - I'm trying to code rather than 
talking today, and I'm failing almost fully).

> > > Just symlink it to /proc/mounts and recognize
> > > that any tool that can't handle that is a buggy tool that needs to be
> > > fixed.

> > No - the kernel doesn't allow storing the full set of infos which are
> > added by mount there. And frankly I don't want the kernel to do that.

> For which use? 

Don't recall exactly...

> I got the loopback functionality working just fine (to the 
> point where you don't have to specify -o loop anymore.

With busybox I guess - you using it on a non-embedded system? I though it was 
thought to be _small_ and feature-limited.

> If you try to mount 
> a file instead of a block device, it'll do the losetup for you.  I'm
> pondering adding support for "mount -o offset=12345 file.img /woot" even...

You get a namespace conflict that way - the day a fs supports offset= you're 
burned.
-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

		
___________________________________ 
Yahoo! Messenger: chiamate gratuite in tutto il mondo 
http://it.messenger.yahoo.com



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:17               ` Nix
@ 2005-11-27 19:24                 ` Rob Landley
  0 siblings, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-27 19:24 UTC (permalink / raw)
  To: Nix; +Cc: user-mode-linux-devel, Chris Lightfoot

On Sunday 27 November 2005 12:17, Nix wrote:
> [Sorry for response delay, steaming cold/flu]
>
> On Fri, 25 Nov 2005, Rob Landley worried:
> > On Friday 25 November 2005 15:04, Nix wrote:
> >> The ~/.kde directory doesn't contain temporary files, but persistent
> >> state:
> >
> > ~/.kde/share/apps/kmail/lock is persistent state?
>
> No, but KDE is a bit of a mess in some areas, and this is one of htem.

My fiance's laptop has xfce on it.  Her assessment?  "The mouse is cute."  And 
she doesn't actively hate it.

Pondering switching over to that.  It'd mean giving up Konqueror, but that's 
the Konqueror developers' fault for gluing it to hundreds of megabytes of 
unnecessary crap...

> > I do know that half the time the darn battery runs out and kde suddely
> > shuts down my desktop without the courtesy of even _warning_ me first (oh
> > it pops up a window three seconds before doing it), kmail doesn't have a
> > chance to zap this file before being killed and thus I have to drill down
> > and zap the sucker by hand or it'll refuse to run when I boot back up.
>
> ... and this is why it should be in /tmp.

As with all pid files, it should check and see if there is a currently running 
process with that PID (and that this process is using the same binary as it 
is, which you can find under /proc) and if not, zap the pid file as stale.  
There should probably be a library function for this, it's to let you find 
the currently running instance.  You should confirm once you find...

> > Circa Red Hat 9, konqueror's cache files were under .kde.  I have no idea
> > what the junk in .kde/share/apps/kpdf is for...
>
> Not true as of reasonably recent Konquerors.

It's now under /var/tmp, as I mentioned.  (Apparently, they want the cache to 
persist between reboots, despite the fact I told it cookies shouldn't.  So 
when _is_ /var/tmp cleaned, anyway?  Randomly?)

> > But I take your point.  They've instituted a policy and tried to clean
> > this up.  Similarly, .bash_history, .bittorrent, .DCOPserver*, .mcop, and
> > all the other fun stuff written into home must be considered persistent
> > state.
>
> Certainly the bash history and bittorrent stuff is persistent. .mcop is
> persistent (the trader cache should outlast reboots).

*shrug*.  The only one I know what it actually does is the bash history...

> >> and the same is true of /var/spool,
> >
> > Doesn't /var/spool/cups contains files spooled to the printer?  (I dunno,
> > the only printer in the house is hooked up to my fiance's windows
> > machine.)
>
> Yes. Again, if the machine reboots, you don't want to lose stuff you've got
> waiting to print.

Actually, I do.  Very much so, in some cases.  But I can see it being a 
preference...

> >> >                                                      but these days it
> >> > uses . ${filename}.swp in the same directory as the file being edited.
> >>
> >> Yes, and I absolutely despise this behaviour. Is there any way to force
> >> vim to use /var/tmp like everyone else?
> >
> > It's a compile-time option.  (I accidentally set it to use /tmp once and
> > had to figure out how to undo it.)
>
> Is it? Oh good, I'll flip it next time I upgrade :)

Also look for a vimrc file under /etc somewhere.  You can override just about 
anything from there.

> > Files that live for brief instants never get written out to disk anyway.
>
> Aside: it's easy to test this by writing something that creates and
> unlinks a file, dumps stuff into it, then deletes it, and loops on that:
> watch the disk light. I'll write a testcase because I'm so sure I'm right.
>
> [five minutes later]
>
> ... oops. I just, er, proved I was wrong. Ah well. You live and
> learn. This was certainly true in 2.4 but in 2.6 it seems to be the case
> that dirty blocks get magically undirtied if the file in question gets
> completely unlinked and not kept open by anything before the blocks hit
> the disk (unless the file is too large to fit in the page cache of
> course; even then it might fit in tmpfs, as tmpfs is swap-backed but
> even I'll admit that multi-hundred-megabyte writes to /tmp are rare
> things for programs to do.)

You were right about noatime not being the default, though.  Should be, but 
then "should be" is Jeff's argument for tmpfs on /tmp and I'm the one pushing 
against that.  (Patch forthcoming, I added it to the front my to-do list, 
might even get to it this evening.)

> > That's why there's the delay before dirty pages in the page cache are
> > scheduled for writeout.  So tmpfs doesn't help there.
>
> Well, it does if the consuming program takes some time to consume the
> file, or the producing program takes some time to generate it (e.g. GCC;
> yes, even in -pipe mode, some temporary files in /tmp are used.)

In theory that's idle disk time and the sucker is very CPU limited in that 
case.  More or less by definition.  (But if the disk is highly bogged by 
something else.  Of course then it's possible you're swapping, so...)

> >> - users. A *lot* of my users dump temporary crud in /tmp:
> >
> > Yeah, at Rutgers we used to do that on the Sun machines to get around the
> > disk quota.
>
> Mine do it to avoid cluttering up their $HOMEs with crap. (Well, all but
> one whose home directory looks like a sewer. I avoid looking in there
> unless forced.)

Doesn't everybody's ~ look like a junk drawer?  Every time I reinstall I start 
with a fresh home directory and the previous stuff in /home/old or some such, 
and copy stuff over as I need it.

> > Sounds like your users are old unix hands who cut their teeth on
> > traditional Unix boxes in the days before Linux.
>
> Two of them are for certain: I don't know about the rest. They're not
> doing it for efficiency reasons, just out of tidiness.

My programming style tends to have this in common with farming: The end result 
is as tidy as I can make it, but the workspace is piles of dirt with trenches 
dug in it.  (Laws and sausage are apparently made the same way.)

> >> `Existing practice' seems to me to have pretty much wanted something,
> >> uh, like tmpfs. But maybe your existing practice of /tmp is very
> >> different from mine. (It certainly sounds like it.)
> >
> > Out there in the field, today, /tmp is not usually tmpfs.
>
> Out there in the field, today, the average Linux box is running Oracle and
> very little else :(

Not in my experience.  Start by thinking about apache, and from what I've seen 
mysql installations outnumber Oracle (not in dollar volume but in units and 
users).

I was rooting for postgresql for a while, but apparently there's not as much 
middle ground as you'd think.  These days I'm rooting for the simple entirely 
in-memory databases...

> >                                                            And nobody's
> > seen enough benefit in it to bother deploying it on the Fedora, Gentoo,
> > and Ubuntu systems I've tested.
>
> I haven't seen a non-tmpfs-for-/tmp Linux box in years. I guess this is
> another transatlantic divide thing :)

You see a lot of hand-tuned systems.  I see a lot of "IT isn't really what we 
do" systems and a lot of "put it together myself with duct tape", and haven't 
seen much in between recently.

> > Oh I've watched large jobs thrash the heck out of a machine all
> > afternoon. Classic ray tracing, for example...
>
> Ray tracing is a worst case; it has very little locality of reference at
> all (at least not unless the ray tracer has been optimized for parallism,
> which `classic' ones generally haven't been).

Hence the thrashing, yes. :)

> >> You can zap it if you need it for something else pretty easily.
> >> swapfiles are no slower than swap partitions these days, and swap
> >> partitions are easy to turn into filesystems too.
> >
> > I've done this, but it's not automatic.  (Did they ever make swapfiles
> > reliable so they don't lock up under low memory situations?)
>
> As I understand it all the downsides of swapfiles (speed, reliability et
> al) went away in the 2.5.x timeframe.

I knew they were improving it, but I hadn't been following too closely.

*ponders current setup*. I could do tmpfs backed by a swap file living on an 
ext2 partition that's loopback mounted from a hostfs that's exported from an 
ext3 partition.  I wonder if that has a bat's chance of actually working?

Of course at that point, I'd have an almost unbearable urge to stick QEMU in 
there somewhere.  On general principles.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 19:10                       ` Blaisorblade
@ 2005-11-27 19:43                         ` Nix
  0 siblings, 0 replies; 42+ messages in thread
From: Nix @ 2005-11-27 19:43 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Rob Landley, Chris Lightfoot

On Sun, 27 Nov 2005, blaisorblade@yahoo.it whispered secretively:
> Plus, for deep troubleshooting (mainly for kernels) init=/bin/sh is useful.

init=/bin/busybox/sh is also useful for those cases when you've futzed
your libc. :)

>> > No - the kernel doesn't allow storing the full set of infos which are
>> > added by mount there. And frankly I don't want the kernel to do that.
> 
>> Why not? It should. Only root can call mount(), so there's no real
>> danger that some attacker will stick megabytes of stuff in there.
> 
> Yep, but I don't like the kernel to become a configuration repository... it's 
> conceptually similar to the in-kernel Win32 registry (though it's smaller, 
> yep).

Well, except that it's just storing, er, data about mount points. It
*already* has to store that!

-- 
`Y'know, London's nice at this time of year. If you like your cities
 freezing cold and full of surly gits.' --- David Damerell



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:35                     ` Nix
  2005-11-27 19:10                       ` Blaisorblade
@ 2005-11-27 21:21                       ` Rob Landley
  1 sibling, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-27 21:21 UTC (permalink / raw)
  To: Nix; +Cc: Blaisorblade, user-mode-linux-devel, Chris Lightfoot

On Sunday 27 November 2005 12:35, Nix wrote:
> On Sun, 27 Nov 2005, blaisorblade@yahoo.it whispered secretively:
> > It's not a file, it's a AF_UNIX socket bound there - and bind() fails if
> > the file exists. So it's a different story (I was puzzled by a missing
> > bind(O_EXCL), but I learned with trial there's no need).
>
> There's an (optional) abstract namespace for AF_UNIX sockets now. It's
> Linux-only, but UML isn't going to care about that :)

I dunno if that's any less susceptible to attack, but it's probably "the right 
thing" anyway...

> > Don't know for shared mounts...
>
> /etc/mtab assumes *one single* canonical filesystem view, so shared or
> private mounts or anything smacking of them will break it completely.
>
> (Indeed in my experience breathing heavily near it will break it
> completely...)

I once asked a couple of lute players what would put a lute out of tune.  
"Large flowers" and "brightly colored wallpaper" were the immediate answers.  
(Very delicate instrument your average lute; thin wood, lightly strung, so 
it's basically out of tune by the end of any given song...)

It came to mind thinking about the reliability of mtab, for some reason...

> >> Just symlink it to /proc/mounts and recognize
> >> that any tool that can't handle that is a buggy tool that needs to be
> >> fixed.
> >
> > No - the kernel doesn't allow storing the full set of infos which are
> > added by mount there. And frankly I don't want the kernel to do that.
>
> Why not? It should. Only root can call mount(), so there's no real
> danger that some attacker will stick megabytes of stuff in there.

And only the kernel really knows what's mounted.  Userspace can try to keep 
track, assuming every program that calls the mount or umount syscall (mount, 
autofs, nfsmount, smbmount) remembers to update /etc/mtab, agrees on how, 
never has a race condition with any other instance.  But kernel is still the 
ultimate authority here.  If the _kernel_ doesn't know something's mounted, 
then it's not mounted.  Period.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:49                   ` Nix
@ 2005-11-27 21:25                     ` Rob Landley
  0 siblings, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-27 21:25 UTC (permalink / raw)
  To: Nix; +Cc: Blaisorblade, user-mode-linux-devel, Chris Lightfoot

On Sunday 27 November 2005 12:49, Nix wrote:

> I'm using both matlab/octave *and*, when running backups, said French disk
> archiver. The source is gradually being Anglicised so that the developer
> base can rise a bit :)
>
> It has numerous advantages over tar and rsync if, like me, you're stuck
> using a pile of CD-R[W]s as your backup medium.

I invested in a DVD burner a couple years ago.  My laptop's hard drive is 
bigger than that, but not bigger enough to make the stack unmanageable.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 19:20                       ` Blaisorblade
@ 2005-11-27 21:41                         ` Rob Landley
  2005-11-29 16:52                           ` Blaisorblade
  0 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-27 21:41 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Sunday 27 November 2005 13:20, Blaisorblade wrote:
> > Try "rdinit=/bin/sh", that affects what init gets run on the initramfs.
> > (Assuming the initramfs has a comand shell...)
>
> Missing from Documentation/filesystems/ramfs-rootfs-initramfs.txt.

I only dredged through the source and found it shortly after posting that 
document.  I need to update the thing, but I'm hoping to get mdev merged into 
busybox (and switch_root properly tested, and the issue of what to do 
about /dev/console sorted out).  Then I can put together a minimal busybox 
initramfs package and reference that.

> > So this was fixed in util-linux then?
>
> Don't remember the bug.

I found it myself testing corner cases to figure out how busybox mount should 
work.

> > (I know _busybox_ gets it right, but
> > that's because I took it into account in my rewrite...)
> >
> > http://www.busybox.net/lists/busybox/2005-August/015285.html
>
> I never tested --move, only --bind.

They had the same problem when I tested.  The information being stored 
in /etc/mtab wasn't the information actually needed to umount.

> Also, umount has other shortcomings: for instance, umount dev will
> umount /dev, even if I meant a udev mount on /mnt/gen32/dev with dev on
> first column (or something like that). I.e. umount tries to absolutize
> paths relative to /.

Try the busybox version.  There were loud complaints from some of the users, 
and I fixed all the ones I knew about at the time.  (Although I believe I 
still need to add --rbind, which is trivial..)

> Also, umount /var when /var is bind-mounted elsewhere is ambiguous, and
> I've no idea of a proper solution.

Leave the bind mount.

It's a bit like:

mount /dev/hda one
mount /dev/hda two
umount /dev/hda

The kernel umounts the most recent one, not all of them.  I need to upgrade 
umount so that you can do "umount -a /dev/hda".  Right now umount -a ignores 
arguments...

It's a to-do item.

> (However, take this with a grain of salt - I'm trying to code rather than
> talking today, and I'm failing almost fully).

My to-do list is growing faster than I'm retiring items on it, but what else 
is new?

> > I got the loopback functionality working just fine (to the
> > point where you don't have to specify -o loop anymore.
>
> With busybox I guess - you using it on a non-embedded system?

Yup.  I'm using it in place of the gnu tools in a system capable of rebuilding 
itself from source code.

http://www.landley.net/code/firmware

I had to upgrade lots of busybox in order make that work, but it does now.  
(The -devel version, anyway.  1.01 is a bit behind the times.  Should be a 
new release around new year's.)

> I though it was thought to be _small_ and feature-limited.

It's configurable.  It tries to pack as much functionality into as small a 
space as it can, and lets you configure how much functionality you want.

The "everything enabled" version is around a megabyte, but that megabyte 
replaces bzip2, coreutils, e2fsprogs, file, findutils, gawk, grep, inetutils, 
less, modutils, net-tools, patch, procps, sed, shadow, sysklogd, sysvinit, 
tar, util-linux, and vim.

> > If you try to mount
> > a file instead of a block device, it'll do the losetup for you.  I'm
> > pondering adding support for "mount -o offset=12345 file.img /woot"
> > even...
>
> You get a namespace conflict that way - the day a fs supports offset=
> you're burned.

*shrug*  The same argument could be made about any other argument that mount 
interprets.  (Currently: loop, defaults, noauto, ro, rw, nosuid, suid, dev, 
nodev, exec, noexec, sync, async, remount, atime, noatime, diratime, 
nodiratime, bind, move, and rbind.)

And no, I didn't invent mount interpreting any of that.  I can make it depend 
on loopback support (which is already a config option for mount; you can 
disable features out of the apps in busybox to shrink the size).

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 18:31                   ` Nix
@ 2005-11-28  1:07                     ` Rob Landley
  2005-11-29 16:08                       ` Blaisorblade
  0 siblings, 1 reply; 42+ messages in thread
From: Rob Landley @ 2005-11-28  1:07 UTC (permalink / raw)
  To: Nix; +Cc: Blaisorblade, user-mode-linux-devel, Chris Lightfoot

On Sunday 27 November 2005 12:31, Nix wrote:
> >             I personally symlink /bin, /sbin, and /lib to the
> > corresponding /usr directories and consolidate the whole mess, myself. 
> > Yes, you have to patch gcc's paths (in collect2) to not search _both_
> > /lib and /usr/lib because if gnu's linker finds the same symbols in two
> > different libraries it statically links them in rather than trying to
> > figure out which one is right, resulting in executables as big as if
> > they're statically linked but still refusing to run if they can't find
> > their shared libraries at run time.  That's a bug in ld.
>
> I'll say! I'll see if I can fix that (if it isn't already fixed: I'm having
> trouble reproducing it here, with binutils 2.16.91.0.2...)

It might have been.  I noticed it 4 or 5 years ago and have gone out of my way 
to avoid the problem ever since (as a simple cleanliness thing).  I noticed 
an excessively bloated image earlier this year but I think it simply hadn't 
stripped debug info...

In theory, a standard linux image where you "mv /lib/* /usr/lib; rm /lib; ln 
-s /usr/lib /lib" (probably from a knoppix CD because that's _not_ going to 
be happy halfway through).  And then try to compile stuff...

If that doesn't show the problem, it's probably fixed.  (And I _think_ the 
problem was actually in collect2, not in ld.  And collect2 is part of gcc.)

> >                                                  These are still nebulous
> > future plans with no actual deadline, but they include moving to
> > dynamically assigned major/minor numbers (so you need something like udev
> > to populate /dev),
>
> How terrible. :)

If static device number assignments go away, then drivers have to register 
with sysfs in order to export device nodes.  The exports you have to bind to 
to register with sysfs are GPLONLY.  Interesting, eh?

> >                 having userspace find and mount the real root partition
> > (so when you're booting from a USB key but your root paritition lives on
> > an NFS server that in order to access it you have to dhcp yourself an
> > address, nslookup the server name, and then login with a public key from
> > said USB stick...)  All the various partitioning schemes could be moved
> > over to device mapper.  And so on.
>
> It's a little annoying for those of us *without* horribly complex boot
> schemes; I guess there'll be a `default initramfs' which replicates the
> current behaviour.

Yup.  I'm doing one for busybox (slowly), and the klibc guys are also working 
their way towards one which has about a 50% chance of becoming "the 
standard".  (The busybox one may become "the standard" for embedded systems.)

The Red Hat people are slowly migrating their initrd image over to initramfs, 
although "not horribly complex" is long gone in that arena.  The gentoo 
people have theirs, and the debian people have theirs, and the Linux From 
Scratch people are evolving theirs, all home-grown...

Who else is interesting?  Possibly SuSE.  No idea what they're up to since the 
founder and lead architect quit last month.

> > They'd proposed a serious kernel crapectomy "for 2.7" back before 2.7 got
> > put on indefinite hold.  How they're rolling it out now, we dunno.  They
> > seem to be happy chewing their current mouthful, at the moment...
>
> Yeah, the change rate of the kernel doesn't exactly seem to be at an
> all-time low :)

Source control and delegation have been good to Linus.

Way back when I posted that "patch penguin" recommendation I was highlighting 
patch integration as a serious bottleneck, and as is normal for Linus he 
barfed on the proposed solution and found a better way to do it, automating 
his way around the problem with better merging tools.  This let him delegate 
entire subsystems to trusted people and trivially merge the results, and thus 
the Lieutenants layer formed between him and normal maintainers.  (It used to 
take someone like Alan Cox to maintain a separate tree and marshall the 
changes from that as a stream of patches Linus could integrate, and Linus had 
to fix up the rejects.  Now it's just a "please pull" request that takes 
Linus a minute or two to handle; the tools do all the integration work.)

All this is why they decided to try going without a development fork but 
instead doing rolling updates.  With better integration tools and a dozen 
subsystem maintainers to spread the load, they can now evaluate and merge 
each month or two what would have been a year's worth of patches back in 
2000.

If you can do a year and a half's worth of integration in three months, what's 
the point of a development fork?  They haven't quite figured out how to 
handle things like the 2.4 to 2.6 modules rewrite, but they introduced the 
feature removal schedule as step in that direction.  So devfs->udev is 
working like "add udev, deprecate devfs, eventually yank"...

> Yeah, but what does /proc/mounts say? Does it show only references that the
> querying process can see?
>
> ... actually, hey, yes, it's a symlink to /proc/self/mounts, so it does the
> right thing already. Nifty.

Has for a while now. :)

> >> Obviously /etc/mtab *must* be a symlink to /proc/mounts, now,
> >> only oops that breaks the quota tools...)
> >
> > I rewrote busybox mount so that things work properly with /proc/mounts. 
> > And I vaguely remember coming up with an in-house patch to fix the quota
> > tools (they were upset by rootfs) something like four years ago.
>
> Please feed it upstream to the quota tools people before I have to write
> the same damn patch ;)))

Alas, that was four years ago at an employer I stopped working for when their 
venture capital ran out.  Haven't used quota since.

> > Everybody hates /etc/mtab.  It doesn't work if you chroot.  It can't
> > handle --bind or --move mounts...  Just symlink it to /proc/mounts and
> > recognize that any tool that can't handle that is a buggy tool that needs
> > to be fixed.
>
> Well, ideally the kernel should allow mount(2) to feed it *arbitrary*
> options in the `data' argument, reflecting those it doesn't understand
> back into /proc/mounts.

It does, but there are some it has to interpret already.  (For example, rw or 
ro involve setting the MS_RDONLY flag to the correct value.  There are 
several such flags: MS_NOSUID, MS_NODEV, MS_NOEXEC, MS_SYNCHRONOUS, 
MS_REMOUNT, MS_NOATIME, and so on...)

> That would avoid breaking the quota tools and, 
> um, whatever else depends on this (I've seen distributed administration
> tools that mark up filesystems with custom options in the expectation
> that they'll land in mtab, too: I think there's some automated fstab
> editor in HAL that does the same thing).

Trust me, this would just be adding to infrastructure that's already there.  
If you pass "mount -o walrus=enormous" it'll pass it on to the kernel.  
(Well, busybox will.  Don't ask me what the mainline mount does, I haven't 
looked at its sources...)

> [...]
>
> > First time I've heard of the tool, but then back under 2.4.7 I remember I
> > had rsync regularly triggering the OOM killer.  Not because rsync was
> > leaking, but because the servers backing up only had 128 megs of memory
> > and the balancing was _terrible_ so the dentry cache and page cache would
> > squeeze out anonymous pages to the point where rsync itself got OOM
> > killed...
>
> Ick, yes. I switched to 2.4 around that time and switched right back to
> 2.2 again because the MM had so many problems...

I stuck it out.  Same with 2.6 (which I've been using since 2.6.0-pre3, which 
once upon a time used to kernel panic if the orinoco wireless driver ever 
lost touch with its access point)...

> > People who want truly insane amounts of memory these days (often for
> > graphics or video editing) tend to mmap their data files directly and
> > work in there. Once again rendering insane amounts of swap less useful...
>
> Not necessarily, given the existence of MAP_PRIVATE. (The problem with
> working directly in data files without MAP_PRIVATE is that if you lose
> power at *any* time, your data file is toast.)

Did you catch Linus's long rant about how MAP_PRIVATE is deeply stupid and 
that Linux will never really implement it?

And you can fsync and do stuff like journaling within a file.  (Except with 
mmap it's msync.)

> > If we had a "treat this like it's on tmpfs" madvice, that would be
> > ideal...
>
> Agreed. Combine that with per-user filesytems and, well, give every user
> a small tmpfs mount of their own on /tmp and let apps use suitably
> advised mmaps for everything else :)

No, just use the madvise(NO_SYNC).  Then tmpfs becomes completely irrelevant 
because any arbitrary file backed mapping can be treated as memory.  (Not 
pinned, but still treated as shared memory.)

I vaguely remember some discussion about this on linux-kernel, some time ago.  
I wonder how it came out?  (2.6.15-rc2 mman.h doesn't show anything...)

> (security holes? but other users can't *see* that /tmp, which is why it's
> mode 640, just like their $HOME...)

Or you could just add the madvise() like I mentioned and forget about tmpfs 
completely.

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-28  1:07                     ` Rob Landley
@ 2005-11-29 16:08                       ` Blaisorblade
  2005-11-29 19:38                         ` Rob Landley
  0 siblings, 1 reply; 42+ messages in thread
From: Blaisorblade @ 2005-11-29 16:08 UTC (permalink / raw)
  To: Rob Landley; +Cc: Nix, user-mode-linux-devel, Chris Lightfoot

On Monday 28 November 2005 02:07, Rob Landley wrote:
> On Sunday 27 November 2005 12:31, Nix wrote:

> Did you catch Linus's long rant about how MAP_PRIVATE is deeply stupid and
> that Linux will never really implement it?

What's that?

> And you can fsync and do stuff like journaling within a file.  (Except with
> mmap it's msync.)

> No, just use the madvise(NO_SYNC).

Which doesn't exist.

> Then tmpfs becomes completely 
> irrelevant because any arbitrary file backed mapping can be treated as
> memory.  (Not pinned, but still treated as shared memory.)

> I vaguely remember some discussion about this on linux-kernel, some time
> ago. I wonder how it came out?  (2.6.15-rc2 mman.h doesn't show
> anything...)

What I remember is MADV_TRUNCATE / MADV_FREE / and so on by Badary Pulavarty - 
the equivalent of MADV_DONTNEED with additional truncation of the tmpfs pages 
you have mmaped.

Which is used for UML memory hotunplug patch from Jeff's. IIRC it's in -mm, 
but won't be in 2.6.15 at this point.

> > (security holes? but other users can't *see* that /tmp, which is why it's
> > mode 640, just like their $HOME...)

> Or you could just add the madvise() like I mentioned and forget about tmpfs
> completely.

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-27 21:41                         ` Rob Landley
@ 2005-11-29 16:52                           ` Blaisorblade
  0 siblings, 0 replies; 42+ messages in thread
From: Blaisorblade @ 2005-11-29 16:52 UTC (permalink / raw)
  To: Rob Landley; +Cc: user-mode-linux-devel, Nix, Chris Lightfoot

On Sunday 27 November 2005 22:41, Rob Landley wrote:
> On Sunday 27 November 2005 13:20, Blaisorblade wrote:

> *shrug*  The same argument could be made about any other argument that
> mount interprets.  (Currently: loop, defaults, noauto, ro, rw, nosuid,
> suid, dev, nodev, exec, noexec, sync, async, remount, atime, noatime,
> diratime, nodiratime, bind, move, and rbind.)

No, it's not the same story. What keeps both pieces together is mount being a 
de-facto standard.

Any fs coder who takes one of that as an fs option will have his head lopped 
at the very moment he does it.

So, start making busybox mount incompatible with util-linux mount (I mean for 
features, not for "undefined behaviour" situations) and you'll break stuff.

( I didn't know about diratime... )

> And no, I didn't invent mount interpreting any of that.  I can make it
> depend on loopback support (which is already a config option for mount; you
> can disable features out of the apps in busybox to shrink the size).

> Rob

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [uml-devel] When /tmp is not tmpfs.
  2005-11-29 16:08                       ` Blaisorblade
@ 2005-11-29 19:38                         ` Rob Landley
  0 siblings, 0 replies; 42+ messages in thread
From: Rob Landley @ 2005-11-29 19:38 UTC (permalink / raw)
  To: Blaisorblade; +Cc: Nix, user-mode-linux-devel, Chris Lightfoot

On Tuesday 29 November 2005 10:08, Blaisorblade wrote:
> On Monday 28 November 2005 02:07, Rob Landley wrote:
> > On Sunday 27 November 2005 12:31, Nix wrote:
> >
> > Did you catch Linus's long rant about how MAP_PRIVATE is deeply stupid
> > and that Linux will never really implement it?
>
> What's that?

Looking it up, it turns out it was MAP_COPY he was against:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0110.0/0826.html

So I was wrong there.

> > And you can fsync and do stuff like journaling within a file.  (Except
> > with mmap it's msync.)
> >
> > No, just use the madvise(NO_SYNC).
>
> Which doesn't exist.

Takes some of the fun out of it, yes.  But as a recommendation in response to 
a similar "change everything" proposal, it seems like a more sane direction 
to me.

> > Then tmpfs becomes completely
> > irrelevant because any arbitrary file backed mapping can be treated as
> > memory.  (Not pinned, but still treated as shared memory.)
> >
> > I vaguely remember some discussion about this on linux-kernel, some time
> > ago. I wonder how it came out?  (2.6.15-rc2 mman.h doesn't show
> > anything...)
>
> What I remember is MADV_TRUNCATE / MADV_FREE / and so on by Badary
> Pulavarty - the equivalent of MADV_DONTNEED with additional truncation of
> the tmpfs pages you have mmaped.
>
> Which is used for UML memory hotunplug patch from Jeff's. IIRC it's in -mm,
> but won't be in 2.6.15 at this point.

This is a vague recollection from 3-5 years ago. :)

Rob
-- 
Steve Ballmer: Innovation!  Inigo Montoya: You keep using that word.
I do not think it means what you think it means.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2005-11-29 19:39 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-24 12:11 [uml-devel] When /tmp is not tmpfs Rob Landley
2005-11-24 20:40 ` Blaisorblade
2005-11-25  8:26   ` Rob Landley
2005-11-25  9:55 ` Jeff Dike
2005-11-25  9:48   ` Rob Landley
2005-11-25 10:52     ` Rob Landley
2005-11-25 11:26       ` Rob Landley
2005-11-25 14:56 ` Nix
2005-11-25 15:03   ` Chris Lightfoot
2005-11-25 15:36     ` Nix
2005-11-25 16:03     ` Rob Landley
2005-11-25 19:33       ` Nix
2005-11-25 20:18         ` Rob Landley
2005-11-25 21:04           ` Nix
2005-11-25 22:31             ` Rob Landley
2005-11-27 16:48               ` Blaisorblade
2005-11-27 18:17               ` Nix
2005-11-27 19:24                 ` Rob Landley
2005-11-25 23:33             ` Blaisorblade
2005-11-26  2:12               ` Nix
2005-11-26 11:47                 ` Rob Landley
2005-11-27 17:37                   ` Blaisorblade
2005-11-27 18:35                     ` Nix
2005-11-27 19:10                       ` Blaisorblade
2005-11-27 19:43                         ` Nix
2005-11-27 21:21                       ` Rob Landley
2005-11-27 18:59                     ` Rob Landley
2005-11-27 19:20                       ` Blaisorblade
2005-11-27 21:41                         ` Rob Landley
2005-11-29 16:52                           ` Blaisorblade
2005-11-27 18:31                   ` Nix
2005-11-28  1:07                     ` Rob Landley
2005-11-29 16:08                       ` Blaisorblade
2005-11-29 19:38                         ` Rob Landley
2005-11-26 10:44               ` Rob Landley
2005-11-27 16:38                 ` Blaisorblade
2005-11-27 18:49                   ` Nix
2005-11-27 21:25                     ` Rob Landley
2005-11-27 17:10                 ` Blaisorblade
2005-11-25 23:46           ` Chris Lightfoot
2005-11-26 10:03             ` Rob Landley
2005-11-26 10:15               ` Chris Lightfoot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.