* fastboot, diskstat @ 2005-07-22 3:41 bert hubert 2005-07-22 4:47 ` Andrew Morton 2005-07-22 7:16 ` Andre Eisenbach 0 siblings, 2 replies; 7+ messages in thread From: bert hubert @ 2005-07-22 3:41 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Hi Andrew, I'm currently at OLS and presented http://ds9a.nl/diskstat yesterday, which also references your ancient 'fboot' program. I've also done experiments along those lines, and will be doing more of them soon. You mention it was a waste of time, do you recall if that meant: 1) that the total time for prefetching + actual boot was only 10% shorter, but that the actual booting did not (significantly) touch the disk? 2) that on actual boot there would still be a lot of i/o ? Regarding 1), in my own experiments I failed to convince the kernel to actually cache the pages I touched, I wonder if some sequential-read detection kicked in, the one that prevents entire cd's being cached. For my next attempt I'll try to actually lock the pages into memory. Also, regarding the directory entries, are they accessed via the buffer cache? Iow, would reading blocks which can't be mapped to files directly via /dev/hda be useful? Thanks! -- http://www.PowerDNS.com Open source, database driven DNS Software http://netherlabs.nl Open and Closed source services ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 3:41 fastboot, diskstat bert hubert @ 2005-07-22 4:47 ` Andrew Morton 2005-07-22 7:18 ` Avi Kivity 2005-07-22 7:16 ` Andre Eisenbach 1 sibling, 1 reply; 7+ messages in thread From: Andrew Morton @ 2005-07-22 4:47 UTC (permalink / raw) To: bert hubert; +Cc: linux-kernel bert hubert <bert.hubert@netherlabs.nl> wrote: > > Hi Andrew, > > I'm currently at OLS and presented http://ds9a.nl/diskstat yesterday, which > also references your ancient 'fboot' program. > > I've also done experiments along those lines, and will be doing more of them > soon. > > You mention it was a waste of time, do you recall if that meant: > > 1) that the total time for prefetching + actual boot was only 10% shorter, > but that the actual booting did not (significantly) touch the disk? > > 2) that on actual boot there would still be a lot of i/o > > ? eep, this was early 2001, on 2.4.whatever. I recall trying various preloading schemes - try loading the metadata first, then the pagecache, pagecache first then metadata, one and not the other, etc. Yeah, 10-15% benefit was obtainable but on a little old laptop the amount of discontiguous I/O was still quite tremendous. I also recall hacking the initscripts so they were _all_ launched asynchronously. A few things broke because of dependency problems of course, but that improved things quite noticeably. I think quite a few of the scripts and daemons and things have explicit sleeps, and parallelising all of those helped. > > Regarding 1), in my own experiments I failed to convince the kernel to > actually cache the pages I touched, I wonder if some sequential-read > detection kicked in, the one that prevents entire cd's being cached. It depends how you touch the page. Remember that there is no unified pagecache in 2.6. The pagecache for /dev/hda1 is separate from the pagecache for /etc/passwd. If one tries to preload /etc/passwd by reading from /dev/hda1 then that won't be effective. So any userspace preloading scheme would have to open both /dev/hda1 and /etc/passwd and it would then read from both fds in some intermingled manner based on disk block address. Although I'd suggest that it'd be easier to just get the kernel to do it: set the disk queue size to something enormous (4096 requests?), open 100 files, launch posix_fadvise() against them all (or against sections of them) then close the files again. Rely upon that large disk queue to perform all the sorting. Maybe. > For my next attempt I'll try to actually lock the pages into memory. It shouldn't be needed. If at the end of preload there's still a decent amount of free memory, you know that the kernel hasn't gone and thrown anything away yet. Any machine with 256MB or more of RAM should be able to fit all the boot-time stuff into RAM fairly comfortably. > Also, regarding the directory entries, are they accessed via the buffer > cache? yes. For ext3 you can preload both inodes and directory entries via read()s from /dev/hda1. For ext2, directory entries each have their own pagecache and should be preloaded via read(open(/name/of/directory)). > Iow, would reading blocks which can't be mapped to files directly via > /dev/hda be useful? If the blocks are directories or inodes then you _must_ preload them via /dev/hda1's pagecache. (/dev/hda1's pagecache is the storage for /dev/hda1's buffercache - they're the same thing). So a scheme which would work for 2.6.x would be: a) Boot the machine b) Walk /dev/hda1's pagecache, record which pages are present. c) For all files which are in dcache, walk their pagecache, work out which pages are present. (nb: it might be possible to do most of the above from userspace: mmap the file and use mincore() to find out if the page is in pagecache). The above data is enough for performing a crude preload: a) Boot the machine b) Boost the disk queue size, set the VFS readahead to zero, open /dev/hda1 and all the regular files, hose reads at the disk via fadvise(). Restore VFS readahead and queue size, continue with boot. More sophisticated preload would involve bmap()ping the various regular files so the reads can be issued in LBA-sorted order. But this could be of marginal additional benefit. And I suspect that the whole thing will be of marginal benefit. Although things might be better now that files are laid out with the Orlov allocator (make sure that the distro was installed with a 2.6 kernel! The file layout will be quite different if the installer used a 2.4 ext3). Of course the next step is to rewrite files so that they are more favourably laid out on disk. Tricky. Or dump all pagecache to some temp file in a nice linear slurp and preload that, copying it all to the appropriate per-inode pagecaches and taking care of files which have been modified. Trickier ;) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 4:47 ` Andrew Morton @ 2005-07-22 7:18 ` Avi Kivity 2005-07-22 8:50 ` Lincoln Dale 0 siblings, 1 reply; 7+ messages in thread From: Avi Kivity @ 2005-07-22 7:18 UTC (permalink / raw) To: Andrew Morton; +Cc: bert hubert, linux-kernel Andrew Morton wrote: >The above data is enough for performing a crude preload: > >a) Boot the machine > >b) Boost the disk queue size, set the VFS readahead to zero, open > /dev/hda1 and all the regular files, hose reads at the disk via > fadvise(). Restore VFS readahead and queue size, continue with boot. > > opening all these files will require synchronous reads of their directories and inodes, so you might need to split b) into first opening and reading /dev/hda1, then opening and reading the regular files. >And I suspect that the whole thing will be of marginal benefit. Although >things might be better now that files are laid out with the Orlov allocator >(make sure that the distro was installed with a 2.6 kernel! The file >layout will be quite different if the installer used a 2.4 ext3). > >Of course the next step is to rewrite files so that they are more >favourably laid out on disk. Tricky. Or dump all pagecache to some temp >file in a nice linear slurp and preload that, copying it all to the >appropriate per-inode pagecaches and taking care of files which have been >modified. Trickier ;) > > another possibility: use a device mapper module under /dev/hda1 that records I/O patterns, then relocates blocks to fit that pattern, so that the normal boot sequence ends up issuing sequential disk writes. parallelized initscripts will probably defeat this, though. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 7:18 ` Avi Kivity @ 2005-07-22 8:50 ` Lincoln Dale 0 siblings, 0 replies; 7+ messages in thread From: Lincoln Dale @ 2005-07-22 8:50 UTC (permalink / raw) To: Avi Kivity; +Cc: Andrew Morton, bert hubert, linux-kernel Avi Kivity wrote: > parallelized initscripts will probably defeat this, though. > put all run-once-but-never-run-again scripts into initrd / initramfs? <evil grin> boot into a suspend-to-disk image? i still see the real solution at least for "desktop" machines is to minimize the sheer amount of stuff loaded in the rc scripts. at least for my use-every-day laptop (IBM T42), i've literally halved the startup time by being savvy about what services are started and in many cases not starting things until a few minutes after i've logged in. for example, making use of NetworkManager sorts out a lot of the delay associated with dhcp and roaming WiFi connections - so there are no start-on-boot network kruft. likewise, as a desktop its completely academic if sendmail starts at T+0 seconds or T+2 minutes. same for sshd/cups/httpd/ntpd et al. of what does run, you CAN run it in parallel & hopefully get some sense out of the elevator being intelligent. cheers, lincoln. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 3:41 fastboot, diskstat bert hubert 2005-07-22 4:47 ` Andrew Morton @ 2005-07-22 7:16 ` Andre Eisenbach 2005-07-22 9:31 ` Jan Engelhardt 2005-07-22 18:36 ` Diego Calleja 1 sibling, 2 replies; 7+ messages in thread From: Andre Eisenbach @ 2005-07-22 7:16 UTC (permalink / raw) To: bert hubert, Andrew Morton, linux-kernel 2005/7/21, bert hubert <bert.hubert@netherlabs.nl>: > I'm currently at OLS and presented http://ds9a.nl/diskstat yesterday, which > also references your ancient 'fboot' program. Bert, ever so slightly off topic, but you mentioned parallelized startup in your slides... So checkout initng for your tests. It's a highly parallelized init system which seriously speeds up boot. It also keeps the disks much busier during boot and might help your testing. Initng: http://initng.thinktux.net Cheers, Andre ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 7:16 ` Andre Eisenbach @ 2005-07-22 9:31 ` Jan Engelhardt 2005-07-22 18:36 ` Diego Calleja 1 sibling, 0 replies; 7+ messages in thread From: Jan Engelhardt @ 2005-07-22 9:31 UTC (permalink / raw) To: Andre Eisenbach; +Cc: bert hubert, Andrew Morton, linux-kernel >> I'm currently at OLS and presented http://ds9a.nl/diskstat yesterday, which >> also references your ancient 'fboot' program. > >So checkout initng for your tests. It's a highly parallelized init >system which seriously speeds up boot. It also keeps the disks much >busier during boot and might help your testing. Sharing my impression: The downside of parallelization within a runlevel change (to keep it general) is that the disk can get too active, and if you're starting/stopping a memory intensive process, you're almost stuck in swapping in and out because everyone wants a piece of mapped physram. Jan Engelhardt -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fastboot, diskstat 2005-07-22 7:16 ` Andre Eisenbach 2005-07-22 9:31 ` Jan Engelhardt @ 2005-07-22 18:36 ` Diego Calleja 1 sibling, 0 replies; 7+ messages in thread From: Diego Calleja @ 2005-07-22 18:36 UTC (permalink / raw) To: Andre Eisenbach; +Cc: bert.hubert, akpm, linux-kernel El Fri, 22 Jul 2005 00:16:38 -0700, Andre Eisenbach <int2str@gmail.com> escribió: > So checkout initng for your tests. It's a highly parallelized init > system which seriously speeds up boot. It also keeps the disks much > busier during boot and might help your testing. > > Initng: > http://initng.thinktux.net It's also interesting that people is porting Mac OS X's launchd to FreeBSD (which has helped mac os x to improve boot times) http://developer.apple.com/documentation/Darwin/Reference/ManPages/man8/launchd.8.html http://wikitest.freebsd.org/moin.cgi/launchd ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-07-22 18:37 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-07-22 3:41 fastboot, diskstat bert hubert 2005-07-22 4:47 ` Andrew Morton 2005-07-22 7:18 ` Avi Kivity 2005-07-22 8:50 ` Lincoln Dale 2005-07-22 7:16 ` Andre Eisenbach 2005-07-22 9:31 ` Jan Engelhardt 2005-07-22 18:36 ` Diego Calleja
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox