* Re: Linux-2.5.17
@ 2002-05-21 18:52 Wayne.Brown
2002-05-21 21:30 ` Linux-2.5.17 David S. Miller
0 siblings, 1 reply; 50+ messages in thread
From: Wayne.Brown @ 2002-05-21 18:52 UTC (permalink / raw)
To: linux-kernel
Under 2.5.17 there is a problem with gtop 1.0.9. It opens a window but never
fills in any details; there's just a blank background. The process becomes
unkillable, even with -9, and although I can do a normal shutdown, the root
partition can't be unmounted because the gtop process is still running and so
fsck is forced on reboot. There are no oops messages that I can find in any of
the logs.
Actually, this happens with all the most recent 2.5.x kernels. I'm not sure how
far back it goes, but I believe it was working OK prior to 2.5.8. (If necessary
I can try to narrow it down further.) It still works great with 2.4.19-pre8 and
2.4.19-pre8-ac5.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
@ 2002-05-22 20:00 Wayne.Brown
2002-05-23 12:17 ` Linux-2.5.17 Nick Holloway
0 siblings, 1 reply; 50+ messages in thread
From: Wayne.Brown @ 2002-05-22 20:00 UTC (permalink / raw)
To: linux-kernel
Thanks for pointing me in the right direction; I found the same code in my copy
of libgtop (1.0.9) and see the problem. Maybe now I can hack something together
to make it work.
In comparing /proc/meminfo in 2.4.19-pre8 and 2.5.17 I see that there is very
little difference except that the information gtop relies upon is missing. The
lines it needs aren't changed or rearranged, just gone altogether. Was there
any particular purpose for that, other than breaking programs like gtop? I'm a
firm believer that adding something new to a system should never break existing
functionality unless absolutely necessary. Was it necessary in this case, or
was it done because someone was offended that it wasn't "clean" enough?
Nick.Holloway@pyrites.org.uk (Nick Holloway) on 05/22/2002 06:49:59 AM
To: linux-kernel@vger.kernel.org
cc: (bcc: Wayne Brown/Corporate/Altec)
Subject: Re: Linux-2.5.17
In <86256BC1.001146A6.00@smtpnotes.altec.com> Wayne.Brown@altec.com writes:
> I can live with not building, crashing, or even eating filesystems. Those
> things will be fixed sooner or later. But breaking userspace programs -- that
> may well be permanent.
Looking at the source code to libgtop-1.0.6 (the version I have
easy access to), the parser used to extract the swap information from
/proc/meminfo is extremely fragile (read: broken). Rather than looking
at the tag at the start of each line for the one it requires, it assumes
that the "Swap:" details are on the 3rd line (and doesn't even verify
the label).
You can't expect the kernel to keep compatability for such poor user-space
code (especially during a development cycle).
The change to /proc/meminfo came about in 2.5.1, and this removed
the first two lines from the old, inflexible layout (that has been
deprecated for a while, and should probably been removed during the
2.1.x development cycle).
--
`O O' | Nick.Holloway@pyrites.org.uk
// ^ \\ | http://www.pyrites.org.uk/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 20:00 Linux-2.5.17 Wayne.Brown
@ 2002-05-23 12:17 ` Nick Holloway
0 siblings, 0 replies; 50+ messages in thread
From: Nick Holloway @ 2002-05-23 12:17 UTC (permalink / raw)
To: linux-kernel
In <86256BC1.0076F247.00@smtpnotes.altec.com> Wayne.Brown@altec.com writes:
> In comparing /proc/meminfo in 2.4.19-pre8 and 2.5.17 I see that there is very
> little difference except that the information gtop relies upon is missing. The
> lines it needs aren't changed or rearranged, just gone altogether. Was there
> any particular purpose for that, other than breaking programs like gtop?
I confess to submitting the patch to Linus to remove the compatability
lines when 2.5.0 was created.
The change that made the first two lines redundant was originally made in
1.3.68[1], and the intention was that they would removed -- the comment
in get_meminfo had:
Tagged format, for easy grepping and expansion. The above will go away
eventually, once the tools have been updated.
As I had an application that parsed /proc/meminfo around the time of
the change, I wrote the parser to handle the new format in preparation
for the old format going away. Others would have done the same.
Someone mentioned that this compatability was still present during
2.4 development, so I made a mental note to submit a patch when 2.5
development started.
> I'm a firm believer that adding something new to a system should
> never break existing functionality unless absolutely necessary. Was it
> necessary in this case, or was it done because someone was offended that
> it wasn't "clean" enough?
Where possible, interfaces should not be changed for the sake of change,
but you shouldn't keep old interfaces because that way lies bloat.
One of the beauties of Linux (IMO) is that broken interfaces don't have
to be maintained ad-infinitum. It should be expected that interfaces
may be broken during the development series of kernels, and the tools
have a chance to catch up during this cycle.
The interface that gtop (implicitly) required is that "Swap:" should be
the 3rd line in /proc/meminfo. You'll never get anyone to admit that
this is the interface to /proc/meminfo that should be maintained.
-
[1] http://groups.google.com/groups?selm=4glagm%24pno%40treflan.shout.net
--
`O O' | Nick.Holloway@pyrites.org.uk
// ^ \\ | http://www.pyrites.org.uk/
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
@ 2002-05-22 3:02 Wayne.Brown
2002-05-22 7:12 ` Linux-2.5.17 Zwane Mwaikambo
2002-05-22 11:49 ` Linux-2.5.17 Nick Holloway
0 siblings, 2 replies; 50+ messages in thread
From: Wayne.Brown @ 2002-05-22 3:02 UTC (permalink / raw)
To: Russell King; +Cc: linux-kernel
I can live with not building, crashing, or even eating filesystems. Those
things will be fixed sooner or later. But breaking userspace programs -- that
may well be permanent. If there was a good chance it would be working again by
the time 2.6 comes out, it wouldn't bother me. But I really don't expect this
to change, so it looks like I won't be able to use 2.5 (or anything later) until
I get another version of gtop, or fix this one myself. And so there will be yet
another nonstandard change to my patchwork system that I'll have to deal with
(if I even remember it) the next time I try to upgrade (i.e., reinstall)
Slackware.
Russell King <rmk@arm.linux.org.uk> on 05/21/2002 06:29:23 PM
To: Wayne Brown/Corporate/Altec@Altec
cc: linux-kernel@vger.kernel.org
Subject: Re: Linux-2.5.17
On Tue, May 21, 2002 at 06:20:56PM -0500, Wayne.Brown@altec.com wrote:
> So, I'm just getting used to the idea of using new tools to build kernels,
> and now I learn that 2.5 breaks an ordinary program that I use all day,
> every day. It just keeps getting better and better...
The 2.<odd> series, like 2.5 is a strictly development kernel series; new
features go into these all the time. You can expect it to:
1. not build.
2. crash.
3. silently eat your filesystems.
4. break userspace programs.
or any combination of the above. If you're looking for stability, stick
with the 2.<even> series.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 3:02 Linux-2.5.17 Wayne.Brown
@ 2002-05-22 7:12 ` Zwane Mwaikambo
2002-05-22 11:49 ` Linux-2.5.17 Nick Holloway
1 sibling, 0 replies; 50+ messages in thread
From: Zwane Mwaikambo @ 2002-05-22 7:12 UTC (permalink / raw)
To: Wayne.Brown; +Cc: Linux Kernel
On Tue, 21 May 2002 Wayne.Brown@altec.com wrote:
> I get another version of gtop, or fix this one myself. And so there will be yet
> another nonstandard change to my patchwork system that I'll have to deal with
> (if I even remember it) the next time I try to upgrade (i.e., reinstall)
> Slackware.
Change isn't such a bad thing, and if you want consistency, 2.5 is hardly
what you should be using.
Cheers,
Zwane Mwaikambo
--
http://function.linuxpower.ca
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 3:02 Linux-2.5.17 Wayne.Brown
2002-05-22 7:12 ` Linux-2.5.17 Zwane Mwaikambo
@ 2002-05-22 11:49 ` Nick Holloway
1 sibling, 0 replies; 50+ messages in thread
From: Nick Holloway @ 2002-05-22 11:49 UTC (permalink / raw)
To: linux-kernel
In <86256BC1.001146A6.00@smtpnotes.altec.com> Wayne.Brown@altec.com writes:
> I can live with not building, crashing, or even eating filesystems. Those
> things will be fixed sooner or later. But breaking userspace programs -- that
> may well be permanent.
Looking at the source code to libgtop-1.0.6 (the version I have
easy access to), the parser used to extract the swap information from
/proc/meminfo is extremely fragile (read: broken). Rather than looking
at the tag at the start of each line for the one it requires, it assumes
that the "Swap:" details are on the 3rd line (and doesn't even verify
the label).
You can't expect the kernel to keep compatability for such poor user-space
code (especially during a development cycle).
The change to /proc/meminfo came about in 2.5.1, and this removed
the first two lines from the old, inflexible layout (that has been
deprecated for a while, and should probably been removed during the
2.1.x development cycle).
--
`O O' | Nick.Holloway@pyrites.org.uk
// ^ \\ | http://www.pyrites.org.uk/
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
@ 2002-05-21 23:20 Wayne.Brown
2002-05-21 23:29 ` Linux-2.5.17 Russell King
2002-05-21 23:33 ` Linux-2.5.17 Joel Jaeggli
0 siblings, 2 replies; 50+ messages in thread
From: Wayne.Brown @ 2002-05-21 23:20 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-kernel
Thanks for the information.
So, I'm just getting used to the idea of using new tools to build kernels, and
now I learn that 2.5 breaks an ordinary program that I use all day, every day.
It just keeps getting better and better...
"David S. Miller" <davem@redhat.com> on 05/21/2002 04:30:11 PM
To: Wayne Brown/Corporate/Altec@Altec
cc: linux-kernel@vger.kernel.org
Subject: Re: Linux-2.5.17
From: Wayne.Brown@altec.com
Date: Tue, 21 May 2002 13:52:08 -0500
Under 2.5.17 there is a problem with gtop 1.0.9.
The /proc/meminfo output changed, and this makes a lot of programs
reading that file explode.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 23:20 Linux-2.5.17 Wayne.Brown
@ 2002-05-21 23:29 ` Russell King
2002-05-21 23:33 ` Linux-2.5.17 Joel Jaeggli
1 sibling, 0 replies; 50+ messages in thread
From: Russell King @ 2002-05-21 23:29 UTC (permalink / raw)
To: Wayne.Brown; +Cc: linux-kernel
On Tue, May 21, 2002 at 06:20:56PM -0500, Wayne.Brown@altec.com wrote:
> So, I'm just getting used to the idea of using new tools to build kernels,
> and now I learn that 2.5 breaks an ordinary program that I use all day,
> every day. It just keeps getting better and better...
The 2.<odd> series, like 2.5 is a strictly development kernel series; new
features go into these all the time. You can expect it to:
1. not build.
2. crash.
3. silently eat your filesystems.
4. break userspace programs.
or any combination of the above. If you're looking for stability, stick
with the 2.<even> series.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-21 23:20 Linux-2.5.17 Wayne.Brown
2002-05-21 23:29 ` Linux-2.5.17 Russell King
@ 2002-05-21 23:33 ` Joel Jaeggli
1 sibling, 0 replies; 50+ messages in thread
From: Joel Jaeggli @ 2002-05-21 23:33 UTC (permalink / raw)
To: Wayne.Brown; +Cc: David S. Miller, linux-kernel
On Tue, 21 May 2002 Wayne.Brown@altec.com wrote:
>
>
> Thanks for the information.
>
> So, I'm just getting used to the idea of using new tools to build kernels, and
> now I learn that 2.5 breaks an ordinary program that I use all day, every day.
> It just keeps getting better and better...
>
I don't recall how many time stuff that touched proc broke in the
.99-1.3-2.0 era but it was a few...
the proc filesystem is not an snmp mib.
>
>
>
> "David S. Miller" <davem@redhat.com> on 05/21/2002 04:30:11 PM
>
> To: Wayne Brown/Corporate/Altec@Altec
> cc: linux-kernel@vger.kernel.org
>
> Subject: Re: Linux-2.5.17
>
>
>
> From: Wayne.Brown@altec.com
> Date: Tue, 21 May 2002 13:52:08 -0500
>
> Under 2.5.17 there is a problem with gtop 1.0.9.
>
> The /proc/meminfo output changed, and this makes a lot of programs
> reading that file explode.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
--------------------------------------------------------------------------
Joel Jaeggli Academic User Services joelja@darkwing.uoregon.edu
-- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E --
In Dr. Johnson's famous dictionary patriotism is defined as the last
resort of the scoundrel. With all due respect to an enlightened but
inferior lexicographer I beg to submit that it is the first.
-- Ambrose Bierce, "The Devil's Dictionary"
^ permalink raw reply [flat|nested] 50+ messages in thread
* Linux-2.5.17
@ 2002-05-21 5:16 Linus Torvalds
2002-05-21 13:58 ` Linux-2.5.17 Roman Zippel
` (3 more replies)
0 siblings, 4 replies; 50+ messages in thread
From: Linus Torvalds @ 2002-05-21 5:16 UTC (permalink / raw)
To: Kernel Mailing List
Various FS updates (including merges of quota and iget_locked), and
Makefile cleanups from Kai.
And yet more TLB shootdown stuff.
Linus
-----
Summary of changes from v2.5.16 to v2.5.17
============================================
<acme@conectiva.com.br>
copy_from/to_user checking in
o drivers/sound/*.c
o fs/intermezzo/ext_attr.c
o drivers/isdn/*.c
o drivers/usr/*.c
o sound/{core,pci}/*.c
o drivers/char/*
o drivers/block/*.c
Andrew Morton <akpm@zip.com.au>
o check for dirtying of non-uptodate buffers
o reduce lock contention in do_pagecache_readahead
o larger b_size, and misc fixlets
o fix dirty page management
o reiserfs locking fix
o pdflush exclusion infrastructure
o dirty inode management
o i_dirty_buffers locking fix
o ext2: preread inode backing blocks
o pdflush exclusion
o fix ext3 race with writeback
o fix ext3 buffer-stealing
o writeback tuning
o remove PG_launder
o improved I/O scheduling for indirect blocks
<david@gibson.dropbear.id.au>
o Missing init.h in drivers/pci/power.c
<dmccr@us.ibm.com>
o Thread group exit problem reappeared
Christoph Hellwig <hch@infradead.org>
o cleanup read/write
o Small cleanup of nfsd export checks
o kNFSd cleanup of nfsd_open
o get rid of <linux/locks.h>
<jack@suse.cz>
o quota-1-newlocks
o quota-2-formats
o quota-3-register
o quota-4-getstats
o quota-5-space
o quota-6-bytes
o quota-7-quotactl
o quota-8-format1
o quota-9-format2
o quota-10-inttype
o quota-11-sync
o quota-12-compat
o quota-13-ioctl
<jaharkes@cs.cmu.edu>
o iget_locked [1-6]
<jhammer@us.ibm.com>
o ips for 2.5
<kai@tp1.ruhr-uni-bochum.de>
o Rules.make cleanup: introduce c_flags, a_flags
o Remuve some cruft from top-level Makefile
o Move DocBook stuff out of top-level Makefile
o Move arch specific options to their Makefile
o Don't implicitly export all symbols
o top-level Makefile cleanup
o Remove assembler rules from top-level Makefile
o Add scripts to generate include/linux/{version,compile}.h
o Rules.make: Use variables for commands
o Small Rules.make cleanup
o Rules.make: check for changed command line
o Makefile cleanup: Don't rebuild init/version.o on each build
o IA64: Use standard AS rule
o x86_64: Use standard AS rule
o Rules.make: Remove special rule for $(export-objs)
o Fix a typo in drivers/pcmcia/Makefile
o Fix arch/alpha/boot AS rule
o Makefile: fix merge
o ISDN: Export CAPI user interface directly
o ISDN: Remove remaining MOD_{INC,DEC}_USE_COUNT from CAPI drivers
o Make AFLAGS_KERNEL use consistent with CFLAGS_KERNEL
o ISDN: CAPI: Remove duplicate statistics
o ISDN: CAPI: Remove capi_interface_user etc.
o ISDN: CAPI: Move the notification callback
o ISDN: Have the CAPI application alloc struct capi_appl
o ISDN: CAPI: Pass struct capi_appl * instead of index
o ISDN: CAPI use struct capi20_appl * in signal callback
o ISDN: CAPI: Get rid of capi_signal mechanism
o ISDN: AVM T1 ISA CAPI controller fix
o Update /BitKeeper/etc/ignore
o kbuild: Use $(CURDIR)
o kbuild: Suppress printing of '$(MAKE) -C command' line
o kbuild: Fix object-specific CFLAGS_foo.o
o Small fix for net/irda/Makefile
o Fix ext2 compilation
o Fix some compiler warnings
o kbuild: Remove generated .<object>.cmd files on 'make clean'
o kbuild: Standardize building of init/*
o kbuild: Speed up vmlinux build
<mason@suse.com>
o reiserfs bitops warnings
o reiserfs iput deadlock fix
Neil Brown <neilb@cse.unsw.edu.au>
o Increase snd buffer size for UDP
o Change MD Superblock IO to go straight to submit_bio
o Tidy up raid5 code
o Initial md/raid5 support for 2.5 (with bio)
<torvalds@transmeta.com>
o Clean up %cr3 loading on x86, fix lazy TLB problem
o Fix double i_writecount handling (Tony Luck)
o Make generic TLB shootdown friendlier to non-x86 architectures
o Fix OSS API emulation when sound is compiled as a module
o Update kernel version to 2.5.17
o New makefiles generate .*.cmd files, not .*.flags files
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-21 5:16 Linux-2.5.17 Linus Torvalds
@ 2002-05-21 13:58 ` Roman Zippel
2002-05-21 16:06 ` Linux-2.5.17 Linus Torvalds
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
` (2 subsequent siblings)
3 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2002-05-21 13:58 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
Hi,
On Mon, 20 May 2002, Linus Torvalds wrote:
> And yet more TLB shootdown stuff.
I'm a bit puzzled, how you want to do proper rss accounting, you put now a
"tlb->freed++;" into zap_pte_range(). mmu_gather_t is supposed to be an
opaque type and this access violates this.
bye, Roman
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-21 13:58 ` Linux-2.5.17 Roman Zippel
@ 2002-05-21 16:06 ` Linus Torvalds
2002-05-21 18:36 ` Linux-2.5.17 Roman Zippel
0 siblings, 1 reply; 50+ messages in thread
From: Linus Torvalds @ 2002-05-21 16:06 UTC (permalink / raw)
To: Roman Zippel; +Cc: Kernel Mailing List
On Tue, 21 May 2002, Roman Zippel wrote:
> On Mon, 20 May 2002, Linus Torvalds wrote:
>
> > And yet more TLB shootdown stuff.
>
> I'm a bit puzzled, how you want to do proper rss accounting, you put now a
> "tlb->freed++;" into zap_pte_range(). mmu_gather_t is supposed to be an
> opaque type and this access violates this.
I don't think there is any validity any more in the "opaque type" comment,
and I'd rather expose the fact that it _has_ to have the rss computations
inside of it than have more made-up interfaces to hide it.
The fact is, the rss cannot be computed anywhere else any more, so why
play games about it?
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 16:06 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-21 18:36 ` Roman Zippel
2002-05-21 18:53 ` Linux-2.5.17 Linus Torvalds
0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2002-05-21 18:36 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
Hi,
Linus Torvalds wrote:
> I don't think there is any validity any more in the "opaque type" comment,
> and I'd rather expose the fact that it _has_ to have the rss computations
> inside of it than have more made-up interfaces to hide it.
>
> The fact is, the rss cannot be computed anywhere else any more, so why
> play games about it?
Basically I could agree with it, but something looks wrong. Why exactly
is pte_free_tlb() needed in first place? Why does it call
tlb_remove_page()? A page mapped into user space has little to do with a
page used as page table. Latter is never in the user tlb, so it doesn't
need to be removed from it, so calling tlb_remove_page() is just a more
complicated way of calling __free_page() or am I missing something?
bye, Roman
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 18:36 ` Linux-2.5.17 Roman Zippel
@ 2002-05-21 18:53 ` Linus Torvalds
2002-05-21 23:35 ` Linux-2.5.17 Roman Zippel
0 siblings, 1 reply; 50+ messages in thread
From: Linus Torvalds @ 2002-05-21 18:53 UTC (permalink / raw)
To: Roman Zippel; +Cc: Kernel Mailing List
On Tue, 21 May 2002, Roman Zippel wrote:
>
> Basically I could agree with it, but something looks wrong. Why exactly
> is pte_free_tlb() needed in first place? Why does it call
> tlb_remove_page()?
That is a x86-specific thing, not aarchitected.
The _architected_ thing is
- pte_free() does the physical free of a pte pointer that was allocated
but never inserted into the page tables due to optimistic locking (see
pte_alloc_map() in mm/memory.c).
- pte_free_tlb() does the same BUT it is also an architecture-specific
hook to allow the architecture to also some way shoot down whatever TLB
contents that might depend on the pmd_page in question.
On x86, we do that by just adding it as another page to teh tlb flush
stuff, but other architectures might just make it be the same as
pte_free() if there are no TLB issues involved.
If you care, the reason we need to do this on x86 is that the TLB walker
is speculative and almost totally asynchronous wrt the rest of the CPU
core, so we may have a CPU "TLB lookup thread" goin on in parallel with
the TLB cleaning - and that TLB lookup may have looked up the pmd contents
already but not resolved the entry yet. Which is why we have to
synchronize the PMD freeing with the TLB flush - the same way we already
have to do it for the regular data pages.
Other architectures may not have this issue (or you can fix it with
alternative approaches, like using the pmd quicklists etc to avoid freeing
the pmd before the TLB flush, which is likely to be the fix in the 2.4.x
tree).
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 18:53 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-21 23:35 ` Roman Zippel
2002-05-22 0:10 ` Linux-2.5.17 Linus Torvalds
0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2002-05-21 23:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
Hi,
Linus Torvalds wrote:
> If you care, the reason we need to do this on x86 is that the TLB walker
> is speculative and almost totally asynchronous wrt the rest of the CPU
> core, so we may have a CPU "TLB lookup thread" goin on in parallel with
> the TLB cleaning - and that TLB lookup may have looked up the pmd contents
> already but not resolved the entry yet. Which is why we have to
> synchronize the PMD freeing with the TLB flush - the same way we already
> have to do it for the regular data pages.
Alternative suggestion: remove the present bit from the pgd/pmd entry.
After you flushed the tlb, you can clean up the page tables without a
hurry. That will work on any sane system and you don't have to force
data and table pages into the same interface.
bye, Roman
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 23:35 ` Linux-2.5.17 Roman Zippel
@ 2002-05-22 0:10 ` Linus Torvalds
2002-05-22 0:31 ` Linux-2.5.17 Roman Zippel
0 siblings, 1 reply; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 0:10 UTC (permalink / raw)
To: Roman Zippel; +Cc: Kernel Mailing List
On Wed, 22 May 2002, Roman Zippel wrote:
>
> Alternative suggestion: remove the present bit from the pgd/pmd entry.
> After you flushed the tlb, you can clean up the page tables without a
> hurry. That will work on any sane system and you don't have to force
> data and table pages into the same interface.
Sounds sane, except for the fact that some architectures do not actually
care about the "Present" bit in the pgd at all.
x86, to be exact ;(
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 0:10 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-22 0:31 ` Roman Zippel
2002-05-22 0:54 ` Linux-2.5.17 Linus Torvalds
0 siblings, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2002-05-22 0:31 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
Hi,
On Tue, 21 May 2002, Linus Torvalds wrote:
> > Alternative suggestion: remove the present bit from the pgd/pmd entry.
> > After you flushed the tlb, you can clean up the page tables without a
> > hurry. That will work on any sane system and you don't have to force
> > data and table pages into the same interface.
>
> Sounds sane, except for the fact that some architectures do not actually
> care about the "Present" bit in the pgd at all.
>
> x86, to be exact ;(
IMO that's not really problem, the pmd tables are created and destroyed
with the pgd table.
bye, Roman
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 0:31 ` Linux-2.5.17 Roman Zippel
@ 2002-05-22 0:54 ` Linus Torvalds
2002-05-22 2:17 ` Linux-2.5.17 David S. Miller
2002-05-22 13:45 ` Linux-2.5.17 Roman Zippel
0 siblings, 2 replies; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 0:54 UTC (permalink / raw)
To: Roman Zippel; +Cc: Kernel Mailing List
On Wed, 22 May 2002, Roman Zippel wrote:
> >
> > x86, to be exact ;(
>
> IMO that's not really problem, the pmd tables are created and destroyed
> with the pgd table.
unmap()?
That's the big one, actually. The exit case we _could_ do very differently
anyway, and there are reasons that we probably should try to.
(When we exit, we could flush the TLB and at the same time do a
"speculative" switch to the mm of the next process on the run-queue of
this CPU, so that when we actually tear down the MM we would have no TLB
issues at all any more).
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 0:54 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-22 2:17 ` David S. Miller
2002-05-22 2:40 ` Linux-2.5.17 Linus Torvalds
2002-05-22 13:45 ` Linux-2.5.17 Roman Zippel
1 sibling, 1 reply; 50+ messages in thread
From: David S. Miller @ 2002-05-22 2:17 UTC (permalink / raw)
To: torvalds; +Cc: zippel, linux-kernel
From: Linus Torvalds <torvalds@transmeta.com>
Date: Tue, 21 May 2002 17:54:18 -0700 (PDT)
That's the big one, actually. The exit case we _could_ do very differently
anyway, and there are reasons that we probably should try to.
(When we exit, we could flush the TLB and at the same time do a
"speculative" switch to the mm of the next process on the run-queue of
this CPU, so that when we actually tear down the MM we would have no TLB
issues at all any more).
I think deferring this to the lazy TLB end at the next task switch is
worth pursuing.
I always wanted to also explore way to speed up these pieces of code
we have which walk the page table tree to kill everything off.
Something simple like a very small bitmap in the mm_struct. It would
work by keeping track of which areas of the address space actually
have some mappings present. The set bits would be kept track of
pessimistically, to keep it fast and simple.
So when you add a page mapping somewhere you'd go:
set_mapping_bit(mm, address);
Then exit_mmap() would only traverse into parts of the page
tables where mappings actually existed.
Similarly for copy_page_range when dup'ing an address space.
This stuff shows up clearly on the fork/exit/exec microbenchmark
profiles.
Like I said, keep the bitmap very small, perhaps 4 unsigned longs
at the most.
Actually, what this suggests is that we blow away the page table
flushing guts of exit_mmap() and just have this
"anihilate_address_space()" thing that is %100 arch-specific and can
be used to optimize this as much as a platform wants to. We can even
provide a "boring" generic implementation protected by
HAVE_ARCH_ANIHILATE_ADDRESS_SPACE that basically looks like what we
have there today. (The interface name sucks, I know, sorry, we'll
will have to come up with a nicer name :-)
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 2:17 ` Linux-2.5.17 David S. Miller
@ 2002-05-22 2:40 ` Linus Torvalds
2002-05-22 2:57 ` Linux-2.5.17 David S. Miller
0 siblings, 1 reply; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 2:40 UTC (permalink / raw)
To: David S. Miller; +Cc: zippel, linux-kernel
On Tue, 21 May 2002, David S. Miller wrote:
>
> I think deferring this to the lazy TLB end at the next task switch is
> worth pursuing.
No can do.
If we tear down the page tables, we _have_ to flush the TLB on x86,
because even if we don't touch them later on, speculative execution may
end up causing TLB fills, and if we don't tell the TLB fill hw that we've
torn down the pages (by invalidating the TLB), you can get all the same
nasty behaviour.
And we cannot just defer the TLB flush to a later date ("who cares if we
get crap in the TLB, we'll flush it anyway"), because some of the bogus
TLB contents might get the "Global" bit set too. Which would mean that
those bogus entries wouldn't be flushed at all.
In short:
- if we tear down the page tables, we _have_ to flush the TLB, even if we
turn it into a lazy TLB.
- At least on x86, once you flush the TLB, the incremental cost of doing
a full mm switch is basically zero. The TLB flush is, after all, the
real cost of the mm switch (this is likely to be true on other CPU's
too).
- so we can choose between just flushing the TLB (and leaving it lazy),
and then on the next switch_mm() we flush it again when we switch into
the next process, _OR_ we could try to opportunistically switch mm's
"early".
The early switch would at least on x86 be likely to result in the minimal
amount of TLB flushing theoretically possible. Which I kind of like (if
you can _prove_ that you cannot do better, you're in a good position ;).
But the "just flush the TLB" approach certainly also works.
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 2:40 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-22 2:57 ` David S. Miller
2002-05-22 3:21 ` Linux-2.5.17 Linus Torvalds
0 siblings, 1 reply; 50+ messages in thread
From: David S. Miller @ 2002-05-22 2:57 UTC (permalink / raw)
To: torvalds; +Cc: zippel, linux-kernel
From: Linus Torvalds <torvalds@transmeta.com>
Date: Tue, 21 May 2002 19:40:08 -0700 (PDT)
The early switch would at least on x86 be likely to result in the minimal
amount of TLB flushing theoretically possible. Which I kind of like (if
you can _prove_ that you cannot do better, you're in a good position ;).
Probably on sparc64 too. The simplest way to kill off a TLB context
on sparc64 at exit_mmap() is to just mark it invalid (this means just
clearing the cpu_vm_mask of the mm_struct using that context PID).
It is even simpler than that, at exit_mmap() time we are destroying
the mm_struct anyways, nobody references it, and thus destroy_context
does all of the work.
Unfortunately, today mmdrop() (which is where destroy_context is
invoked) happens after exit_mmap().
Maybe some kind of "switch_from_dead_context()" type of thing?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 2:57 ` Linux-2.5.17 David S. Miller
@ 2002-05-22 3:21 ` Linus Torvalds
2002-05-22 8:06 ` Linux-2.5.17 David Lang
2002-05-22 14:14 ` Linux-2.5.17 Dave McCracken
0 siblings, 2 replies; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 3:21 UTC (permalink / raw)
To: David S. Miller; +Cc: zippel, linux-kernel
On Tue, 21 May 2002, David S. Miller wrote:
>
> Unfortunately, today mmdrop() (which is where destroy_context is
> invoked) happens after exit_mmap().
>
> Maybe some kind of "switch_from_dead_context()" type of thing?
Yes, I was thinking of an extra step like that.
The problem is just finding a _good_ context to switch to. We can do this
two different ways:
- actually doing a real context switch, but with a magic
"schedule_tail()" that ends up being the rest of do_exit(). This is
_really_ hard to get right, and implies that everything after the
context switch has to be non-blocking (since we'd block in the "wrong"
process context at that point.
- my preferred solution: speculatively find _some_ process (preferably
one that we are likely to schedule next), and use that process's
"active_mm" to do a "switch_mm()" into (and set that to "current->mm")
This is kind of like the lazy TLB thing, except going the other way.
The speculative thing has the problem of finding a good process, but I
would suggest something along the lines of:
- take the first process in the run-queue on the current CPU.
- if there is no process on th erun-queue, take our parent
The "parent" fallback is nice because (a) we're guaranteed to have a
parent and it is easily found and (b) we're going to wake our parent up
soon enough in "notify_parent()", so if the current runqueue is empty, the
parent is one of the likelier processes to end up there..
But no, I've not looked into the details. We've never stolen a mm from
anybody else before (lazy TLB _gives_ a mm to the next process, it doesn't
take it from anybody), so it might have nasty locking issues or something.
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 3:21 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-22 8:06 ` David Lang
2002-05-22 14:14 ` Linux-2.5.17 Dave McCracken
1 sibling, 0 replies; 50+ messages in thread
From: David Lang @ 2002-05-22 8:06 UTC (permalink / raw)
To: Linus Torvalds; +Cc: David S. Miller, zippel, linux-kernel
what about SMP where you may have multiple children hit this at the same
time on different CPUs?
David Lang
On Tue, 21 May 2002, Linus Torvalds wrote:
> - if there is no process on th erun-queue, take our parent
>
> The "parent" fallback is nice because (a) we're guaranteed to have a
> parent and it is easily found and (b) we're going to wake our parent up
> soon enough in "notify_parent()", so if the current runqueue is empty, the
> parent is one of the likelier processes to end up there..
>
> But no, I've not looked into the details. We've never stolen a mm from
> anybody else before (lazy TLB _gives_ a mm to the next process, it doesn't
> take it from anybody), so it might have nasty locking issues or something.
>
> Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 3:21 ` Linux-2.5.17 Linus Torvalds
2002-05-22 8:06 ` Linux-2.5.17 David Lang
@ 2002-05-22 14:14 ` Dave McCracken
2002-05-22 16:10 ` Linux-2.5.17 Linus Torvalds
1 sibling, 1 reply; 50+ messages in thread
From: Dave McCracken @ 2002-05-22 14:14 UTC (permalink / raw)
To: Linus Torvalds, David S. Miller; +Cc: zippel, linux-kernel
--On Tuesday, May 21, 2002 08:21:56 PM -0700 Linus Torvalds
<torvalds@transmeta.com> wrote:
> The problem is just finding a _good_ context to switch to. We can do this
> two different ways:
>
> (...)
>
> - my preferred solution: speculatively find _some_ process (preferably
> one that we are likely to schedule next), and use that process's
> "active_mm" to do a "switch_mm()" into (and set that to "current->mm")
>
> The speculative thing has the problem of finding a good process, but I
> would suggest something along the lines of:
>
> - take the first process in the run-queue on the current CPU.
> - if there is no process on th erun-queue, take our parent
What would be the incremental cost of just switching to init_mm? Granted,
it's likely to require switching again when you schedule, but this is the
exit path. It could be a fallback if nothing else looks good.
Dave McCracken
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 0:54 ` Linux-2.5.17 Linus Torvalds
2002-05-22 2:17 ` Linux-2.5.17 David S. Miller
@ 2002-05-22 13:45 ` Roman Zippel
2002-05-22 16:08 ` Linux-2.5.17 Linus Torvalds
1 sibling, 1 reply; 50+ messages in thread
From: Roman Zippel @ 2002-05-22 13:45 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
Hi,
On Tue, 21 May 2002, Linus Torvalds wrote:
> > > x86, to be exact ;(
> >
> > IMO that's not really problem, the pmd tables are created and destroyed
> > with the pgd table.
>
> unmap()?
We already don't let the general vm touch the pgd entries for the same
reason, so I don't think that's really a big problem.
Using the present bit has another consequence. unmap() had to be done in
two phases:
1. Disable the table entries at the highest possible level. Using the
previous and following vma avoids scanning the tables (something like
free_pgtables already does, only more accurate).
2. Scan the tables and free all the disabled entries. At this point we
don't have worry about any tlb issues anymore.
I can see a few advantages doing it this way. The first phase could be
quite fast even for large unmaps and so reducing the time holding the
page_table_lock. It avoids the race mentioned by Paul (although a
ptep_clear_present() would still be needed). It would also free up more
unused tables. The tlb shootdown stuff would be simpler as well.
On the other hand it's a rather rough idea and I don't know how feasible
it really is, but without the exit case it should become easier and IMO
worth a try.
bye, Roman
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 13:45 ` Linux-2.5.17 Roman Zippel
@ 2002-05-22 16:08 ` Linus Torvalds
0 siblings, 0 replies; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 16:08 UTC (permalink / raw)
To: Roman Zippel; +Cc: Kernel Mailing List
On Wed, 22 May 2002, Roman Zippel wrote:
>
> We already don't let the general vm touch the pgd entries for the same
> reason, so I don't think that's really a big problem.
> Using the present bit has another consequence. unmap() had to be done in
> two phases:
I don't disagree. Are you interested in trying to write it up? It sounds
like a potentially good idea, with few downsides (but I can imagine some:
it does bad things to threads that just happen to share the same 4M area
for other stuff, and that start getting spurious page faults on another
CPU because _their_ area temporarily went away from under them).
I also suspect that it might simplify the TLB shootdown enough that we
wouldn't _have_ to split out the exit case and could use the shared
zapping. But I'm kind of worried about the potential threading issues.
(Rule of thumb: it's always a bad idea to cut down on parallelism, and
we'll _really_ be up shit creek if some threaded app comes along later
where munmap() ends up serializing threads too much).
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 5:16 Linux-2.5.17 Linus Torvalds
2002-05-21 13:58 ` Linux-2.5.17 Roman Zippel
@ 2002-05-22 10:54 ` Martin Dalecki
2002-05-22 12:04 ` Linux-2.5.17 Alexander Viro
` (2 more replies)
2002-05-22 11:19 ` Linux-2.5.17 Russell King
2002-05-24 13:59 ` Linux-2.5.17 Martin Dalecki
3 siblings, 3 replies; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 10:54 UTC (permalink / raw)
To: jack; +Cc: Linus Torvalds, Kernel Mailing List
Uz.ytkownik Linus Torvalds napisa?:
>
> Summary of changes from v2.5.16 to v2.5.17
> ============================================
>
> <jack@suse.cz>
> o quota-1-newlocks
> o quota-2-formats
> o quota-3-register
> o quota-4-getstats
> o quota-5-space
> o quota-6-bytes
> o quota-7-quotactl
> o quota-8-format1
> o quota-9-format2
> o quota-10-inttype
> o quota-11-sync
> o quota-12-compat
> o quota-13-ioctl
Please put the following crap under /proc/sys/fs,
where it belongs. OK?
[root@kozaczek fs]# pwd
/proc/fs
[root@kozaczek fs]# cat quota
Version 60501
Formats
0 0 0 0 0 0 0 8
[root@kozaczek fs]#
Or are are you going to reinvent just enother
case of /proc/ formatting compatibility problems?!
And the requirement to have /proc mounted for quoate usage?!
I hate /proc/my/random/sandbox/becouse/I/dont/knwo/unix/and/have/no/taste
interfaces more and more...
(PS. Hah! I found finally someone today who deserves flames! :-).)
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 12:04 ` Alexander Viro
2002-05-22 13:07 ` Linux-2.5.17 Martin Dalecki
2002-05-22 12:14 ` Linux-2.5.17 Russell King
2002-05-22 13:06 ` Linux-2.5.17 Alan Cox
2 siblings, 1 reply; 50+ messages in thread
From: Alexander Viro @ 2002-05-22 12:04 UTC (permalink / raw)
To: Martin Dalecki; +Cc: jack, Linus Torvalds, Kernel Mailing List
On Wed, 22 May 2002, Martin Dalecki wrote:
> Or are are you going to reinvent just enother
> case of /proc/ formatting compatibility problems?!
> And the requirement to have /proc mounted for quoate usage?!
>
> I hate /proc/my/random/sandbox/becouse/I/dont/knwo/unix/and/have/no/taste
> interfaces more and more...
>
> (PS. Hah! I found finally someone today who deserves flames! :-).)
Gives the phrase "finding yourself" a whole new meaning, doesn't it?
Al, deeply PO'd by assorted cretinisms _not_ related to the kernel.
Sigh...
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 12:04 ` Linux-2.5.17 Alexander Viro
@ 2002-05-22 13:07 ` Martin Dalecki
2002-05-22 14:38 ` Linux-2.5.17 Alexander Viro
2002-05-22 16:55 ` Linux-2.5.17 Jan Kara
0 siblings, 2 replies; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 13:07 UTC (permalink / raw)
To: Alexander Viro; +Cc: jack, Linus Torvalds, Kernel Mailing List
Uz.ytkownik Alexander Viro napisa?:
>
> On Wed, 22 May 2002, Martin Dalecki wrote:
>
>
>>Or are are you going to reinvent just enother
>>case of /proc/ formatting compatibility problems?!
>>And the requirement to have /proc mounted for quoate usage?!
>>
>>I hate /proc/my/random/sandbox/becouse/I/dont/knwo/unix/and/have/no/taste
>>interfaces more and more...
>>
>>(PS. Hah! I found finally someone today who deserves flames! :-).)
>
>
> Gives the phrase "finding yourself" a whole new meaning, doesn't it?
>
> Al, deeply PO'd by assorted cretinisms _not_ related to the kernel.
> Sigh...
Lokking at 2.5.17 I see the following:
-#define QUOTAFILENAME "quota"
-#define QUOTAGROUP "staff"
As usuall we can see what goes to /proc is apparently
random bulls*it as always. I love in esp. the assumption about
some group name on a system!
But it get's removed this time. So let's peer where
it get's reintroduced:
Ah... yes, patch-2.5.17, here it is:
+#ifdef CONFIG_PROC_FS
+static int read_stats(char *buffer, char **start, off_t offset, int count, int
*eof, void *data)
+{
+
int len;
+
struct quota_format_type *actqf;
+
+
dqstats.allocated_dquots = nr_dquots;
+
dqstats.free_dquots = nr_free_dquots;
+
+
len = sprintf(buffer, "Version %u\n", __DQUOT_NUM_VERSION__);
+
len += sprintf(buffer + len, "Formats");
+
lock_kernel();
+
for (actqf = quota_formats; actqf; actqf = actqf->qf_next)
+
len += sprintf(buffer + len, " %u", actqf->qf_fmt_id);
unlock_kernel();
-
return ret;
+
len += sprintf(buffer + len, "\n%u %u %u %u %u %u %u %u\n",
+
dqstats.lookups, dqstats.drops,
+
dqstats.reads, dqstats.writes,
+
dqstats.cache_hits, dqstats.allocated_dquots,
+
dqstats.free_dquots, dqstats.syncs);
+
+
if (offset >= len) {
+
*start = buffer;
+
*eof = 1;
+
return 0;
+
}
+
*start = buffer + offset;
+
if ((len -= offset) > count)
+
return count;
+
*eof = 1;
+
+
return len;
+}
+#endif
What can we see in the above:
1. Those are first grade candidates for sysctl read-only entires, since they
are system global statistics which should belong to /proc/sys/fs/
We even have already fs.dquot-nr there! Why the hell don't put them
alongside?
2. Typical string formating and value copy and termination
problems inherent to string stuff...
3. The futile hope that tools using it will even bother to check the
Version... gtop just *right today* showed that user space programmers
won't care about it, so it gains us literally *nothing*.
If it where sysctl numbers they would just vanish beneath them if something
changed semantincally and they *would have no chance* to do it wrong.
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 13:07 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 14:38 ` Alexander Viro
2002-05-22 13:42 ` Linux-2.5.17 Martin Dalecki
2002-05-22 16:55 ` Linux-2.5.17 Jan Kara
1 sibling, 1 reply; 50+ messages in thread
From: Alexander Viro @ 2002-05-22 14:38 UTC (permalink / raw)
To: Martin Dalecki; +Cc: jack, Linus Torvalds, Kernel Mailing List
On Wed, 22 May 2002, Martin Dalecki wrote:
> 2. Typical string formating and value copy and termination
> problems inherent to string stuff...
s/inherent to/inherent to incompetently written/
BTW, quoted code should've used seq_file helpers - that would both
cut the code size way down and fix the damn thing.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 14:38 ` Linux-2.5.17 Alexander Viro
@ 2002-05-22 13:42 ` Martin Dalecki
0 siblings, 0 replies; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 13:42 UTC (permalink / raw)
To: Alexander Viro; +Cc: jack, Linus Torvalds, Kernel Mailing List
Uz.ytkownik Alexander Viro napisa?:
>
> On Wed, 22 May 2002, Martin Dalecki wrote:
>
>
>>2. Typical string formating and value copy and termination
>> problems inherent to string stuff...
>
>
> s/inherent to/inherent to incompetently written/
>
> BTW, quoted code should've used seq_file helpers - that would both
> cut the code size way down and fix the damn thing.
Ah... I think I will just provide the step toward the
/proc/sys/fs. Code talks best I think in this case.
jack would you mind it?
Are there any user land tool issues I should keep an eye
on?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 13:07 ` Linux-2.5.17 Martin Dalecki
2002-05-22 14:38 ` Linux-2.5.17 Alexander Viro
@ 2002-05-22 16:55 ` Jan Kara
1 sibling, 0 replies; 50+ messages in thread
From: Jan Kara @ 2002-05-22 16:55 UTC (permalink / raw)
To: Martin Dalecki; +Cc: Alexander Viro, Linus Torvalds, Kernel Mailing List
Hello,
> Uz.ytkownik Alexander Viro napisa?:
> >
> >On Wed, 22 May 2002, Martin Dalecki wrote:
> >
> >
> >>Or are are you going to reinvent just enother
> >>case of /proc/ formatting compatibility problems?!
> >>And the requirement to have /proc mounted for quoate usage?!
> >>
> >>I hate /proc/my/random/sandbox/becouse/I/dont/knwo/unix/and/have/no/taste
> >>interfaces more and more...
> >>
> >>(PS. Hah! I found finally someone today who deserves flames! :-).)
> >
> >
> >Gives the phrase "finding yourself" a whole new meaning, doesn't it?
> >
> >Al, deeply PO'd by assorted cretinisms _not_ related to the kernel.
> >Sigh...
>
> Lokking at 2.5.17 I see the following:
>
> -#define QUOTAFILENAME "quota"
> -#define QUOTAGROUP "staff"
>
>
> As usuall we can see what goes to /proc is apparently
> random bulls*it as always. I love in esp. the assumption about
> some group name on a system!
> But it get's removed this time. So let's peer where
> it get's reintroduced:
gets reintroduced? I think I removed QUOTAGROUP forever...
> Ah... yes, patch-2.5.17, here it is:
>
> +#ifdef CONFIG_PROC_FS
> +static int read_stats(char *buffer, char **start, off_t offset, int count,
> int *eof, void *data)
> +{
> +
<snip>
> return len;
> +}
> +#endif
>
> What can we see in the above:
>
> 1. Those are first grade candidates for sysctl read-only entires, since they
> are system global statistics which should belong to /proc/sys/fs/
> We even have already fs.dquot-nr there! Why the hell don't put them
> alongside?
>
> 2. Typical string formating and value copy and termination
> problems inherent to string stuff...
I agree that the proc code isn't good (maybe you missed the mail from
Christoph Hellwing and my answer to it...) and should be replaced.
> 3. The futile hope that tools using it will even bother to check the
> Version... gtop just *right today* showed that user space programmers
> won't care about it, so it gains us literally *nothing*.
The hope isn't futile I think. At least quota tools (which are
IMHO the most interesting) are checking the version and warning user
about too new kernel.
> If it where sysctl numbers they would just vanish beneath them if something
> changed semantincally and they *would have no chance* to do it wrong.
The version isn't there only for format of that quota file in proc.
It's *mainly* used for detection of kernel interface to use. Previously
tools had to try a few quotactl()s and from their results they had to
guess the quota format etc. With version somewhere it's a bit easier...
Looking forward to next flame from you ;)
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
2002-05-22 12:04 ` Linux-2.5.17 Alexander Viro
@ 2002-05-22 12:14 ` Russell King
2002-05-22 12:36 ` Linux-2.5.17 Martin Dalecki
2002-05-22 16:02 ` Linux-2.5.17 Linus Torvalds
2002-05-22 13:06 ` Linux-2.5.17 Alan Cox
2 siblings, 2 replies; 50+ messages in thread
From: Russell King @ 2002-05-22 12:14 UTC (permalink / raw)
To: Martin Dalecki; +Cc: jack, Linus Torvalds, Kernel Mailing List
On Wed, May 22, 2002 at 12:54:15PM +0200, Martin Dalecki wrote:
> Please put the following crap under /proc/sys/fs,
> where it belongs. OK?
/proc/sys is for sysctls, not random proc junk. Therefore, putting the
random crap you point out that's currently in /proc/fs in /proc/sys/fs:
> [root@kozaczek fs]# pwd
> /proc/fs
> [root@kozaczek fs]# cat quota
> Version 60501
> Formats
> 0 0 0 0 0 0 0 8
> [root@kozaczek fs]#
is even worse.
/proc/sys has a clean and clear purpose.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 12:14 ` Linux-2.5.17 Russell King
@ 2002-05-22 12:36 ` Martin Dalecki
2002-05-22 16:02 ` Linux-2.5.17 Linus Torvalds
1 sibling, 0 replies; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 12:36 UTC (permalink / raw)
To: Russell King; +Cc: jack, Linus Torvalds, Kernel Mailing List
Uz.ytkownik Russell King napisa?:
> On Wed, May 22, 2002 at 12:54:15PM +0200, Martin Dalecki wrote:
>
>>Please put the following crap under /proc/sys/fs,
>>where it belongs. OK?
>
>
> /proc/sys is for sysctls, not random proc junk. Therefore, putting the
> random crap you point out that's currently in /proc/fs in /proc/sys/fs:
>
>
>>[root@kozaczek fs]# pwd
>>/proc/fs
>>[root@kozaczek fs]# cat quota
>>Version 60501
>>Formats
>>0 0 0 0 0 0 0 8
>>[root@kozaczek fs]#
>
>
> is even worse.
>
> /proc/sys has a clean and clear purpose.
sysctl is for adjusting global system parameters.
So apparently it's even worser, becouse the above
doesn't even serve this purpose?
I tought 0 0 0 0 0 0 8 where random configuration parameters.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 12:14 ` Linux-2.5.17 Russell King
2002-05-22 12:36 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 16:02 ` Linus Torvalds
2002-05-22 15:04 ` Linux-2.5.17 Martin Dalecki
1 sibling, 1 reply; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 16:02 UTC (permalink / raw)
To: Russell King; +Cc: Martin Dalecki, jack, Kernel Mailing List
On Wed, 22 May 2002, Russell King wrote:
>
> /proc/sys has a clean and clear purpose.
Yes, but it _:would_ be good to make the quota stuff use the existign
helper functions to make it much cleaner.
And some of those helper functions are definitely from sysctl's: splitting
up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
sounds like a good idea to me.
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 16:02 ` Linux-2.5.17 Linus Torvalds
@ 2002-05-22 15:04 ` Martin Dalecki
2002-05-22 16:58 ` Linux-2.5.17 Jan Kara
0 siblings, 1 reply; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 15:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Russell King, jack, Kernel Mailing List
Uz.ytkownik Linus Torvalds napisa?:
>
> On Wed, 22 May 2002, Russell King wrote:
>
>>/proc/sys has a clean and clear purpose.
>
>
> Yes, but it _:would_ be good to make the quota stuff use the existign
> helper functions to make it much cleaner.
>
> And some of those helper functions are definitely from sysctl's: splitting
> up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
> sounds like a good idea to me.
Well I'm actually coding this right now :-).
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 15:04 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 16:58 ` Jan Kara
2002-05-22 16:08 ` Linux-2.5.17 Martin Dalecki
0 siblings, 1 reply; 50+ messages in thread
From: Jan Kara @ 2002-05-22 16:58 UTC (permalink / raw)
To: Martin Dalecki; +Cc: Linus Torvalds, Russell King, jack, Kernel Mailing List
> Uz.ytkownik Linus Torvalds napisa?:
> >
> >On Wed, 22 May 2002, Russell King wrote:
> >
> >>/proc/sys has a clean and clear purpose.
> >
> >
> >Yes, but it _:would_ be good to make the quota stuff use the existign
> >helper functions to make it much cleaner.
> >
> >And some of those helper functions are definitely from sysctl's: splitting
> >up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
> >sounds like a good idea to me.
>
> Well I'm actually coding this right now :-).
Thanks. I'll update quota tools to use your new files if you send me
new layout of interface...
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 16:58 ` Linux-2.5.17 Jan Kara
@ 2002-05-22 16:08 ` Martin Dalecki
2002-05-22 17:56 ` Linux-2.5.17 Jan Kara
0 siblings, 1 reply; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 16:08 UTC (permalink / raw)
To: Jan Kara; +Cc: Linus Torvalds, Russell King, Kernel Mailing List
Uz.ytkownik Jan Kara napisa?:
>>Uz.ytkownik Linus Torvalds napisa?:
>>
>>>On Wed, 22 May 2002, Russell King wrote:
>>>
>>>
>>>>/proc/sys has a clean and clear purpose.
>>>
>>>
>>>Yes, but it _:would_ be good to make the quota stuff use the existign
>>>helper functions to make it much cleaner.
>>>
>>>And some of those helper functions are definitely from sysctl's: splitting
>>>up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
>>>sounds like a good idea to me.
>>
>>Well I'm actually coding this right now :-).
>
> Thanks. I'll update quota tools to use your new files if you send me
> new layout of interface...
I'm not ready right now but...
Well actually I went the cheapest way possible:
Here is the layout of the /proc/sys/fs/dquotas array:
/*
* Statistics about disc quota.
*/
enum {
DQSTATS_LOOKUPS,
DQSTATS_DROPS,
DQSTATS_READS,
DQSTATS_WRITES,
DQSTATS_CACHE_HITS,
DQSTATS_ALLOCATED, // formerly known as nr_dquts inside kernel.
DQSTATS_FREE, // formerly known as nr_free_dquots inside kernel.
DQSTATS_SYNCS,
DQSTATS_SIZE
};
extern __u32 dqstats_array[DQSTATS_SIZE];
And here is the allocated sysctl id number:
FS_DQSTATS=16, /* int: disc quota suage statistics *
All of this appears under:
static ctl_table fs_table[] = {
{FS_DQSTATS, "dqstats", dqstats_array, sizeof(dqstats_array), 0444, NULL,
&proc_dointvec},
{},
};
inside /proc/sys/fs/dqstats
I dodn't think the particular fields are subject to change soon
so I wen't for the array.
If yes - please feel rather free to complain :-).
Switch over to sysctl() and see the client code
melting down :-).
BTW> Since I got already my "required flame dosis" for toady I would
rather like to express that the rest of the new quota
handling code is, well, quite nice IMHO of course :-).
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 16:08 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 17:56 ` Jan Kara
2002-05-22 16:56 ` Linux-2.5.17 Martin Dalecki
0 siblings, 1 reply; 50+ messages in thread
From: Jan Kara @ 2002-05-22 17:56 UTC (permalink / raw)
To: Martin Dalecki; +Cc: Linus Torvalds, Russell King, Kernel Mailing List
> Uz.ytkownik Jan Kara napisa?:
> >>Uz.ytkownik Linus Torvalds napisa?:
> >>
> >>>On Wed, 22 May 2002, Russell King wrote:
> >>>
> >>>
> >>>>/proc/sys has a clean and clear purpose.
> >>>
> >>>
> >>>Yes, but it _:would_ be good to make the quota stuff use the existign
> >>>helper functions to make it much cleaner.
> >>>
> >>>And some of those helper functions are definitely from sysctl's:
> >>>splitting
> >>>up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
> >>>sounds like a good idea to me.
> >>
> >>Well I'm actually coding this right now :-).
> >
> > Thanks. I'll update quota tools to use your new files if you send me
> >new layout of interface...
>
> I'm not ready right now but...
> Well actually I went the cheapest way possible:
>
>
> Here is the layout of the /proc/sys/fs/dquotas array:
>
> /*
> * Statistics about disc quota.
> */
> enum {
> DQSTATS_LOOKUPS,
> DQSTATS_DROPS,
> DQSTATS_READS,
> DQSTATS_WRITES,
> DQSTATS_CACHE_HITS,
> DQSTATS_ALLOCATED, // formerly known as nr_dquts inside kernel.
> DQSTATS_FREE, // formerly known as nr_free_dquots inside
> kernel.
> DQSTATS_SYNCS,
> DQSTATS_SIZE
> };
>
> extern __u32 dqstats_array[DQSTATS_SIZE];
>
> And here is the allocated sysctl id number:
>
> FS_DQSTATS=16, /* int: disc quota suage statistics *
>
> All of this appears under:
>
> static ctl_table fs_table[] = {
> {FS_DQSTATS, "dqstats", dqstats_array, sizeof(dqstats_array), 0444,
> NULL, &proc_dointvec},
> {},
> };
>
> inside /proc/sys/fs/dqstats
>
> I dodn't think the particular fields are subject to change soon
> so I wen't for the array.
> If yes - please feel rather free to complain :-).
> Switch over to sysctl() and see the client code
> melting down :-).
The array is OK (I don't expect any changes in statistics too).
I'd just like to have that 'version' and 'formats' fields somewhere.
Otherwise it's rather hard for quota tools to recognize quota
interface...
> BTW> Since I got already my "required flame dosis" for toady I would
> rather like to express that the rest of the new quota
> handling code is, well, quite nice IMHO of course :-).
Thanks :).
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 17:56 ` Linux-2.5.17 Jan Kara
@ 2002-05-22 16:56 ` Martin Dalecki
2002-05-22 18:17 ` Linux-2.5.17 Jan Kara
0 siblings, 1 reply; 50+ messages in thread
From: Martin Dalecki @ 2002-05-22 16:56 UTC (permalink / raw)
To: Jan Kara; +Cc: Linus Torvalds, Russell King, Kernel Mailing List
Uz.ytkownik Jan Kara napisa?:
>>Uz.ytkownik Jan Kara napisa?:
>>
>>>>Uz.ytkownik Linus Torvalds napisa?:
>>>>
>>>>
>>>>>On Wed, 22 May 2002, Russell King wrote:
>>>>>
>>>>>
>>>>>
>>>>>>/proc/sys has a clean and clear purpose.
>>>>>
>>>>>
>>>>>Yes, but it _:would_ be good to make the quota stuff use the existign
>>>>>helper functions to make it much cleaner.
>>>>>
>>>>>And some of those helper functions are definitely from sysctl's:
>>>>>splitting
>>>>>up the quota file into multiple sysctls (_and_ moving it to /proc/sys/fs)
>>>>>sounds like a good idea to me.
>>>>
>>>>Well I'm actually coding this right now :-).
>>>
>>> Thanks. I'll update quota tools to use your new files if you send me
>>>new layout of interface...
>>
>>I'm not ready right now but...
>>Well actually I went the cheapest way possible:
>>
>>
>>Here is the layout of the /proc/sys/fs/dquotas array:
>>
>>/*
>> * Statistics about disc quota.
>> */
>>enum {
>> DQSTATS_LOOKUPS,
>> DQSTATS_DROPS,
>> DQSTATS_READS,
>> DQSTATS_WRITES,
>> DQSTATS_CACHE_HITS,
>> DQSTATS_ALLOCATED, // formerly known as nr_dquts inside kernel.
>> DQSTATS_FREE, // formerly known as nr_free_dquots inside
>> kernel.
>> DQSTATS_SYNCS,
>> DQSTATS_SIZE
>>};
>>
>>extern __u32 dqstats_array[DQSTATS_SIZE];
>>
>>And here is the allocated sysctl id number:
>>
>> FS_DQSTATS=16, /* int: disc quota suage statistics *
>>
>>All of this appears under:
>>
>>static ctl_table fs_table[] = {
>> {FS_DQSTATS, "dqstats", dqstats_array, sizeof(dqstats_array), 0444,
>> NULL, &proc_dointvec},
>> {},
>>};
>>
>>inside /proc/sys/fs/dqstats
>>
>>I dodn't think the particular fields are subject to change soon
>>so I wen't for the array.
>>If yes - please feel rather free to complain :-).
>>Switch over to sysctl() and see the client code
>>melting down :-).
>
> The array is OK (I don't expect any changes in statistics too).
> I'd just like to have that 'version' and 'formats' fields somewhere.
> Otherwise it's rather hard for quota tools to recognize quota
> interface...
You have the sysctl id number for this purpose and the /proc/sys/fs file
name is right now unique. So there is no need for more
treatment here then just trying to stick to what we get once it's there.
The versioning of syscall returns I will just preserve.
Going through sysctl *will be much easier* in code
then fs lookup of the file above.
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 16:56 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 18:17 ` Jan Kara
2002-05-22 18:36 ` Linux-2.5.17 Russell King
0 siblings, 1 reply; 50+ messages in thread
From: Jan Kara @ 2002-05-22 18:17 UTC (permalink / raw)
To: Martin Dalecki
Cc: Jan Kara, Linus Torvalds, Russell King, Kernel Mailing List
> Uz.ytkownik Jan Kara napisa?:
> >>Uz.ytkownik Jan Kara napisa?:
> >>
> >>>>Uz.ytkownik Linus Torvalds napisa?:
> >>>>
> >>>>
> >>>>>On Wed, 22 May 2002, Russell King wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>/proc/sys has a clean and clear purpose.
> >>>>>
> >>>>>
> >>>>>Yes, but it _:would_ be good to make the quota stuff use the existign
> >>>>>helper functions to make it much cleaner.
> >>>>>
> >>>>>And some of those helper functions are definitely from sysctl's:
> >>>>>splitting
> >>>>>up the quota file into multiple sysctls (_and_ moving it to
> >>>>>/proc/sys/fs)
> >>>>>sounds like a good idea to me.
> >>>>
> >>>>Well I'm actually coding this right now :-).
> >>>
> >>>Thanks. I'll update quota tools to use your new files if you send me
> >>>new layout of interface...
> >>
> >>I'm not ready right now but...
> >>Well actually I went the cheapest way possible:
> >>
> >>
> >>Here is the layout of the /proc/sys/fs/dquotas array:
> >>
> >>/*
> >>* Statistics about disc quota.
> >>*/
> >>enum {
> >> DQSTATS_LOOKUPS,
> >> DQSTATS_DROPS,
> >> DQSTATS_READS,
> >> DQSTATS_WRITES,
> >> DQSTATS_CACHE_HITS,
> >> DQSTATS_ALLOCATED, // formerly known as nr_dquts inside kernel.
> >> DQSTATS_FREE, // formerly known as nr_free_dquots inside
> >> kernel.
> >> DQSTATS_SYNCS,
> >> DQSTATS_SIZE
> >>};
> >>
> >>extern __u32 dqstats_array[DQSTATS_SIZE];
> >>
> >>And here is the allocated sysctl id number:
> >>
> >> FS_DQSTATS=16, /* int: disc quota suage statistics *
> >>
> >>All of this appears under:
> >>
> >>static ctl_table fs_table[] = {
> >> {FS_DQSTATS, "dqstats", dqstats_array, sizeof(dqstats_array), 0444,
> >> NULL, &proc_dointvec},
> >> {},
> >>};
> >>
> >>inside /proc/sys/fs/dqstats
> >>
> >>I dodn't think the particular fields are subject to change soon
> >>so I wen't for the array.
> >>If yes - please feel rather free to complain :-).
> >>Switch over to sysctl() and see the client code
> >>melting down :-).
> >
> > The array is OK (I don't expect any changes in statistics too).
> >I'd just like to have that 'version' and 'formats' fields somewhere.
> >Otherwise it's rather hard for quota tools to recognize quota
> >interface...
>
> You have the sysctl id number for this purpose and the /proc/sys/fs file
> name is right now unique. So there is no need for more
> treatment here then just trying to stick to what we get once it's there.
> The versioning of syscall returns I will just preserve.
>
> Going through sysctl *will be much easier* in code
> then fs lookup of the file above.
OK. You convinced me that 'version' isn't needed. But how about
'formats'? Currently quotaon(8) uses this field to check which format it
should try to turn on... I can live without it as quotaon(8) might try
new format and if it doesn't succeed it will try the old one but
anyway...
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 18:17 ` Linux-2.5.17 Jan Kara
@ 2002-05-22 18:36 ` Russell King
0 siblings, 0 replies; 50+ messages in thread
From: Russell King @ 2002-05-22 18:36 UTC (permalink / raw)
To: Jan Kara; +Cc: Kernel Mailing List
On Wed, May 22, 2002 at 08:17:53PM +0200, Jan Kara wrote:
> OK. You convinced me that 'version' isn't needed. But how about
> 'formats'? Currently quotaon(8) uses this field to check which format it
> should try to turn on... I can live without it as quotaon(8) might try
> new format and if it doesn't succeed it will try the old one but
> anyway...
Each sysctl file is only supposed to carry one bit of information - one
number, one string. There should be no formatting of data.
Have a look at /proc/sys/net/ipv4/* as an example.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
2002-05-22 12:04 ` Linux-2.5.17 Alexander Viro
2002-05-22 12:14 ` Linux-2.5.17 Russell King
@ 2002-05-22 13:06 ` Alan Cox
2 siblings, 0 replies; 50+ messages in thread
From: Alan Cox @ 2002-05-22 13:06 UTC (permalink / raw)
To: Martin Dalecki; +Cc: jack, Linus Torvalds, Kernel Mailing List
> Please put the following crap under /proc/sys/fs,
> where it belongs. OK?
>
> [root@kozaczek fs]# pwd
> /proc/fs
> [root@kozaczek fs]# cat quota
> Version 60501
> Formats
> 0 0 0 0 0 0 0 8
> [root@kozaczek fs]#
>
> Or are are you going to reinvent just enother
> case of /proc/ formatting compatibility problems?!
> And the requirement to have /proc mounted for quoate usage?!
/proc/sys/ is sysctl space.
/proc/sys/fs/quota/version
/proc/sys/fs/quota/format/0,1,2,3..
maybe
> (PS. Hah! I found finally someone today who deserves flames! :-).)
You looked in the mirror ?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 5:16 Linux-2.5.17 Linus Torvalds
2002-05-21 13:58 ` Linux-2.5.17 Roman Zippel
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
@ 2002-05-22 11:19 ` Russell King
2002-05-22 11:27 ` Linux-2.5.17 David S. Miller
2002-05-22 16:23 ` Linux-2.5.17 Linus Torvalds
2002-05-24 13:59 ` Linux-2.5.17 Martin Dalecki
3 siblings, 2 replies; 50+ messages in thread
From: Russell King @ 2002-05-22 11:19 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
On Mon, May 20, 2002 at 10:16:35PM -0700, Linus Torvalds wrote:
> Various FS updates (including merges of quota and iget_locked), and
> Makefile cleanups from Kai.
>
> And yet more TLB shootdown stuff.
We seem to have inconsistent cache handling in the new TLB shootdown stuff.
Or maybe its just my misunderstanding of what's going on; whatever it is,
the new TLB shootdown stuff appears to be quite messy.
Lets look at the flow of the 3 places where tlb_gather_mmu is used:
zap_page_range unmap_region exit_mmap
flush_cache_range tlb_gather_mmu tlb_gather_mmu
tlb_gather_mmu unmap_page_range flush_cache_mm
unmap_page_range free_pgtables unmap_page_range
tlb_finish_mmu tlb_finish_mmu clear_page_tables
tlb_finish_mmu
So we have 3 different functions, 2 different orders of gather_mmu
and cache handling, and one with no cache handling what so ever.
I think we have two options - either leave the cache handling up to
tlb_start_vma() (in which case, flush_cache_range and flush_cache_mm
are redundant and should be removed) or let it be up to the caller
of tlb_gather_mmu to call the right cache handling function.
I think which is actually function dependent - in zap_page_range,
we're only removing one vma. In exit_mmap, we're removing all vmas.
In unmap_region, we're removing an unspecified number of vmas.
Depending on which option we choose, we'll either end up calling
flush_cache_range() many times, or flush_cache_mm() and flushing
the cache for a munmap of a small area.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 11:19 ` Linux-2.5.17 Russell King
@ 2002-05-22 11:27 ` David S. Miller
2002-05-22 16:23 ` Linux-2.5.17 Linus Torvalds
1 sibling, 0 replies; 50+ messages in thread
From: David S. Miller @ 2002-05-22 11:27 UTC (permalink / raw)
To: rmk; +Cc: torvalds, linux-kernel
From: Russell King <rmk@arm.linux.org.uk>
Date: Wed, 22 May 2002 12:19:29 +0100
So we have 3 different functions, 2 different orders of gather_mmu
and cache handling, and one with no cache handling what so ever.
I think we have two options - either leave the cache handling up to
tlb_start_vma() (in which case, flush_cache_range and flush_cache_mm
are redundant and should be removed) or let it be up to the caller
of tlb_gather_mmu to call the right cache handling function.
We're in very much agreement with you, that is why we are
still hashing out how to make this thing as optimal as
possible.
The idea currently is that the tlb_vma_{start,end}() handle
cache and tlb flush respectively.
Ignore the exit_mmap() case, it will be optimized to shreds :-)
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: Linux-2.5.17
2002-05-22 11:19 ` Linux-2.5.17 Russell King
2002-05-22 11:27 ` Linux-2.5.17 David S. Miller
@ 2002-05-22 16:23 ` Linus Torvalds
1 sibling, 0 replies; 50+ messages in thread
From: Linus Torvalds @ 2002-05-22 16:23 UTC (permalink / raw)
To: Russell King; +Cc: Kernel Mailing List
On Wed, 22 May 2002, Russell King wrote:
>
> We seem to have inconsistent cache handling in the new TLB shootdown stuff.
Not surprising - I've worried only about changing the TLB architecture on
x86, where the caches do not matter.
> I think we have two options - either leave the cache handling up to
> tlb_start_vma() (in which case, flush_cache_range and flush_cache_mm
> are redundant and should be removed) or let it be up to the caller
> of tlb_gather_mmu to call the right cache handling function.
I think I'd prefer the "let the tlb functions handle caches too" approach.
For many architectures, that means "tlb_start/end_vma()". Others can do it
in "tlb_remove_tlb_entry()".
There's another issue: I think we should aim to get rid of the old
"flush_tlb_xxxx()" functions, and aim to rely entirely on the TLB
gathering. vmalloc/vfree might be the one special case (and I suspect
vfree() is going to get a lot slower to make sure it does the right thing
wrt TLB's).
Linus
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Linux-2.5.17
2002-05-21 5:16 Linux-2.5.17 Linus Torvalds
` (2 preceding siblings ...)
2002-05-22 11:19 ` Linux-2.5.17 Russell King
@ 2002-05-24 13:59 ` Martin Dalecki
3 siblings, 0 replies; 50+ messages in thread
From: Martin Dalecki @ 2002-05-24 13:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 1378 bytes --]
Thu May 23 14:37:50 CEST 2002 ide-clean-70
- Apply host chip driver cleanups by Bartomiej Zonierkiewicz.
- Take the draft device type driver implementation from Adam Richter and make
it actually work with some of the drivers we have at hand. Quite a lot
of it was fixed by me as well to have the desired effects.
We have added a attach method for the sub device type drivers to make it
possible dor sub device type drivers to attach devices to the overall
infrastructure. UNIX has something like this SCSI code is implementing
something like this, just for some unknown reasons Linux block device
operations don't have it...
- ide_drive_t is finally gone. Please use struct ata_device instead.
Hint the ide.h specific byte type should go over time as well, sine there
is no need to invent something already handled by the kernel. Please use
the unambigious u8 type instead where possible.
- Add a bit of documentation about cabling issues. ide.txt needs a lot of
improvement at some time still.
Well the fact that this is collecting many bits from different people
again makes this patch unfortunately rather big. I compress it
therefore, since sending the patches to lkml turned out the be sucsessfull
in attracting more people to the overall effort.
Once again I would like to express my many thanks to all of them who are
involved!.
[-- Attachment #2: ide-clean-70.diff.gz --]
[-- Type: application/x-gzip, Size: 22327 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2002-05-24 15:03 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-21 18:52 Linux-2.5.17 Wayne.Brown
2002-05-21 21:30 ` Linux-2.5.17 David S. Miller
2002-05-22 7:36 ` Linux-2.5.17 Helge Hafting
-- strict thread matches above, loose matches on Subject: below --
2002-05-22 20:00 Linux-2.5.17 Wayne.Brown
2002-05-23 12:17 ` Linux-2.5.17 Nick Holloway
2002-05-22 3:02 Linux-2.5.17 Wayne.Brown
2002-05-22 7:12 ` Linux-2.5.17 Zwane Mwaikambo
2002-05-22 11:49 ` Linux-2.5.17 Nick Holloway
2002-05-21 23:20 Linux-2.5.17 Wayne.Brown
2002-05-21 23:29 ` Linux-2.5.17 Russell King
2002-05-21 23:33 ` Linux-2.5.17 Joel Jaeggli
2002-05-21 5:16 Linux-2.5.17 Linus Torvalds
2002-05-21 13:58 ` Linux-2.5.17 Roman Zippel
2002-05-21 16:06 ` Linux-2.5.17 Linus Torvalds
2002-05-21 18:36 ` Linux-2.5.17 Roman Zippel
2002-05-21 18:53 ` Linux-2.5.17 Linus Torvalds
2002-05-21 23:35 ` Linux-2.5.17 Roman Zippel
2002-05-22 0:10 ` Linux-2.5.17 Linus Torvalds
2002-05-22 0:31 ` Linux-2.5.17 Roman Zippel
2002-05-22 0:54 ` Linux-2.5.17 Linus Torvalds
2002-05-22 2:17 ` Linux-2.5.17 David S. Miller
2002-05-22 2:40 ` Linux-2.5.17 Linus Torvalds
2002-05-22 2:57 ` Linux-2.5.17 David S. Miller
2002-05-22 3:21 ` Linux-2.5.17 Linus Torvalds
2002-05-22 8:06 ` Linux-2.5.17 David Lang
2002-05-22 14:14 ` Linux-2.5.17 Dave McCracken
2002-05-22 16:10 ` Linux-2.5.17 Linus Torvalds
2002-05-22 13:45 ` Linux-2.5.17 Roman Zippel
2002-05-22 16:08 ` Linux-2.5.17 Linus Torvalds
2002-05-22 10:54 ` Linux-2.5.17 Martin Dalecki
2002-05-22 12:04 ` Linux-2.5.17 Alexander Viro
2002-05-22 13:07 ` Linux-2.5.17 Martin Dalecki
2002-05-22 14:38 ` Linux-2.5.17 Alexander Viro
2002-05-22 13:42 ` Linux-2.5.17 Martin Dalecki
2002-05-22 16:55 ` Linux-2.5.17 Jan Kara
2002-05-22 12:14 ` Linux-2.5.17 Russell King
2002-05-22 12:36 ` Linux-2.5.17 Martin Dalecki
2002-05-22 16:02 ` Linux-2.5.17 Linus Torvalds
2002-05-22 15:04 ` Linux-2.5.17 Martin Dalecki
2002-05-22 16:58 ` Linux-2.5.17 Jan Kara
2002-05-22 16:08 ` Linux-2.5.17 Martin Dalecki
2002-05-22 17:56 ` Linux-2.5.17 Jan Kara
2002-05-22 16:56 ` Linux-2.5.17 Martin Dalecki
2002-05-22 18:17 ` Linux-2.5.17 Jan Kara
2002-05-22 18:36 ` Linux-2.5.17 Russell King
2002-05-22 13:06 ` Linux-2.5.17 Alan Cox
2002-05-22 11:19 ` Linux-2.5.17 Russell King
2002-05-22 11:27 ` Linux-2.5.17 David S. Miller
2002-05-22 16:23 ` Linux-2.5.17 Linus Torvalds
2002-05-24 13:59 ` Linux-2.5.17 Martin Dalecki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.