* Linux 2.6.36-rc7
@ 2010-10-06 21:45 Linus Torvalds
2010-10-07 0:49 ` Stephen Rothwell
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Linus Torvalds @ 2010-10-06 21:45 UTC (permalink / raw)
To: Linux Kernel Mailing List
So I decided to break my a-week-is-eight-days rut, and actually
release -rc7 after a proper seven-day week instead. Wo-oo!
And yes, that's probably as exciting as it gets, which is just fine by
me. This should be the last -rc, I'm not seeing any reason to keep
delaying a real release. There was still more changes to
drivers/gpu/drm than I really would have hoped for, but they all look
harmless and good. Famous last words.
Apart from that, there's some mips updates, some late ACPI fixes, and
lots of small random noise. I was looking at the dirstat, and noticed
that xfs shows up at 6%, but even that is just a single commit that is
mainly comment fixups. Which just goes to show that there's really a
lot of small trivial one-liners: 167 files changed, 992
insertions(+), 562 deletions(-) in 137 non-merge commits really does
end up being about a lot of very small changes.
[ Git hint of the day: do
git log --stat --oneline --no-merges v2.6.36-rc6..v2.6.36-rc7
to get a dense view of the commits and visually pick out the ones
that aren't trivial one-liners)
---
Akinobu Mita (1):
powerpc/512x: fix clk_get() return value
Alex Deucher (3):
drm/radeon/kms: fix up encoder info messages for DFP6
drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle
drm/radeon/kms: add quirk for MSI K9A2GM motherboard
Andrea Gelmini (2):
ACPI: Kconfig: fix typo.
MIPS: Fix a typo.
Andreas Bießmann (1):
MIPS: Octeon: Determine if helper needs to be built
Andrew Morton (3):
drivers/serial/mfd.c needs slab.h
arch/m68k/mac/macboing.c: use unsigned long for irqflags
drivers/serial/mrst_max3110.c needs linux/irq.h
Arnd Bergmann (1):
drm: i810/i830: fix locked ioctl variant
Axel Lin (1):
regulator: max8649 - fix setting extclk_freq
Bernhard Walle (2):
i2c-octeon: Return -ETIMEDOUT in octeon_i2c_wait() on timeout
MIPS: N32: Fix getdents64 syscall for n32
Boaz Harrosh (1):
um: Proper Fix for f25c80a4: remove duplicate structure field
initialization
Chris Wilson (5):
drm: Prune GEM vma entries
drm/i915: Fix refleak during eviction.
drm: Hold the mutex when dropping the last GEM reference (v2)
drm/i915: Sanity check pread/pwrite
drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow
Christoph Hellwig (1):
writeback: always use sb->s_bdi for writeback purposes
Colin Ian King (1):
ACPI: enable repeated PCIEXP wakeup by clearing PCIEXP_WAKE_STS on resume
Cyril Chemparathy (1):
regulator: fix typo in current units
Cyrill Gorcunov (1):
perf, x86: Handle in flight NMIs on P4 platform
Damian Lukowski (1):
net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()
Dan Carpenter (1):
dma/shdma: move dereference below the NULL check
Dan Rosenberg (2):
ALSA: prevent heap corruption in snd_ctl_new()
sys_semctl: fix kernel stack leakage
Daniel J Blueman (1):
fix OMAP2 MTD build failure
Dave Airlie (2):
drm/radeon: fix PCI ID 5657 to be an RV410
drm/gem: handlecount isn't really a kref so don't make it one.
Dave Chinner (1):
xfs: force background CIL push under sustained load
David Daney (3):
MIPS: Hookup fanotify_init, fanotify_mark, and prlimit64 syscalls.
MIPS: Don't place cu2 notifiers in __cpuinitdata
MIPS: Octeon: Place cnmips_cu2_setup in __init memory.
David Howells (1):
MN10300: Fix flush_icache_range()
David S. Miller (1):
ip_gre: Fix dependencies wrt. ipv6.
Deng-Cheng Zhu (1):
MIPS: Use generic atomic64 for 32-bit kernels
Don Mullis (1):
lib/list_sort: do not pass bad pointers to cmp callback
Eric Dumazet (2):
rcu: rcu_read_lock_bh_held(): disabling irqs also disables bh
vlan: dont drop packets from unknown vlans in promiscuous mode
Eric Millbrandt (1):
powerpc/5200: tighten up ac97 reset timing
Evgeny Kuznetsov (1):
wait: using uninitialized member of wait queue
FUJITA Tomonori (1):
MIPS: TX49xx: Rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN
Florian Mickler (1):
iwl3945: queue the right work if the scan needs to be aborted
Frederic Weisbecker (2):
reiserfs: fix dependency inversion between inode and reiserfs mutexes
reiserfs: fix unwanted reiserfs lock recursion
Frederik Deweerdt (1):
perf ui hist browser: Fix segfault on 'a' for annotate
Geert Uytterhoeven (1):
fuse: Initialize total_len in fuse_retrieve()
Giel van Schijndel (1):
hwmon: f71882fg: use a muxed resource lock for the Super I/O port
H. Peter Anvin (1):
x86, cpu: After uncapping CPUID, re-run CPU feature detection
Heiko Carstens (1):
generic-ipi: Fix deadlock in __smp_call_function_single
Huang Ying (3):
ACPI, APEI, Fix APEI related table size checking
ACPI, APEI, Fix error path for memory allocation
ACPI, APEI, Fix ERST MOVE_DATA instruction implementation
Hugh Dickins (2):
ksm: fix page_address_in_vma anon_vma oops
ksm: fix bad user data when swapping
Ira W. Snyder (1):
kfifo: fix scatterlist usage
Jeff Layton (2):
cifs: set backing_dev_info on new S_ISREG inodes
cifs: prevent infinite recursion in cifs_reconnect_tcon
Jesse Barnes (1):
drm/i915: fix GMCH power reporting
Jin Dongming (2):
ACPI, APEI, Fix acpi_pre_map() return value
ACPI, APEI, HEST Fix the unsuitable usage of platform_data
Jiri Olsa (2):
oprofile: Add Support for Intel CPU Family 6 / Model 29
proc: make /proc/pid/limits world readable
Joe Perches (1):
MIPS: Remove pr_<level> uses of KERN_<level>
Joel Becker (1):
ocfs2: Don't walk off the end of fast symlinks.
Johannes Berg (1):
mac80211: fix use-after-free
Jon Povey (1):
i2c-davinci: Fix race when setting up for TX
Julia Lawall (3):
powerpc/5200: efika.c: Add of_node_put to avoid memory leak
drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code
MIPS: kspd: Adjust confusing if indentation
Keith Packard (2):
drm/i915: vblank status not valid while training display port
drm/i915: Use pipe state to tell when pipe is off
Kevin Liu (1):
mfd: Fix max8925 irq control bit incorrect setting
Kukjin Kim (1):
MAINTAINERS: update maintainer for S5P ARM ARCHITECTURES
Kumar Sanghvi (1):
Phonet: Correct header retrieval after pskb_may_pull
Kusanagi Kouichi (1):
perf tools: Fix build breakage
Len Brown (7):
intel_idle: PCI quirk to prevent Lenovo Ideapad s10-3 boot hang
ACPI: delete ZEPTO idle=nomwait DMI quirk
ACPI: expand Vista blacklist to include SP1 and SP2
ACPI: EC: add Vista incompatibility DMI entry for Toshiba Satellite L355
acpi_idle: add missing \n to printk
ACPI: acpi_pad: simplify code to avoid false gcc build warning
ACPI: invoke DSDT corruption workaround on all Toshiba Satellite
Linus Torvalds (3):
Fix up more fallout form alpha signal cleanups
modules: Fix module_bug_list list corruption race
Linux 2.6.36-rc7
Lucas De Marchi (2):
ACPI: Fix typos
cpuidle: Fix typos
Luis Henriques (1):
ACPI: fan: Fix more unbalanced code block
Manuel Lauss (1):
MIPS: Alchemy: Resolve prom section mismatches
Marcin Slusarz (1):
i7core_edac: fix panic in udimm sysfs attributes registration
Mark Brown (1):
mfd: Ignore non-GPIO IRQs when setting wm831x IRQ types
Mathieu Lacage (1):
missing inline keyword for static function in linux/dmaengine.h
Matthew Garrett (1):
ACPI: Don't report current_now if battery reports in mWh
MyungJoo Ham (1):
i2c-s3c2410: fix calculation of SDA line delay
Namhyung Kim (3):
ACPI: add missing __percpu markup in arch/x86/kernel/acpi/cstate.c
intel_idle: add missing __percpu markup
[CPUFREQ] acpi-cpufreq: add missing __percpu markup
Pekka Enberg (1):
[CPUFREQ] Fix memory leaks in pcc_cpufreq_do_osc
Petr Vandrovec (1):
MAINTAINERS: update matroxfb & ncpfs status
Rafael J. Wysocki (1):
PM / ACPI: Blacklist systems known to require acpi_sleep=nonvs
Ralf Baechle (7):
MIPS: Document why RELOC_HIDE is there.
MIPS: Audit: Fix hang in entry.S.
MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.
MIPS: Kconfig: Fix and clarify kconfig help text for VSMP and SMTC.
MIPS: GIC: Remove dependencies from Malta files.
MIPS: PNX8550: Sort out machine halt, restart and powerdown functions.
MIPS: Fix syscall 64 bit number comments.
Ricardo Mendoza (1):
MIPS: RM7000: Symbol should be static
Robert Richter (1):
oprofile, ARM: Release resources on failure
Roel Kluin (1):
spi: spi-gpio.c tests SPI_MASTER_NO_RX bit twice, but not SPI_MASTER_NO_TX
Scott Ellis (1):
omap: McBSP: tx_irq_completion used in rx_irq_handler
Shmulik Ladkani (1):
MIPS: Calculate VMLINUZ_LOAD_ADDRESS based on the length of vmlinux.bin
Simon Guinot (1):
dmaengine: fix interrupt clearing for mv_xor
Sinan Akman (1):
of/spi: Fix OF-style driver binding of spi devices
Stefano Stabellini (2):
xen: do not set xenstored_ready before xenbus_probe on hvm
xen: do not initialize PV timers on HVM if !xen_have_vector_callback
Stephane Eranian (1):
perf trace scripting: Fix extern struct definitions
Stephen Rothwell (1):
powerpc: remove unused variable
Suresh Siddha (1):
intel_idle: Voluntary leave_mm before entering deeper
Takashi Iwai (1):
ALSA: i2c/other/ak4xx-adda: Fix a compile warning with CONFIG_PROCFS=n
Thomas Gleixner (2):
x86, irq: Plug memory leak in sparse irq
x86, hpet: Fix bogus error check in hpet_assign_irq()
Thomas Hellstrom (5):
drm/vmwgfx: Fix breakage introduced by commit "drm: block
userspace under allocating buffer and having drivers overwrite it
(v2)"
vmwgfx: vt-switch (master drop) fixes
vmwgfx: Enable use of the vblank system
vmwgfx: Remove initialisation of dev::devname
vmwgfx: Fix fb VRAM pinning failure due to fragmentation
Thomas Weber (1):
intel_idle: Change mode 755 => 644
Tony Lindgren (1):
omap: Fix compile dependency to LEDS_CLASS
Vasiliy Kulikov (1):
regulator: fix device_register() error handling
Zhang Rui (3):
ACPI: fix build warnings resulting from merge window conflict
ACPI video: fix a poor warning message
ACPI: Disable Windows Vista compatibility for Toshiba P305D
christophe leroy (1):
spi/mpc8xxx: fix buffer overrun on large transfers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-06 21:45 Linux 2.6.36-rc7 Linus Torvalds
@ 2010-10-07 0:49 ` Stephen Rothwell
2010-10-08 15:05 ` James Bottomley
2010-10-07 16:10 ` Tvrtko Ursulin
2010-10-07 19:28 ` Tejun Heo
2 siblings, 1 reply; 29+ messages in thread
From: Stephen Rothwell @ 2010-10-07 0:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux Kernel Mailing List, Russell King, James Bottomley,
David Miller, netdev, John W. Linville, Michal Marek,
Dmitry Torokhov
[-- Attachment #1: Type: text/plain, Size: 2788 bytes --]
Hi Linus,
On Wed, 6 Oct 2010 14:45:13 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> This should be the last -rc, I'm not seeing any reason to keep
> delaying a real release. There was still more changes to
> drivers/gpu/drm than I really would have hoped for, but they all look
> harmless and good. Famous last words.
I have no idea how critical any of this stuff is, but linux-next contain
the following in it's "current" trees i.e. stuff that is supposed to go
into 2.6.36. These are from the arm-current, scsi-rc-fixes, net-current,
wireless-current, kbuild-current, input-current and ide-curent trees
(contacts cc'd).
Arnaud Lacombe (1):
kconfig: delay symbol direct dependency initialization
[This one produces a lot of new warnings, but also fixes a bug ...]
Ben Hutchings (2):
Revert "ipv4: Make INET_LRO a bool instead of tristate."
netdev: Depend on INET before selecting INET_LRO
Dan Carpenter (1):
cls_u32: signedness bug
Dan Rosenberg (2):
sctp: prevent reading out-of-bounds memory
sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac()
David Stevens (1):
ipv4: correct IGMP behavior on v3 query during v2-compatibility mode
Dmitry Torokhov (1):
Input: wacom - fix runtime PM related deadlock
Eric Dumazet (1):
caif: fix two caif_connect() bugs
Felix Fietkau (1):
ath9k_hw: fix regression in ANI listen time calculation
Henrik Rydberg (1):
Input: uinput - setup MT usage during device creation
Jeff Kirsher (4):
ixgbevf.txt: Update ixgbevf documentation
e1000.txt: Update e1000 documentation
e1000e.txt: Add e1000e documentation
MAINTAINERS: update Intel LAN Ethernet info
Johannes Berg (1):
mac80211: delete AddBA response timer
Kenneth Waters (1):
Input: joydev - fix JSIOCSAXMAP ioctl
Maciej Żenczykowski (1):
net: Fix IPv6 PMTU disc. w/ asymmetric routes
Martin K. Petersen (1):
[SCSI] Fix VPD inquiry page wrapper
Nagendra Tomar (1):
net: Fix the condition passed to sk_wait_event()
Neil Horman (1):
bonding: fix WARN_ON when writing to bond_master sysfs file
Santosh Shilimkar (1):
ARM: 6419/1: mmu: Fix MT_MEMORY and MT_MEMORY_NONCACHED pte flags
Sergei Shtylyov (2):
hpt366: add debounce delay to cable_detect() method
hpt366: fix clock turnaround
Stanislaw Gruszka (1):
skge: add quirk to limit DMA
Will Deacon (2):
ARM: 6416/1: errata: faulty hazard checking in the Store Buffer may lead to data corruption
ARM: 6412/1: kprobes-decode: add support for MOVW instruction
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-06 21:45 Linux 2.6.36-rc7 Linus Torvalds
2010-10-07 0:49 ` Stephen Rothwell
@ 2010-10-07 16:10 ` Tvrtko Ursulin
2010-10-07 17:15 ` Tvrtko Ursulin
2010-10-07 19:28 ` Tejun Heo
2 siblings, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2010-10-07 16:10 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List
On Wednesday 06 Oct 2010 22:45:13 Linus Torvalds wrote:
[snip]
> And yes, that's probably as exciting as it gets, which is just fine by
> me. This should be the last -rc, I'm not seeing any reason to keep
> delaying a real release. There was still more changes to
> drivers/gpu/drm than I really would have hoped for, but they all look
> harmless and good. Famous last words.
Hi Linus,
Please see http://marc.info/?l=linux-kernel&m=128618485204253&w=2
I have sent this proposed bugfix several times now but no one is picking it up
and Eric seems to have disappeared. It would be suboptimal to release 2.6.36
with a core fanotify feature non-functional.
Thanks,
Tvrtko
Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 16:10 ` Tvrtko Ursulin
@ 2010-10-07 17:15 ` Tvrtko Ursulin
2010-10-07 17:33 ` Eric Paris
0 siblings, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2010-10-07 17:15 UTC (permalink / raw)
To: Linus Torvalds, Eric Paris; +Cc: Linux Kernel Mailing List
On Thursday 07 Oct 2010 17:10:46 Tvrtko Ursulin wrote:
> On Wednesday 06 Oct 2010 22:45:13 Linus Torvalds wrote:
> [snip]
>
> > And yes, that's probably as exciting as it gets, which is just fine by
> > me. This should be the last -rc, I'm not seeing any reason to keep
> > delaying a real release. There was still more changes to
> > drivers/gpu/drm than I really would have hoped for, but they all look
> > harmless and good. Famous last words.
>
> Hi Linus,
>
> Please see http://marc.info/?l=linux-kernel&m=128618485204253&w=2
>
> I have sent this proposed bugfix several times now but no one is picking it
> up and Eric seems to have disappeared. It would be suboptimal to release
> 2.6.36 with a core fanotify feature non-functional.
Unfortunately I have another showstopper. Sadly I missed it until now because
internally we were more worried of issues which were kind of direct problems
for us and I went to deep instead of spending more time reviewing it breath
first.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=08ae89380a8210a9965d04083e1de78cb8bca4b1
Priority argument was dropped from the fanotify_init syscall, and since it is
a syscall once released it is set in stone. Without the priority argument, how
are multiple clients supposed to be ordered?
Co-existence between multiple clients was something which was supposed to be
designed in from the start. Use cases like hierarchical storage management,
anti-malware and content indexing should all be able to co-exist. Without a
priority argument I do not see how it can be assured HSM sees the perm event
before anti-malware, and content indexing after both of them? If there was any
discussion about dropping priority I missed it. :(
Tvrtko
Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 17:15 ` Tvrtko Ursulin
@ 2010-10-07 17:33 ` Eric Paris
2010-10-07 18:07 ` Alan Cox
2010-10-07 20:55 ` John Stoffel
0 siblings, 2 replies; 29+ messages in thread
From: Eric Paris @ 2010-10-07 17:33 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Linus Torvalds, Linux Kernel Mailing List, agruen
On Thu, 2010-10-07 at 18:15 +0100, Tvrtko Ursulin wrote:
> On Thursday 07 Oct 2010 17:10:46 Tvrtko Ursulin wrote:
> > On Wednesday 06 Oct 2010 22:45:13 Linus Torvalds wrote:
> Priority argument was dropped from the fanotify_init syscall, and since it is
> a syscall once released it is set in stone. Without the priority argument, how
> are multiple clients supposed to be ordered?
>
> Co-existence between multiple clients was something which was supposed to be
> designed in from the start. Use cases like hierarchical storage management,
> anti-malware and content indexing should all be able to co-exist. Without a
> priority argument I do not see how it can be assured HSM sees the perm event
> before anti-malware, and content indexing after both of them? If there was any
> discussion about dropping priority I missed it. :(
Shit. I'm trying to remember the logic. hrmph.... You could have a
real interface issue.... Shit. Let me think about it for an hour or
two.
Original idea of priorities was to allow multiple permissions decision
makers to co-exist without having the livelock problem of each trying to
grant and deny access to each other. That was solved with the
O_NONOTIFY hack and I think the priority was then thought to be useless.
But you're absolutely right, it isn't useless if we consider that an HSM
might need to run first to make sure data exists on disk before an
indexer looks at the data.
I see two possibilities off the top of my head:
I could just slap an (unused) priority field onto the end of the
fanotify_init() syscall (assuming Linus doesn't murder me) so we can
build that support out with explicit priorities down the line, which I
think might be overkill, or
The other option (without breaking ABI as it stands today) is to define
some set of the fanotify_init() flags to be a priority field, we've got
32 bits and only use 2 of them so giving 4-8 bits of that as a priority
(next cycle) isn't an issue and can be easily backwards compatible.
-Eric
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 18:07 ` Alan Cox
@ 2010-10-07 17:49 ` Eric Paris
2010-10-08 12:06 ` Andreas Gruenbacher
0 siblings, 1 reply; 29+ messages in thread
From: Eric Paris @ 2010-10-07 17:49 UTC (permalink / raw)
To: Alan Cox
Cc: Tvrtko Ursulin, Linus Torvalds, Linux Kernel Mailing List, agruen
On Thu, 2010-10-07 at 19:07 +0100, Alan Cox wrote:
> > I see two possibilities off the top of my head:
> >
> > I could just slap an (unused) priority field onto the end of the
> > fanotify_init() syscall (assuming Linus doesn't murder me) so we can
> > build that support out with explicit priorities down the line, which I
> > think might be overkill, or
> >
> > The other option (without breaking ABI as it stands today) is to define
> > some set of the fanotify_init() flags to be a priority field, we've got
> > 32 bits and only use 2 of them so giving 4-8 bits of that as a priority
> > (next cycle) isn't an issue and can be easily backwards compatible.
>
> Except you've then got a magic release that works differently to every
> other release afterwards and which sods law says will get shipped by some
> big vendor. Both your proposals are "we got the API wrong,
Correct.
> lets have one
> kernel that is special", that isn't good in the bigger scheme of things
> and will have if kernel 2.6.blah then crap forced into bits of app/lib
> code forever.
Not sure I understand this logic completely. I see both of those
options as: we'd have a 2.6.36 kernel which don't have a priority
feature (and would reject apps that try to use it) and that feature
could be built into 2.6.37. Apps built against 2.6.36 kernels would
still work on 2.6.37 (with a priority of 0 since that's all they could
set in 2.6.36)
> Given two chunks of "oh dear" last minute stuff would it be safer to
> simply punt and just pull the syscall/prototype itself (leaving the rest)
> for the release. That can go into the first pass of the next kernel tree,
> and if it the fixes and priority bits all work out may well then be tiny
> enough for -stable.
The safest thing would probably be to punt the syscalls to 2.6.37.
Which is sad since I know a number of people are already working against
them, but maybe that proves it's the best approach?
-Eric
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 17:33 ` Eric Paris
@ 2010-10-07 18:07 ` Alan Cox
2010-10-07 17:49 ` Eric Paris
2010-10-07 20:55 ` John Stoffel
1 sibling, 1 reply; 29+ messages in thread
From: Alan Cox @ 2010-10-07 18:07 UTC (permalink / raw)
To: Eric Paris
Cc: Tvrtko Ursulin, Linus Torvalds, Linux Kernel Mailing List, agruen
> I see two possibilities off the top of my head:
>
> I could just slap an (unused) priority field onto the end of the
> fanotify_init() syscall (assuming Linus doesn't murder me) so we can
> build that support out with explicit priorities down the line, which I
> think might be overkill, or
>
> The other option (without breaking ABI as it stands today) is to define
> some set of the fanotify_init() flags to be a priority field, we've got
> 32 bits and only use 2 of them so giving 4-8 bits of that as a priority
> (next cycle) isn't an issue and can be easily backwards compatible.
Except you've then got a magic release that works differently to every
other release afterwards and which sods law says will get shipped by some
big vendor. Both your proposals are "we got the API wrong, lets have one
kernel that is special", that isn't good in the bigger scheme of things
and will have if kernel 2.6.blah then crap forced into bits of app/lib
code forever.
Given two chunks of "oh dear" last minute stuff would it be safer to
simply punt and just pull the syscall/prototype itself (leaving the rest)
for the release. That can go into the first pass of the next kernel tree,
and if it the fixes and priority bits all work out may well then be tiny
enough for -stable.
Alan
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-06 21:45 Linux 2.6.36-rc7 Linus Torvalds
2010-10-07 0:49 ` Stephen Rothwell
2010-10-07 16:10 ` Tvrtko Ursulin
@ 2010-10-07 19:28 ` Tejun Heo
2010-10-07 20:13 ` [dm-devel] " Milan Broz
2 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2010-10-07 19:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux Kernel Mailing List, just.for.lkml, herbert, hch, neilb,
dm-devel
Hello, Linus.
On 10/06/2010 11:45 PM, Linus Torvalds wrote:
> So I decided to break my a-week-is-eight-days rut, and actually
> release -rc7 after a proper seven-day week instead. Wo-oo!
>
> And yes, that's probably as exciting as it gets, which is just fine by
> me. This should be the last -rc, I'm not seeing any reason to keep
> delaying a real release. There was still more changes to
> drivers/gpu/drm than I really would have hoped for, but they all look
> harmless and good. Famous last words.
I'm afraid there is a possibly workqueue related deadlock under high
memory pressure. It happens on dm-crypt + md raid1 configuration.
I'm not yet sure whether this is caused by workqueue failing to kick
rescuers under memory pressure or the shared workqueue is making an
already existing problem more visible and in the process of setting up
an environment to reproduce the problem.
http://thread.gmane.org/gmane.comp.file-systems.xfs.general/34922/focus=1044784
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dm-devel] Linux 2.6.36-rc7
2010-10-07 19:28 ` Tejun Heo
@ 2010-10-07 20:13 ` Milan Broz
2010-10-08 17:02 ` Tejun Heo
0 siblings, 1 reply; 29+ messages in thread
From: Milan Broz @ 2010-10-07 20:13 UTC (permalink / raw)
To: device-mapper development
Cc: Tejun Heo, Linus Torvalds, Linux Kernel Mailing List,
just.for.lkml, hch, herbert
On 10/07/2010 09:28 PM, Tejun Heo wrote:
> I'm afraid there is a possibly workqueue related deadlock under high
> memory pressure. It happens on dm-crypt + md raid1 configuration.
> I'm not yet sure whether this is caused by workqueue failing to kick
> rescuers under memory pressure or the shared workqueue is making an
> already existing problem more visible and in the process of setting up
> an environment to reproduce the problem.
>
> http://thread.gmane.org/gmane.comp.file-systems.xfs.general/34922/focus=1044784
Yes, XFS is very good to show up problems in dm-crypt:)
But there was no change in dm-crypt which can itself cause such problem,
planned workqueue changes are not in 2.6.36 yet.
Code is basically the same for the last few releases.
So it seems that workqueue processing really changed here under memory pressure.
Milan
p.s.
Anyway, if you are able to reproduce it and you think that there is problem
in per-device dm-crypt workqueue, there are patches from Andi for shared
per-cpu workqueue, maybe it can help here. (But this is really not RC material.)
Unfortunately not yet in dm-devel tree, but I have them here ready for review:
http://mbroz.fedorapeople.org/dm-crypt/2.6.36-devel/
(all 4 patches must be applied, I hope Alasdair will put them in dm quilt soon.)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 17:33 ` Eric Paris
2010-10-07 18:07 ` Alan Cox
@ 2010-10-07 20:55 ` John Stoffel
2010-10-07 21:24 ` Eric Paris
1 sibling, 1 reply; 29+ messages in thread
From: John Stoffel @ 2010-10-07 20:55 UTC (permalink / raw)
To: Eric Paris
Cc: Tvrtko Ursulin, Linus Torvalds, Linux Kernel Mailing List, agruen
>>>>> "Eric" == Eric Paris <eparis@redhat.com> writes:
Eric> On Thu, 2010-10-07 at 18:15 +0100, Tvrtko Ursulin wrote:
>> On Thursday 07 Oct 2010 17:10:46 Tvrtko Ursulin wrote:
>> > On Wednesday 06 Oct 2010 22:45:13 Linus Torvalds wrote:
>> Priority argument was dropped from the fanotify_init syscall, and since it is
>> a syscall once released it is set in stone. Without the priority argument, how
>> are multiple clients supposed to be ordered?
>>
>> Co-existence between multiple clients was something which was supposed to be
>> designed in from the start. Use cases like hierarchical storage management,
>> anti-malware and content indexing should all be able to co-exist. Without a
>> priority argument I do not see how it can be assured HSM sees the perm event
>> before anti-malware, and content indexing after both of them? If there was any
>> discussion about dropping priority I missed it. :(
Eric> Shit. I'm trying to remember the logic. hrmph.... You could
Eric> have a real interface issue.... Shit. Let me think about it
Eric> for an hour or two.
Eric> Original idea of priorities was to allow multiple permissions
Eric> decision makers to co-exist without having the livelock problem
Eric> of each trying to grant and deny access to each other. That was
Eric> solved with the O_NONOTIFY hack and I think the priority was
Eric> then thought to be useless. But you're absolutely right, it
Eric> isn't useless if we consider that an HSM might need to run first
Eric> to make sure data exists on disk before an indexer looks at the
Eric> data.
Eric> I see two possibilities off the top of my head:
Eric> I could just slap an (unused) priority field onto the end of the
Eric> fanotify_init() syscall (assuming Linus doesn't murder me) so we
Eric> can build that support out with explicit priorities down the
Eric> line, which I think might be overkill, or
Eric> The other option (without breaking ABI as it stands today) is to
Eric> define some set of the fanotify_init() flags to be a priority
Eric> field, we've got 32 bits and only use 2 of them so giving 4-8
Eric> bits of that as a priority (next cycle) isn't an issue and can
Eric> be easily backwards compatible.
So what happens when you try to register a priority level and someone
else has already gotten that level? Does the call fail? Do you get
bumped down to the next open level? Can you *tell* what level you're
at and whether or not some other decision maker is ahead of you?
So if I register an HSM module for /home, with a priority of 1, and
then register a content indexer for /home/john at priority 1, will
they clash? Who wins? The one registered first?
I tried looking in Documentation/fs/fanotify.txt but I couldn't find
it anywhere. So I had to grep around looking for the file which held
fanotify_init() so I could look it over... and then my brain started
bleeding from the lack of any comments on the various functions on WTF
they were supposed to do.
But hey, I admit I'm not a kernel programmer at all, nor a low level
FS guy, so I probably just don't have the indepth understanding of
Linux kernel internals. I just need to spend six months hacking on
the code to come upto speed.
But I'd really like some docs in the next release which tells me as a
poor dumb sysadmin how it can and should be used and what the gotchas
are.
Thanks,
John
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 20:55 ` John Stoffel
@ 2010-10-07 21:24 ` Eric Paris
2010-10-08 15:42 ` John Stoffel
0 siblings, 1 reply; 29+ messages in thread
From: Eric Paris @ 2010-10-07 21:24 UTC (permalink / raw)
To: John Stoffel
Cc: Tvrtko Ursulin, Linus Torvalds, Linux Kernel Mailing List, agruen
On Thu, 2010-10-07 at 16:55 -0400, John Stoffel wrote:
> So what happens when you try to register a priority level and someone
> else has already gotten that level? Does the call fail? Do you get
> bumped down to the next open level? Can you *tell* what level you're
> at and whether or not some other decision maker is ahead of you?
Well it hasn't been discussed and implemented so I can't answer that.
*smile*
I will tell you that the way I envision it working (and being backwards
compatible) is that priority 0 is the last thing to be serviced. If 2
things register at the same priority the order between them getting
events is unpredictable. So when an HSM uses the interface it would use
the highest priority. An AV vendor might use (highest priority / 2)
while normal inotify like listeners would all be happy using priority 0.
> But I'd really like some docs in the next release which tells me as a
> poor dumb sysadmin how it can and should be used and what the gotchas
> are.
We have example man-like pages in the commit logs which I expected to be
used as the basis for man pages once the interface was accepted. They
aren't perfect but they are
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=52c923dd079df49f58016a9e56df184b132611d6
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2a3edf86040a7e15684525a2aadc29f532c51325
You'll also find an example program which shows all of the features at
http://git.kernel.org/?p=linux/kernel/git/agruen/fanotify-example.git;a=summary
I don't think digging around in kernel code is the right way :)
-Eric
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 17:49 ` Eric Paris
@ 2010-10-08 12:06 ` Andreas Gruenbacher
2010-10-08 16:33 ` David Daney
2010-10-08 16:38 ` Linux 2.6.36-rc7 Eric Paris
0 siblings, 2 replies; 29+ messages in thread
From: Andreas Gruenbacher @ 2010-10-08 12:06 UTC (permalink / raw)
To: Eric Paris
Cc: Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On Thursday 07 October 2010 19:49:28 Eric Paris wrote:
> The safest thing would probably be to punt the syscalls to 2.6.37.
> Which is sad since I know a number of people are already working against
> them, but maybe that proves it's the best approach?
I agree with removing the syscalls from 2.6.36 because of the following
reasons:
* Reviewers have complained that the feature was not ready to be merged, yet.
* At least some of the criticism did not get addressed (neither discussed nor
fixed).
* Some weaknesses in the interface design were only identified and fixed late
in the -rc phase, changing the ABI. There may be more issues, like the
priority discussion. This might leave us with a broken ABI we would need
to support forever.
(Making fanotify fit for HSM hasn't been thought through at all AFAIK but
fanotify has legitimate use cases apart from HSM, so I don't necessarily
consider this a blocker.)
* The code has shown to contain the kinds of bugs which show that it was not
tested very well before merging.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 0:49 ` Stephen Rothwell
@ 2010-10-08 15:05 ` James Bottomley
0 siblings, 0 replies; 29+ messages in thread
From: James Bottomley @ 2010-10-08 15:05 UTC (permalink / raw)
To: Stephen Rothwell
Cc: Linus Torvalds, Linux Kernel Mailing List, Russell King,
David Miller, netdev, John W. Linville, Michal Marek,
Dmitry Torokhov
On Thu, 2010-10-07 at 11:49 +1100, Stephen Rothwell wrote:
> Hi Linus,
>
> On Wed, 6 Oct 2010 14:45:13 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> > This should be the last -rc, I'm not seeing any reason to keep
> > delaying a real release. There was still more changes to
> > drivers/gpu/drm than I really would have hoped for, but they all look
> > harmless and good. Famous last words.
>
> I have no idea how critical any of this stuff is, but linux-next contain
> the following in it's "current" trees i.e. stuff that is supposed to go
> into 2.6.36. These are from the arm-current, scsi-rc-fixes, net-current,
> wireless-current, kbuild-current, input-current and ide-curent trees
> (contacts cc'd).
The SCSI rc-fixes stuff is critical if you run into the bugs, but the
bugs are fairly rare cases for most people. I'd still like to get them
in, though (and I have another 3 rc fixes candidates going through the
test pipeline).
James
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-07 21:24 ` Eric Paris
@ 2010-10-08 15:42 ` John Stoffel
2010-10-08 16:17 ` Tvrtko Ursulin
0 siblings, 1 reply; 29+ messages in thread
From: John Stoffel @ 2010-10-08 15:42 UTC (permalink / raw)
To: Eric Paris
Cc: John Stoffel, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, agruen
>>>>> "Eric" == Eric Paris <eparis@redhat.com> writes:
Eric> On Thu, 2010-10-07 at 16:55 -0400, John Stoffel wrote:
>> So what happens when you try to register a priority level and someone
>> else has already gotten that level? Does the call fail? Do you get
>> bumped down to the next open level? Can you *tell* what level you're
>> at and whether or not some other decision maker is ahead of you?
Eric> Well it hasn't been discussed and implemented so I can't answer that.
Eric> *smile*
Ah, but you have implemented multiple notifiers registered at the same
time, without priorities, right? So that's just the degenerate case
which I assume is handled already.
Eric> I will tell you that the way I envision it working (and being
Eric> backwards compatible) is that priority 0 is the last thing to be
Eric> serviced. If 2 things register at the same priority the order
Eric> between them getting events is unpredictable. So when an HSM
Eric> uses the interface it would use the highest priority. An AV
Eric> vendor might use (highest priority / 2) while normal inotify
Eric> like listeners would all be happy using priority 0.
Ugh, I'd prefer that priority 0 is first (highest), but I can see how
that would make ABI problems, assuming we don't punt on releasing the
ABI now.
So if the order is undefined now, that's not good. Well... maybe it's
acceptable, but I can see all kinds of problems cropping up.
Hopefully they're processed in order of notifier creation instead. So
if I insert a notifier, anyone who inserts a notifier after me gets
serviced after my notifier gets run. That would seem to be the
logical and sane semantics to use here.
Again, I'm really approaching this as a SysAdmin who'd love to use
this in an HSM environment, so ordering is a *key* requirement.
Which is why I'm going to harp of the documentation in the Kernel from
day one too.
>> But I'd really like some docs in the next release which tells me as a
>> poor dumb sysadmin how it can and should be used and what the gotchas
>> are.
Eric> We have example man-like pages in the commit logs which I expected to be
Eric> used as the basis for man pages once the interface was accepted. They
Eric> aren't perfect but they are
Eric> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=52c923dd079df49f58016a9e56df184b132611d6
Eric> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2a3edf86040a7e15684525a2aadc29f532c51325
Eric> You'll also find an example program which shows all of the
Eric> features at
Eric> http://git.kernel.org/?p=linux/kernel/git/agruen/fanotify-example.git;a=summary
Eric> I don't think digging around in kernel code is the right way :)
Sure, but putting some design documentation which explains how it's
expected to be used, with pointers to the canonical site holding docs
and such *is* expected. Just see the docs for inotify and such.
Thanks for the pointers, I'll try to find the time to look them over
and make some more comments.
John
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 15:42 ` John Stoffel
@ 2010-10-08 16:17 ` Tvrtko Ursulin
2010-10-08 16:41 ` Eric Paris
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Tvrtko Ursulin @ 2010-10-08 16:17 UTC (permalink / raw)
To: John Stoffel
Cc: Eric Paris, Linus Torvalds, Linux Kernel Mailing List,
agruen@suse.de
On Friday 08 Oct 2010 16:42:06 John Stoffel wrote:
> >>>>> "Eric" == Eric Paris <eparis@redhat.com> writes:
> Eric> I will tell you that the way I envision it working (and being
> Eric> backwards compatible) is that priority 0 is the last thing to be
> Eric> serviced. If 2 things register at the same priority the order
> Eric> between them getting events is unpredictable. So when an HSM
> Eric> uses the interface it would use the highest priority. An AV
> Eric> vendor might use (highest priority / 2) while normal inotify
> Eric> like listeners would all be happy using priority 0.
>
> Ugh, I'd prefer that priority 0 is first (highest), but I can see how
> that would make ABI problems, assuming we don't punt on releasing the
> ABI now.
>
> So if the order is undefined now, that's not good. Well... maybe it's
> acceptable, but I can see all kinds of problems cropping up.
> Hopefully they're processed in order of notifier creation instead. So
> if I insert a notifier, anyone who inserts a notifier after me gets
> serviced after my notifier gets run. That would seem to be the
> logical and sane semantics to use here.
>
> Again, I'm really approaching this as a SysAdmin who'd love to use
> this in an HSM environment, so ordering is a *key* requirement.
Ordering by time of registration? I thought of that but it falls flat once any
service needs to restart or be restarted.
Priority is also not that great concept. I may have proposed classes or
something similar at some point, don't remember any more. It would be
equivalent to having allocated priority ranges, like:
>1000 - pre-content
>=100 - access-control
<100 - content
Doesn't really solve ordering inside groups so maybe we do not need priorities
at all just these three classes?
Tvrtko
Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 12:06 ` Andreas Gruenbacher
@ 2010-10-08 16:33 ` David Daney
2010-10-08 21:50 ` Andreas Gruenbacher
2010-10-08 16:38 ` Linux 2.6.36-rc7 Eric Paris
1 sibling, 1 reply; 29+ messages in thread
From: David Daney @ 2010-10-08 16:33 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: Eric Paris, Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On 10/08/2010 05:06 AM, Andreas Gruenbacher wrote:
> On Thursday 07 October 2010 19:49:28 Eric Paris wrote:
>> The safest thing would probably be to punt the syscalls to 2.6.37.
>> Which is sad since I know a number of people are already working against
>> them, but maybe that proves it's the best approach?
>
> I agree with removing the syscalls from 2.6.36 because of the following
> reasons:
How would the mechanics of this be achieved?
Is it enough to just unconditionally return -ENOSYS from the sys_*()
functions? Or should all the patches be reverted?
David Daney
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 12:06 ` Andreas Gruenbacher
2010-10-08 16:33 ` David Daney
@ 2010-10-08 16:38 ` Eric Paris
2010-10-08 21:45 ` Andreas Gruenbacher
1 sibling, 1 reply; 29+ messages in thread
From: Eric Paris @ 2010-10-08 16:38 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On Fri, 2010-10-08 at 14:06 +0200, Andreas Gruenbacher wrote:
> On Thursday 07 October 2010 19:49:28 Eric Paris wrote:
> > The safest thing would probably be to punt the syscalls to 2.6.37.
> > Which is sad since I know a number of people are already working against
> > them, but maybe that proves it's the best approach?
>
> I agree with removing the syscalls from 2.6.36 because of the following
> reasons:
I disagree with all of your reasons and have argued my position on this
topic repeatedly and don't see the need to refute your claims again.
However, THIS is potentially a real ABI problem and something which
deals with the interface. Alan seemed to lean towards pulling the
syscalls. It is relatively easily solved without changing the interface
or breaking userspace in 2.6.37. We use some set of the flags bits as a
priority (we only use 2 of the 32 bits today so we have plenty) and
order groups with highest priority first, 0 priority last, and 2+ groups
with the same priority have unpredictable ordering. I'd then call
priorities other than 0 a 2.6.37 feature. If we do it in flags I think
that leaves us with say 8 bits and thus 255 priorities. Maybe people
want more, if so, that's a interface change to add a new argument.
Now if Alan would still like me to pull, if anyone has any other 11th
hour interface problems, or if 255 priorities doesn't seem like enough
to someone I am wondering what the best way to unhook is. Just make the
functions return -ENOSYS as the first line or actually troll through all
of the arches and explicily unhook and rehook to sys_ni_syscall? I
started on the latter, but it seems to be a rather large patch at this
point...
-Eric
------
I said I wouldn't refute your claims but I can't help myself on one
account which I think might mislead people.
* Some weaknesses in the interface design were only identified and fixed late
in the -rc phase, changing the ABI. There may be more issues, like the
priority discussion. This might leave us with a broken ABI we would need
to support forever.
Between rc2 and rc3 we switched the order and size of a couple of fields
to help alignment, it did break ABI, but it wasn't an interface failing.
See: 0fb85621df4f. It also lead to an interesting idea about a new type
for linux/types.h which both fanotify and the networking could make use
of (but isn't picked up, I'm not sure we know who is in charge of
types.h)
I concede the interface may in fact not be perfect for every user. I
have ask for ideas and feedback on proposals for literally (not
figuratively) years. It has gone through many iterations. The
interface has been in Linus's kernel for some months and I know of
numerous people who are starting to code to it and haven't (until now)
found interface failings. We can talk about how problems might be out
there but if they haven't been found after all this time I doubt just
waiting longer is going to change anything.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:17 ` Tvrtko Ursulin
@ 2010-10-08 16:41 ` Eric Paris
2010-10-18 11:01 ` Tvrtko Ursulin
2010-10-08 16:54 ` John Stoffel
2010-10-08 21:03 ` Andreas Gruenbacher
2 siblings, 1 reply; 29+ messages in thread
From: Eric Paris @ 2010-10-08 16:41 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: John Stoffel, Linus Torvalds, Linux Kernel Mailing List,
agruen@suse.de
On Fri, 2010-10-08 at 17:17 +0100, Tvrtko Ursulin wrote:
> On Friday 08 Oct 2010 16:42:06 John Stoffel wrote:
> Priority is also not that great concept. I may have proposed classes or
> something similar at some point, don't remember any more. It would be
> equivalent to having allocated priority ranges, like:
>
> >1000 - pre-content
> >=100 - access-control
> <100 - content
>
> Doesn't really solve ordering inside groups so maybe we do not need priorities
> at all just these three classes?
I originally thought of trying to enumerate the types of users and came
up with the same 3 you did. Then I thought it better to give a general
priority field which we could indicate in documentation something like
those 3 classes (exactly like you did above). I don't want to hard code
some limited number of types of users into the interface. (ok it's going
to limited, I was thinking 8 bits, but maybe others think we need more?)
As an extreme example going with 3 fixed type of users (and thus
equivalently only 3 priorities) would not allow for hierarchies of
hierarchical storage managers. What if priority MAX only brought in
enough info for priority MAX-1 to bring in the real file? If they had
to share the single 'pre-content' priority we have another ordering
problem.
-Eric
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:17 ` Tvrtko Ursulin
2010-10-08 16:41 ` Eric Paris
@ 2010-10-08 16:54 ` John Stoffel
2010-10-08 21:03 ` Andreas Gruenbacher
2 siblings, 0 replies; 29+ messages in thread
From: John Stoffel @ 2010-10-08 16:54 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: John Stoffel, Eric Paris, Linus Torvalds,
Linux Kernel Mailing List, agruen@suse.de
>>>>> "Tvrtko" == Tvrtko Ursulin <tvrtko.ursulin@sophos.com> writes:
Tvrtko> On Friday 08 Oct 2010 16:42:06 John Stoffel wrote:
>> >>>>> "Eric" == Eric Paris <eparis@redhat.com> writes:
Eric> I will tell you that the way I envision it working (and being
Eric> backwards compatible) is that priority 0 is the last thing to be
Eric> serviced. If 2 things register at the same priority the order
Eric> between them getting events is unpredictable. So when an HSM
Eric> uses the interface it would use the highest priority. An AV
Eric> vendor might use (highest priority / 2) while normal inotify
Eric> like listeners would all be happy using priority 0.
>>
>> Ugh, I'd prefer that priority 0 is first (highest), but I can see how
>> that would make ABI problems, assuming we don't punt on releasing the
>> ABI now.
>>
>> So if the order is undefined now, that's not good. Well... maybe it's
>> acceptable, but I can see all kinds of problems cropping up.
>> Hopefully they're processed in order of notifier creation instead. So
>> if I insert a notifier, anyone who inserts a notifier after me gets
>> serviced after my notifier gets run. That would seem to be the
>> logical and sane semantics to use here.
>>
>> Again, I'm really approaching this as a SysAdmin who'd love to use
>> this in an HSM environment, so ordering is a *key* requirement.
Tvrtko> Ordering by time of registration? I thought of that but it
Tvrtko> falls flat once any service needs to restart or be restarted.
Yup, it can fail, esp if you have indexing before the HSM
retrieval... :] That could lead to all kinds of silliness.
But then again, how *serious* is it when it does fail? For example,
we use CommVault at work on Netapps to do HSM for some data. CV
leaves a little stub file holding the key to the content to be
restored, and a note for the end user saying "Don't mess with this
file, it's a pointer to your data. Call your SysAdmin if you see
this..."
So that's one way of approaching the error handling case.
Tvrtko> Priority is also not that great concept. I may have proposed
Tvrtko> classes or something similar at some point, don't remember any
Tvrtko> more. It would be equivalent to having allocated priority
Tvrtko> ranges, like:
Tvrtko> 1000 - pre-content
Tvrtko> =100 - access-control
Tvrtko> <100 - content
Umm, why is there only one value for the access-control part? Are you
saying there should only be one program to decide on access control?
Shouldnt' there be more? So I think you're classes are just passed
in by the notifier setup call:
flags = FANOTIFY_PRE_CONTENT & FANOTIFY_EXCLUSIVE;
add_notifier(filesystem,flags,&hook);
flags = ACCESS & SUFFICIENT;
add_notififier(filesystem,flags,&hook);
And you'd have flags like:
PRE_ACCESS 1
ACCESS 2
POST_ACCESS 4
EXCLUSIVE 8
SUFFICIENT 16 /* Passing this means all ACCESS is allowed */
/* even with other ACCESS notifiers in the
chain */
so say where in the process you need to be. The EXCLUSIVE flag means
we want to be the only one doing this and to bail if someone else is
already registered.
Tvrtko> Doesn't really solve ordering inside groups so maybe we do not
Tvrtko> need priorities at all just these three classes?
Ordering is a key thing. In most cases, you don't care, but for some
cases you really do care.
Maybe we should take some of the ideas used by PAM for ordering to
used as a basis?
John
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dm-devel] Linux 2.6.36-rc7
2010-10-07 20:13 ` [dm-devel] " Milan Broz
@ 2010-10-08 17:02 ` Tejun Heo
2010-10-10 11:56 ` Torsten Kaiser
2010-10-11 10:09 ` [PATCH wq#for-next] workqueue: fix HIGHPRI handling in keep_working() Tejun Heo
0 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2010-10-08 17:02 UTC (permalink / raw)
To: Milan Broz
Cc: device-mapper development, Linus Torvalds,
Linux Kernel Mailing List, just.for.lkml, hch, herbert
Hello, again.
On 10/07/2010 10:13 PM, Milan Broz wrote:
> Yes, XFS is very good to show up problems in dm-crypt:)
>
> But there was no change in dm-crypt which can itself cause such problem,
> planned workqueue changes are not in 2.6.36 yet.
> Code is basically the same for the last few releases.
>
> So it seems that workqueue processing really changed here under memory pressure.
>
> Milan
>
> p.s.
> Anyway, if you are able to reproduce it and you think that there is problem
> in per-device dm-crypt workqueue, there are patches from Andi for shared
> per-cpu workqueue, maybe it can help here. (But this is really not RC material.)
>
> Unfortunately not yet in dm-devel tree, but I have them here ready for review:
> http://mbroz.fedorapeople.org/dm-crypt/2.6.36-devel/
> (all 4 patches must be applied, I hope Alasdair will put them in dm quilt soon.)
Okay, spent the whole day reproduing the problem and trying to
determine what's going on. In the process, I've found a bug and a
potential issue (not sure whether it's an actual issue which should be
fixed for this release yet) but the hang doesn't seem to have anything
to do with workqueue update. All the queues are behaving exactly as
expected during hang.
Also, it isn't a regression. I can reliably trigger the same deadlock
on v2.6.35.
Here's the setup, which should be mostly similar to Torsten's setup I
used to trigger the problem.
The machine is dual quad-core Opteron (8 phys cores) w/ 4GiB memory.
* 80GB raid1 of two SATA disks
* On top of that, luks encrypted device w/ twofish-cbc-essiv:sha256
* In the encrypted device, xfs filesystem which hosts 8GiB swapfile
* 12GiB tmpfs
The workload is v2.6.35 allyesconfig -j 128 build in the tmpfs. Not
too long after swap starts being used (several tens of secs), the
system hangs. IRQ handling and all are fine but no IO gets through
with a lot of tasks stuck in bio allocation somewhere.
I suspected that with md and dm stacked together, something in the
upper layer ended up exhausting a shared bio pool and tried a couple
of things but haven't succeeded at finding where the culprit is. It
probably would be best to run blktrace together and analyze how IO
gets stuck.
So, well, we seem to be broken the same way as before. No need to
delay release for this one.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:17 ` Tvrtko Ursulin
2010-10-08 16:41 ` Eric Paris
2010-10-08 16:54 ` John Stoffel
@ 2010-10-08 21:03 ` Andreas Gruenbacher
2010-10-09 0:46 ` John Stoffel
2 siblings, 1 reply; 29+ messages in thread
From: Andreas Gruenbacher @ 2010-10-08 21:03 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: John Stoffel, Eric Paris, Linus Torvalds,
Linux Kernel Mailing List
On Friday 08 October 2010 18:17:25 Tvrtko Ursulin wrote:
> Doesn't really solve ordering inside groups so maybe we do not need
> priorities at all just these three classes?
Applications can easily make the priority they register with configurable (and
default to something in their "range" if we define such ranges). Wouldn't
this be enough?
Andreas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:38 ` Linux 2.6.36-rc7 Eric Paris
@ 2010-10-08 21:45 ` Andreas Gruenbacher
0 siblings, 0 replies; 29+ messages in thread
From: Andreas Gruenbacher @ 2010-10-08 21:45 UTC (permalink / raw)
To: Eric Paris
Cc: Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On Friday 08 October 2010 18:38:28 Eric Paris wrote:
> However, THIS is potentially a real ABI problem and something which
> deals with the interface. Alan seemed to lean towards pulling the
> syscalls. It is relatively easily solved without changing the interface
> or breaking userspace in 2.6.37. We use some set of the flags bits as a
> priority (we only use 2 of the 32 bits today so we have plenty) and
> order groups with highest priority first, 0 priority last, and 2+ groups
> with the same priority have unpredictable ordering. I'd then call
> priorities other than 0 a 2.6.37 feature. If we do it in flags I think
> that leaves us with say 8 bits and thus 255 priorities.
That's a possibility but it seems quite messy for a brand new system call.
I'd still pull the system call and work out the few remaining quirkses.
Thanks.
> ------
>
> I said I wouldn't refute your claims but I can't help myself on one
> account which I think might mislead people.
>
> * Some weaknesses in the interface design were only identified and fixed
> late in the -rc phase, changing the ABI. There may be more issues, like
> the priority discussion. This might leave us with a broken ABI we would
> need to support forever.
>
> Between rc2 and rc3 we switched the order and size of a couple of fields
> to help alignment, it did break ABI, but it wasn't an interface failing.
> See: 0fb85621df4f.
See, a weakness, not a failure. What I said.
The main issue that Andreas Schwab has pointed out (and which also lead to
this commit) was the packing of the structs which leads to inefficient code,
though. This hasn't been fixed but it still can be, in a backwards compatible
way.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:33 ` David Daney
@ 2010-10-08 21:50 ` Andreas Gruenbacher
2010-10-08 21:59 ` H. Peter Anvin
0 siblings, 1 reply; 29+ messages in thread
From: Andreas Gruenbacher @ 2010-10-08 21:50 UTC (permalink / raw)
To: David Daney
Cc: Eric Paris, Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On Friday 08 October 2010 18:33:55 David Daney wrote:
> On 10/08/2010 05:06 AM, Andreas Gruenbacher wrote:
> > On Thursday 07 October 2010 19:49:28 Eric Paris wrote:
> >> The safest thing would probably be to punt the syscalls to 2.6.37.
> >> Which is sad since I know a number of people are already working against
> >> them, but maybe that proves it's the best approach?
> >
> > I agree with removing the syscalls from 2.6.36 because of the following
> > reasons:
>
> How would the mechanics of this be achieved?
>
> Is it enough to just unconditionally return -ENOSYS from the sys_*()
> functions? Or should all the patches be reverted?
Whatever works I guess ... they would get reactivated pretty soon, anyway.
Andreas
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 21:50 ` Andreas Gruenbacher
@ 2010-10-08 21:59 ` H. Peter Anvin
2010-10-11 22:13 ` fanotify: disable fanotify syscalls Eric Paris
0 siblings, 1 reply; 29+ messages in thread
From: H. Peter Anvin @ 2010-10-08 21:59 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: David Daney, Eric Paris, Alan Cox, Tvrtko Ursulin, Linus Torvalds,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel
On 10/08/2010 02:50 PM, Andreas Gruenbacher wrote:
> On Friday 08 October 2010 18:33:55 David Daney wrote:
>> On 10/08/2010 05:06 AM, Andreas Gruenbacher wrote:
>>> On Thursday 07 October 2010 19:49:28 Eric Paris wrote:
>>>> The safest thing would probably be to punt the syscalls to 2.6.37.
>>>> Which is sad since I know a number of people are already working against
>>>> them, but maybe that proves it's the best approach?
>>>
>>> I agree with removing the syscalls from 2.6.36 because of the following
>>> reasons:
>>
>> How would the mechanics of this be achieved?
>>
>> Is it enough to just unconditionally return -ENOSYS from the sys_*()
>> functions? Or should all the patches be reverted?
>
> Whatever works I guess ... they would get reactivated pretty soon, anyway.
>
Returning -ENOSYS should be sufficient (that's what a non-system-call
does); it would *also* be good to block any headers from getting
exported to userspace so people don't end up with compiling against the
wrong version of the kernel headers and then wonder why their code
doesn't work in the future.
-hpa
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 21:03 ` Andreas Gruenbacher
@ 2010-10-09 0:46 ` John Stoffel
0 siblings, 0 replies; 29+ messages in thread
From: John Stoffel @ 2010-10-09 0:46 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: Tvrtko Ursulin, John Stoffel, Eric Paris, Linus Torvalds,
Linux Kernel Mailing List
>>>>> "Andreas" == Andreas Gruenbacher <agruen@suse.de> writes:
Andreas> On Friday 08 October 2010 18:17:25 Tvrtko Ursulin wrote:
>> Doesn't really solve ordering inside groups so maybe we do not need
>> priorities at all just these three classes?
Andreas> Applications can easily make the priority they register with
Andreas> configurable (and default to something in their "range" if we
Andreas> define such ranges). Wouldn't this be enough?
What about conflicts, when two applications register with the same
priority? And then one of the them exits, dies, fails, and
re-registers itself.
Does it get the same slot back? Does it goto the end of the line? If
it's an HSM handling (or other content manipulation tool) what
happens?
These are the questions which need to be answered, and the ABI updated
to reflect these questions. Even if it's just "We don't care..." it
needs to be spelled out.
John
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [dm-devel] Linux 2.6.36-rc7
2010-10-08 17:02 ` Tejun Heo
@ 2010-10-10 11:56 ` Torsten Kaiser
2010-10-11 10:09 ` [PATCH wq#for-next] workqueue: fix HIGHPRI handling in keep_working() Tejun Heo
1 sibling, 0 replies; 29+ messages in thread
From: Torsten Kaiser @ 2010-10-10 11:56 UTC (permalink / raw)
To: Tejun Heo
Cc: Milan Broz, device-mapper development, Linus Torvalds,
Linux Kernel Mailing List, hch, herbert
On Fri, Oct 8, 2010 at 7:02 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, again.
>
> On 10/07/2010 10:13 PM, Milan Broz wrote:
>> Yes, XFS is very good to show up problems in dm-crypt:)
>>
>> But there was no change in dm-crypt which can itself cause such problem,
>> planned workqueue changes are not in 2.6.36 yet.
>> Code is basically the same for the last few releases.
>>
>> So it seems that workqueue processing really changed here under memory pressure.
>>
>> Milan
>>
>> p.s.
>> Anyway, if you are able to reproduce it and you think that there is problem
>> in per-device dm-crypt workqueue, there are patches from Andi for shared
>> per-cpu workqueue, maybe it can help here. (But this is really not RC material.)
>>
>> Unfortunately not yet in dm-devel tree, but I have them here ready for review:
>> http://mbroz.fedorapeople.org/dm-crypt/2.6.36-devel/
>> (all 4 patches must be applied, I hope Alasdair will put them in dm quilt soon.)
>
> Okay, spent the whole day reproduing the problem and trying to
> determine what's going on. In the process, I've found a bug and a
> potential issue (not sure whether it's an actual issue which should be
> fixed for this release yet) but the hang doesn't seem to have anything
> to do with workqueue update. All the queues are behaving exactly as
> expected during hang.
>
> Also, it isn't a regression. I can reliably trigger the same deadlock
> on v2.6.35.
>
> Here's the setup, which should be mostly similar to Torsten's setup I
> used to trigger the problem.
>
> The machine is dual quad-core Opteron (8 phys cores) w/ 4GiB memory.
>
> * 80GB raid1 of two SATA disks
> * On top of that, luks encrypted device w/ twofish-cbc-essiv:sha256
> * In the encrypted device, xfs filesystem which hosts 8GiB swapfile
> * 12GiB tmpfs
>
> The workload is v2.6.35 allyesconfig -j 128 build in the tmpfs. Not
> too long after swap starts being used (several tens of secs), the
> system hangs. IRQ handling and all are fine but no IO gets through
> with a lot of tasks stuck in bio allocation somewhere.
>
> I suspected that with md and dm stacked together, something in the
> upper layer ended up exhausting a shared bio pool and tried a couple
> of things but haven't succeeded at finding where the culprit is. It
> probably would be best to run blktrace together and analyze how IO
> gets stuck.
>
> So, well, we seem to be broken the same way as before. No need to
> delay release for this one.
I instrument mm/mempool.c, trying to find what shared pool gets exhausted.
On the last run, it seemed that the fs_bio_set from fs/bio.c runs dry.
As far as I can see, that pool is used by bio_alloc() and bio_clone().
Above bio_alloc() a dire warning says, that any bio allocated that way
needs to be submitted from IO, otherwise the system could livelock.
bio_clone() does not have this warning, but as it uses the same pool
in the same way, I would expect the same rule applies.
Looking for uses of bio_allow() and bio_clone() in drivers/md it looks
like dm-crypt uses its own pools and not the fs_bio_set.
But drivers/md/raid1.c uses this pool, and in my eyes it does it wrong.
When writing to a RAID1 array the function make_request() in raid1.c
does a bio_clone() for each drive (lines 967-1001 in 2.6.36-rc7) and
only after all bios are allocates they will be merged into the
pending_bio_list.
So a RAID1 with 3 mirrors is a sure way to lock up a system as soon as
the mempool is needed?
(The fs_bio_set pool only allocates BIO_POOL_SIZE entries and that is
defined as 2)
>From the use of atomic_inc(&r1_bio->remaining) and the use of the
spin_lock_irqsave(&conf->device_lock, flags) when merging the bio
list, I would suspect that its even possible that multiple CPUs
concurrently get into this allocation loop, or that the use of
multiple RAID1 devices each with only 2 drives could lock up the same
way.
What am I missing, or is the use of bio_clone() really the wrong thing?
Torsten
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH wq#for-next] workqueue: fix HIGHPRI handling in keep_working()
2010-10-08 17:02 ` Tejun Heo
2010-10-10 11:56 ` Torsten Kaiser
@ 2010-10-11 10:09 ` Tejun Heo
1 sibling, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2010-10-11 10:09 UTC (permalink / raw)
To: Milan Broz, Linus Torvalds
Cc: device-mapper development, Linux Kernel Mailing List,
just.for.lkml, hch, herbert
The policy function keep_working() didn't check GCWQ_HIGHPRI_PENDING
and could return %false with highpri work pending. This could lead to
late execution of a highpri work which was delayed due to @max_active
throttling if other works are actively consuming CPU cycles.
For example, the following could happen.
1. Work W0 which burns CPU cycles.
2. Two works W1 and W2 are queued to a highpri wq w/ @max_active of 1.
3. W1 starts executing and W2 is put to delayed queue. W0 and W1 are
both runnable.
4. W1 finishes which puts W2 to pending queue but keep_working()
incorrectly returns %false and the worker goes to sleep.
5. W0 finishes and W2 starts execution.
With this patch applied, W2 starts execution as soon as W1 finishes.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
This is the workqueue bug I've found while trying to debug the dm/raid
hang. Although the bug may introduce unexpected delay in scheduling a
highpri work, the delay can only be as long as the combined length of
CPU cycle burns of the already running works. Given that HIGHPRI is
currently only used by xfs and its usage, I don't think it's likely to
cause an actual issue. I'll queue it for #for-next.
Thank you.
kernel/workqueue.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f77afd9..d355278 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -604,7 +604,9 @@ static bool keep_working(struct global_cwq *gcwq)
{
atomic_t *nr_running = get_gcwq_nr_running(gcwq->cpu);
- return !list_empty(&gcwq->worklist) && atomic_read(nr_running) <= 1;
+ return !list_empty(&gcwq->worklist) &&
+ (atomic_read(nr_running) <= 1 ||
+ gcwq->flags & GCWQ_HIGHPRI_PENDING);
}
/* Do we need a new worker? Called from manager. */
--
1.7.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* fanotify: disable fanotify syscalls
2010-10-08 21:59 ` H. Peter Anvin
@ 2010-10-11 22:13 ` Eric Paris
0 siblings, 0 replies; 29+ messages in thread
From: Eric Paris @ 2010-10-11 22:13 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andreas Gruenbacher, David Daney, Alan Cox, Tvrtko Ursulin,
Linux Kernel Mailing List, Christoph Hellwig, linux-fsdevel,
H. Peter Anvin
This patch disables the fanotify syscalls by just not building them and
letting the cond_syscall() statements in kernel/sys_ni.c redirect them
to sys_ni_syscall().
It was pointed out by Tvrtko Ursulin that the fanotify interface did not
include an explicit prioritization between groups. This is necessary
for fanotify to be usable for hierarchical storage management software,
as they must get first access to the file, before inotify-like notifiers
see the file.
This feature can be added in an ABI compatible way in the next release
(by using a number of bits in the flags field to carry the info) but it
was suggested by Alan that maybe we should just hold off and do it in
the next cycle, likely with an (new) explicit argument to the syscall.
I don't like this approach best as I know people are already starting to
use the current interface, but Alan is all wise and noone on list backed
me up with just using what we have. I feel this is needlessly ripping
the rug out from under people at the last minute, but if others think it
needs to be a new argument it might be the best way forward.
Three choices:
Go with what we got (and implement the new feature next cycle). Add a
new field right now (and implement the new feature next cycle). Wait
till next cycle to release the ABI (and implement the new feature next
cycle). This is number 3.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/Kconfig | 2 +-
include/linux/Kbuild | 1 -
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/notify/Kconfig b/fs/notify/Kconfig
index 22c629e..b388443 100644
--- a/fs/notify/Kconfig
+++ b/fs/notify/Kconfig
@@ -3,4 +3,4 @@ config FSNOTIFY
source "fs/notify/dnotify/Kconfig"
source "fs/notify/inotify/Kconfig"
-source "fs/notify/fanotify/Kconfig"
+#source "fs/notify/fanotify/Kconfig"
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 626b629..4e8ea8c 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -118,7 +118,6 @@ header-y += eventpoll.h
header-y += ext2_fs.h
header-y += fadvise.h
header-y += falloc.h
-header-y += fanotify.h
header-y += fb.h
header-y += fcntl.h
header-y += fd.h
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: Linux 2.6.36-rc7
2010-10-08 16:41 ` Eric Paris
@ 2010-10-18 11:01 ` Tvrtko Ursulin
0 siblings, 0 replies; 29+ messages in thread
From: Tvrtko Ursulin @ 2010-10-18 11:01 UTC (permalink / raw)
To: Eric Paris
Cc: John Stoffel, Linus Torvalds, Linux Kernel Mailing List,
agruen@suse.de
On Friday 08 Oct 2010 17:41:46 Eric Paris wrote:
> On Fri, 2010-10-08 at 17:17 +0100, Tvrtko Ursulin wrote:
> > On Friday 08 Oct 2010 16:42:06 John Stoffel wrote:
> >
> > Priority is also not that great concept. I may have proposed classes or
> > something similar at some point, don't remember any more. It would be
> >
> > equivalent to having allocated priority ranges, like:
> > >1000 - pre-content
> > >=100 - access-control
> >
> > <100 - content
> >
> > Doesn't really solve ordering inside groups so maybe we do not need
> > priorities at all just these three classes?
>
> I originally thought of trying to enumerate the types of users and came
> up with the same 3 you did. Then I thought it better to give a general
> priority field which we could indicate in documentation something like
> those 3 classes (exactly like you did above). I don't want to hard code
> some limited number of types of users into the interface. (ok it's going
> to limited, I was thinking 8 bits, but maybe others think we need more?)
>
> As an extreme example going with 3 fixed type of users (and thus
> equivalently only 3 priorities) would not allow for hierarchies of
> hierarchical storage managers. What if priority MAX only brought in
> enough info for priority MAX-1 to bring in the real file? If they had
> to share the single 'pre-content' priority we have another ordering
> problem.
I think I am fine with priorities if we make these three ranges documented.
Ordering between ranges, like in your chained HSM setup, would then be
sysadmin's responsibility but those setups look esoteric enough to think we
could get away with it. More importantly, I really do not see an automatic
solution which would solve all imaginable (more or less crazy) setups one
could come up with and priorities solve all normal ones plus giving some more
options than having only three classes.
If you go with 8 bits, you could even have classes plus priorities. Like
having low bits for in-class priority and then high two or three for class
(three if with a single bit set in class field implementation would be more
efficient). I do not see any usability advantage of this scheme though, it
just may enable an implementation with less decisions based on arbitrary
numbers.
Tvrtko
Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2010-10-18 11:01 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-06 21:45 Linux 2.6.36-rc7 Linus Torvalds
2010-10-07 0:49 ` Stephen Rothwell
2010-10-08 15:05 ` James Bottomley
2010-10-07 16:10 ` Tvrtko Ursulin
2010-10-07 17:15 ` Tvrtko Ursulin
2010-10-07 17:33 ` Eric Paris
2010-10-07 18:07 ` Alan Cox
2010-10-07 17:49 ` Eric Paris
2010-10-08 12:06 ` Andreas Gruenbacher
2010-10-08 16:33 ` David Daney
2010-10-08 21:50 ` Andreas Gruenbacher
2010-10-08 21:59 ` H. Peter Anvin
2010-10-11 22:13 ` fanotify: disable fanotify syscalls Eric Paris
2010-10-08 16:38 ` Linux 2.6.36-rc7 Eric Paris
2010-10-08 21:45 ` Andreas Gruenbacher
2010-10-07 20:55 ` John Stoffel
2010-10-07 21:24 ` Eric Paris
2010-10-08 15:42 ` John Stoffel
2010-10-08 16:17 ` Tvrtko Ursulin
2010-10-08 16:41 ` Eric Paris
2010-10-18 11:01 ` Tvrtko Ursulin
2010-10-08 16:54 ` John Stoffel
2010-10-08 21:03 ` Andreas Gruenbacher
2010-10-09 0:46 ` John Stoffel
2010-10-07 19:28 ` Tejun Heo
2010-10-07 20:13 ` [dm-devel] " Milan Broz
2010-10-08 17:02 ` Tejun Heo
2010-10-10 11:56 ` Torsten Kaiser
2010-10-11 10:09 ` [PATCH wq#for-next] workqueue: fix HIGHPRI handling in keep_working() Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).