* 2.4.19pre9aa1 @ 2002-05-30 1:01 Andrea Arcangeli 2002-05-30 0:38 ` 2.4.19pre9aa1 Marcelo Tosatti 2002-05-30 1:32 ` 2.4.19pre9aa1 William Lee Irwin III 0 siblings, 2 replies; 6+ messages in thread From: Andrea Arcangeli @ 2002-05-30 1:01 UTC (permalink / raw) To: linux-kernel NOTE: this release is highly experimental, while it worked solid so far it's not well tested yet, so please don't use in production environments! (yet :) The o1 scheduler integration will take a few weeks to settle and to compile on all archs. I would suggest the big-iron folks to give this kernel a spin, in particular for o1, shm-rmid fix, p4/pmd fix, inode-leak fix. The only rejected feature is been the node-affine allocations of per-cpu data structures in the numa-sched (matters only for numa, but o1 is more sensible optimization for numa anyways). Currently only x86 and alpha compiles and runs as expected. x86-64, ia64, ppc, s390*, sparc64 doesn't compile yet. uml worst of all compiles but it doesn't run correctly :), however it runs pretty well too, simply it hangs sometime and you've to press a key in the terminal and then it resumes as if nothing has happened. URL: http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa1.gz http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre9aa1/ Diff between 2.4.19pre8aa3 and 2.4.19pre9aa1 besides migrating to pre9. Only in 2.4.19pre8aa3: 00_block-highmem-all-18b-11.gz Only in 2.4.19pre9aa1: 00_block-highmem-all-18b-12.gz Only in 2.4.19pre8aa3: 00_x86-fast-pte-1 Only in 2.4.19pre9aa1: 10_x86-fast-pte-2 Only in 2.4.19pre8aa3: 20_pte-highmem-24 Only in 2.4.19pre9aa1: 20_pte-highmem-25 Only in 2.4.19pre8aa3: 30_x86_setup-boot-cleanup-3 Only in 2.4.19pre9aa1: 30_x86_setup-boot-cleanup-4 Only in 2.4.19pre8aa3: 90_init-survive-threaded-race-2 Only in 2.4.19pre9aa1: 90_init-survive-threaded-race-3 Only in 2.4.19pre8aa3: 91_zone_start_pfn-3 Only in 2.4.19pre9aa1: 91_zone_start_pfn-5 Only in 2.4.19pre8aa3: 93_NUMAQ-1 Only in 2.4.19pre9aa1: 93_NUMAQ-2 Rediffed. Only in 2.4.19pre8aa3: 00_compile-nfsroot-1 Only in 2.4.19pre8aa3: 00_initrd-free-2 Only in 2.4.19pre8aa3: 00_ufs-compile-1 Merged in mainline. Only in 2.4.19pre8aa3: 00_cpu-affinity-rml-3 Dropped (collided with o1 and it's not needed). Only in 2.4.19pre8aa3: 00_cpu-affinity-syscall-rml-2.4.19-pre7-1 Only in 2.4.19pre9aa1: 00_cpu-affinity-syscall-rml-2.4.19-pre9-1 Ported to the o1 sched with the -ac patch in rml/sched. Only in 2.4.19pre8aa3: 00_dnotify-cleanup-1 Only in 2.4.19pre9aa1: 00_dnotify-cleanup-2 Just keep the leftover out deletion in -ac. Only in 2.4.19pre9aa1: 00_ext2-ext3-warning-1 Warning when mounting an ext3 as an ext2 from Andrew Morton. Only in 2.4.19pre9aa1: 00_free_pgtable-and-p4-tlb-race-fixes-1 Fix the pagetable freeing races introduced by the speculative random userspace tlb fills of the p4, patch discussed on l-k. Also fix a definitive kernel smp race condition in free_pgtables. Only in 2.4.19pre8aa3: 00_get_pid-no-deadlock-and-boosted-3 Only in 2.4.19pre9aa1: 10_get_pid-no-deadlock-and-boosted-4 Discard the inferior attempt in pre9 and rediff (as Ihno noticed in practice the complexity dominates, if you fill the pid space the fix in mainline is useless anyways). Wonder why this much better fix isn't been merged instead (it is been submitted for both 2.2 and 2.4). This also fix a longstanding fork race present even in 2.2 that can lead to two tasks getting the same pid. Only in 2.4.19pre9aa1: 00_negative-dentry-waste-ram-1 Collect negative dentries after unlink or creat-failure (discussed on l-k, for 2.5 a very-low-prio lru list can be implemented instead). Only in 2.4.19pre8aa3: 00_rcu-poll-5 Only in 2.4.19pre9aa1: 10_rcu-poll-6 Ported to the o1 scheduler, this has a fix in force_cpu_reschedule compared to the rcu-poll patch in 2.5, and also it saves cachelines by colaescing the per-cpu quiescent sequence number in the runqueue structure, the quiescent will be increased at every schedule() call (i.e. every time we reach the quiescent point), so it is optimal to coalesce it in the same cacheline with the other fields later used by schedule(). the per-cpu quiescent++ is the _only_ fixed cost of rcu-poll, this is been a design choice to keep the fast-path overhead as low as possible. Only in 2.4.19pre9aa1: 00_sched-O1-rml-2.4.19-pre9-1.gz 2.5 O1 scheduler from Ingo Molnar backported to 2.4 from Robert Love. Only in 2.4.19pre9aa1: 00_shm_destroy-deadlock-1 Fix SMP deadlock due scheduling with spinlock held in IPC_RMID (fput is a blocking operation). Only in 2.4.19pre9aa1: 00_vm86-pagetablelock-1 Add pagetable lock to the vm86 pagetable walking, from Benjamin LaHaise. Only in 2.4.19pre9aa1: 02_sched-19pre8ac5-1 Use the wq locks so it stays generic and the switch in wait.h doesn't break. From -ac o1 sched comparison. Only in 2.4.19pre9aa1: 02_sched-alpha-1 alpha updates to make o1 working. This is also a good tutorial to port all the other archs. Only in 2.4.19pre9aa1: 02_sched-sparc64-1 Little attempt, only takes care of two bits, still lots of stuff missing for sparc64 and x86-64, they won't compile at the moment. Only in 2.4.19pre9aa1: 02_sched-x86-1 Additional fix for x86 (needed because of the rcu-poll changes to make the runqueue structure visible to the common code). Only in 2.4.19pre8aa3: 05_vm_03_vm_tunables-1 Only in 2.4.19pre9aa1: 05_vm_03_vm_tunables-2 Only in 2.4.19pre8aa3: 05_vm_06_swap_out-1 Only in 2.4.19pre9aa1: 05_vm_06_swap_out-2 Only in 2.4.19pre8aa3: 05_vm_07_local_pages-1 Only in 2.4.19pre9aa1: 05_vm_07_local_pages-2 Only in 2.4.19pre8aa3: 05_vm_08_try_to_free_pages_nozone-1 Only in 2.4.19pre9aa1: 05_vm_08_try_to_free_pages_nozone-2 Only in 2.4.19pre8aa3: 05_vm_09_misc_junk-1 Only in 2.4.19pre9aa1: 05_vm_09_misc_junk-2 Only in 2.4.19pre8aa3: 05_vm_17_rest-4 Only in 2.4.19pre9aa1: 05_vm_17_rest-7 Various random rediffing to sync with the o1 scheduler introduction. Only in 2.4.19pre9aa1: 05_vm_18_buffer-page-uptodate-1 Optimization to avoid losing the page-uptodate information while dropping the bh, from Andrew Morton. Only in 2.4.19pre9aa1: 10_inode-highmem-1 Avoid highmem pagecache to pin inodes in memory undefinitely, if we detect the "pinned inode" condition we shrink the cache. This should fix the last (known :) highmem vm unbalance problem. Only in 2.4.19pre8aa3: 10_numa-sched-18 Dropped in favour of the o1 scheduler, but it's not obsoleted by the o1 scheduler, for istance all the runqueue should go into the zone local memory so even the spinlocks are local etc... Only in 2.4.19pre9aa1: 10_o1-sched-64-cpu-1 Fix 64bit bug in o1 scheduler. Only in 2.4.19pre9aa1: 10_o1-sched-fixes-1 Other o1 sched fixes. Only in 2.4.19pre9aa1: 10_o1-sched-nfs-1 Fix nfs to compile with the o1 scheduler. Only in 2.4.19pre9aa1: 10_parent-timeslice-10 Only in 2.4.19pre8aa3: 10_parent-timeslice-9 Port this longstanding scheduler fix in the "share" timeslice algorithm on top of the o1 scheduler (with o1 the coding of the algorithm changed but the bug was stll there). This bug is been noticed in real life workloads that rendered the machine not interactive during intensive bash/exec/wait workloads. Only in 2.4.19pre8aa3: 10_tlb-state-2 Only in 2.4.19pre9aa1: 10_tlb-state-3 Mainline patch is inferior, backedout and resurrected this one. Only in 2.4.19pre8aa3: 20_share-timeslice-2 Obsoleted by the o1 scheduler, I missed the code in wake_up_forked_process, I didn't verified that it really works as expected in practice with gdb but the code looked ok. We'll see with the next round of benchmarks from Randy (the fact 2.5 wasn't forking as fast as previous -aa make me to think it might be related to an difference here). Only in 2.4.19pre8aa3: 30_dyn-sched-6 Obsoleted in favour of o1 that also provides similar dynamic priority. Only in 2.4.19pre8aa3: 50_uml-patch-2.4.18-25.gz Only in 2.4.19pre9aa1: 50_uml-patch-2.4.18-30.gz New patch from Jeff. NOTE: this compiles with o1 but it hangs sometime until you press a character on the konsole and then it restarts like if it didn't hanged, I checked schedule_ticks keeps running and that rq->nr_running is zero, so I'm not sure what's wrong at the moment. Could even be a bug between -29 and -30, I tested only -30 with o1, OTOH -29 was working fine w/o o1. Jeff could you give a spin to 2.4.19pre9aa1 and see what's wrong with uml? (sounds like a problem with the uml internals that you certainly know better than anyone else :) Only in 2.4.19pre8aa3: 51_uml-ac-to-aa-8 Only in 2.4.19pre9aa1: 51_uml-ac-to-aa-9 Only in 2.4.19pre9aa1: 51_uml-o1-1 Only in 2.4.19pre8aa3: 57_uml-dyn_sched-1 Only in 2.4.19pre8aa3: 59_uml-yield-1 Only in 2.4.19pre9aa1: 59_uml-yield-2 Various updates to compile with o1. Only in 2.4.19pre8aa3: 60_show-stack-1 Only in 2.4.19pre9aa1: 62_tux-dump-stack-1 Use dump_stack instead. Only in 2.4.19pre8aa3: 60_tux-exports-3 Only in 2.4.19pre9aa1: 60_tux-exports-4 Only in 2.4.19pre8aa3: 60_tux-kstat-3 Only in 2.4.19pre9aa1: 60_tux-kstat-4 Only in 2.4.19pre9aa1: 60_tux-o1-1 Further o1 scheduler updates this time for tux. Only in 2.4.19pre8aa3: 70_xfs-1.1-1.gz Only in 2.4.19pre9aa1: 70_xfs-1.1-2.gz Only in 2.4.19pre9aa1: 75_compile-dmapi-1 Rediffed. Only in 2.4.19pre8aa3: 80_x86_64-common-code-3 Only in 2.4.19pre9aa1: 80_x86_64-common-code-4 Only in 2.4.19pre8aa3: 82_x86-64-compile-aa-5 Only in 2.4.19pre9aa1: 82_x86-64-compile-aa-6 Some fixup for rejects and first o1 bits, but not complete yet. Andrea ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19pre9aa1 2002-05-30 1:01 2.4.19pre9aa1 Andrea Arcangeli @ 2002-05-30 0:38 ` Marcelo Tosatti 2002-05-30 1:43 ` 2.4.19pre9aa1 Andrea Arcangeli 2002-05-30 1:32 ` 2.4.19pre9aa1 William Lee Irwin III 1 sibling, 1 reply; 6+ messages in thread From: Marcelo Tosatti @ 2002-05-30 0:38 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel On Thu, 30 May 2002, Andrea Arcangeli wrote: > Only in 2.4.19pre8aa3: 00_get_pid-no-deadlock-and-boosted-3 > Only in 2.4.19pre9aa1: 10_get_pid-no-deadlock-and-boosted-4 > > Discard the inferior attempt in pre9 and rediff (as Ihno noticed in > practice the complexity dominates, if you fill the pid space the fix in > mainline is useless anyways). Wonder why this much better fix isn't > been merged instead (it is been submitted for both 2.2 and 2.4). > This also fix a longstanding fork race present even in 2.2 that can > lead to two tasks getting the same pid. Could you be more verbose in explaining the problems with the current approach and the advantages of your patch ? Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19pre9aa1 2002-05-30 0:38 ` 2.4.19pre9aa1 Marcelo Tosatti @ 2002-05-30 1:43 ` Andrea Arcangeli 0 siblings, 0 replies; 6+ messages in thread From: Andrea Arcangeli @ 2002-05-30 1:43 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1180 bytes --] On Wed, May 29, 2002 at 09:38:29PM -0300, Marcelo Tosatti wrote: > > > On Thu, 30 May 2002, Andrea Arcangeli wrote: > > > Only in 2.4.19pre8aa3: 00_get_pid-no-deadlock-and-boosted-3 > > Only in 2.4.19pre9aa1: 10_get_pid-no-deadlock-and-boosted-4 > > > > Discard the inferior attempt in pre9 and rediff (as Ihno noticed in > > practice the complexity dominates, if you fill the pid space the fix in > > mainline is useless anyways). Wonder why this much better fix isn't > > been merged instead (it is been submitted for both 2.2 and 2.4). > > This also fix a longstanding fork race present even in 2.2 that can > > lead to two tasks getting the same pid. > > Could you be more verbose in explaining the problems with the current > approach and the advantages of your patch ? see the attached email (I got no reply IIRC), I sent it to you and Linus some week ago (only for the commentary, the latest version of the patch is 10_get_pid-no-deadlock-and-boosted-4, that btw, probably it won't apply anymore cleanly against pristine pre9 mainline because I put o1 at the very top of the patch-chain, because it's the thing most likely to need updates at the moment) Andrea [-- Attachment #2: Type: message/rfc822, Size: 7235 bytes --] From: Andrea Arcangeli <andrea@suse.de> To: Marcelo Tosatti <marcelo@conectiva.com.br> Cc: linux-kernel@vger.kernel.org, Ihno Krumreich <ihno@suse.de>, Linus Torvalds <torvalds@transmeta.com> Subject: get_pid fixes against 2.4.19pre7 Date: Fri, 26 Apr 2002 13:44:09 +0200 Message-ID: <20020426134409.C19278@dualathlon.random> Hello, Could you have a look at these get_pid fixes? Besides the deadlocking while running out of pids and reducing from quadratic to linear the complexity of get_pids, it also addresses a longstanding non trivial race present in 2.2 too that can lead to pid collisions even on UP (noticed today while merging two more fixes from Ihno on the other part). the fix reduces a bit scalability of simultaneous forks from different cpus, but it's obviously right at least. Putting non-ready tasks into the tasklists asks for troubles (signals...). For more details see the comment at the end of the patch. If you've suggestions they're welcome, thanks. Patch in pre7aa2 against 2.4.19pre7: diff -urN 2.4.19pre7/include/linux/threads.h get_pid-1/include/linux/threads.h --- 2.4.19pre7/include/linux/threads.h Thu Apr 18 07:51:30 2002 +++ get_pid-1/include/linux/threads.h Fri Apr 26 09:10:30 2002 @@ -19,6 +19,6 @@ /* * This controls the maximum pid allocated to a process */ -#define PID_MAX 0x8000 +#define PID_NR 0x8000 #endif diff -urN 2.4.19pre7/kernel/fork.c get_pid-1/kernel/fork.c --- 2.4.19pre7/kernel/fork.c Tue Apr 16 08:12:09 2002 +++ get_pid-1/kernel/fork.c Fri Apr 26 10:25:33 2002 @@ -37,6 +37,12 @@ struct task_struct *pidhash[PIDHASH_SZ]; +/* + * Protectes next_unsafe, last_pid and it avoids races + * between get_pid and SET_LINKS(). + */ +static DECLARE_MUTEX(getpid_mutex); + void add_wait_queue(wait_queue_head_t *q, wait_queue_t * wait) { unsigned long flags; @@ -79,51 +85,105 @@ init_task.rlim[RLIMIT_NPROC].rlim_max = max_threads/2; } -/* Protects next_safe and last_pid. */ -spinlock_t lastpid_lock = SPIN_LOCK_UNLOCKED; - +/* + * Get the next free pid for a new process/thread. + * + * Strategy: last_pid and next_unsafe (excluded) are an interval where all pids + * are free, so next pid is just last_pid + 1 if it's also < next_unsafe. + * If last_pid + 1 >= next_unsafe the interval is completely used. + * In this case a bitmap with all used pids/tgids/pgrp/seesion is + * is created. This bitmap is looked for the next free pid and next_unsafe. + * If all pids are used, a kernel warning is issued. + */ static int get_pid(unsigned long flags) { - static int next_safe = PID_MAX; + static int next_unsafe = PID_NR; +#define PID_FIRST 2 /* pid 1 is init, first usable pid is 2 */ +#define PID_BITMAP_SIZE ((((PID_NR + 7) / 8) + sizeof(long) - 1 ) / (sizeof(long))) + /* + * Even if this could be local per-thread, keep it static and protected by + * the lock because we don't want to overflow the stack and we wouldn't + * SMP scale better anyways. It doesn't waste disk space because it's in + * the .bss. + */ + static unsigned long pid_bitmap[PID_BITMAP_SIZE]; + + /* from here the stuff on the stack */ struct task_struct *p; - int pid; + int pid, found_pid; if (flags & CLONE_PID) return current->pid; - spin_lock(&lastpid_lock); - if((++last_pid) & 0xffff8000) { - last_pid = 300; /* Skip daemons etc. */ - goto inside; - } - if(last_pid >= next_safe) { -inside: - next_safe = PID_MAX; + pid = last_pid + 1; + if (pid >= next_unsafe) { + next_unsafe = PID_NR; + memset(pid_bitmap, 0, PID_BITMAP_SIZE*sizeof(long)); + read_lock(&tasklist_lock); - repeat: + /* + * Build the bitmap and calc next_unsafe. + */ for_each_task(p) { - if(p->pid == last_pid || - p->pgrp == last_pid || - p->tgid == last_pid || - p->session == last_pid) { - if(++last_pid >= next_safe) { - if(last_pid & 0xffff8000) - last_pid = 300; - next_safe = PID_MAX; + set_bit(p->pid, pid_bitmap); + set_bit(p->pgrp, pid_bitmap); + set_bit(p->tgid, pid_bitmap); + set_bit(p->session, pid_bitmap); + + if (next_unsafe > p->pid && p->pid > pid) + next_unsafe = p->pid; + if (next_unsafe > p->pgrp && p->pgrp > pid) + next_unsafe = p->pgrp; + if (next_unsafe > p->tgid && p->tgid > pid) + next_unsafe = p->tgid; + if (next_unsafe > p->session && p->session > pid) + next_unsafe = p->session; + } + + /* + * Release the tasklist_lock, after the unlock it may happen that + * a pid is freed while it's still marked in use + * in the pid_bitmap[]. + */ + read_unlock(&tasklist_lock); + + found_pid = find_next_zero_bit(pid_bitmap, PID_NR, pid); + if (found_pid >= PID_NR) { + next_unsafe = 0; /* depends on PID_FIRST > 0 */ + found_pid = find_next_zero_bit(pid_bitmap, pid, PID_FIRST); + /* We scanned the whole bitmap without finding a free pid. */ + if (found_pid >= pid) { + static long last_get_pid_warning; + if ((unsigned long) (jiffies - last_get_pid_warning) >= HZ) { + printk(KERN_NOTICE "No more PIDs (PID_NR = %d)\n", PID_NR); + last_get_pid_warning = jiffies; } - goto repeat; + return -1; + } + } + + pid = found_pid; + + if (pid > next_unsafe) { + /* recalc next_unsafe by looking for the next bit set in the bitmap */ + unsigned long * start = pid_bitmap; + unsigned long * p = start + (pid / (sizeof(long) * 8)); + unsigned long * end = pid_bitmap + PID_BITMAP_SIZE; + unsigned long mask = ~((1UL << (pid & ((sizeof(long) * 8 - 1)))) - 1); + + *p &= (mask << 1); + + while (p < end) { + if (*p) { + next_unsafe = ffz(~*p) + (p - start) * sizeof(long) * 8; + break; + } + p++; } - if(p->pid > last_pid && next_safe > p->pid) - next_safe = p->pid; - if(p->pgrp > last_pid && next_safe > p->pgrp) - next_safe = p->pgrp; - if(p->session > last_pid && next_safe > p->session) - next_safe = p->session; } - read_unlock(&tasklist_lock); } - pid = last_pid; - spin_unlock(&lastpid_lock); + + last_pid = pid; return pid; } @@ -623,7 +683,10 @@ p->state = TASK_UNINTERRUPTIBLE; copy_flags(clone_flags, p); + down(&getpid_mutex); p->pid = get_pid(clone_flags); + if (p->pid < 0) /* valid pids are >= 0 */ + goto bad_fork_cleanup; p->run_list.next = NULL; p->run_list.prev = NULL; @@ -730,7 +793,17 @@ list_add(&p->thread_group, ¤t->thread_group); } + /* + * We must do the SET_LINKS() under the getpid_mutex, to avoid + * another CPU to get our same PID between the release of of the + * getpid_mutex and the SET_LINKS(). + * + * In short to avoid SMP races the new child->pid must be just visible + * in the tasklist by the time we drop the getpid_mutex. + */ SET_LINKS(p); + up(&getpid_mutex); + hash_pid(p); nr_threads++; write_unlock_irq(&tasklist_lock); @@ -757,6 +830,7 @@ bad_fork_cleanup_files: exit_files(p); /* blocking */ bad_fork_cleanup: + up(&getpid_mutex); put_exec_domain(p->exec_domain); if (p->binfmt && p->binfmt->module) __MOD_DEC_USE_COUNT(p->binfmt->module); Andrea ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19pre9aa1 2002-05-30 1:01 2.4.19pre9aa1 Andrea Arcangeli 2002-05-30 0:38 ` 2.4.19pre9aa1 Marcelo Tosatti @ 2002-05-30 1:32 ` William Lee Irwin III 2002-05-30 1:40 ` 2.4.19pre9aa1 Andrea Arcangeli 1 sibling, 1 reply; 6+ messages in thread From: William Lee Irwin III @ 2002-05-30 1:32 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel On Thu, May 30, 2002 at 03:01:25AM +0200, Andrea Arcangeli wrote: > NOTE: this release is highly experimental, while it worked solid so far > it's not well tested yet, so please don't use in production > environments! (yet :) > The o1 scheduler integration will take a few weeks to settle and to > compile on all archs. I would suggest the big-iron folks to give this > kernel a spin, in particular for o1, shm-rmid fix, p4/pmd fix, > inode-leak fix. The only rejected feature is been the node-affine > allocations of per-cpu data structures in the numa-sched (matters only > for numa, but o1 is more sensible optimization for numa anyways). > Currently only x86 and alpha compiles and runs as expected. x86-64, > ia64, ppc, s390*, sparc64 doesn't compile yet. uml worst of all compiles > but it doesn't run correctly :), however it runs pretty well too, simply > it hangs sometime and you've to press a key in the terminal and then it > resumes as if nothing has happened. I noticed what looked like missed wakeups in tty code in early 2.4.x ports of the O(1) scheduler, though I saw a somewhat different failure mode, that is, the terminal echo would remain one character behind forever (and if it happened again, more than one). I never got a real answer to this, unfortunately, as it appeared to go away after a certain revision of the scheduler. The failure mode you describe is slightly different, but perhaps related. And thanks for looking into shm, I understand that area is a bit painful to work around, but fixes are certainly needed there. Cheers, Bill ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19pre9aa1 2002-05-30 1:32 ` 2.4.19pre9aa1 William Lee Irwin III @ 2002-05-30 1:40 ` Andrea Arcangeli 2002-05-31 19:34 ` 2.4.19pre9aa1 Andrea Arcangeli 0 siblings, 1 reply; 6+ messages in thread From: Andrea Arcangeli @ 2002-05-30 1:40 UTC (permalink / raw) To: William Lee Irwin III, linux-kernel On Wed, May 29, 2002 at 06:32:00PM -0700, William Lee Irwin III wrote: > On Thu, May 30, 2002 at 03:01:25AM +0200, Andrea Arcangeli wrote: > > NOTE: this release is highly experimental, while it worked solid so far > > it's not well tested yet, so please don't use in production > > environments! (yet :) > > The o1 scheduler integration will take a few weeks to settle and to > > compile on all archs. I would suggest the big-iron folks to give this > > kernel a spin, in particular for o1, shm-rmid fix, p4/pmd fix, > > inode-leak fix. The only rejected feature is been the node-affine > > allocations of per-cpu data structures in the numa-sched (matters only > > for numa, but o1 is more sensible optimization for numa anyways). > > Currently only x86 and alpha compiles and runs as expected. x86-64, > > ia64, ppc, s390*, sparc64 doesn't compile yet. uml worst of all compiles > > but it doesn't run correctly :), however it runs pretty well too, simply > > it hangs sometime and you've to press a key in the terminal and then it > > resumes as if nothing has happened. > > I noticed what looked like missed wakeups in tty code in early 2.4.x > ports of the O(1) scheduler, though I saw a somewhat different failure > mode, that is, the terminal echo would remain one character behind > forever (and if it happened again, more than one). I never got a real > answer to this, unfortunately, as it appeared to go away after a certain > revision of the scheduler. The failure mode you describe is slightly > different, but perhaps related. interesting, a tty problem could explain it probably, but being it reproducible only with uml it should be still some uml internal that broke, not a generic bug, there are no changes to the tty code and it's unlikely that only the tty code broke due a generic o1 bug and that additionally it is reproducible only in uml. > > And thanks for looking into shm, I understand that area is a bit > painful to work around, but fixes are certainly needed there. you're very welcome. Andrea ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19pre9aa1 2002-05-30 1:40 ` 2.4.19pre9aa1 Andrea Arcangeli @ 2002-05-31 19:34 ` Andrea Arcangeli 0 siblings, 0 replies; 6+ messages in thread From: Andrea Arcangeli @ 2002-05-31 19:34 UTC (permalink / raw) To: William Lee Irwin III, linux-kernel On Thu, May 30, 2002 at 03:40:09AM +0200, Andrea Arcangeli wrote: > On Wed, May 29, 2002 at 06:32:00PM -0700, William Lee Irwin III wrote: > > On Thu, May 30, 2002 at 03:01:25AM +0200, Andrea Arcangeli wrote: > > > NOTE: this release is highly experimental, while it worked solid so far > > > it's not well tested yet, so please don't use in production > > > environments! (yet :) > > > The o1 scheduler integration will take a few weeks to settle and to > > > compile on all archs. I would suggest the big-iron folks to give this > > > kernel a spin, in particular for o1, shm-rmid fix, p4/pmd fix, > > > inode-leak fix. The only rejected feature is been the node-affine > > > allocations of per-cpu data structures in the numa-sched (matters only > > > for numa, but o1 is more sensible optimization for numa anyways). > > > Currently only x86 and alpha compiles and runs as expected. x86-64, > > > ia64, ppc, s390*, sparc64 doesn't compile yet. uml worst of all compiles > > > but it doesn't run correctly :), however it runs pretty well too, simply > > > it hangs sometime and you've to press a key in the terminal and then it > > > resumes as if nothing has happened. > > > > I noticed what looked like missed wakeups in tty code in early 2.4.x > > ports of the O(1) scheduler, though I saw a somewhat different failure > > mode, that is, the terminal echo would remain one character behind > > forever (and if it happened again, more than one). I never got a real > > answer to this, unfortunately, as it appeared to go away after a certain > > revision of the scheduler. The failure mode you describe is slightly > > different, but perhaps related. > > interesting, a tty problem could explain it probably, but being it JFYI: the uml-hang gone away with 2.4.19pre9aa2, not sure why. I start to wonder that it happened because I didn't run a full 'make distclean' while I was updating it, maybe it was miscompiled. > reproducible only with uml it should be still some uml internal that > broke, not a generic bug, there are no changes to the tty code and it's > unlikely that only the tty code broke due a generic o1 bug and that > additionally it is reproducible only in uml. > > > > > And thanks for looking into shm, I understand that area is a bit > > painful to work around, but fixes are certainly needed there. > > you're very welcome. > > Andrea Andrea ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-05-31 19:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-05-30 1:01 2.4.19pre9aa1 Andrea Arcangeli 2002-05-30 0:38 ` 2.4.19pre9aa1 Marcelo Tosatti 2002-05-30 1:43 ` 2.4.19pre9aa1 Andrea Arcangeli 2002-05-30 1:32 ` 2.4.19pre9aa1 William Lee Irwin III 2002-05-30 1:40 ` 2.4.19pre9aa1 Andrea Arcangeli 2002-05-31 19:34 ` 2.4.19pre9aa1 Andrea Arcangeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox