From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264341AbTLKBtT (ORCPT ); Wed, 10 Dec 2003 20:49:19 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264353AbTLKBs4 (ORCPT ); Wed, 10 Dec 2003 20:48:56 -0500 Received: from unknown.servercentral.net ([66.225.219.162]:45265 "EHLO amsterdam.servershost.net") by vger.kernel.org with ESMTP id S264341AbTLKBsJ (ORCPT ); Wed, 10 Dec 2003 20:48:09 -0500 Message-ID: <3FD7CCD2.6010407@campogeral.com.br> Date: Wed, 10 Dec 2003 23:48:02 -0200 From: Fernando Serboncini - Campo Geral User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031127 X-Accept-Language: en-us, en, pt-br, pt, es, it MIME-Version: 1.0 To: linux-kernel@vger.kernel.org, Gabriel Winckler Subject: RAID5 resync blocking on 2.6.0-test11 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-MailScanner-Information: Please contact the ISP for more information X-MailScanner: Found to be clean X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - amsterdam.servershost.net X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - campogeral.com.br Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi LKML, I've just installed a software-RAID 5 on my Alpha (miata) Workstation 500au with Kernel 2.6.0-test11. The RAID is made of 4 Western Digital 250Gb HDs on a promise-ata-133tx2. The initial raidstart and mke2fs worked fine. After the initial sync, a clean boot and a new raidstart, the /proc/mdstat reported that the drive was at "resync" (/var/log/messages says that "raid array is not clean"). The problem is, the resync was blocking the /dev/md0. So, when I did a 'mount /dev/md0', it blocked until the resync was done (and it was a looong time for a nearly 1TB RAID). Looking through driver/md code I've realized that md_do_sync() was the real devil. After a few diffs with 2.4.23 code I've realized that at the speed limiter part of the function, the old "current->nice = " lines were deleted. After a few more search (and looking through LXR) at /fs/jffs2/background.c and /net/bluetooth/bnep/core.c I've realized that some people have changed "current->nice =" statements to "set_user_nice(current,...)" ones. Done that. The RAID worked just fine (still resyncing at boot, not fully tested yet) but don't block mounts anymore (btw, mount returns a lot faster than with 2.4.23). Since I'm no kernel (nor RAID) expert (first post here, btw), just wandering if I did something really stupid or not. Also, is this an Alpha-only issue? Or a 2.6.0 issue? Anyway, here follows the patch for what I've done. thanks for the attention. Fernando Serboncini --- linux-2.6.0-test11/drivers/md/md.c 2003-11-26 18:43:29.000000000 -0200 +++ linux/drivers/md/md.c 2003-12-10 23:29:33.000000000 -0200 @@ -3290,6 +3290,8 @@ currspeed = (j-mddev->resync_mark_cnt)/2/((jiffies-mddev->resync_mark)/HZ +1) +1; if (currspeed > sysctl_speed_limit_min) { + set_user_nice(current,19); + if ((currspeed > sysctl_speed_limit_max) || !is_mddev_idle(mddev)) { current->state = TASK_INTERRUPTIBLE; @@ -3297,6 +3299,8 @@ goto repeat; } } + else + set_user_nice(current,-20); } printk(KERN_INFO "md: md%d: sync done.\n",mdidx(mddev)); /*