From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262126AbTKCRqo (ORCPT ); Mon, 3 Nov 2003 12:46:44 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262127AbTKCRqo (ORCPT ); Mon, 3 Nov 2003 12:46:44 -0500 Received: from mail3.ithnet.com ([217.64.64.7]:14530 "HELO heather-ng.ithnet.com") by vger.kernel.org with SMTP id S262126AbTKCRqm (ORCPT ); Mon, 3 Nov 2003 12:46:42 -0500 X-Sender-Authentication: net64 Date: Mon, 3 Nov 2003 18:46:41 +0100 From: Stephan von Krawczynski To: linux@3ware.com Cc: linux-kernel@vger.kernel.org Subject: Bug during media scan, kernel 2.4.23-pre9 Message-Id: <20031103184641.7bceb740.skraw@ithnet.com> Organization: ith Kommunikationstechnik GmbH X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hello, I just encountered a real bad problem with using media scan on 3ware controllers. I have 3 hds connected and configured a RAID5. I use media scan regularly (daily basis). Since two days I see this problem: Nov 3 18:12:11 box 3w-xxxx[2039]: INFORMATION: Verify started on unit 0 on controller ID:2. (0x29) Nov 3 18:19:41 box kernel: 3w-xxxx: scsi2: Unit #0: Command (f6e5d800) timed out, resetting card. After that the box has problems, the controller obviously hangs. This in itself can be considered a bug, but what is really annoying is that one has no chance finding out _which_ port caused the problem. So at this point you can play roulette and replace one of the hds hoping that it was indeed the bad one. It would be really a lot better to degrade the unit in this case and give a hint which port has problems (command timed out on port ...). This would: a) not hang the box b) give you a chance to replace the hd, as you would expect in RAID5 The current situation is absolutely _no good_. Regards, Stephan