From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932161AbaHYLbH (ORCPT ); Mon, 25 Aug 2014 07:31:07 -0400 Received: from cantor2.suse.de ([195.135.220.15]:44412 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755504AbaHYLbA (ORCPT ); Mon, 25 Aug 2014 07:31:00 -0400 Message-ID: <53FB1E6F.7080803@suse.de> Date: Mon, 25 Aug 2014 13:30:55 +0200 From: Hannes Reinecke User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: Christoph Hellwig , "Elliott, Robert (Server Storage)" CC: Yoshihiro YUNOMAE , "Martin K. Petersen" , "linux-scsi@vger.kernel.org" , "yrl.pp-manager.tt@hitachi.com" , "linux-kernel@vger.kernel.org" , "James E.J. Bottomley" , Masami Hiramatsu , Doug Gilbert , Hidehiro Kawai Subject: Re: scsi logging future directions, was Re: [RFC PATCH -logging 00/10] scsi/constants: Output continuous error messages on trace References: <20140808115004.6768.97014.stgit@yuno-kbuild.novalocal> <94D0CD8314A33A4D9D801C0FE68B402958C1A9E5@G9W0745.americas.hpqcorp.net> <20140824204454.GA11423@lst.de> In-Reply-To: <20140824204454.GA11423@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/24/2014 10:44 PM, Christoph Hellwig wrote: > On Fri, Aug 22, 2014 at 12:39:59AM +0000, Elliott, Robert (Server Storage) wrote: >> If you trigger hundreds of errors (e.g., hot remove a device >> during heavy IO), then all the prints to the linux serial console >> bog down the system, causing timeouts in commands to other >> devices and soft lockups for applications. >> >> Some changes that would help are: >> 1. Put them under SCSI logging level control >> 2. Use printk_ratelimited so an excessive number are trimmed >> >> Would you like to include something like this in your >> patch set? > > I think we should come to an agreement where we want to go with scsi > logging first before doing various smaller adjustments. (Although your > example is one that's urgent enough that I'd like to put it in ASAP, > I had issues with it a few times). > > I had a chat with Martin at Linuxcon about these issues, and we were > both in favor of getting rid of the old scsi logging mechansisms and > instead replace it by an extended version of the scsi tracepoints that > cover all places, and dump all data from the old logging mechanism > that people find useful. > > In a few places we'd still want to log normal dev_printk style errors, > and the I/O completion is one of them, even if they really need to be > ratelimited and condensed. > > If someone has arguments in favour of keeping the old logging code > I'd love to hear them, but in practive the traceevent code has huge > benefits: > > - almost zero overhead if disabled > - can easily be used without any tools through configs, but can be used > even better with tools like trace-cmd or perf > - allows both fine and coarse grained selections of events to trace > - allows to capture statistics on each trace point without event enabling the > output > - doesn't have any of the console lockup problems. > I've already been working on updating scsi logging infrastructure, removing old cludges and streamlining it. I'm all in favour of moving things over to scsi tracing; in fact I've already moved all the current SCSI_ML_XXX statements to tracepoints in my current patchset. Unfortunately I haven't found time to test things out there, and there's the patchset from Yoshihiro which needs review and integration. As of now I've treated this as rather low priority as no-one seemed to mind and the patchsets will be touching each and every driver. I'll be updating the patchset and send it for review. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)