* SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) @ 2016-03-01 11:00 Johan de Jong 2016-03-05 20:15 ` Johan de Jong 0 siblings, 1 reply; 8+ messages in thread From: Johan de Jong @ 2016-03-01 11:00 UTC (permalink / raw) To: linux-kernel Dear developers, (Please CC me as I am not subscribed (yet)) Writing (backing up) to multiple optical drives at the same time results in a performance loss of about 7-10 times compared to writing to a single drive. After digging around it seems the problem arose about 5 years ago after the Big Kernel Lock removal and the introduction of the new "sr_mutex" private mutex in drivers/scsi/sr.c, which locks on a per driver basis instead of a per device basis. Various reports by users are listed on this issue on various mailing lists, so I think there is interest for a solution in the linux community. So far, it looks like this has not attracted the attention of, or not been identified as a priority by, any of the kernel developers. However, I think a Linux based DIY server with multiple optical drives for the purpose of backing up files in multiple offline copies is a very useful application and it would be unfortunate if the current behavior keeps such an application unfeasible. Would someone be willing to look into this and/or comment on the issue? Sincerely, Johan de Jong ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-01 11:00 SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) Johan de Jong @ 2016-03-05 20:15 ` Johan de Jong 2016-03-05 20:47 ` Thomas Schmitt 2016-03-05 21:25 ` Wakko Warner 0 siblings, 2 replies; 8+ messages in thread From: Johan de Jong @ 2016-03-05 20:15 UTC (permalink / raw) To: linux-kernel; +Cc: Thomas Schmitt Dear developers, In the mean time I have applied and tested the 2013 patch by Otto Meta: http://marc.info/?l=linux-scsi&m=135705061804384&w=2 which, in short, replaces mutex_lock(&sr_mutex) (global mutex), that was introduced in 2010 to replace lock_kernel(), by per-device mutexes and allowing concurrent ioctl(SG_IO) in different processes with different sr devices. I had to patch some parts by hand as the posted patch did not slide seamlessly into my more current source tree, but I'm happy to report that the patch indeed does what it intends and solves the performance issue with accessing multiple sr devices concurrently. Repeated concurrent writes to three SATA burners have shown reliable and performance penalty free runs. In addition, repeated concurrent drive tray open and close (eject (-t) /dev/sr0 & eject (-t) /dev/sr1 & eject (-t) /dev/sr2) commands result in simultaneous (as opposed to the unpatched kernel) and reliable tray movement with no visible indications of locking problems caused by the patch either physically or in the kernel logs. I have been running the patched kernel for a number of days now to full satisfaction and relief. I would therefore venture to suggest that mutex_lock(&sr_mutex) is indeed the cause of the severe performance penalty and that the 2013 Otto Meta patch proposes a viable remedy that bears nomination for patching into the main kernel tree. Sincerely, Johan de Jong ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-05 20:15 ` Johan de Jong @ 2016-03-05 20:47 ` Thomas Schmitt 2016-03-07 12:13 ` One Thousand Gnomes 2016-03-05 21:25 ` Wakko Warner 1 sibling, 1 reply; 8+ messages in thread From: Thomas Schmitt @ 2016-03-05 20:47 UTC (permalink / raw) To: linux-kernel; +Cc: jrdejong Hi, as developer of libburn i got several user complaints about poor concurrent throughput. Since last year i suffer from it myself on kernel 3.16 of Debian 8. Before i had 2.6.18 which did very well in that aspect. An old workaround for IDE master-slave concurrency problems brings a certain degree of relief on some drives. See http://libburnia-project.org/wiki/ConcurrentLinuxSr But the much better solution would be to remove the need for the global lock shared by all ioctl(SG_IO) to all /dev/sr*. Given the old reports of Otto Meta about possible race conditions with drives at the same IDE controller, and the rareness of IDE attached drives nowadays, i propose to keep the global sr_mutex lock for IDE attached drives. Question is how this can be determined from the device parameters of the calls in question: struct block_device *bdev struct gendisk *disk struct scsi_cd *cd Have a nice day :) Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-05 20:47 ` Thomas Schmitt @ 2016-03-07 12:13 ` One Thousand Gnomes 2016-03-07 13:11 ` Thomas Schmitt 0 siblings, 1 reply; 8+ messages in thread From: One Thousand Gnomes @ 2016-03-07 12:13 UTC (permalink / raw) To: Thomas Schmitt; +Cc: linux-kernel, jrdejong On Sat, 05 Mar 2016 21:47:00 +0100 "Thomas Schmitt" <scdbackup@gmx.net> wrote: > Hi, > > as developer of libburn i got several user complaints about poor > concurrent throughput. Since last year i suffer from it myself > on kernel 3.16 of Debian 8. Before i had 2.6.18 which did very well > in that aspect. > > An old workaround for IDE master-slave concurrency problems brings > a certain degree of relief on some drives. See > http://libburnia-project.org/wiki/ConcurrentLinuxSr > > But the much better solution would be to remove the need for the > global lock shared by all ioctl(SG_IO) to all /dev/sr*. > > Given the old reports of Otto Meta about possible race conditions > with drives at the same IDE controller, and the rareness of IDE > attached drives nowadays, i propose to keep the global sr_mutex lock > for IDE attached drives. If there are race conditions present in the libata drivers then they want fixing there. The old IDE drivers are basically obsoleted by libata for all real world uses and most "IDE" devices are actually SATA now anyway. Alan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-07 12:13 ` One Thousand Gnomes @ 2016-03-07 13:11 ` Thomas Schmitt 0 siblings, 0 replies; 8+ messages in thread From: Thomas Schmitt @ 2016-03-07 13:11 UTC (permalink / raw) To: gnomes; +Cc: linux-kernel, jrdejong Hi, i wrote: > > Given the old reports of Otto Meta about possible race conditions > > with drives at the same IDE controller, and the rareness of IDE > > attached drives nowadays, i propose to keep the global sr_mutex lock > > for IDE attached drives. One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> wrote: > If there are race conditions present in the libata drivers then they want > fixing there. >From the view of software architecture: of course, yes. But the research of Johan de Jong shows that this patch was proposed several times and always failed to be decided due to problems when testing heavy concurrency on IDE attached drives. Newest threads known to me (besides this one) were started by Tim Small in november 2014: "[PATCH 0/4] Fix performance burning or extracting audio etc. from multiple optical drives." http://marc.info/?t=141692734400009&r=1&w=2 "Very slow throughput when using cdparanoia on two SATA CDROM drives with /dev/sr but not /dev/sg" http://marc.info/?t=141528207400003&r=1&w=2 In the middle of the discussion Jens Axboe was positive towards the issue. But then came IDE problems. It is not clear to me whether the reported problems existed already with the Big Kernel Lock and whether they do not exist with the global sr_mutex lock which is currently in drivers/scsi/sr.c. Especially the problem reports of Otto Meta in 2013 are not explainable alone by wrongly directed SCSI commands or confused householding in the lower drivers. In http://marc.info/?l=linux-scsi&m=135734072119667&w=2 he reports that a drive tray was stuck out and moved in only on command eject -t, but not on pressing the drive's eject button. This is not SCSI MMC (as payload of ATAPI) behavior. The SCSI command 1Eh PREVENT/ALLOW MEDIUM REMOVAL is defined in MMC-5 to override the definition in SPC-3. MMC-5, 6.14 says about it: "[...] requests that the Drive enable or disable the removal of the medium in the Drive. The Drive shall not allow medium removal if any Host currently has medium removal prevented." The drive cannot protect the medium when the tray is out. So being stuck in this state is not normal on drive firmware level. > The old IDE drivers are basically obsoleted by libata for > all real world uses and most "IDE" devices are actually SATA now anyway. Of course, if we can get reports that a modern kernel on a machine with two optical drives on the same IDE controller works fine, then we do not have to care for older kernels. But given the situation i see, it seems better to handle all IDE drives like they are handled now, and to only let the SATA or USB attached drives perform per-drive locking. We have several positive reports with SATA drives. So i consider it proven that no concurrency problems exist before SATA processing gets separated from IDE processing. If still concurrency problems show up on IDE, then they cannot be blamed on the relaxed locking of the other drives. If IDE users want no discrimination, one could give them a kernel configuration option and let them search for problems on their own risk. Maybe they find out what's really wrong in IDE. (Uninformed guess: include/uapi/linux/major.h and block/genhd.c function add_disk(struct gendisk *disk) make me think that one could possibly recognize IDE attached drives by comparing static int ide_majors[] = {IDE0_MAJOR, IDE1_MAJOR, IDE2_MAJOR, IDE3_MAJOR, IDE4_MAJOR, IDE5_MAJOR, IDE6_MAJOR, IDE7_MAJOR, IDE8_MAJOR, IDE9_MAJOR, -1}; with MAJOR(disk_to_dev(disk)->devt) ) Have a nice day :) Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-05 20:15 ` Johan de Jong 2016-03-05 20:47 ` Thomas Schmitt @ 2016-03-05 21:25 ` Wakko Warner 2016-03-05 21:36 ` Johan de Jong 1 sibling, 1 reply; 8+ messages in thread From: Wakko Warner @ 2016-03-05 21:25 UTC (permalink / raw) To: Johan de Jong; +Cc: linux-kernel, Thomas Schmitt Johan de Jong wrote: > In the mean time I have applied and tested the 2013 patch by Otto Meta: > > http://marc.info/?l=linux-scsi&m=135705061804384&w=2 > > which, in short, replaces mutex_lock(&sr_mutex) (global mutex), that > was introduced in 2010 to replace lock_kernel(), by per-device mutexes > and allowing concurrent ioctl(SG_IO) in different processes with > different sr devices. There seems to be a few patches floating around. I've had one running on 3.3.0 for a long time w/o any issues. I'm currently using the one from Tim Small (Search for subject Fix performance burning or extracting audio etc. from multiple optical drives.) on 4.x (where x is 3-4) and a 3.14.something without any issues. I still have the emails from Tim. My current usage is 2 systems with 3 burners from the same source. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-05 21:25 ` Wakko Warner @ 2016-03-05 21:36 ` Johan de Jong 2016-03-06 2:06 ` Wakko Warner 0 siblings, 1 reply; 8+ messages in thread From: Johan de Jong @ 2016-03-05 21:36 UTC (permalink / raw) To: Wakko Warner; +Cc: linux-kernel, Thomas Schmitt Hi Wakko, If I remember correctly I did see you commenting on discussions on either the Otto Meta patch, or another that proposed to remove the mutex entirely. I was unaware of any others. Do you have more information on why this never resulted in a succesful concerted effort to get a patch in the kernel tree? Do the patches have drawbacks or have they never been submitted properly? If the latter, we might endeavor it? Best, Johan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) 2016-03-05 21:36 ` Johan de Jong @ 2016-03-06 2:06 ` Wakko Warner 0 siblings, 0 replies; 8+ messages in thread From: Wakko Warner @ 2016-03-06 2:06 UTC (permalink / raw) To: Johan de Jong; +Cc: linux-kernel, Thomas Schmitt Johan de Jong wrote: > Hi Wakko, > > If I remember correctly I did see you commenting on discussions on > either the Otto Meta patch, or another that proposed to remove the > mutex entirely. I was unaware of any others. I received the last set of patches from Tim more than a year ago. I wasn't using a system with multiple drives other than my 3.3.0 box. Last year around november I gathered some more drives and put them in another system and I decided to test the patches. I sent a report back to the list back in november 2015 (maybe october) about the success. > Do you have more information on why this never resulted in a succesful > concerted effort to get a patch in the kernel tree? Do the patches > have drawbacks or have they never been submitted properly? If the > latter, we might endeavor it? No, sorry. I'm just a user. I haven't had any crashes with the patches. I'd like to see the patches go in, I'm tired of patching every new kernel. I'm running the ones from Tim on 2 machines. One machine has the dvd burners (exported as iscsi targets) and the other machine connects to it. The patches have to be on both machines for it to work (I assume). I'm able to burn at 16x to 3 burners over iscsi at the same time. I've burned many discs this way with out any issues (other than bad media). Just FYI: The computer with the burners is running 4.4.1 and the iscsi initiator is 3.12.52. -- Microsoft has beaten Volkswagen's world record. Volkswagen only created 22 million bugs. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-03-07 13:11 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-03-01 11:00 SCSI sr driver: parallel writes to optical serialized which hurts performance (sr_mutex) Johan de Jong 2016-03-05 20:15 ` Johan de Jong 2016-03-05 20:47 ` Thomas Schmitt 2016-03-07 12:13 ` One Thousand Gnomes 2016-03-07 13:11 ` Thomas Schmitt 2016-03-05 21:25 ` Wakko Warner 2016-03-05 21:36 ` Johan de Jong 2016-03-06 2:06 ` Wakko Warner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox