* Re: scsi: sg: assorted memory corruptions [not found] ` <1517501859.3417.67.camel@codethink.co.uk> @ 2018-02-01 16:21 ` Dmitry Vyukov 2018-02-04 9:07 ` Eric Biggers 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Vyukov @ 2018-02-01 16:21 UTC (permalink / raw) To: Ben Hutchings, Tejun Heo, linux-ide Cc: Doug Gilbert, Bart Van Assche, jejb@linux.vnet.ibm.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, syzkaller@googlegroups.com On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings <ben.hutchings@codethink.co.uk> wrote: > On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote: >> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote: >> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote: > [...] >> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1 >> > > >> > > # readlink /sys/class/scsi_generic/sg0 >> > > >> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0 >> > > >> > > # cat /sys/class/scsi_generic/sg0/device/vendor >> > > ATA >> > >> > >> > ^^^^^ >> > That subsystem is the culprit IMO, most likely libata. >> > >> > Until you can show this test failing on something other than an >> > ATA disk, then I will treat this issue as closed. >> >> Hi Doug, >> >> Why is bug in ATA not a bug? Is it long unused by everybody? I've got >> it by running qemu with default flags... > > If the bug is in libata then it's not on Doug to fix it since he's only > maintaining sg. Then I think we need to CC ata maintainers rather than treat it as closed. +Tejun, linux-ide@, you can see full thread here: https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions 2018-02-01 16:21 ` scsi: sg: assorted memory corruptions Dmitry Vyukov @ 2018-02-04 9:07 ` Eric Biggers 2018-02-04 11:10 ` Dmitry Vyukov 0 siblings, 1 reply; 4+ messages in thread From: Eric Biggers @ 2018-02-04 9:07 UTC (permalink / raw) To: Dmitry Vyukov Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert, Bart Van Assche, jejb@linux.vnet.ibm.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, syzkaller@googlegroups.com On Thu, Feb 01, 2018 at 05:21:12PM +0100, 'Dmitry Vyukov' via syzkaller wrote: > On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings > <ben.hutchings@codethink.co.uk> wrote: > > On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote: > >> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote: > >> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote: > > [...] > >> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1 > >> > > > >> > > # readlink /sys/class/scsi_generic/sg0 > >> > > > >> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0 > >> > > > >> > > # cat /sys/class/scsi_generic/sg0/device/vendor > >> > > ATA > >> > > >> > > >> > ^^^^^ > >> > That subsystem is the culprit IMO, most likely libata. > >> > > >> > Until you can show this test failing on something other than an > >> > ATA disk, then I will treat this issue as closed. > >> > >> Hi Doug, > >> > >> Why is bug in ATA not a bug? Is it long unused by everybody? I've got > >> it by running qemu with default flags... > > > > If the bug is in libata then it's not on Doug to fix it since he's only > > maintaining sg. > > > Then I think we need to CC ata maintainers rather than treat it as closed. > +Tejun, linux-ide@, you can see full thread here: > https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY > To get memory corruption it's actually sufficient just to submit "1-byte" reads; there's no need for the SG_NEXT_CMD_LEN ioctl or anything: #include <fcntl.h> #include <unistd.h> int main() { int fd = open("/dev/sg0", O_RDWR); char buf[43] = { [36] = 0x08 /* READ_6 */ }; for (;;) write(fd, buf, sizeof(buf)); } (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE") The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42 is the only data byte. Also this is a different bug from the crash in ata_bmdma_fill_sg() which is fixed by "libata: fix length validation of ATAPI-relayed SCSI commands". I'm guessing the driver is DMA'ing to somewhere it shouldn't be... Eric ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions 2018-02-04 9:07 ` Eric Biggers @ 2018-02-04 11:10 ` Dmitry Vyukov 2018-02-10 19:13 ` Eric Biggers 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Vyukov @ 2018-02-04 11:10 UTC (permalink / raw) To: Eric Biggers Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert, Bart Van Assche, jejb@linux.vnet.ibm.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, syzkaller@googlegroups.com On Sun, Feb 4, 2018 at 10:07 AM, Eric Biggers <ebiggers3@gmail.com> wrote: > On Thu, Feb 01, 2018 at 05:21:12PM +0100, 'Dmitry Vyukov' via syzkaller wrote: >> On Thu, Feb 1, 2018 at 5:17 PM, Ben Hutchings >> <ben.hutchings@codethink.co.uk> wrote: >> > On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote: >> >> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgilbert@interlog.com> wrote: >> >> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote: >> > [...] >> >> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1 >> >> > > >> >> > > # readlink /sys/class/scsi_generic/sg0 >> >> > > >> >> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0 >> >> > > >> >> > > # cat /sys/class/scsi_generic/sg0/device/vendor >> >> > > ATA >> >> > >> >> > >> >> > ^^^^^ >> >> > That subsystem is the culprit IMO, most likely libata. >> >> > >> >> > Until you can show this test failing on something other than an >> >> > ATA disk, then I will treat this issue as closed. >> >> >> >> Hi Doug, >> >> >> >> Why is bug in ATA not a bug? Is it long unused by everybody? I've got >> >> it by running qemu with default flags... >> > >> > If the bug is in libata then it's not on Doug to fix it since he's only >> > maintaining sg. >> >> >> Then I think we need to CC ata maintainers rather than treat it as closed. >> +Tejun, linux-ide@, you can see full thread here: >> https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY >> > > To get memory corruption it's actually sufficient just to submit "1-byte" reads; > there's no need for the SG_NEXT_CMD_LEN ioctl or anything: > > #include <fcntl.h> > #include <unistd.h> > > int main() > { > int fd = open("/dev/sg0", O_RDWR); > char buf[43] = { [36] = 0x08 /* READ_6 */ }; > > for (;;) > write(fd, buf, sizeof(buf)); > } > > (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE") > > The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42 > is the only data byte. > > Also this is a different bug from the crash in ata_bmdma_fill_sg() which is > fixed by "libata: fix length validation of ATAPI-relayed SCSI commands". > > I'm guessing the driver is DMA'ing to somewhere it shouldn't be... It would be good to add KASAN checks to the DMA code that issues transfers. This is another case where a silent memory corruption causes dozens of assorted crashes all over the kernel. If we add checks, KASAN would pinpoint the exact stack that issues the bad command. This may be the simplest way to debug this bug as well. I've filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scsi: sg: assorted memory corruptions 2018-02-04 11:10 ` Dmitry Vyukov @ 2018-02-10 19:13 ` Eric Biggers 0 siblings, 0 replies; 4+ messages in thread From: Eric Biggers @ 2018-02-10 19:13 UTC (permalink / raw) To: Dmitry Vyukov Cc: Ben Hutchings, Tejun Heo, linux-ide, Doug Gilbert, Bart Van Assche, jejb@linux.vnet.ibm.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, martin.petersen@oracle.com, syzkaller@googlegroups.com On Sun, Feb 04, 2018 at 12:10:58PM +0100, Dmitry Vyukov wrote: > > > > To get memory corruption it's actually sufficient just to submit "1-byte" reads; > > there's no need for the SG_NEXT_CMD_LEN ioctl or anything: > > > > #include <fcntl.h> > > #include <unistd.h> > > > > int main() > > { > > int fd = open("/dev/sg0", O_RDWR); > > char buf[43] = { [36] = 0x08 /* READ_6 */ }; > > > > for (;;) > > write(fd, buf, sizeof(buf)); > > } > > > > (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE") > > > > The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42 > > is the only data byte. > > > > Also this is a different bug from the crash in ata_bmdma_fill_sg() which is > > fixed by "libata: fix length validation of ATAPI-relayed SCSI commands". > > > > I'm guessing the driver is DMA'ing to somewhere it shouldn't be... > > It would be good to add KASAN checks to the DMA code that issues > transfers. This is another case where a silent memory corruption > causes dozens of assorted crashes all over the kernel. If we add > checks, KASAN would pinpoint the exact stack that issues the bad > command. This may be the simplest way to debug this bug as well. I've > filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this. It seems the problem is related to the fact that in the PRD (Physical Region Descriptor) list for the DMA transfer for "BMDMA" ATA disks, the disk (emulated in QEMU here: https://github.com/qemu/qemu/blob/master/hw/ide/pci.c#L89) ignores the low bit in the lengths, causing a length of 1 (byte) to be interpreted as 0. But, at the same time there is also a special case where a length of 0 is interpreted as 65536 bytes. So the disk will DMA up to 65536 bytes into a 1-byte buffer, causing massive memory corruption. I'm not sure what the best fix is, but probably it needs to be required that the lengths in the sglist have the alignment needed for the disk. KASAN would not have helped here unfortunately. Even if there were KASAN checks when mapping the sglist for DMA or when filling in the PRD list, the kernel would not have known that the disk would actually interpret "1" as "65536". - Eric ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-02-10 19:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CACT4Y+b0pizvOAxtWQ-TGCRAYvHXdv79BAEaXh50W6vF8gBvkQ@mail.gmail.com>
[not found] ` <1516638634.2545.0.camel@wdc.com>
[not found] ` <097921fa-52c5-8b2b-f564-4b24d9720478@interlog.com>
[not found] ` <CACT4Y+aEx0ie3qjx6afCYV2vN7gYii0uupKWPXaNTnWUmhdHQw@mail.gmail.com>
[not found] ` <c0a7ec39-e357-9db6-cf12-ff9c46259f26@interlog.com>
[not found] ` <CACT4Y+bObXrFiB0Qn+nfkD8DZRiVck1GJ5U9UeQz+-y85gh70Q@mail.gmail.com>
[not found] ` <1517501859.3417.67.camel@codethink.co.uk>
2018-02-01 16:21 ` scsi: sg: assorted memory corruptions Dmitry Vyukov
2018-02-04 9:07 ` Eric Biggers
2018-02-04 11:10 ` Dmitry Vyukov
2018-02-10 19:13 ` Eric Biggers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox