On 11/14/2013 09:43 PM, Venkat Venkatsubra wrote: > > -----Original Message----- > From: Honggang LI [mailto:honli@redhat.com] > Sent: Wednesday, November 13, 2013 6:56 PM > To: Josh Hunt; Venkat Venkatsubra > Cc: David Miller; jjolly@suse.com; LKML; netdev@vger.kernel.org > Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback > > On 11/14/2013 01:40 AM, Josh Hunt wrote: >> On Wed, Nov 13, 2013 at 9:16 AM, Venkat Venkatsubra >> wrote: >>> -----Original Message----- >>> From: Josh Hunt [mailto:joshhunt00@gmail.com] >>> Sent: Tuesday, November 12, 2013 10:25 PM >>> To: David Miller >>> Cc: jjolly@suse.com; LKML; Venkat Venkatsubra; netdev@vger.kernel.org >>> Subject: Re: [PATCH] rds: Error on offset mismatch if not loopback >>> >>> On Tue, Nov 12, 2013 at 10:22 PM, Josh Hunt wrote: >>>> On Sat, Sep 22, 2012 at 2:25 PM, David Miller wrote: >>>>> From: John Jolly >>>>> Date: Fri, 21 Sep 2012 15:32:40 -0600 >>>>> >>>>>> Attempting an rds connection from the IP address of an IPoIB >>>>>> interface to itself causes a kernel panic due to a BUG_ON() being triggered. >>>>>> Making the test less strict allows rds-ping to work without >>>>>> crashing the machine. >>>>>> >>>>>> A local unprivileged user could use this flaw to crash the system. >>>>>> >>>>>> Signed-off-by: John Jolly >>>>> Besides the questions being asked of you by Venkat Venkatsubra, >>>>> this patch has another issue. >>>>> >>>>> It has been completely corrupted by your email client, it has >>>>> turned all TAB characters into spaces, making the patch useless. >>>>> >>>>> Please learn how to send a patch unmolested in the body of your >>>>> email. Test it by emailing the patch to yourself, and verifying >>>>> that you can in fact apply the patch you receive in that email. >>>>> Then, and only then, should you consider making a new submission of >>>>> this patch. >>>>> >>>>> Use Documentation/email-clients.txt for guidance. >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-kernel" in the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> Please read the FAQ at http://www.tux.org/lkml/ >>>> I think this issue was lost in the shuffle. It appears that redhat, >>>> ubuntu, and oracle are maintaining local patches to resolve this: >>>> >>>> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d636 >>>> 85 >>>> 2be130fa15fa8be10d4704e8 >>>> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >>>> http://ubuntu.5.x6.nabble.com/CVE-2012-2372-RDS-local-ping-DOS-td498 >>>> 53 >>>> 88.html >>>> >>>> Given that Oracle has applied it I'll make the assumption that >>>> Venkat's question was answered at some point. >>>> >>>> David - I can resubmit the patch with the proper signed-off-by and >>>> formatting if you are willing to apply it unless John wants to try >>>> again. I think it's time this got upstream. >>>> >>>> -- >>>> Josh >>> Ugh.. hopefully resending with all the html crap removed... >>> >>> -- >>> Josh >>> >>> Hi Josh, >>> >>> No, I still didn't get an answer for how "off" could be non-zero in case of rds-ping to hit BUG_ON(off % RDS_FRAG_SIZE). >>> Because, rds-ping uses zero byte messages to ping. >>> If you have a test case that reproduces the kernel panic I can try it out and see how that can happen. >>> The Oracle's internal code I checked doesn't have that patch applied. >>> >>> Venkat >> No I don't have a test case. I came across this CVE while doing an >> audit and noticed it was patched in Ubuntu's kernel and other distros, >> but was not in the upstream kernel yet. Quick googling of lkml showed >> that there were at least two attempts to get this patch upstream, but >> both had issues due to not following the proper submission process: >> >> https://lkml.org/lkml/2012/10/22/433 >> https://lkml.org/lkml/2012/9/21/505 >> >> From my searching it appears the initial bug was found by someone at redhat: >> https://bugzilla.redhat.com/show_bug.cgi?id=822754 >> >> I've added Li Honggang the reporter of this issue from Redhat to the >> mail. Hopefully he can share his testcase. > The test case is very simple: > Steps to Reproduce: > 1. yum install -y rds-tools > > 2. [root@rdma3 ~]# ifconfig ib0 | grep 'inet addr' > inet addr:172.31.0.3 Bcast:172.31.0.255 Mask:255.255.255.0 > > 3. [root@rdma3 ~]# /usr/bin/rds-ping 172.31.0.3 <<<< kernel panic (You may need to wait for a few seconds before the kernel panic.) >> and possibly requires certain hardware as Jay writes in the first link above: >> "...some Infiniband HCAs(QLogic, possibly others) the machine will panic..." > This bug can be reproduced with Mellanox HCAs (mlx4_ib.ko and mthca.ko), QLogic HCA (ib_qib.ko). I did not test the QLogic HCA running "ib_ipath.ko". > > As I know the upstream code of RDS is broken. There are *many* RDS bugs. > > Best regards. > Honggang >> I was referring to this oracle commit: >> https://oss.oracle.com/git/?p=redpatch.git;a=commit;h=c7b6a0a1d8d63685 >> 2be130fa15fa8be10d4704e8 >> >> I have no experience with this code. There were a few comments around >> the reset and xmit fns about making sure the caller did certain things >> if not they were racy, but I have no idea if that's coming into play >> here. >> > Hi Honggang, > > I ran rds-ping over local interface for 30 minutes. I stopped it after that. > It didn't hit any panic. > > # ip addr show dev ib0 > 6: ib0: mtu 2044 qdisc pfifo_fast qlen 1024 > link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:21:28:00:01:cf:63:db brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff > inet 10.196.4.125/30 brd 10.196.4.127 scope global ib0 > inet6 fe80::221:2800:1cf:63db/64 scope link > valid_lft forever preferred_lft forever > # > > # rds-ping 10.196.4.125 > 1: 170 usec > 2: 171 usec > .... > .... > .... > 1860: 173 usec > 1861: 171 usec > 1862: 177 usec > 1863: 168 usec > 1864: 171 usec > 1865: 175 usec > ^C# > > I tested with Oracle UEK2 which is based on 2.6.39 kernel. Mellanox IB adaptor. > 19:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) > > There is something about your setup that must be causing it for you. > Can I work with you offline if you are available ? > > The panic you are hitting is not making sense to me. > > Venkat Hi, Venkat It seems we are in different time zone. Please contact me via email if you need I do something for this bug. Could you please try upstream kernel 2.6.39. I confirmed that the bug can be reproduced with Mellanox and QLogic HCA when running upstream kernel-2.6.39. [root@rdma01 ~]# ifconfig mlx4_ib1 Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes. Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly. Ifconfig is obsolete! For replacement check ip. mlx4_ib1 Link encap:InfiniBand HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:172.31.2.1 Bcast:172.31.2.255 Mask:255.255.255.0 inet6 addr: fe80::7ae7:d1ff:ff6b:b01/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) [root@rdma01 ~]# rpm -qf /usr/bin/rds-ping rds-tools-2.0.6-3.el6.x86_64 [root@rdma01 ~]# uname -a Linux rdma01.rhts.eng.nay.redhat.com 2.6.39 #1 SMP Thu Nov 14 20:25:45 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@rdma01 ~]# ibstat CA 'mlx4_0' CA type: MT26428 Number of ports: 2 Firmware version: 2.8.600 Hardware version: b0 Node GUID: 0x78e7d1ffff6b0b00 System image GUID: 0x78e7d1ffff6b0b03 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 4 Capability mask: 0x02510868 Port GUID: 0x78e7d1ffff6b0b01 Link layer: InfiniBand Port 2: State: Down Physical state: Polling Rate: 70 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x02510868 Port GUID: 0x78e7d1ffff6b0b02 Link layer: InfiniBand [root@rdma01 ~]# lspci | grep Mellanox 1f:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) [root@rdma01 ~]# ssh 172.31.2.2 hostname (make sure the IPoIB interface works) rdma02.rhts.eng.nay.redhat.com [root@rdma01 ~]# ssh 172.31.2.1 hostname rdma01.rhts.eng.nay.redhat.com [root@rdma01 ~]# /usr/bin/rds-ping 172.31.2.1 (kernel panic, please see the attachment for console log)