From mboxrd@z Thu Jan 1 00:00:00 1970 From: Manish Kumar Singh Subject: RE: [PATCH] bonding:avoid repeated display of same link status change Date: Thu, 25 Oct 2018 23:49:51 -0700 (PDT) Message-ID: <4976cd2b-d782-4b01-8957-133d1b37a9c8@default> References: <20181023152924.24033-1-mk.singh@oracle.com> <65f98009-1ce0-d6fd-06dc-233aa115abc9@gmail.com> <20181023162613.GA22291@unicorn.suse.cz> <20181023163825.GB22291@unicorn.suse.cz> <20181025092930.GC22291@unicorn.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Eric Dumazet , =?utf-8?B?TWFoZXNoIEJhbmRld2Fy?= =?utf-8?B?ICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?= , linux-netdev , Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S. Miller" , linux-kernel@vger.kernel.org To: Michal Kubecek Return-path: In-Reply-To: <20181025092930.GC22291@unicorn.suse.cz> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org > -----Original Message----- > From: Michal Kubecek [mailto:mkubecek@suse.cz] > Sent: 25 =E0=A4=85=E0=A4=95=E0=A5=8D=E0=A4=A4=E0=A5=82=E0=A4=AC=E0=A4=B0 = 2018 14:59 > To: Manish Kumar Singh > Cc: Eric Dumazet; Mahesh Bandewar (=E0=A4=AE=E0=A4=B9=E0=A5=87=E0=A4=B6 = =E0=A4=AC=E0=A4=82=E0=A4=A1=E0=A5=87=E0=A4=B5=E0=A4=BE=E0=A4=B0); linux-net= dev; Jay > Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH] bonding:avoid repeated display of same link status > change >=20 > On Thu, Oct 25, 2018 at 02:21:05AM -0700, Manish Kumar Singh wrote: > > > From: Michal Kubecek [mailto:mkubecek@suse.cz] > > > IMHO it does not. AFAICS multiple instances of bond_mii_monitor() > cannot > > > run simultaneously for the same bond so that there doesn't seem to be > > > anything to collide with. (And if they could, we would need to test a= nd > > > set the flag atomically in bond_miimon_inspect().) > > > > > Yes, Michal, we are inline with your understanding. > > when the -original- patch was posted to upstream there was no > > -synchronization- nor -racing- addressing code was in read/write of thi= s > > added filed, as we -never- saw need for either. > > > > -only- writer of the added field is bond_mii_monitor. > > -only- reader of the added field is bond_miimon_inspect. > > -this writer & reader -never- can run concurrently. > > -writer invokes the reader. > > > > hence, imo uint_8 rtnl_needed is all what is needed; with > bond_mii_monitor doing rtnl_needed =3D 1; and bond_miimon_inspect doing > if rtnl_needed. > > > > here is the gravity of the situation with multiple customers whose name= s > including machine names redacted: > > > > 4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC > on p2p1 > > 4354 May 31 02:38:57 hostname kernel: public: link status down for act= ive > interface p2p1, disabling it in 100 ms > > 4355 May 31 02:38:57 hostname kernel: public: link status down for act= ive > interface p2p1, disabling it in 100 ms > > 4356 May 31 02:38:57 hostname kernel: public: link status definitely d= own > for interface p2p1, disabling it > > 4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 th= e > new active one > > 4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered P= HC > device on p2p1 > > 4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Lin= k is > Up 10 Gbps, Flow Control: RX/TX > > 4360 May 31 02:39:00 hostname kernel: public: link status up for inter= face > p2p1, enabling it in 200 ms > > 4361 May 31 02:39:00 hostname kernel: public: link status definitely u= p for > interface p2p1, 10000 Mbps full duplex > > 4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages > > 4363 May 31 02:45:37 hostname kernel: public: link status down for act= ive > interface p2p2, disabling it in 100 ms > > =09--------------------- > > 11000+ APPROX SAME REPEATED MESSAGES in second > > =09--------------------- > > 15877 May 31 02:45:37 hostname kernel: public: link status down for act= ive > interface p2p2, disabling it in 100 ms > > 15878 May 31 02:45:37 hostname kernel: public: link status definitely d= own > for interface p2p2, disabling it > > 15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 th= e > new active one >=20 > When I was replying, I didn't know this was a v2 and I haven't seen the > v1 discussion. I have read it since and I think I understand Eric's > point now. The thing is that just adding e.g. u8 is OK as it is now. > However, someone could later add another u8 next to it which would also > be perfectly OK on its own but reads/writes to these two could collide > between each other. >=20 > And as pointed out by a colleague, even having atomic_t and u8 flag in > one 64-bit word could be a problem on architectures which cannot do an > atomic read/write from/to a 32-bit word (sparc seems to be one). Thanks Michal for explaining it, now we understand the problem what Eric wa= s referring to in v1 of the patch. I could think of fixing it in 3 ways, Please suggest which one would be saf= e and optimal fix: 1. Use type unit64_t for rtnl_needed . 2. Use type atomic64_t for rtnl_needed and atomic64_set/read. 3. Use type uint64_t for rtnl_needed with spinlock protection. I think option 3 would be overkill keeping in mind the frequency of bond_mi= i_monitor. Thanks, Manish >=20 > Michal Kubecek