All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dimitris Michailidis <dm@chelsio.com>
To: Matt Domsch <Matt_Domsch@dell.com>
Cc: Eilon Greenstein <eilong@broadcom.com>,
	Dmitry Kravkov <dmitry@broadcom.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"narendra_k@dell.com" <narendra_k@dell.com>,
	"jordan_hargrave@dell.com" <jordan_hargrave@dell.com>
Subject: Re: [PATCH net-next] bnx2x: Add Nic partitioning mode (57712 devices)
Date: Fri, 17 Dec 2010 15:13:30 -0800	[thread overview]
Message-ID: <4D0BEE9A.5070505@chelsio.com> (raw)
In-Reply-To: <20101217024509.GA5854@auslistsprd01.us.dell.com>

Matt Domsch wrote:
> On Thu, Dec 09, 2010 at 04:49:25PM +0200, Eilon Greenstein wrote:
>> On Mon, 2010-12-06 at 10:21 -0800, Dimitris Michailidis wrote:
>>> Matt Domsch wrote:
>> ...
>>> /sys/class/net/<ifname>/dev_id indicates the physical port <ifname> is 
>>> associated with.  At least a few drivers set up dev_id this way.
>>>
>>>
>> So we are on agreement? This can satisf all needs? If so, we will add
>> this scheme to the bnx2x as well.
> 
> I don't think that's enough.  Necessary, but not sufficient.
> 
> If dev_id is a field that starts over with each PCI device (e.g. is
> used to distinguish multiple ports that share the same PCI
> device), that's enough to handle the Chelsio case, but not the NPAR &
> SR-IOV case.

My understanding is that dev_id indicates the physical port of the card 
associated with an interface.  It does not reset when you move to a new 
function of the device.

> 
> If the above is true, then a value of dev_id=0 for all 1:1 PCI Device
> : Port relations is fine, leaving the three drivers that set dev_id
> non-zero are all multi-port, single PCI device controllers.
> 
> cxgb4/t4_hw.c:          adap->port[i]->dev_id = j;

The HW cxgb4 deals with is multi-function (actually the driver uses 
primarily function 4 nowadays) but it's virtualizable and the association 
between functions and ports very flexible.  For example, you may have a 
2-port card but maybe the driver will be given just (a slice of) port 1.  So 
the driver will create one netdev with dev_id==1 and there won't be anything 
with dev_id 0.  You cannot determine this by looking at anything PCI-related 
or any static table.

For this driver you can get two pieces of information for an interface:
- /sys/class/net/<interface>/device points to the PCI function handling the 
interface
- /sys/class/net/<interface>/dev_id indicates the physical port of the interface

You can have several interfaces with same device link and different dev_id. 
  While the current driver doesn't do it you could also have several 
interfaces with different device links but same dev_id (NPAR situation, 
notice again that dev_ids are not per PCI function), or interfaces with 
different device and dev_id, or even interfaces with same device and dev_id.

> mlx4/en_netdev.c:       dev->dev_id =  port - 1;
> sfc/siena.c:    efx->net_dev->dev_id = EFX_OWORD_FIELD(reg, FRF_CZ_CS_PORT_NUM) - 1;
> 
> Is that truly how these three controllers work: they set dev_id when
> there are multiple physical ports that a single PCI d/b/d/f drives?
> 
> My naming convention of:
>   pci<slot>#<port>
> wants to express this relationship.  If I have a card with 2 PCI
> devices, and 2 physical ports on each device, I have 4 ports to
> describe.  The dev_ids would look like: 0,1 0,1 , so I can't use that
> value directly.

I think they'd be 0,1,2,3 for drivers that set dev_id and 0,0,0,0 otherwise.

   I can make a list of PCI devices on the same card,
> look at the dev_id field of each, and run a counter:
> 
> for each slot:
>   int port=1;
>   for each pci device:
>      for each in net/<interface>/dev_id:
>         use name pci<slot>#<port>
> 	port++
> 
> OK?  Can someone with such a card send me tree /sys, so I can see the
> tree does really look like I expect:
> 
> /sys/devices/pci0000:00/0000:00:1c.0/0000:0b:00.0/net/eth0/dev_id = 0
> /sys/devices/pci0000:00/0000:00:1c.0/0000:0b:00.0/net/eth1/dev_id = 1
> 
> simply finding a net/ subdir under a PCI device, each of the
> directories in net/ are interface names, with different dev_id values.

This would be the common case but in general the dev_ids don't need to be 
consecutive or start at 0, nor does a particular dev_id need to appear just 
once.

> Now for the partitioned devices (NPAR or SR-IOV).  Here, we have
> multiple PCI devices mapped to the same port.
> 
> My naming convention of:
>   pci<slot>#<port>_<partition>
> wants to express this relationship. 
> 
> I need a way to express which port a given partition maps to.  I'm
> also presuming this is a static mapping right now, that it won't
> change around during runtime (ala Xsigo, which I have no solution here
> for; if the mapping isn't static, this is going to get trickier).
> 
> As dev_ids are only unique per PCI device, we would need a pointer to
> the "base" device.  However, in the Broadcom 57712 case, there is no
> such "base" device. :-( So, using dev_id here doesn't seem like the
> right approach for these devices.

dev_ids can handle NPAR but I do understand that dev_id 0 is ambiguous.  Two 
functions with dev_id 0 mean one thing for a driver that sets dev_id and a 
very different thing for one that doesn't.

> What if we did something like this?
> 
> /sys/devices/net_ports/port0/
> /sys/devices/pci0000:00/0000:00:1c.0/0000:0b:00.0/net/eth0/port -> 
>     /../../../../../net_ports/port0
> /sys/devices/pci0000:00/0000:00:1c.0/0000:0b:00.1/net/eth1/port -> 
>     /../../../../../net_ports/port0
> 
> 
> In this case, the port0 "name" is simply a way to group interfaces
> into ports, it's not how ports are labeled on the chassis.

If I understand you right a "port" is a group of interfaces sharing one 
physical port without saying which one.  I think dev_id does the same and 
specifies which physical port.

> 
> Do network drivers know how many ports they have?
> What are the characteristics of network ports? Ideally, physical
> location (PCI slot), and index within that physical location.

This index is the dev_id for drivers that set it.

> These
> right now I'm deriving from SMBIOS and PCI, and if not explicitly
> exposed, counting devices on the same slot and assigning port numbers
> that way, but I would love to have explicit information from the
> drivers.
> 
> Thoughts?
> 
> Thanks,
> Matt
> 


  parent reply	other threads:[~2010-12-17 23:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-28 22:09 [PATCH net-next] bnx2x: Add Nic partitioning mode (57712 devices) Dmitry Kravkov
2010-11-29  6:01 ` Matt Domsch
2010-11-29  9:33   ` Eilon Greenstein
2010-12-06 17:35     ` Matt Domsch
2010-12-06 18:21       ` Dimitris Michailidis
2010-12-09 14:49         ` Eilon Greenstein
2010-12-17  2:45           ` Matt Domsch
2010-12-17 13:22             ` Ben Hutchings
2010-12-19  5:57               ` Matt Domsch
2010-12-19 21:21                 ` Ben Hutchings
2010-12-17 23:13             ` Dimitris Michailidis [this message]
2010-12-19  5:49               ` Matt Domsch
2010-12-20 19:44                 ` Dimitris Michailidis
2011-01-06 14:40                   ` Eilon Greenstein
2010-12-01 20:40 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D0BEE9A.5070505@chelsio.com \
    --to=dm@chelsio.com \
    --cc=Matt_Domsch@dell.com \
    --cc=davem@davemloft.net \
    --cc=dmitry@broadcom.com \
    --cc=eilong@broadcom.com \
    --cc=jordan_hargrave@dell.com \
    --cc=narendra_k@dell.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.