All of lore.kernel.org
 help / color / mirror / Atom feed
* wip-addr
@ 2015-10-02 23:24 Marcus Watts
  2015-10-09 21:49 ` wip-addr Sage Weil
  0 siblings, 1 reply; 9+ messages in thread
From: Marcus Watts @ 2015-10-02 23:24 UTC (permalink / raw)
  To: ceph-devel

wip-addr

1. where is it?
2. current state
3. more info
4. cheap fixes
5. in case you were wondering why?

____ 1. where is it?

I've just pushed another update to wip-addr:

git@github.com:linuxbox2/linuxbox-ceph.git
wip-addr

____ 2. current state

This version
1/ compiles
2/ ran an extremely limited set of tests successfully
	(was able to bring up ceph-mon, ceph-osd).

In theory, it should do everything a recent "master" branch copy of ceph
can do and little or nothing past that.  Internally it adds "address vector"
support, some parsing/print logic, and lots of encoding rules to pass them
around, but there's nothing that can create and little that makes any
sensible use of this.  So this is just the back end encoding and storage rules.

Phase 2 is to add logic to actually make it useful.
	(the very start of this is on linuxbox2 "wip-addr-p2",
	just monmap changes so far...)

____ 3. more info

There's an etherpad document that describes this in more detail,

http://pad.ceph.com/p/wip_addr

____ 4. cheap fixes

a couple of minor issues that should be easy to resolve,
1.
AsyncConnection.cc
this passes addresses back and forth as it's setting up the connection,
and it also exchanges features.  As best I can tell, it looks like
it exchanges addresses before it knows what features the other end
supports.  There should be something in here that
does this after knowing what features the other end supports.

2.
(about line 2067 in src/tools/ceph_objectstore_tool.cc)
(use via ceph cmd?) tools - "object store tool".
This has a way to serialize objects which includes a watch list
which includes an address.  There should be an option here to say
whether to include exported addresses.

____ 5. in case you were wondering why?

The main current interest for this it to work with accelio,
which introduces 2 more transport types (accelio, via tcp or
infiniband.)  So that means actually 6 possible choices,
{ ipv6 or ipv6 } x { simple messenger, tcp, infiniband. }.
Infiniband typically would work within a data center but not
between them, so there are also reachability issues which
may not be obvious to the client.  (ipv6 has this too).

					-Marcus Watts

^ permalink raw reply	[flat|nested] 9+ messages in thread
* wip-addr
@ 2015-08-24 18:01 Sage Weil
  2015-08-25 10:06 ` wip-addr Marcus Watts
  0 siblings, 1 reply; 9+ messages in thread
From: Sage Weil @ 2015-08-24 18:01 UTC (permalink / raw)
  To: mwatts, ceph-devel

Hi Marcus,

I looked over the wip-addr branch a bit.  I have two basic 
questions/concerns:

1) In commit
	https://github.com/ceph/ceph/commit/73b09090466a43d5aceb979a4802de3f3f5bf24a 
we switch the type field from sa_family to transport_type.  This seems 
like the way to go, but we need to deal with the fact that lots of 
clusters out there are IPv6 and have AF_INET6 filled in here.  Probably 
both types should be interpreted to mean "existing/legacy IP messenger" or 
whatever we want to call SimpleMessenger/AsyncMessenger's protocol.

I think encode needs to make sure it fills in that value for the type when 
encoding the legacy entity_addr_t encoding, but could use a/the single 
valid value for the new encoding.  And any get_transport_type() accessor 
should also return the single valid value.

2) In the later commit
	https://github.com/ceph/ceph/commit/9d203a2058f76414703b4fc212a1a0a960d0c672 
you introduce a grammar for printing/parsing the addrs.  This also makes 
sense since e.g. xio uses an IP to identify an endpoint.  I think we 
should identify these based on the *protocol* and not the implementation, 
though... whether we use SimpleMessenger or AsyncMessenger is not a 
property of the address.  Maybe "tcp://" makes more sense here?  Or 
perhaps no prefix at all (a bare IP address), so that this looks the same 
as it did before in the case where the default protocol(s) are in use.

I assume the xio protocol (whether it is rdma or tcp) is closely tied to 
libxio itself.. is that right?  If so, using xio in that prefix makes 
sense.  I'd include xio somewhere in the rdma prefix though (xrdma:// and 
xtcp://)?

What do you think?

Logistically, I think the steps for getting this ready for merge are:

1) Separate out the preliminary patches that pass a feature to the addr 
encoding.. without any of the other cohortfs patches that are currently on 
this branch.  Once this builds we can merge it separately from the rest...

2) The entity_addrvec_t type.

3) The type -> transport_type switch.

4) We should make the new entity_addr_t encoding encode sockaddr piece 
more compactly instead of eating up a full 80-byte sockaddr_storage even 
for the ~8-byte IPv4 sockaddr_in.  Maybe just need to encode an 
explicit length for the sockaddr_* piece?

Something like that?
sage

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-10-12 18:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-02 23:24 wip-addr Marcus Watts
2015-10-09 21:49 ` wip-addr Sage Weil
     [not found]   ` <CACJqLyZh4WQZv_3isjXJy=t6YL700C2GjWuen-QG1D5=RkKHYw@mail.gmail.com>
2015-10-10  3:20     ` wip-addr Haomai Wang
2015-10-10 12:07     ` wip-addr Sage Weil
2015-10-12 17:42   ` wip-addr David Zafman
2015-10-12 18:04     ` wip-addr Sage Weil
  -- strict thread matches above, loose matches on Subject: below --
2015-08-24 18:01 wip-addr Sage Weil
2015-08-25 10:06 ` wip-addr Marcus Watts
2015-08-25 14:08   ` wip-addr Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.