Skip to content
Anthony Cameron edited this page Jul 9, 2013 · 4 revisions

Federation Protocol Planning

To be a truly robust and scalable service, echoplexus needs to be able to federate messages at least to other instances of echoplexus. This page outlines a proposed plan to bring it to the project, outlining potential pitfalls.

The longer term goals of federation:

  1. Federation amongst multiple echoplexus server instances
  2. Peers join in, with some elected to be supernodes. This can be accomplished with node-webkit
  3. Bridging federation across protocols/services: Namely, it would be great if we could interface with users on IRC

Phase One

Echoplexus should be fitted with a server metadata module. This module would provide an HTTP API for information such as:

  • the current overall version of echoplexus
  • specific modules that are enabled
    • API version
    • their API version's backwards-compatabile-til value
    • any special namespaces (if applicable/differing)
  • how other instances should interact with it: the FQDN over for the wss connection

We'll use A and B to represent 2 server instances.

  1. Server operator of new echoplexus instance B points their echoplexus at established host A.
  2. B queries A's server metadata module, storing the capabilities of A in memory.
  3. B sends out a federation request
  4. A decides:
  • to let B join:
    • A and B negotiate a wss connection
      • if successful:
        • A adds B to his list of echoplexi that he will relay (via broadcast) events to
        • B adds A (reciprocation of the above)
      • if unsuccessful, they try again, backing off the attempts if they encounter more failures
  • to keep B out:
    • A responds with a nack. A can optionally recommend another echoplexus to connect to if A is over load capacity

Relay

A and B maintain a list of echoplexi that they will forward requests to. For forwarding, a distinction must be made from origin and foreign events. An origin event is anything sent directly by the direct peer of the server. A foreign event is anything that was relayed by another echoplexus.

For robustness, a central wheel & spoke model should be avoided -- we wouldn't want the center echoplexus to go down and completely island all the others

Relay Process (wrt Sender)

  1. A receives a new chat message from Alice
  2. A does the normal stuff; broadcast the message to all of his direct peers
  3. A also adds the event data in its entirety to a FIFO queue, the infoflow, a list of things to be sent to all of A's server peers
  4. A periodically (<500ms or less) flushes this queue (or some number of items up to a sensible (negotiated?) maximum from this queue) by broadcasting to each direct peer

A max TTL-type value should be included so that a broadcast loop isn't encountered.

Relay Process (wrt Receiver)

  1. B has 0+ connections to other echoplexi, and with that, 0+ infoflow buffers
  2. While parsing events, we check to see from which module it came.
  • B checks that it hasn't seen this exact duplicate event recently (perhaps from another infoflow). If it has, toss out the event.
  • if our modules aren't API-compatible (gleaned during the capability negotiation / federation stage), toss out the event.
  • if our modules are API-compatible, the event is re-emitted internally as if it had come from a phantom direct peer and relayed to all of B's direct peers as appropriate

Phase Two +

The part where an echoplexus user is promoted to a supernode based on some algorithm deciding their candidacy, and is considered a 2nd level candidate. To be elucidated at a later date.