Federation

Federation Protocol Planning

To be a truly robust and scalable service, echoplexus needs to be able to federate messages at least to other instances of echoplexus. This page outlines a proposed plan to bring it to the project, outlining potential pitfalls.

The longer term goals of federation:

Federation amongst multiple echoplexus server instances
Peers join in, with some elected to be supernodes. This can be accomplished with node-webkit
Bridging federation across protocols/services: Namely, it would be great if we could interface with users on IRC

Phase One

Echoplexus should be fitted with a server metadata module. This module would provide an HTTP API for information such as:

the current overall version of echoplexus
specific modules that are enabled
- API version
- their API version's backwards-compatabile-til value
- any special namespaces (if applicable/differing)
how other instances should interact with it: the FQDN over for the wss connection

We'll use A and B to represent 2 server instances.

Server operator of new echoplexus instance B points their echoplexus at established host A.
B queries A's server metadata module, storing the capabilities of A in memory.
B sends out a federation request
A decides:

to let B join:
- A and B negotiate a wss connection
  - if successful:
    - A adds B to his list of echoplexi that he will relay (via broadcast) events to
    - B adds A (reciprocation of the above)
  - if unsuccessful, they try again, backing off the attempts if they encounter more failures
to keep B out:
- A responds with a nack. A can optionally recommend another echoplexus to connect to if A is over load capacity

Relay

A and B maintain a list of echoplexi that they will forward requests to. For forwarding, a distinction must be made from origin and foreign events. An origin event is anything sent directly by the direct peer of the server. A foreign event is anything that was relayed by another echoplexus.

For robustness, a central wheel & spoke model should be avoided -- we wouldn't want the center echoplexus to go down and completely island all the others

Relay Process (wrt Sender)

A receives a new chat message from Alice
A does the normal stuff; broadcast the message to all of his direct peers
A also adds the event data in its entirety to a FIFO queue, the infoflow, a list of things to be sent to all of A's server peers
A periodically (<500ms or less) flushes this queue (or some number of items up to a sensible (negotiated?) maximum from this queue) by broadcasting to each direct peer

A max TTL-type value should be included so that a broadcast loop isn't encountered.

Relay Process (wrt Receiver)

B has 0+ connections to other echoplexi, and with that, 0+ infoflow buffers
While parsing events, we check to see from which module it came.

B checks that it hasn't seen this exact duplicate event recently (perhaps from another infoflow). If it has, toss out the event.
if our modules aren't API-compatible (gleaned during the capability negotiation / federation stage), toss out the event.
if our modules are API-compatible, the event is re-emitted internally as if it had come from a phantom direct peer and relayed to all of B's direct peers as appropriate

Phase Two +

The part where an echoplexus user is promoted to a supernode based on some algorithm deciding their candidacy, and is considered a 2nd level candidate. To be elucidated at a later date.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly