[mnet-devel] desiderata and open issues in ent

Zooko zooko at zooko.com
Wed Sep 17 14:40:59 BST 2003


Dear mnet-devel and p2p-hackers:

[If Robert Hettinga forwards this message to a list that you read, and you 
find it interesting, perhaps you should consider joining mnet-devel [1] or 
p2p-hackers [2].]

[Note that ent is not the only thing going on in terms of improved Mnet 
networking.  Tschechow has gotten a simplified version of ent named "router" 
running, Myers is implementing Twisted Chord, and Arno *was* working on 
multiple independent metatrackers before Real Life distracted him.]


These are the ways that ent should differ from emergent network designs like 
Chord and Kademlia.  Some of these desiderata dovetail with one another, but 
others of them appear to be conflicting.  It may be impossible to satisfy all 
of them at once, but I currently believe that we can at least partially 
achieve all of them.

1.  It should self-heal flaws in the network structure.  There is a protocol [3]
known to do this for Chord, but not (as far as I know) for Kademlia.  (re: 
recent discussion [4])

2.  It should handle the reality that a large fraction (around half) of the 
nodes are behind NAT or firewalls and can't accept incoming connections.  If 
two nodes are both restricted like that then they cannot be peers of one 
another in the ent graph.

2.b.  But it will not try to handle arbitrary underlay structure -- it will 
rely on the fact that a large fraction of the nodes form a single fully-
connected clique, and that every node has edges to almost all of the clique 
nodes.

3.  It should handle transient nodes.

3.a.  Newcomer nodes are never entrusted with the only replica of any data.  
They may be entrusted with no data at all, or at a cost of using extra 
bandwidth, they can be entrusted with an extra replica that is already stored 
by an old-timer node.

3.b.  Suppose a new node "B" joins the network, and it is no responsible for a 
part of the id space that was formerly covered by an old node "A".  Now 
suppose A is serving up 200 GB of data, 100 GB of which now falls into B's 
part of the id space.  When queries for blocks from B's space get routed to 
be, B will forward them to A, and cache copies of the response.  This 
dovetails with 3.a.

4.  It should include incentives for people to run good nodes.

4.a.  The most important incentives are probably social -- people should chat 
with one another, have (*real*, human social, not digital) reputations, and so 
forth.  That's outside the scope of ent design.

4.b.  Peering should have a selfish incentive mechanism so that people who run 
high-quality nodes get higher-quality performance from the network when they 
try to use it themselves.  (I'm inspired by the tit-for-tat pairwise 
incentives from Bram's original BitTorrent design.  I don't know if the 
current BitTorrent has changed in that respect.)

5.  It should exploit locality: nodes which have short fat pipes between them 
(they are "close to" each other in the underlay network) should be more likely 
to peer with each other than nodes that share long skinny pipes.

6.  It should exploit heterogeneity.  The capacities of nodes is expected to 
follow a power-law distribution: half of the nodes are Pentium II's on 33 Kbps 
dial-up with 5 MB of free disk space, a quarter of the nodes are Pentium IV's
on 512 Kbps DSL with 5 GB of free disk space, an eighth of the nodes are quad-
Opterons on 44 Mbps DS3 with 1 TB free disk space, a sixteenth of the nodes 
are supercomputers which are illicitly bridging the Internet to "Internet 2", 
etc.

7.  Normal emergent network desiderata: scalable, robust, efficient.

8.  Simple, analyzable, measurable.


Open issues:

 * Self-healing (1.) has to be designed for Kademlia or else locality (5.) has 
to be designed for Chord or else we have to switch to another emergent network 
design entirely for the basis of ent.

 * Id-space shadowing (3.b.) has to be designed carefully to avoid looping or 
other bad artifacts, and to behave acceptably well in the various cases of A, 
B, and other nodes coming and going.

 * Selfish peering (4.b.) has to be balanced against the system-wide consistency 
and performance desiderata.  For example, each individual node wants to link 
*only* with peers that provide good quality service to it.  However, suppose 
there is a node that provides bad-quality service, that nobody wants to peer 
with.  Suppose that node is in sole possession of a block of data.  We still 
want searches in the network to find that block!  This may be impossible, in 
which case we have to choose a trade-off, and hopefully one which is either 
qualitative or easily measurable.

 * Putting all of these together is the big trick.  Can we exploit 
heterogeneity (6.) and locality (5.) at the same time?  Can we exploit both of 
these while retaining any sort of analyzability/measurability?  Etc.


A general design idea:

One way that some of these apparently conflicting desiderata can be reconciled 
is to use redundant "special-purpose" overlay networks.  For example, 
Pastry [5] uses its free-choice property to increase network locality, while 
the very similar Kademlia [6] uses the same free-choice property to increase 
robustness, at the expense of locality.  That is: in Pastry you have to choose 
any one out of (say) a thousand nodes to be your peer, and you attempt to 
choose the one which is closest to you in the underlay, i.e. the one that has 
the fastest connection to you.  In Kademlia you have to choose any one out of 
a thousand nodes, and you attempt to choose the one that is least likely to 
drop off the net.

My idea for ent is that you have two separate overlay networks, one in which 
you prefer the most reliable nodes and the most robust emergent network 
topology, and the other in which you prefer the fastest nodes and the most 
efficient emergent network topology.  When the latter fails, you use the 
former to rebuild it.


Regards,

Zooko

http://zooko.com/log.html

[1] http://sourceforge.net/mailarchive/forum.php?forum_id=7702
[2] http://zgp.org/pipermail/p2p-hackers/
[3] http://citeseer.nj.nec.com/liben-nowell02observations.html
[4] http://zgp.org/pipermail/p2p-hackers/2003-August/001344.html
[5] http://research.microsoft.com/~antr/Pastry/
[6] http://citeseer.nj.nec.com/529075.html


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel




More information about the Mnet-devel mailing list