[mnet-devel] desiderata and open issues in ent
Zooko
zooko at zooko.com
Wed Sep 17 14:40:59 BST 2003
Dear mnet-devel and p2p-hackers:
[If Robert Hettinga forwards this message to a list that you read, and you
find it interesting, perhaps you should consider joining mnet-devel [1] or
p2p-hackers [2].]
[Note that ent is not the only thing going on in terms of improved Mnet
networking. Tschechow has gotten a simplified version of ent named "router"
running, Myers is implementing Twisted Chord, and Arno *was* working on
multiple independent metatrackers before Real Life distracted him.]
These are the ways that ent should differ from emergent network designs like
Chord and Kademlia. Some of these desiderata dovetail with one another, but
others of them appear to be conflicting. It may be impossible to satisfy all
of them at once, but I currently believe that we can at least partially
achieve all of them.
1. It should self-heal flaws in the network structure. There is a protocol [3]
known to do this for Chord, but not (as far as I know) for Kademlia. (re:
recent discussion [4])
2. It should handle the reality that a large fraction (around half) of the
nodes are behind NAT or firewalls and can't accept incoming connections. If
two nodes are both restricted like that then they cannot be peers of one
another in the ent graph.
2.b. But it will not try to handle arbitrary underlay structure -- it will
rely on the fact that a large fraction of the nodes form a single fully-
connected clique, and that every node has edges to almost all of the clique
nodes.
3. It should handle transient nodes.
3.a. Newcomer nodes are never entrusted with the only replica of any data.
They may be entrusted with no data at all, or at a cost of using extra
bandwidth, they can be entrusted with an extra replica that is already stored
by an old-timer node.
3.b. Suppose a new node "B" joins the network, and it is no responsible for a
part of the id space that was formerly covered by an old node "A". Now
suppose A is serving up 200 GB of data, 100 GB of which now falls into B's
part of the id space. When queries for blocks from B's space get routed to
be, B will forward them to A, and cache copies of the response. This
dovetails with 3.a.
4. It should include incentives for people to run good nodes.
4.a. The most important incentives are probably social -- people should chat
with one another, have (*real*, human social, not digital) reputations, and so
forth. That's outside the scope of ent design.
4.b. Peering should have a selfish incentive mechanism so that people who run
high-quality nodes get higher-quality performance from the network when they
try to use it themselves. (I'm inspired by the tit-for-tat pairwise
incentives from Bram's original BitTorrent design. I don't know if the
current BitTorrent has changed in that respect.)
5. It should exploit locality: nodes which have short fat pipes between them
(they are "close to" each other in the underlay network) should be more likely
to peer with each other than nodes that share long skinny pipes.
6. It should exploit heterogeneity. The capacities of nodes is expected to
follow a power-law distribution: half of the nodes are Pentium II's on 33 Kbps
dial-up with 5 MB of free disk space, a quarter of the nodes are Pentium IV's
on 512 Kbps DSL with 5 GB of free disk space, an eighth of the nodes are quad-
Opterons on 44 Mbps DS3 with 1 TB free disk space, a sixteenth of the nodes
are supercomputers which are illicitly bridging the Internet to "Internet 2",
etc.
7. Normal emergent network desiderata: scalable, robust, efficient.
8. Simple, analyzable, measurable.
Open issues:
* Self-healing (1.) has to be designed for Kademlia or else locality (5.) has
to be designed for Chord or else we have to switch to another emergent network
design entirely for the basis of ent.
* Id-space shadowing (3.b.) has to be designed carefully to avoid looping or
other bad artifacts, and to behave acceptably well in the various cases of A,
B, and other nodes coming and going.
* Selfish peering (4.b.) has to be balanced against the system-wide consistency
and performance desiderata. For example, each individual node wants to link
*only* with peers that provide good quality service to it. However, suppose
there is a node that provides bad-quality service, that nobody wants to peer
with. Suppose that node is in sole possession of a block of data. We still
want searches in the network to find that block! This may be impossible, in
which case we have to choose a trade-off, and hopefully one which is either
qualitative or easily measurable.
* Putting all of these together is the big trick. Can we exploit
heterogeneity (6.) and locality (5.) at the same time? Can we exploit both of
these while retaining any sort of analyzability/measurability? Etc.
A general design idea:
One way that some of these apparently conflicting desiderata can be reconciled
is to use redundant "special-purpose" overlay networks. For example,
Pastry [5] uses its free-choice property to increase network locality, while
the very similar Kademlia [6] uses the same free-choice property to increase
robustness, at the expense of locality. That is: in Pastry you have to choose
any one out of (say) a thousand nodes to be your peer, and you attempt to
choose the one which is closest to you in the underlay, i.e. the one that has
the fastest connection to you. In Kademlia you have to choose any one out of
a thousand nodes, and you attempt to choose the one that is least likely to
drop off the net.
My idea for ent is that you have two separate overlay networks, one in which
you prefer the most reliable nodes and the most robust emergent network
topology, and the other in which you prefer the fastest nodes and the most
efficient emergent network topology. When the latter fails, you use the
former to rebuild it.
Regards,
Zooko
http://zooko.com/log.html
[1] http://sourceforge.net/mailarchive/forum.php?forum_id=7702
[2] http://zgp.org/pipermail/p2p-hackers/
[3] http://citeseer.nj.nec.com/liben-nowell02observations.html
[4] http://zgp.org/pipermail/p2p-hackers/2003-August/001344.html
[5] http://research.microsoft.com/~antr/Pastry/
[6] http://citeseer.nj.nec.com/529075.html
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel
More information about the Mnet-devel
mailing list