[mnet-devel] another fix in comms/relay (was *this* the Bermuda Triangle Bug?)

Zooko O'Whielacronx zooko at zooko.com
Fri Feb 13 20:14:45 GMT 2004


I just committed a bug fix, bumping the vernum to Mnet v0.6.2.343-STABLE.  (See 
attached patch.)  Let me describe the bug and see if it sounds like it could 
have been the Bermuda Triangle Bug.

The bug was that when you sent a "pass this along" message to a relay server, 
and the pass-this-along message failed, then the failure was being treated as a 
fast-failure of the encapsulated message.

Now normally a pass-this-along message would work, and so nothing would happen.  
Sometimes the pass-this-along message could fast-fail, for example if the relay 
server can't be reached at all.  In this case, this code would cause the 
encapsulated message to fast-fail as well, which was good.  Sometimes the 
encapsulated message would resolve (for example, you would get a response to 
it), and *then* then pass-this-along message would fail (typically because of 
timeout), in which case this code would attempt to abort the transaction, but 
nothing would happen since the transaction already completed.  Finally, 
sometimes the pass-this-along message would fail (typically because of timeout), 
and cause the transaction to be aborted, and then when the response to the 
encapsulated response arrived, it would get dropped on the floor.

So, this sounds a bit like the Bermuda Triangle Bug, because if the relay 
server's load increased so that its responses started timing-out, this would 
cause the messages which were *responses* to the encapsulated messages to 
disappear without a trace.

(Even if the response didn't go through the relay server at all!)

I'll also add a diagnostic printout so that if anything like this crops up again 
we get a warning printout any time a well-formed response message arrives which 
we drop on the floor.

Regards,

Zooko


--- CommStrat.py        2 Feb 2004 04:52:57 -0000       1.17
+++ CommStrat.py        13 Feb 2004 19:51:05 -0000      1.18
@@ -344,16 +344,7 @@
         else:
             wrappermsgbody['comm strat sequence num'] = -1
 
-        def outcome_func_from_pass_this_along(widget, outcome, failure_reason=None, self=self, msg=msg):
-            assert idlib.equal(widget.get_counterparty_id(), self._relayer_id)
-            # debug.mojolog.write("CommStrat.Relay: Got result of `pass this along'.  self._relayer_id: %s, widget: %s, outcome: %s, failure_reason: %s\n", args=(self._relayer_id, widget, outcome, failure_reason,), v=3, vs="commstrats")
-            if failure_reason:
-                fast_fail_handler(failure_reason="couldn't contact relay server: %s" % hr(outcome), bad_commstrat=self)
-            if (not failure_reason) and (outcome.get('result') != "ok") and (outcome.get('result') != "success"):
-                # Note: `ok' is for backwards compatibility, `success' is preferred.
-                fast_fail_handler(failure_reason="got failure from relay server: %s" % hr(outcome), bad_commstrat=self)
-
-        self._mtm.initiate(self._relayer_id, 'pass this along', wrappermsgbody, outcome_func=outcome_func_from_pass_this_along, post_timeout_outcome_func=outcome_func_from_pass_this_along, commstratseqno=self._commstratseqno, demoteonfailure=False)
+        self._mtm.initiate(self._relayer_id, 'pass this along', wrappermsgbody, commstratseqno=self._commstratseqno, demoteonfailure=False)
 
 class Crypto(CommStrat):
     def __init__(self, pubkey, lowerstrategy, broker_id=None):






-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel




More information about the Mnet-devel mailing list