Thursday, October 30, 2008

DHTs

I've decided to put both DHT papers into the same post, as they covered a lot of the same material.

These were papers on distributed hash tables. They what they seem to be, but there are some clever tricks in each individual one. You hash an address and this then maps to another machine via a series of lookups.

What I don't understand is how these things help P2P, as is argued at such depth. Since you don't get to organize the data in a pure P2P network, you can't really create any address -> data mappings. Both are handed to you, you can only create a search service that gives you one if given the other.

I do like DHTs as DNS replacements though, that seems like the proper model of usage. This is an actual problem too, as I can remember comcast's DNS server choking and access to goggle being hindered.

I guess I don't have much to write as we read the Chord paper for 262. All of my clever insight is gone.

Monday, October 27, 2008

Active Networks

I still don't really know what active networks are. Wiki says:

"The active network architecture is composed of execution environments (similar to a unix shell that can execute active packets), a node operating system capable of supporting one or more execution environments. It also consists of active hardware, capable of routing or switching as well as executing code within active packets. This differs from the traditional network architecture which seeks robustness and stability by attempting to remove complexity and the ability to change its fundamental operation from underlying network components. Network processors are one means of implementing active networking concepts. Active networks have also been implemented as overlay networks."

That didn't help too much. I suppose it's reasonable that the applications should be allowed to reform the network. However, as my previous post said, this isn't going to optimize throughput, and thus won't be of huge value.

Is this a formative work for overlay networks?

Rambling aside, the paper itself was a survey of the lessons learned in implementing such a system. Not terribly valuable, in my own view.

Lastly, I am intrigued by the idea, at least for the DTN case. DTN's are highly mobile, with the network shifting around all the time. It may be valuable to modify the network architecture in this case. Then again, I barely understand this to begin with. I'll muse more after lecture.

RON

Ron is a really good name for a project. It just reads well.

RON is a "Resilient Overlay Network". It makes a lot of sense in an area where there are significant network problems. The overlay allows for more informed, better routing decisions. Specifically it routes around trouble spots in the internet. It can also be used to enforce policies among the routers that can't be enforced at the BGP layer.

The largest problem with this paper is that their justification went away. I very seldom experience outages. In fact, I can't remember the last one. I do know that my last outage was my home router anyhow. People get paid a lot of money to make sure that core routers don't tip over. These guys did not anticipate the internet becoming quite the serious business it is now.

My guess is that is early "Look at fun stuff we can do with overlays!" work. This is valid for spotty networks, but I've developed a hypothesis from our readings in this class. Anything that doesn't optimize throughput eventually gets phased out. In the common case of a working network, the network engineers have decided that you should take this path to google, and they know WAY more than you about their network setup. You should probably trust them.

Oh, and another note. Fantastic use of e2e!

Sunday, October 19, 2008

Internet Routing Topology

This paper details the problems with the previous power-law view of network topologies. Their complaints are obvious in retrospect, as the power-law view doesn't take actual network requirements into account. They then demonstrate that incorporating actual networking requirements (throughput being the big one) into the construction of network graphs improves the modeling ability of said graph.

There's not a lot to say here really. I buy it, but it's just the normal interplay between theoreticians and practitioners. The theory folks build general models and the engineers want exact networks.

I personally side with the engineers and I find this work to be valuable. The best answer is to test on the real network, and get actual data from industry.

I'm surprised this is recent work. I suppose the internet has grown so fast that no one is exactly sure of what it looks like on the inside. The business concerns limit the amount of information that we have about the internals.

This is rambling now, but I think taking reality into account does lead to better mappings.

Power Laws

An interesting work that demonstrates the 1998-era internet can be described concisely by some power law relationships. The particulars of the relationships are not that important.

I totally buy the argument, as it's clear that there's a generic topology that should be consistent throughout the network. As the paper was written for the 1998 internet, it's likely that the boom of the late 90s to early 00s was not anticipated. I'd like to see a reassessment of this work for the modern internet. I should mention that they explicitly said that their assumptions do not account for the boom and bust properties of technology.

I do question the usefulness of this work. I see the value in being able to build reasonable approximations of the internet. However, this sort of mechanism only applies in areas where it's impossible to directly measure the network. Sensornets, datacenters, and the like are small enough (and generally have design documents) that we don't need to approximate anything. Perhaps it would still be valuable to model these other networks as university researchers may not have access to the designs of the other networks.

I like the design of the work. It's very systems to build something and then figure out the theoretical grounds for it, rather than the other way around. These heuristic works make sure that the theoretical backings are of use.

Thursday, October 16, 2008

Blank!

I'm skipping the sensornet readings this week, as I have the TIER workshop on Friday. Normal ramblings will commence Tuesday of next week.

Wednesday, October 8, 2008

XORs in the Air

Hah! I walked away from reading the previous paper convinced I had a good idea for wireless encoding, and then this paper does just that. This paper uses an xor encoding of packets to send multiple packets at the same time. The logistics of it are complicated, but it works, and works well. Totally fantastic.

I'm not sure what else to say. This is a totally reasonable way to exploit the gains inherent in a broadcast medium. Their insight into the benefits given to the bottlenecks was great as well.

I still like my idea of using erasure coding, but that's just going to maximize throughput, not utilize the network more effectively. This allows wireless networks to get advantages for being highly packed, which the roofnet folks never addressed.

The only thing mildly overlooked is security, which I think isn't a problem either. End-to-end encryption should provide any guarantees, and there are definitively no new attacks generated by using this.

I'll stop fawning, but drop the other paper and emphasize this heavily!

ExOR

This paper is a follow-on to roofnet and ETX. It describes ExOR, a broadcast-based routing algorithm. The basic idea is simple: You broadcast a packet. Those that hear it correctly decide amongst each other as to when they've received it. Then, the one closest to the destination does this again.

That's the story anyhow. The implementation is a mess, with "bundling" of packets required to mitigate the inevitable loss, and then traditional routing to deal with those lost packets. The latency losses for such a system are enormous, but that's not discussed at all by the authors. They simply emphasize throughput as some godly metric. There's no discussion of scalability either, as these gossip algorithms won't. This seems nearly wholly unusable.

That's not to say that there are not clever ideas here. It's clear that a great deal of data is wasted in a wifi network because of broadcast. Because only one router can talk at a time, we should gain some benefit by forcing the others to listen. The authors listed erasure coding in related work, but said that their system did not need to identify specific paths, or disjoint paths. That's true, but only in a superficial way. They do identify the optimal cost path, for instance.

I like the erasure coding mechanism more. I see that they are attempting to deal with the broadcast properties, but this is not correct. It seems to me that doing the following would be better:

S broadcasts.
All others hear it. You wait (random * error rate on the packet) to broadcast yourself. This means that those who got it better will broadcast first, getting the packet along. Also, the erasure coding will allow the first two or three packets to get to the end, and allow for reconstruction there.

Monday, October 6, 2008

Protocol Comparison

This paper was an analysis of various routing protocols on ad-hoc wireless networks. There wasn't anything particularly interesting. First, they create a simulator that will allow for accurate emulation of wireless links, and mobility on these links. As the work focuses on ad-hoc networks, it's mostly relevant to sensornets. Then they implement and run a variety of protocols with some reasonable workload.

Good result, the DSR algorithm is pretty much winner. The most interesting aspect of this paper was it's existence. The paper pretty much writes itself. Half of it is descriptions of the routing algorithms. The second half is all results. There is no future work. There's a good amount of work in getting it all set up, but it's just so unimaginative. Still, it's totally useful. I wouldn't have expected a paper like this out of CMU.

I'm interested in doing such work for delay tolerant networks. I don't think any particular motivating use case (such as ad-hoc) has emerged, which means that it maybe more difficult to set up a reasonable simulation.

It's surprisingly soft on results too. the DSR algorithm beats all others in almost all ways, losing only on bytes sent in overhead. This isn't emphasised in the conclusion. There must be a reason why, is it politics? It is bad form for a survey to make these judgements? Why be so generic?

Sunday, October 5, 2008

Path Metric

I don't buy this. This paper describes a different path metric, ETX or Expected Transmission Count. The concept is that, on wifi links, hop-count is broken. The hops have differing properties, specifically in terms of quality, and we need to take this into account when picking a path. That's a good idea, but transmission count is not a reasonable metric.

The author is attempting to maximize throughput. If all of the wifi links are identical aside from expected retransmission count, then this is plausible. Retransmissions on a wifi drop are why throughput drops. If we expand this to include links with different throughput rates, this metric is totally meaningless. To maximize throughput, we have to know the throughput of the link and the retransmission rate. With these we can compute the expected throughput.

They argued that this could not cause cycles in the network, but I think that depends on their distance vector calculations. Congestion in a wifi-link may be exposed as retransmissions rather than just drops. So, as we route to the lowest retransmission count, this may cause retransmissions in the area, and so on.

I don't think is useful to begin with. Most modern wifi-links are one-hop to the wired network, which has much higher throughput. A network of wifi-routers are generally used for sensornets, which don't often need to maximize throughput.

There's more here, but my keyboard is acting up and it's hellish to keep this post even vaguely comprehensible.

Thursday, October 2, 2008

Roofnet

Ah! A recent paper, Mobicom '05. This was a design and deployment of a multi-hop ad-hoc network around MIT. It what is expected, where there are routers for each house and the routers collude to create paths to network gateways. I think it's quite clever and pretty useful. It doesn't quite make sense in developed areas where there are gateways in almost all houses, but it would be nice as a way to backup your connection in case of a local failure.

The paper itself had a few confusing points, particularly on the benefits of single-hop vs multi-hop. They argued each side of that a few times, based (seemingly) solely on what they implemented. Each scheme both increased and decreased throughput in different areas of the paper.

I do think this makes sense in areas with less wired infrastructure. For instance, a user may purchase this to get limited bandwidth through the roofnet system. Then, as they scale up, they can get a deeper wired connection and have it subsidized by other roofnet users.

It's also interesting that they did not hit the wall with scaling the system out. There should be a physical limit to the density of nodes allowed, as determined by the hardware channels available for communication.

There is likely more here, as I found this to be a particularly neat paper, but I'm tired and fairly out of it, so this ends here for now.

Modeling Wireless Links

Well, since I can't sleep tonight (too much bad news in a day) I decided to read and write about this paper.

The content is quite simple, it notes that modeling wireless links is not always intuitive, leading to poor simulations of the technology. It then proceeds to give arguments about the appropriate ways to model each particular type of wireless link property.

I found it interesting that network simulators suck so badly. They seem like simple things to model, but perhaps this was before the advent of virtualization at the scale we see today. Has this improved since the publication of this work?

It's also neat that there can be important work on simply simulating networks. Click is an example of a router infrastructure that was designed for simulation as well. Perhaps simulating networks in areas where there are not many sources of reliable data would be a valid area for work.