Monday, November 24, 2008

Map-Reduce Performance

So this is Andy and Matei's fantastic OSDI work, improving the performance of the speculative execution of Hadoop. It turns out to be pretty simple, as hadoop does everything in a very naive way. Good work, and great problem selection. Lastly, great acronym selection. Matei is great at that.

Firstly, this work lies under the huge shadow of Google. No one really knows what their system does. They published in 2002 (I think), and hadoop is just about catching up to their system at that time. It's a solid academic area, but there's an open question as to the novelty of the work because of that.

This particular work is likely pretty novel, as Google has a fairly homogeneous network. I think it's 3 types of systems at any point.

Another question revolves around map-reduce itself. I think it's telling how few research efforts actually USE the system, rather than work on the system. I've found that there's a set of problems that it solves, and it's really not a huge set. They're pretty much the uses we've found in the rad lab, log searching. It doesn't build services. It doesn't scale databases (yet). The parlab has map-reduce as one of the "dwarves", a series of solved parallel problems. It's not the key one, as some in systems seem to think.

So the key, in my mind, is to expand the map-reduce problems. My understanding is that MSR is doing this, Yahoo as well. Turning it into a non-consistent database is interesting. Generalizing the model to encompass other parallel problems is interesting. However, the original map-reduce paper never would have been published without Google backing it. It took the experience of it's deployment to get attention. How many great parallel architectures sit in dusty conference proceedings? How do we change this research area so that the simplicity of the map-reduce model is seen as an advantage?

Lastly, I think you should have people read the map-reduce paper and stonebreaker's nagging paper.

Policy Aware Switching

Firstly, I'm going to be out of class on Tuesday. I hope you guys have a good talk.

I couldn't finish it, 26 pages for something I've listened to numerous talks about. It's quite likely I've missed some details, but the idea is pretty simple:

Overlays give us a few advantages, one of which is in routing. We use that in this in the PAS to enforce routes that packets must fly by to reach their endpoint. For instance, we can enforce, in the overlay alone, that a packet must go through a firewall before hitting me.

This gives huge flexibility, all the flexibility you'd expect an overlay to allow. In my nascent graduate career, I first worked on a project that lined up very well with this. PAS gives you various abilities. You can do statistical tricks, allowing packets through some time, for active testing or load balancing.

It has one huge problem though, which is something I've talked about a few times on this blog. It's not optimizing for the common case. They want performance, and this doesn't increase it. For top-tier datacenter companies such as Google, this is totally useless. They'd much rather pay their network engineers a larger amount and get better utilization. It does make sense for smaller datacenters, but not really, as they have fewer configuration problems. Lastly, the engineers themselves are not likely to adopt technology that reduces their employment.

So this is neat work, and with it you can do some really cool things with a network. It's not targeted correctly. I can think of some good uses for this, mostly from ISPs and the like who may want more direct control of their traffic. For instance, they'd like to push all P2P people over some thinner line. That's not as sexy (or well funded) as datacenters.

Wednesday, November 19, 2008

App Layer Multicast

Ok, overlays give you multicast. I see that. This is an alright system for that, hierarchy. That's cool.

I could be that I'm tired, but I just don't have a lot to say about either of these items. Great, multicast. Cool. There's nothing elegant or useful here. Nothing mind-blowing, just "if I had to build a multicast overlay, this is what I'd do".

The related work section shows some of this. They don't really provide an argument as to why their system is superior to the others. They just give a big list of other work.

SRM

This describes Scalable, Reliable, Multicast. It uses some naming and ack tricks to get basic functionality, then they implement a whiteboard app with it.

Ok, so nothing tricky. They pointed out that Multicast doesn't fit with end to end, which is obvious enough to have been a brilliant observation. I now want to write a paper on end (to end)^n just so that it exists. It's a good point that I don't think I've entirely digested. In the overlay, you just create n^2 connections, each end-to-end. Are these equivalent? The notion of fate-sharing was what they used to differentiate the two. This is tricky, and I'm not in the right place to reason through this correctly.

There's a somewhat tangential interesting point there. Multicast is equivalent to many unicasts. If you buy the hirearchy, you can implement all of this above that. This is probably why multicast doesn't matter that much. With that, we get to keep end-to-end and the existing routing infrastructure, at the loss of performance for a very small set of applications.

Either way, they pretty much implemented bittorrent in the paper filler, explaining possible uses of multicast with caching. That's when you know it's an important work: something like a "future work" section creates an industry.

Overall, I probably don't have much to say here. It's not too surprising how they did this, and that they punted the ordering stuff to a higher layer.

Monday, November 17, 2008

DTN

This is going to be brief, as I actually work on this. I'm currently implementing version 6 of the bundle protocol and TCP convergence layer in python. I've decided to be completely negative here as well. Hopefully others will have the discussion of why this is so fucking sweet.

The biggest problem in DTN is that it doesn't solve a lot of the problems it proposes. Satellite networks? Sure, those are multihop with limited connectivity. Developing regions? Ehhh... most connections are one hop. Cell phone to tower is the only connection with disruption. Everything internal is good. Sensor networks? They don't want the abstractions, as the overhead is too high. Also, established wireless protocols seem to work fine.

The key thing about DTN is that it's so general. It applies to every network, but it isn't optimized for the characteristics of the network itself. With that, it has to compete with networks specifically designed for the properties of the network. It can't. We need graceful degradation, allowing for transitions as network conditions change. DTN is a big broad hammer, encompassing everything. The next step is to morph that into something that can compete with scapels, not just networks where nothing currently works.

DOT

DOT describes the "data-oriented transfer" service. The idea here is to separate the data transfer, and stick another layer in our fat protocol stack. It's really heavily related to my class project, actually. Lots of the same issues, more people trying to get practical value out of the DHT work.

I'm not totally sold on loving or hating it. It seems completely unreasonable, but I'm not sure I can put my finger on why. It doesn't optimize the common case, as they point out. Flexibility is nice, but only when you are solving a problem. They are not solving a problem really, rather just trying to allow for a design space where there currently isn't one. This means a system like this will never be used, but it doesn't mean it's bad work.

I should note, that one metric for success the authors had is that the SMTP client they build was robust enough for one of the two authors to use. I'm sold.

I do agree that this is more elegant. That may be the fact that I'm currently implementing a system with a lot of parallels to this. I think my biases will be laid bare in class.

In conclusion, this paper may be an argument as to why my yelling about metrics last class was wrong.

Wednesday, November 12, 2008

X-Trace

What's the X stand for? Extensible? Extra?

Anyhow, X-Trace is Rodrigo and George's work on pushing tracing into the appropriate layers to get a view of the entire system. With this, debugging is easier. It does this in the obvious ways, and has the obvious problems.

The big problem for this is adoption. To use X-Trace, you have to instrument it, which means writing specific code in each protocol you want to trace. That's a huge headache. However, they deal with this well, allowing (limited) incremental adoption as well as focusing on an area (datacenters) that has the amount of top-down control needed to push such a sweeping change.

Performance is another problem. They analyzed the performance hit of their daemons, and showed it's not a huge problem. However their apache trace took a 15% hit. Assuming this is high as it's research code, that's still huge. The obvious thing to me is to instrument only a limited set of things, or to be able to turn it off until you have an error, then try to reproduce it on a limited number of servers running the service. I think they eventually did this as well.

I wonder how related this is to debugging on multicores. I argued with andy about wanting a distributed GDB, and this is approaching the ballpark of such a system.

Internet Measurement

This is probably the densest paper we've read. It's an in-depth analysis of TCP's behavior. It attempts to analyze the assumptions we have when building TCP, just as the network's infrequent reordering to the packet error rate. It's an immense work.

On a note, I'm going to start a list of worst project/protocol names. New on the list: SACK. SACKs are not new to me, but this is a new list and I've recently been reminded of it.

The usual complaints apply to this work. It's old, we're not sure how relevant the findings are to the modern internet. There were many reorderings and bursty losses due to routers updating their routes. This is presumably less likely now. Follow-on work would be great to read, as it would show what changes were implemented in reaction to these finding (not many I'd wager).

I really like these deep, measurement papers. There's so much nonsense about building a system when there's such a light understanding of the problem you're trying to solve. This work likely led to tens of future projects for exactly that reason. There are plenty of engineers in systems, we need more analysis.

Particularly, I want more social science techniques in systems and networking. This is probably a deeper analysis of user cases. We build so many technologies that are solving problems that don't exist. They get into SIGCOMM, put in a binder and on a CV, and that's it. Though, to be fair, that may just be academic problem...

Anyhow, I feel like I had something else of value to say about this work, but I can't remember. Perhaps I'll update once I remember.

Thursday, November 6, 2008

i3

Matei lied to me and told me that this was the better paper of the two.

i3 is a fairly obvious scheme to provide mobility and multicast with an overlay. There's a few interesting implementation details that they tweezed out of the idea, but none are crucial for the actual design.

Again I come to question that has haunted me through this class, adoption. Overlays are nice for adoption, as you presumably control each client. This works particularly well for p2p, less so for others.

A big part of this paper was being general and applying to traditional servers, which don't need this. You only pay a performance penalty if there is no mobility or multicast. The lack of these is the common case. I also think it will remain the common case as we deploy more wide area wireless networks.

So, this really fits mobility and multicast in p2p networks. Each has interesting use cases I think, multicast is nice to steal stuff and mobility will allow a large class of applications on personal communication devices.

Wednesday, November 5, 2008

DOA

Woah. First thing, don't name a project DOA. Name it something better, like PANTS.

Secondly, this is 16 pages about a fairly simple idea. The diea is to generalize middleboxes in some way, dealing with all of the issues the things bring up. The most common issue is addressing machines inside of NATs, as their addresses are meaningless outside of the LAN. These guys define a new namespace that is global and flat, fudge a DNS scheme, and allow the packets to give a source routing like scheme.

This is an interesting point though. Essentially, source routing solves all of this, right? I mean, if I send a source routed packet to a NAT box, with (192.168.0.10) as the last source to route to, it would work. Assuming local DNS (such as avahi) you could make the last source be: (darth-vader) or (waterloo) and get the packet there without global naming. I suppose that the NAT will need to be addressed in some meaningful manner, but those usually are globally unique.

Huh, that might be a good idea.

Anyhow, this work was good, but it's got that "never going to be used" problem, as it requires both sides use it to get an advantage. I don't think this is inherent to the solution either, as my thing above is a slightly more reasonable version of this. For instance, a VoIP app may just broadcast a "register" packet saying "hey! I'm listening on this port for VoIP, if you get one, give it here". Then you've done some of the delegation they speak of, without the new naming scheme. You can get looked up via traditional DNS and bam, way around that punching a hole crap.

Lastly, note that skype fixed this with a big server in the real world. Both clients talk to it, and then skype tells each client what port the other has open. Doesn't work for pure P2P, unless you can negotiate a P2P node to do this for you...

Sunday, November 2, 2008

DNS Caching

This was a dense work from MIT and KAIST where they gathered traces of DNS traffic to determine the effectiveness of DNS caching. There were two primary findings: DNS requests follow the known zipfian distribution, and low TTL DNS entries do not really affect the caching.

This was really really dense with a whole lot of figures and percentages. I think I followed most of it, with the argument that the zipfian distribution is correct being the most obvious. They argue it shows that caching is not that important, as most users only use the cache for a few minutes as they browse the same site. This is important as servers are using TTL-based load balancing.

Note that this only applies for addressing, not forwarding. I think this is a crucial point, as it allows us to distribute the load at that level. Since it's likely amazon.com owns that hardware, TTL multiplexing makes a lot of sense.

Interesting bit about scaling out DNS. I'm always interested in how these things have changed, as I feel like the internet has become much more stable since 2000. Then, the obvious question is: how did they stabilize it?

DNS

This paper was strange. I know DNS, and this paper actually made me a little more confused. I expect it was mostly terminology. Zones, for instance, are a strange name for administrative domains.

The paper itself was great, as it dealt heavily with an issue very close to my heart; technology adoption. As I work in Information and Communication Technology for Development (ICTD) getting users of my technology is of paramount importance.

The HOSTS.TXT file was a perfect example of "good enough" technology, which is the biggest obstacle to adoption. It's just not worth it to the users unless the benefits of the new technology outweigh the losses from adoption. This is why it's nearly impossible to get new networking technology adopted. The advantages are too small compared to the risks.

The DNS folks lamented a great deal over this fact. The users who did not switch caused tons of problems, and the users who only partially transitioned caused the most. I think that Berkeley TCP implementations were probably the definitive reason that TCP is so widely adopted.

I'm not sure exactly what I'm trying to say. I just want a more involved discussion on how to push network technology. Right now you have to implement it at user level, that way people don't need to monkey with their systems to use it.

I'm rambling. I'l bring this up in class.

Thursday, October 30, 2008

DHTs

I've decided to put both DHT papers into the same post, as they covered a lot of the same material.

These were papers on distributed hash tables. They what they seem to be, but there are some clever tricks in each individual one. You hash an address and this then maps to another machine via a series of lookups.

What I don't understand is how these things help P2P, as is argued at such depth. Since you don't get to organize the data in a pure P2P network, you can't really create any address -> data mappings. Both are handed to you, you can only create a search service that gives you one if given the other.

I do like DHTs as DNS replacements though, that seems like the proper model of usage. This is an actual problem too, as I can remember comcast's DNS server choking and access to goggle being hindered.

I guess I don't have much to write as we read the Chord paper for 262. All of my clever insight is gone.

Monday, October 27, 2008

Active Networks

I still don't really know what active networks are. Wiki says:

"The active network architecture is composed of execution environments (similar to a unix shell that can execute active packets), a node operating system capable of supporting one or more execution environments. It also consists of active hardware, capable of routing or switching as well as executing code within active packets. This differs from the traditional network architecture which seeks robustness and stability by attempting to remove complexity and the ability to change its fundamental operation from underlying network components. Network processors are one means of implementing active networking concepts. Active networks have also been implemented as overlay networks."

That didn't help too much. I suppose it's reasonable that the applications should be allowed to reform the network. However, as my previous post said, this isn't going to optimize throughput, and thus won't be of huge value.

Is this a formative work for overlay networks?

Rambling aside, the paper itself was a survey of the lessons learned in implementing such a system. Not terribly valuable, in my own view.

Lastly, I am intrigued by the idea, at least for the DTN case. DTN's are highly mobile, with the network shifting around all the time. It may be valuable to modify the network architecture in this case. Then again, I barely understand this to begin with. I'll muse more after lecture.

RON

Ron is a really good name for a project. It just reads well.

RON is a "Resilient Overlay Network". It makes a lot of sense in an area where there are significant network problems. The overlay allows for more informed, better routing decisions. Specifically it routes around trouble spots in the internet. It can also be used to enforce policies among the routers that can't be enforced at the BGP layer.

The largest problem with this paper is that their justification went away. I very seldom experience outages. In fact, I can't remember the last one. I do know that my last outage was my home router anyhow. People get paid a lot of money to make sure that core routers don't tip over. These guys did not anticipate the internet becoming quite the serious business it is now.

My guess is that is early "Look at fun stuff we can do with overlays!" work. This is valid for spotty networks, but I've developed a hypothesis from our readings in this class. Anything that doesn't optimize throughput eventually gets phased out. In the common case of a working network, the network engineers have decided that you should take this path to google, and they know WAY more than you about their network setup. You should probably trust them.

Oh, and another note. Fantastic use of e2e!

Sunday, October 19, 2008

Internet Routing Topology

This paper details the problems with the previous power-law view of network topologies. Their complaints are obvious in retrospect, as the power-law view doesn't take actual network requirements into account. They then demonstrate that incorporating actual networking requirements (throughput being the big one) into the construction of network graphs improves the modeling ability of said graph.

There's not a lot to say here really. I buy it, but it's just the normal interplay between theoreticians and practitioners. The theory folks build general models and the engineers want exact networks.

I personally side with the engineers and I find this work to be valuable. The best answer is to test on the real network, and get actual data from industry.

I'm surprised this is recent work. I suppose the internet has grown so fast that no one is exactly sure of what it looks like on the inside. The business concerns limit the amount of information that we have about the internals.

This is rambling now, but I think taking reality into account does lead to better mappings.

Power Laws

An interesting work that demonstrates the 1998-era internet can be described concisely by some power law relationships. The particulars of the relationships are not that important.

I totally buy the argument, as it's clear that there's a generic topology that should be consistent throughout the network. As the paper was written for the 1998 internet, it's likely that the boom of the late 90s to early 00s was not anticipated. I'd like to see a reassessment of this work for the modern internet. I should mention that they explicitly said that their assumptions do not account for the boom and bust properties of technology.

I do question the usefulness of this work. I see the value in being able to build reasonable approximations of the internet. However, this sort of mechanism only applies in areas where it's impossible to directly measure the network. Sensornets, datacenters, and the like are small enough (and generally have design documents) that we don't need to approximate anything. Perhaps it would still be valuable to model these other networks as university researchers may not have access to the designs of the other networks.

I like the design of the work. It's very systems to build something and then figure out the theoretical grounds for it, rather than the other way around. These heuristic works make sure that the theoretical backings are of use.

Thursday, October 16, 2008

Blank!

I'm skipping the sensornet readings this week, as I have the TIER workshop on Friday. Normal ramblings will commence Tuesday of next week.

Wednesday, October 8, 2008

XORs in the Air

Hah! I walked away from reading the previous paper convinced I had a good idea for wireless encoding, and then this paper does just that. This paper uses an xor encoding of packets to send multiple packets at the same time. The logistics of it are complicated, but it works, and works well. Totally fantastic.

I'm not sure what else to say. This is a totally reasonable way to exploit the gains inherent in a broadcast medium. Their insight into the benefits given to the bottlenecks was great as well.

I still like my idea of using erasure coding, but that's just going to maximize throughput, not utilize the network more effectively. This allows wireless networks to get advantages for being highly packed, which the roofnet folks never addressed.

The only thing mildly overlooked is security, which I think isn't a problem either. End-to-end encryption should provide any guarantees, and there are definitively no new attacks generated by using this.

I'll stop fawning, but drop the other paper and emphasize this heavily!

ExOR

This paper is a follow-on to roofnet and ETX. It describes ExOR, a broadcast-based routing algorithm. The basic idea is simple: You broadcast a packet. Those that hear it correctly decide amongst each other as to when they've received it. Then, the one closest to the destination does this again.

That's the story anyhow. The implementation is a mess, with "bundling" of packets required to mitigate the inevitable loss, and then traditional routing to deal with those lost packets. The latency losses for such a system are enormous, but that's not discussed at all by the authors. They simply emphasize throughput as some godly metric. There's no discussion of scalability either, as these gossip algorithms won't. This seems nearly wholly unusable.

That's not to say that there are not clever ideas here. It's clear that a great deal of data is wasted in a wifi network because of broadcast. Because only one router can talk at a time, we should gain some benefit by forcing the others to listen. The authors listed erasure coding in related work, but said that their system did not need to identify specific paths, or disjoint paths. That's true, but only in a superficial way. They do identify the optimal cost path, for instance.

I like the erasure coding mechanism more. I see that they are attempting to deal with the broadcast properties, but this is not correct. It seems to me that doing the following would be better:

S broadcasts.
All others hear it. You wait (random * error rate on the packet) to broadcast yourself. This means that those who got it better will broadcast first, getting the packet along. Also, the erasure coding will allow the first two or three packets to get to the end, and allow for reconstruction there.

Monday, October 6, 2008

Protocol Comparison

This paper was an analysis of various routing protocols on ad-hoc wireless networks. There wasn't anything particularly interesting. First, they create a simulator that will allow for accurate emulation of wireless links, and mobility on these links. As the work focuses on ad-hoc networks, it's mostly relevant to sensornets. Then they implement and run a variety of protocols with some reasonable workload.

Good result, the DSR algorithm is pretty much winner. The most interesting aspect of this paper was it's existence. The paper pretty much writes itself. Half of it is descriptions of the routing algorithms. The second half is all results. There is no future work. There's a good amount of work in getting it all set up, but it's just so unimaginative. Still, it's totally useful. I wouldn't have expected a paper like this out of CMU.

I'm interested in doing such work for delay tolerant networks. I don't think any particular motivating use case (such as ad-hoc) has emerged, which means that it maybe more difficult to set up a reasonable simulation.

It's surprisingly soft on results too. the DSR algorithm beats all others in almost all ways, losing only on bytes sent in overhead. This isn't emphasised in the conclusion. There must be a reason why, is it politics? It is bad form for a survey to make these judgements? Why be so generic?

Sunday, October 5, 2008

Path Metric

I don't buy this. This paper describes a different path metric, ETX or Expected Transmission Count. The concept is that, on wifi links, hop-count is broken. The hops have differing properties, specifically in terms of quality, and we need to take this into account when picking a path. That's a good idea, but transmission count is not a reasonable metric.

The author is attempting to maximize throughput. If all of the wifi links are identical aside from expected retransmission count, then this is plausible. Retransmissions on a wifi drop are why throughput drops. If we expand this to include links with different throughput rates, this metric is totally meaningless. To maximize throughput, we have to know the throughput of the link and the retransmission rate. With these we can compute the expected throughput.

They argued that this could not cause cycles in the network, but I think that depends on their distance vector calculations. Congestion in a wifi-link may be exposed as retransmissions rather than just drops. So, as we route to the lowest retransmission count, this may cause retransmissions in the area, and so on.

I don't think is useful to begin with. Most modern wifi-links are one-hop to the wired network, which has much higher throughput. A network of wifi-routers are generally used for sensornets, which don't often need to maximize throughput.

There's more here, but my keyboard is acting up and it's hellish to keep this post even vaguely comprehensible.

Thursday, October 2, 2008

Roofnet

Ah! A recent paper, Mobicom '05. This was a design and deployment of a multi-hop ad-hoc network around MIT. It what is expected, where there are routers for each house and the routers collude to create paths to network gateways. I think it's quite clever and pretty useful. It doesn't quite make sense in developed areas where there are gateways in almost all houses, but it would be nice as a way to backup your connection in case of a local failure.

The paper itself had a few confusing points, particularly on the benefits of single-hop vs multi-hop. They argued each side of that a few times, based (seemingly) solely on what they implemented. Each scheme both increased and decreased throughput in different areas of the paper.

I do think this makes sense in areas with less wired infrastructure. For instance, a user may purchase this to get limited bandwidth through the roofnet system. Then, as they scale up, they can get a deeper wired connection and have it subsidized by other roofnet users.

It's also interesting that they did not hit the wall with scaling the system out. There should be a physical limit to the density of nodes allowed, as determined by the hardware channels available for communication.

There is likely more here, as I found this to be a particularly neat paper, but I'm tired and fairly out of it, so this ends here for now.

Modeling Wireless Links

Well, since I can't sleep tonight (too much bad news in a day) I decided to read and write about this paper.

The content is quite simple, it notes that modeling wireless links is not always intuitive, leading to poor simulations of the technology. It then proceeds to give arguments about the appropriate ways to model each particular type of wireless link property.

I found it interesting that network simulators suck so badly. They seem like simple things to model, but perhaps this was before the advent of virtualization at the scale we see today. Has this improved since the publication of this work?

It's also neat that there can be important work on simply simulating networks. Click is an example of a router infrastructure that was designed for simulation as well. Perhaps simulating networks in areas where there are not many sources of reliable data would be a valid area for work.

Sunday, September 28, 2008

Wireless TCP

So, this paper was focused on the ways to allow TCP to run over a wireless network.

I'm intrigued that there was so much work on mechanisms that clearly violate the end-to-end principle. I'm guessing that a big reason for this is that the wireless link is the last hop. It's not like we're doing anything to the internals of the network. This also might help with mobility.

The end-to-end paper does give reasons to push functionality into the network, but not at the transport layer. The point of it is to push functionality down into lower layers if necessary, not push layers down, as some of these solutions have done. This is the link-layer solution that is actually used.

So, as to why this is a great idea. An interesting point is that wired and wireless networks are different. One reason not to push reliable transmission into a wired network is that it requires state at the router. This isn't possible at the high speed required of wired networks. However, wireless has a much lower bitrate, so we presumably have excess computation power (maybe we should have used it to save power...) This power is used to store and process these packets and retransmissions.

The only thing I had not contemplated about this work before is the effect of reordering. It's not obvious to me that the link requires in-order processing, but it seems as though TCP has become dependent on a property of classical internetworking that perhaps it should not have. I'm not sure of the implications of that, but it's at least a clear indicator that this is the case.

Saturday, September 27, 2008

Macaw

I'm changing the format of the posts as Randy wants more of a discussion. The "this is what the paper was about stuff" isn't valuable for that, it should be interesting observations alone.

I aggressively skimmed this paper, as I know most of the wireless stuff pretty well. In macaw, they are assuming "symmetry" among the nodes, which is actually much different from what's actually done. Wireless, like DTN, is generally only a one-hop technology. You use wireless to get to a wire. This is of course different in sensor nets, and a few other applications. So the focus on having all nodes be equal was somewhat strange.

I like it though, as I feel that wireless's flexibility isn't highly utilized. You can, if we figured out this wireless thing, get n^2 connections of n machines without rewiring. The limitation on this is the spectrum required, which would not scale with n. In particular, it would start running off into reserved spectrum pretty quickly.

Now, this wouldn't be of value in home computers, as there are few connections between home computers. It's all to some massive server somewhere. However, in a constrained area that has many peer-to-peer connections (oh, a datacenter or a sensor net) this would be fantastic.

Datacenters are currently out of scale for this, and sensor nets do it. I think the first might be able to be solved.

The next paper will have me musing on the inherent differences between wired and wireless, which will explain why the above may not be feasible.

Thursday, September 18, 2008

Dangit

I read the wrong papers again.

Gb Switched Router

Wow, I should have read this before the last paper. This was a thorough description of how (semi) modern routers are designed, complete with descriptions of terminology used all over the previous paper. It describes the algorithms used in these routers as well. Lastly, it suggests a modification to these algorithms for multicast routing.

The paper was clear, unambiguous and totally helpful. I always assume that anything in networks/systems is actually dead simple, and this demonstrates that. This paper is fantastically valuable, and should totally be in the reading.

I really found the model of adoption for routers to be interesting. As we discussed in class, you worried that RED and other algorithms are not being used due to ignorance, or worry or whatnot. This shows, to me at least, that it's not that case. They've pushed cutting edge algorithms in at the router level, the difference is that these algorithms give a performance boost under the common case while RED does not. RED only benefits you in high congestion, which you avoid like the fucking plague. The algorithms speed up your router all the time. So, I think to have an impact on the routing world you can't focus on esoteric events like congestion collapse, you have to focus on the common case or no one will care.

Wednesday, September 17, 2008

Scaling Routers

We read a paper about building a 100 Tb/s router, and the design considerations required for it. I wasn't able to pick up too much from the paper, as lingo usage was endemic. I did figure that router design is still generally pretty simple, with N^2 connections among the inputs all over the place, which was acceptable, it seems.

I'm not sure what I was supposed to get out of this anyhow? The power discussion? They mentioned that router power usage is becoming the limiting factor, so they were switching to optical. Ok, that's a good idea, I agree. They gave a proof that it would 100% of the throughput, which I guess I buy, I mean, they are switching between lots of flows and I could see that working.

I don't think this paper was particularly valuable, a more general work on router structure probably would have been more beneficial to someone at my level, at least.

Sunday, September 14, 2008

XCP

This paper details the eXtensible Control Protocol, which isn't particularly extensible. However, it attempts to improve upon TCP, particularly in instances of high delay/bandwidth product. This is done by breaking end-to-end and sticking a layer above IP in the routers that explicitly informs the protocol users about current congestion, and some complicated mathematics.
I found this interesting as I have worked on DTN, something else that works on TCP's irresponsible on high delay links. The systems seem unrelated, as this relies on ACKs for transfer of information, which DTN does not assume.
The most interesting aspect of the paper is the complete disregard of the end-to-end principle. I'm not sure how meaningful this is, tying the congestion control into the routers. TCP does a beautiful thing in assuming drops mean congestion. This is an assumption based on the lower layer, requiring no changes at the router. I feel as though a more interesting research agenda may be to reassess the assumptions the transport layer can make about the IP layer, now that the IP layer has RED and such. Perhaps backing off less on a drop, these things.

Saturday, September 13, 2008

RED

This paper deals with "Random Early Detection" algorithms for marking or dropping packets. It's a particularly elegant solution to the question of what packets are dropped when the router gets congested. Instead of waiting for congestion, it drops or marks packets with higher probability as we approach congestion. It begins once you pass some heuristic for average queue size. This is done as a weighted average, as would be expected. Eventually the queues are filled and the probability of a drop is one.
I consider this so elegant as conversations that utilize more packets have a higher probability of being selected for dropping. It just falls out of the scheme. Totally brilliant.
I was confused about the punishment of bursty communications it claims to avoid, but I think I understand that now. Because it works on average queue size rather than actual queue size, the bursts do not imply a higher chance of drop. There is a chance that the queues may fill before the average queue size catches up though...

I did some work in Randy's datacenter 294 class that dealt with congestion control in datacenter networks. I made modifications to Click, which used RED as one of the mechanisms implemented.

The paper is ungodly long for such a simple system. RED is not complicated, and I feel as though a quick view of the Wiki page would be more valuable than this in-depth analysis of the algorithm.

Thursday, September 11, 2008

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks

This paper is the design and implementation of a stateless fair queueing algorithm. The cute trick they used is that they have gateway routers tacking state onto the packets, allowing for stateful decisions without state.

This paper was much clearer than the FQ paper. I found some interesting tidbits in this, particularly in regards to the end-to-end argument. As we move these flow-oriented schemes into the network, I begin to question how much of TCP is being reimplemented in the network. For instance, they had a misbehaving UDP connection that hosed the TCP connections for one experiment. The stateless FQ solved this, giving each equal bandwidth. If you think about it, a good portion of TCP is enforcing that exact behavior. So, why are all of the hosts using TCP? If stateless FQ exists, they can likely use UDP and get the same throughput and congestion avoidance behaviors.

I know there's more to TCP than just that, it just caught my attention.

I found the tagging of packets to be really interesting. That's how Click does a lot of it's work, though they call it "painting" the packet. I question the security. If this is a great function, why not have the ultimate gateways, the hosts, tag their packets? Well, that's presumably a security concern. What would it take for the clients to implement this? Would it simply be a sliding window?

It's late and this isn't well reasoned, we'll see about class tomorrow.

This paper was great and should totally be left in.

Tuesday, September 9, 2008

Analysis and Simulation of a Fair Queueing Algorithm

This paper describes a fair queuing algorithm, in the sense that misbehaving clients cannot abuse the network. With standard TCP and FCFS queuing at the intermediate routers, a client who ignores the sliding window and packing drops can gain an advantage and a disproportionate share of the network. They demonstrate this, as well as a solution.

Some background exists in the TCP Daytona implementation by Savage. He implemented a form of TCP that misbehaved in precisely the wrong way to cause this. Also, during my time in Uganda, I found another problems with TCP. They have these "download accelerators" that open multiple TCP connections, allowing the one download to grab a larger portion of the network. This system doesn't solve this problem, as it queues on a per-connection basis. Is this algorithm used? I'm assuming so. Could it be modified to solve the tcp accelerator? It hoses the network at Makerere, that much I'm sure of.

The paper was thick and dense. I wasn't able to glean the details, but I found the algorithm and the concept simple enough. I doubt there is a version of this that reads more easily, I would much prefer that.

The paper is valuable, but not important enough for the syllabus. This could be explained in ten minutes in class.

Saturday, September 6, 2008

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks,

I read this paper earlier, on accident.

Congestion Avoidance and Control

This paper is a summary of the congestion control mechanisms built into the TCP standard. One of these is slow start, where the packet rate is only increased when an ack is received. The other is additive increase/multiplicative decrease. Lastly, the exponential backup mechanism is briefly justified.

These algorithms assume a number of things. One is that a dropped packet indicates congestion. In highly intermittent networks (802.11) this doesn't hold, and these algorithms tend to grossly overestimate the congestion in the network.

The paper itself is great, concise and clear. It looks long, at 25 pages, though around half of it is appendix.

As far as discussion topics, my normal questions come out:
Competing systems, and why
Assumptions, and why

This leads to a discussion of DTN, UDP, and any other systems I don't know about.

I'm also curious about what other assumptions are implicit. The big one for me has been that packet dropping one, but various work have shown some security assumptions which do not necessarily hold.

This paper should totally be kept in the reading list.

This paper was mostly technical work, with little pure research. I don't feel as though it left many open research questions. If any, it would be attacks on this system and it's assumptions.

Wednesday, September 3, 2008

Inferring AS Relationships

This paper is easily summarized, as they simply attempted to infer the nature of AS connections from their behavior. Nothing terribly difficult, and not rocket science. Each AS behaves differently based on who they are connected to, forwarding packets if the user has paid, and generally not forwarding if they have not. With a slightly more nuanced view, they show that it's not that hard to guess the relationships.

As to the paper itself, it went into far too much depth on the theoretical view of such a simple argument. That should have been left for an appendix, if anything. For discussion topics, I find little value. The previous paper gave a greater depth to the discussion of the reasoning for the AS setup. The inference of the AS status is not particularly interested to me either.

As a quicker read, this paper did give a good background on AS and BGP.

This paper is redundant and doesn't add much to the syllabus. I would remove it for an earlier paper describing the discussions made to push BGP to where it is. I'm a sucker for those sorts of things.

Routing Notes

This paper was a summary of the internetworking protocols, and how AS's are connected. The majority of the discussion was about BGP.

This paper was quite valuable, giving a good description of the protocols involved and the problems that arrive from their use. I particularly enjoyed the discussion of the economic concerns for ISPs. The transition from academic network to commercial entity was no doubt difficult, but it was interesting how the user demands enforced cooperation.

An interesting aside is how early networks did not always support this framework. AOL is my example from my youth. Traditionally, AOL did not allow you ot leave AOL's network. This paper allowed me to understand how and why they attempted to do that.

Lastly, the design of BGP is great. It's simple and seems to solve most of the concerns of users. However, I know of a few security concerns in the BGP protocol. The use of TCP, in particular, seems strange.

This paper was great, and should be kept in the reading. It was clear and valuable. The reading for this week is a little long, at around 30-40 pages for two days.

Open research questions were not given in this paper, as it was an overview of existing technology. They did mention issues scaling iBGP routers. I personally wonder at how far this technology can scale as we continue to increase the connectivity of our devices. As always, I wonder about alternative schemes and where they are used.

This is a description of how our network works. The implications of the paper are benign for that reason, as it's not really the paper as the technology described.

Increase/Decrease Algorithms

This paper was about the algorithms to enforce congestion avoidance. They make a distinction between congestion avoidance (staying in low congestion) and congestion control (staying out of high congestion). They find that additive increase/multiplicative decrease is the appropriate answer. This is done with a rigorous proof of its correctness.

Some background on these algorithms from the TCP Daytona paper by Savage. In this, he shows the assumptions shown in this work. The biggest thing is that they trust the client to implement the protocol correctly. If they don't they can game this algorithm to improve their performance. Very cool result there.

I consider the above an interesting discussion topic. Also, an interesting question revolves around the appropriate layer in the stack for this functionality. Anything above IP likely violates the skinny waist model, as the intermediate routers are the ones using this. However, a basic layer without any control flow is useful for more generic systems. There's seemingly no obvious answer, though I expect that implementing it at IP is probably correct, as control flow is a basic property of networks.

The paper was nice, although it was went into too much depth on the deep proofs. The graphical proofs were valuable enough.

They punted on a great deal of questions, from the guessing of the number of users to async operation. They worked on a very abstract, singular problem and didn't leave that area.

As to the implications of this work, this is what's currently used, as far as I know. Greatly impactful.

Tuesday, September 2, 2008

HE DESIGN PHILOSOPHY OF THE DARPA PROTOCOLS

This was read with great speed before class, as bnrg has been down.

Packet switching because it was "obvious" for the apps, not any high minded idealism
This is strange, as the packet switching model gained them so much
interesting point about priorities. GSM is a network that was designed by industry, so accountability was priority one, which has hampered the network severely.
So there are networks which break the IP plan by providing higher level semantics at the network level. I wonder if that has happened since IP was developed?
good point about IP's failure if network retransmits are common. However, that can also be implemented at a higher level.
Poor implementation of TCP problem was shown with the Savage paper on the fast TCP
Ends with a discussion of TCP. Quite good, and shows how scalability was priority one. As processing power scales faster than population, we may be able to move away from that model...

Monday, September 1, 2008

End-To-End Arguments in System Design

End-To-End Arguments in System Design

More emphasis on APIs as the primary design problem
You have to have end-to-end, as there are too many failure cases.
end-to-end meaning check and verify, generally
non end-to-end solutions (TCP) are only optimizations
When to end? Presumably when we hit hardware levels of risk
great example on delivery
don't care about acks, system could die after packet received. You want application ack
E2E Duplicate packets: what if application sends multiple packets?
Need E2E, if implicitly.
Remind me to name my distributed file system something other than SWALLOW
Not enough discussion of the performance optimizations of pushing it into the network
E2E as users making calls. That's pretty funny.