Distributed Computation — A Primer

by Dabe — 28 July 2008

What follows is an attempt to describe distributed (grid) computing in layman's terms. We'll take a simple problem and analyze how to go about solving it, avoiding some common pitfalls along the way.

So let's say we have a Job. For this example, we're going to figure out which name in the phone book has the highest Scrabble score. For the purposes of this demonstration, let's assume that it takes one second to calculate the score for one name, and that there are a million names in our phone book. Thus, one computer working end-to-end would take one million seconds, which is 277 hours, or roughly 11.5 days. Obviously we can do better by splitting up the Job into multiple smaller tasks. [But being as this is a Distributed Computation primer, I'm guessing you already knew that!]

Naturally, the obvious answer is, we hand a bunch of phone books out to a bunch of people and say, "Person 1, you work on the 'A's, Person 2, the 'B's, etc." That way, we'll have 26 people working at the same time, and the work will be done much faster. Each person will say, "In my group, the highest score went to Quigley Quizzenheimer," or whatever.

But there are some further optimizations we can make. Exactly how long will it take this job to finish? Will it be 1,000,000 / 26? Doubtful. That assumes an even distribution between letters, but common letters will certainly have more names than uncommon ones; there are probably more "Smith"s than all the 'Z's put together!

So let's break up our search space into pages instead. Person 1 takes pages 1-1000, person 2, pages 1001-2000, etc. If we do that, we can divide our search space into arbitrarily small blocks -- maybe we have a hundred people working on 100-page sections, or a thousand people working on 10-page elements. Again, it's up to the programmer to define those parameters.

You'll note that at the other end of the spectrum, 1,000,000 people working on one name apiece is inefficient as well. It takes longer to assign each name than it actually does to perform the calculation. So there's got to be a "sweet spot" somewhere, where each node has just enough work to keep busy. We recommend trying to keep each task somewhere between 5-minutes and an hour, ideally.

We like to assume nodes are "unreliable" which means that, since we might be running on somebody's home computer, they might be turned off at night, or the user may start playing Quake or something. That's not the end of the world, of course, since the Frontier Grid Platform is designed to be resilient against such failures (we can re-assign a task to someone else when that happens) but still, it would suck to run for a task to run for a long time, then five minutes before we've finished calculating the ultimate answer to the ultimate question of life, the universe, and everything, our computer gets destroyed by Vogons to make way for an intergalactic space bypass or some such. (Okay, I digress... The point is, we want to try to size our tasks so they're long enough to get real work done, but not so long that we'll lose a lot of work if they get canceled. Thus the "5-minutes to 1-hour" rule-of-thumb -- you can measure that on your digital watch, which I still think is a pretty neat idea.)

Some applications, of course, can't be chopped up that small, which is okay, but in our example, where each name takes one second to calculate, that means each task should compute between 300 and 3600 names, so we would do well to divide our phone book into ~300-3000 tasks. If you had a hundred names on each page (10,000 pages) that means each task should work on between 3 and 30 pages.

Also, an obvious (but frequent) error is to distribute the entire phone book to each volunteer, when each one really only needs a couple dozen pages. The phone book in our example is the data, and we have to send that data out over the Internet each time. If we were wasting bandwidth sending out all 1,000,000 names to each task, it would dramatically slow down our entire job. Instead, we want to make sure we only send pages 1-10, and 11-20, etc. to the respective compute engines.

So that's the "plain text" version. What does that mean programmatically, in terms of the Frontier API?

Basically, what we have to do is: chop up our input into Data Elements (tear the phone book into lots of small pieces), then create a number of Tasks, assigning the corresponding Data Element to each one (that's saying "Task 1 gets pages 1-10, Task 2 gets pages 11-20, Task 3 gets pages 21-30, etc.")

You add each Task to your overall Job, and then Run it.

Listeners are what actually wait to hear results. It's worth mentioning that there are two ways to "listen" for results: you could either attach Listeners when you launch the job, then wait for your answers, OR you could launch the job with no Listeners, and later on, poll the server for results. If you know your job is going to run for a week, there's no point in keeping that window on your desktop open for seven days -- instead, you can "fire and forget" at launch time, go on vacation, and when you get back, download your results for post-processing.

Now, that's the simple case... The flexibility of the Frontier API is, it allows you to send multiple Data Elements to each task -- or even different EXECUTABLE classes.

Using multiple Data Elements is particularly useful if you have two (or more) variables you want to work on -- maybe you have a spreadsheet and you want to divide each cell into its own task. You can then create one set of Data Elements for each column (A, B, C, D, ...) and one set of Data Elements for each row (1, 2, 3, 4, ...) and when you launch the tasks, you say, "Task 1 will work on cell A1, so we need to send it the data for Column A, and Row 1. Task 2 will work on cell B1, so we need to send it the data from Column B, and Row 1," etc. If we have 20 rows and 20 columns, that's 40 different Data Elements, but combined, you can break the spreadsheet up into FOUR HUNDRED tasks. This gets exponentially more powerful as you add more and more dimensions to your data.

Multiple EXECUTABLE elements, on the other hand, would mean, maybe you have 10 different simulations you want to run, where each one uses a different algorithm, but they all operate on the same data set. This way, you could write up ten different classes and bundle them all together so Task 1 runs class MySimulation_A, Task 2 runs class MySimulation_B, etc. This might be useful, for example, in Evolutionary Computation applications, where you run a particular algorithm, tweak it slightly, run it again, and compare the results -- admittedly, more people can get their heads around chopping up the DATA, rather than the EXECUTABLE elements, but this ability is extremely powerful once you start to "think" in terms of extreme-scale computation.

So that, in a nutshell, is a very basic introduction to Distributed Computation via the Frontier API. Once you start thinking in terms of massively parallel optimization, questions like "What if?" quickly turn into "What's BEST?" No longer are you bound to running one simulation at a time, you can simultaneously model THOUSANDS of scenarios and determine which is best suited for your particular needs. But I'll have more to say about that next time...

Learn More about Computation on Demand

Visiting our office?

Get directions.

11260 Roger Bacon Drive, Suite 406
Reston, VA 20190-5227
Voice: 703-689-9689
Fax: 703-689-9695
Email: info@parabon.com