There are a number of hard math problems in designing a distributed system. Many of these problems are in a famous category in mathematics known as NP-Complete. The problems that are related to network and system reliability are in another category called #P-Complete. For example the famous Traveling Salesperson Problem (TSP) is NP-Complete, and is very hard. The TSP requires that you find the shortest path (or at least a good path) that connects all the cities in a tour, and allows the salesperson to visit each city once and only once while traveling on this path/schedule. There are *lots* of alternate routes when planning a path like the TSP tour of cities, and no real "closed form" solution to the problem that is known. If we were to take the fastest computers on earth, it would still take longer than the lifetime of the universe (using currently known algorithms and techniques) to exactly solve some of the large TSP problems. When we design a distributed system, including bandwidth, server CPU, Memory, routing, Cost, etc., there are lots of things to consider, and in particular there are lots of constraints, like max CPU utilization, Cost, etc. When we add constraints to some math problems they get hard really fast. In fact some problems, as we add constraints to a given system, appear to undergo a "phase transition" or rapid change in difficulty as "constraint density" increases, and finding solutions becomes difficult. This is related to the phase transitions we hear about in physics, where we have a phase change from water to ice upon freezing. In the field of mathematics, there is a subfield known as "combinatorial optimization" that relates to problems like the TSP and other discrete type problems. In some problem categories in the field of combinatorial optimization there are phase changes in problem complexity as we increase the constraints on a system. There are other phase change phenomenon that relate to the way we solve some of these hard problems in math. In some cases thermodynamic models of a system of equations can be effective in finding high quality solutions in a reasonable amount of time. In this thread here you can find a section I wrote about temperature and entropy:
http://www.namepros.com/domain-name...n-differences-between-dot-com-everything.html
Suppose we are trying to design a system that itself is designed to take abuse, for example servers under attack or failing, or network links or routers breaking down. In the linked thread above I mentioned a state density function Ω (omega) that is used to define temperature and entropy in physics. One of the things we can do is study the "distributed design space" and find the number of similar states to the current state of the system, and that have similar computational abilities to the abilities in the current state of the system. It is a thermodynamic argument, where you look for states of a distributed system that have a high density function Ω, or large numbers of nearby states that require minimal disruption to the operation of the system. Perhaps the nearby states are even better in some characteristics like Performance, and/or lower quality in terms of Risk (related to probability of failure of the current operating state), but they are still by some measure "neighbors". It costs us something to add survivability to a system, and it takes time to switch from one state of the system to another (and recovery time is one of the math problems in this real time design process), but in some cases the loss of the system is unacceptable. Some of these hard math problems in distributed system design need to be solved in real time if we want to design a self-healing self-protecting survivable system. (see DARPA Ultralog, I did the state solver for performance/risk etc.). Nature never seems to have too many problems solving problems that are far beyond todays computing, so it is natural that techniques that model the physics of nature, like phase change phenomenon, annealing, etc., might be useful in solving hard problems in math that reflect nature. So most of the last couple decades has been devoted to solving this NP-Complete class of problems, with focus on networking and distributed computing. Not a good topic for a party, but you asked, and it has been a fun career. I may retire soon, but overall I just loved the work... you know math geeks and all...
Like I said, my job is not a good party topic... and you know what it is New Years Eve, and for me it is time to start the party!!!
HAPPY NEW YEARS!!!
Marc