A SMALL MAC MINI SERVER CLUSTER

I have recently been building a small multi-node cluster of Mac Mini Servers as a development tool to explore some cloud services and parallel processing techniques. For this purpose, the Mac Mini is great as they are dead quiet, use very little power, boot with no problems without a keyboard, mouse or monitor attached and are easily set up to allow full remote management, configuration and screen sharing. For me the Server version is best as only they come with an Intel quad-core i7 CPU, giving effectively 8 processing nodes each. The CPU speed is a bit slower compared to the best non-server version (2.0MHz vs 2.7MHz), however the non-server version is only dual-core.

The old and the new...: My development cluster, with an old MacBook Pro and Mac Mini as the portal to a number of newer Intel quad-core i7 Mac Mini Servers.

So far I have been mainly experimenting with JPPF in Java and nGrid with Mono and C#. The JPPF API is really quite good and the whole thing was dead easy to set up. nGrid has been a bit trickier as it uses a slightly different programming paradigm, but they are both very powerful and usable.

Of course my original plan was to run Eucalyptus on a host of virtual machines using VirtualBox and Vagrant . However that quickly turned out to be a lousy idea as it generated heaps of IO and network traffic, and then running Java/Mono apps on each of them basically meant two levels of virtualization.

However, for pretty well all my use cases the fastest, simplest and most optimum setup has been to just run a JPPF driver with a local node on each machine. For the kind of analysis and simulation I need to do, it turned out that a multithreaded approach was way faster than distributed processing. This is because the model datasets are typically quite large and each individual calculation chunk quite small and quick - such as a single ray-trace through a large model - but needing to generate many hundreds of thousands of them. With some minor tweaking of its basic config file, JPPF seems to handle this approach really well - detecting that a single task is using multiple threads on a node and either using up any remaining processes or switching to a new node/machine when it needs to. Of course, trying to optimise the grouping of multi-threaded tasks into machine-sized chunks has meant a bit more work and care designing the calculation classes, but significantly reduces network traffic, IO and memory usage for any given job.