M  O  S  I  X
Cluster and Multi-Cluster Management

Home     About     Distributions     Clouds     Wiki     HUGI     FAQ     Pubs     Contact

About MOSIX

MOSIX is an on-line (operating system like) management system targeted for High Performance Computing (HPC) on x86 Linux clusters, multi-clusters and Clouds. MOSIX supports both interactive concurrent processes and batch jobs. It incorporates automatic resource discovery and dynamic workload distribution, commonly found on single computers with multiple processors.

MOSIX is implemented as a software layer that allows applications to run in remote nodes as if they run locally. Users can start regular (sequential and parallel) applications on one node, while MOSIX automatically seek resources and transparently migrate processes to other nodes. There is no need to modify or link applications with any library, copy files or login to remote nodes, or even assign processes to different nodes - it is all done automatically, just fork and forget. Migrations are supervised by a comprehensive set of on-line algorithms that monitor the state of the resources and attempt to improve the overall performance by dynamic resource allocation, e.g. load-balancing.

A unique feature of MOSIX is that it operates on the process-level, unlike other systems that operate on the job-level. This means that the system adapts and redistributes the workload when the number of processes of a job (and/or their demands) changes (using "fork" and "exit"). This is especially useful for parallel jobs.

The latest version of MOSIX for Linux-2.6 can manage clusters and multi-clusters. Flexible management allows owners of clusters to share their computational resources, while still preserving their autonomy to disconnect their clusters at any time, without disrupting already running programs. A MOSIX multi-cluster can extend indefinitely as long as there is trust between the owners of its clusters. This must include guarantees that guest applications will not be tampered with while running in remote clusters and that no hostile computers can be connected to the local network. Since nowadays these requirements are standard within clusters and intra-organizational multi-clusters, we recommend the use of MOSIX in such cases.

MOSIX can run in native mode or in a Virtual Machine (VM). In native mode, performance is better, but it requires modifications to the base Linux kernel, whereas a VM can run on top of any unmodified operating system that supports virtualization, including Windows, Linux and OS-X.

MOSIX is most suitable for running HPC applications with low to moderate amount of I/O. Tests of MOSIX show that the performance of several such applications over a 1Gb/s campus multi-cluster is nearly identical to that of a single cluster. It is particularly suitable for:

  • Efficient utilization of system-wide resources- by automatic resource discovery and load-balancing.
  • Running applications with unpredictable resource requirements or run times.
  • Running long processes - which are automatically sent to nodes in remote clusters and are migrated back when these nodes are disconnected.
  • Combining nodes of different speeds - by migrating processes among nodes based on their respective speeds, current load and available memory.

    Few examples:

  • Scientific applications - genomic, protein sequences, molecular dynamics, quantum dynamics, nano-technology and other parallel HPC applications.
  • Engineering applications - CFD, weather forecasting, crash simulations, oil industry, ASIC design, pharmaceutical and other HPC applications.
  • Financial modeling, rendering farms, compilation farms.

    For further information see the related  papers and the FAQ.

  • Copyright © 1999-2010 A. Barak. All rights reserved.