M  O  S  I  X
Cluster and Multi-Cluster Grid Management

Home     About     Linux-2.6     Wiki     Linux-2.4     HUGI     FAQ     Papers     Contact

About MOSIX

MOSIX is a management system targeted for High Performance Computing (HPC) on x86 based Linux clusters and multi-cluster organizational grids. MOSIX supports both interactive concurrent processes and batch jobs. It incorporates dynamic resource discovery and automatic workload distribution, commonly found on single computers with multiple processors.

MOSIX is implemented as a software layer that allows applications to run in remote nodes as if they run locally. Users can start their regular (sequential and parallel) applications on one node, while MOSIX automatically seek resources and transparently migrate processes to other nodes. Users do not need to modify or link applications with any library, login to remote nodes or even copy files to remote nodes. Migrations are supervised by a comprehensive set of on-line algorithms that monitor the state of the resources and attempt to improve the overall performance by dynamic resource allocation, e.g. load-balancing.

A unique advantage of MOSIX is that it operates on the process-level, unlike other systems that operate on the job-level. This means that the system adapts well and redistributes the load when the number of processes of a job (and/or their demands) changes (using "fork" and "exit"). This is especially useful for parallel jobs.

The latest version of MOSIX for Linux-2.6 can manage a cluster and a multi-cluster organizational grid. Flexible management allows owners of clusters to share their computational resources, while still preserving their autonomy to disconnect their nodes from the grid at any time, without disrupting already running programs. A MOSIX grid can extend indefinitely as long as there is trust between the owners of its clusters. This must include guarantees that guest applications will not be tampered while running in remote clusters and that no hostile computers can be connected to the local network. Since nowadays these requirements are standard within clusters and organizational grids, we recommend the use of MOSIX in such cases.

MOSIX can run in native mode or in a Virtual Machine (VM). In native mode, performance is better, but it requires modifications to the base Linux kernel, whereas a VM can run on top of any unmodified operating system that supports virtualization, including Windows, Linux and OS-X.

MOSIX is most suitable for running HPC applications with low to moderate amount of I/O. Tests of MOSIX show that the performance of several such applications over a 1Gb/s campus grid is nearly identical to that of a single cluster. It is particularly suitable for:

  • Efficient utilization of grid-wide resources- by automatic resource discovery and load-balancing.
  • Running applications with unpredictable resource requirements or run times.
  • Running long processes - which are automatically sent to grid nodes and are migrated back when these nodes are disconnected from the grid.
  • Combining nodes of different speeds - by migrating processes among nodes based on their respective speeds, current load and available memory.

    Few examples:

  • Scientific applications - genomic, protein sequences, molecular dynamics, quantum dynamics, nano-technology and other parallel HPC applications.
  • Engineering applications - CFD, weather forecasting, crash simulations, oil industry, ASIC design, pharmaceutical and other HPC applications.
  • Financial modeling, rendering farms, compilation farms.

    For further information see the related  papers and the FAQ.

  • Copyright © 1999-2008 Amnon Barak. All rights reserved.