![]() | ![]() |
|
The openMosix HOWTOLive free() or die()Kris Buytaertand Others
I. Introduction
Chapter 1. Introduction1.1. openMosix HOWTOIn the beginning there was Mosix, then came openMosix, in my opinion a more interesting project. Not only from a technical point of view but also due to the more correct license. I made the decision to focus this HOWTO on openMosix rather than on Mosix, mainly based on the fact that openMosix has a bigger userbase. (Moshe Bar states that about 97% of the old Mosix community has switched over to openMosix.) (20020705) Given the above, lots of information might be valuable to both users of Mosix and openMosix. I decided to split the HOWTO. The latest release of the Mosix HOWTO, containing info about both Mosix and OpenMosix will be 0.20 My intention is to focus on the openMosix HOWTO, however not neglecting the Mosix users. More info on http://howto.ipng.be/Mosix-HOWTO/ 1.2. IntroductionThis document gives a brief description of openMosix, a software package that turns a network of GNU/Linux computers into a computer cluster. Along the way, some background to parallel processing is given, as well as a brief introduction to programs that make special use of openMosix's capabilities. The HOWTO expands on the documentation as it provides more background information and discusses the quirks of various distributions. Since the creation of this HOWTO some people of the Mosix team created openMosix (more info later), initially both openMosix and Mosix were discussed in this HOWTO. Although lots of information might be valuable to both users of Mosix and openMosix. I decided to split the HOWTO. The latest relase of the Mosix HOWTO, containing info about both Mosix and OpenMosix will be 0.20 and can be found on http://howto.ipng.be/Mosix-HOWTO/Mosix-HOWTO/ Kris Buytaert got involved in this piece of work when Scot Stevenson was looking for somebody to take over the Job: this was during February 2002. While initially we discussed both Mosix and openMosix, this version of the HOWTO now mainly focuses on openMosix. Please note that the document often still mentions Mosix where it should read openMosix. You will notice that some of the headings are not as serious as they should be. Scot had planned to write the HOWTO in a slightly lighter style, as the world (and even the part of the world with a burping penguin as a mascot) is full of technical literature that is deadly. Therefore some parts still have these comments.
1.3. DisclaimerUse the information in this document at your own risk. I disavow potential liability for the contents of this document. Use of these concepts, examples, and/or other content of this document is entirely at your own risk. All copyrights are owned by their respective owners, unless specified otherwise. Use of a term in this document should not be regarded affecting the validity of any trademark or service mark. openMosix is Copyright (c) by Moshe Bar. Mosix is Copyright (c) by Amnon Barak. Linux is a Registered Trademark of Linus Torvalds. openMosix is licensed under version 2 of the GNU General Public License as published by the Free Software Foundation. Naming of particular products or brands should not be seen as endorsements. You are strongly recommended to take a backup of your system before major installation and backups at regular intervals. 1.4. Distribution policyCopyright (c) 2002 by Kris Buytaert and Scot W. Stevenson. This document may be distributed under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the appendix entitled "GNU Free Documentation License". 1.5. New versions of this documentOfficial New versions of this document can be found on the web pages of the Linux Documentation Project Drafts and Beta versions will be available on howto.ipng.be in the appropriate sub folder. Changes to this document will usually be discussed on the openMosix Mailing Lists. See the openMosix for details. Chapter 2. So what is openMosix Anyway ?2.1. A very, very brief introduction to clusteringMost of the time, your computer is bored. Start a program like xload or top that monitors your system use, and you will probably find that your processor load is not even hitting the 1.0 mark. If you have two or more computers, chances are that at any given time, at least one of them is doing nothing. Unfortunately, when you really do need CPU power - during a C++ compile, or encoding Ogg Vorbis music files - you need a lot of it at once. The idea behind clustering is to spread these loads among all available computers, using the resources that are free on other machines. The basic unit of a cluster is a single computer, also called a "node". Clusters can grow in size - they "scale" - by adding more machines. A cluster as a whole will be more powerful the faster the individual computers and the faster their connection speeds are. In addition, the operating system of the cluster must make the best use of the available hardware in response to changing conditions. This becomes more of a challenge if the cluster is composed of different hardware types (a "heterogeneous" cluster), if the configuration of the cluster changes unpredictably (machines joining and leaving the cluster), and the loads cannot be predicted ahead of time. 2.1.1. A very, very brief introduction to clustering2.1.1.1. HPC vs Fail-over vs Load-balancingBasically there are 3 types of clusters, Fail-over, Load-balancing and HIGH Performance Computing, The most deployed ones are probably the Failover cluster and the Load-balancing Cluster.
Most common known examples of loadbalancing and failover clusters are webfarms, databases or firewalls. People want to have a 99,99999% uptime for their services, the internet is open 24/24 7/7/ 365/365 not unlike in the old days when you could shut down your server when the office closed. People that are in need of cpu cycles often can afford to schedule downtime for their environments, as long as they can use the maximum power of their machines when they need it. 2.1.1.2. Supercomputers vs. clustersTraditionally Supercomputers have only been built by a selected number of vendors: a company or organization that required the performance of such a machine had to have a huge budget available for its Supercomputer. Lots of universities could not afford the costs of a Supercomputer by themselves, therefore other alternatives were being researched by them. The concept of a cluster was born when people first tried to spread different jobs over more computers and then gather back the data those jobs produced. With cheaper and more common hardware available to everybody, results similar to real Supercomputers were only to be dreamed of during the first years, but as the PC platform developed further, the performance gap between a Supercomputer and a cluster of multiple personal computers became smaller. 2.1.1.3. Cluster models [(N)UMA, PVM/MPI]There are different ways of doing parallel processing: (N)UMA, DSM, PVM and MPI are all different kinds of Parallel Processing schemes. Some of them are implemented in hardware, others in software, others in both. (N)UMA ((Non-)Uniform Memory Access), machines for example have shared access to the memory where they can execute their code. In the Linux kernel there is a NUMA implementation that varies the memory access times for different regions of memory. It then is the kernel's task to use the memory that is the closest to the CPU it is using. DSM aka Distributed Shared memory, has been implemented in both software and hardware , the concept is to provide an abstraction layer for physically distributed memory. PVM and MPI are the tools that are most commonly being used when people talk about GNU/Linux based Beowulfs. MPI stands for Message Passing Interface. It is the open standard specification for message passing libraries. MPICH is one of the most used implementations of MPI. Next to MPICH you also can find LAM, another implementation of MPI based on the free reference implementation of the libraries. PVM or Parallel Virtual Machine is another cousin of MPI that is also quite often being used as a tool to create a Beowulf. PVM lives in user space so no special kernel modifications are required: basically each user with enough rights can run PVM. 2.1.1.4. openMosix's roleThe openMosix software package turns networked computers running GNU/Linux into a cluster. It automatically balances the load between different nodes of the cluster, and nodes can join or leave the running cluster without disruption of the service. The load is spread out among nodes according to their connection and CPU speeds. Since openMosix is part of the kernel and maintains full compatibility with Linux, a user's programs, files, and other resources will all work as before without any further changes. The casual user will not notice the difference between a Linux and an openMosix system. To her, the whole cluster will function as one (fast) GNU/Linux system. openMosix is a Linux-kernel patch which provides full compatibility with standard Linux for IA32-compatible platforms. The internal load-balancing algorithm transparently migrates processes to other cluster members. The advantage is a better load-sharing between the nodes. The cluster itself tries to optimize utilization at any time (of course the sysadmin can affect the automatic load-balancing by manual configuration during runtime). This transparent process-migration feature makes the whole cluster look like a BIG SMP-system with as many processors as available cluster-nodes (of course multiplied with X for X-processor systems such as dual/quad systems and so on). openMosix also provides a powerful optimized File System (oMFS) for HPC-applications, which unlike NFS provides cache, time stamp and link consistency. 2.2. The story so far2.2.1. Historical DevelopmentRumours say that Mosix comes from Moshe Unix. Initially Mosix started out as an application running on BSD/OS 3.0.
2.2.2. openMosixopenMosix is in addition to whatever you find at mosix.org and in full appreciation and respect for Prof. Barak's leadership in the outstanding Mosix project. Moshe Bar has been involved for a number of years with the Mosix project (www.mosix.com) and was co-project manager of the Mosix project and general manager of the commercial Mosix company. After a difference of opinions on the commercial future of Mosix, he has started a new clustering company - Qlusters, Inc. - and Prof. Barak has decided not to participate for the moment in this venture (although he did seriously consider joining) and held long running negotiations with investors. It appears that Mosix is not any longer supported openly as a GPL project. Because there is a significant user base out there (about 1000 installations world-wide), Moshe Bar has decided to continue the development and support of the Mosix project under a new name: openMosix and under the full GPL2 license. Whatever code in openMosix comes from the old Mosix project is Copyright 2002 by Amnon Bark. All the new code is Copyright 2002 by Moshe Bar. There could (and will) be significant changes in the architecture of the future openMosix versions. New concepts about auto-configuration, node-discovery and new user-land tools are being discussed in the openMosix mailing lists. Most of these new functionalities are already implemented while some of them, such as DSM (Distributed Shared Memory) are still being worked on at the moment I write this (march 2003). To approach standardization and future compatibility the proc-interface has changed from /proc/mosix to /proc/hpc and the /etc/mosix.map was changed to /etc/hpc.map. More recently the standard for the config file has been set to be located in /etc/openmosix.map (this is in fact the first config file the /etc/init.d/openmosix script will look for). Adapted command-line user-space tools for openMosix are already available on the web-page of the project. The openmosix.map config file can be replaced with a node-auto-discovery system which is called omdiscd (openMosix auto DISCovery Daemon) about which we will discuss later. openMosix is supported by various competent people (see openmosix.sourceforge.net) working together around the world. The main goal of the project is to create a standardized clustering-environment for all kinds of HPC-applications. openMosix has also a project web-page at http://openMosix.sourceforge.net with a CVS tree and mailing-lists for developers as well as users. 2.2.3. Current stateLike most active Open Source programs, openMosix's rate of change tends to outstrip the followers' ability to keep the documentation up to date. As I write this part in February 2003 openMosix 2.4.20 is available and openMosix Userland Tools v0.2.4 are available, including the new autodiscovery tools. For a more recent state of development please take a look at the openMosix website 2.3. openMosix in action: An exampleopenMosix clusters can take various forms. To demonstrate this, let's assume you are a student and share a dorm room with a rich computer science guy, with whom you have linked computers to form an openMosix cluster. Let's also assume you are currently converting music files from your CDs to Ogg Vorbis for your private use, which is legal in your country. Your roommate is working on a project in C++ that he says will bring World Peace. However, at just this moment he is in the bathroom doing unspeakable things, and his computer is idle. So when you start a program like bladeenc to convert Bach's .... from .wav to .ogg format, the openMosix routines on your machine compare the load on both nodes and decide that things will go faster if that process is sent from your Pentium-233 to his Athlon XP. This happens automatically: you just type or click your commands as you would if you were on a standalone machine. All you notice is that when you start two more coding runs, things go a lot faster, and the response time doesn't go down. Now while you're still typing ...., your roommate comes back, mumbling something about red chili peppers in cafeteria food. He resumes his tests, using a program called 'pmake', a version of 'make' optimized for parallel execution. Whatever he's doing, it uses up so much CPU time that openMosix even starts to send subprocesses to your machine to balance the load. This setup is called *single-pool*: all computers are used as a single cluster. The advantage/disadvantage of this is that your computer is part of the pool: your stuff will run on other computers, but their stuff will run on yours too. 2.4. Components2.4.1. Process migrationWith openMosix you can start a process on one machine and find out it actually runs on another machine in the cluster. Each process has its own Unique Home Node (UHN) where it gets created. Migration means that a process is splitted in 2 parts, a user part and a system part. The user part will be moved to a remote node while the system part will stay on the UHN. This system-part is sometimes called the deputy process: this process takes care of resolving most of the system calls. openMosix takes care of the communication between these 2 processes. 2.4.2. The openMosix File System (oMFS)oMFS is a feature of openMosix which allows you to access remote filesystems in a cluster as if they were locally mounted. The filesystems of your other nodes can be mounted on /mfs and you will, for instance, find the files in /home on node 3 on each machine in /mfs/3/home. 2.5. openMosix Test DriveIn support of openMosix, Major Chai Mee Joon is giving OM users a free trial account to his online openMosix cluster service, which users can use to test and experiment openMosix with. The availability of this online openMosix cluster service will help both new users overcome the initial openMosix configuration issues, and also provides higher computing power to openMosix users who are developing or porting their applications. To get your userid and password to the cluster: http://www.mosixcluster.com/trial.php 2.6. Pros of openMosixTable 2-1. Pros of openMosix
2.7. Cons of openMosixTable 2-2. Cons of openMosix
II. Installing openMosix
Chapter 3. Requirements and Planning3.1. Hardware requirementsInstalling a basic cluster requires at least 2 network connected machines, either using a cross-cable between the two network cards or using a switch or hub (a switch is much better than a hub though and only costs a few bucks more). Of course the faster your network-cards the easier you will get better performance for your cluster. These days Fast Ethernet (100 Mbps) is standard; putting multiple ports in a machine isn't that difficult, but make sure to connect them through other physical networks in order to gain the speed you want. Gigabit Ethernet is getting cheaper every day now but I suggest that you don't rush to the shop spending your money before you have actually tested your setup with multiple 100Mbit cards and noticed that you really do need the extra network capacity. Next to putting a Gigabit card you might also want to try bonding different 100Mbit cards together. An even cheaper alternative can be found in Firewire, as discussed in this paper 3.2. Hardware Setup GuidelinesSetting up a big cluster requires some thinking to be done: where are you going to put the machines? Not under a table somewhere or in the middle of your office I hope! It's ok if you just want to do some small tests, but if you are planning to deploy a N node cluster you will have to make sure that the environment that will hold these machines is capable of doing so. I'm talking about preparing one or more 19" racks to host the machines, configure the appropriate network topology, either straight, single connected or even a 1 to 1 cross connected network between all your nodes. You will also need to make sure that there is enough power to support such a range of machines, that your air-conditioning system supports the load and that in case of power-failure your UPS can cleanly shut down all the required systems. You might want to invest in a KVM (Keyboard, Video, Mouse) Switch in order to facility access to the machines' consoles. But even if you don't have the number of nodes that justifies such an investment, make sure that you can always easily access the different nodes, you never know when you have to replace a CPU fan or an hard-disk of a machine in trouble. If that means that you have to unload a stack of machines to reach the bottom one, hence shutting down your cluster, you are in trouble. 3.3. Software requirementsThe systems we plan to use will need a basic Linux installation of your choice: Red Hat, Suse, Debian, Gentoo or any another distribution: it doesn't really matter which one. What does matter is that the kernel is at least on 2.4 level, and that your network-cards are configured correctly; next to that you'll need a healthy space of swap. 3.4. Planning your ClusterWhen it comes to configuring openMosix Clusters with a pool of servers and a set of (personal) workstations, you have different options that will have their advantages and disadvantages.
3.5. ClassroomsAlthough it might seem a good idea to convert your classroom into an openMosix cluster at night, you'll have to consider training your end users not to pull the power switch of those machines when they want to use them again. More recent machines support automatic shutdowns when hitting the power button, but with older machines you might loose some data now and then when this actually happens. Chapter 4. Distribution specific installations4.1. Installing openMosixThis chapter deals with installing openMosix on different distributions. It won't be an exhaustive list of all the possible combinations. However throughout the chapter you should find enough information on installing openMosix in your environment. Techniques for installing multiple machines with openMosix will be discussed in one of the next chapters. 4.2. Getting openMosixYou can download the latest versions of openMosix from http://sourceforge.net/project/showfiles.php?group_id=46729. You can either choose the binary (even in rpm) compiled for UP or SMP or download the source code. You will need both the kernel patch or binaries and the userland tools. Alternatively you can get the CVS version:
4.3. openMosix General Instructions4.3.1. Kernel CompilationAlways use pure vanilla kernel-sources from http://www.kernel.org/ to compile an openMosix kernel! Please be kind enough to download the kernel using a mirror near to you and always try and download patches to the latest kernel sources you do have instead of downloading the whole thing. This is going to be much appreciated by the Linux community and will greatly increase your geeky Karma ;-) Be sure to use the right openMosix patch depending on the kernel-version. At the moment I write this, the latest 2.4 kernel is 2.4.20 so you should download the openMosix-2.4.20-x.gz patch, where the "x" stands for the patch revision (ie: the greater the revision number, the most recent it is). Do not use the kernel that comes with any Linux-distribution: it won't work. These kernel sources get heavily patched by the distribution-makers so, applying the openMosix patch to such a kernel is going to fail for sure! Been there, done that: trust me ;-) Download the actual version of the openMosix patch and move it in your kernel-source directory (e.g. /usr/src/linux-2.4.20). If your kernel-source directory is other than "/usr/src/linux-[version_number]" at least the creation of a symbolic link to "/usr/src/linux-[version_number]" is required. Supposing you're the root user and you've downloaded the gzipped patch file in your home directory, apply the patch using (guess what?) the patch utility:
Now compile it with:
Reboot and your openMosix-cluster-node is up! 4.3.2. Syntax of the /etc/openmosix.map fileBefore starting openMosix, there has to be an /etc/openmosix.map configuration file which must be the same on each node. The standard is now /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map are old standards, but the CVS-version of the tools is backwards compatible and looks for /etc/openmosix.map, /etc/mosix.map and /etc/hpc.map (in that order). The openmosix.map file contains three space separated fields:
If a node has more than one network-interface it can be configured with the ALIAS option in the range-size field (which equals to setting the range-size to 0) e.g.
Always be sure to run the same openMosix version AND configuration on each of your Cluster's nodes! Start openMosix with the "setpe" utility on each node :
Alternatively, you can grab the "openmosix" script which can be found in the scripts directory of the userspace-tools, copy it to the /etc/init.d directory, chmod 0755 it, then use the following commands as root:
Installation is finished now: the cluster is up and running :) 4.3.3. oMFSFirst of all, the CONFIG_MOSIX_FS option in the kernel configuration has to be enabled. If the current kernel was compiled without this option, then recompilation with this option enabled is required. Also the UIDs (User IDs) and GIDs (Group IDs) on each of the clusters' nodes file-systems must be the same. You might want to accomplish this using openldap. The CONFIG_MOSIX_DFSA option in the kernel is optional but of course required if DFSA should be used. To mount oMFS on the cluster there has to be an additional fstab-entry on each node's /etc/fstab. in order to have DFSA enabled:
With the help of some symbolic links all cluster-nodes can access the same data e.g. /work on node1
The following special files are excluded from the oMFS:
Creating links like:
The following system calls are supported without sending the migrated process (which executes this call on its home (remote) node) going back to its home node: read, readv, write, writev, readahead, lseek, llseek, open, creat, close, dup, dup2, fcntl/fcntl64, getdents, getdents64, old_readdir, fsync, fdatasync, chdir, fchdir, getcwd, stat, stat64, newstat, lstat, lstat64, newlstat, fstat, fstat64, newfstat, access, truncate, truncate64, ftruncate, ftruncate64, chmod, chown, chown16, lchown, lchown16, fchmod, fchown, fchown16, utime, utimes, symlink, readlink, mkdir, rmdir, link, unlink, rename Here are situations when system calls on DFSA mounted file-systems may not work:
Next to the /mfs/1/ /mfs/2/ and so on files you will find some other directories as well. Table 4-1. Other Directories
Note that these magic files are all ``per process''. That is their content is dependent upon which process opens them. A last not about openMFS is that there are versions around that return faultive results when you run "df" on those filesystems. Don't be surpised if you suddenlty have about 1.3 TB available on those systems. 4.4. Red Hat and openMosixIf you are running a RedHat 7.2, 7.3 or 8.0 version, this is probably the easiest *Mosix install you have ever done. Choose the appropriate openMosix RPMs from sourceforge. They have precompiled kernels (as I write this 2.4.20) that work seamlessly: I have tested them on several machines including Laptops with PCMCIA cards and Servers with SCSI disks. If you are a grub user, the kernel rpm even modifies your grub.conf. So all you have to do is install 2 RPMs:
Most RedHat installations have one extra thing to fix. You often get the following error:
If you would like to use more bleeding edge patches, you can always opt for the src rpm and run rpmbuild --rebuild on it. This will install the source for you and create an initial config file. From there you can go further applying patches to openMosix As new RedHat versions come out, they might be supported out of the box so, feel free to drop the author a note and help him keeping this information updated. 4.5. Suse and openMosixAlthough the RPMs are being built on a RedHat based environment, you can use most of them on other RPM based systems. Suse however has /sbin/mk_initrd as a link to /sbin/mkinitrd, which makes rpms before release 20-2 fail. Newer version should have a fix for this. 4.6. Debian and openMosixInstalling openMosix ``the Debian way'' can be easily done as described below. The first step consists in downloading the packages from the net. I had to use a 2.4.19 kernel since the openMosix patches package is not yet available for 2.4.20 at the moment I write this. Since we are using a Debian setup we needed: http://packages.debian.org/unstable/net/openmosix.html, http://packages.debian.org/unstable/net/kernel-patch-openmosix.html, http://packages.debian.org/unstable/misc/kernel-package.html, http://packages.debian.org/unstable/devel/kernel-source-2.4.19.html. You can also apt-get install them ;). The next part is making the kernel openMosix capable. Basically, the procedure to follow is:
After rebooting with this kernel and a configured /etc/openmosix.map, you should then have a cluster of openMosix machines that talk to each-other and that do migration of processes. You can test that by running the following small script:
We also setup openMosixView on the Debian machine:
openMosixView gives you a nice interface that shows the load of different machines and gives you the possibility to migrate processes manually. A detailed discussion of openMosixView can be found elsewhere in this document. 4.7. openMosix and GentooFirst Install Gentoo Linux Then, install openMosix: type "emerge sys-apps/openmosix-user", which will install an openMosix kernel source tree in /usr/src/linux along with the openMosix userland tools. Michael Imhof, aka tantive, keeps Gentoo current for the latest openMosix version. Daniel Robbins, the President/CEO of Gentoo Technologies, Inc. and the creator of Gentoo Linux, wrote the artitles we use as our Introduction to openMosix Clusters. Chapter 5. Autodiscovery5.1. Easy ConfigurationThe auto-discovery daemon (omdiscd) provides a way to automatically configure an openMosix cluster hence eliminating the need of a /etc/mosix.map or similar manual configurations. Auto-discovery uses multicast packages to notify other nodes that it is an openMosix node. This way adding an extra node to your mosix cluster means that you just have to start the omdiscd on your machine and it will join the cluster. However there are some small requirements, Like with any openMosix cluster , you need to have networking configured correctly. mainly the routing. Without a default route, you must specify an interface to omdiscd with the -i option. Otherwise omdiscd will exit with an error like.
omdiscd has some other options that you can use. You can either run omdiscd as a daemon (default) or in the foreground where output goes to the screen (standard output) omdiscd -n . An interface can be specified with the -i option. Now lets still have a short look at the other tool , it's showmap. This tool will show you the newly auto generated openMosix map.
Auto-discovery has some other features not listed here such as a routing mechanism for clusters with more than one network. More detailed information can be found in the README and DESIGN files in the user-land tools source tree. More recent versions of the openMosix rc scripts will first verify wether an /etc/openmosix.map file or similar exists before trying to use autoconfiguration. 5.2. Compiling auto-discoveringIf you are compiling autodiscovery from source you will need to make a small modification to openmosix.c. One of the first lines will be
5.3. Troubleshooting autodiscoverySometimes however autodiscovery does not function as you would like, for example a node might not see multicast traffic from other nodes. This has occurred with some PCMCIA ethernet drivers. One solution is to place the interface in promiscuous and or multicast mode as detailed below:
I have also noticed that autodiscovery does not work with FireWire based network cards. Chapter 6. Cluster Installation6.1. Cluster InstallationsThis chapter does not deal with installing openMosix as such, it does however deal with installing multiple machines with openMosix. Automated or semi automated mass installs. 6.2. DSH, Distributed ShellAt the time of this writing (May 2003) DSH's most current release is available from http://www.netfort.gr.jp/~dancer/software/downloads/ More info on the package can be found on http://www.netfort.gr.jp/~dancer/software/dsh.html The latest version available for download is 0.23.6 You will need both libdshconfig-0.20.8.tar.gz and dsh-0.23.5.tar.gz Start with installing libdshconfig
Say we have a small cluster with a couple of nodes. To make life easier we want type each command once but have it executed on each node. You then have to create a file in $HOME/.dsh/group/clusterwname that lists the ip's of your cluster. eg.
III. Administrating openMosix
Chapter 7. Administrating openMosix7.1. Basic AdministrationopenMosix provides the advantage of process migration to HPC-applications. The administrator can configure and tune the openMosix-cluster by using the openMosix-user-space-tools or the /proc/hpc interface which will be now described in detail. Up till openMosix version 2.4.16 the /proc interface was named /proc/mosix ! Until openMosix version 2.4.17 it was named /proc/hpc. 7.2. ConfigurationThe values in the flat files in the /proc/hpc/admin directory presenting the current configuration of the cluster. Also the administrator can write its own values into these files to change the configuration during runtime, e.g. Table 7-1. Changing /proc/hpc parameters
... Table 7-2. /proc/hpc/admin/
Table 7-3. Writing a 1 to the following files /proc/hpc/decay/
Table 7-4. Informations about the other nodes
Table 7-5. Additional Informations about local processes
7.3. the userspace-toolsThese following tools are providing easy administration to openMosix clusters.
Table 7-6. more detailed
The mosrun command can be executed with several more commandline options. To ease this up there are several preconfigured run-scripts for executing jobs with a special (openMosix) configuration. Table 7-7. extra options for mosrun
Additional to the /proc interface and the commandline-openMosix utilities (which are using the /proc interface) there is a patched "ps" and "top" available (they are called "mps" and "mtop") which displays also the openMosix-node ID on a column. This is useful for finding out where a specific process is currently being computed. This actually summarised the command line tools, but have a look at openMosixview which is a GUI for the most common administration tasks, and which ill be discussed in a future chapter. 7.4. Cluster Mask(by Moshe Bar) Several people have asked for a feature in openMosix which allows to specifiy to which nodes a given process and it's children can migrate and to which nodes it cannot. Simone Ettore has just committed a new patch to the CVS which allows you to do just that. Here is how it works:
We are shortly going to release also a simple user-land tool to set the node mask, but I would like you guys to give it a try asap before we release it as openMosix 2.4.20-3. Chapter 8. Tuning Mosix8.1. IntroductionSome of the parts below are still from the old Mosix Howto, as time passes these parts will get replaced by relevant openMosix parts, however some things are still the same , but your mileage may vary. 8.2. Creating a "Master" nodeAlthough openMosix architcture does not require a master node as such, you might want to have a head node from where you launch processes, this might be a multihomed node from where users log in to your cluster. You want to configure your machine to make processes migrate away You have to trick the node in thinking it is the slowest node around and it'd better migrate all it's processes to the faster nodes. You will have to make it "slow" with :
8.3. Optimizing MosixEditorial Comment: To be checked with openMosix versions Login a normal terminal as root. Type
8.4. Channel Bonding Made EasyContributed by Evan Hisey Channel bonding is actually horrible easy. This may explain the lack of documentation on this subject A bonded network appears as a normal network to the applications. All machines on a subnet must be either bonded the same way. Bonded and non-bonded machine really don't talk well to each other. Channel bonding needs at least to physical sub-nets but can have more(Currently I have a tri-bonded cluster). To enable bonding you need to either compile in to the kernel or as a module (bonding.o) the Channel Bonding kernel code, as of 2.4.x is it a standard option of the kernel. The NIC's are setup as normal with except that you only us 'ifconfig' to initialize the first card of the bond. 'ifenslave' is used to initialize the remaining cards in the bonded connection. 'ifenslave' can be locate in the linux/Documentation/network/ directory. It will need to be compiled as it is a .c file. The basic format for use is
8.5. UpdatedbUpdatedb in combination with mfs can cause some issues, you might want to add /mfs to the PRUNEFPATHS or mfs to the PRUNEFS in your /etc/updatedb.conf to disable updatedb from indexing this mountpoints. 8.6. openMosix and FireWireopenMosix does gain performance by using another type of network device, as described within the paper about openMosix and FireWire Chapter 9. openMosixview9.1. IntroductionopenMosixview is the next version and a complete rewrite of Mosixview. It is a cluster-management GUI for openMosix-cluster and everybody is invited to download and use it (at your own risk and responsibility). The openMosixview-suite contains 5 useful applications for monitoring and administrating openMosix-cluster.
All parts are accessible from the main application window. The most common openMosix-commands are executable by a few mouse-clicks. An advanced execution dialog helps to start applications on the cluster. "Priority-sliders" for each node simplifying the manual and automatic load-balancing. openMosixview is now adapted to the openMosix-auto-discovery and gets all configuration-values from the openMosix /proc-interface. 9.2. openMosixview vs MosixviewopenMosixview is fully designed for openMosix cluster only. The Mosixview-website (and all mirrors) will stay as they are but all further developing will continue with openMosixview located at the new domain www.openmosixview.com If you have: questions, features wanted, problems during installation, comments, exchange of experiences etc. feel free to mail me, Matt Rechenburg or subscribe to the openMosix/Mosixview-mailing-list and mail to the openMosix/Mosixview-mailing-list changes: (to Mosixview 1.1) openMosixview is a complete rewrite "from the scratch" of Mosixview! It has the same functionalities but there are fundamental changes in ALL parts of the openMosixview source-code. It is tested with a constantly changing cluster topography (required for the openMosix auto-discovery) All "buggy" parts are removed or rewritten and it (should ;) run much more stable now.
9.3. InstallationRequirements
Documentation about openMosixview There is a full HTML-documentation about openMosixview included in every package. You find the startpage of the documentation in your openMosixview installation directory: openmosixview/openmosixview/docs/en/index.html The RPM-packages have their installation directories in: /usr/local/openmosixview 9.3.1. Installation of the RPM-distributionDownload the latest version of openMosixview rpm-package. Then just execute e.g.:
9.3.2. Installation of the source-distributionDownload the latest version of openMosixview and unzip+untar the sources and copy the tarball to e.g. /usr/local/.
9.3.3. Automatic setup-scriptJust cd to the openmosixview-directory and execute
9.3.4. Manual compilingSet the QTDIR-Variable to your actual QT-Distribution, e.g.
9.3.5. Hints(from the testers of openMosixview/Mosixview who compiled it on different linux-distributions, thanks again) Create the link /usr/lib/qt pointing to your QT-2.3.x installation e.g. if QT-2.3.x is installed in /usr/local/qt-2.3.0
9.4. using openMosixview9.4.1. main applicationHere is a picture of the main application-window. The functionality is explained in the following. openMosixview displays a row with a lamp, a button, a slider, a lcd-number, two progress-bars and some labels for each cluster-member. The lights at the left are displaying the openMosix-Id and the status of the cluster-node. Red if down, green for available. If you click on a button displaying the ip-address of one node a configuration-dialog will pop up. It shows buttons to execute the most common used "mosctl"-commands. (described later in this HOWTO) With the "speed-sliders" you can set the openMosix-speed for each host. The current speed is displayed by the lcd-number. You can influence the load-balancing of the whole cluster by changing these values. Processes in a openMosix-Cluster are migrating easier to a node with more openMosix-speed than to nodes with less speed. Sure it is not the physically speed you can set but it is the speed openMosix "thinks" a node has. e.g. a cpu-intensive job on a cluster-node which speed is set to the lowest value of the whole cluster will search for a better processor for running on and migrate away easily. The progress bars in the middle gives an overview of the load on each cluster-member. It displays in percent so it does not represent exactly the load written to the file /proc/hpc/nodes/x/load (by openMosix), but it should give an overview. The next progressbar is for the used memory the nodes. It shows the currently used memory in percent from the available memory on the hosts (the label to the right displays the available mem). How many CPUs your cluster have is written in the box to the right. The first line of the main windows contains a configuration button for "all-nodes". You can configure all nodes in your cluster similar by this option. How good the load-balancing works is displayed by the progressbar in the top left. 100% is very good and means that all nodes nearly have the same load. Use the collector- and analyzer-menu to manage the openMosixcollector and open the openMosixanalyzer. This two parts of the openMosixview-application suite are useful for getting an overview of your cluster during a longer period. 9.4.2. the configuration-windowThis dialog will pop up if an "cluster-node"-button is clicked. The openMosix-configuration of each host can be changed easily now. All commands will be executed per "rsh" or "ssh" on the remote hosts (even on the local node) so "root" has to "rsh" (or "ssh") to each host in the cluster without prompting for a password (it is well described in a Beowulf documentation or on the HOWTO on this page how to configure it). The commands are:
If you are logged on your cluster from a remote workstation insert your local hostname in the edit-box below the "remote proc-box". Then openMosixprocs will be displayed on your workstation and not on the cluster-member you are logged on. (maybe you have to set "xhost +clusternode" on your workstation). There is a history in the combo-box so you have to write the hostname only once. 9.4.3. advanced-executionIf you want to start jobs on your cluster the "advanced execution"-dialog may help you. Choose a program to start with the "run-prog" button (file-open-icon) and you can specify how and where the job is started by this execution-dialog. There are several options to explain. 9.4.4. the command-lineYou can specify additional commandline-arguments in the lineedit-widget on top of the window. Table 9-1. how to start
9.5. openMosixprocs9.5.1. introThis process-box is really useful for managing the processes running on your cluster. The processlist gives an overview what is running where. The second column displays the openMosix-node ID of each process. 0 means local, all other values are remote nodes. Migrated processes are marked with a green icon and non movable processes have a lock. By double-clicking a process from the list the migrator-window will pop-up for managing e.g. migrating the process. There are also options to migrate the remote processes away, send SIGSTOP and SIGCONT to it or to "renice" it. If you click on the "manage procs from remote" button a new window will come up (the remote-procs windows) displaying the process currently migrated to this host. 9.5.2. the migrator-windowThis dialog will pop up if process from the process box is clicked. The openMosixview-migrator window displays all nodes in your openMosix-cluster. This window is for managing one process (with additional status-information). By double-clicking on an host from the list the process will migrate to this host. After a short moment the process-icon for the managed process will be green, which means it is running remote. The "home"-button sends the process to its home node. With the "best"-button the process is send to the best available node in your cluster. This migration is influenced by the load, speed, CPU's and what openMosix "thinks" of each node. It maybe will migrate to the host with the most CPU's and/or the best speed. With the "kill"-button you can kill the process immediately. To pause a program just click the "SIGSTOP"-button and to continue the "SIGCONT"-button. With the renice-slider below you can renice the current managed process (-20 means very fast, 0 normal and 20 very slow) 9.5.3. managing processes from remoteThis dialog will pop up if the "manage procs from remote"-button beneath the process-box is clicked The TabView displays processes that are migrated to the local host. The procs are coming from other nodes in your cluster and currently computed on the host openMosixview is started on. Similar to the two buttons in the migrator-window the process is send home by the "goto home node"-button and send to the best available node by the "goto best node"-button. 9.6. openMosixcollectorThe openMosixcollector is a daemon which should/could be started on one cluster-member. It logs the openMosix-load of each node to the directory /tmp/openmosixcollector/* These history log-files analyzed by the openMosixanalyzer (as described later) gives an nonstop overview of the load, memory and processes in your cluster. There is one main log-file called /tmp/openmosixcollector/cluster Additional to this there are additional files in this directory to which the data is written. At startup the openMosixcollector writes its PID (process id) to /var/run/openMosixcollector.pid The openMosixcollector-daemon restarts every 12 hours and saves the current history to /tmp/openmosixcollector[date]/* These backups are done automatically but you can also trigger this manual. There is an option to write a checkpoint to the history. These checkpoints are graphically marked as a blue vertical line if you analyze the history log-files with the openMosixanalyzer. For example you can set a checkpoint when you start a job on your cluster and another one at the end.. Here is the explanation of the possible commandline-arguments:
You can start this daemon with its init-script in /etc/init.d or /etc/rc.d/init.d. You just have to create a symbolic link to one of the runlevels for automatic startup. How to analyze the created logfiles is described in the openMosixanalyzer-section. 9.7. openMosixanalyzer9.7.1. the load-overviewThis picture shows the graphical Load-overview in the openMosixanalyzer (Click to enlarge) With the openMosixanalyzer you can have a non-stop openMosix-history of your cluster. The history log-files created by openMosixcollector are displayed in a graphically way so that you have a long-time overview what happened and happens on your cluster. The openMosixanalyzer can analyze the current "online" logfiles but you can also open older backups of your openMosixcollector history logs by the filemenu. The logfiles are placed in /tmp/openmosixcollector/* (the backups in /tmp/openmosixcollector[date]/*) and you have to open only the main history file "cluster" to take a look at older load-informations. (the [date] in the backup directories for the log-files is the date the history is saved) The start time is displayed on the top and you have a full-day view in the openMosixanalyzer (12 h). If you are using the openMosixanalyzer for looking at "online"-logfiles (current history) you can enable the "refresh"-checkbox and the view will auto-refresh. The load-lines are normally black. If the load increases to >75 the lines are drawn red. These values are openMosix--informations. The openMosixanalyzer gets these informations from the files /proc/hpc/nodes/[openMosix ID]/* The Find-out-button of each nodes calculates several useful statistic values. Clicking it will open a small new window in which you get the average load- and mem values and some more statically and dynamic informations about the specific node or the whole cluster. 9.7.2. statistical informations about a cluster-nodeIf there are checkpoints written to the load-history by the openMosixcollector they are displayed as a vertical blue line. You now can compare the load values at a certain moment much easier. 9.7.3. the memory-overviewThis picture shows the graphical Memory-overview in the openMosixanalyzer With Memory-overview in the openMosixanalyzer you can have a non-stop memory history similar to the Load-overview. The history log-files created by openMosixcollector are displayed in a graphically way so that you have a long-time overview what happened and happens on your cluster. It analyze the current "online" logfiles but you can also open older backups of your openMosixcollector history logs by the filemenu. The displayed values are openMosix-informations. The openMosixanalyzer gets these informations from the files
If there are checkpoints written to the memory-history by the openMosixcollector they are displayed as a vertical blue line. 9.7.4. openMosixhistorydisplays the processlist from the past openMosixhistory gives a detailed overview which process was running on which node. The openMosixcollector saves the processlist from the host the collector was started on and you can browse this log-data with openMosixhistory. You can easy change the browsing time in openMosixhistory by the time-slider. openMosixhistory can analyze the current "online" logfiles but you can also open older backups of your openMosixcollector history logs by the filemenu. The logfiles are placed in /tmp/openmosixcollector/* (the backups in /tmp/openmosixcollector[date]/*) and you have to open only the main history file "cluster" to take a look at older load-informations. (the [date] in the backup directories for the log-files is the date the history is saved) The start time is displayed on the top/left and you have a 12 hour view in openMosixhistory. 9.8. openMosixmigmon9.8.1. GeneralThe openMosixmigmon is a monitor for migrations in your openMosix-cluster. It displays all your nodes as little penguins sitting in a circle. -> nodes-circle. The main penguin is the node on which openMosixmigmon runs and around this node it shows its processes also in a circle of small black squares. -> main process-circle If a process migrates to one of the nodes the node gets an own process-circle and the process moved from the main process-circle to the remote process-circle. Then the process is marked green and draws a line from its origin to its remote location to visualize the migration. 9.9. openmosixview FAQ
At first QT >= 2.3.x is required. The QTDIR -environment variable has to be set to your QT-installation directories like it is well described in the INSTALL- file. In versions < 0.6 you can do a "make clean" and delete the two files: /openmosixview/Makefile /openmosixview/config.cache and try to compile again because i alway left the binary- and object-files in older versions. If you have any other problems post them to the openMosixview-mailinglist (or directly to me). Yes, until version 0.7 there is a built-in SSH-support. You have to be able to ssh to each node in your cluster without password (just like the same with using RSH this is required) Do not fork openMosixview in the background with & (e.g. openMosixview &). Maybe you cannot rsh/ssh (depends on what you want to use) as user root without password to each node? Try "rsh hostname" as root. You should not been promped for a password but soon get a login shell. (If you use SSH try "ssh hostname" as root.) You have to be root on the cluster because that is the only way the administrative commands executed by openMosixview requires root-privileges. openMosixview uses "rsh" as the default! If you only have "ssh" installed on your cluster edit (or create) the file /root/.openMosixview and put "1111" in it. This is the main-configuration file for openMosixview and the last "1" stands for "use ssh instead of rsh". This will cause openMosixview to use "ssh" even for the first start. The openMosixview-client is executed per rsh (or ssh which you can configer whith a checkbox) on the remote host. It has to be installed in /usr/bin/ on each node. If you use RSH try: "xhost +hostname" "rsh hostname /usr/bin/openMosixview_client -display your_local_host_name:0.0" or if you use SSH try: "xhost +hostname" "ssh hostname /usr/bin/openMosixview_client -display your_local_host_name:0.0" If this works it will work in openMosixview too. openMosixview crashes with "segmentation fault"! Maybe you still use an old version of openMosixview/Mosixview ? in the mosix.map-parser (which is completly removed in openMosixview !!) (the versions openMosixview 1.2 and Mosixview > 1.0 are stable) (automigration on/off, blocking on/off......) I want them to be preselected too. The problem is to get the information of node. You have to login to each cluster-node because these information are not cluster-wide (to my mind). The status of each node is stored in the /proc/hpc/admin directory of each node. Everybody who knows a good way to get these information easy is invited to mail me. 9.10. openMosixview + ssh:(this HowTo is for SSH2) You can read the reasons why you should use SSH instead of RSH everyday on the newspaper when another script-kiddy hacked into an insecure system/network. So SSH is a good decision at all.
At first a running secure-shell daemon on the remote site is required. If it is not already installed install it! (rpm -i [sshd_rpm_packeage_from_your_linux_distribution_cd]) If it is not already running start it with:
If you ssh to this remote host now you will be prompted for the passphrase of your public-key. Giving the right passphrase should give you a login. What is the advantage right now??? The passphrase is normally a lot longer than a password! The advantage you can get using the ssh-agent. It manages the passphrase during ssh login.
You just have to add your public-key to the ssh-agent with the ssh-add command.
You could (should) add the ssh-agent and ssh-add commands in your login-profile e.g.
openMosixview There is a menu-entry which toggles using rsh/ssh with openMosixview. Just enable this and you can use openMosixview even in insecure network-environments. You should also save this configuration (the possibility for saveing the current config in openMosixview was added in the 0.7 version) because it gets initial data from the slave using rsh or ssh (just like you configured). If you choose a service wich is not installed properly openMosixview will not work! (e.g. if you cannot rsh to a slave without being prompted for a password you cannot use openMosixview with RSH; if you cannot ssh to a slave without being prompted for a password you cannot use openMosixview with SSH) Chapter 10. Other openMosix related Programs10.1. IntroductionThere are a couple of different applications available to monitor and admin openMosix, we give a short overview of them in this chapter, we won't really go in detail. 10.2. openMosixViewopenMosixview is the most used and the best known applet for openMosix administration, you can read more about it in the openMosix adminstration chapter. 10.3. openMosixappletThe openMosixApplet lets you watch the realtime load of your openMosix cluster. It consists of a local daemon which listens for connections by applets. The applet uses chart2D to provide a good-lookin' feeling. 10.4. wmonloadwmomload is a simple, but handy and small dockapp for overviewing the load of cluster nodes in a small openMosix-based cluster. 10.5. openMosixWebViewopenMosixWebView - Produces web charts for monitoring an openMosix cluster. openMosixWebView is a PHP script for monitoring an openMosix cluster via the WEB. It uses openMosixview's openMosixCollector logs. Download now the last release openmosixwebview-0.2.12.tar.gz (16 Feb 2003) See openMosixWebView screenshots and running :-) Released under the GNU General Public License (GPL). See README and FAQ files. Chapter 11. Common Problems11.1. IntroductionAlthough most of the issues in this chapter could be a part of the FAQ. There where the FAQ will give a short "how to solve" answer, we have taken a closer look at them and explained why they are problems and how to solve them. 11.2. My processes won't migrateHelp process XYZ doesn't migrate. Moshe Bar explains below why some processes migrate and why some don't. But before that you can always look in /proc/$pid/, there often is a cantmove file which will tell you why a certain process can't migrate. Processes can also be locked. You can check if a process is locked with:
Now listen to what Moshe himself has to say about this topic. Often people have the same kernel but on a different distribution, say a mixed environment of RedHat and Debian ,rc scripts from different distros tend to start openmosix differently. Some implementations completely modify /etc/inittab to start all daemons (and their children) with
Ok, this simple program should always migrate if launched more times than number of local CPUs. So for a 2-way SMP system, starting this program 3 times will start migration if the other nodes in the cluster have at least the same speed like the local ones:
This sample program with content like this will never migrate:
Program using pipes like this do migrate nicely:
(all above code is by Moshe as Moshe Bar or by Moshe as CTO of Qlusters, Inc.) Please also refer to the man pages of openMosix , they also give an adequate explanation why processes don't migrate. If for some reason your processes stay locked while they shouldn't. You can try to allow locked processes to migrate by simply putting
11.3. I don't see all my nodesFirst of all , are you using the same kernel version on each machine ? The 'same-kernel' refers to the version. You can build different kernel images of the same source version to meet the hardware/software needs of a given node. However you wil need toe make sure that when you install openMosix on your cluster, all your machines should have the openmosix-x.x.x-y kernel installed, in contrast to having one machine running openmosix-x.x.z-x, another running openmosix-x.x.x-y, another running openmosix x.x.x-z, and so on and so forth When you run mosmon, press t to see the total of machines running. Does it warn you that mosix is not running? If yes, then make sure your machine's ip is included in /etc/mosix.map (don't use 127.0.0.1 - if your machine's ip is such, then you probably have problems with your dhcp server/nameserver). If it does not tell you that mosix is not running, see what machines show up. Do you see only your machine? If yes, then your machine is most likely running a firewall and is not letting openmosix through. If not, then the problem is most likely with the machine that doesn't show up. Also: Do you have two nic cards on a node? then you have to edit the /etc/hosts file to have a line that has the following format
Maybe you used different kernel-parameters on each machine? Especially if you use the 'Support clusters with a complex network topology' option you should take care that you use the same value for the also appearing option 'Maximum network-topology complexity support' on each machine. 11.4. I often get errors: No such processI often get the error
The above line meas that the shell you were using has acutallly migrated to another node ? This printout from bash is caused by a bug in old version of openmosix, but a fix has been commited. (Muli Ben-Yehuda mulix@actcom.co.il) 11.5. DFSA ? MFS ?People often get confused about what exactly MFS and DFSA are. As discussed before in the howto MFS is the feature of openMosix that enables you access to remote filesystems as if those filesystems were locally mounted. They are mostly mounted on /mfs . A common misunderstanding is that you need MFS in order to have openMosix working, this is not true, however it can make things easier. With DFSA enabled, system calls will be executed on the remote node withouth migrating the process back to it's home node. This behaviour (direct filesystem access) causes processes migratiing to the data and not the other way around (which is common). If DFSA is not enabled MfS is "just" a non-caching network-filesystem. Very generally speaking, if you don't have DFSA turned on, each and every I/O will go to the home node for execution. With DFSA turned on, if the file happens to be residing on the node where the process finds itself then the I/O will happen locally. A very common error is that people mix kernels with DFSA enabled and disabled. So one has to have a way to find out wether DFSA is actually enabled. This information can be obtained by typing
11.6. Python TroublesSome people have reported problems with Python, closer research showed that these problems were not with openMosix but rather with glibc issues, however it seemed that issues manifested themselves especially in openMosix. One user solved the problem by removig /lib/i686/lib* on his machine and let the applications link dynamically against /lib/libpthread (and other) However bugfixes in newer glibc versions combined with more recent openMosix version seem to have solved these problems. Chapter 12. Hints and Tips12.1. Locked ProcessesIf for some reason you find your processes are always locked in your home node and you can't find the reason, you can put the following lines into your ~/.profile as a stop-gap measure to automatically enable migration:
12.2. Choosing your processesYou will probably want to test your setup before deciding which programs you want to enable migration for. For example, if you are running KDE2 on a slow machine and have a significantly faster machine has part of your Mosix cluster, you might find resource-hungry programs like kmail are migrated out. This is not a bad thing as such, however, it can lead to a brief moment when your writing is not displayed on the screen immediately. 12.3. Java and openMosixGreen Threads JVM's, allow for migration because each Java thread is a separate process. Threads other than Java green thread JVM's cannot be migrated by Linux, so openMosix cannot migrate programs that use them. If you have the source so your Java application you might be able to compile the application native. In this case you might be able to migrate your applications to another node. Gian Paolo Ghilardi wrote a paper titled Consideration on OpenMosix it deals amongst other topics with Java an dopenMosix. http://www.democritos.it/events/openMosix/papers/crema.pdf 12.4. openMosix and HyperthreadingBasically openMosix performance increases with the current Linux scheduler when Hypethreading is disabled. You can do this by either entering 'noht' as a boot option or disabling HT in the bios. For those who are still wondering what hypetrheading is : Intel explains it 12.5. openMosix and FirewallsPeople often have questions regarding openMosix and firewalls. Amit helped me out on this matter:
the mig_daemon port is a tcp port, the info_daemon port is udp. Hence tcp/4660 and udp/5428, Matt also mentions tcp/723 somewhere. Chapter 13. (stress)Testing your openMosix installation13.1. A small Test ScriptThe fastest way to test your openMosix cluster is by creating a small script with the following content.
13.2. Perl Proggie by Charles NadeauPerl program to test an openMosix Cluster. Here is a is quick program I wrote to test an openMosix cluster. This is taken from a posting I made to the openMosix-devel mailing list on March 6th, 2002: "Charles wrote this little program (in Perl) to stress test his home cluster (3 P200MMX and a P166). It is a program simulating different sets of stocks in a portfolio for a given period of time. The code is well documented and it should be easy to add/remove stocks and change the average monthly yield and standard deviation for each stock. Since the problem of portfolio optimization cannot be solved analytically, it simulate a lot of portfolios and report the results at the end. Please note that this program does not take stock correlation into account. It is not finished yet but it's a good start. I plan to add more code at the end of the program to improve the reporting format of the data (generating SVG graph on the fly). But the simulation part works very well. In order to take advantage of the parallelism offered by openMosix, it uses the Perl module Parallel::?ForkManager (from CPAN) to span threads that openMosix can then assign to all the machines of the cluster (it also require another module for the statistical calculations, don't forget to install both, I provide the URLs in the comments of the code). Take a look at it and tell me what you think. Cheers!"
13.3. the openMosix stress-testby Matt Rechenburg 13.3.1. General descriptionThis stress test is made to test an openMosix cluster + kernel. It will perform several application + kernel tests for checking the stability and other features of openMosix (e.g. process migration, mfs, ...). During this test the cluster will be mostly loaded so you should stop other running applications before starting it. When it finished it generates a fully detailed report about each component which was tested. 13.3.2. Detailed descriptionThe openMosix stress-test runs several programs to check the functionality of the whole system. In the following part you will find a description of each test-application:
13.3.3. Installing the strestest suiteFirst of all download the rpm or source package from http://www.openmosixview.com/omtest/
13.3.4. Running the tests
IV. Running Applications on openMosix
Chapter 14. Improving Compiling Performance14.1. IntroductionThis Section is a Work in Progress Lots of people try to use openMosix as a kind of compile farm, ofthen they come back very disappointed. This chapter of the howto will try to explain in which cases your compilations will benefit from openMosix and how to improve your successlevel. First of all you have to remember 1 thing. openMosix will not migrate all processes you start on your cluster, only the ones that will benefit from migration to another node. For compiling this means that a process has to last long enough. Kernel compiles typically consist of numerous short compiles, each of them not being long enough to acutally migrate. Chapter 15. Imaging with openMosix15.1. IntroductionThis Section is a Work in Progress Computer Graphics have always been applications that required a lot of CPU power, this hasn't changed. Within this chapter I wil demonstrate with some practical examples how Computer Graphics can benefit from openMosix. 15.2. PovrayThe Persistence of Vision Raytracer is a high-quality, totally free tool for creating stunning three-dimensional graphics. Ray-tracing is a rendering technique that calculates an image of a scene by simulating the way rays of light travel in the real world. However it does its job backwards. In the real world, rays of light are emitted from a light source and illuminate objects. The light reflects off of the objects or passes through transparent objects. This reflected light hits our eyes or perhaps a camera lens. Because the vast majority of rays never hit an observer, it would take forever to trace a scene. These kind of applications can be easily made parrallel by using pvmpovray. Pvmpovray expects to working on a Beowulf style cluster and spread it's load to other nodes using pvm. The openMosix way of doing this is the same, however we just do this on 1 machine and have openMosix do the load spreading work fo you ! A GREAT Howto on PVM Povray will show you how to setup PVMPovray. Below is a small summary.
If you are on a RH 8.0 box id moved libpng and zlib to .notused .. This in order to prevent version issues .. with other libpng and zlib versions.
And a last thing that is not known by a novice pvm user is that pvm does use its own paths, and you have to put pvmpov either in that path or launch it with the complete pathname.
I had good results with 2 to 3 times the the number of cpu's I had available Chapter 16. BioInformatics and openMosix16.2. BlastOne of the more frequent used application in this field is Blast, Blast has a patch available that makes it work smoother with openMosix, but that's not the only alternative. First of all there are some known problems with this patch, and other versions of blast , blast tends to segfault sometimes, this mostly happens with the preformatted databases you download from the internet. If you run formatdb on a raw database these errors tend to go away. Next to the openMosix blast patch a lot of people run MPIBlast Given the fact that openMosix tends to speed up MPI, adding openMosix to this config might even give you more power for your money, however we will have to do some extra research to be able to confirm this. V. openMosix Development
Chapter 17. Getting started with openMosix internals17.1. Introductionthis part has been written by Amit Shah There's not much documentation available right now for the kernel. I hope to write some in the coming weeks. Anyways, here's how the sources are laid out: The openMosix code resides largely in hpc/ and include/hpc. There are lots of patches to the core kernel files everywhere, right from the arch/i386 directories to mm/, fs/, etc. You need to read up the code which interests you and think that would matter for the present situation (that shouldn't be a problem, since you've done kernel coding). here's what you should expect in each of the source files:
Appendix A. More InfoA.1. ircSome of the openMosix enthousiasts spend time online helping out people on irc. We are on irc.freenode.net on #openMosix. Just join us there to dicuss your problems , ideas and other stuff about openMosix A.3. TranslationsSome people have been working on partial translations of this HOWTO, or just plain openMosix documentation in their own language. If you are working on a translation of this document let us know. A.3.1. ChineseDing Wei has written some documents in Chinese, you can read them at http://software.ccidnet.com/pub/disp/Article?columnID=732&articleID=25795&pageNO=1 Here is a local copy to the Chinese doc in PDF A.3.2. SpanishTogether with some collegues Miquel Catalán Coïthas been working on a spanish translation of the HOWTO http://w3.akamc2.net/ A.4. Links
Appendix B. CreditsThe list of people who deserve credits for this HOWTO is long, I actually lost track of all the people that should be in here. I often add their names right next to the parts they have contributed. If you feel that your name is missing here do not hesitate to contact me and I'll gladly put your name into the list. Scot W. Stevenson I have to thank Scot W. Stevenson for all the work he did on this HOWTO before I took over. He made a great start for this document. Assaf Spanier worked together with Scott in drafting the layout and the chapters of this HOWTO. and now promised to help me out with this document. Matthias Rechenburg Matthias Rechenburg should be thanked for the work he did on openMosixview and the accompanying documentation , which we included in this HOWTO. Jean-David Marrow is the author of Clump/OS, he contributed the documentation on his distribution to the HOWTO. Bruce Knox is the maintainer of the openMosix website, he helps where he can and gives a lot of feedback ! Evan Hisey for putting a lot of effort into putting extra documentation in the WIKI Charles Nadeau for putting a lot of effort into putting extra documentation in the WIKI Louis Zechter Moshe Bar For writing the code he wrote and helping out with the docs wherever he knows the answers ! Amit Shah for getting started with the openMosix internals Mirko Caserta For sending in huge patches to this howto Ramon Pons for proofreading the howto and sending in some advice Appendix C. GNU Free Documentation LicenseVersion 1.1, March 2000
0. PREAMBLEThe purpose of this License is to make a manual, textbook, or other written document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or non-commercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. 1. APPLICABILITY AND DEFINITIONSThis License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not "Transparent" is called "Opaque". Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only. The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. 2. VERBATIM COPYINGYou may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. 3. COPYING IN QUANTITYIf you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. 4. MODIFICATIONSYou may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles. You may add a section entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. 5. COMBINING DOCUMENTSYou may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections entitled "History" in the various original documents, forming one section entitled "History"; likewise combine any sections entitled "Acknowledgements", and any sections entitled "Dedications". You must delete all sections entitled "Endorsements." 6. COLLECTIONS OF DOCUMENTSYou may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. 7. AGGREGATION WITH INDEPENDENT WORKSA compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an "aggregate", and this License does not apply to the other self-contained works thus compiled with the Document, on account of their being thus compiled, if they are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate. 8. TRANSLATIONTranslation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, the original English version will prevail. 9. TERMINATIONYou may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 10. FUTURE REVISIONS OF THIS LICENSEThe Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. How to use this License for your documentsTo use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
If you have no Invariant Sections, write "with no Invariant Sections" instead of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being LIST"; likewise for Back-Cover Texts. If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software. |