ITG Unix Support
>    
     |  List directory  |  History  |  Similar  |  Print version  

HPC > IonMan Cluster > Powering the IonMan cluster down and back up

Powering the IonMan cluster down and back up

Powering the whole system down

  • Log in to node 1.
  • Execute:
    sudo clrun "/sbin/shutdown -h 0" all
  • Verify that the systems have powered off (in about 30-90 seconds). A flashing power light (small, round, green LED) on the front panel indicates the macine is off. If the power light is solid, check to see why it is still on. Hook up the monitor and keyboard, and look at the console to see where it is in the shutdown process. If it seems hung up, try one last <CTRL><ALT><DEL>. If nothing happens, or it gets stuck again on sending a TERM signal to all processes, it may be time to cycle power. If you are satisfied waiting longer will not help, then push and hold the power button for five seconds until the screen goes blank, and release. The power LED should be flashing now. If for some reason the power button will not power off the machine, you may pull the power cord from it.
  • Now that all the other nodes are powered down, on node 1 execute the following:
    sudo /sbin/shutdown -h 0
  • Power down the DS4100 ONLY when nodes 1 and 2 are down. Turn off using 2 rocker switches on back of unit.
  • To turn off the UPS (not usually necessary), hold down the Power button (the one with the circle).
  • The ethernet switch can be unplugged if it needs to be powered off.

Powering on the system

  • Turn on the UPS (if off) - hold down the Power button (circle button).
  • Turn on the DS4100 if it is off (2 rocker switches on back).
  • Power on node 1 by pressing the power button for about a second. You may need to use a tool to push it. Also make sure the button comes back out when you release it; sometimes they get caught behind the hole in the front panel, which causes the machine to shut off again. Wait for the login prompt.
  • Power on each of the Mascot nodes, 6 through 10, pausing several seconds between each.
  • Power on node 2. Wait for the login prompt.
  • Power on each of the PVFS nodes, 3 through 5 (9 and 10 are also PVFS nodes, but being Mascot nodes as well should already be on), pausing several seconds between each.
  • Power on each of the other nodes, pausing a few seconds between each.
  • Verify that the system is ready. On node 1 check these:
    - that jobhunter is running.

    - that PVFS is operational.

        sudo pvfs2-ping -m /scratch/parallel 

    - that mascot search DB is in a good state at http://ionman-mascot/.

    - that shares are accessible via SMB.

 

Reference http://wiki.chem.indiana.edu/HPC/PoweringTheIonManClusterDownAndBackUp
Rights rw-rw-r--   sacreps   IonMan
Comments: 0 New comment

Prev. Mascot Submission Architecture   Restarting services on the cluster Next